ingest package

Subpackages

Submodules

ingest.apps module

The Scale ingest application

class ingest.apps.IngestConfig(app_name, app_module)

Bases: django.apps.config.AppConfig

Configuration for the ingest app

label = u'ingest'
name = u'ingest'
ready()

Override this method in subclasses to run code when Django starts.

verbose_name = u'Ingest'

ingest.ingest_event_serializers module

Defines the serializers for ingests

class ingest.ingest_event_serializers.IngestEventBaseSerializerV6(instance=None, data=<class rest_framework.fields.empty>, **kwargs)

Bases: util.rest.ModelIdSerializer

Converts ingest event model fields to REST output

class ingest.ingest_event_serializers.IngestEventDetailsSerializerV6(instance=None, data=<class rest_framework.fields.empty>, **kwargs)

Bases: ingest.ingest_event_serializers.IngestEventBaseSerializerV6

Converts ingest event model fields to REST output

class ingest.ingest_event_serializers.IngestEventSerializerV6(instance=None, data=<class rest_framework.fields.empty>, **kwargs)

Bases: ingest.ingest_event_serializers.IngestEventBaseSerializerV6

Converts ingest event model fields to REST output

ingest.ingest_job module

Defines the functions necessary to perform the ingest of a source file

ingest.ingest_job.perform_ingest(ingest_id)

Performs the ingest for the given ingest ID

Parameters:ingest_id (int) – The ID of the ingest to perform

ingest.models module

Defines the database models related to ingesting files

class ingest.models.Ingest(*args, **kwargs)

Bases: django.db.models.base.Model

Represents an instance of a file being ingested into a workspace

Parameters:
  • file_name (django.db.models.CharField) – The name of the file
  • strike (django.db.models.ForeignKey) – The Strike process that created this ingest
  • scan (django.db.models.ForeignKey) – The Scan process that created this ingest
  • status (django.db.models.CharField) – The status of the file ingest process
  • transfer_started (django.db.models.DateTimeField) – When the transfer to the workspace started
  • transfer_ended (django.db.models.DateTimeField) – When the transfer to the workspace ended
  • bytes_transferred (django.db.models.BigIntegerField) – The total number of bytes transferred so far
  • media_type (django.db.models.CharField) – The IANA media type of the file
  • file_size (django.db.models.BigIntegerField) – The size of the file in bytes
  • data_type_tags (django.db.models.ArrayField) – An array of data type “tags” for the file
  • file_path (django.db.models.CharField) – The relative path for where the file is stored in the workspace
  • workspace (django.db.models.ForeignKey) – The workspace where the file was transferred
  • new_file_path (django.db.models.CharField) – The relative path for where the file should be moved as part of ingesting
  • new_workspace (django.db.models.ForeignKey) – The new workspace to move the file into as part of ingesting
  • job (django.db.models.ForeignKey) – The ingest job that is processing this ingest
  • ingest_started (django.db.models.DateTimeField) – When the ingest was started
  • ingest_ended (django.db.models.DateTimeField) – When the ingest ended
  • source_file (django.db.models.ForeignKey) – A reference to the source file that was stored by this ingest
  • data_started (django.db.models.DateTimeField) – The start time of the data in this source file
  • data_ended (django.db.models.DateTimeField) – The end time of the data in this source file
  • created (django.db.models.DateTimeField) – When the ingest model was created
  • last_modified (django.db.models.DateTimeField) – When the ingest model was last modified
ALPHABETIZE_FIELDS = [u'file_name', u'status', u'media_type', u'file_path', u'new_file_path']
exception DoesNotExist

Bases: django.core.exceptions.ObjectDoesNotExist

INGEST_STATUSES = ((u'TRANSFERRING', u'TRANSFERRING'), (u'TRANSFERRED', u'TRANSFERRED'), (u'DEFERRED', u'DEFERRED'), (u'QUEUED', u'QUEUED'), (u'INGESTING', u'INGESTING'), (u'INGESTED', u'INGESTED'), (u'ERRORED', u'ERRORED'), (u'DUPLICATE', u'DUPLICATE'))
exception MultipleObjectsReturned

Bases: django.core.exceptions.MultipleObjectsReturned

add_data_type_tag(tag)

Adds a new data type tag to the file.

Parameters:tag (string) – The data type tag to add
add_file(file_name, workspace, scan_id=None, strike_id=None)

Add file source metadata to ingest record

Parameters:
  • file_name (string) – File name excluding full path
  • workspace (string) –
  • scan_id (int) –
  • strike_id (int) –
bytes_transferred

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

created

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

data_ended

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

data_started

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

data_type_tags

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

file_name

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

file_path

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

file_size

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

get_data_type_tags()

Returns the set of data type tags associated with this file

Returns:The set of data type tags
Return type:set of string
get_ingest_source_event()

Returns the event that triggered the ingest strike or scan

get_next_by_created(*moreargs, **morekwargs)
get_next_by_last_modified(*moreargs, **morekwargs)
get_previous_by_created(*moreargs, **morekwargs)
get_previous_by_last_modified(*moreargs, **morekwargs)
get_recipe_name()

Returns the

get_status_display(*moreargs, **morekwargs)
id

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

ingest_ended

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

ingest_started

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

ingestevent_set

Accessor to the related objects manager on the reverse side of a many-to-one relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

parent.children is a ReverseManyToOneDescriptor instance.

Most of the implementation is delegated to a dynamically defined manager class built by create_forward_many_to_many_manager() defined below.

is_there_rule_match(file_handler, workspaces, no_match_status=None)

Applies rules to an ingest record, determining if there is a match and updating as indicated in rule match

Parameters:
  • file_handler – Rules to be matched against the ingest record
  • workspaces – mimetype to workspace mapping
  • no_match_status – Optional status to apply when rules aren’t matched
Type:
class:ingest.handlers.file_handler.FileHandler
Type:

dict

Type:

string

Returns:

The ingest record if matched otherwise None

Return type:

ingest.models.Ingest

job

Accessor to the related object on the forward side of a many-to-one or one-to-one (via ForwardOneToOneDescriptor subclass) relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

child.parent is a ForwardManyToOneDescriptor instance.

job_id

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

last_modified

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

media_type

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

new_file_path

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

new_workspace

Accessor to the related object on the forward side of a many-to-one or one-to-one (via ForwardOneToOneDescriptor subclass) relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

child.parent is a ForwardManyToOneDescriptor instance.

new_workspace_id

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

objects = <ingest.models.IngestManager object>
scan

Accessor to the related object on the forward side of a many-to-one or one-to-one (via ForwardOneToOneDescriptor subclass) relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

child.parent is a ForwardManyToOneDescriptor instance.

scan_id

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

source_file

Accessor to the related object on the forward side of a many-to-one or one-to-one (via ForwardOneToOneDescriptor subclass) relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

child.parent is a ForwardManyToOneDescriptor instance.

source_file_id

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

status

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

strike

Accessor to the related object on the forward side of a many-to-one or one-to-one (via ForwardOneToOneDescriptor subclass) relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

child.parent is a ForwardManyToOneDescriptor instance.

strike_id

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

transfer_ended

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

transfer_started

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

workspace

Accessor to the related object on the forward side of a many-to-one or one-to-one (via ForwardOneToOneDescriptor subclass) relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

child.parent is a ForwardManyToOneDescriptor instance.

workspace_id

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

class ingest.models.IngestCounts(time, files=0, size=0)

Bases: object

Represents ingest status values for a specific time slot.

Parameters:
  • time (datetime.datetime) – The time slot being counted.
  • files (int) – The number of files ingested for the time slot.
  • size (int) – The total size of all files ingested for the time slot in bytes.
class ingest.models.IngestEvent(*args, **kwargs)

Bases: django.db.models.base.Model

Represents an ingest event that triggered a recipe

Parameters:
  • type (django.db.models.CharField) – The type of ingest that occurred (strike/scan)
  • ingest (django.db.models.ForeignKey) – The ingest that occurred
  • strike (django.db.models.ForeignKey) – The strike that triggered this event, possibly None (some events are not triggered by rules)
  • scan (django.db.models.ForeignKey) – The scan that triggered this event, possibly None (some events are not triggered by rules)
  • description (django.contrib.postgres.fields.JSONField) – JSON description of the event. This will contain fields specific to the type of the trigger that occurred.
  • occurred (django.db.models.DateTimeField) – When the event occurred
exception DoesNotExist

Bases: django.core.exceptions.ObjectDoesNotExist

exception MultipleObjectsReturned

Bases: django.core.exceptions.MultipleObjectsReturned

description

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

get_next_by_occurred(*moreargs, **morekwargs)
get_previous_by_occurred(*moreargs, **morekwargs)
id

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

ingest

Accessor to the related object on the forward side of a many-to-one or one-to-one (via ForwardOneToOneDescriptor subclass) relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

child.parent is a ForwardManyToOneDescriptor instance.

ingest_id

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

job_set

Accessor to the related objects manager on the reverse side of a many-to-one relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

parent.children is a ReverseManyToOneDescriptor instance.

Most of the implementation is delegated to a dynamically defined manager class built by create_forward_many_to_many_manager() defined below.

objects = <ingest.models.IngestEventManager object>
occurred

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

recipe_set

Accessor to the related objects manager on the reverse side of a many-to-one relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

parent.children is a ReverseManyToOneDescriptor instance.

Most of the implementation is delegated to a dynamically defined manager class built by create_forward_many_to_many_manager() defined below.

scan

Accessor to the related object on the forward side of a many-to-one or one-to-one (via ForwardOneToOneDescriptor subclass) relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

child.parent is a ForwardManyToOneDescriptor instance.

scan_id

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

strike

Accessor to the related object on the forward side of a many-to-one or one-to-one (via ForwardOneToOneDescriptor subclass) relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

child.parent is a ForwardManyToOneDescriptor instance.

strike_id

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

type

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

class ingest.models.IngestEventManager

Bases: django.db.models.manager.Manager

Manages the IngestEvent model

MANUAL_TYPE = u'MANUAL'
SCAN_TYPE = u'SCAN'
STRIKE_TYPE = u'STRIKE'
create_manual_ingest_event(ingest_id, description, occurred)

Creates a new ingest event and returns the event model.

Parameters:
  • ingest_id – The ingest that triggered the strike
  • description (dict) – The JSON description of the event as a dict
  • occurred (datetime.datetime) – When the event occurred
Returns:

The new trigger event

Return type:

ingest.models.IngestEvent

create_scan_ingest_event(ingest_id, scan, description, occurred)

Creates a new ingest event and returns the event model. The given scan model must have already been saved in the database (it must have an ID). The returned ingest event model will be saved in the database.

Parameters:
  • ingest_id – The ingest that triggered the scan
  • scan (ingest.models.Scan) – The scan that triggered the event
  • description (dict) – The JSON description of the event as a dict
  • occurred (datetime.datetime) – When the event occurred
Returns:

The new trigger event

Return type:

ingest.models.IngestEvent

create_strike_ingest_event(ingest_id, strike, description, occurred)

Creates a new ingest event and returns the event model. The given strike model must have already been saved in the database (it must have an ID). The returned ingest event model will be saved in the database.

Parameters:
  • ingest_id – The ingest that triggered the strike
  • strike (ingest.models.Strike) – The scan that triggered the event
  • description (dict) – The JSON description of the event as a dict
  • occurred (datetime.datetime) – When the event occurred
Returns:

The new trigger event

Return type:

ingest.models.IngestEvent

class ingest.models.IngestManager

Bases: django.db.models.manager.Manager

Provides additional methods for handling ingests.

create_ingest(file_name, workspace, scan_id=None, strike_id=None)

Creates a new ingest for the given file name. The database save is the caller’s responsibility.

Parameters:
  • file_name (string) – The name of the file being ingested
  • workspace (string) –
  • recipe (string) – The name of the recipe to kick off after ingest
  • scan_id (int) –
  • strike_id (int) –
Returns:

The new ingest model

Return type:

ingest.models.Ingest

filter_ingests(source_file_id=None, started=None, ended=None, statuses=None, scan_ids=None, strike_ids=None, file_name=None, order=None)

Returns a query for ingest models that filters on the given fields. The returned query includes the related strike, scan, source_file, and source_file.workspace fields, except for the strike.configuration, scan.configuration, and source_file.workspace.json_config fields.

Parameters:
  • source_file_id (int) – Query ingests for this source file ID.
  • started (datetime.datetime) – Query ingests updated after this amount of time.
  • ended (datetime.datetime) – Query ingests updated before this amount of time.
  • statuses ([string]) – Query ingests with the a specific process status.
  • scan_ids ([string]) – Query ingests created by a specific scan processor.
  • strike_ids ([string]) – Query ingests created by a specific strike processor.
  • file_name (string) – Query ingests with a specific file name.
  • order ([string]) – A list of fields to control the sort order.
Returns:

The ingest query

Return type:

django.db.models.QuerySet

get_details(ingest_id, is_staff=False)

Gets additional details for the given ingest model based on related model attributes.

Parameters:
  • ingest_id (int) – The unique identifier of the ingest.
  • is_staff (bool) – Whether the requesting user is a staff member
Returns:

The ingest with extra related attributes.

Return type:

ingest.models.Ingest

get_dupe_ingests_by_scan(scan_id, new_ingests)

Returns a list of ingests associated with a scan and file name/sizes

Parameters:
  • scan_id (int) – Query ingests created by a specific scan processor.
  • new_ingests (dict) – dict of filename/file sizes
Returns:

The list of ingests that match the scan, filenames and file sizes

Return type:

[ingest.models.Ingest]

get_ingest_job_type()

Returns the Scale Ingest job type

Returns:The ingest job type
Return type:job.models.JobType
get_ingests(started=None, ended=None, statuses=None, scan_ids=None, strike_ids=None, file_name=None, order=None)

Returns a list of ingests within the given time range.

Parameters:
  • started (datetime.datetime) – Query ingests updated after this amount of time.
  • ended (datetime.datetime) – Query ingests updated before this amount of time.
  • statuses ([string]) – Query ingests with the a specific process status.
  • scan_ids ([string]) – Query ingests created by a specific scan processor.
  • strike_ids ([string]) – Query ingests created by a specific strike processor.
  • file_name (string) – Query ingests with the a specific file name.
  • order ([string]) – A list of fields to control the sort order.
Returns:

The list of ingests that match the time range.

Return type:

[ingest.models.Ingest]

get_ingests_by_scan(scan_id, file_names=None)

Returns a list of ingests associated with a scan and optionally files

Parameters:
  • scan_id ([string]) – Query ingests created by a specific scan processor.
  • file_names ([string]) – Query ingests with the specific file names.
Returns:

The list of ingests that match the scan and file_names.

Return type:

[ingest.models.Ingest]

get_recipe_source_config(ingest_id)

Returns the strike/scan recipe configuration for the given ingest id

get_status(started=None, ended=None, use_ingest_time=False)

Returns ingest status information within the given time range grouped by strike process.

Parameters:
  • started (datetime.datetime) – Query ingests updated after this amount of time.
  • ended (datetime.datetime) – Query ingests updated before this amount of time.
  • use_ingest_time (bool) – Whether or not to group the status values by ingest time (False) or data time (True).
Returns:

The list of ingest status models that match the time range.

Return type:

[ingest.models.IngestStatus]

start_ingest_tasks(ingests, scan_id=None, strike_id=None)

Starts a batch of tasks for the given scan in an atomic transaction.

One of scan_id or strike_id must be set.

Parameters:
  • ingests ([ingest.models.Ingest]) – The ingest models
  • scan_id (int) – ID of Scan that generated ingest
  • strike_id (int) – ID of Strike that generated ingest
start_ingest_tasks_cm(ingests, scan_id=None, strike_id=None)

Starts a batch of tasks for the given scan in an atomic transaction.

One of scan_id or strike_id must be set.

Parameters:
  • ingests ([ingest.models.Ingest]) – The ingest models
  • scan_id (int) – ID of Scan that generated ingest
  • strike_id (int) – ID of Strike that generated ingest
class ingest.models.IngestStatus(strike=None, most_recent=None, files=0, size=0, values=None)

Bases: object

Represents ingest status values for a strike process.

Parameters:
  • strike (strike.models.Strike) – The strike process that generated the ingests being counted.
  • most_recent (datetime.datetime) – The date/time of the last ingest generated by the strike process.
  • files (int) – The total number of files ingested by the strike process.
  • size (int) – The total size of all files ingested by the strike process in bytes.
  • values ([ingest.models.IngestCounts]) – A list of values that summarize work done by the strike process.
class ingest.models.Scan(*args, **kwargs)

Bases: django.db.models.base.Model

Represents an instance of a Scan process which will run and detect files in a workspace for ingest

Parameters:
  • name (django.db.models.CharField) – The identifying name of this Scan process
  • title (django.db.models.CharField) – The human-readable name of this Scan process
  • description (django.db.models.TextField) – An optional description of this Scan process
  • configuration (django.contrib.postgres.fields.JSONField) – JSON configuration for this Scan process
  • dry_run_job (django.db.models.ForeignKey) – The job that is performing the Scan process as dry run
  • job (django.db.models.ForeignKey) – The job that is performing the Scan process with ingests
  • file_count (django.db.models.BigIntegerField) – Number of files identified by last execution of Scan
  • created (django.db.models.DateTimeField) – When the Scan process was created
  • last_modified (django.db.models.DateTimeField) – When the Scan process was last modified
ALPHABETIZE_FIELDS = [u'name', u'title', u'description']
exception DoesNotExist

Bases: django.core.exceptions.ObjectDoesNotExist

exception MultipleObjectsReturned

Bases: django.core.exceptions.MultipleObjectsReturned

configuration

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

created

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

description

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

dry_run_job

Accessor to the related object on the forward side of a many-to-one or one-to-one (via ForwardOneToOneDescriptor subclass) relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

child.parent is a ForwardManyToOneDescriptor instance.

dry_run_job_id

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

file_count

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

get_next_by_created(*moreargs, **morekwargs)
get_next_by_last_modified(*moreargs, **morekwargs)
get_previous_by_created(*moreargs, **morekwargs)
get_previous_by_last_modified(*moreargs, **morekwargs)
get_scan_configuration()

Returns the configuration for this Scan process

Returns:The configuration for this Scan process
Return type:ingest.scan.configuration.scan_configuration.ScanConfiguration
get_v6_configuration_json()

Returns the scan configuration in v6 of the JSON schema

Returns:The scan configuration in v6 of the JSON schema
Return type:dict
id

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

ingest_set

Accessor to the related objects manager on the reverse side of a many-to-one relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

parent.children is a ReverseManyToOneDescriptor instance.

Most of the implementation is delegated to a dynamically defined manager class built by create_forward_many_to_many_manager() defined below.

ingestevent_set

Accessor to the related objects manager on the reverse side of a many-to-one relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

parent.children is a ReverseManyToOneDescriptor instance.

Most of the implementation is delegated to a dynamically defined manager class built by create_forward_many_to_many_manager() defined below.

job

Accessor to the related object on the forward side of a many-to-one or one-to-one (via ForwardOneToOneDescriptor subclass) relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

child.parent is a ForwardManyToOneDescriptor instance.

job_id

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

last_modified

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

name

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

objects = <ingest.models.ScanManager object>
title

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

class ingest.models.ScanManager

Bases: django.db.models.manager.Manager

Provides additional methods for handling Scan processes

cancel_scan(scan_id)

attempts to cancel the job associated with a scan

Parameters:scan_id (int) – The unique identifier of the Scan process.
create_scan(*args, **kwargs)

Creates a new Scan process with the given configuration and returns the new Scan model. All changes to the database will occur in an atomic transaction.

Parameters:
Returns:

The new Scan process

Return type:

ingest.models.Scan

:raises ingest.scan.configuration.exceptions.InvalidScanConfiguration: If the configuration is
invalid.
edit_scan(*args, **kwargs)

Edits the given Scan process and saves the changes in the database. All database changes occur in an atomic transaction. An argument of None for a field indicates that the field should not change.

Parameters:
:raises ingest.scan.configuration.exceptions.InvalidScanConfiguration: If the configuration is
invalid.
get_details(scan_id)

Returns the Scan process for the given ID with all detail fields included.

Parameters:scan_id (int) – The unique identifier of the Scan process.
Returns:The Scan process with all detail fields included.
Return type:ingest.models.Scan
get_scan_job_type()

Returns the Scale Scan job type

Returns:The Scan job type
Return type:job.models.JobType
get_scans(started=None, ended=None, names=None, order=None)

Returns a list of Scan processes within the given time range.

Parameters:
  • started (datetime.datetime) – Query Scan processes updated after this amount of time.
  • ended (datetime.datetime) – Query Scan processes updated before this amount of time.
  • names ([string]) – Query Scan processes associated with the name.
  • order ([string]) – A list of fields to control the sort order.
Returns:

The list of Scan processes that match the time range.

Return type:

[ingest.models.Scan]

queue_scan(*args, **kwargs)

Retrieves a Scan model and uses metadata to place a job to run the Scan process on the queue. All changes to the database will occur in an atomic transaction.

Parameters:
  • scan_id (int) – The unique identifier of the Scan process.
  • dry_run (bool) – Whether the scan will execute as a dry run
Returns:

The new Scan process

Return type:

ingest.models.Scan

validate_scan_v6(configuration)

Validates the given configuration for creating a new scan process

Parameters:configuration (dict) – The scan configuration
Returns:The scan validation
Return type:strike.models.ScanValidation
class ingest.models.ScanValidation(is_valid, errors, warnings)

Bases: tuple

errors

Alias for field number 1

is_valid

Alias for field number 0

warnings

Alias for field number 2

class ingest.models.Strike(*args, **kwargs)

Bases: django.db.models.base.Model

Represents an instance of a Strike process which will run and detect incoming files in a directory for ingest

Parameters:
  • name (django.db.models.CharField) – The identifying name of this Strike process
  • title (django.db.models.CharField) – The human-readable name of this Strike process
  • description (django.db.models.TextField) – An optional description of this Strike process
  • configuration (django.contrib.postgres.fields.JSONField) – JSON configuration for this Strike process
  • job (django.db.models.ForeignKey) – The job that is performing the Strike process
  • created (django.db.models.DateTimeField) – When the Strike process was created
  • last_modified (django.db.models.DateTimeField) – When the Strike process was last modified
ALPHABETIZE_FIELDS = [u'name', u'title', u'description']
exception DoesNotExist

Bases: django.core.exceptions.ObjectDoesNotExist

exception MultipleObjectsReturned

Bases: django.core.exceptions.MultipleObjectsReturned

configuration

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

created

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

description

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

get_next_by_created(*moreargs, **morekwargs)
get_next_by_last_modified(*moreargs, **morekwargs)
get_previous_by_created(*moreargs, **morekwargs)
get_previous_by_last_modified(*moreargs, **morekwargs)
get_strike_configuration()

Returns the configuration for this Strike process

Returns:The configuration for this Strike process
Return type:ingest.strike.configuration.strike_configuration.StrikeConfiguration
get_v6_configuration_json()

Returns the strike configuration in v6 of the JSON schema

Returns:The strike configuration in v6 of the JSON schema
Return type:dict
id

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

ingest_set

Accessor to the related objects manager on the reverse side of a many-to-one relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

parent.children is a ReverseManyToOneDescriptor instance.

Most of the implementation is delegated to a dynamically defined manager class built by create_forward_many_to_many_manager() defined below.

ingestevent_set

Accessor to the related objects manager on the reverse side of a many-to-one relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

parent.children is a ReverseManyToOneDescriptor instance.

Most of the implementation is delegated to a dynamically defined manager class built by create_forward_many_to_many_manager() defined below.

job

Accessor to the related object on the forward side of a many-to-one or one-to-one (via ForwardOneToOneDescriptor subclass) relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

child.parent is a ForwardManyToOneDescriptor instance.

job_id

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

last_modified

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

metricsingest_set

Accessor to the related objects manager on the reverse side of a many-to-one relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

parent.children is a ReverseManyToOneDescriptor instance.

Most of the implementation is delegated to a dynamically defined manager class built by create_forward_many_to_many_manager() defined below.

name

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

objects = <ingest.models.StrikeManager object>
title

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

class ingest.models.StrikeManager

Bases: django.db.models.manager.Manager

Provides additional methods for handling Strike processes

create_strike(*args, **kwargs)

Creates a new Strike process with the given configuration and returns the new Strike model. The Strike model will be saved in the database and the job to run the Strike process will be placed on the queue. All changes to the database will occur in an atomic transaction.

Parameters:
Returns:

The new Strike process

Return type:

ingest.models.Strike

:raises ingest.strike.configuration.exceptions.InvalidStrikeConfiguration: If the configuration is
invalid.
edit_strike(*args, **kwargs)

Edits the given Strike process and saves the changes in the database. All database changes occur in an atomic transaction. An argument of None for a field indicates that the field should not change.

Parameters:
:raises ingest.strike.configuration.exceptions.InvalidStrikeConfiguration: If the configuration is
invalid.
get_details(strike_id, is_staff=False)

Returns the Strike process for the given ID with all detail fields included.

Parameters:
  • strike_id (int) – The unique identifier of the Strike process.
  • is_staff (bool) – Whether the requesting user is a staff member
Returns:

The Strike process with all detail fields included.

Return type:

ingest.models.Strike

get_strike_job_type()

Returns the Scale Strike job type

Returns:The Strike job type
Return type:job.models.JobType
get_strikes(started=None, ended=None, names=None, order=None)

Returns a list of Strike processes within the given time range.

Parameters:
  • started (datetime.datetime) – Query Strike processes updated after this amount of time.
  • ended (datetime.datetime) – Query Strike processes updated before this amount of time.
  • names ([string]) – Query Strike processes associated with the name.
  • order ([string]) – A list of fields to control the sort order.
Returns:

The list of Strike processes that match the time range.

Return type:

[ingest.models.Strike]

validate_strike_v6(configuration)

Validates the given configuration for creating a new strike process

Parameters:configuration (dict) – The strike configuration
Returns:The strike validation
Return type:strike.models.StrikeValidation
class ingest.models.StrikeValidation(is_valid, errors, warnings)

Bases: tuple

errors

Alias for field number 1

is_valid

Alias for field number 0

warnings

Alias for field number 2

ingest.serializers module

Defines the serializers for ingests

class ingest.serializers.IngestBaseSerializer(instance=None, data=<class rest_framework.fields.empty>, **kwargs)

Bases: util.rest.ModelIdSerializer

Converts ingest model fields to REST output

class ingest.serializers.IngestDetailsSerializerV6(instance=None, data=<class rest_framework.fields.empty>, **kwargs)

Bases: ingest.serializers.IngestSerializerV6

Converts ingest model fields to REST output

class ingest.serializers.IngestEventBaseSerializerV6(instance=None, data=<class rest_framework.fields.empty>, **kwargs)

Bases: util.rest.ModelIdSerializer

Converts ingest event model fields to REST output

class ingest.serializers.IngestEventDetailsSerializerV6(instance=None, data=<class rest_framework.fields.empty>, **kwargs)

Bases: ingest.serializers.IngestEventBaseSerializerV6

Converts ingest event model fields to REST output

class ingest.serializers.IngestEventSerializerV6(instance=None, data=<class rest_framework.fields.empty>, **kwargs)

Bases: ingest.serializers.IngestEventBaseSerializerV6

Converts ingest event model fields to REST output

class ingest.serializers.IngestSerializerV6(instance=None, data=<class rest_framework.fields.empty>, **kwargs)

Bases: ingest.serializers.IngestBaseSerializer

Converts ingest model fields to REST output

class ingest.serializers.IngestStatusSerializerV6(instance=None, data=<class rest_framework.fields.empty>, **kwargs)

Bases: rest_framework.serializers.Serializer

Converts ingest model fields to REST output

class ingest.serializers.IngestStatusValuesSerializer(instance=None, data=<class rest_framework.fields.empty>, **kwargs)

Bases: rest_framework.serializers.Serializer

Converts ingest model fields to REST output

class ingest.serializers.ScanBaseSerializer(instance=None, data=<class rest_framework.fields.empty>, **kwargs)

Bases: util.rest.ModelIdSerializer

Converts scan model fields to REST output

class ingest.serializers.ScanDetailsSerializerV6(instance=None, data=<class rest_framework.fields.empty>, **kwargs)

Bases: ingest.serializers.ScanSerializerV6

Converts scan model fields to REST output

class ingest.serializers.ScanSerializerV6(instance=None, data=<class rest_framework.fields.empty>, **kwargs)

Bases: ingest.serializers.ScanBaseSerializer

Converts scan model fields to REST output

class JobBaseSerializerV6(instance=None, data=<class rest_framework.fields.empty>, **kwargs)

Bases: util.rest.ModelIdSerializer

Converts job model fields to REST output.

class ingest.serializers.StrikeBaseSerializer(instance=None, data=<class rest_framework.fields.empty>, **kwargs)

Bases: util.rest.ModelIdSerializer

Converts strike model fields to REST output

class ingest.serializers.StrikeDetailsSerializerV6(instance=None, data=<class rest_framework.fields.empty>, **kwargs)

Bases: ingest.serializers.StrikeSerializerV6

Converts strike model fields to REST output

class ingest.serializers.StrikeSerializerV6(instance=None, data=<class rest_framework.fields.empty>, **kwargs)

Bases: ingest.serializers.StrikeBaseSerializer

Converts strike model fields to REST output

class JobBaseSerializerV6(instance=None, data=<class rest_framework.fields.empty>, **kwargs)

Bases: util.rest.ModelIdSerializer

Converts job model fields to REST output.

ingest.urls module

Defines the URLs for the RESTful ingest and Strike services

ingest.views module

Defines the views for the RESTful ingest and Strike services

class ingest.views.CancelScansView(**kwargs)

Bases: rest_framework.generics.GenericAPIView

This view is the endpoint for canceling a scan in progress.

get_serializer_class()

Returns the appropriate serializer based off the requests version of the REST API.

post(request, scan_id)
queryset
class ingest.views.IngestDetailsView(**kwargs)

Bases: rest_framework.generics.RetrieveAPIView

This view is the endpoint for retrieving/updating details of an ingest.

get_serializer_class()

Returns the appropriate serializer based off the requests version of the REST API

queryset
retrieve(request, ingest_id=None, file_name=None)

Determine api version and call specific method

Parameters:
  • request (rest_framework.request.Request) – the HTTP GET request
  • ingest_id (int encoded as a str) – The id of the ingest
  • file_name (string) – The name of the ingest
Return type:

rest_framework.response.Response

Returns:

the HTTP response to send back to the user

retrieve_v6(request, ingest_id)

Retrieves the details for an ingest and return them in JSON form

Parameters:
  • request (rest_framework.request.Request) – the HTTP GET request
  • ingest_id (int encoded as a str) – The id of the ingest
Return type:

rest_framework.response.Response

Returns:

the HTTP response to send back to the user

class ingest.views.IngestsStatusView(**kwargs)

Bases: rest_framework.generics.ListAPIView

This view is the endpoint for retrieving summarized ingest status.

get_serializer_class()

Returns the appropriate serializer based off the requests version of the REST API

list(request)

Determine api version and call specific method

Parameters:request (rest_framework.request.Request) – the HTTP POST request
Return type:rest_framework.response.Response
Returns:the HTTP response to send back to the user
list_impl(request)

Retrieves the ingest status information and returns it in JSON form

Parameters:request (rest_framework.request.Request) – the HTTP GET request
Return type:rest_framework.response.Response
Returns:the HTTP response to send back to the user
queryset
class ingest.views.IngestsView(**kwargs)

Bases: rest_framework.generics.ListAPIView

This view is the endpoint for retrieving the list of all ingests.

get_serializer_class()

Returns the appropriate serializer based off the requests version of the REST API

list(request)

Determine api version and call specific method

Parameters:request (rest_framework.request.Request) – the HTTP POST request
Return type:rest_framework.response.Response
Returns:the HTTP response to send back to the user
list_impl(request)

Retrieves the list of all ingests and returns it in JSON form

Parameters:request (rest_framework.request.Request) – the HTTP GET request
Return type:rest_framework.response.Response
Returns:the HTTP response to send back to the user
queryset
class ingest.views.ScansDetailsView(**kwargs)

Bases: rest_framework.generics.GenericAPIView

This view is the endpoint for retrieving/updating details of a Scan process.

get(request, scan_id)

Retrieves the details for a Scan process and return them in JSON form

Parameters:
  • request (rest_framework.request.Request) – the HTTP GET request
  • scan_id (int encoded as a str) – The ID of the Scan process
Return type:

rest_framework.response.Response

Returns:

the HTTP response to send back to the user

get_serializer_class()

Returns the appropriate serializer based off the requests version of the REST API.

patch(request, scan_id)

Edits an existing Scan process and returns the updated details

Parameters:
  • request (rest_framework.request.Request) – the HTTP GET request
  • scan_id (int encoded as a str) – The ID of the Scan process
Return type:

rest_framework.response.Response

Returns:

the HTTP response to send back to the user

queryset
class ingest.views.ScansProcessView(**kwargs)

Bases: rest_framework.generics.GenericAPIView

This view is the endpoint for launching a scan execution to ingest

get_serializer_class()

Returns the appropriate serializer based off the requests version of the REST API.

post(request, scan_id=None)

Launches a scan to ingest from an existing scan model instance

Parameters:
  • request (rest_framework.request.Request) – the HTTP POST request
  • scan_id (int) – ID for Scan record to pull configuration from
Return type:

rest_framework.response.Response

Returns:

the HTTP response to send back to the user

queryset
class ingest.views.ScansValidationView(**kwargs)

Bases: rest_framework.views.APIView

This view is the endpoint for validating a new Scan process before attempting to actually create it

post(request)

Validates a new Scan process and returns any warnings discovered

Parameters:request (rest_framework.request.Request) – the HTTP POST request
Return type:rest_framework.response.Response
Returns:the HTTP response to send back to the user
queryset
class ingest.views.ScansView(**kwargs)

Bases: rest_framework.generics.ListCreateAPIView

This view is the endpoint for retrieving the list of all Scan process.

create(request)

Creates a new Scan process and returns a link to the detail URL

Parameters:request (rest_framework.request.Request) – the HTTP POST request
Return type:rest_framework.response.Response
Returns:the HTTP response to send back to the user
get_serializer_class()

Returns the appropriate serializer based off the requests version of the REST API.

list(request)

Retrieves the list of all Scan process and returns it in JSON form

Parameters:request (rest_framework.request.Request) – the HTTP GET request
Return type:rest_framework.response.Response
Returns:the HTTP response to send back to the user
queryset
class ingest.views.StrikeDetailsView(**kwargs)

Bases: rest_framework.generics.GenericAPIView

This view is the endpoint for retrieving/updating details of a Strike process.

get(request, strike_id)

Determine api version and call specific method

Parameters:
  • request (rest_framework.request.Request) – the HTTP POST request
  • strike_id (int encoded as a str) – The ID of the Strike process
Return type:

rest_framework.response.Response

Returns:

the HTTP response to send back to the user

get_impl(request, strike_id)

Retrieves the details for a Strike process and return them in JSON form

Parameters:
  • request (rest_framework.request.Request) – the HTTP GET request
  • strike_id (int encoded as a str) – The ID of the Strike process
Return type:

rest_framework.response.Response

Returns:

the HTTP response to send back to the user

get_serializer_class()

Returns the appropriate serializer based off the requests version of the REST API

patch(request, strike_id)

Determine api version and call specific method

Parameters:
  • request (rest_framework.request.Request) – the HTTP POST request
  • strike_id (int encoded as a str) – The ID of the Strike process
Return type:

rest_framework.response.Response

Returns:

the HTTP response to send back to the user

patch_impl_v6(request, strike_id)

Edits an existing Strike process and returns the updated details

Parameters:
  • request (rest_framework.request.Request) – the HTTP GET request
  • strike_id (int encoded as a str) – The ID of the Strike process
Return type:

rest_framework.response.Response

Returns:

the HTTP response to send back to the user

queryset
class ingest.views.StrikesValidationView(**kwargs)

Bases: rest_framework.views.APIView

This view is the endpoint for validating a new Strike process before attempting to actually create it

post(request)

Determine api version and call specific method

Parameters:request (rest_framework.request.Request) – the HTTP POST request
Return type:rest_framework.response.Response
Returns:the HTTP response to send back to the user
post_impl_v6(request)

Validates a new Strike process and returns any warnings discovered

Parameters:request (rest_framework.request.Request) – the HTTP POST request
Return type:rest_framework.response.Response
Returns:the HTTP response to send back to the user
queryset
class ingest.views.StrikesView(**kwargs)

Bases: rest_framework.generics.ListCreateAPIView

This view is the endpoint for retrieving the list of all Strike process.

create(request)

Determine api version and call specific method

Parameters:request (rest_framework.request.Request) – the HTTP POST request
Return type:rest_framework.response.Response
Returns:the HTTP response to send back to the user
create_impl_v6(request)

Creates a new Strike process and returns a link to the detail URL

Parameters:request (rest_framework.request.Request) – the HTTP POST request
Return type:rest_framework.response.Response
Returns:the HTTP response to send back to the user
get_serializer_class()

Returns the appropriate serializer based off the requests version of the REST API

list(request)

Determine api version and call specific method

Parameters:request (rest_framework.request.Request) – the HTTP POST request
Return type:rest_framework.response.Response
Returns:the HTTP response to send back to the user
list_impl(request)

Retrieves the list of all Strike process and returns it in JSON form

Parameters:request (rest_framework.request.Request) – the HTTP GET request
Return type:rest_framework.response.Response
Returns:the HTTP response to send back to the user
queryset

Module contents

This model handles the ingestion of new data into scale