source package

Submodules

source.apps module

Defines the application configuration for the source application

class source.apps.SourceConfig(app_name, app_module)

Bases: django.apps.config.AppConfig

Configuration for the source app

label = u'source'
name = u'source'
ready()

Override this method in subclasses to run code when Django starts.

verbose_name = u'Source'

source.models module

Defines the database model for source files

class source.models.SourceFile(*args, **kwargs)

Bases: storage.models.ScaleFile

Represents a source data file that is available for processing. This is a proxy model of the storage.models.ScaleFile model. It has the same set of fields, but a different manager that provides functionality specific to source files.

exception DoesNotExist

Bases: storage.models.DoesNotExist

exception MultipleObjectsReturned

Bases: storage.models.MultipleObjectsReturned

VALID_TIME_FIELDS = [u'data', u'last_modified']
classmethod create()

Creates a new source file

Returns:The new source file
Return type:source.models.SourceFile
objects = <source.models.SourceFileManager object>
class source.models.SourceFileManager(*args, **kwargs)

Bases: django.contrib.gis.db.models.manager.GeoManager

Provides additional methods for handling source files

filter_sources(started=None, ended=None, time_field=None, is_parsed=None, file_name=None, order=None)

Returns a query for source models that filters on the given fields. The returned query includes the related workspace, job_type, and job fields, except for the workspace.json_config field. The related countries are set to be pre-fetched as part of the query.

Parameters:
  • started (datetime.datetime) – Query source files updated after this amount of time.
  • ended (datetime.datetime) – Query source files updated before this amount of time.
  • time_field (string) – The time field to use for filtering.
  • is_parsed (bool) – Query source files flagged as successfully parsed.
  • file_name (str) – Query source files with the given file name.
  • order ([string]) – A list of fields to control the sort order.
Returns:

The list of source files that match the time range.

Return type:

list[storage.models.ScaleFile]

get_details(source_id)

Gets additional details for the given source model

Parameters:source_id (int) – The unique identifier of the source (file ID)
Returns:The source model with details
Return type:storage.models.ScaleFile

:raises storage.models.ScaleFile.DoesNotExist: If the file does not exist

get_source_ingests(source_file_id, started=None, ended=None, statuses=None, scan_ids=None, strike_ids=None, order=None)

Returns a query for ingest models for the given source file. The returned query includes the related strike, scan, source_file, and source_file.workspace fields, except for the strike.configuration, scan.configuration, and source_file.workspace.json_config fields.

Parameters:
  • source_file_id (int) – The source file ID.
  • started (datetime.datetime) – Query ingests updated after this amount of time.
  • ended (datetime.datetime) – Query ingests updated before this amount of time.
  • statuses ([string]) – Query ingests with the a specific process status.
  • scan_ids ([string]) – Query ingests created by a specific scan processor.
  • strike_ids ([string]) – Query ingests created by a specific strike processor.
  • file_name (string) – Query ingests with a specific file name.
  • order ([string]) – A list of fields to control the sort order.
Returns:

The ingest query

Return type:

django.db.models.QuerySet

get_source_jobs(source_file_id, started=None, ended=None, statuses=None, job_ids=None, job_type_ids=None, job_type_names=None, batch_ids=None, error_categories=None, order=None)

Returns a query for the list of jobs that have used the given source file as input. The returned query includes the related job_type, job_type_rev, event, and error fields, except for the job_type.manifest and job_type_rev.manifest fields.

Parameters:
  • source_file_id (int) – The source file ID.
  • started (datetime.datetime) – Query jobs updated after this amount of time.
  • ended (datetime.datetime) – Query jobs updated before this amount of time.
  • statuses ([string]) – Query jobs with the a specific execution status.
  • job_ids ([int]) – Query jobs associated with the identifier.
  • job_type_ids ([int]) – Query jobs of the type associated with the identifier.
  • job_type_names ([string]) – Query jobs of the type associated with the name.
  • batch_ids ([int]) – Query jobs associated with batches with the given identifiers.
  • error_categories ([string]) – Query jobs that failed due to errors associated with the category.
  • order ([string]) – A list of fields to control the sort order.
Returns:

The list of jobs that match the time range.

Return type:

[job.models.Job]

get_source_products(source_file_id, started=None, ended=None, time_field=None, batch_ids=None, job_type_ids=None, job_type_names=None, job_ids=None, is_published=None, is_superseded=None, file_name=None, job_output=None, recipe_ids=None, recipe_type_ids=None, recipe_job=None, order=None)

Returns a query for the list of products produced by the given source file ID. The returned query includes the related workspace, job_type, and job fields, except for the workspace.json_config field. The related countries are set to be pre-fetched as part of the query.

Parameters:
  • source_file_id (int) – The source file ID.
  • started (datetime.datetime) – Query product files updated after this amount of time.
  • ended (datetime.datetime) – Query product files updated before this amount of time.
  • time_field (string) – The time field to use for filtering.
  • batch_ids ([int]) – Query product files produced by batches with the given identifiers.
  • job_type_ids ([int]) – Query product files produced by jobs with the given type identifiers.
  • job_type_names ([string]) – Query product files produced by jobs with the given type names.
  • job_ids ([int]) – Query product files produced by jobs with the given identifiers.
  • is_published (bool) – Query product files flagged as currently exposed for publication.
  • is_superseded (bool) – Query product files that have/have not been superseded.
  • file_name (str) – Query product files with the given file name.
  • job_output (str) – Query product files with the given job output
  • recipe_ids ([int]) – Query product files produced by a given recipe id
  • recipe_job (str) – Query product files produced by a given recipe name
  • recipe_type_ids ([int]) – Query product files produced by a given recipe types
  • order ([str]) – A list of fields to control the sort order.
Returns:

The product file query

Return type:

django.db.models.QuerySet

get_sources(started=None, ended=None, time_field=None, is_parsed=None, file_name=None, order=None)

Returns a list of source files within the given time range.

Parameters:
  • started (datetime.datetime) – Query source files updated after this amount of time.
  • ended (datetime.datetime) – Query source files updated before this amount of time.
  • time_field (string) – The time field to use for filtering.
  • is_parsed (bool) – Query source files flagged as successfully parsed.
  • file_name (str) – Query source files with the given file name.
  • order ([string]) – A list of fields to control the sort order.
Returns:

The list of source files that match the time range.

Return type:

list[storage.models.ScaleFile]

save_parse_results(*args, **kwargs)

Saves the given parse results to the source file for the given ID. All database changes occur in an atomic transaction.

Parameters:
  • src_file_id (int) – The ID of the source file
  • geo_json (dict) – The associated geojson data, possibly None
  • data_started (datetime.datetime or None) – The start time of the data contained in the source file, possibly None
  • data_ended (datetime.datetime or None) – The end time of the data contained in the source file, possibly None
  • data_types ([string]) – List of strings containing the data types tags for this source file.
  • new_workspace_path (str) – New workspace path to move the source file to now that parse data is available. If None, the source file should not be moved.

source.serializers module

Defines the serializers for source files

class source.serializers.SourceFileBaseSerializer(instance=None, data=<class rest_framework.fields.empty>, **kwargs)

Bases: storage.serializers.ScaleFileSerializerV6

Converts source file model fields to REST output

class source.serializers.SourceFileDetailsSerializer(instance=None, data=<class rest_framework.fields.empty>, **kwargs)

Bases: source.serializers.SourceFileSerializer

Converts source file model fields to REST output

class source.serializers.SourceFileSerializer(instance=None, data=<class rest_framework.fields.empty>, **kwargs)

Bases: source.serializers.SourceFileBaseSerializer

Converts source file model fields to REST output

class source.serializers.SourceFileUpdateField(read_only=False, write_only=False, required=None, default=<class rest_framework.fields.empty>, initial=<class rest_framework.fields.empty>, source=None, label=None, help_text=None, style=None, error_messages=None, validators=None, allow_null=False)

Bases: rest_framework.fields.Field

Field for displaying the update information for a source file

to_representation(value)

Converts the model to its update information

Parameters:value (source.models.SourceFile) – the source file model
Return type:dict
Returns:the dict with the update information
type_label = 'update'
type_name = 'UpdateField'
class source.serializers.SourceFileUpdateSerializer(instance=None, data=<class rest_framework.fields.empty>, **kwargs)

Bases: source.serializers.SourceFileSerializer

Converts source file updates to REST output

Module contents