source package¶
Subpackages¶
Submodules¶
source.apps module¶
Defines the application configuration for the source application
source.models module¶
Defines the database model for source files
-
class
source.models.
SourceFile
(*args, **kwargs)¶ Bases:
storage.models.ScaleFile
Represents a source data file that is available for processing. This is a proxy model of the
storage.models.ScaleFile
model. It has the same set of fields, but a different manager that provides functionality specific to source files.-
exception
DoesNotExist
¶ Bases:
storage.models.DoesNotExist
-
exception
MultipleObjectsReturned
¶ Bases:
storage.models.MultipleObjectsReturned
-
VALID_TIME_FIELDS
= [u'data', u'last_modified']¶
-
classmethod
create
()¶ Creates a new source file
Returns: The new source file Return type: source.models.SourceFile
-
objects
= <source.models.SourceFileManager object>¶
-
exception
-
class
source.models.
SourceFileManager
(*args, **kwargs)¶ Bases:
django.contrib.gis.db.models.manager.GeoManager
Provides additional methods for handling source files
-
filter_sources
(started=None, ended=None, time_field=None, is_parsed=None, file_name=None, order=None)¶ Returns a query for source models that filters on the given fields. The returned query includes the related workspace, job_type, and job fields, except for the workspace.json_config field. The related countries are set to be pre-fetched as part of the query.
Parameters: - started (
datetime.datetime
) – Query source files updated after this amount of time. - ended (
datetime.datetime
) – Query source files updated before this amount of time. - time_field (string) – The time field to use for filtering.
- is_parsed (bool) – Query source files flagged as successfully parsed.
- file_name (str) – Query source files with the given file name.
- order ([string]) – A list of fields to control the sort order.
Returns: The list of source files that match the time range.
Return type: list[
storage.models.ScaleFile
]- started (
-
get_details
(source_id)¶ Gets additional details for the given source model
Parameters: source_id (int) – The unique identifier of the source (file ID) Returns: The source model with details Return type: storage.models.ScaleFile
:raises
storage.models.ScaleFile.DoesNotExist
: If the file does not exist
-
get_source_ingests
(source_file_id, started=None, ended=None, statuses=None, scan_ids=None, strike_ids=None, order=None)¶ Returns a query for ingest models for the given source file. The returned query includes the related strike, scan, source_file, and source_file.workspace fields, except for the strike.configuration, scan.configuration, and source_file.workspace.json_config fields.
Parameters: - source_file_id (int) – The source file ID.
- started (
datetime.datetime
) – Query ingests updated after this amount of time. - ended (
datetime.datetime
) – Query ingests updated before this amount of time. - statuses ([string]) – Query ingests with the a specific process status.
- scan_ids ([string]) – Query ingests created by a specific scan processor.
- strike_ids ([string]) – Query ingests created by a specific strike processor.
- file_name (string) – Query ingests with a specific file name.
- order ([string]) – A list of fields to control the sort order.
Returns: The ingest query
Return type: django.db.models.QuerySet
-
get_source_jobs
(source_file_id, started=None, ended=None, statuses=None, job_ids=None, job_type_ids=None, job_type_names=None, batch_ids=None, error_categories=None, order=None)¶ Returns a query for the list of jobs that have used the given source file as input. The returned query includes the related job_type, job_type_rev, event, and error fields, except for the job_type.manifest and job_type_rev.manifest fields.
Parameters: - source_file_id (int) – The source file ID.
- started (
datetime.datetime
) – Query jobs updated after this amount of time. - ended (
datetime.datetime
) – Query jobs updated before this amount of time. - statuses ([string]) – Query jobs with the a specific execution status.
- job_ids ([int]) – Query jobs associated with the identifier.
- job_type_ids ([int]) – Query jobs of the type associated with the identifier.
- job_type_names ([string]) – Query jobs of the type associated with the name.
- batch_ids ([int]) – Query jobs associated with batches with the given identifiers.
- error_categories ([string]) – Query jobs that failed due to errors associated with the category.
- order ([string]) – A list of fields to control the sort order.
Returns: The list of jobs that match the time range.
Return type:
-
get_source_products
(source_file_id, started=None, ended=None, time_field=None, batch_ids=None, job_type_ids=None, job_type_names=None, job_ids=None, is_published=None, is_superseded=None, file_name=None, job_output=None, recipe_ids=None, recipe_type_ids=None, recipe_job=None, order=None)¶ Returns a query for the list of products produced by the given source file ID. The returned query includes the related workspace, job_type, and job fields, except for the workspace.json_config field. The related countries are set to be pre-fetched as part of the query.
Parameters: - source_file_id (int) – The source file ID.
- started (
datetime.datetime
) – Query product files updated after this amount of time. - ended (
datetime.datetime
) – Query product files updated before this amount of time. - time_field (string) – The time field to use for filtering.
- batch_ids ([int]) – Query product files produced by batches with the given identifiers.
- job_type_ids ([int]) – Query product files produced by jobs with the given type identifiers.
- job_type_names ([string]) – Query product files produced by jobs with the given type names.
- job_ids ([int]) – Query product files produced by jobs with the given identifiers.
- is_published (bool) – Query product files flagged as currently exposed for publication.
- is_superseded (bool) – Query product files that have/have not been superseded.
- file_name (str) – Query product files with the given file name.
- job_output (str) – Query product files with the given job output
- recipe_ids ([int]) – Query product files produced by a given recipe id
- recipe_job (str) – Query product files produced by a given recipe name
- recipe_type_ids ([int]) – Query product files produced by a given recipe types
- order ([str]) – A list of fields to control the sort order.
Returns: The product file query
Return type: django.db.models.QuerySet
-
get_sources
(started=None, ended=None, time_field=None, is_parsed=None, file_name=None, order=None)¶ Returns a list of source files within the given time range.
Parameters: - started (
datetime.datetime
) – Query source files updated after this amount of time. - ended (
datetime.datetime
) – Query source files updated before this amount of time. - time_field (string) – The time field to use for filtering.
- is_parsed (bool) – Query source files flagged as successfully parsed.
- file_name (str) – Query source files with the given file name.
- order ([string]) – A list of fields to control the sort order.
Returns: The list of source files that match the time range.
Return type: list[
storage.models.ScaleFile
]- started (
-
save_parse_results
(*args, **kwargs)¶ Saves the given parse results to the source file for the given ID. All database changes occur in an atomic transaction.
Parameters: - src_file_id (int) – The ID of the source file
- geo_json (dict) – The associated geojson data, possibly None
- data_started (
datetime.datetime
or None) – The start time of the data contained in the source file, possibly None - data_ended (
datetime.datetime
or None) – The end time of the data contained in the source file, possibly None - data_types ([string]) – List of strings containing the data types tags for this source file.
- new_workspace_path (str) – New workspace path to move the source file to now that parse data is available. If None, the source file should not be moved.
-
source.serializers module¶
Defines the serializers for source files
-
class
source.serializers.
SourceFileBaseSerializer
(instance=None, data=<class rest_framework.fields.empty>, **kwargs)¶ Bases:
storage.serializers.ScaleFileSerializerV6
Converts source file model fields to REST output
-
class
source.serializers.
SourceFileDetailsSerializer
(instance=None, data=<class rest_framework.fields.empty>, **kwargs)¶ Bases:
source.serializers.SourceFileSerializer
Converts source file model fields to REST output
-
class
source.serializers.
SourceFileSerializer
(instance=None, data=<class rest_framework.fields.empty>, **kwargs)¶ Bases:
source.serializers.SourceFileBaseSerializer
Converts source file model fields to REST output
-
class
source.serializers.
SourceFileUpdateField
(read_only=False, write_only=False, required=None, default=<class rest_framework.fields.empty>, initial=<class rest_framework.fields.empty>, source=None, label=None, help_text=None, style=None, error_messages=None, validators=None, allow_null=False)¶ Bases:
rest_framework.fields.Field
Field for displaying the update information for a source file
-
to_representation
(value)¶ Converts the model to its update information
Parameters: value ( source.models.SourceFile
) – the source file modelReturn type: dict Returns: the dict with the update information
-
type_label
= 'update'¶
-
type_name
= 'UpdateField'¶
-
-
class
source.serializers.
SourceFileUpdateSerializer
(instance=None, data=<class rest_framework.fields.empty>, **kwargs)¶ Bases:
source.serializers.SourceFileSerializer
Converts source file updates to REST output