job.data package

Submodules

job.data.exceptions module

Defines exceptions that can occur when interacting with job data

exception job.data.exceptions.InvalidConnection

Bases: exceptions.Exception

Exception indicating that the provided job connection was invalid

exception job.data.exceptions.InvalidData

Bases: exceptions.Exception

Exception indicating that the provided job data was invalid

exception job.data.exceptions.StatusError

Bases: exceptions.Exception

Exception indicating that an operation cannot be completed due to the current job status.

job.data.job_connection module

Defines connections that will provide data to execute jobs

class job.data.job_connection.SeedJobConnection

Bases: object

Represents a connection that will provide data to execute jobs. This class contains the necessary description needed to ensure the data provided by the connection will be sufficient to execute the given job.

add_input_file(file_name, multiple, media_types, optional, partial)

Adds a new file parameter to this connection

Parameters:
  • file_name (str) – The file parameter name
  • multiple (bool) – Whether the file parameter provides multiple files (True)
  • media_types (list of str) – The possible media types of the file parameter (unknown if None or [])
  • optional (bool) – Whether the file parameter is optional and may not be provided (True)
  • partial (bool) – Flag indicating if the parameter only requires a small portion of the file
add_property(property_name)

Adds a new property parameter to this connection

Parameters:property_name (str) – The property parameter name
add_workspace()

Indicates that this connection provides a workspace for storing output files

has_workspace()

Indicates whether this connection provides a workspace for storing output files

Returns:True if this connection provides a workspace, False otherwise
Return type:bool
validate_input_files(files)

Validates the given file parameters to make sure they are valid with respect to the job interface.

Parameters:files ([job.seed.types.SeedInputFiles]) – List of file inputs
Returns:A list of warnings discovered during validation.
Return type:list[job.configuration.data.job_data.ValidationWarning]

:raises job.configuration.data.exceptions.InvalidConnection: If there is a configuration problem.

validate_properties(property_names)

Validates the given property names to make sure all properties exist if they are required. :param files: List of file inputs :type files: [job.seed.types.SeedInputFiles] :returns: A list of warnings discovered during validation. :rtype: list[job.configuration.data.job_data.ValidationWarning]

:raises job.configuration.data.exceptions.InvalidConnection: If there is a configuration problem.

job.data.job_data module

Defines the data needed for executing a job

class job.data.job_data.JobData(data=None)

Bases: object

Represents the data needed for executing a job. Data includes details about the data inputs, links needed to connect shared resources to resource instances in Scale, and details needed to store all resulting output.

add_file_input(name, file_id)

Adds a new file parameter to this job data.

Parameters:data (dict) – The files parameter dict
add_file_list_input(name, file_ids)

Adds a new files parameter to this job data.

Parameters:
  • name (string) – The files parameter name
  • file_ids ([long]) – The ID of the file
add_file_output(data, add_to_internal=True)

Adds a new output files to this job data with a workspace ID.

Parameters:
  • data (dict) – The output parameter dict
  • add_to_internal (bool) – Whether we should add to private data dict. Unneeded when used from __init__
add_json_input(data, add_to_internal=True)

Adds a new json parameter to this job data.

Parameters:
  • data (dict) – The json parameter dict
  • add_to_internal (bool) – Whether we should add to private data dict. Unneeded when used from __init__
extend_interface_with_inputs(interface, job_files)

Add a value property to both files and json objects within Seed Manifest

Parameters:
Returns:

A dictionary of Seed Manifest inputs key mapped to the corresponding data value.

Return type:

dict

get_all_properties()

Retrieves all properties from this job data and returns them in ascending order of their names

Returns:List of strings containing name=value
Return type:[string]
get_dict()

Returns the internal dictionary that represents this job data

Returns:The internal dictionary
Return type:dict
get_injected_env_vars(input_files, interface)

Inject all execution time values to job data mappings

Parameters:
Returns:

Mapping of all input keys to their true file / property values

Return type:

{str, str}

get_injected_input_values(input_files)

Apply all execution time values to job data

TODO: Remove with v6 when old style Job Types are removed

Parameters:input_files ({str, job.execution.configuration.input_file.InputFile}) – Mapping of input names to InputFiles
Returns:Mapping of all input keys to their true file / property values
Return type:{str, str}
get_input_file_ids()

Returns a set of scale file identifiers for each file in the job input data.

Returns:Set of scale file identifiers
Return type:{int}
get_input_file_ids_by_input()

Returns the list of file IDs for each input that holds files

Returns:Dict where each file input name maps to its list of file IDs
Return type:dict
get_input_file_info()

Returns a set of scale file identifiers and input names for each file in the job input data.

Returns:Set of scale file identifiers and names
Return type:set[tuple]
get_output_workspace_ids()

Returns a list of the IDs for every workspace used to store the output files for this data

Returns:List of workspace IDs
Return type:[int]
get_output_workspaces()

Returns a dict of the output parameter names mapped to their output workspace ID

Returns:A dict mapping output parameters to workspace IDs
Return type:dict
get_property_values(property_names)

Retrieves the values contained in this job data for the given property names. If no value is available for a property name, it will not be included in the returned dict.

Parameters:property_names ([string]) – List of property names
Returns:Dict with each property name mapping to its value
Return type:{string: string}
has_workspaces()

Whether this job data contains output wrkspaces

Returns:Whether this job data contains output wrkspaces
Return type:bool
retrieve_input_data_files(data_files)

Retrieves the given data input files and writes them to the given local directories. Any given file parameters that do not appear in the data will not be returned in the results.

Parameters:data_files ([job.seed.types.SeedInputFiles]) – Object containing manifest details on input files.
Returns:Dict with each file parameter name mapping to a list of absolute file paths of the written files
Return type:{string: [string]}
setup_job_dir(data_files)

Sets up the directory structure for a job execution and downloads the given files

Parameters:data_files ({string: tuple(bool, string)}) – Dict with each file parameter name mapping to a bool indicating if the parameter accepts multiple files (True) and an absolute directory path
Returns:Dict with each file parameter name mapping to a list of absolute file paths of the written files
Return type:{string: [string]}
validate_input_files(files)

Validates the given file parameters to make sure they are valid with respect to the job interface.

Parameters:files ([job.seed.types.SeedInputFiles]) – List of Seed Input Files
Returns:A list of warnings discovered during validation.
Return type:[job.configuration.data.job_data.ValidationWarning]

:raises job.configuration.data.exceptions.InvalidData: If there is a configuration problem.

validate_input_json(input_json)

Validates the given property names to ensure they are all populated correctly and exist if they are required.

Parameters:input_json ([job.seed.types.SeedInputJson]) – List of Seed input json fields
Returns:A list of warnings discovered during validation.
Return type:[job.configuration.data.job_data.ValidationWarning]

:raises job.configuration.data.exceptions.InvalidData: If there is a configuration problem.

validate_output_files(files)

Validates the given file parameters to make sure they are valid with respect to the job interface.

Parameters:files ([string]) – List of file parameter names
Returns:A list of warnings discovered during validation.
Return type:[job.configuration.data.job_data.ValidationWarning]

:raises job.configuration.data.exceptions.InvalidData: If there is a configuration problem.

class job.data.job_data.ValidationWarning(key, details)

Bases: object

Tracks job data configuration warnings during validation that may not prevent the job from working.

job.data.types module

class job.data.types.JobDataFields(data)

Bases: object

name
class job.data.types.JobDataInputFiles(data)

Bases: job.data.types.JobDataFields

file_ids
class job.data.types.JobDataInputJson(data)

Bases: job.data.types.JobDataFields

value
class job.data.types.JobDataOutputFiles(data)

Bases: job.data.types.JobDataFields

workspace_id

Module contents