job.execution package

Submodules

job.execution.container module

Defines the methods for handling file systems in the job execution’s local container volume

job.execution.container.get_job_exe_input_vol_name(job_exe)

Returns the container input volume name for the given job execution

Parameters:job_exe (job.models.JobExecution) – The job execution model (must not be queued) with related job and job_type fields
Returns:The container input volume name
Return type:string
Raises:Exception – If the job execution is still queued
job.execution.container.get_job_exe_output_vol_name(job_exe)

Returns the container output volume name for the given job execution

Parameters:job_exe (job.models.JobExecution) – The job execution model (must not be queued) with related job and job_type fields
Returns:The container output volume name
Return type:string
Raises:Exception – If the job execution is still queued
job.execution.container.get_mount_volume_name(job_exe, mount_name)

Returns the name of the mount’s container volume for the given job execution

Parameters:
  • job_exe (job.models.JobExecution) – The job execution model (must not be queued) with related job and job_type fields
  • mount_name (string) – The name of the mount
Returns:

The mount’s container volume name

Return type:

string

Raises:

Exception – If the job execution is still queued

job.execution.container.get_workspace_volume_name(job_exe, workspace)

Returns the name of the workspace’s container volume for the given job execution

Parameters:
  • job_exe (job.models.JobExecution) – The job execution model (must not be queued) with related job and job_type fields
  • workspace (string) – The name of the workspace
Returns:

The workspace’s container volume name

Return type:

string

Raises:

Exception – If the job execution is still queued

job.execution.exceptions module

Defines exceptions that can occur when interacting with job executions

exception job.execution.exceptions.InvalidTaskResults

Bases: exceptions.Exception

Exception indicating that the provided task results JSON was invalid

job.execution.job_exe module

Defines the class that represents running job executions

class job.execution.job_exe.RunningJobExecution(agent_id, job_exe, job_type, configuration, priority)

Bases: object

This class represents a currently running job execution. This class is thread-safe.

check_for_starvation(when)

Checks this job execution to see if it has been starved of resources for its next task

Parameters:when (datetime.datetime) – The current time
Returns:Whether this job execution has been starved
Return type:bool
create_job_exe_end_model()

Creates and returns a job execution end model for this job execution. Caller must ensure that this job execution is finished before calling.

Returns:The job execution end model
Return type:job.models.JobExecutionEnd
current_task

Returns the currently running task of the job execution, or None if no task is currently running

Returns:The current task, possibly None
Return type:job.tasks.base_task.Task
error

Returns this job execution’s error, None if there is no error

Returns:The error, possibly None
Return type:error.models.Error
error_category

Returns the category of this job execution’s error, None if there is no error

Returns:The error category, possibly None
Return type:bool
execution_canceled(when)

Cancels this job execution

Parameters:when (datetime.datetime) – The time that the execution was canceled
execution_lost(when)

Fails this job execution for its node becoming lost

Parameters:when (datetime.datetime) – The time that the node was lost
execution_timed_out(task, when)

Fails this job execution for timing out

Parameters:
  • task (job.tasks.exe_task.JobExecutionTask) – The task that timed out
  • when (datetime.datetime) – The time that the job execution timed out
finished

When this job execution finished, possibly None

Returns:When this job execution finished, possibly None
Return type:datetime.datetime
get_container_names()

Returns the list of container names for all tasks in this job execution

Returns:The list of all container names
Return type:[string]
is_finished()

Indicates whether this job execution is finished with all tasks

Returns:True if all tasks are finished, False otherwise
Return type:bool
is_next_task_ready()

Indicates whether the next task in this job execution is ready

Returns:True if the next task is ready, False otherwise
Return type:bool
next_task()

Returns the next task in this job execution. Returns None if there are no remaining tasks.

Returns:The next task, possibly None
Return type:job.tasks.base_task.Task
start_next_task()

Starts the next task in the job execution and returns it. Returns None if the next task is not ready or no tasks remain.

Returns:The new task that was started, possibly None
Return type:job.tasks.base_task.Task
status

Returns the status of this job execution

Returns:The status of the job execution
Return type:string
task_update(task_update)

Updates a task for this job execution

Parameters:task_update (job.tasks.update.TaskStatusUpdate) – The task update

job.execution.manager module

Defines the class that manages job executions

class job.execution.manager.JobExecutionManager

Bases: object

This class manages all running and finished job executions. This class is thread-safe.

add_canceled_job_exes(job_exe_ends)

Adds the given job_exe_end models for job executions canceled off of the queue

Parameters:job_exe_ends (list()) – The job_exe_end models to add
check_for_starvation(when)

Checks all of the currently running job executions for resource starvation. If any starved executions are found, they are failed and returned.

Parameters:when (datetime.datetime) – The current time
Returns:A list of the starved job executions
Return type:list()
clear()

Clears all data from the manager. This method is intended for testing only.

generate_status_json(nodes_list, when)

Generates the portion of the status JSON that describes the job execution metrics

Parameters:
  • nodes_list (list()) – The list of nodes within the status JSON
  • when (datetime.datetime) – The current time
get_messages()

Returns all messages related to jobs and executions that need to be sent

Returns:The list of job-related messages to send
Return type:list()
get_running_job_exe(cluster_id)

Returns the running job execution with the given cluster ID, or None if the job execution does not exist

Parameters:cluster_id (int) – The cluster ID of the job execution to return
Returns:The running job execution with the given cluster ID, possibly None
Return type:job.execution.job_exe.RunningJobExecution
get_running_job_exes()

Returns all currently running job executions

Returns:A list of running job executions
Return type:[job.execution.job_exe.RunningJobExecution]
handle_task_timeout(task, when)

Handles the timeout of the given task

Parameters:
handle_task_update(task_update)

Handles the given task update and returns the associated job execution if it has finished

Parameters:task_update (job.tasks.update.TaskStatusUpdate) – The task update
Returns:The job execution if it has finished, None otherwise
Return type:job.execution.job_exe.RunningJobExecution
init_with_database()

Initializes the job execution metrics with the execution history from the database

lost_job_exes(job_exe_ids, when)

Informs the manager that the job executions with the given IDs were lost

Parameters:
  • job_exe_ids (list()) – The IDs of the lost job executions
  • when (datetime.datetime) – The time that the executions were lost
Returns:

A list of the finished job executions

Return type:

list()

lost_node(node_id, when)

Informs the manager that the node with the given ID was lost and has gone offline

Parameters:
  • node_id (int) – The ID of the lost node
  • when (datetime.datetime) – The time that the node was lost
Returns:

A list of the finished job executions

Return type:

list()

schedule_job_exes(job_exes, messages)

Adds newly scheduled running job executions to the manager

Parameters:
  • job_exes (list()) – A list of the running job executions to add
  • messages (list()) – The messages for the running jobs
sync_with_database()

Syncs with the database to handle any canceled executions. Any job executions that are now finished are returned.

Returns:A list of the finished job executions
Return type:list()

job.execution.metrics module

Defines the classes that manages job execution metrics

class job.execution.metrics.FinishedJobExeMetrics

Bases: object

This class holds metrics for finished job executions

add_job_execution(job_exe)

Adds the given job execution to the metrics

Parameters:job_exe (job.execution.job_exe.RunningJobExecution) – The job execution
count

Returns the total number of finished job executions

Returns:The total number of finished job executions
Return type:int
generate_status_json(json_dict)

Generates the portion of the status JSON that describes these finished job executions

Parameters:json_dict (dict) – The JSON dict to add these metrics to
subtract_metrics(metrics)

Subtracts the given metrics

Parameters:metrics (job.execution.manager.FinishedJobExeMetrics) – The metrics to subtract
class job.execution.metrics.FinishedJobExeMetricsByNode

Bases: object

This class holds metrics for finished job executions, grouped by node

EMPTY_METRICS = <job.execution.metrics.FinishedJobExeMetrics object>
add_job_execution(job_exe)

Adds the given job execution to the metrics

Parameters:job_exe (job.execution.job_exe.RunningJobExecution) – The job execution
generate_status_json(nodes_list)

Generates the portion of the status JSON that describes these finished job executions

Parameters:nodes_list (list()) – The list of nodes within the status JSON
subtract_metrics(metrics)

Subtracts the given metrics

Parameters:metrics (job.execution.manager.FinishedJobExeMetricsByNode) – The metrics to subtract
class job.execution.metrics.FinishedJobExeMetricsOverTime(when)

Bases: object

This class holds metrics for finished job executions, grouped and sorted into time blocks

BLOCK_LENGTH = datetime.timedelta(0, 300)
class TIME_BLOCK(start, end, metrics)

Bases: tuple

end

Alias for field number 1

metrics

Alias for field number 2

start

Alias for field number 0

TOTAL_TIME_PERIOD = datetime.timedelta(0, 10800)
add_job_execution(job_exe)

Adds the given job execution to the metrics

Parameters:job_exe (job.execution.job_exe.RunningJobExecution) – The job execution
update_to_now(when)

Updates the metrics to now by removing any time blocks that are older than our time period and creates any needed new blocks

Parameters:when (datetime.datetime) – The current time
Returns:The list of removed metrics classes for old time blocks
Return type:[job.execution.metrics.FinishedJobExeMetricsByNode]
class job.execution.metrics.JobExeMetrics

Bases: object

This class holds metrics for a list of job executions

add_job_execution(job_exe)

Adds the given job execution to the metrics

Parameters:job_exe (job.execution.job_exe.RunningJobExecution) – The job execution
generate_status_json(json_dict)

Generates the portion of the status JSON that describes this list of job executions

Parameters:json_dict (dict) – The JSON dict to add these metrics to
remove_job_execution(job_exe)

Removes the given job execution from the metrics

Parameters:job_exe (job.execution.job_exe.RunningJobExecution) – The job execution
subtract_metrics(metrics)

Subtracts the given metrics

Parameters:metrics (job.execution.manager.JobExeMetrics) – The metrics to subtract
class job.execution.metrics.JobExeMetricsByType

Bases: object

This class holds metrics for job executions grouped by their type

add_job_execution(job_exe)

Adds the given job execution to the metrics

Parameters:job_exe (job.execution.job_exe.RunningJobExecution) – The job execution
generate_status_json(json_dict)

Generates the portion of the status JSON that describes this group of job executions

Parameters:json_dict (dict) – The JSON dict to add these metrics to
remove_job_execution(job_exe)

Removes the given job execution from the metrics

Parameters:job_exe (job.execution.job_exe.RunningJobExecution) – The job execution
subtract_metrics(metrics)

Subtracts the given metrics

Parameters:metrics (job.execution.manager.JobExeMetricsByType) – The metrics to subtract
class job.execution.metrics.RunningJobExeMetricsByNode

Bases: object

This class holds metrics for running job executions grouped by their node

EMPTY_METRICS = <job.execution.metrics.JobExeMetricsByType object>
add_job_execution(job_exe)

Adds the given job execution to the metrics

Parameters:job_exe (job.execution.job_exe.RunningJobExecution) – The job execution
generate_status_json(nodes_list)

Generates the portion of the status JSON that describes these running job executions

Parameters:nodes_list (list()) – The list of nodes within the status JSON
remove_job_execution(job_exe)

Removes the given job execution from the metrics

Parameters:job_exe (job.execution.job_exe.RunningJobExecution) – The job execution
subtract_metrics(metrics)

Subtracts the given metrics

Parameters:metrics (job.execution.manager.RunningJobExeMetricsByNode) – The metrics to subtract
class job.execution.metrics.TotalJobExeMetrics(when)

Bases: object

This class handles all real-time metrics for job executions

add_running_job_exes(job_exes)

Adds newly scheduled running job executions to the metrics

Parameters:job_exes ([job.execution.job_exe.RunningJobExecution]) – A list of the running job executions to add
generate_status_json(nodes_list, when)

Generates the portion of the status JSON that describes the job execution metrics

Parameters:
  • nodes_list (list()) – The list of nodes within the status JSON
  • when (datetime.datetime) – The current time
init_with_database(*args, **kwargs)
job_exe_finished(job_exe)

Handles a running job execution that has finished

Parameters:job_exe (job.execution.job_exe.RunningJobExecution) – The finished job execution

Module contents