Results Manifest¶
The results manifest is a JSON document that defines the output of an algorithm’s run. Using the results manifest, you can specify your outputs, parse information, run_information and errors. In addition, you can register artifacts by printing a line to stdout with the following format “ARTIFACT:<output_name>:<path_to_file>”. The artifact string must be on a separate line, and if there are any conflicts with the manifest file, the manifest file takes precedence.
The following are some example output manifest files:
Results manifest with one output
{
"version": "1.1",
"output_data": [
{
"name" : "output_file",
"file": {
"path" : "/tmp/job_exe_231/outputs/output.csv"
}
}
]
}
The above manifest simply says that the output with the name “output_file” can be found on the local computer at the location “/tmp/job_exe_231/outputs/output.csv”.
Results manifest with a parsed input
{
"version": "1.1",
"parse_results": [
{
"filename" : "myfile.h5",
"data_types" : [
"H5",
"VEG"
],
"geo_metadata": {
"data_started" : "2015-05-15T10:34:12Z",
"data_ended" : "2015-05-15T10:36:12Z",
}
}
]
}
This example is the result of one of the inputs (myfile.h5) being parsed.
Results Manifest Specification Version 1.1¶
A valid results manifest is a JSON document with the following structure:
{
"version": STRING,
"output_data": [
{
"name": STRING,
"file": {
"path": STRING,
"geo_metadata": {
"data_started": STRING(ISO-8601),
"data_ended": STRING(ISO-8601),
"geo_json": JSON
}
},
"files": [
{
"path": STRING,
"geo_metadata": {
"data_started": STRING(ISO-8601),
"data_ended": STRING(ISO-8601),
"geo_json": JSON
}
}
]
}
],
"parse_results": [
{
"filename": STRING,
"new_workspace_path": STRING,
"data_types": [
STRING,
STRING
],
"geo_metadata": {
"data_started": STRING(ISO-8601),
"data_ended": STRING(ISO-8601),
"geo_json": JSON
}
}
]
}
version: JSON string
The version is an optional string value that defines the version of the results manifest specification used. This allows updates to be made to the specification while maintaining backwards compatibility by allowing Scale to recognize an older version and convert it to the current version. The default value for version if it is not included is the latest version, which is currently 1.1. It is recommended, though not required, that you include the version so that future changes to the specification will still accept your results manifest
output_data JSON array
The output_data is an optional array of output files that your algorithm produced. If not provided, it defaults to an empty list. The JSON object that represents each output_data entry has the following fields:
name: JSON string
The name is a required string that indicates which field in the job_interface this output corresponds to.file: JSON object
The file is an optional sting field, however either file or files must be present. The file field should be used if the “file” output_type was used in the job interface. The file object has the following fields:
path: JSON string
The path is the location of the file on the machine that ran the algorithm.
geo_metadata: JSON object
The geo_metadata contains additional geospatial information associated with the output file. It contains the following fields:
data_started: JSON string (ISO-8601)
The data_started is an optional JSON string that is formatted to the ISO-8601 standard. This field represents when the data from this file started.data_ended: JSON string (ISO-8601)
The data_ended is an optional JSON string that is formatted to the ISO-8601 standard. This field represents when the data from this file ended.geo_json: JSON object
The geo_json is an optional JSON string containing the geospatial extents of the data. It is currently required that this contain a 3D geometry. In addition to storing the extents of the data, a center point is auto calculated.files: JSON array
The files is an optional array of JSON objects, however either file or files must be present. The files field should be used if the “files” output_type was used in the job interface. Each files object has the following fields:
path: JSON string
The path is the location of the file on the machine that ran the algorithm.
geo_metadata: JSON object
The geo_metadata contains additional geospatial information associated with the output file. It contains the following fields:
data_started: JSON string (ISO-8601)
The data_started is an optional JSON string that is formatted to the ISO-8601 standard. This field represents when the data from this file started.data_ended: JSON string (ISO-8601)
The data_ended is an optional JSON string that is formatted to the ISO-8601 standard. This field represents when the data from this file ended.geo_json: JSON object
The geo_json is an optional JSON string containing the geospatial extents of the data. It is currently required that this contain a 3D geometry. In addition to storing the extents of the data, a center point is auto calculated.
parse_results: JSON array
The parse_results is an array of JSON objects that contain information from parsing inputs to your algorithm. These results should be used to associate meta-data with input files to the algorithm. Each of the parse results corresponds to a input from the job interface of the type “file”. Additionally, the file must be a “source” file. A “source” file is something that was not produced by an algorithm. Files produced by algorithms are known as “product” files. As an algorithm developer, this is not important, but when you are tying an algorithm to the scale data, this distinction is important. Each parse_results object has the following fields:
filename: JSON string
The filename is a required JSON string that is the name of the file that you have performed the parsing on.new_workspace_path: JSON string
The new_workspace_path is an optional JSON string that is a new location where the file should be stored.data_started: JSON string (ISO-8601)
The data_started is an optional JSON string that is formatted to the ISO-8601 standard. This field represents when the data from this file started.data_ended: JSON string (ISO-8601)
The data_ended is an optional JSON string that is formatted to the ISO-8601 standard. This field represents when the data from this file ended.data_types: JSON array
The data_types is an optional array of JSON strings. Each of the strings is a file data type that this input file can be associated with.gis_data_path: JSON string
The gis_data_path is an optional path to a GeoJSON file. The contents of the this file will be set in the meta_data for the given input file. The geometry will also be set for the file. In addition to storing the extents of the data, a center point is auto calculated.
Results Manifest Specification Version 1.0¶
A valid version 1.0 results manifest is a JSON document with the following structure:
{
"version": STRING,
"files": [
{
"name": STRING,
"path": STRING
},
{
"name": STRING,
"paths": [
STRING,
STRING
]
}
],
"parse_results": [
{
"filename": STRING,
"data_started": STRING(ISO-8601),
"data_ended": STRING(ISO-8601),
"data_types": [
STRING,
STRING
],
"gis_data_path": STRING
}
]
}
version: JSON string
The version is an optional string value that defines the version of the results manifest specification used. This allows updates to be made to the specification while maintaining backwards compatibility by allowing Scale to recognize an older version and convert it to the current version. The default value for version if it is not included is the latest version, which is currently 1.0. It is recommended, though not required, that you include the version so that future changes to the specification will still accept your results manifest
files JSON array
The files is an optional array of output files that your algorithm produced. If not provided, files defaults to an empty list. The JSON object that represents each files entry has the following fields:
name: JSON string
The name is a required string that indicates which field in the job_interface this output corresponds to.path: JSON string
The path is an optional sting field, however either path or paths must be present. The path is the location of the file on the machine that ran the algorithm. The path field should be used if the “file” output_type was used in the job interface.paths: JSON array
The paths is an optional array of JSON strings, however either path or paths must be present. Each string in the array is a path to a file that corresponds to a job_output. The paths field should be used if the “files” output_type was used in the job interface.
parse_results: JSON array
The parse_results is an array of JSON objects that contain information from parsing inputs to your algorithm. These results should be used to associate meta-data with input files to the algorithm. Each of the parse results corresponds to a input from the job interface of the type “file”. Additionally, the file must be a “source” file. A “source” file is something that was not produced by an algorithm. Files produced by algorithms are known as “product” files. As an algorithm developer, this is not important, but when you are tying an algorithm to the scale data, this distinction is important. Each parse_results object has the following fields:
filename: JSON string
The filename is a required JSON string that is the name of the file that you have performed the parsing on.data_started: JSON string (ISO-8601)
The data_started is an optional JSON string that is formatted to the ISO-8601 standard. This field represents when the data from this file started.data_ended: JSON string (ISO-8601)
The data_ended is an optional JSON string that is formatted to the ISO-8601 standard. This field represents when the data from this file ended.data_types: JSON array
The data_types is an optional array of JSON strings. Each of the strings is a file data type that this input file can be associated with.gis_data_path: JSON string
The gis_data_path is an optional path to a GeoJSON file. The contents of the this file will be set in the meta_data for the given input file. The geometry will also be set for the file. In addition to storing the extents of the data, a center point is auto calculated.