This repository implements the internals of the HuBMAP data repository processing pipeline. This code is independent of the UI but works in response to requests from the data-ingest UI backend.
devtest is a mock assay for use by developers. It provides a testing tool controlled by a simple YAML file, allowing a developer to simulate execution of a full ingest pipeline without the need for real data. To do a devtest run, follow this procedure.
- Create an input dataset, for example using the ingest UI.
- It must have a valid Source ID.
- Its datatype must be Other -> devtest
- Insert a control file named test.yml into the top-level directory of the dataset. The file format is described below. You may include any other files in the directory, as long as test.yml exists.
- Submit the dataset.
Ingest operations will proceed normally from that point:
- The state of the original dataset will change from New through Processing to QA.
- A secondary dataset will be created, and will move through Processing to QA with an adjustable delay (see below).
- Files specified in test.yml may be moved into the dataset directory of the secondary dataset.
- All normal metadata will be returned, including extra metadata specified in test.yml (see below).
The format for test.yml is:
{
# the following line is required for the submission to be properly identified at assay 'devtest'
collectiontype: devtest,
# The pipeline_exec stage will delay for this many seconds before returning (default 30 seconds)
delay_sec: 120,
# If this list is present, the listed files will be copied from the submission directory to the derived dataset.
files_to_copy: ["file_068.bov", "file_068.doubles"],
# If present, the given metadata will be returned as dataset metadata for the derived dataset.
metadata_to_return: {
mymessage: 'hello world',
othermessage: 'and also this'
}
}
API Test | |
---|---|
Description | Test that the API is available |
HTTP Method | GET |
Example URL | /api/hubmap/test |
URL Parameters | None |
Data Parameters | None |
Success Response | Code: 200 Content: {"api_is_alive":true} |
Error Responses | None |
Get Process Strings | |
---|---|
Description | Get a list of valid process identifier keys |
HTTP Method | GET |
Example URL | /api/hubmap/get_process_strings |
URL Parameters | None |
Data Parameters | None |
Success Response | Code: 200 Content: {"process_strings":[...list of keys...]} |
Error Responses | None |
Get Version Information | |
---|---|
Description | Get API version information |
HTTP Method | GET |
Example URL | /api/hubmap/version |
URL Parameters | None |
Data Parameters | None |
Success Response | Code: 200 Content: {"api":API version, "build":build version} |
Error Responses | None |
Request Ingest | |
---|---|
Description | Cause a workflow to be applied to a dataset in the LZ. The full dataset path is computed from the data parameters. |
HTTP Method | POST |
Example URL | /api/hubmap/request_ingest |
URL Parameters | None |
Data Parameters | provider : one of a known set of providers, e.g. 'Vanderbilt' submission_id : unique identifier string for the data submission process : one of a known set of process names, e.g. 'MICROSCOPY.IMS.ALL' |
Success Response | Code: 200 Content:{ "ingest_id":"some_unique_string", "run_id":"some_other_unique_string" } |
Error Responses | Bad Request: Code: 400 Content strings: "Must specify provider to request data be ingested" "Must specify sample_id to request data be ingested" "Must specify process to request data be ingested" "NAME is not a known ingestion process" Unauthorized: Code: 401 Content strings: "You are not authorized to use this resource" Not Found: Code: 404 Content strings: "Resource not found" "Dag id DAG_ID not found" Server Error: Code: 500 Content strings: "An unexpected problem occurred" "The request happened twice?" "Attempt to trigger run produced an error: ERROR" |