API - Data Transformation Service
Contents
- 1 API
- 1.1 transformations/
- 1.2 transformations/<transformation-id>
- 1.3 transformations/<transformation-id>/<file-name>
- 1.4 transformations/<transformation-id>/<version-id>
- 1.5 transformations/<transformation-id>/<version-id>/<file-name>
- 1.6 transform
- 1.7 results/<job-id>
- 1.8 results/<job-id>/<file-name>
- 1.9 results/<job-id>/delete
- 1.10 Additional Remarks
API
transformations/
Shows all available transformations (Metadata, latest version, when changed, attached files, input format, output format)
{"transformations": [
{
"transformation_id": "1",
"title": "ABCD XML to PanSimple",
"description": "converts ABCD 2.06 format to the PanSimple metadata format for havesting by the GFBio Indexer",
"input_format": "ABCD 2.06",
"output_format": "PanSimple",
"version_id": "2",
"version_date": "2019-05-16",
"files": ["abcd_2.06-pansimple.xslt"],
"engine": "xslt",
"status": "stable"
},
{
"transformation_id": "2",
"title": "ABCD XML to Darwin Core",
"description": "converts ABCD 2.06 format to DwC either as flat files or as archive depending on the parameter.",
"input_format": "ABCD 2.06",
"output_format": "DarwinCore",
"version_id": "1",
"version_date": "2019-07-16",
"files": [
"abcd_dwc_job.kjb",
"abcd_dwc_transformation.ktr"
],
"parameters": [
{"parameter_name": "archive"},
{"parameter_description": "a boolean flag indicating if an DwC Archive should be generated. Allowed Values: \"true\" or \"false\", default value: \"false\""}
],
"engine": "pdi",
"status": "experimental"
}
]}
transformations/<transformation-id>
Shows information about selected transformation, lists versions
{"transformation": {
"transformation_id": "1",
"title": "ABCD XML to PanSimple",
"description": "converts ABCD 2.06 format to the PanSimple metadata format for havesting by the GFBio Indexer",
"input_format": "ABCD 2.06",
"output_format": "PanSimple",
"version_id": "2",
"version_date": "2019-05-16",
"files": ["abcd_2.06-pansimple.xslt"],
"engine": "xslt",
"versions": [
{
"version_id": "2",
"version_comment": "minor changes and fixes",
"version_date": "2019-05-16",
"title": "ABCD XML to PanSimple",
"description": "converts ABCD to PanSimple",
"input_format": "ABCD 2.06",
"output_format": "PanSimple",
"files": ["abcd_2.06-pansimple.xslt"],
"engine": "xslt"
},
{
"version_id": "1",
"version_comment": "initial version",
"version_date": "2019-03-12",
"title": "ABCD XML to PanSimple",
"description": "converts ABCD 2.06 format to the PanSimple metadata format for havesting by the GFBio Indexer",
"input_format": "ABCD 2.06",
"output_format": "PanSimple",
"files": ["abcd_2.06-pansimple.xslt"],
"engine": "xslt"
}
]
}}
transformations/<transformation-id>/<file-name>
Shows or downloads the most recent version of the specified file.
transformations/<transformation-id>/<version-id>
Shows information about selected version of the transformation
{"version": {
"transformation_id": "1",
"version_id": "2",
"version_comment": "minor changes and fixes",
"version_date": "2019-05-16",
"title": "ABCD XML to PanSimple",
"description": "converts ABCD to PanSimple",
"input_format": "ABCD 2.06",
"output_format": "PanSimple",
"files": ["abcd_2.06-pansimple.xslt"],
"engine": "xslt"
}}
transformations/<transformation-id>/<version-id>/<file-name>
Shows or downloads the specified file at the time of the specified version.
transform
Creates a new job for a transformation. The parameters are passed as GET variables. The result will be a redirect to the result page of the current job.
| Parameter | Description | Values | Example |
|---|---|---|---|
| transformation-id | the id of the requested transformation | positive integer | 1 |
| version-id | the id of the specific version of the transformation. Optional. If none is supplied, the most recent version is used. | positive integer | 2 |
| input-file-url | the URL to the file which should be transformed | url-encoded link | https%3A%2F%2Fdata.example.org%2Fmy-dataset%2Fobservations.zip |
| input-file-zipped | A boolean flag indicating that the input file is compressed and needs to be extracted prior to transformation. | true or false (default:false) | true |
| additional parameters | Additional parameters that are specified by the specific transformation. All parameters must start with an underscore to avoid conflicts with existing parameters. | mixed | _custom_parameter=value |
https://data-transformation.gfbio.org/api/transform?transformation-id=1&version-id=2&input-file-url=https%3A%2F%2Fdata.example.org%2Fmy-dataset%2Fobservations.zip&input-zipped=true&_target-file-name=my_data_results.xml
results/<job-id>
Status page for current job
{"job": {
"job_id": "9297105672",
"transformation_id": "1",
"version_id": "2",
"input_file_url":"https://data.example.org/my-dataset/observations.zip",
"input_file_zipped":"false",
"parameters": [{"result_file_name": "my_data_results.xml"}],
"status": "in_progress",
"start_time": "2019-07-15T13:37:24.782"
}}
{"job": {
"job_id": "9297105672",
"transformation_id": "1",
"version_id": "2",
"input_file_url":"https://data.example.org/my-dataset/observations.zip",
"input_file_zipped":"false",
"parameters": [{"result_file_name": "my_data_results.xml"}],
"status": "complete",
"start_time": "2019-07-15T13:37:24.782",
"finish_time": "2019-07-15T13:37:26.275",
"result_file": "my_data_results.xml",
"combined_download": "9297105672.zip",
"job_expiration_date": "2019-08-15T13:37:26.275"
}}
results/<job-id>/<file-name>
Shows or downloads the specified result file or combined download.
results/<job-id>/delete
Remove all data associated with the current job. If the job is still running, it will be aborted.
Additional Remarks
- Combined downloads will include:
- original data file
- transformation metadata
- job metadata
- transformation files
- result files
- jobs have an automated expiration date after which all of the associated files are removed.
- versions can be marked as deprecated. They will not be shown in the version list, but if the specific version is requested, it is still served.
- version ids and transformation ids are sequential number, but job ids are long random numbers so that only the client that requested a job can get the result information
- maybe at a later point it will become useful to offer filter or search capabilities for
/transformations/