API - Data Transformation Service

Jump to: navigation

API

transformations/

Shows all available transformations (Metadata, latest version, when changed, attached files, input format, output format)

[Expand]
Sample Return
{"transformations": [
    {
        "transformation_id": "1",
        "title": "ABCD XML to PanSimple",
        "description": "converts ABCD 2.06 format to the PanSimple metadata format for havesting by the GFBio Indexer",
        "input_format": "ABCD 2.06",
        "output_format": "PanSimple",
        "version_id": "2",
        "version_date": "2019-05-16",
        "files": ["abcd_2.06-pansimple.xslt"],
        "engine": "xslt",
        "status": "stable"
    },
    {
        "transformation_id": "2",
        "title": "ABCD XML to Darwin Core",
        "description": "converts ABCD 2.06 format to DwC either as flat files or as archive depending on the parameter.",
        "input_format": "ABCD 2.06",
        "output_format": "DarwinCore",
        "version_id": "1",
        "version_date": "2019-07-16",
        "files": [
            "abcd_dwc_job.kjb",
            "abcd_dwc_transformation.ktr"
        ],
        "parameters": [
            {"parameter_name": "archive"},
            {"parameter_description": "a boolean flag indicating if an DwC Archive should be generated. Allowed Values: \"true\" or \"false\", default value: \"false\""}
        ],
        "engine": "pdi",
        "status": "experimental"
    }
]}

transformations/<transformation-id>

Shows information about selected transformation, lists versions

[Expand]
Sample Return
{"transformation": {
    "transformation_id": "1",
    "title": "ABCD XML to PanSimple",
    "description": "converts ABCD 2.06 format to the PanSimple metadata format for havesting by the GFBio Indexer",
    "input_format": "ABCD 2.06",
    "output_format": "PanSimple",
    "version_id": "2",
    "version_date": "2019-05-16",
    "files": ["abcd_2.06-pansimple.xslt"],
    "engine": "xslt",
    "versions": [
        {
            "version_id": "2",
            "version_comment": "minor changes and fixes",
            "version_date": "2019-05-16",
            "title": "ABCD XML to PanSimple",
            "description": "converts ABCD to PanSimple",
            "input_format": "ABCD 2.06",
            "output_format": "PanSimple",
            "files": ["abcd_2.06-pansimple.xslt"],
            "engine": "xslt"
        },
        {
            "version_id": "1",
            "version_comment": "initial version",
            "version_date": "2019-03-12",
            "title": "ABCD XML to PanSimple",
            "description": "converts ABCD 2.06 format to the PanSimple metadata format for havesting by the GFBio Indexer",
            "input_format": "ABCD 2.06",
            "output_format": "PanSimple",
            "files": ["abcd_2.06-pansimple.xslt"],
            "engine": "xslt"
        }
    ]
}}

transformations/<transformation-id>/<file-name>

Shows or downloads the most recent version of the specified file.

transformations/<transformation-id>/<version-id>

Shows information about selected version of the transformation

[Expand]
Sample Return
{"version": {
    "transformation_id": "1",
    "version_id": "2",
    "version_comment": "minor changes and fixes",
    "version_date": "2019-05-16",
    "title": "ABCD XML to PanSimple",
    "description": "converts ABCD to PanSimple",
    "input_format": "ABCD 2.06",
    "output_format": "PanSimple",
    "files": ["abcd_2.06-pansimple.xslt"],
    "engine": "xslt"
}}

transformations/<transformation-id>/<version-id>/<file-name>

Shows or downloads the specified file at the time of the specified version.

transform

Creates a new job for a transformation. The parameters are passed as GET variables. The result will be a redirect to the result page of the current job.

Parameters [Expand]
Parameter Description Values Example
transformation-id the id of the requested transformation positive integer 1
version-id the id of the specific version of the transformation. Optional. If none is supplied, the most recent version is used. positive integer 2
input-file-url the URL to the file which should be transformed url-encoded link https%3A%2F%2Fdata.example.org%2Fmy-dataset%2Fobservations.zip
input-file-zipped A boolean flag indicating that the input file is compressed and needs to be extracted prior to transformation. true or false (default:false) true
additional parameters Additional parameters that are specified by the specific transformation. All parameters must start with an underscore to avoid conflicts with existing parameters. mixed _custom_parameter=value
Sample Query

https://data-transformation.gfbio.org/api/transform?transformation-id=1&version-id=2&input-file-url=https%3A%2F%2Fdata.example.org%2Fmy-dataset%2Fobservations.zip&input-zipped=true&_target-file-name=my_data_results.xml

results/<job-id>

Status page for current job

[Expand]
Sample Return (while transformation is still ongoing)
{"job": {
    "job_id": "9297105672",
    "transformation_id": "1",
    "version_id": "2",
    "input_file_url":"https://data.example.org/my-dataset/observations.zip",
    "input_file_zipped":"false",
    "parameters": [{"result_file_name": "my_data_results.xml"}],
    "status": "in_progress",
    "start_time": "2019-07-15T13:37:24.782"
}}
[Expand]
Sample Return (after transformation is finished)
{"job": {
    "job_id": "9297105672",
    "transformation_id": "1",
    "version_id": "2",
    "input_file_url":"https://data.example.org/my-dataset/observations.zip",
    "input_file_zipped":"false",
    "parameters": [{"result_file_name": "my_data_results.xml"}],
    "status": "complete",
    "start_time": "2019-07-15T13:37:24.782",
    "finish_time": "2019-07-15T13:37:26.275",
    "result_file": "my_data_results.xml",
    "combined_download": "9297105672.zip",
    "job_expiration_date": "2019-08-15T13:37:26.275"
}}

results/<job-id>/<file-name>

Shows or downloads the specified result file or combined download.

results/<job-id>/delete

Remove all data associated with the current job. If the job is still running, it will be aborted.

[Expand]
Sample Return
{"job": {
    "job_id": "9297105672",
    "status": "removed"
}}

Additional Remarks

  • Combined downloads will include:
    • original data file
    • transformation metadata
    • job metadata
    • transformation files
    • result files
  • jobs have an automated expiration date after which all of the associated files are removed.
  • versions can be marked as deprecated. They will not be shown in the version list, but if the specific version is requested, it is still served.
  • version ids and transformation ids are sequential number, but job ids are long random numbers so that only the client that requested a job can get the result information
  • maybe at a later point it will become useful to offer filter or search capabilities for /transformations/