GFBio logo

GFBio Data Transformation Service

WP3, Task 3.4

How to start transformations

(For the detailed concept and a full API documentation, please refer to the internal wiki.)

A transformation job can be initiated using the transform service, which accepts the following four parameters:
transformation: ID of the transformation to be executed,
version: version ID (optional),
input_file_url: URL of the source document,
input_file_zipped: specifies whether the source document is zipped or not.

A list of transformations offered by the service can be retrieved by using this request.
 

#1: ABCD > PANGAEA PanSimple

Transforms a single ABCD document into a PanSimple document, which is used for harvesting purposes from PANGAEA (Data Publisher for Earth and Environmental Science).
Version list | Sample request

#2: ABCD > HTML Landing Page

Transforms a single ABCD document into a human-readable description of the dataset stored in the document. The page generated contains the dataset's metadata (such as title, description, contacts, taxonomic and geographic scope) and lists the individual records with their catalog number and scientific identification result. If the ABCD field RecordURI is filled, the detailed record pages are linked.
Version list | Sample request

#3: ABCD (archive) > DarwinCore Archive

This transformation will create a DarwinCore archive for an ABCD dataset. The source document can be a single ABCD file storing one dataset (parameter input_file_zipped = true) or an ABCD archive containing multiple documents (input_file_zipped = false). If the source file is an ABCD archive, the transformation can take some time to run - a rough estimation is one minute for 100,000 records.
The result file will be a zipped DarwinCore archive with an occurrence.txt core file, a descriptor file meta.xml and an EML document eml.xml. In addition, in case the respective fields are present in the ABCD documents, there might be three extension files identification.txt, multimedia.txt and measurementorfact.
Version list | Sample request for single ABCD document | Sample request for ABCD archive

#4: CDM Light > PANGAEA PanSimple

This transformation will convert a checklist from CDM Light into the PANGAEA PanSimple format.
The result file will be a zipped DarwinCore archive with an occurrence.txt core file, a descriptor file meta.xml and an EML document eml.xml. In addition, in case the respective fields are present in the ABCD documents, there might be three extension files identification.txt, multimedia.txt and measurementorfact.
Version list | Sample request

#5: ABCD > BioSchemas Search File

This transformation converts a single ABCD document into a BioSchemas file representing the dataset. This file is a reduced and summarized version of the ABCD document(s), containing only metadata that is relevant for search engines.
Version list | Sample request