Darwin Core biodiversity data pipelines

This repository contains data production pipelines for building Darwin Core datasets for publication in the Global Biodiversity Information Facility, with permanent archiving in Zenodo

EcoTaxa

Notice: These are pre-production URLs, for testing purposes only

Export EcoTaxa data as TSV (using DOI export with images)
Publish untreated TSV and images to Zenodo
Create Darwin Core occurrences in NDJSON from EcoTaxa TSV, using ecotaxa-darwin-core
Create unique Darwin Core sampling events in NDJSON by reducing the occurrences
@todo Merge with other/authoritative event metadata (eg. sampling volumes)
Create lists of ignored (not-living) and rejected (non-Eukaryota) objects
Create lists of rejected events (non-unique or invalid/non-consistent metadata)
Finish local processing by executing Darwin Core pipelines below

gbif-no-darwin-core$ ./bin/ecotaxa-pipeline 1420

Create taxonomy NDJSON by extracting occurrence taxa and checking against GBIF Species API using WoRMS
Create lists of possible taxonomy issues (not found or incertae sedis)

@todo

This project was co-funded by GBIF Norway, see Data management plan for further details.

Reproducible Darwin Core data pipelines for GBIF Norway

MIT License

Language:Shell 85.1%Language:JavaScript 14.9%