iobis / Project-team-Genetic-Data

Developing guidelines for adding sequence data to OBIS

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Adding Genetic Data to OBIS

Introduction

This repository is the main discussion channel for the OBIS project team on genetic data in 2021. The objective of the project team is to discuss the guidelines required for adding genetic data to OBIS, as well as how OBIS will store, access, and analyse that data.

This project team will work in conjunction with the TWDG task team for Sustainable DarwinCore MIxS Interoperability (https://www.tdwg.org/community/gbwg/MIxS/), and will be utilizing the extension decided on by the community, while providing feedback to the task team through issues discovered through the use cases. In addition, the guidelines in development by GBIF will be reviewed (https://docs.gbif-uat.org/publishing-dna-derived-data/1.0/en/), and OBIS will align its guidelines so that the interoperability between GBIF and OBIS is retained, and provide feedback if any issues are encountered.

Goals and Outcomes

Objectives of the project are to have ready guidelines with use cases to submit to the 10th Session of the SG-OBIS (Nov 2021). Initially, the GBIF guidelines will be reviewed and the DwC-MixS extension will be tested with different use cases. Most importantly however, discussion will be around how OBIS will store genetic data, how this data will be analysed/updated and how different issues will be dealt with.

Part 1

  • Review GBIF guidelines
  • Follow DwC-MixS interoperability developments
  • Review MixS data fields, and how these will be suited for OBIS

Part 2

  • Discussion and decisions on how sequence data will be dealt with in OBIS
    • Will OBIS store sequences or reference to other databases?
    • Will OBIS analyse sequence data, i.e. have its own bioinformatics pipeline?
    • How will counts be dealt with?
    • How will unnamed or cryptic species be dealt with?
    • Will taxonomies be updated inside OBIS?
    • Will OBIS support data submission through the biom format?
    • How will OBIS deal with control data?
    • How will OBIS deal with simultaneous analysis of several biomarkers?

Questions and suggestions

The most up-to-date guidelines will be collected in the guidelines folder. Questions on specific issues encountered for datasets should be added to the issues tab.

Materials to help get started

OBIS organized a webinar on genetic data, with an introduction to how OBIS is incorporating data, how genetic data can be accessed and a use case from the first eDNA dataset provided by OBIS-USA. The recording of the webinar can be watched here.

In addition, as the first use case: the data and python scripts used for formatting the first eDNA dataset are available here!

We are always also looking for more use-cases that could be used as examples for adding genetic data to OBIS.

About

Developing guidelines for adding sequence data to OBIS