bpbond / cosore

Data, metadata, and software tools for the COSORE database of continuous soil respiration measurements

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Build Status

codecov

cosore

A first data analysis using COSORE is published!

The Global Change Biology paper is published!

The cosore package consists of data, metadata, and software tools for COSORE, a reproducibility-oriented community database for continuous soil respiration data.

To use the database from within R, install this cosore package by for example devtools::install_github("bpbond/cosore").

To download the COSORE database in a flat-file format, i.e. accessible by any data analysis tool, click on the Releases tab above.

A step-by-step guide to using COSORE is available here.

To contribute to the database, fill out the metadata form.

Principles and general information

Only free use data (CC BY 4) accepted. We request that users cite the database definition paper, and strongly encourage them to (i) cite all dataset primary publications, and (ii) involve data contributors as co-authors when possible.

The package, and the process of contributing and accessing data, should be as focused and simple as possible (but no simpler).

All data contributors will be included on an introductory database paper planned for spring 2020.

COSORE is not designed to be, and should not be treated as, a permanent data repository. It is a community database, but not an institutionally-backed repository like Figshare, DataONE, ESS-DIVE, etc. We recommend (but not require) depositing your data in one of these first, and providing its DOI in your COSORE dataset metadata.

Database design

This database is comprised of a collection of datasets, each converted to a standard format and units. A dataset is one or more files of continuous (automated) soil respiration data, with accompanying metadata, with all measurements taken at a single site and with constant treatment assignments (i.e. they may vary between chambers but not over time).

COSORE is designed to be a relatively lightweight database, and metadata are kept to a minimum. Each dataset has seven tables:

  • description - includes data on the site name; location; timezone name and IGBP cover type; measurement instrument; publication and data links; and acknowledgments and notes.
  • contributors - contributor information, including name, email, ORCID, and CRediT role.
  • ports - continuous systems typically, but not always, are comprised of a single analyzer plumbed to multiple chambers through a multiplexer. This table lists, for each multiplexer port, measurement variable (typically Rs, Rh, or NEE); treatment; species, and chamber/collar details.
  • columns - describes the mapping between the raw dataset fields and standardized COSORE fields; used during the import of raw (contributed) data.
  • ancillary - arbitrary ancillary data: stand structure, carbon cycle, disturbance, etc. [All optional.]
  • data - the actual chamber respiration data, with many possible fields including the required ones: beginning and end timestamps, flux rate, and port number. May also include meteorological and soil data, flux fit diagnostics, error codes, etc.
  • diagnostics - this is generated by the data import process, and summarizes records that were dropped, problems found, etc.

Operation

Four primary functions are available for R users:

  • csr_database() returns a summary data frame about the entire database (all constituent datasets)
  • csr_dataset() returns a single dataset, as a list of data frames
  • csr_table() returns a single table, across one or many datasets
  • csr_metadata() returns a metadata table describing all fields in dataset tables

Reports can be generated for the overall database (csr_report_database()) and each individual dataset (csr_report_dataset()). There are a number of developer functions as well, i.e. not intended for the average COSORE user. Perhaps most importantly this includes csr_build(), which scans for and parses metadata on all installed datasets, then loads the data, parsing raw data as necessary and available.

Data access

R users will find it easiest to install this package and then use the functions above. Anyone can also download flat (csv) files from the Releases page.

Data priorities

  • Structured/standardized continuous IRGA data
  • Raw LI-8100A data
  • Unusual or long-term survey (i.e. not continuous) measurements

About

Data, metadata, and software tools for the COSORE database of continuous soil respiration measurements

License:Creative Commons Attribution 4.0 International


Languages

Language:R 100.0%