FAIR schemas to enable and encourage data interoperability across systems.
This repository contains best-practice tables that are reused for specific projects or applications via profiles.
The aim is to be specific enough to be relatable and useful, but not be unnecessarily specific.
For instance, Subject is reusable cross-species by replacing human-biased underlying ontologies.
Rules for tables
Tables represent concept archetype such as Study, Biobank, Subject, Biosample, and Cohort.
Tables are supersets of reusable columns that belong to that concept.
Identical or similar columns should ideally be merged.
Columns can have a partOfStandard attribute to indicate they represent an accepted standard. Standard must their own profile with this name.
Columns that represent the same concept (e.g. Age) expressed in a different value type (e.g. age in years vs. age range categorical) are not explicitly connected, but should be tagged with the same semantics.
Column names are only required to be unique within context of their table.
Table and column names must start with letter, followed by letter, number, whitespace or underscore ([a-zA-Z][a-zA-Z0-9_ ]*).
Things not supported:
Inheritance because it is too limiting. How to solve use cases for inheritance?
Instead of querying for an inheritance subtree you can indicate what profile of a table you want to have in your reference
When using refLabel for a reference the designer of an instance should ensure all columns exist in the refered to profiles.
Rules for profiles
Profiles represent specific projects or applications.
Profiles can cherry-pick a combination of:
Tables (all columns of that table). We sometimes refer to such table instance as 'flavor'. E.g. Patient is a flavor of Subject.
Columns (some columns of that table)
Profiles (all columns included or defined by that profile)
Standards (all columns included or defined by that standard)
Reused columns are chosen by referencing only their name.
Reused columns are placed in their original table structure.
Reused columns cannot be altered for interoperability purposes. This includes relabeling. If relabeling is required, this should be done via runtime internationalization.
Profiles for particular applications can introduce highly specific, non-reusable tables and columns.
To add new columns to an existing table, that table should be represented in the profile using only the name.
New columns in new tables should be fully specified as expected from tables.
Rules for standards
Standards are only comprised of columns annotated with the partOfStandard attribute.
Standard names that match the column attribute partOfStandard describe that standard.
Standard cannot point to additional columns, tables or profiles.
Standards cannot introduce additional columns.
Syntax
Table attributes
Attribute
Description
name
Name of this table. Required.
definedBy
The location of the ontology term that defines this column.
definedAs
The column definition according to the ontology term.
we prefer explicit definitions over magic. E.g. if you are ducktyping (e.g. using different flavors of a table) then the standard should make exlpicit if you assume particular columns to be present (e.g. via refLabel you can indidate what columns you expect in a lookup).
semantic data types.
About
FAIR schemas to enable and encourage data interoperability across systems