fairtracks / fairtracks_standard

FAIRtracks is a JSON Schema defining a minimal standard for genomic track metadata.

Home Page:https://fairtracks.net

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Handle versioning of ontologies

sveinugu opened this issue · comments

After some research, ontologies in OWL2 have an ontology IRI (shared by all the releases of the ontology), and an ontology versionIRI (unique for each one of the releases, usually based on the ontologyIRI ). When these IRIs behave like URLs (recommended case, but it is not enforced), the ontology IRI is (or should be) a download link of the latest version of the ontology, and the ontology versionIRI is a stable download link to the OWL containing that ontology versionIRI.

In Owlready2 we have next panorama: when you give an ontology provided by URL to the library, it fetches the content, it parses the ontology, and then it stores the ontology definitions. Then, in the table of stored ontologies, the ontology IRI is recorded, associated to the stored ontology. The ontology versionIRI is ignored, even when it is available. The used download URL is recorded in the table of ontology aliases.

So, my proposal for FAIRTracks JSON Schemas is the next:

  1. Check which of the used ontologies can be fetched by their ontology IRI, and use it instead of the download URL.

  2. Improve the ontology declaration in the JSON Schema to tell both the ontology IRI and fallback URLs to the ontology, so we can check that the fetched ontology is still the right one.

  3. On validation, returning in the validation results metadata the used ontology IRIs, as well as the ontology versionIRIs of the loaded ontologies (when available), so validation is more reproducible.

@jmfernandez Thanks for the thorough research and feedback.

  1. I think all the ontologies we define support for have both a versionIRI and an ontologyIRI (which are both downloadable URLs), so I think we can assume this. I think having versioned ontologies is most FAIR, so if we need an unversioned ontology later I think we should first contact them and suggest they add a versionIRI, before looking into supporting other cases.

  2. I think the best way to describe this in the Schema is to use ontologyIRIs in the ontology attribute of the term_url fields. In addition we add a required ontology_versionIRIs object with one field per supported ontology (using ontologyIRI as name). The values should be the versionIRI. We can help the curators by adding format rules and autogenerating the latest versionIRI when using our autogenerate service (currently in TrackFind).

  3. I think this is a good idea. Most ontologies should be backwards compatible so the versionIRI could be mostly thought of as a least version field (but not guaranteed). It is also good to be alerted of any incompatible changes to the ontologies. So using only the current ontology for validation might be the best solution as this improves interoperability. In case of non-backwards-compatible changes to the ontologies, one should’ve easily able to fix that when one has the version at creation time (which we will require).

So for the validator, it should always use the most current ontology version (which it might already do today) and also return both IRIs of the most resent ontology in the validation response. It should also report the versionIRI in the submitted json (if the top-level Schema is validated), you will need to wait for the current issue to be fixed first, though. @jmfernandez Can you add issues for the validator repo for whatever is lacking to support this?

@sveinugu I have just defined a couple of issues in the fairtracks_validator repo