fairtracks / fairtracks_standard

FAIRtracks is a JSON Schema defining a minimal standard for genomic track metadata.

Home Page:https://fairtracks.net

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Change IRI into URL

sveinugu opened this issue · comments

@dzerbino @jmfernandez The use of IRI was made without really thinking about the consequences. Even if Unicode support is more or less everywhere, it is not difficult to envision that UTF8 characters in IRIs might break downstream parsers or consumers. We will not demand that metadata only use ASCII characters, as this will limit usefulness. Many terms naturally include non-ascii characters, such as e.g. greek letters. So parsers should handle Unicode metadata. But the IRI/URLs would often be not only parsed, but also accessed automatically by the consumers, which opens up for a new set of possible bugs. Any IRI can be translated to URI using %-escaping (https://en.wikipedia.org/wiki/Internationalized_Resource_Identifier), so we do not loose any functionality by the change.

Another argument is that (as far as I know) there is no concept of IRL, meaning internationalized version of URLs (perhaps due to the "In Real Life" acronym?). But our IRIs are all locators, not names, so using URL is more precise.

Also, URL is a more known acronym.

BTW: For the standard JSON files itself, there is no reason for the standard not being fully ASCII, which they are now (but not in a previous version, due to "..." being by mistake replaced by a single character). Which is the reason why the Python scripts should be Unicode-safe, but rather fail for non-ascii characters.. But this is an aside.

Please reopen if you don't agree with the changes!