Anthony-Nolan / Atlas

A free & open-source Donor Search Algorithm Service

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Support older nomenclature versions than 3.17.0

benbelow opened this issue · comments

In older versions of the HLA nomenclature, certain files in the source data github respoistory do not exist - e.g. https://github.com/ANHIG/IMGTHLA/blob/3170/Allelelist_history.txt is the earliest copy of the Allelelist_history file.

(I haven't confirmed whether this is the only file Atlas uses with this issue - plausibly the minimum version is even higher than 3.17.0)

This change will only be necessary if we need to be able to run Atlas on very old nomenclature versions. One current potential use case is when running match prediction to compare to consensus data, which was derived using nomenclature 3.4.0.

I will run using a later nomenclature and see how the results look - if there are large discrepancies between the consensus and Atlas results, and we believe these can be attributed to the nomenclature version mismatch, we may need to think about how to achieve this.

Otherwise, this may never be worth doing - as realistic use cases of Atlas in production will not need to use any very old nomenclature versions

If this is not worked on (which I think is likely), then we should instead document somewhere at a high level that Atlas has minimum requirements for HLA nomenclature versions

Experimentation has shown that 3.33.0 is actually the minimum supported nomenclature

No need to support older versions, especially as a new validation dataset is being design by the WMDA Bioinformatics group. Documented instead.