lenaschimmel / sc2rf

SARS-Cov-2 Recombinant Finder for fasta sequences

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Allow more file formats and/or access methods, i.e. Auspice v2 dataset JSON from nextstrain URLs

lenaschimmel opened this issue · comments

It seems that Auspice v2 dataset JSON have become a de-factor standard way to link to a set of samples, like in nextstrain's fetch URLs.

If I remember it correctly, that JSON can easily be traversed to get the full set of mutations for each sample. I would like to accept those URLs (either the part after nextstrain.org/fetch/ or the whole URL) as an alternative to local .fasta filenames.

Or I should accept both file formats (.fasta and Auspice v2 dataset JSON) as well as several access methods (local file name, remote URL or piped stream), in any possible combination. I might need something like a rewindable stream so that I can look at the first few bytes, decide what it is, and then parse it from the beginning.