Stockholm domain format doesn't work with non-UniProt FASTA sequences
widdowquinn opened this issue · comments
Leighton Pritchard commented
Summary:
Extracting CDS features uses the GN=.*
regex, but if adding Stockholm domains to NCBI FASTA files, this is missing. That causes corresponding features not to be found, leading to false negatives.
We should add an additional check for the sequence ID, not just the GN=
field, when that is missing.