Get list of INFO and FORMAT keys from Header
edg1983 opened this issue · comments
Hi Brent,
Is there a way to get the list of keys for INFO and FORMAT definitions from the header of a VCF using hts-nim?
The idea I'm working on is to get all the INFO and FORMAT keys defined in the header from 2 VCFs so I can compute the intersection and output a new VCF containing only the shared fields for both.
Thanks!
Hi Edoardo,
there's not currently a nice way to do this. You could get the header-string from each header, then write your own code to get the intersection of INFO and FORMAT fields and then use, e.g.:
try:
var hi = ivcf.header[key]
# do something with hi (HeaderInfo)
except KeyError:
continue
and you can merge headers as here: https://github.com/brentp/tnsv/blob/main/tnsv.nim#L38 (just letting htslib do that part).
In short, it's possible, but will be a lot of work and string parsing. If you give it a go and get stuck I'll attempt to help.