quinlan-lab / vcf2db

create a gemini-compatible database from a VCF

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Multiple floats in INFO columns

samesense opened this issue · comments

commented

I used vcfanno to annotate my vcf with AF from a decomposed and normalized kaviar vcf. When I ran vcf2db.py, the kaviar AF info was not included in the database.

The problem was multiple comma delimited floats in the kaviar AF. It would be nice to get a warning or error about such cases instead of silently dropping column.

can you share the header line for the excluded field?

@samesense I just pushed a change that prints a warning for the case that I think you are hitting. Could you pull and verify?

Or if you have a better idea on how to handle, I'd be open to it.

commented

This works.

##INFO=<ID=kv_af,Number=1,Type=Float,Description="calculated by max of overlapping values in field AF from Kaviar-160204-Public-hg19.vt.vcf.gz">

These do not:

##INFO=<ID=kv_af,Number=A,Type=Float,Description="transfered from matched variants in Kaviar-160204-Public-hg19.vt.vcf.gz">
##INFO=<ID=kv_src,Number=A,Type=String,Description="transfered from matched variants in Kaviar-160204-Public-hg19.vt.vcf.gz">

ex from failing vcf INFO:

kv_af=0.2824;kv_src=GMIAK1|GMIAK2|GMIAK3|GMIAK5|GMIAK6|GS000009920;

kv_af=0.000417,3.79e-05,0.6954,0.0001516,0.0001137;kv_src=ISB_founders-Nge3|NA21767,NA12890,!Gubi|DNK02|G/aq'o|GMIAK1|GMIAK14|;

Yep, OK. you now get a warning message for this.

commented

I'll try it out. Thanks.

commented

I see the warning, but this also tosses out the 1kg columns from your rare-disease example for vcfanno. Do I need to update the tidy 1kg vcf file to fix 'A'?(ALL.wgs.phase3_shapeit2_mvncall_integrated_v5a.20130502.sites.tidy.vcf.gz)?

skipping 'af_1kg_afr' because it has Number=A

shoot, that's a good point. I changed the ExAC annos from "self" to "max" for this reason. I'll update the rare-disease.conf to do the same for 1kg.

commented

Thanks.

I pushed the change to vcfanno, thanks for reporting. please let me know if you hit any other problems.