fritzsedlazeck / SURVIVOR

Toolset for SV simulation, comparison and filtering

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

issue with merging sample vcfs already merged by calls from different programs

heidihyang opened this issue · comments

Hi Fritz!

I have a total of 59 samples and have called SVs using delly, lumpy, and manta - I merged the vcfs from the programs for each sample with no issue, and I am now trying to run SURVIVOR merge to merge all of the samples into one vcf. I've sorted all of these vcfs beforehand.

The command I'm using is SURVIVOR merge samples 1000 0 1 1 0 30 all_merged.vcf. I don't see any errors in my joblog, but I think my output vcf is corrupted, since when I try to use bcftools query and other commands to inspect the file it throws errors.

For example, when I run bcftools query -f "%FORMAT\n" all_merged.vcf I get the following errors:

[W::vcf_parse] Contig '' is not defined in the header. (Quick workaround: index the file with tabix.)
[W::bcf_hrec_check] Invalid contig name: ""
[E::vcf_parse_format] Incorrect number of FORMAT fields at :1

I then tried to make an index and got more errors:

[E::hts_idx_push] Unsorted positions on sequence #1: 6 followed by 1
tbx_index_build failed: all_merged.vcf.gz

I then got an error trying to sort the file:

[W::vcf_parse] Contig '' is not defined in the header. (Quick workaround: index the file with tabix.)
[W::bcf_hrec_check] Invalid contig name: ""
[E::vcf_parse_format] Incorrect number of FORMAT fields at :1
Error encountered while parsing the input
Cleaning

It could be a problem further upstream, but I'm wondering if you have suggestions on parameters to change in the merge command or other solutions. Thank you in advance!

Best,
Heidi

I did not unzip some of my vcfs! closing