Merging UKB SV Files
GHawkes93 opened this issue · comments
Hi,
In the recent release of 500,000 genomes, the UKB has provided SV calls, but only in bgzipped sample-level vcf files.
I've tried merging these files in groups to create a pVCF- after unzipping each vcf, as survivor doesn't seem to take .gz files? - but the file size is growing such that I can't merge those groups (I get a "Killed" error). I tried trimming the vcf files to just genotypes in the FORMAT field using bcftools - but then the merging was odd, in that when merging two files with 9000 people each in, I got only 2 individuals in the output
Do you have any suggestions for how I could perform this analysis?
Cheers,
Gareth
I should add - I'm using a 72-core machine - each group file (approx 9k people) is ~ 270GB and contains ~.5M SVs