About the 1000 Genomes dataset
Caiyong-Yin opened this issue · comments
Dear authors,
When I hanle the 1000 Genomes dataset by the code 'yhaplo -i 1000y.all.vcf.gz', the linux system responds that ‘ERROR. Could not open: 1000Y.all.vcf.gz’. I have check the path and the name of the data. So how could I solve this problem?
Caiyong Yin
Fudan University
Hmm, I just downloaded a fresh copy of the file from:
ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/supporting/chrY/ALL.chrY_10Mbp_mask.glia_freebayes_maxLikGT_siteQC.20130502.60555_biallelic_snps.vcf.gz
I then ran:
ln -sf ALL.chrY_10Mbp_mask.glia_freebayes_maxLikGT_siteQC.20130502.60555_biallelic_snps.vcf.gz 1000y.all.vcf.gz
yhaplo -i 1000y.all.vcf.gz
And it worked fine. So I'm guessing your download was corrupted. You could try reading the file with tabix
:
tabix -p vcf 1000y.all.vcf.gz
tabix -H 1000y.all.vcf.gz
But it's probably easiest to just re-download.