23andMe / yhaplo

Identifying Y-chromosome haplogroups in arbitrarily large samples of sequenced or genotyped men

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

About the 1000 Genomes dataset

Caiyong-Yin opened this issue · comments

Dear authors,

When I hanle the 1000 Genomes dataset by the code 'yhaplo -i 1000y.all.vcf.gz', the linux system responds that ‘ERROR. Could not open: 1000Y.all.vcf.gz’. I have check the path and the name of the data. So how could I solve this problem?

Caiyong Yin
Fudan University

Hmm, I just downloaded a fresh copy of the file from:
ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/supporting/chrY/ALL.chrY_10Mbp_mask.glia_freebayes_maxLikGT_siteQC.20130502.60555_biallelic_snps.vcf.gz

I then ran:

ln -sf ALL.chrY_10Mbp_mask.glia_freebayes_maxLikGT_siteQC.20130502.60555_biallelic_snps.vcf.gz 1000y.all.vcf.gz
yhaplo -i 1000y.all.vcf.gz

And it worked fine. So I'm guessing your download was corrupted. You could try reading the file with tabix:

tabix -p vcf 1000y.all.vcf.gz
tabix -H 1000y.all.vcf.gz

But it's probably easiest to just re-download.