qpDstats produced strange results for some combination of species
smallfishcui opened this issue · comments
Hi,
I am using qpdstats in admixtools to detect introgression among species.
I used convertVCFtoEigenstrat.sh script to convert vcf file to eigenstrat file format, and assigned each individual to a population. When I perform the qpDstats, it seems some analyses runs fine, but others just show 0, and the ones run fine were all significant, could anybody tell what's the problem?
Here is part of my result:
W | X | Y | Z | D | stderr | Zscore | BABA | ABBA | nsnps
1 | Med | CN | AU | EU | 0 | 1 | 0 | 0 | 0 | 0 |
---|---|---|---|---|---|---|---|---|---|---|
2 | Med | CN | AU | Usland | 0 | 1 | 0 | 0 | 0 | 0 |
3 | Med | CN | AU | Usnat | 0 | 1 | 0 | 0 | 0 | 0 |
4 | Med | CN | AU | OG | 0 | 1 | 0 | 0 | 0 | 0 |
5 | Med | CN | EU | AU | 0 | 1 | 0 | 0 | 0 | 0 |
6 | Med | CN | EU | Usland | 0 | 1 | 0 | 0 | 0 | 0 |
7 | Med | CN | EU | Usnat | 0 | 1 | 0 | 0 | 0 | 0 |
8 | Med | CN | EU | OG | 0 | 1 | 0 | 0 | 0 | 0 |
9 | Med | CN | Usland | AU | 0 | 1 | 0 | 0 | 0 | 0 |
10 | Med | CN | Usland | EU | 0 | 1 | 0 | 0 | 0 | 0 |
37 | Med | AU | OG | CN | 0 | 1 | 0 | 0 | 0 | 0 |
38 | Med | AU | OG | EU | 0 | 1 | 0 | 0 | 0 | 0 |
39 | Med | AU | OG | Usland | 0 | 1 | 0 | 0 | 0 | 0 |
40 | Med | AU | OG | Usnat | 0 | 1 | 0 | 0 | 0 | 0 |
41 | Med | EU | CN | AU | 0 | 1 | 0 | 0 | 0 | 0 |
42 | Med | EU | CN | Usland | -0,3349 | 0,035253 | -9,5 | 2485 | 4986 | 214014 |
Thanks,
Cui
Hi
I come back again.
Still the same problem.
W X Y Z D stderr Zscore BABA ABBA nsnps
1 Med EU Usland CN 0.795 0.140 5.68 196 22 163723
2 Med EU AU CN -0.0341 0.0645 -0.529 16 18 163723
3 Med Usland EU CN 0 1 0 0 0 0
4 Med Usland AU CN 0 1 0 0 0 0
5 Med AU EU CN 0 1 0 0 0 0
6 Med AU Usland CN 0 1 0 0 0 0
7 EU Med Usland CN -0.795 0.140 -5.68 22 196 163723
it seems this is not caused by genotyping problem of a certain population, but for some combinations it go wrong. For example, the data should have 163723SNPs in total, but for those failed lines the SNP is 0. How could it be?
br,
Cui
Hi Nick,
I just realized one error message from convertf saying that "snp order check fail; snp list not order", as someone poseted here:DReichLab/EIG#37
and here:https://www.biostars.org/p/389958/
So I guess the malformated map file may be the reason that caused the failure of the admixtools run.
Here is the format of my map file produced by vcftools:
7180003098625 340:374:- 0 2078
7180003098625 340:233:- 0 2219
7180003098625 340:213:- 0 2239
7180003098625 340:188:- 0 2264
7180003098625 340:133:- 0 2319
7180003098625 340:83:- 0 2369
According to your instruction I changed the map file to
1 340:374:- 0 2078
1 340:233:- 0 2219
1 340:213:- 0 2239
1 340:188:- 0 2264
1 340:133:- 0 2319
1 340:83:- 0 2369
However, I don't know how to format the $2 and the $4 about the SNP position and coordination, can you give some suggestions?
thanks,
Cui