chrchang / plink-ng

A comprehensive update to the PLINK association analysis toolset. Beta testing of the first new version (1.90), focused on speed and memory efficiency improvements, is finishing up. Development is now focused on building out support for multiallelic, phased, and dosage data in PLINK 2.0.

Home Page:https://www.cog-genomics.org/plink/2.0/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

read haploid dosages with pgenlib

23andme-jaredo opened this issue · comments

Is it possible to read haploid dosages with pgenlib.PgenReader?

thanks,

Jared

As with the plink .bed format, haploid vs. diploid is not directly encoded in the .pgen. Instead, plink and plink2 divide the encoded values by two when the .bim/.pvar (and on chrX, .fam/.psam) file indicates that we're dealing with haploid data.

hmmm so I am a bit confused. I have imputed data converted from bcf via:

plink2 --bcf $bcf dosage=HDS --make-pfile

and I can see that the two haploid dosages per individual are stored because I can recover them via:

 plink2 --pfile plink2 --export vcf bgz vcf-dosage=HDS

so I am try to extract those HDS values via pgenlib

Maybe I wasn't clear that I meant imputed haploid/phased probabilities, not hard genotypes.

Oh, sorry, I thought you were referring to e.g. chrX/chrY/chrM.

The PgrGetDp() function in pgenlib_read.h is the simplest one that can return biallelic phased dosages.

Thanks! We'll try exposing that in python.