chrchang / plink-ng

A comprehensive update to the PLINK association analysis toolset. Beta testing of the first new version (1.90), focused on speed and memory efficiency improvements, is finishing up. Development is now focused on building out support for multiallelic, phased, and dosage data in PLINK 2.0.

Home Page:https://www.cog-genomics.org/plink/2.0/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

pgen spec: when bottom 3 bits of the record type is 3

kose-y opened this issue · comments

The current PGEN spec states:

3: LD-compressed, inverted. A difflist with all (sample ID, genotype value) pairs for samples in different categories than they would be in the previous non-LD-compressed variant after inversion (categories 0 and 2 swapped). This addresses spots where the reference genome is “wrong” for the population of interest.

I'm working on an independent PGEN reader in Julia, and it seems that the genotype value in the difflist is also inverted. I think it should be explicitly stated in the spec.

Thanks for reporting this; it should be clarified by f004302 .

Thanks!