chrchang / plink-ng

A comprehensive update to the PLINK association analysis toolset. Beta testing of the first new version (1.90), focused on speed and memory efficiency improvements, is finishing up. Development is now focused on building out support for multiallelic, phased, and dosage data in PLINK 2.0.

Home Page:https://www.cog-genomics.org/plink/2.0/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

plink2 --within flag not clustering

reneemf opened this issue · comments

I was attempting to calculate allele frequencies by population using the --within flag as follows:
plink2 --bfile in_file -allow-no-sex --freq --within populations.txt --out out_file
Analysis ran and output a .afreq file but was lacking any clusters/stratification. I ran the exact same analysis but using plink 1.9 rather than plink 2 as follows:
plink --bfile in_file -allow-no-sex --freq --within populations.txt --out out_file
and the output file (.frq.strat) contained CLST column as expected.
Plink2 doesn't seem to be clustering properly using the --within flag. Additionally the --family flag is not clustering by FID using plink2 (outputs un-clustered .afreq file).

This is a case where backward-compatibility was deliberately dropped in plink2 because the old behavior was a weird corner-case relative to the rest of plink's design.

The plink2 way to get cluster-stratified statistics is to combine --within/--family with --loop-cats; this generates a separate set of output files for each category. If you prefer what plink 1.x did, it's fine to continue using it; it will continue to be maintained.

Thanks for letting me know!