chrchang / plink-ng

A comprehensive update to the PLINK association analysis toolset. Beta testing of the first new version (1.90), focused on speed and memory efficiency improvements, is finishing up. Development is now focused on building out support for multiallelic, phased, and dosage data in PLINK 2.0.

Home Page:https://www.cog-genomics.org/plink/2.0/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Error: Failed to extract eigenvector(s) from GRM.

ManuelMoradiellos opened this issue · comments

Hi~!

I wanted to use Plink (v1.90b6.21 on Ubuntu 20.04.1 Linux x86_64) to obtain a Population Stratification PCA
of my variants dataset.

This dataset is a .tsv that I've tried to turn into a mock-up .vcf as input for plink, the .vcf headers are not exactly
real and I omitted including some information on various columns (QUAL, INFO) as they were a bit cumbersome
to add and as I thought that the Genotype (GT) information should be enough for my task.

I don't know if I'm missing something in the input .vcf or if I'm overlooking a command as I'm relatively new to
bioinformatics, but I include the .log and a fraction of the input .vcf if its any help.
Thanks in advance!!

gazpacho.log
plink_test_input.vcf.gz

99.5% of the genotypes in your input VCF are missing. PCA should not be used on variants with more than ~10% missing data.

Thanks, that was it! Here I was using a small subset as a toy example, but once I tried to apply it to my whole dataset I got what I asked for.

all.vcf.gz
Sorry to bother you, but I came across the same mistake:Failed to extract eigenvector(s) from GRM. I include the input .vcf if its any help.
Thank you very much!

Rerun --pca with plink 2.0; it will provide a more useful error message.

Thank you for your reply. After I deleted several samples with very high similarity with reference genes, the problem was solved