Should I use imputed data for the plink file in step 1 of SAIGE instead of the diierct genotype data of the SNP array?
Apprentice2 opened this issue · comments
I built GRM with the following commands to perform SAIGE and ran step1. The plink file is SNP data with MAF≥1% typed directly from the SNP array. The SNP data is for 1000 individuals.
#! /bin/bashcpu=90
trait=SBP
conda activate saige
mkdir output
createSparseGRM.R
--plinkFile=./mygeno
--nThreads=${cpu}
--outputPrefix=./output/sparseGRM
--numRandomMarkerforSparseKin=2000
--relatednessCutoff=0.125
step1_fitNULLGLMM.R
--plinkFile=./mygeno
--sparseGRMFile=./output/sparseGRM_relatednessCutoff_0.125_2000_randomMarkersUsed.sparseGRM.mtx
--sparseGRMSampleIDFile=./output/sparseGRM_relatednessCutoff_0.125_2000_randomMarkersUsed.sparseGRM.mtx.sampleIDs.txt
--useSparseGRMtoFitNULL=TRUE
--phenoFile=./chrs.imputed.rehead.dose.phe.txt
--phenoCol=${trait}
--covarColList=Sex,PC1,PC2
--qCovarColList=PC1,PC2
--sampleIDColinphenoFile=IID
--invNormalize=TRUE
--traitType=quantitative
--nThreads=${cpu}
--IsOverwriteVarianceRatioFile=TRUE
--isCateVarianceRatio=TRUE
--outputPrefix=./output/${trait}_sparseGRM_temo
However, after the step 1 command, I get the following error message
ERROR! number of genetic variants in 10< MAC <= 20.5 is lower than 30
Please include more markers in this MAC category in the plink file
The plink file in step 1 is required to be a hard call genotype. I used directly typed data from SNP arrays because I thought that high-quality data should be used for the step 1. On the other hand, data of low frequency SNPs in SNP array data are low quality.
I also have imputed data from same samples. The reference panel is 1000 genomes phase 3 all ancestries. This data is in vcf format, should I convert it to a plink file and use it to build the GRM and perform step 1 without the MAF filter?
I have checked the following post and have not received a clear answer to this question, so I am posting it again. I would appreciate it if you could enlighten me.
weizhouUMICH/SAIGE#226