siddrrsh / lmm22

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Increasingly, genomics is being used for the prediction of specific traits and diseases (phenotypes) among humans. Wider availability of genomics data through multiple research projects (such as International HapMap Project and 1000 Genomes ) has been a catalyst in that direction. With the recent advances in machine learning and big data analysis, data computation resources and data models needed for genomics data analysis are readily available. However, the prediction of traits and diseases has its own challenges in terms of computational requirements and computational analysis, statistical analysis (example: confounding variables), and limited quality of data collection. Linear Mixed Models (LMM, a type of linear regression) is a common approach for Genome-wide Association Studies (GWAS) for the prediction of common traits among humans using genomics. This paper researches the existing LMM-based approaches for Genome-wide Association Studies (GWAS), describes the experiment performed on FaST-LMM approach from Microsoft Research, and then proposes an enhanced approach (called LMM-22) on how to address computational and statistical issues. LMM-22 focuses on the parallelization of LMM computations and execution of LMM-22 on General Purpose Graphics Processing Units (GPU) as against CPUs to accelerate the LMM approach for GWAS studies.

About