DEMON-NEUROHACK / Challenge-1-London-Team-B-Genome-Wide-Association-Lovers-

We suggest a novel way of defining NCD as a continuous phenotype, and we employ a mixed linear model to account for pronounced population stratification in the South Asian sample, which is more diverse than European samples, for example.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Challenge-1-London-Team-B aka "Genome-Wide Association Lovers"

Project Summary

For a summary of our completed code and pipeline please refer to our Github Page. For a regular R Markdown version of our pipeline for users to run please refer to the Summary.Rmd file.

Formal Pipeline in Progress

Our workflows are being created as NextFlow pipelines.
Benefits of using NextFlow are:

  • increased scalability
  • increased mobility
  • better reproducibility
  • integration with Docker AND Singularity possible ( --> running on servers using Singularity is not an issue, as opposed to many other workflow managers)
  • workflow itself can be dockerized
  • workflow can pull docker images in order to use software instead of requiring local installations and maintanance
  • modularity

Repo Structure

  • templates dir: contains files with original code that is used in .nf (NextFlow) scripts
  • workflow dir : contains .nf scripts
    • NOTE: due to time reasons the .nf files are uncomplete and additional code from templates needs to be migrated into .nf files to be fully functional
    • current status: all zipped .vcf files in folder are unzipped and all unzipped files are merged into one .vcf file containing all information in an automated manner

NextFlow Environment Setup

In order to use NextFlow from the DNA nexus Cloud Station, the following steps were taken (see history.txt for detailed commands):

  • NextFlow Installation
    • requires Java installation
  • Docker Installation
  • Data Download

Team Members & Contributions

  • Anna Elisabeth Fürtjes (London) - Genetic Quality Control, Formatting genetic files for analysis, Creation & testing of Regenie scripts (Stages 1 & 2); advising team members on GWAS methods
  • Mateus Harrington (Cardiff) - Becoming our go-to man on how to use the DNA Nexus platform; creation of covariate file; creating snplist files; setting up Regenie to work via DNA Nexus
  • Isy Foote (London) - Advising team members on how to integrate our work into the dementia field meaningfully; creation of phenotype file; troubleshooting errors & queries to support teammates; creation of presentation
  • David Enoma (Nigeria) - Technical and Scientific support, brainstorming & Creating presentation;
  • Gabriela Karina Paulus (NYC) - Setting up our GitHub page; worked on integrating our workflow as a NextFlow pipeline

All team members helped to shape our overall aims and plan.

About

We suggest a novel way of defining NCD as a continuous phenotype, and we employ a mixed linear model to account for pronounced population stratification in the South Asian sample, which is more diverse than European samples, for example.


Languages

Language:HTML 91.0%Language:Jupyter Notebook 8.4%Language:Shell 0.4%Language:R 0.2%Language:Nextflow 0.1%