hfbassani / h3agwas

GWAS Pipeline for H3Africa

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

h3agwas

This is currently in draft

Background

h3aGWAS is a simple human GWAS analysis workflow originally built at the Sydney Brenner Institute for data quality control (QC) and basic association testing, and later refined and extended by H3ABionet. It uses Nextflow as the basis for workflow managment, and has been dockerised.

Documentation

Installation, Examples and tutorials for witsGWAS can be found in the wiki

Features

QC of Affymetrix array data (SNP6 raw .CEL files)

  • genotype calling
  • converting birdseed calls to PLINK format

Sample and SNP QC of PLINK Binaries

Sample QC tasks checking:

  • discordant sex information
  • calculating missingness
  • heterozygosity scores
  • relatedness
  • divergent ancestry

SNP QC tasks checking:

  • remove duplicates
  • discordant sex information
  • minor allele frequencies
  • SNP missingness
  • differential missingness
  • Hardy Weinberg Equilibrium deviations

Association testing

  • Basic PLINK association tests, producing manhattan and qqplots
  • CMH association test - Association analysis, accounting for clusters
  • permutation testing
  • logistic regression
  • emmax association testing

Running the pipeline

The pipeline is controlled through the nextflow.config file. All parameters including input files, and parameters. This can be edited manually

Copyright

Authors

Lerato E. Magosi, Kiran Anmol, Shaun Aron, Rob Clucas, Eugene de Beste, Scott Hazelhurst, Aboyomini Mosaku, Don Armstrong and the Wits Bioinformatics team

License

witsGWAS is offered under the MIT license. See LICENSE.txt.

Download

git clone https://github.com/h3abionet/h3agwas

References

About

GWAS Pipeline for H3Africa


Languages

Language:R 37.8%Language:Python 30.8%Language:Perl 17.8%Language:Java 8.6%Language:Shell 4.4%Language:Perl 6 0.7%