This is currently in draft
h3aGWAS is a simple human GWAS analysis workflow originally built at the Sydney Brenner Institute for data quality control (QC) and basic association testing, and later refined and extended by H3ABionet. It uses Nextflow as the basis for workflow managment, and has been dockerised.
Installation, Examples and tutorials for witsGWAS can be found in the wiki
QC of Affymetrix array data (SNP6 raw .CEL files)
- genotype calling
- converting birdseed calls to PLINK format
Sample and SNP QC of PLINK Binaries
Sample QC tasks checking:
- discordant sex information
- calculating missingness
- heterozygosity scores
- relatedness
- divergent ancestry
SNP QC tasks checking:
- remove duplicates
- discordant sex information
- minor allele frequencies
- SNP missingness
- differential missingness
- Hardy Weinberg Equilibrium deviations
Association testing
- Basic PLINK association tests, producing manhattan and qqplots
- CMH association test - Association analysis, accounting for clusters
- permutation testing
- logistic regression
- emmax association testing
Running the pipeline
The pipeline is controlled through the nextflow.config file. All parameters including input files, and parameters. This can be edited manually
Lerato E. Magosi, Kiran Anmol, Shaun Aron, Rob Clucas, Eugene de Beste, Scott Hazelhurst, Aboyomini Mosaku, Don Armstrong and the Wits Bioinformatics team
witsGWAS is offered under the MIT license. See LICENSE.txt.
git clone https://github.com/h3abionet/h3agwas