This repo contains the workflow which was developed as part of the H3ABioNet Hackathon held in Pretoria, SA in 2016.
- We track our open tasks using github's issues
- The 1000ft view is located on our Trello board.
- Nextflow (can be installed as local user)
- NXF_HOME needs to be set, and must be in the PATH
- Note that we've experienced problems running Nextflow when NXF_HOME is on an NFS mount.
- The Nextflow script also needs to be invoked in a non-NFS folder
- Java 1.8+
-
The compute nodes need access to shared storage for input, references, output
-
The following commands need to be available in PATH on the compute nodes
- Clone this repo
- Run the "tiny" dataset included
nextflow run imputation.nf -c nextflow.test.tiny.config
- check for results in
outfolder
wc -l output/impute_results/FINAL_VCFS/*
- Download this slightly larger dataset: small.tar.bz2 and extract into the
samples
folder - Run this "small" dataset with
nextflow run imputation.nf -c nextflow.test.small.config
- check for results in
outfolder
wc -l output/impute_results/FINAL_VCFS/*
This workflow can be run on AWS Batch, an Amazon service that streamlines the running of containerised workflows. Nextflow
- Set up the AWS environment. The requirements are:
- an AWS account with credits. See the AWS grants page for grants
- a configured Compute Environment and associated AWS Job Queue
- a configured S3 bucket
- Populate the
awsbatch
profile in
aws.region = 'eu-west-1'
aws.client.storageEncryption = 'AES256'
process.queue = 'large'
executor.name = 'awsbatch'
executor.awscli = '/home/ec2-user/miniconda/bin/aws'
- run with the
awsbatch
profile
nextflow run imputation.nf -c nextflow.test.small.config -profile awsbatch