This repository contains Nextflow
implementations of common workflows used by
the Sanders research group. The current workflows I am looking to implement are:
- QC pipeline
- Stacks 2 pipeline
- CodeML
- Consensus calling
- Transcriptome assembly pipeline
- Variant calling
If there are any other analyses that you are interested in having as a workflow, create an issue or message me and I'll look into implementing it.
Below are functionalities I wish to implement.
- Deomgraphic history sub-workflows:
- PSMC
- MSMC
- MSA pipeline
- Align peptide
- Convert peptide alignment to nucleotide sequence
- Clean alignments
- HmmClean/PREQUAL/Gblocks/GUIDANCE2
- HyPhy
- (Better) alternative to CodeML
- File checks:
- Stacks: Population maps contain
tab
spacing and not spaces - Multiple: CSV files actually are comma separated
- Stacks: Population maps contain
Below are the install instructions for
- Nextflow
- Sanders-workflow pipeline
- Conda setup
Use the following command to install the Nextflow executable to your Phoenix
account. This is not the pipeline! This is the software used to run the
pipeline. I recommend installing this in your Fast
directory ($FASTDIR
) in a
sub-directory that is in your path ($PATH
). Below is an example.
$ mkdir -p $FASTDIR/bin # Create the directory
$ cd $FASTDIR/bin # Change into the directory
$ wget -qO- https://get.nextflow.io | bash # Download Nextflow executable in the directory
$ nextflow --help # Check that the installation worked
To add Nextflow to your path, simply run the following code once
$ echo "export PATH=$PATH:$FASTDIR/bin" >> ~/.bashrc
This will append the export
command to your .bashrc
file which is sourced
every time you log onto Phoenix. The export command essentially appends
$FASTDIR/bin
to your $PATH
variable, making the contents of
the $FASTDIR/bin
directory available at the command line, meaning
you don't have to provide the full path to the Nextflow executable when you
want to use it.
To install the pipeline, I recommend creating a pipelines
directory in
$FASTDIR
. Below is how I would install this software.
$ mkdir -p ${FASTDIR}/pipelines
$ cd ${FASTDIR}/pipelines
$ git clone https://github.com/a-lud/Sanders-workflows.git # HTTPS installation
## OR
$ git clone git@github.com:a-lud/Sanders-workflows.git # SSH installation
The directory Sanders-workflows
should now exist within ${FASTDIR}/pipelines
with all the required scripts.
Each of the sub-workflows needs to install software to run. I've set this up to use
conda
. First all users need to follow the instructions from
here under the Configuring your conda pkgs_dirs and envs_dirs
heading. Once that has been done, we're going to add a few conda channels, which essentially
tell conda where to look online when installing software. Run the following at the
terminal to add the required channels to your account.
$ module load Anaconda3
$ conda config --add channels r
$ conda config --add channels anaconda
$ conda config --add channels bioconda
$ conda config --add channels etetoolkit
$ conda config --add channels conda-forge
Check the channels are there by running the following command:
$ conda config --show
This will print a whole lot of stuff to screen, but within the printout you should see something like the following:
channels:
- conda-forge
- etetoolkit
- bioconda
- anaconda
- r
- defaults
If this information is there, then you are good to go.
Please visit the wiki for detailed information regarding each sub-workflow.
Alastair Ludington: alastair.ludington@adelaide.edu.au