- Clone the repo from GitHub.
git clone https://github.com/cschu/vortex_knight.git
- Create a conda environment with NextFlow, e.g. by using the provided
environment.yml
.
cd vortex_knight
conda env create -f environment.yml
conda activate vortex_knight
-
Make a copy of the
config/run.config
file and adjust it to your environment. -
Run the pipeline
nextflow run /path/to/vortex_knight/main.nf --input_dir /path/to/input_files --output_dir /path/to/output_dir -c /path/to/run.config
Note: Nextflow itself requires at least 5GB
of memory.
This requires a local nextflow installation. If you don't have one, see Steps 1/2 above.
-
Make a local copy of the run configuration file and adjust to your environment.
-
Run the pipeline
nextflow run cschu/vortex_knight --input_dir /path/to/input_files --output_dir /path/to/output_dir -c /path/to/run.config
Note: Nextflow itself requires at least 5GB
of memory.
--input_dir
should be a folder with bam files or with gzipped fastq files. For fastq files, individual samples should be separated into individual folders.--output_dir
isvknight_out
in the local directory by default.--skip_<analysis>
,--run_<analysis>
skips, resp. explicitly requires execution of the specified analysis (motus
,pathseq
,count_reads
,mtags
,mapseq
,kraken2
)--publishMode
allows to switch between various modes of how results files are placed in theoutput_dir
(cf. NextFlow documentation)
mapseq
can only run in combination withmtags
and when the parametermapseq_bin
is explicitly set.kraken2
can only run when the parameterkraken_database
is set.pathseq
can only run when the parameterpathseq_database
is set.- a pre-downloaded motus database can be set with the parameter
motus_database
. - results are only collated if the parameter
collate_script
is set. (TODO -> change to baseDir?)
- Outputs
The output folder contains:
- one subdirectory
otu_tables
containing the summarisedmapseq
otu tables - a subdirectory per sample (named
<sample>
) with- the kraken2 report
<sample>.kraken2_report.txt
- the library size
<sample>.libsize.txt
- the mOTUs report
<sample>.motus.txt
- pathseq output
<sample>.pathseq.bam
<sample>.pathseq.bam.sgi
<sample>.pathseq.score_metrics
<sample>.pathseq.scores
- the kraken2 report
Note that by default, all files in the output folder are symlinks into the work dir! Before you delete the work dir, ensure you have dereferenced copies. Alternatively, change the --publishMode parameter to copy
or link
(if the target file system supports hard links).