nmdp-bioinformatics / netMHC-spark

netMHC Apache Spark CLI

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

netMHC-spark

Command line tool for running netMHC with Apache Spark

Install

git clone https://github.com/nmdp-bioinformatics/netMHC-spark
cd netMHC-spark
mvn package

Useage

netmhc-spark 1.0
Usage: spark-submit netmhc-spark-1.0-SNAPSHOT.jar [options]

  -i, --input <value>    input is the input path
  -o, --output <value>   output is the output path
  -a, --alleles <value>  alleles is the list of HLA alleles to use
  -f, --format <value>   format is the output format (default = parquet)

Example

spark-submit --master yarn --deploy-mode client \
    target/netmhc-spark-1.0-SNAPSHOT.jar \
    --input src/test/resources/test_peptides.pep \
    --alleles src/test/resources/allele_list.txt \
    --output peptide_binding

Required Software

NetMHC Reference

Massimo Andreatta, Morten Nielsen; Gapped sequence alignment using artificial neural networks: application to the MHC class I system, Bioinformatics, Volume 32, Issue 4, 15 February 2016, Pages 511–517, https://doi.org/10.1093/bioinformatics/btv639

About

netMHC Apache Spark CLI


Languages

Language:Scala 64.0%Language:Pep8 33.6%Language:Shell 2.4%