nyudanin / antigen.garnish

Ensemble Antigen Prediction and Quality Analysis from DNA Variants and Proteins in R

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

rech.io | rech.io | |

antigen.garnish

Ensemble tumor neoantigen prediction and multi-parameter quality analysis from direct input, SNVs, indels, or gene fusion variants.

Detailed flowchart.

Description

An R package for neoantigen analysis that takes human or murine DNA missense mutations, insertions, deletions, or RNASeq-derived gene fusions and performs ensemble neoantigen prediction using 7 algorithms. Input is a VCF file, JAFFA output, or table of peptides or transcripts. Outputs are ranked and summarized by sample. Neoantigens are ranked by MHC I/II binding affinity, clonality, RNA expression, similarity to known immunogenic antigens, and dissimilarity to the normal peptidome.

Advantages

  1. Thoroughness:
    • missense mutations, insertions, deletions, and gene fusions
    • human and mouse
    • ensemble MHC class I/II binding prediction using mhcflurry, mhcnuggets, netMHC, netMHCII, netMHCpan and netMHCIIpan
    • ranked by
      • MHC I/II binding affinity
      • clonality
      • RNA expression
      • similarity to known immunogenic antigens
      • dissimilarity to the normal peptidome
  2. Speed and simplicity:
  3. Integration with R/Bioconductor
    • upstream/VCF processing
    • exploratory data analysis, visualization

Installation

Requirements

  • Linux
  • R ≥ 3.4
  • python-pip
  • sudo is required to install prediction tool dependencies

or

Install all dependencies, prediction tools, and antigen.garnish

One-line installation script:

$ curl -fsSL http://get.rech.io/install_antigen.garnish.sh | sudo sh
$ curl -fsSL "http://get.rech.io/antigen.garnish.tar.gz" | tar -xvz
$ export AG_DATA_DIR="$PWD/antigen.garnish"
  • detailed installation instructions for bootstrapping a fresh AWS instance can be found in the wiki
  • please note that netMHC, netMHCpan, netMHCII, and netMHCIIpan require academic-use only licenses

Workflow

  1. Prepare input for MHC affinity prediction and quality analysis:

    • VCF input - garnish_variants
    • Fusions from RNASeq via JAFFA- garnish_jaffa
    • Prepare table of direct transcript or peptide input - see manual page in R (?garnish_affinity)
  2. Add MHC alleles of interest - see examples below.

  3. Run ensemble prediction method and perform antigen quality analysis including proteome-wide differential agretopicity, IEDB alignment score, and dissimilarity: garnish_affinity.

  4. Summarize output by sample level with garnish_summary and garnish_plot, and prioritize the highest quality neoantigens per clone and sample with garnish_antigens.

Examples

Predict neoantigens from missense mutations, insertions, and deletions

library(magrittr)
library(data.table)
library(antigen.garnish)

  # load an example VCF
	dir <- system.file(package = "antigen.garnish") %>%
		file.path(., "extdata/testdata")

	dt <- "antigen.garnish_example.vcf" %>%
	file.path(dir, .) %>%

  # extract variants
    garnish_variants %>%

  # add space separated MHC types
  # see list_mhc() for nomenclature of supported alleles

      .[, MHC := c("HLA-A*01:47 HLA-A*02:01 HLA-DRB1*14:67")] %>%

  # predict neoantigens
    garnish_affinity

  # summarize predictions
    dt %>%
      garnish_summary %T>%
        print

  # generate summary graphs
    dt %>% garnish_plot

Predict neoantigens from gene fusions

library(magrittr)
library(data.table)
library(antigen.garnish)

  # load example jaffa output
	dir <- system.file(package = "antigen.garnish") %>%
		file.path(., "extdata/testdata")

	path <- "antigen.garnish_jaffa_results.csv" %>%
			file.path(dir, .)
	fasta_path <- "antigen.garnish_jaffa_results.fasta" %>%
			file.path(dir, .)

  # get predictions
    dt <- garnish_jaffa(path, db = "GRCm38", fasta_path) %>%

  # add MHC info with list_mhc() compatible names
    .[, MHC := "H-2-Kb"] %>%

  # get predictions
    garnish_affinity %>%

  # summarize predictions
    garnish_summary %T>%
    print

Get full MHC affinity output from a Microsoft Excel file of variants

library(magrittr)
library(data.table)
library(antigen.garnish)

  # load example Microsoft Excel file
  dir <- system.file(package = "antigen.garnish") %>%
    file.path(., "extdata/testdata")

  path <- "antigen.garnish_test_input.xlsx" %>%
    file.path(dir, .)

  # predict neoantigens
    dt <- garnish_affinity(path = path) %T>%
      str

Directly calculate IEDB score and dissimilarity for a list of sequences

library(magrittr)
library(data.table)
library(antigen.garnish)

  # generate our character vector of sequences
  v <- c("SIINFEKL", "ILAKFLHWL", "GILGFVFTL")

  # calculate IEDB score
  v %>% iedb_score(db = "human") %>% print

	# calculate dissimilarity
	v %>% garnish_dissimilarity(db = "human") %>% print

Automated testing

From ./<Github repo>:

  devtools::test(reporter = "summary")

How are peptides generated?

  library(magrittr)
  library(data.table)
  library(antigen.garnish)

  # generate a fake peptide
    dt <- data.table::data.table(
       pep_base = "Y___*___THIS_IS_________*___A_CODE_TEST!______*__X",
       mutant_index = c(5, 25, 47, 50),
       pep_type = "test",
       var_uuid = c(
                    "front_truncate",
                    "middle",
                    "back_truncate",
                    "end")) %>%
  # create nmers
    make_nmers %T>% print

Plots and summary tables

  • garnish_plot output:

  • garnish_antigens output:

Citation

Richman LP, Vonderheide RH, and Rech AJ. "Neoantigen dissimilarity to the self-proteome predicts immunogenicity and response to immune checkpoint blockade." Cell Systems 2019 in press

Contributing

We welcome contributions and feedback via Github or email.

Acknowledgments

We thank the follow individuals for contributions and helpful discussion:

License

Please see included license or contact us with questions.

About

Ensemble Antigen Prediction and Quality Analysis from DNA Variants and Proteins in R

License:Other


Languages

Language:R 100.0%