Ensemble tumor neoantigen prediction and multi-parameter quality analysis from direct input, SNVs, indels, or gene fusion variants.
An R package for neoantigen analysis that takes human or murine DNA missense mutations, insertions, deletions, or RNASeq-derived gene fusions and performs ensemble neoantigen prediction using 7 algorithms. Input is a VCF file, JAFFA output, or table of peptides or transcripts. Outputs are ranked and summarized by sample. Neoantigens are ranked by MHC I/II binding affinity, clonality, RNA expression, similarity to known immunogenic antigens, and dissimilarity to the normal peptidome.
- Thoroughness:
- missense mutations, insertions, deletions, and gene fusions
- human and mouse
- ensemble MHC class I/II binding prediction using mhcflurry, mhcnuggets, netMHC, netMHCII, netMHCpan and netMHCIIpan
- ranked by
- MHC I/II binding affinity
- clonality
- RNA expression
- similarity to known immunogenic antigens
- dissimilarity to the normal peptidome
- Speed and simplicity:
- 1000 variants are ranked in a single step in less than five minutes
- parallelized using
parallel::mclapply
and data.table::setDTthreads, see respective links for information on setting multicore usage
- Integration with R/Bioconductor
- upstream/VCF processing
- exploratory data analysis, visualization
- Linux
- R ≥ 3.4
- see documentation
Imports
for R, Biostrings package from Bioconductor
- see documentation
- python-pip
sudo
is required to install prediction tool dependencies
or
- a Docker image is available, please see the wiki or contact us
One-line installation script:
$ curl -fsSL http://get.rech.io/install_antigen.garnish.sh | sudo sh
- if installing without using the above installation script, set
$AG_DATA_DIR
to the required data directory:
$ curl -fsSL "http://get.rech.io/antigen.garnish.tar.gz" | tar -xvz
$ export AG_DATA_DIR="$PWD/antigen.garnish"
- detailed installation instructions for bootstrapping a fresh AWS instance can be found in the wiki
- please note that netMHC, netMHCpan, netMHCII, and netMHCIIpan require academic-use only licenses
-
Prepare input for MHC affinity prediction and quality analysis:
- VCF input -
garnish_variants
- Fusions from RNASeq via JAFFA-
garnish_jaffa
- Prepare table of direct transcript or peptide input - see manual page in R (
?garnish_affinity
)
- VCF input -
-
Add MHC alleles of interest - see examples below.
-
Run ensemble prediction method and perform antigen quality analysis including proteome-wide differential agretopicity, IEDB alignment score, and dissimilarity:
garnish_affinity
. -
Summarize output by sample level with
garnish_summary
andgarnish_plot
, and prioritize the highest quality neoantigens per clone and sample withgarnish_antigens
.
library(magrittr)
library(data.table)
library(antigen.garnish)
# load an example VCF
dir <- system.file(package = "antigen.garnish") %>%
file.path(., "extdata/testdata")
dt <- "antigen.garnish_example.vcf" %>%
file.path(dir, .) %>%
# extract variants
garnish_variants %>%
# add space separated MHC types
# see list_mhc() for nomenclature of supported alleles
.[, MHC := c("HLA-A*01:47 HLA-A*02:01 HLA-DRB1*14:67")] %>%
# predict neoantigens
garnish_affinity
# summarize predictions
dt %>%
garnish_summary %T>%
print
# generate summary graphs
dt %>% garnish_plot
library(magrittr)
library(data.table)
library(antigen.garnish)
# load example jaffa output
dir <- system.file(package = "antigen.garnish") %>%
file.path(., "extdata/testdata")
path <- "antigen.garnish_jaffa_results.csv" %>%
file.path(dir, .)
fasta_path <- "antigen.garnish_jaffa_results.fasta" %>%
file.path(dir, .)
# get predictions
dt <- garnish_jaffa(path, db = "GRCm38", fasta_path) %>%
# add MHC info with list_mhc() compatible names
.[, MHC := "H-2-Kb"] %>%
# get predictions
garnish_affinity %>%
# summarize predictions
garnish_summary %T>%
print
library(magrittr)
library(data.table)
library(antigen.garnish)
# load example Microsoft Excel file
dir <- system.file(package = "antigen.garnish") %>%
file.path(., "extdata/testdata")
path <- "antigen.garnish_test_input.xlsx" %>%
file.path(dir, .)
# predict neoantigens
dt <- garnish_affinity(path = path) %T>%
str
library(magrittr)
library(data.table)
library(antigen.garnish)
# generate our character vector of sequences
v <- c("SIINFEKL", "ILAKFLHWL", "GILGFVFTL")
# calculate IEDB score
v %>% iedb_score(db = "human") %>% print
# calculate dissimilarity
v %>% garnish_dissimilarity(db = "human") %>% print
From ./<Github repo>
:
devtools::test(reporter = "summary")
library(magrittr)
library(data.table)
library(antigen.garnish)
# generate a fake peptide
dt <- data.table::data.table(
pep_base = "Y___*___THIS_IS_________*___A_CODE_TEST!______*__X",
mutant_index = c(5, 25, 47, 50),
pep_type = "test",
var_uuid = c(
"front_truncate",
"middle",
"back_truncate",
"end")) %>%
# create nmers
make_nmers %T>% print
garnish_plot
output:
garnish_antigens
output:
Richman LP, Vonderheide RH, and Rech AJ. "Neoantigen dissimilarity to the self-proteome predicts immunogenicity and response to immune checkpoint blockade." Cell Systems 2019 in press
We welcome contributions and feedback via Github or email.
We thank the follow individuals for contributions and helpful discussion:
Please see included license or contact us with questions.