neoantigen summit 20161115
TRANSCRIPT
Software for Tumor Neoantigen Prediction and Vaccine DesignTim O’DonnellMount Sinai, Hammer Lab
Nov 15, 2016Neoantigen Summit
Hammer Lab■Bringing emerging
software development and data science technologies into cancer immunotherapy
■Tools developed at github.com/hammerlab under an Apache license
Personalized genome vaccine (PGV)
NCT02721043: PHASE I, OPEN LABEL, STUDY OF PGV001: A MULTI-PEPTIDE
THERAPEUTIC VACCINE PLATFORM FOR USE IN THE TREATMENT OF SOLID TUMORS IN
THE ADJUVANT SETTING
Nina Bhardwaj
Personalized Genomic Vaccine
■Solid tumors patients (H&N, NSCLC, Breast, Ovarian, Urothelial, SCC) without evidence of metastatic disease
■Vaccine: ten 25-mer peptides containing predicted Class I MHC mutated ligands
■Adjuvant: Poly-ICLC■No checkpoint blockade (unfortunately)■Endpoint: safety and feasibility
Tumor neoepitope selection
Sequencing
■150x normal■300x tumor■Sureselect XT■~150 million mRNA reads
Pipeline
Tools developed for the trial
Available at github.com/hammerlab
varcode Variant effect prediction including indel coding sequence
isovar Determine mutant coding sequence from RNA-seqvaxrank Neoantigen vaccine selectionepidisco Workflow to generate vaccine peptide report from
FASTQsmhctools Standard interface to pMHC binding predictorspyensembl Python interface to Ensembl reference genome
annotations
Coding sequence prediction (varcode)Code Value
variant = varcode.Variant( "3", 36779850, ref="C", alt="", ensembl='grch37')
Variant( contig='3', start=36779850, ref='C', alt='', reference_name='GRCh37')
effect = variant .effects() .top_priority_effect()
FrameShift( variant=chr3 g.36779850_36779850delC, transcript_name=DCLK3-001, transcript_id=ENST00000416516, effect_description=p.E101fs)
effect.mutant_protein_sequence MGKEPLTLKSIQVAVEELYPNKARALTLAQHSRAPSPRLRSRLFSKALKGDHRCGETETPKSCSEVAGCKAAMRHQGKIPEELSLDDRARTQKKWGRGKWSQNPVASPPGKPLWKRGTQGERSILGWRLKRPRVKLSDARSARERGSSSRAWSVRGFLWGPVSWIWGRAQCMMWRSW
GGCGACTGTCCGGCTTTGAGCCAGGTGCCTC
Intron
Phasing and transcript selection (isovar)
TGTCCGGCTACTTGTCATGGCGACTGTCCGGCT
TGGCGACTGTCCAGCTCGACTGTCCAGCT
TGTCATGGCGACTGTCCAGCT
Somatic mutation Germline mut.RNA Read 1RNA Read 2RNA Read 3
RNA Read 5
RNA Read 4
TTGAGCCAGGAGCCTCTTGAGCCATTGAGCCAGGAGCCTCTTGTGCCAGGAGCCTCTTGTGCCAGGA
Exon 1 Exon 2
Selected coding sequence includes germline mutation:
Vaccine generation (vaxrank) vaxrank--vcf mutect.vcf --vcf strelka.vcf --bam tumor-rna.bam --vaccine-peptide-length 25 --mhc-predictor netmhcpan --mhc-alleles-file alleles.txt
Mutational burden can be limitingPatient #1 Patient #2 Patient #3 Patient #4 Patient #5
Variants 501 888 591 663 912(Non-silent)
Coding Variants 180 253 173 231 305Frame Shifts 4 8 1 3 1
Peptides in Report 11 9 17 32 22
Peptides with
Predicted MHC
ligands of affinity <=
100nM
4 3 8 10 9
First patient
■Oct 5, 2016 - samples acquired■Oct 10 - pathology deposits samples in Genomics
Core■Oct 17 - sequencing data delivered■Oct 19 - vaccine pipeline completes
■9 credible neoepitope-generating non-synonymous mutations identified
■Oct 20 - Histogenetics (HLA types) report arrives■Concordant with seq2hla except for one HLA-C allele
MHC Binding Affinity Prediction
■This trial uses netMHCpan for peptide/MHC binding prediction
■We think there is room for improvement over netMHC
■We are developing a new predictor called MHCflurry
Motivation for MHC binding prediction
■Identifying T cell antigens is required for vaccine design
■MHC binding is the most restrictive step in antigen processing
■Each MHC allele is capable of binding a distinct set of peptides
■There are thousands of MHC alleles in human population
Binding motifs
■Introduced: A. Sette 1989■Scan for occurrences of a “master sequence”,
e.g. the 6-mer sequence VHAAHA■Allow a certain number of substitutions between
similar amino acids
Position specific scoring matrices (PSSM)
■Introduced: Parker 1994
■At each position in the peptide, specify a value for each amino acid
■To predict whether a peptide binds, sum the values for each amino acid
Source: Bjoern Peters
Nonlinear effects
■Suppose a positively charged residue is required at exactly one of two positions in a peptide to bind an MHC allele
■This cannot be represented with binding motifs or PSSMs
Output
Hidden Layers
Input
Neural Networks
Neural networks for pMHC prediction■ Allele-specific
■ Train a model for each MHC allele■ Input to the model: peptide sequence
■ Pan-allele■ Train a model across alleles■ Input to the model:
(peptide sequence, MHC sequence)
netMHC neural networks■ Allele-specific
■ netMHC - M. Nielsen 2003■ Best choice for alleles with the most training
data ■ Pan-allele
■ netMHCpan - M. Nielsen 2007■ Best choice for alleles with less training data
MHCflurry
■Hybrid between allele-specific and pan-allele■Uses imputation (matrix completion) to “fill in”
missing training data for alleles with little data■Allele-specific predictors are trained on the
imputed data■As the training progresses, the imputed data is down-
weighted in favor of the real data
Imputation algorithmsAlgorithm AUC F1 scoreknnImpute (k=1)
0.8088 0.6906
knnImpute (k=3) 0.8202 0.6054
knnImpute (k=5) 0.8164 0.5884
meanFill 0.6590 0.0677
MICE (20 imputations) 0.8675 0.6292
similarityWeightedAveraging
0.8259 0.6162
softImpute (lambda =1,0) 0.8296 0.5126
softImpute (lambda=10,0) 0.7903 0.3835
softImpute (lambda=5,0) 0.8266 0.4930
svdImpute (rank=10) 0.8283 0.6201
Architecture
Performance
“Predicting Peptide-MHC Binding Affinities With Imputed Training Data.” Alex Rubinsteyn, Timothy O'Donnell, Nandita Damaraju, Jeffrey Hammerbacher. ICML 2016: Computational Biology Workshop. doi: http://dx.doi.org/10.1101/054775
Performance
“Predicting Peptide-MHC Binding Affinities With Imputed Training Data.” Alex Rubinsteyn, Timothy O'Donnell, Nandita Damaraju, Jeffrey Hammerbacher. ICML 2016: Computational Biology Workshop. doi: http://dx.doi.org/10.1101/054775
Weekly contest performance
http://tools.immuneepitope.org/auto_bench/mhci/weekly/
Using MHCflurryInstall $ pip install mhcflurry
$ mhcflurry-downloads fetch
Run$ mhcflurry-predict \ --alleles HLA-A0201 \ --peptides SIINFEKL SIINFEKD
MHCflurry
■ Allele-specific class I predictors may be downloaded from http://github.com/hammerlab/mhcflurry
■Working on■Expanding our training data by including less precise assays■Better handling of non-9mers■Class II prediction■Pan-allele prediction
■Patients■Jeff Hammerbacher & Hammer Lab■Nina Bhardwaj and the PGV team■Nikesh Kotecha and the PICI Bioinformatics
team■OSS bug reporters!
Thanks
End
Backup slides
Related work■Academic
■UCSC ProTECT■WashU pVAC-Seq■DTU MuPeXi
■Commercial■PGD ImmunoSelect-R■Personalis ACE ImmunoID■Immatics XPRESIDENT■MedGenome OncoPept
Does chemotherapy create neoantigens?
Question
■Mutations and neoantigens detectable from bulk-sequencing may be a biomarker for response to checkpoint blockade
■Chemotherapy induces mutations
Does chemotherapy meaningfully impact detectable mutation and neoantigen burden in high grade serous ovarian cancer?
Australian Ovarian Cancer Study (AOCS)
AOCS Cohort
■High grade serous ovarian carcinoma■WGS + RNA on 114 samples from 92 patients
■79 chemotherapy-naive primary samples■30 relapse samples taken after surgery and adjuvant-chemo■5 primary samples taken after neoadjuvant-chemo
Our analysis
■ Predict neoantigens and look for expression■Connect neoantigens to chemotherapy-
associated mutational signatures
Two contexts for chemotherapy
■Neoadjuvant: given before surgery to try to shrink tumors to be operable
■Adjuvant: given after surgery, when there are usually no clinical signs of disease
After adjuvant chemo and relapse, detectable mutation and neoantigen burden nearly double
81% increase124% increase
But not after neoadjuvant chemo
No changeNo change
In fact, neoadjuvant-treated samples trend toward fewer expressed neoantigens (p=0.09)
44 → 16
What part of the increase in neoantigens at relapse is due to chemo?
■As adjuvant chemo is standard of care, there are no patients who receive surgery but not adjuvant chemo to answer this question
■Alternative: mutational signatures (Alexandrov 2013)
Source: http://cancer.sanger.ac.uk/cosmic/signatures
Source: http://cancer.sanger.ac.uk/cosmic/signatures
Animal studies: cisplatin, cyclophosphamide, etoposide
Extracted signatures from G. Gallus experiments
Cyclophosphamide signature enriched in samples treated with cyclophosphamide (4/10 treated and 4/104 non-treated, p=0.001)
Focus on mutations introduced during treatment
■14 samples from 12 patients with paired pre- and post-treatment samples
■Extract unique-to-treated mutations: 0 variant reads and more than 30 depth in pre-treatment samples
■93,986 / 206,766 (45%) SNVs satisfy this filter■Perform deconvolution on these mutations
Deconvolution of unique-to-treated mutations
▶ Cisplatin detected only in the two cisplatin-treated samples
▶ Cyc detected in 3/6 cyc-treated. Unexpectedly, in 6/8 non-cyc treated samples
Chemo contributes at most 16% of neoantigens in adjuvant-treated relapse samples
Conclusions■ Ovarian tumors at relapse harbor nearly double the predicted
expressed neoantigen burden as primary chemo-naive samples■ Mutagenesis from standard chemotherapy regimes contribute a
small but detectable part of this effect■ Processes already operative in the primaries, including COSMIC
signatures (3) BRCA disruption and (8) Unknown etiology continue to contribute most mutations
■ Cisplatin-derived mutational signatures may not generalize to carboplatin
■ Some evidence that neoadjuvant treatment may decrease neoantigen expression, but larger cohorts are required to assess