mutational processes in the human genome...digging deep into data: localized mutation signatures...
TRANSCRIPT
Mutational processes
in the human genome
Serena Nik-Zainal
CRUK Advanced Clinician Scientist
Honorary Consultant in Clinical Genetics
Cambridge Society for the Application of Research Talk, June 19th 2017
normal cell
normal cell
normal cell
normal cell
normal cell
normal cell
normal cell
100% cells
26,700 mutations
10 CN changes
PIK3CA, TP53, GATA3,
SMAD4, NCOR1 muts
normal cell
cancer cell
Gel-based
Kb/day/machine
Massively-parallel sequencing
Gel-basedCapillary
Kb/day/machine
Massively-parallel sequencing
Gel-basedCapillary
Kb/day/machine
Massively-parallel sequencing
Gel-basedCapillary
Massively parallel sequencing
Kb/day/machine
Massively-parallel sequencing
Massively-parallel sequencing:
Sequencing information from individual DNA molecules
Massively-parallel sequencing:
Sequencing information from individual DNA molecules
Massively-parallel sequencing: Gives us unprecedented access to the entire human genome
• Human genome ~ 3,000,000,000
--> Whole genome sequencing
• ~ 20,000 genes encompassing ~ 1-2% of human genome
--> Whole exome sequencing
Massively-parallel sequencing: Gives us unprecedented access to the entire human genome
• Human genome ~ 3,000,000,000
--> Whole genome sequencing
• ~ 20,000 genes encompassing ~ 1-2% of human genome
--> Whole exome sequencing
Massively-parallel sequencing: Gives us unprecedented access to the entire human genome
• Human genome ~ 3,000,000,000
--> Whole genome sequencing
• ~ 20,000 genes encompassing ~ 1-2% of human genome
--> Whole exome sequencing
Genomic abnormalities
exon intron UTR
Genomic abnormalities
exon intron UTR
TTCACG
Genomic abnormalities
exon intron UTR
TTCACG
TTTACG
substitution
Genomic abnormalities
exon intron UTR
TTCACG
TTTACG TTT-CG
substitution deletion/
insertion
base pair resolution
Genomic abnormalities
exon intron UTR
TTCACG
TTTACG TTT-CG
substitution deletion/
insertion
base pair resolution
Genomic abnormalities
exon intron UTR
TTCACG
TTTACG TTT-CG
substitution deletion/
insertion
duplication
base pair resolution
Genomic abnormalities
exon intron UTR
TTCACG
TTTACG TTT-CG
substitution deletion/
insertion
duplication
deletion
base pair resolution
Genomic abnormalities
exon intron UTR
TTCACG
TTTACG TTT-CG
substitution deletion/
insertion
duplication
deletion
inversion
base pair resolution
Genomic abnormalities
exon intron UTR
TTCACG
TTTACG TTT-CG
substitution deletion/
insertion
duplication
deletion
inversion
translocation
Genomic abnormalities
exon intron UTR
TTCACG
TTTACG TTT-CG
substitution deletion/
insertion
duplication
deletion
inversion
translocation
base pair resolution
chromosomal scale
Driver mutations in cancer genes
• Genomic scenario
• ERBB2 - breast cancer
• BCR-ABL – leukaemia
• EGFR – lung cancer
• BRAF - metastatic melanoma
Driver mutations in cancer genes
• Genomic scenario
• ERBB2 - breast cancer
• BCR-ABL – leukaemia
• EGFR – lung cancer
• BRAF - metastatic melanoma
• Targeted drug
• Herceptin, Lapatinib
• Imatinib
• Erlotinib, Gefitinib
• Vemurafenib
Driver mutations in cancer genes
ER positive ER negative
Driver mutations in cancer genes
ER positive ER negative
Driver mutations in cancer genes
ER positive ER negative
Driver mutations in cancer genes
Wagle et al. JCO, 2011
Pre-treatment 15 Weeks 23 Weeks
Driver mutations in cancer genes
Many thousands of passenger mutations
Mutation signatures in human cells
Extracting mutation signatures
Extracting mutation signatures
Extracting mutation signatures
Extracting mutation signatures
C>T = G>A
C>A = G>T
C>G = G>C
T>A = A>T
T>C = A>G
T>G = A>C
6 mutation classes
Classification of base substitution mutations
C>T
C>A
C>G
T>A
T>C
T>G
6 mutation classes
Classification of base substitution mutations
C>T
C>A
C>G
T>A
T>C
T>G
ACA>ATA
ACC>ATC
ACG>ATG
ACT>ATT
CCA>CTA
CCC>CTC
CCG>CTG
CCT>CTT
GCA>GTA
GCC>GTC
GCG>GTG
GCT>GTT
TCA>TTA
TCC>TTC
TCG>TTG
TCT>TTT
6 mutation classes
Classification of base substitution mutations
C>T
C>A
C>G
T>A
T>C
T>G
ACA>ATA
ACC>ATC
ACG>ATG
ACT>ATT
CCA>CTA
CCC>CTC
CCG>CTG
CCT>CTT
GCA>GTA
GCC>GTC
GCG>GTG
GCT>GTT
TCA>TTA
TCC>TTC
TCG>TTG
TCT>TTT
6 mutation classes
Classification of base substitution mutations
C>T
C>A
C>G
T>A
T>C
T>G
ACA>ATA
ACC>ATC
ACG>ATG
ACT>ATT
CCA>CTA
CCC>CTC
CCG>CTG
CCT>CTT
GCA>GTA
GCC>GTC
GCG>GTG
GCT>GTT
TCA>TTA
TCC>TTC
TCG>TTG
TCT>TTT
6 mutation classes
Classification of base substitution mutations
C>T
C>A
C>G
T>A
T>C
T>G
ACA>ATA
ACC>ATC
ACG>ATG
ACT>ATT
CCA>CTA
CCC>CTC
CCG>CTG
CCT>CTT
GCA>GTA
GCC>GTC
GCG>GTG
GCT>GTT
TCA>TTA
TCC>TTC
TCG>TTG
TCT>TTT
6 mutation classes
Classification of base substitution mutations
C>T
C>A
C>G
T>A
T>C
T>G
ACA>ATA
ACC>ATC
ACG>ATG
ACT>ATT
CCA>CTA
CCC>CTC
CCG>CTG
CCT>CTT
GCA>GTA
GCC>GTC
GCG>GTG
GCT>GTT
TCA>TTA
TCC>TTC
TCG>TTG
TCT>TTT
6 mutation classes
Classification of base substitution mutations
C>T
C>A
C>G
T>A
T>C
T>G
ACA>ATA
ACC>ATC
ACG>ATG
ACT>ATT
CCA>CTA
CCC>CTC
CCG>CTG
CCT>CTT
GCA>GTA
GCC>GTC
GCG>GTG
GCT>GTT
TCA>TTA
TCC>TTC
TCG>TTG
TCT>TTT
6 mutation classes
Classification of base substitution mutations
C>T
C>A
C>G
T>A
T>C
T>G
ACA>ATA
ACC>ATC
ACG>ATG
ACT>ATT
CCA>CTA
CCC>CTC
CCG>CTG
CCT>CTT
GCA>GTA
GCC>GTC
GCG>GTG
GCT>GTT
TCA>TTA
TCC>TTC
TCG>TTG
TCT>TTT
6 mutation classes
Classification of base substitution mutations
C>T
C>A
C>G
T>A
T>C
T>G
ACA>ATA
ACC>ATC
ACG>ATG
ACT>ATT
CCA>CTA
CCC>CTC
CCG>CTG
CCT>CTT
GCA>GTA
GCC>GTC
GCG>GTG
GCT>GTT
TCA>TTA
TCC>TTC
TCG>TTG
TCT>TTT
6 mutation classes
Classification of base substitution mutations
C>T
C>A
C>G
T>A
T>C
T>G
ACA>ATA
ACC>ATC
ACG>ATG
ACT>ATT
CCA>CTA
CCC>CTC
CCG>CTG
CCT>CTT
GCA>GTA
GCC>GTC
GCG>GTG
GCT>GTT
TCA>TTA
TCC>TTC
TCG>TTG
TCT>TTT
6 mutation classes
Classification of base substitution mutations
C>T
C>A
C>G
T>A
T>C
T>G
ACA>ATA
ACC>ATC
ACG>ATG
ACT>ATT
CCA>CTA
CCC>CTC
CCG>CTG
CCT>CTT
GCA>GTA
GCC>GTC
GCG>GTG
GCT>GTT
TCA>TTA
TCC>TTC
TCG>TTG
TCT>TTT
6 mutation classesATA>AGA
ATC>AGC
ATG>AGG
ATT>AGT
CTA>CGA
CTC>CGC
CTG>CGG
CTT>CGT
GTA>GGA
GTC>GGC
GTG>GGG
GTT>GGT
TTA>TGA
TTC>TGC
TTG>TGG
TTT>TGT
ATA>ACA
ATC>ACC
ATG>ACG
ATT>ACT
CTA>CCA
CTC>CCC
CTG>CCG
CTT>CCT
GTA>GCA
GTC>GCC
GTG>GCG
GTT>GCT
TTA>TCA
TTC>TCC
TTG>TCG
TTT>TCT
ATA>AAA
ATC>AAC
ATG>AAG
ATT>AAT
CTA>CAA
CTC>CAC
CTG>CAG
CTT>CAT
GTA>GAA
GTC>GAC
GTG>GAG
GTT>GAT
TTA>TAA
TTC>TAC
TTG>TAG
TTT>TAT
ACA>AAA
ACC>AAC
ACG>AAG
ACT>AAT
CCA>CAA
CCC>CAC
CCG>CAG
CCT>CAT
GCA>GAA
GCC>GAC
GCG>GAG
GCT>GAT
TCA>TAA
TCC>TAC
TCG>TAG
TCT>TAT
ACA>AGA
ACC>AGC
ACG>AGG
ACT>AGT
CCA>CGA
CCC>CGC
CCG>CGG
CCT>CGT
GCA>GGA
GCC>GGC
GCG>GGG
GCT>GGT
TCA>TGA
TCC>TGC
TCG>TGG
TCT>TGT
96 mutation classes
Classification of base substitution mutations
Mutational signatures in human cancer
Mutational signatures in human cancer
Mutational signatures in human cancer
ApCpGTpCpG
GpCpG
CpCpG
Mutational signatures in human cancer
Mutational signatures in human cancer
Tobacco
Mutational signatures in human cancer
UV light
associated with exposure to UV light
c
cancer
Mutational signatures in human cancer
associated with exposure to UV light
Simulated sunlight
c
in vitro
cancer
Mutational signatures in human cancer
Mutational signatures in human cancer
Aristolochic acid
Aristolochic acid
c
in vitro
cancer
Mutational signatures in human cancer
Mutational signatures in human cancer
HR deficiency
Mutational signatures in human cancer
MMR deficiency
Mutational signatures in human cancer
Unknown aetiology
It’s early days
• Still in the infancy of understanding mutational signatures
• This is not dogma : mutational signatures will change
• There are improvements in the mathematical frameworks
to be made
DIGGING DEEP INTO DATA:
LOCALIZED MUTATION SIGNATURES
PART II
Panoramic view of whole-genome sequenced cancers
Panoramic view of whole-genome sequenced cancers
p < 0.001Mutation number
Localised hypermutation or kataegis
Coordinate (bp)
Genomic coordinate
(bp)
Localised hypermutation or kataegis
Localised hypermutation or kataegis
Generalised hypermutation of signatures 2/13
PD4120a
ER +ve HER2 -ve
Kataegis and Signatures 2/13 share similar characteristics
TpCpATpCpG
TpCpT
TpCpC
Kataegis and Signatures 2/13 share similar characteristics
TpCpATpCpG
TpCpT
TpCpC
Kataegis and Signatures 2/13 share similar characteristics
What is the biological basis for these mutational
signatures?
• Deamination of cytosine by one of the family
of AID/APOBEC enzymes?
• The family includes AID
APOBEC1
APOBEC2
APOBEC3A-H
APOBEC4
What is the biological basis for these mutational
signatures?
Double strand
break and
rearrangement
DNA editing by the AID/APOBEC family of
cytidine deaminasesWhat is the biological basis for these mutational
signatures?
Double strand
break and
rearrangement
AID /
APOBEC
What is the biological basis for these mutational
signatures?
Double strand
break and
rearrangement
AID /
APOBEC
What is the biological basis for these mutational
signatures?
Double strand
break and
rearrangement
AID /
APOBEC
What is the biological basis for these mutational
signatures?
Double strand
break and
rearrangement
AID /
APOBEC
What is the biological basis for these mutational
signatures?
Double strand
break and
rearrangement
AID /
APOBEC
What is the biological basis for these mutational
signatures?
Double strand
break and
rearrangement
AID /
APOBEC
What is the biological basis for these mutational
signatures?
The AID / APOBEC family of cytidine
deaminases
• AID plays a central role in somatic hypermutation
and class switch recombination at the
immunoglobulin loci
• APOBEC3A-H mutate viruses and restrict
retrotransposon activity
Which member(s) of the family is
responsible for Signature 2/13?
AID
APOBEC1
APOBEC2
APOBEC3A
APOBEC3B
APOBEC3C
APOBEC3DE
APOBEC3F
APOBEC3G
APOBEC3G
APOBEC3H
APOBEC4
A T C G T C G A T G C T C G A T C A T C A G A G T C A A
T A G C A G C T A C G A G C T A G T A G T C T C A G T T
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Kataegis microclusters demonstrate processivityStrand asymmetry
A T C G T C G A T G C T C G A T C A T C A G A G T C A A
T A G C A G C T A C G A G C T A G T A G T C T C A G T T
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Kataegis microclusters demonstrate processivityStrand asymmetry
A T C G T C G A T G C T C G A T C A T C A G A G T C A A
T A G C A G C T A C G A G C T A G T A G T C T C A G T T
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Kataegis microclusters demonstrate processivityStrand asymmetry
A T C G T C G A T G C T C G A T C A T C A G A G T C A A
T A G C A G C T A C G A G C T A G T A G T C T C A G T T
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Kataegis microclusters demonstrate processivityStrand asymmetry
A T C G T C G A T G C T C G A T C A T C A G A G T C A A
T A G C A G C T A C G A G C T A G T A G T C T C A G T T
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Kataegis microclusters demonstrate processivityStrand asymmetry
Kataegis
• Cytosine mutations
• TpC sequence context
• Localised hypermutation
• Co-localising with
rearrangements
• Strand asymmetry about
rearrangement breakpoints
Kataegis
• Cytosine mutations
• TpC sequence context
• Localised hypermutation
• Co-localising with
rearrangements
• Strand asymmetry about
rearrangement breakpoints
Signature 2 & 13
• Cytosine mutations
• TpC sequence context
• Generalised hypermutation
• NOT co-localising with
rearrangements
• Strand asymmetry relating to
replication direction
Kataegis
• Cytosine mutations
• TpC sequence context
• Localised hypermutation
• Co-localising with
rearrangements
• Strand asymmetry about
rearrangement breakpoints
Signature 2 & 13
• Cytosine mutations
• TpC sequence context
• Generalised hypermutation
• NOT co-localising with
rearrangements
• Strand asymmetry relating to
replication direction
Distinct (but related) signatures in human cancers
Distinct (but related) signatures in human cancers
Deep computational analysis paired with experimental work
has started to reveal and validate biological insights into
mechanisms of how the human genome is constantly mutating
Accumulation of mutations in cancer
Accumulation of mutations in cancer
Summary II
• A remarkable phenomenon of localised hypermutation, termed
“kataegis”, was frequently observed in 21 primary breast cancers
and subsequently in many other cancer types including pancreatic
cancer, lung cancer and haematological cancers.
• Regions of kataegis usually co-localized with somatic
rearrangements.
• Base substitutions in these regions were almost exclusively of
cytosine at TpC dinucleotides. This is very similar to the genome-
wide mutagenesis of Signatures 2 and 13.
• A role for the APOBEC family of cytidine deaminases is proposed.
MUTATIONAL PROCESSES:
WHAT DOES IT MEAN FOR PATIENTS?
PART III
Mutational processes in 560 breast cancers
SNZ.2017
Correlated
with age
Mutational processes in 560 breast cancers
SNZ.2017
Associated with
cytosine
deaminases
Mutational processes in 560 breast cancers
SNZ.2017
Mismatch
repair
deficiency
Mutational processes in 560 breast cancers
SNZ.2017
Homologous
recombination
repair deficiency
Mutational processes in 560 breast cancers
SNZ.2017
Uncertain
aetiology
Mutational processes in 560 breast cancers
SNZ.2017
Mutational processes in 560 breast cancers
SNZ.2017
Mutational processes in 560 breast cancers
SNZ.2017
Driver mutations in breast cancer patients
The individuality of each tumour
“No two patients share the same set of drivers or
the same quantities of mutational signatures in
their tumors”
Whole genome profiling
SNZ.2017
Whole genome profiling
SNZ.2017
Whole genome profiling
SNZ.2017
Whole genome profiling
SNZ.2017
Whole genome profiling
SNZ.2017
Whole genome profiling
SNZ.2017
Whole genome profiling
SNZ.2017SNZ.2017
Whole genome profiling
SNZ.2017SNZ.2017
Whole genome profiling
SNZ.2017SNZ.2017
CLINICAL APPLICATIONS OF
MUTATIONAL SIGNATURES
PART IV
Helen
Davies
Dominik Glodzik
Predicting BRCA1/BRCA2 deficiency
HRDetect identifies additional BRCA1/BRCA2 defective tumours
HRDetect identifies additional BRCA1/BRCA2 defective tumours
HRDetect identifies additional BRCA1/BRCA2 defective tumours
HRDetect identifies additional BRCA1/BRCA2 defective tumours
22 known
germline
HRDetect identifies additional BRCA1/BRCA2 defective tumours
22 known
germline
33 new
germline
22 somatic or
promoter
hypermethylation
47 unknown
HRDetect is superior to any individual mutational signature and the HRD index
HRDetect distinguishes clinical outcome
& is robust between samples in the same patient
HRDetect distinguishes clinical outcome
& is robust between samples in the same patient
HRDetect distinguishes clinical outcome
& is robust between samples from the same patient
Summary
• Presented a mutational-signature based algorithm called HRDetect
with excellent sensitivity and specificity for detecting germline
BRCA1/BRCA2 deficient tumours
• Identify a significant proportion of tumours that are BRCA1/BRCA2
deficient that were not previously known (and missed using typical
exome sequencing methods)
• Mutational signatures provides an opportunity to fine-tune genomic
stratification
SNZ.2017
WGS reveals MMR deficient breast cancers
SNZ.2017
WGS reveals MMR deficient breast cancers
SNZ.2017SNZ.2017
WGS reveals MMR deficient breast cancers
MLH1 MSH2
MSH6 PMS2
Immunohistochemistry of MMR proteins : PD23579a
WGS reveals subclonal MMR deficiency
WGS reveals subclonal MMR deficiency
Unpicking cancer evolution
Mutational signatures
Signatures associated with MMR
deficiency
Unpicking cancer evolution
Signatures associated with MMR
deficiency
Acquired mutations in MMR genes
Mutational signatures
Unpicking cancer evolution
Mutational signatures in 560 breast cancer patients
SNZ.2017
SNZ.2017
Mutational signatures in 560 breast cancer patients
The unmet need
SNZ.2017
560 breast cancer genomes
AMC, Netherlands
CRI, UK
DFCI, USA
Erasmus, Netherlands
Institute Curie, France
Bordet, Belgium
Bari, Italy
NCI, Netherlands
ICGC Korea
Lund, Sweden
MD Anderson, USA
MSKCC, USA
NCC, Singapore
Oslo, Norway RUNMC, Netherlands
Synergie, France
TCRU, Belgium
Bergen, Norway
Dundee, Scotland
Reykjavik, Iceland
Brisbane, Australia
acknowledgements
BASIS Consortium & ICGC Breast Cancer Working Group
Mike Stratton
Ewan Birney
Marc van der Vijver
Ake Borg
John Martens
Anne-Lise Borreson-Dale
Henk Stunnenberg
Andrea Richardson
Alastair Thompson
Jorunn Erla Eyfjord
Andy Futreal
Christos Sotiriou
Andy Tutt
Sunil Lakhani
Steven van Laere
Paul Span
Carlos Caldas
Laura van’t Veer
Gilles Thomas
Alain Viari
Gu Kong
The team
Becky Harris Project Coordinator
Sandro Morganella (industry)
Dominik Glodzik (Lund University)
Hongwei Chen (Group Leader Nanjing Uni )
Xueqing Zou
Helen Davies
Helen Davies
Andrea Degasperi
Tauanne Dias Amarante
Ilias Georgakopoulos-Soares (PhD)
Gene Koh (PhD)
HR
Johan Staaf
WTSI IT/Admin
Keiran Raine
David Jones
Andrew Menzies
Lucy Stebbings
Jon Hinton
Adam Butler
Sancha Martin
MMR
Colin Purdie (Dundee)
Elin Borgen (Oslo)
Hege Russnes (Oslo)
Se Jin Jang (Asan)