structural variant detection in colorectal cancer€¦ · structural variant detection in...

Post on 30-Apr-2020

7 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Structural variant detection in colorectal cancer

1

1 Dept. of Pathology, VU University Medical Center, Amsterdam, The Netherlands 2 Dept. of Epidemiology & Biostatistics, VU University Medical Center, Amsterdam, The Netherlands 3 Dept. of Pathology, Radboud University Nijmegen Medical Centre, Nijmegen, The Netherlands 4 Dept. of Medical Oncology, Academic Medical Center, University of Amsterdam, The Netherlands 5 Dept. of Computer Science, VU University, Amsterdam, The Netherlands

E van den Broek1, JC Haan1, MH Jansen1, B Carvalho1, MA van de Wiel2, ID Nagtegaal3, CJA Punt4, B Ylstra1, S Abeln5, GA Meijer1, RJA Fijneman1

Colorectal cancer (CRC)

•  Colorectal cancer is a major health concern worldwide

•  Second cause of cancer related death –  The incidence worldwide is 1,200,000 –  The incidence in the US is 144,000

•  Mortality rates

2

Stage 1 < 10 % Stage 2 25 - 30 % Stage 3 45 – 50 % Stage 4 > 90 %

Diagnostic biomarkers

Predictive biomarkers

Prognostic biomarkers

Clinical needs:

1. Screening

2. Predict recurrence

3. Personalized therapy

CRC research Clinical needs for biomarker discovery

3

normal colon progressive adenoma adenoma localized CRC CRC liver

metastasis CRC lymph node

metastasis

activated Wnt signaling

genomic instability (~5% of adenomas)

~15% MIN+ ~85% CIN+

~3% MIN+ ~97% CIN+

Key molecular features

0

20

40

60

80

100

5-ye

ar s

urvi

val r

ate

(%)

Stage IV Stage I+II Stage III CRC stage:

Chromosomal Instability a hallmark of CRC SKY: numerical & structural aberrations

4 M Hermsen et al., Oncogene 2005

CAIRO & CAIRO2 studies

Phase III clinical trials In total 1575 patients were included CApecitabine, IRinotecan, Oxaliplatin in advanced colorectal cancer CAIRO: Koopman et al. Lancet 2007 CAIRO2: Tol et al. N Engl J Med 2009 DNA from 356 patients: primary tumor and matched normal

–  Representative group –  Isolated from FFPE

5

Array CGH: 356 CAIRO & CAIRO2 samples

6

Calling

Copy numbers

Segmentation 1

2

3

Numerical aberrations

Comparative Genomic Hybridization (CGH) Agilent, 180k array CGH

Segmentation - array CGH Profile of one tumor with 180k probes

7

Segmentation was performed using Circular Binary Segmentation algorithm (DNAcopy. Olshen et al. 2004)

Calling - array CGH Profile of one tumor with 180k probes

8

Calling was performed using CGHcall (CGHcall. vd Wiel et al. 2007)

Structural Variants (SV) in cancer

Hematological disorders •  Philadelphia chromosome

–  t(9;22) –  Fusion gene: BCR-ABL –  Drug: Imatinib / Gleevec

Epithelial cancers •  TMPRSS2-ERG in prostate cancers •  VTI1A-TCF7L2 is confirmed in 3% of 97 CRCs

–  Bass et al., Nature Genetics 2011

9

TO IDENTIFY RECURRENT SOMATIC STRUCTURAL GENOMIC VARIANTS THAT CAUSE CRC

AIM

10

Array CGH: 356 CAIRO & CAIRO2 samples

Breakpoint (BP) detection Based on array CGH

11

Breakpoint detection

Candidate genes

Segmentation 1

2

3

Structural variants

BP detection in array CGH Profile of one tumor with 180k probes

12

Breakpoints are defined by the start position of the first

probe of each segment

Breakpoint annotation per gene

•  Total number of genes with BPs: 5,737 genes •  482 candidate genes were identified with recurrent BP (FDR < 0.1)

13

Results based on array CGH BP detection

020

4060

80100

120

140

Am

ount

of a

ffect

ed s

ampl

es in

arr

ay C

GH

Candidate genes (top 15)

MACROD2

Overall survival: MACROD2 Recurrent BP (1) versus no-BP (0)

14

Log rank P= 0.08 0 500 1000 1500

0.0

0.2

0.4

0.6

0.8

1.0

MACROD2

Overall Survival (days)

Sur

viva

l pro

babi

lity

BP (samples)0 ( 207 )1 ( 144 )

•  Total number of genes with BPs: 5,737 genes •  482 candidate genes were identified with recurrent BP (FDR < 0.1) Limitations breakpoint determination using array CGH: –  Location BP is estimation (average probe distance is ~17 kb) –  DNA structure is unknown –  Balanced events will be missed

15

CGH

482 CANDIDATE GENES

Results based on array CGH BP detection

•  482 candidate genes were identified with recurrent BP (FDR < 0.1)

16

CGH

482 CANDIDATE GENES

Candidate validation is required

Validation array CGH BPs NGS data from TCGA

The Cancer Genome Atlas CRC samples (COAD & READ)

Whole Genome DNA Seq from paired tumor-normal samples

•  482 candidate genes were identified with recurrent BP (FDR < 0.1)

17

CGH

482 CANDIDATE GENES

Candidate validation is required

Validation array CGH BPs NGS data from TCGA

Structural Variant (SV) detection Candidate driven

Negative Control Genes

(no BP)

Computational methods Focus on candidate genes

Based on paired-end NGS data •  Read-pair approach

–  Discordance: location / bridge length / orientation reads

Discordant pairs (DP) types •  Translocation > different chromosomes •  Insertion > bridge length •  Deletion > bridge length •  Inversion > orientation •  Eversion > orientation •  Single mapped could indicate a breakpoint

18

ref

Computational methods Focus on candidate genes

Based on paired-end NGS data 1. Read-pair approach

Combined with: 2. Read-depth

3. Define breakpoint location

4. Determine tumor specific events

19

ref

!

1

2

3

Translocation IGV MACROD2 •  Discordant pairs •  Breakpoints

20

!

!

Fusion partner

Based on DP groups:

Approximately 5 fold higher number of translocation-DP groups for candidate genes compared to control genes

Distribution DP groups per type Preliminary results candidate genes in TCGA data

21

Deletion 50.8%

Eversion 8.4%

Inversion 6.7%

Insertion 26.4%

Translocation 7.7%

Translocation-DP groups per candidate gene in TCGA samples Putative translocations

22

Freq

uenc

y of

tran

sloc

atio

n-D

P gr

oups

(au)

Candidate genes

Freq

uenc

y of

tran

sloc

atio

n-D

P gr

oups

(au)

Candidate genes (top 20)

MACROD2

0% 10% 20% 30% 40%

Correlation per candidate gene - Frequency of samples with BP based on array CGH - Frequency of translocation-DP groups in TCGA data

23 Frequency of affected samples in array CGH analysis

Freq

uenc

y of

tran

sloc

atio

n-D

P gr

oups

(au)

MACROD2

Conclusions •  482 candidate genes with recurrent breakpoints were identified

in a large cohort of 356 CRC samples, based on array CGH analysis

•  The TCGA provided an essential CRC reference dataset (COAD, READ) to validate Structural Variants in candidate genes with recurrent breakpoints

•  Identification of BPs based on array CGH is correlated with SV detection in TCGA CRC NGS data

•  Further studies will be performed to investigate clinical and functional significance of validated candidate genes

24

top related