discovery of two identities of neuroblastoma cells via the...

50
Discovery of two identities of neuroblastoma cells via the analysis of super-enhancer landscapes Valentina BOEVA Computational (Epi-)Genetics of Cancer Institut Cochin, Inserm U1016 / CNRS UMR 8104 / Université Paris Descartes UMR-S1016

Upload: vokhanh

Post on 21-Apr-2018

222 views

Category:

Documents


5 download

TRANSCRIPT

Discovery of two identities of neuroblastoma cells via the analysis of super-enhancer

landscapes

Valentina BOEVA

Computational (Epi-)Genetics of Cancer Institut Cochin, Inserm U1016 / CNRS UMR 8104 / Université Paris Descartes UMR-S1016

Introduction in neuroblastoma

NB = Pediatric cancer (avg. age 18 months)

Neuroblastoma may be found in the adrenal glands and paraspinal nerve

tissue from the neck to the pelvis

MYCN amplification

Acetylation H3K27 modifications

NB cells with MYCN may be sensitive to “epigenetic” drugs: • CDK7 inhibitor (THZ1) – MYCN amplified

tumors

• BRD4 inhibitors (I-Bet726, I-Bet151, JQ1)

Initial aim: Profile super-enhancers in neuroblastoma cell lines and discover core transcriptional regulatory circuitries

• Neuroblastoma: 25 cell lines

& 6 patient-derived xenografts

• Normal control: Neural crest cells

• ChIP-seq data

– H3K27ac

• Gene expression: RNA-seq data

Active promoters, enhancers and super-enhancers

Model for aggressive neuroblastoma

3

Collaboration with the team of Isabelle Janoueix

• Caroline Louis

• Simon Durand

• Agathe Peltier

Bioinformatics methods to work with histone modification data:

• Peak calling from ChIP-seq data

• Calling of super-enhancers based on H3K27ac peaks

Copy number

HMCan

MACS

SICER

Without copy number correction With copy number correction

LILY ROSE

http://boevalab.com/tools.html

H. Ashoor…V. Boeva, Bioinformatics, 2013 V. Boeva* … I. Janoueix-Lerosey*, Nature Genetics, 2017

Principal component analysis (PCA) based on the SE signal determines 2 groups of cell lines

Group II

Group I

V. Boeva* … I. Janoueix-Lerosey*, Nature Genetics, 2017

The two groups of neuroblastoma are driven by different transcriptional master regulators

Analysis: Motif enrichment, core-regulatory circuitries, gene expression correlation analysis in cell lines and 498 primary tumors, experimental ChIP-seq validation

Group II

Group I

PHOX2B

GATA3

HAND2

PHOX2B

GATA3

HAND2

SE

SE

SE

Super-enhancers of MYC, BCOR, MECOM, PTPRJ, etc.

FOSL1

FOSL2

RUNX1

FOSL1

FOSL2

RUNX1

SE

SE

SE

RUNX2 RUNX2 SE

PRRX1 PRRX1 SE

IRF2 IRF2 SE

… … SE

Super-enhancers of MYCN, ALK, RET, LMO1, PRKCE, EYA1, BCL11A, etc.

drive

drive

V. Boeva* … I. Janoueix-Lerosey*, Nature Genetics, 2017

Group I cells are more sensitive to chemotherapy

“Intermediate” cell lines contain cells of both types

Group II

Group I

V. Boeva* … I. Janoueix-Lerosey*, Nature Genetics, 2017

SK-N-AS (intermediate) cell line

Cell ID

1 2

The two cell types can co-exist within the same tumor

van Groningen et al, Nat Genetics, 2017

IHC for MAML3 (blue) and PRRX1 (red) in a stage 4 neuroblastoma. MAML3=the pan-neuroblastoma marker PRRX1=marker of module 2

Module 1 cells

Module 2 cells

stage 4 neuroblastoma tumor

More details in: Boeva et al, Nature Genetics, 2017

Poster #4!

Acknowledgements

10

Emmanuel Barillot

Alban Lermine

Amira Kramdi

Isabelle Janoueix-Lerosey

Caroline Louis

Simon Durand

Tatiana Popova

Olivier Delattre

Gudrun Schleiermacher

Institut Curie, Paris

Vladimir Bajic Haitham Ashoor

KAUST, Saudi Arabia

Irina Medvedeva

Institut Cochin, Paris

Computational methods used in this presentation

1. Detection of regions enriched in H3K27ac (peak calling)

12

H. Ashoor et al, Bioinformatics, 2013

Hidden Markov Model

Peaks predicted by HMCan do not show copy number bias

H. Ashoor et al, Bioinformatics, 2013 13

Copy number

HMCan

MACS

SICER

2. Detection of Super-Enhancers in cancer cells: correction for GC-content bias and variation in copy

number

Without copy number correction

2. Detection of Super-Enhancers in cancer cells: correction for GC-content bias and variation in copy

number

Without copy number correction With copy number correction

LILY: http://boevalab.com/LILY/

3. Motif detection in Super-enhancers

Super-enhancers are too large to look for enriched motifs

3. Motif detection in Super-enhancers

Super-enhancers are too large to look for enriched motifs

Better approach: Discovery of enriched motifs in valley regions of H3K27ac peaks in super-enhancers

Valleys

H3K27ac

Motif hits

NB cell line

3. Motif detection in Super-enhancers

Super-enhancers are too large to look for enriched motifs

Better approach: Discovery of enriched motifs in valley regions of H3K27ac peaks in super-enhancers

Valleys

TF binding (ChIP-seq)

H3K27ac

Motif hits

NB cell line

LILY: http://boevalab.com/LILY/

Results on neuroblastoma cell lines and tumors

Some maths for SE score normalization 1. Real SE in a diploid region:

ChIP signal: X1 reads

Corresponding input signal for this diploid region: X2 reads.

ROSE score: X1-X2

our score: ~(X1/1-X2/1)=(X1-X2)/1 = X1-X2

correction of 1 corresponds to a diploid region (copy number is equal to the main ploidy)

2. an enhancer in a diploid region (here I suppose that there are k times less signal compared to SE:

ChIP signal: X1/k reads

Corresponding input signal for this diploid region: X2 reads.

ROSE score: X1/k -X2

our score: ~( X1/3/1 -X2/1)= X1/k -X2

3. SE in the MYCN region present in 100 copies instead of 2:

ChIP signal: X1*50 reads

Corresponding input signal for this diploid region: X2*50 reads.

ROSE score: X1*50 -X2*50 = 50*(X1-X2)

our score: ~(X1*50/50-X2*50/50)= X1-X2

4. No SE/enhancer in the MYCN region present in 100 copies instead of 2:

ChIP signal: X1*50/k reads

Corresponding input signal for this diploid region: X2*50 reads.

ROSE score: X1*50/k -X2*50 = 50*(X1/k-X2)

Our score: ~(X1*50/50/k-X2*50/50)= X1/k-X2

The same with values X1=400, X2=40 and k=4: 1. ROSE: 360 Our score: 360 2. ROSE: 60 Our score: 60 3. ROSE: 18000 Our score: 360 4. ROSE: 3000 Our score: 60

Neuroblastoma Super-Enhancers defined by H3K27ac peaks are occupied by PHOX2B, HAND2 and GATA3

• Top SE sorted according to the average SE score

• Intersection with TF binding sites defined in CLB-GA

21

Perc

enta

ge

Top super-enhancers

Binding by

HAND2, PHOX2B and GATA3 bind closely located regions within enhancers and SEs

22

10,000 strongest HAND2 binding sites (ChIP-seq)

TF peaks correspond to H3K27ac peaks in the ALK Super-Enhancer

23

ALK

HAND2

PHOX2B

GATA3

H3K27ac

TF peaks correspond to H3K27ac peaks in the TBX2 Super-Enhancer

24

TBX2

HAND2

PHOX2B

GATA3

H3K27ac

HAND2, PHOX2B and GATA3 bind to a MYCN enhancer

MYCN

HAND2

PHOX2B

GATA3

H3K27ac

enhancer FANTOM5 MYCN and DDX1 enhancer

Gene expression linearly correlates with SE score (in Log scale): examples

26

NB cell lines

Control samples

Other cancer cell lines

DNMT expression can be a CIMP driver in ACC

• DNMT1 and DNMT3A expression is increased in CIMP-high patients

DNMT1 and DNMT3A expression correlated with poor survival

DNMT1, but not DNMT3A expression, is correlated with proliferation

TCGA

Proliferation score

Ge

ne

exp

ress

ion

Cochin

We are hiring post-docs

Cancer epigenetics research projects

• Method development

• High-throughput data analysis and data mining

• Experimental validation

NB genes with super-enhancers tend to associate with neuronal differentiation

• Functional annotation of neuroblastoma Super-Enhancers:

31

GO:0030182 neuron differentiation GO:0022008 neurogenesis GO:0048483 autonomic nervous system development GO:0045664 regulation of neuron differentiation GO:0045202 synapse

Gene expression linearly correlates with SE score (in Log scale): examples

33

NB cell lines

Control samples

Other cancer cell lines

Gene expression linearly correlates with SE score (in Log scale): for 1003 SE regions detected in at least 2 NB samples

P-value<0.05

Pearson correlation test on 20 NB cell lines + 2 hNCC

Is there any difference in CRCs in MYCN amplified NB?

35 Chipumuro et al, Cell, 2014

Cancer cells with a specific SV have a specific epigenetic profile (super-enhancers)

These cells are sensitive to a specific drug (CDK7-enhibitor)

SH-S

Y5Y

Kel

ly

Normalization by HMCan does not suggest any “significant” difference in SEs between MYCN-amplified and MYCN non-amplified NBs

36

MYCN-amplified (top) vs MYCN non-amplified (bottom) cell lines: SE in GATA2

Normalization by HMCan does not suggest any “significant” difference in SEs between MYCN-amplified and MYCN non-amplified NBs

37

SE score

Super-enhancers with differential score between MYCN-amplified and MYCN non-amplified NBs

38 With FDR adjustment: no significant regions (Wilcoxon rank test)

P-value <0.01

ChIP-seq technique can provide information about modifications of histone tails

+ Control (e.g., input DNA)

35-100bp

A cluster of reads (peak) in the UCSC genome browser

ChIP-seq = chromatin immunoprecipitation + sequencing

39

Mains steps of ChIP-Seq technique:

PHOX2B is critical for the growth of neuroblastoma cells of noradrenergic type

PHOX2B expressed

PHOX2B inhibited with shRNA

CLB-GA cell line Caroline Louis

Analysis of ChIP-seq data: density profile calculation

41

chromosome

reads

density

4 2 binned density

We calculate the density both for the ChIP and control sample

0 .wig file

Nebula: web-service for analysis of ChIP-seq data

Statistics for external connections to Curie

Nebula Nebula

42

Nebula: web-service for analysis of ChIP-seq data

• Peak calling

• Calculation of the density and cumulative distribution of peak locations relative to gene transcription start sites

• Annotation of peaks with genomic features and genes with peak information

0.0

0.2

0.4

0 0.5 1 1.5 2

down-regulated

no-response

up-regulated

Distance from TSS (Kb)

Pro

port

ion o

f genes w

ith a

peak

at

a g

iven d

ista

nce (

cum

ula

tive)

-2000 -1000 0 1000 2000

2e-0

76e-0

7 ChIPControl

Distance from TSS (bp)

Pro

port

ion o

f genes w

ith a

peak

at

a g

iven d

ista

nce (

density)

Enh. Prom. Imm.Down. Intrag. GeneDown. F.Intron Exons 2,3,etc.Introns E.I.Junctions

Pro

port

ion o

f genes w

ith a

peak

0.0

0.1

0.2

0.3

0.4

0.5

down-regulated

no-response

up-regulated

Control

10 20 30 40 50

1100

10000

Peak height

Peak c

ount

ChIPControl

GeneDown. Enh. Imm.Down. Interg. Intrag. Prom.

Pro

port

ion o

f peaks

0.0

0.1

0.2

0.3

0.4

ChIP

Control

D E

C B A

Some graphs produced produced by Nebula

V. Boeva, A. Lermine et al, Bioinformatics, 2012 43

Disruption of the genomic sequence in cancer can affect epigenetic profiles

• Mutations and structural variants (SVs) in cancer genomes

– Disruption of epigenetic profiles by mutation of epigenome-regulatory proteins (readers, writers or erasers)

– Disruption of regulatory elements

– Disruption of interactions between genes and regulatory elements

44

What transcription factors may be involved in the formation of NB-specific super-enhancers?

• Likely candidates: TCF12 (that interacts with Hand2), PBX3, JUND, GATA2 and GATA3 together with many others TFs (ChIP-seq experiments are ongoing)

45

CTCF JUND PBX3 TCF12 GATA2 GATA3 H3K27ac (SK-N-SH) H3K27ac (SHSY-5Y) Super enhancer Genes

ENCODE DATA

• CDK7 inhibitor (THZ1) – MYCN amplified tumors

Neuroblastoma cells have been shown to be sensitive to CDK7 inhibitor

Chipumuro et al., Cell, 2014

Suggested mechanism: desactivation of super-enhancers created by MYCN

• CDK7 inhibitor (THZ1) – MYCN amplified tumors

Neuroblastoma cells have been shown to be sensitive to CDK7 inhibitor

Chipumuro et al., Cell, 2014

Suggested mechanism: desactivation of super-enhancers created by MYCN

HMCan normalization of the same data

• MYCN high or MYCN amplified tumors

• MYCN low or MYCN WT tumors

• 650 cancer cell lines

Neuroblastoma cells have been shown to be sensitive to BRD4 inhibitor JQ1

Puissant et al., Cancer Discov. 2013

Molecule Mechanism Cancer

JQ1, I-Bet151, I-Bet762 – BRD4 multiple myeloma, Merkel cell carcinoma, castration-resistant prostate cancer, ER+ breast cancers, ovarian carcinoma, human osteosarcoma (+rapamycin)

THZ – CDK7 T-ALL, basal breast cancer, neuroblastoma, AML

Pivanex, Valproate, TSA, Vorinostat, Romidepsin

– HDAC prostate, endometrial and cervical carcinomas, leukemias and lymphomas

Suramin, Cambinol – SIRT1 & SIRT2 lymphomas, prostate, lung and breast cancer

5-Azacitidine, Zebularine, RG108

– DNMT1 hepatocellular carcinoma, breast cancer, prostate cancer, renal carcinoma (+ interferon α-2β)

DZNep, EPZ-6433 GSK126, EI1 – PRC2 prostate cancer, neuroblastoma

A variety of compounds reverse epigenetic changes

|

|

|

|

|

|

ChIP-seq technique can provide information about modifications of histone tails

+ Control (e.g., input DNA)

35-100bp

A cluster of reads (peak) in the UCSC genome browser

ChIP-seq = chromatin immunoprecipitation + sequencing

50

Mains steps of ChIP-Seq technique: