fusion of ttyh1 with the c19mc microrna cluster … · etmr samples (including 2 recurrences) and 5...

9
NATURE GENETICS ADVANCE ONLINE PUBLICATION LETTERS Embryonal tumors with multilayered rosettes (ETMRs) are rare, deadly pediatric brain tumors characterized by high-level amplification of the microRNA cluster C9MC ,2 . We performed integrated genetic and epigenetic analyses of 2 ETMR samples and identified, in all cases, C9MC fusions to TTYH1 driving expression of the microRNAs. ETMR tumors, cell lines and xenografts showed a specific DNA methylation pattern distinct from those of other tumors and normal tissues. We detected extreme overexpression of a previously uncharacterized isoform of DNMT3B originating at an alternative promoter 3 that is active only in the first weeks of neural tube development. Transcriptional and immunohistochemical analyses suggest that C9MC-dependent DNMT3B deregulation is mediated by RBL2, a known repressor of DNMT3B 4,5 . Transfection with individual C9MC microRNAs resulted in DNMT3B upregulation and RBL2 downregulation in cultured cells. Our data suggest a potential oncogenic re-engagement of an early developmental program in ETMR via epigenetic alteration mediated by an embryonic, brain-specific DNMT3B isoform. Brain tumors are currently the leading cause of cancer-related mor- tality and morbidity in children, with neoplasms of embryonal origin representing the most common group. The World Health Organization has divided these embryonal tumors into three broad entities comprising medulloblastomas, atypical teratoid rhabdoid tumors and central nervous system (CNS) primitive neuroectoder- mal tumors (PNETs). ETMRs (named for their histology) 6 are a rare, newly suggested entity within PNETs 6,7 that mainly affect infants, with the vast majority of affected individuals dying within 2 years of diagnosis, despite intensive therapeutic intervention 1,8,9 . Most of these tumors are characterized by high-level amplification of the polycis- tronic microRNA (miRNA) cluster C19MC on chromosome 19q13.41 (refs. 1,2), one of the largest miRNA clusters that spans ~100 kb and contains 46 miRNA genes 10 . Transcriptional signatures indicate that ETMRs may originate from an early neural stem cell precursor, and high LIN28A and low OLIG2 levels have recently been proposed as surrogate diagnostic markers 8,9 . However, there is insufficient infor- mation to improve disease management and a crucial need to identify relevant targets for the design of novel therapeutic agents. C19MC amplification is accompanied by a striking overexpres- sion of oncogenic miRNAs that cannot be explained solely by the copy number aberration 1 . To decipher the molecular pathogenesis of ETMR, we performed an integrated genetic and epigenetic analysis 1 McGill University and Génome Québec Innovation Centre, Montreal, Quebec, Canada. 2 Department of Human Genetics, McGill University, Montreal, Quebec, Canada. 3 Division of Hematology-Oncology, Arthur & Sonia Labatt Brain Tumour Research Centre, The Hospital for Sick Children, Toronto, Ontario, Canada. 4 Division of Experimental Medicine, McGill University, Montreal, Quebec, Canada. 5 Department of Pathology, McGill University Health Centre, Montreal, Quebec, Canada. 6 Second Department of Paediatrics, Semmelweis University, Budapest, Hungary. 7 Department of Neurosurgery, Medical and Health Science Center, University of Debrecen, Debrecen, Hungary. 8 Division of Neurosurgery, Department of Surgery, Montreal Children’s Hospital, McGill University Health Centre, Montreal, Quebec, Canada. 9 Department of Molecular Pathology and Neuropathology, Medical University of Lodz, Lodz, Poland. 10 Department of Neurosurgery, Polish Mother’s Memorial Hospital Research Institute, Lodz, Poland. 11 Rosalind and Morris Goodman Cancer Research Centre, McGill University, Montreal, Quebec, Canada. 12 Department of Biochemistry, McGill University, Montreal, Quebec, Canada. 13 Department of Pathology & Laboratory Medicine, University of Calgary, Calgary, Alberta, Canada. 14 Division of Pediatric Hematology-Oncology, Department of Pediatrics, McGill University and the McGill University Health Centre Research Institute, Montreal, Quebec, Canada. 15 Division of Neurosurgery, Arthur & Sonia Labatt Brain Tumour Research Centre, The Hospital for Sick Children, Toronto, Ontario, Canada. 16 Program in Cell Biology, Arthur & Sonia Labatt Brain Tumour Research Centre, The Hospital for Sick Children, Department of Pediatrics, University of Toronto, Toronto, Ontario, Canada. 17 These authors contributed equally to this work. 18 These authors contributed equally to this workThese authors jointly directed this work. Correspondence should be addressed to N.J. ([email protected]), J.M. ([email protected]) or A.H. ([email protected]). Received 15 June; accepted 13 November; published online 8 December 2013; doi:10.1038/ng.2849 Fusion of TTYH1 with the C19MC microRNA cluster drives expression of a brain-specific DNMT3B isoform in the embryonal brain tumor ETMR Claudia L Kleinman 1,2,17 , Noha Gerges 2,17 , Simon Papillon-Cavanagh 1 , Patrick Sin-Chan 3 , Albena Pramatarova 1 , Dong-Anh Khuong Quang 2 , Véronique Adoue 1 , Stephan Busche 1 , Maxime Caron 1 , Haig Djambazian 1 , Amandine Bemmo 1 , Adam M Fontebasso 4 , Tara Spence 3 , Jeremy Schwartzentruber 1 , Steffen Albrecht 5 , Peter Hauser 6 , Miklos Garami 6 , Almos Klekner 7 , Laszlo Bognar 7 , Jose-Luis Montes 8 , Alfredo Staffa 1 , Alexandre Montpetit 1 , Pierre Berube 1 , Magdalena Zakrzewska 9 , Krzysztof Zakrzewski 10 , Pawel P Liberski 9 , Zhifeng Dong 11 , Peter M Siegel 11 , Thomas Duchaine 12 , Christian Perotti 13 , Adam Fleming 14 , Damien Faury 14 , Marc Remke 15 , Marco Gallo 15 , Peter Dirks 15 , Michael D Taylor 15 , Robert Sladek 1,2 , Tomi Pastinen 1 , Jennifer A Chan 13 , Annie Huang 3,16,18 , Jacek Majewski 1,2,18 & Nada Jabado 2,4,18

Upload: dangdiep

Post on 09-Oct-2018

226 views

Category:

Documents


0 download

TRANSCRIPT

Nature GeNetics  ADVANCE ONLINE PUBLICATION �

l e t t e r s

Embryonal tumors with multilayered rosettes (ETMRs) are rare, deadly pediatric brain tumors characterized by high-level amplification of the microRNA cluster C�9MC�,2. We performed integrated genetic and epigenetic analyses of �2 ETMR samples and identified, in all cases, C�9MC fusions to TTYH1 driving expression of the microRNAs. ETMR tumors, cell lines and xenografts showed a specific DNA methylation pattern distinct from those of other tumors and normal tissues. We detected extreme overexpression of a previously uncharacterized isoform of DNMT3B originating at an alternative promoter3 that is active only in the first weeks of neural tube development. Transcriptional and immunohistochemical analyses suggest that C�9MC-dependent DNMT3B deregulation is mediated by RBL2, a known repressor of DNMT3B4,5. Transfection with individual C�9MC microRNAs resulted in DNMT3B upregulation and RBL2 downregulation in cultured cells. Our data suggest a potential oncogenic re-engagement of an early developmental program in ETMR via epigenetic alteration mediated by an embryonic, brain-specific DNMT3B isoform.

Brain tumors are currently the leading cause of cancer-related mor-tality and morbidity in children, with neoplasms of embryonal

origin representing the most common group. The World Health Organization has divided these embryonal tumors into three broad entities comprising medulloblastomas, atypical teratoid rhabdoid tumors and central nervous system (CNS) primitive neuroectoder-mal tumors (PNETs). ETMRs (named for their histology)6 are a rare, newly suggested entity within PNETs6,7 that mainly affect infants, with the vast majority of affected individuals dying within 2 years of diagnosis, despite intensive therapeutic intervention1,8,9. Most of these tumors are characterized by high-level amplification of the polycis-tronic microRNA (miRNA) cluster C19MC on chromosome 19q13.41 (refs. 1,2), one of the largest miRNA clusters that spans ~100 kb and contains 46 miRNA genes10. Transcriptional signatures indicate that ETMRs may originate from an early neural stem cell precursor, and high LIN28A and low OLIG2 levels have recently been proposed as surrogate diagnostic markers8,9. However, there is insufficient infor-mation to improve disease management and a crucial need to identify relevant targets for the design of novel therapeutic agents.

C19MC amplification is accompanied by a striking overexpres-sion of oncogenic miRNAs that cannot be explained solely by the copy number aberration1. To decipher the molecular pathogenesis of ETMR, we performed an integrated genetic and epigenetic analysis

1McGill University and Génome Québec Innovation Centre, Montreal, Quebec, Canada. 2Department of Human Genetics, McGill University, Montreal, Quebec, Canada. 3Division of Hematology-Oncology, Arthur & Sonia Labatt Brain Tumour Research Centre, The Hospital for Sick Children, Toronto, Ontario, Canada. 4Division of Experimental Medicine, McGill University, Montreal, Quebec, Canada. 5Department of Pathology, McGill University Health Centre, Montreal, Quebec, Canada. 6Second Department of Paediatrics, Semmelweis University, Budapest, Hungary. 7Department of Neurosurgery, Medical and Health Science Center, University of Debrecen, Debrecen, Hungary. 8Division of Neurosurgery, Department of Surgery, Montreal Children’s Hospital, McGill University Health Centre, Montreal, Quebec, Canada. 9Department of Molecular Pathology and Neuropathology, Medical University of Lodz, Lodz, Poland. 10Department of Neurosurgery, Polish Mother’s Memorial Hospital Research Institute, Lodz, Poland. 11Rosalind and Morris Goodman Cancer Research Centre, McGill University, Montreal, Quebec, Canada. 12Department of Biochemistry, McGill University, Montreal, Quebec, Canada. 13Department of Pathology & Laboratory Medicine, University of Calgary, Calgary, Alberta, Canada. 14Division of Pediatric Hematology-Oncology, Department of Pediatrics, McGill University and the McGill University Health Centre Research Institute, Montreal, Quebec, Canada. 15Division of Neurosurgery, Arthur & Sonia Labatt Brain Tumour Research Centre, The Hospital for Sick Children, Toronto, Ontario, Canada. 16Program in Cell Biology, Arthur & Sonia Labatt Brain Tumour Research Centre, The Hospital for Sick Children, Department of Pediatrics, University of Toronto, Toronto, Ontario, Canada. 17These authors contributed equally to this work. 18These authors contributed equally to this workThese authors jointly directed this work. Correspondence should be addressed to N.J. ([email protected]), J.M. ([email protected]) or A.H. ([email protected]).

Received 15 June; accepted 13 November; published online 8 December 2013; doi:10.1038/ng.2849

Fusion of TTYH1 with the C19MC microRNA cluster drives expression of a brain-specific DNMT3B isoform in the embryonal brain tumor ETMRClaudia L Kleinman1,2,17, Noha Gerges2,17, Simon Papillon-Cavanagh1, Patrick Sin-Chan3, Albena Pramatarova1, Dong-Anh Khuong Quang2, Véronique Adoue1, Stephan Busche1, Maxime Caron1, Haig Djambazian1, Amandine Bemmo1, Adam M Fontebasso4, Tara Spence3, Jeremy Schwartzentruber1, Steffen Albrecht5, Peter Hauser6, Miklos Garami6, Almos Klekner7, Laszlo Bognar7, Jose-Luis Montes8, Alfredo Staffa1, Alexandre Montpetit1, Pierre Berube1, Magdalena Zakrzewska9, Krzysztof Zakrzewski10, Pawel P Liberski9, Zhifeng Dong11, Peter M Siegel11, Thomas Duchaine12, Christian Perotti13, Adam Fleming14, Damien Faury14, Marc Remke15, Marco Gallo15, Peter Dirks15, Michael D Taylor15, Robert Sladek1,2, Tomi Pastinen1, Jennifer A Chan13, Annie Huang3,16,18, Jacek Majewski1,2,18 & Nada Jabado2,4,18

2  ADVANCE ONLINE PUBLICATION Nature GeNetics

l e t t e r s

using deep sequencing and global DNA methylation arrays of primary tumor samples (Supplementary Table 1), which allowed us to detect the downstream effects of this amplification. Exome sequencing and copy number variant (CNV) analysis of four ETMR samples, includ-ing a recurrence (Supplementary Table 1), confirmed the presence of the amplicon and allowed us to map its boundaries to the TTYH1 locus in all cases (Fig. 1a), suggesting a fusion between this gene and the C19MC cluster. We next sequenced the transcriptome of 12 ETMR samples (including 2 recurrences) and 5 PNET control sam-ples and validated the presence of a fusion between TTYH1 and the miRNA cluster exclusively in all ETMR samples. We determined the genomic locations of the fusion breakpoints at single-base resolu-tion, with consistent results using three different alignment strategies (Supplementary Fig. 1). Each breakpoint was unique to a given sam-ple (Supplementary Fig. 2), falling 5.5–16 kb upstream of the C19MC cluster and occurring either within introns (7/8 cases) or exons (1/8 cases) of TTYH1, downstream of its promoter. Sanger sequencing confirmed the presence of the TTYH1-C19MC fusion in all tested ETMR samples. An analysis of the genomic regions surrounding the fusion breakpoints suggested that a shared mechanism, conferred by sequences with the potential to hybridize and form secondary structures, could perturb replication and lead to genomic instability (Supplementary Fig. 3).

Sequence reads displayed an unusual distribution across the chimeric gene (Fig. 1b–d) that shed light on the mechanism of expression and processing of the fused transcript. We found an anomalous propor-tion of intronic reads in this region (50% in ETMRs compared to 21% in PNETs; t-test P = 0.017) that did not correspond to the proportion observed across the rest of the genome (for example, there was no dif-ference in the frequency of intronic reads for GAPDH or ACTB when comparing the tumor types; P > 0.5), suggesting that the processing of miRNAs from their primary transcripts interferes with proper splic-ing and the usually rapid degradation of introns that follows. Indeed, the substantial amount of intronic reads allowed us to visually con-firm the locations of the breakpoints (Supplementary Fig. 2). Small RNA sequencing showed expression of the C19MC miRNAs at levels 150–1,000 times higher than in control PNET samples (Supplementary Fig. 4 and Supplementary Table 2). Sharp decreases in RNA sequenc-ing (RNA-seq) coverage were observed where the miRNAs are processed out of their primary transcripts, overlapping the exact locations where the mature molecules are detected by small RNA sequencing (Fig. 1e). Notably, expression levels of TTYH1 and C19MC miRNAs were highly correlated across samples (Spearman’s r = 0.93; Supplementary Fig. 5), suggesting that both regions are driven by the same promoter. Next, we confirmed the presence of transcripts extending from the first exon of TTYH1 up to the first miRNAs in C19MC by targeted long-read

c

d

f

e

MIR515-1 MIR519C

C19MC

MIR512-1 MIR512-1

TTYH1

15001500

TTYH1

25003000

C19MC

115 kb

C19MCTTYH1

Chromosome 19

54.2 Mb 54.9 Mb

C19MC

C19MCTTYH1

TTYH1

a

b

Figure 1 Genomic regions involved in the rearrangement leading to ETMR. (a) Amplified region of chromosome 19 detected by CNV analysis of exome sequencing samples. (b) TTYH1 gene. Top rows show RNA-seq reads mapping to TTYH1 for one ETMR sample (pink) and one control PNET sample (green); the black arrow marks the breakpoint location in the ETMR sample. The ETMR sample shows unusual accumulation of intronic reads up to the breakpoint. The bottom row shows the TTYH1 gene model. (c) C19MC locus. Purple triangles mark the locations of miRNA genes. Accumulation of reads corresponding to the fused transcript is shown for two ETMR samples. The top row corresponds to sample ETMR1, which has a 14-kb deletion in the C19MC locus. The bottom row depicts a typical ETMR sample. RNA-seq data for all other samples are provided in supplementary Figure 2. (d) Alignment to an in silico reconstruction of the fusion, with reads spanning the inferred breakpoint. Chimeric reads are colored to illustrate the regions mapping to TTYH1 (orange) and CM19C (red). A deletion of two nucleotides is observed at the breakpoint for this sample, which was confirmed by Sanger sequencing. (e) Mature miRNAs are detected by small RNA sequencing (top rows) where RNA-seq coverage decreases sharply (bottom rows), indicative of proper miRNA processing. Gene models for the mature and precursor miRNAs are shown. (f) Long-range PCR targeted amplification followed by PacBio sequencing was performed on total RNA extracted from the xenograft X-ETMR8. Reads were aligned to a reference genome that combines human genome hg19 and an in silico reconstruction of the TTYH1-C19MC fusion. Top, schematic of the TTYH1 fusion; the first two miRNAs of the C19MC cluster are indicated. Bottom, long reads (>1 kb) mapping to the fusion are shown in blue; each line represents a single read. Solid boxes represent mapping regions; gaps are represented by lines joining the boxes.

Nature GeNetics  ADVANCE ONLINE PUBLICATION 3

l e t t e r s

sequencing using the PacBio platform (Fig. 1f and Supplementary Fig. 6). Taken together, these results argue that the TTYH1 promoter drives expression of the C19MC cluster, explaining the strikingly high expression levels of C19MC miRNAs in ETMRs.

TTYH1, a member of the Tweety family, encodes a chloride chan-nel11 restricted to neural tissue with an ill-defined role in neuron physiology4. We analyzed TTYH1 expression in more than 600 samples covering 62 cell types, compiling expression data from the BrainSpan, Encyclopedia of DNA Elements (ENCODE)12,13 and Roadmap Epigenomics projects14. TTYH1 was expressed in embry-onic stem cells and in most neural structures analyzed, both dur-ing development and in adult brain, whereas limited expression was detected in other cell types and tissues (Supplementary Fig. 7). Thus, a gene fusion between the TTYH1 promoter and the C19MC cluster would account not only for the high expression levels of C19MC miR-NAs in ETMRs but also for the brain specificity of tumors harboring the C19MC amplification1,2.

miRNAs could help modulate the epigenome and regulate global DNA methylation patterns. We therefore analyzed ETMR samples using the Illumina 450K DNA methylation array and compared the

methylation patterns obtained to those of 223 samples compris-ing normal brain and 16 other tumor types, including embryonal tumors (Fig. 2 and Supplementary Table 3). Hierarchical cluster-ing by methylation levels consistently identified ETMR samples as a distinct group (Fig. 2a and Supplementary Fig. 8). Variance-ranked clustering was extremely robust, with as few as 100 individual sites sufficient to properly group ETMR samples (Supplementary Fig. 8c). This specific methylation pattern for ETMRs was further validated by randomly resampling sites (P < 0.01), indicating a distinctive genome-wide alteration in DNA methylation in ETMRs that is not restricted to a few discrete regions (Supplementary Fig. 9). Moreover, this methylation profile was retained in cell lines and mouse xenografts derived from tumors included in this study (Fig. 2a), arguing for a strong association of the miRNA cluster not only with tumor formation, as previously suggested1, but also with global changes in DNA methylation.

Given the unique DNA methylation profile, we focused our exome and transcriptome analyses on genes directly involved in establish-ing and maintaining chromatin structure. Very few of these genes showed deregulated expression in ETMRs (Supplementary Table 4).

a X-ETMR8

ETMRs

PNETs

Pediatric brain control

Adult brain control

Fetal brain control

ETMR8-C

g

Exon 1

Exon 1BM E P S P E PP S L E S M K G D T R H L N G E E

M K G D T R H L N G E E

Exon 2

DNMT3B

exp

ress

ion

leve

ls(e

xon

6 R

PK

M)

b20

15

10

5

0

ET

MR

BR

CA

CE

SC

CO

AD

GB

MH

NS

CK

ICH

LAM

LLG

GLI

HC

LUA

DLU

SC

OV

PA

AD

PR

AD

RE

AD

SK

CM

TH

CA

UC

EC

KIR

PK

IRC

PN

ET

BLC

A

Alte

rnat

ive

prom

oter

usa

ge(e

xon

1B R

PK

M)

3.0

c

ET

MR

BR

CA

CE

SC

CO

AD

GB

MH

NS

CK

ICH

LAM

LLG

GLI

HC

LUA

DLU

SC

OV

PA

AD

PR

AD

RE

AD

SK

CM

TH

CA

UC

EC

KIR

PK

IRC

PN

ET

BLC

A

2.5

2.0

1.5

1.0

0.5

0

DNMT3B

exp

ress

ion

leve

ls(w

hole

-gen

e R

PK

M)

e8

6

4

2

8 13 21

Post-conceptionweeks

Years after birth

Age

1 3 11 19 36

Alte

rnat

ive

prom

oter

usa

ge(e

xon

1B R

PK

M)

f8

6

4

2

0

Post-conceptionweeks

Years after birth

Age

8 13 21 1 3 11 19 36

Alte

rnat

ive

prom

oter

met

hyla

tion

leve

ls (�

valu

e)

1.0

This study

ETMR

d

ET

MR

PN

ET

AA AII

Ang

iosa

rcom

aA

TR

TG

BM

Olig

oden

drog

liom

a III

Fet

al c

ontr

olP

edia

tric

con

trol

Adu

lt co

ntro

l

Ped

iatr

ic c

ontr

olA

dult

cont

rol

BLC

AB

RC

AC

ES

CC

OA

D

GB

MH

NS

CK

ICH

DLB

CE

SC

A

LAM

LLG

GLI

HC

LUA

DLU

SC

SA

RC

ST

AD

PA

AD

PR

AD

RE

AD

SK

CM

TH

CA

UC

EC

KIR

PK

IRC

0.8

0.6

0.4

0.2

0

TCGA tumors BrainSpan

Fetal brain

Figure 2 Analysis of methylation and DNMT3B expression in cancer samples and normal brain. (a) ETMRs constitute a distinct molecular subgroup among other CNS tumors and normal brain. A dendrogram obtained by hierarchical clustering based on the methylation levels of the 10,000 most variant sites is shown. A complete heat map is shown in supplementary Figure 8. ETMR8-C, ETMR-derived cell line; X-ETMR8, mouse xenograft derived from an ETMR tumor. (b) Expression levels of constitutive exon 6 of DNMT3B in 5,445 tumor samples obtained from TCGA; outliers are omitted. RPKM, reads per kilobase per million mapped reads. (c) Expression levels of DNMT3B exon 1B in TCGA data. (d) Methylation levels of CpG sites located in alternative exon 1B of DNMT3B; data are included for fetal brain control, 6–40 weeks old. (e,f) DNMT3B expression in human brain as a function of developmental stage, based on data for 580 human brain samples from the BrainSpan Project, is shown for the whole gene (e) and for the transcript from the alternative promoter (f). (g) Alternative promoters of the DNMT3B gene. The predicted protein sequence is shown; the additional amino acids incorporated when exon 1B is expressed are shown in red. Dark gray dots represent methylation marks analyzed by the 450K methylation array. The schematic is not to scale. In box plots, whiskers represent 1.5 times the interquartile range, the top and bottom edges of the box represent the 75th and 25th percentiles, respectively, and the band inside the box represents the median of the distribution. Outliers were omitted from the plots. A complete description of the samples included here is provided in supplementary table 6.

4  ADVANCE ONLINE PUBLICATION Nature GeNetics

l e t t e r s

Exome sequencing identified few nonrecurrent mutations in individual samples (Supplementary Table 5), which occurred in genes not known to affect DNA methylation (including mutations in VANGL1, involved in neural tube development, and genes involved in the WNT pathway, FZD6 and CTNNB1, each occurring in one sample). Transcriptome analysis identified altered expression of only one gene known to affect DNA methylation—DNMT3B, a de novo DNA methyltransferase gene involved in early neural crest specification and progenitor cell development15,16—which was significantly overexpressed in ETMRs compared to PNETs (P = 5 × 10–5) and other pediatric brain tumors. Notably, in ETMRs, transcription of DNMT3B was found to start at an internal alternative promoter, resulting in the inclusion of exon 1B, with a predicted addition of 11 amino acids to the protein (Fig. 2g and Supplementary Figs. 10 and 11). This exon and the protein-coding sequence contained within it are evolutionary conserved across pla-cental mammals, suggesting that it has a conserved function.

We assessed DNMT3B expression in The Cancer Genome Atlas (TCGA) data set of 5,445 tumor samples and normal tissues of fetal and adult origin (Fig. 2b,c and Supplementary Table 6a). ETMRs had significantly higher DNMT3B mRNA expression levels than any other tumor type, except for a subgroup of acute myeloid leukemia, where high expression of this gene was recently associated with poor prognosis17. Strikingly, we did not detect inclusion of DNMT3B exon 1B in any tumor type except ETMR (Fig. 2c) nor in any single nor-mal tissue from the ENCODE or Roadmap Epigenomics projects (Supplementary Fig. 12). Analysis of the chromatin state at the DNMT3B locus showed that promoter 1A, which is used in the tran-scription of the vast majority of DNMT3B isoforms, was demethylated (Supplementary Fig. 13) and surrounded by open chromatin in almost all cell types studied, both fetal and adult (Supplementary Fig. 14). In contrast, the alternate promoter 1B, which is used in exon 1B tran-scription, was silenced by methylation (Fig. 2d) and located in a region of closed chromatin that was open almost exclusively in fetal brain (Supplementary Fig. 14). We observed DNA demethylation at this promoter locus only in ETMRs and fetal brain (Fig. 2d), with methyl-ation increasing with the gestational age of the samples and paralleling

decreased DNMT3B exon 1B production (Supplementary Fig. 15). These data suggest that stable silencing of the alternate promoter, which normally occurs in fetal and adult tissues, is absent in ETMRs. As the only tissue type where promoter 1B was unmethylated and surrounded by open chromatin was fetal brain, we next profiled DNMT3B expression in 580 samples from the BrainSpan Project covering 16 CNS structures across the full course of human brain development, (Supplementary Table 6b) finding exon 1B expres-sion to be restricted to a narrow developmental window during early neurogenesis, around 8 weeks after conception (Fig. 2e,f).

DNMT3B is essential for key events in embryonic differentiation18, and its depletion leads to early arrest of development19. In embryonic stem cells, it is an integral component of a regulatory feedback loop involving Oct4, Sall4, Nanog and the miR-290-295 cluster, and its transcription is repressed by retinoblastoma-like 2 (Rbl2)4,5,19. The mRNA for this repressor is predicted by five different algorithms (Supplementary Table 7) to be targeted by miRNAs in C19MC that bear an identical sequence seed to members of the miR-290-295 clus-ter. We thus investigated RBL2 and DNMT3B expression levels in ETMR samples and PNET control samples (Fig. 3), both at the mRNA (RNA-seq and quantitative PCR) and protein (inmunohistochemical staining) levels. Consistent with targeting by C19MC miRNAs, RBL2 transcript levels were lower in ETMRs compared to PNETs (by twofold; Fig. 3b and Supplementary Fig. 16), and RBL2 protein levels were undetectable in ETMRs (Fig. 3a) (protein expression was absent in tumor cells in all six ETMRs and present in all four PNETs examined; P = 0.005, Fisher’s exact test). Combined transfection with three mem-bers of the C19MC cluster, selected on the basis of their prediction to target RBL2 by four distinct algorithms (miR-519a) or their previous description as oncogenic in ETMR1 (miR-520g), plus an additional miRNA overexpressed as part of the C19MC cluster (miR-517c), of human neural stem cells (hNSCs) decreased RBL2 expression (Fig. 3d) and was inversely correlated with increased DNMT3B levels (Fig. 3a,b and Supplementary Fig. 16). Stable individual overexpression of each oncogenic miRNA in PFSK-1, a human PNET cell line, showed that the miRNAs predicted to target RBL2 cause a decrease in RBL2 levels

a

e

RBL2 DNMT3B

ET

MR

PN

ET

TTYH1 C19MC C19MC 1 1B DNMT3BRBL2

Retention of embryonicchromatin state

Altered DNA methylationETMR tumor fomation

b

Nor

mal

ized

exp

ress

ion 0.035

0.0300.0250.0200.0150.0100.005

0

RBL2

DNMT3B

0.025

0.020

0.015

0.010

0.005

0

PNETETMRXenograft

c

Nor

mal

ized

exp

ress

ion 5

4

3

2

1

0

RBL2

DNMT1

DNMT3A

DNMT3B

*

*

pcDNAmiR-520g3-miR

d

RBL2

3

2

1

0Nor

mal

ized

exp

ress

ion

DNMT1

DNMT3A

DNMT3B

*

*

*

**

**

pcDHmiR-520gmiR-519amiR-517cmiR-517amiR-512-3p

Figure 3 Expression analysis of RBL2 and DNMT3B in patient samples, an ETMR-derived xenograft and neural stem cells transfected with C19MC miRNAs. (a) Immunohistochemistry performed on ETMR and PNET tumor samples, confirming RBL2 downregulation and DNMT3B induction in ETMRs. Insets show magnified views of the boxed regions. Scale bars, 100 µm (insets, 20 µm). (b) Quantitative PCR performed on an ETMR tumor sample, an ETMR-derived xenograft and a PNET tumor sample; expression is normalized to the combined expression levels for GAPDH and ACTB. Error bars, s.d. from three replicates, calculated from the sum of the weighted variances of the components (Cq value and PCR efficiency). (c,d) Changes in expression of RBL2, DNMT3B, DNMT3A and DNMT1 after transfection with empty vector control (pcDNA or pcDH) or a combination of members of the C19MC cluster in hNSCs (c) or with each of these constructs individually or two additional oncogenic miRNAs in the PNET cell line PSFK-1 (d). Both cell lines lack endogenous expression of C19MC1. Expression levels are normalized to those for ACTB. Error bars, s.d. from three replicates. Significance was determined by unpaired two-tailed Student’s t test. *P < 0.01 (c); *P < 0.05 (d). (e) Proposed model of ETMR tumorigenesis.

Nature GeNetics  ADVANCE ONLINE PUBLICATION 5

l e t t e r s

and concurrent substantial increase in DNMT3B levels (Fig. 3c). These findings were further validated by stable overexpression in PFSK-1 cells of another miRNA predicted to target RBL2 (miR-512-3p) and one not predicted to target it (miR-517-a) (Fig. 3d). Overexpression of miR520g, the oncogenic miRNA previously investigated in ETMR1, did not lead to changes in RBL2 or DNMT3B levels in cells, suggesting that it is not likely to be involved in ETMR tumorigenesis, as further inferred by its deletion in an ETMR sample in our data set (Fig. 1d and Supplementary Fig. 4). Notably, levels of DNMT1, a maintenance DNA methyltransferase, and of DNMT3A, a second de novo meth-yltransferase also shown to be highly expressed during brain devel-opment, remained unchanged following overexpression of the three combined miRNAs or of the five individual ones tested. Exon 1B was not present in overexpressed DNMT3B, a fact we attribute to the cell of origin, that is, postnatal brain-derived neural stem cells and the PNET cell line. RBL2 targeting by specific miRNAs in C19MC is thus one plausible mechanism leading to alternate DNMT3B promoter use.

miRNAs regulate gene expression and diverse biological processes and are increasingly associated with human disease, including cancer. C19MC1,2,10, the largest miRNA cluster found in the human genome3, is specific to primates, imprinted and expressed mainly during the first trimester of gestation10,20. Abnormal expression of some mem-bers of this cluster has been associated with parathyroid, breast and liver cancers and with thyroid adenomas21–23. DNMT3B over-expression, on the other hand, has been associated with a number of cancers24–27 and suppresses the WNT/β-catenin pathway, previously shown to be abnormally regulated in ETMR, whereas hypomorphic germline DNMT3B mutations are responsible for ICF syndrome, a genetic disorder associated with defects in neurogenesis15,28. The expression of TTYH1, C19MC and exon 1B–containing DNMT3B peaks 8–10 weeks after conception, which may suggest an early event leading to ETMR development. We propose a model of tumorigenesis (Fig. 3e) in which a fusion between TTYH1 and C19MC leads to extreme overexpression of the miRNA cluster; the effect, potentially mediated by RBL2, would be an abnormal maintenance of chromatin state at promoter 1B of DNMT3B, leading to the incorporation of exon 1B, an alternate exon 1 specific to early brain embryogenesis. ETMR tumorigenesis in this model would thus be promoted by the ‘reawak-ening’ or aberrant maintenance of an early neurogenesis developmen-tal pathway throughout brain development and after birth, as TTYH1 continues to be expressed at later stages in the brain. If this model is confirmed, DNMT3B would represent a clear candidate for future therapies targeting these deadly tumors, which have been described by pathologists as resembling undifferentiated neural tubes.

URLs. BrainSpan, http://www.brainspan.org/; Picard, http://picard.sourceforge.net/; FastQC, http://www.bioinformatics.babraham.ac.uk/projects/fastqc/; miRBase, http://www.mirbase.org/.

METhODsMethods and any associated references are available in the online version of the paper.

Accession codes. Exome sequencing and small RNA sequencing data have been deposited in the NCBI Sequence Read Archive (SRA) under accession SRP032767. RNA-seq data have been deposited in SRA under accession SRP032476. Methylation data have been deposited in the NCBI Gene Expression Omnibus (GEO) under accession GSE52556.

Note: Any Supplementary Information and Source Data files are available in the online version of the paper.

ACKNowLeDGMeNTSThis work was supported by the Cancer Research Society and was funded in part by Hungarian Scientific Research Fund (OTKA) contract T-04639, by National Research and Development Fund (NKFP) contracts 1A/002/2004 (P.H., M. Garami and L.B.) and TÁMOP-4.2.2A-11/1/KONV-2012-0025 (A.K. and L.B.) and by Canadian Institutes of Health Research grant 102684 (A.H.). N.J. is a member of the Penny Cole laboratory and is the recipient of a Chercheur Clinicien Senior Award. S.B. is supported by an RMGA (Reseau Médical de Génétique Apliquée; Network of Applied Genetic Medicine) fellowship. T.P. and J.M. hold Canada Research Chairs (tier 2). C.L.K. is the recipient of a fellowship award from the Fonds de Recherche du Québec-Santé, N.G. is the recipient of a studentship award from Cedars, and A.M.F. is the recipient of a studentship from the Canadian Institutes of Health Research and the McGill–Canadian Institutes of Health Research Training Program in Systems Biology. R.S. is supported by the Canadian Institutes of Health Research.

AUTHoR CoNTRIBUTIoNSN.G., D.-A.K.Q., A.P., P.S.-C., V.A., S.B., A.M.F., Z.D., C.P., T.S., D.F. and P.M.S. performed experimental work. C.L.K., M.C., H.D., S.B., S.P.-C., A.B., A.M. and J.S. performed bioinformatic analyses. C.L.K., N.G., Z.D., A.H., P.S.-C., T.D., S.P.-C., T.P., S.B., V.A., A.S., P.B., R.S., J.M. and N.J. performed data analyses and generated the text and figures. C.L.K., S.P.-C., P.H., A.K., M. Garami, M.Z., M.R., M. Gallo, P.D., M.D.T., P.P.L., L.B., J.-L.M., J.A.C., K.Z., S.A., A.F. and N.J. collected data and provided patient materials. J.M. and N.J. provided leadership for the project. All authors contributed to the final manuscript.

CoMPeTING FINANCIAL INTeReSTSThe authors declare no competing financial interests.

Reprints and permissions information is available online at http://www.nature.com/reprints/index.html.

1. Li, M. et al. Frequent amplification of a chr19q13.41 microRNA polycistron in aggressive primitive neuroectodermal brain tumors. Cancer Cell 16, 533–546 (2009).

2. Pfister, S. et al. Novel genomic amplification targeting the microRNA cluster at 19q13.42 in a pediatric embryonal tumor with abundant neuropil and true rosettes. Acta Neuropathol. 117, 457–464 (2009).

3. Yanagisawa, Y., Ito, E., Yuasa, Y. & Maruyama, K. The human DNA methyltransferases DNMT3A and DNMT3B have two types of promoters with different CpG contents. Biochim. Biophys. Acta 1577, 457–465 (2002).

4. Benetti, R. et al. A mammalian microRNA cluster controls DNA methylation and telomere recombination via Rbl2-dependent regulation of DNA methyltransferases. Nat. Struct. Mol. Biol. 15, 268–279 (2008).

5. Sinkkonen, L. et al. MicroRNAs control de novo DNA methylation through regulation of transcriptional repressors in mouse embryonic stem cells. Nat. Struct. Mol. Biol. 15, 259–267 (2008).

6. Korshunov, A. et al. Focal genomic amplification at 19q13.42 comprises a powerful diagnostic marker for embryonal tumors with ependymoblastic rosettes. Acta Neuropathol. 120, 253–260 (2010).

7. Gessi, M. et al. Embryonal tumors with abundant neuropil and true rosettes: a distinctive CNS primitive neuroectodermal tumor. Am. J. Surg. Pathol. 33, 211–217 (2009).

8. Korshunov, A. et al. LIN28A immunoreactivity is a potent diagnostic marker of embryonal tumor with multilayered rosettes (ETMR). Acta Neuropathol. 124, 875–881 (2012).

9. Picard, D. et al. Markers of survival and metastatic potential in childhood CNS primitive neuro-ectodermal brain tumours: an integrative genomic analysis. Lancet Oncol. 13, 838–848 (2012).

10. Bentwich, I. et al. Identification of hundreds of conserved and nonconserved human microRNAs. Nat. Genet. 37, 766–770 (2005).

11. Suzuki, M. & Mizuno, A. A novel human Cl− channel family related to Drosophila flightless locus. J. Biol. Chem. 279, 22461–22468 (2004).

12. Rosenbloom, K.R. et al. ENCODE data in the UCSC Genome Browser: year 5 update. Nucleic Acids Res. 41, D56–D63 (2013).

13. Meacham, F. et al. Identification and correction of systematic error in high-throughput sequence data. BMC Bioinformatics 12, 451 (2011).

14. Bernstein, B.E. et al. The NIH Roadmap Epigenomics Mapping Consortium. Nat. Biotechnol. 28, 1045–1048 (2010).

15. Jin, B. et al. DNA methyltransferase 3B (DNMT3B) mutations in ICF syndrome lead to altered epigenetic modifications and aberrant expression of genes regulating development, neurogenesis and immune function. Hum. Mol. Genet. 17, 690–709 (2008).

16. Watanabe, D., Uchiyama, K. & Hanaoka, K. Transition of mouse de novo methyltransferases expression from Dnmt3b to Dnmt3a during neural progenitor cell development. Neuroscience 142, 727–737 (2006).

17. Hayette, S. et al. High DNA methyltransferase DNMT3B levels: a poor prognostic marker in acute myeloid leukemia. PLoS ONE 7, e51527 (2012).

18. Okano, M., Bell, D.W., Haber, D.A. & Li, E. DNA methyltransferases Dnmt3a and Dnmt3b are essential for de novo methylation and mammalian development. Cell 99, 247–257 (1999).

�  ADVANCE ONLINE PUBLICATION Nature GeNetics

l e t t e r s

19. Tan, M.H. et al. An Oct4-Sall4-Nanog network controls developmental progression in the pre-implantation mouse embryo. Mol. Syst. Biol. 9, 632 (2013).

20. Morales-Prieto, D.M., Ospina-Prieto, S., Chaiwangyen, W., Schoenleben, M. & Markert, U.R. Pregnancy-associated miRNA-clusters. J. Reprod. Immunol. 97, 51–61 (2013).

21. Vaira, V. et al. The microRNA cluster C19MC is deregulated in parathyroid tumours. J. Mol. Endocrinol. 49, 115–124 (2012).

22. Flor, I. & Bullerdiek, J. The dark side of a success story: microRNAs of the C19MC cluster in human tumours. J. Pathol. 227, 270–274 (2012).

23. Fornari, F. et al. In hepatocellular carcinoma miR-519d is up-regulated by p53 and DNA hypomethylation and targets CDKN1A/p21, PTEN, AKT3 and TIMP2. J. Pathol. 227, 275–285 (2012).

24. Rhee, I. et al. DNMT1 and DNMT3b cooperate to silence genes in human cancer cells. Nature 416, 552–556 (2002).

25. Oka, M. et al. De novo DNA methyltransferases Dnmt3a and Dnmt3b primarily mediate the cytotoxic effect of 5-aza-2′-deoxycytidine. Oncogene 24, 3091–3099 (2005).

26. Ostler, K.R. et al. Cancer cells express aberrant DNMT3B transcripts encoding truncated proteins. Oncogene 26, 5553–5563 (2007).

27. You, J.S. & Jones, P.A. Cancer genetics and epigenetics: two sides of the same coin? Cancer Cell 22, 9–20 (2012).

28. Martins-Taylor, K., Schroeder, D.I., LaSalle, J.M., Lalande, M. & Xu, R.H. Role of DNMT3B in the regulation of early neural and neural crest specifiers. Epigenetics 7, 71–82 (2012).

Nature GeNeticsdoi:10.1038/ng.2849

ONLINE METhODsCharacteristics of the samples and pathological review. All samples were obtained with informed consent. After approval was obtained from the insti-tutional review boards of the respective hospitals, samples were treated and independently reviewed by senior pediatric neuropathologists (S.A. and M.Z.) according to World Health Organization guidelines. The clinical characteris-tics of the patients are summarized in Supplementary Table 1. Samples were taken at the time of the first surgery before further treatment, except for two samples corresponding to relapses.

Exome sequencing and CNV analysis. Standard instructions from the manufacturer were used for target capture with the Illumina TruSeq exome enrichment kit and 100-bp paired-end sequencing on the Illumina HiSeq 2000 platform, with bioinformatic processing and variant annotation as previously described29. To look for CNVs, we used a CNV detection algorithm that pri-oritizes rare variants by using a large set of exome samples as controls and comparing the exonic read depth of the test sample against the distribution of read depths for the control set30. We first obtained coverage information for 200 controls and the test sample in RPKM and eliminated batch-to-batch variation in the control set by applying principal-component analysis (PCA) and removing the top contributing principal components (~2–10) from the data. We then segmented the test sample into regions of similar RPKM ratios using circular binary segmentation (CBS)31. Finally, we calculated the sta-tistical deviation of the test sample from the distribution of the controls and generated a P value for each segment, controlling the false discovery rate with the Benjamini-Hochberg procedure. Although the C19MC cluster was not included in the exome capture kit, all neighboring genes were, and, for each ETMR case, the breakpoint was inferred to fall within the TTYH1 gene.

RNA sequencing. Brain samples were homogenized in nitrogen using a mortar and pestle. Brain powder was immersed in TRIzol (Invitrogen) and subjected to a second round of homogenization using a POLYTRON instrument for 30 s. Total RNA was extracted using the miRNeasy kit (including a step of DNase I treatment), and RNA quality was assessed by Agilent 2100 Bioanalyzer (Agilent Technologies). Libraries for RNA-seq and small RNA sequencing were prepared according to Illumina TruSeq protocols (with the Gold option and strand-specific preparation for RNA-seq samples). Size selection (130–170 bp) of libraries for small RNA sequencing was performed using 3% gel cassettes (Sage Science). The quality of the libraries was assessed using an Agilent 2100 Bioanalyzer. Samples were indexed (three or four per lane for RNA-seq and ten per lane for small RNA sequencing) and sequenced on an Illumina HiSeq 2000 (100-bp paired-end and 50-bp single-end reads for RNA-seq and small RNA sequencing, respectively).

Sequencing runs were processed with Illumina CASAVA 1.8 software. Reads were trimmed using Trimmomatic v.0.22 (ref. 32), removing low-quality bases at the ends of reads (phred33 < 30) and clipping Illumina adaptor sequences using the palindrome mode in Trimmomatic. Reads shorter than 32 bp after trimming (10 bp for small RNA sequencing) were discarded. Resulting strand-specific, high-quality RNA-seq reads were aligned to human reference genome build hg19 using TopHat v1.4.1 (ref. 33) coupled with Bowtie v0.12.8 (ref. 34), with UCSC gene model annotations to guide spliced alignment. Picard tools were used to mark duplicated reads. Reads mapping to multiple (more than ten) locations were discarded for downstream analysis. In the case of small RNA sequencing data, alignment to the human reference genome hg19 was performed with Bowtie v2.0.2 (ref. 34), reporting up to ten alternative locations for each read, using a seed length of 15 nucleotides and allowing 1 mismatch within the seed.

Integrative Genomics Viewer35 was used for visualization. Multiple quality control metrics were obtained using FastQC, SAMtools36, BEDtools37 and custom scripts (Supplementary Table 1b). Bigwig tracks for visualization were generated with custom scripts, using BEDtools and UCSC tools.

Gene fusion detection from RNA-seq data. To determine at single-base reso-lution the breakpoints of the genomic rearrangement leading to the fusion of C19MC and TTYH1, three alignment strategies were used (Supplementary Fig. 1). First, focusing on the amplified regions detected by exome sequencing data, approximate breakpoints were obtained from the paired-end alignment

to the reference genome described above by identifying mate pairs where each mate mapped to a different gene (Supplementary Fig. 1a). In ETMRs, these pairs had a an unusually long insert size and inverse orientation (negative insert size found in the 99.5th percentile of the inferred insert size distribu-tion, with mates facing outward). Second, a different alignment algorithm, STAR38, was used to map the sequencing reads to the reference genome. In the seed-finding phase of STAR, a sequential search for a maximal mappable prefix is performed, in which the longest substring of the read that exactly matches a genomic location is found. This approach represents a natural way of finding the precise locations of splice junctions or fusion breakpoints in a read sequence and is more accurate than split-read methods that use arbitrary splitting (Supplementary Fig. 1b). Thus, a large number of reads spanning the breakpoint were correctly mapped and reported as soft clipped. Finally, we performed an alignment using an in silico reconstruction of the chimeric gene as the reference (Supplementary Fig. 1c).

Validation of gene fusion by PacBio sequencing. We reverse transcribed 100 ng of DNase I–treated RNA from the xenograft X-ETMR8 using ThermoScript reverse transcriptase (Invitrogen) and either gene-specific primers or a random hexamer primer. TTYH1-C19MC fusion transcripts were then amplified using a combination of forward primers annealing in exons 1, 3 or 5 of the TTYH1 gene and reverse primers annealing downstream of the fusion breakpoint, either close to the breakpoint or close to the first miRNA gene in the C19MC cluster (Supplementary Fig. 15). PCR runs comprised 40 cycles using either Takara PrimeSTAR GXL DNA polymerase (Clontech) or Q5 High-Fidelity DNA polymerase (New England BioLabs).

Each PCR product was prepared for sequencing using 200 ng of input mate-rial. First, a polyA tail was added to the 3′ end using a terminal deoxynucle-otidyl transferase (TdT). A short polyT sequencing primer was then annealed to this tail, where the P4 PacBio (Pacific Biosciences) sequencing polymerase was subsequently bound, forming the DNA polymerase complex. For each sample, 30 pM of the complex was loaded onto SMRTcells using the PacBio MagBead kit. Libraries were sequenced with 120-min movies for each sam-ple on a PacBio RSII instrument with stage start enabled. Raw sequencing reads were trimmed for quality using SMRT Pipe software from PacBio, with a quality cutoff of 0.75 and removing reads shorter than 50 bp. High-quality reads were then aligned to a reference genome consisting of a combination of human genome hg19 and an in silico reconstruction of the TTYH1-C19MC chimeric sequence. Alignments were performed with BLAT39, requiring at least 85% sequence identity and considering only the best scoring hit for each read, with the score defined as the number of matches minus the number of mismatches.

Validation of gene fusion by Sanger sequencing. Total RNA (500 ng to 1 µg) was used to synthesize first-strand cDNA with 500 ng of random hexamers (Invitrogen) and SuperScript II reverse transcriptase (Invitrogen) according to the manufacturer’s instructions. Genomic DNA was isolated from tissue using the DNeasy Blood and Tissue kit (Qiagen) according to the manufac-turer’s protocols.

Primer pairs for PCR amplification and sequencing of each breakpoint were generated using Primer3 (ref. 40). Primers were designed to obtain a product of approximately 300 bp, with the fusion breakpoint near the center of the fragment. To minimize the amplification of nonspecific genomic sequences, primer pairs were filtered using UCSC In Silico PCR. PCR products were puri-fied before sequencing by filtration using Millipore MultiScreen filter plates (EMD Millipore Corporation). Products were sequenced by conventional Sanger methods with the BigDye Terminator v3.1 Cycle Sequencing kit (Life Technologies) and purified by ethanol precipitation before obtaining DNA chromatograms on a 3730xl DNA Analyzer (Applied Biosystems).

Analysis of gene expression from RNA-seq data. To estimate gene expres-sion levels, we used all exonic reads mapping uniquely within the maximal genomic locus containing each gene and its known isoforms, normalizing by library size using DESeq41. For clustering and correlation analysis, variance-stabilized expression values were derived using DESeq. Hierarchical clustering was performed using Pearson’s correlation as the distance metric and average linkage as the agglomeration method. Differential expression analysis was

Nature GeNetics doi:10.1038/ng.2849

performed using DESeq, where statistical significance was calculated using the negative binomial distribution as a null distribution for gene expression values, with the variance and mean estimated from the data and linked by local regression. The expression levels obtained for all coding genes can be found in Supplementary Table 8. Genes were subsequently annotated with the DAVID functional annotation tool42, v6.7.

Analysis of gene expression by quantitative PCR. RNA was extracted from tissue (around 20 mg) or cell lines (1–2 × 106 cells total) using the RNeasy Lipid Tissue kit (Qiagen). RNA integrity was assessed with the Experion sys-tem (Bio-Rad). We then reverse transcribed 100 ng of total RNA using iScript Reverse Transcription Supermix for RT-qPCR (Bio-Rad), following the manu-facturer’s instructions. SsoFast EvaGreen Supermix (Bio-Rad) was used for quantitative PCR, starting with 1 µl of cDNA. Each sample was run in triplicate on a LightCycler 96 instrument (Roche). Cycling conditions comprised 95 °C for 30 s, 95 °C for 5 s and 60 °C for 20 s (n = 40 cycles). Product specificity was examined by melting curve analysis. Quantitative PCR analysis was performed using LC96 Application Software 2.0. Results were normalized to expression levels for reference genes (ACTB and GAPDH).

DNA methylation analysis. We subjected 500 ng of genomic DNA to bisulfite conversion using the EZ DNA Methylation kit (Zymo Research), according to the manufacturer’s protocols. Modified genomic DNA was processed as described in the Infinium Assay Methylation Protocol Guide Rev. C (November 2010) and analyzed on Infinium HumanMethylation450 BeadChips (Illumina), measuring DNA methylation at single-CpG resolution on the basis of geno-typing of C>U polymorphisms. On such a chip, fluorescence intensities are detected in two separate color channels for the methylated (IM) and unmeth-ylated (IU) alleles. For genomic annotation of probes, RefSeq gene and CGI annotations were downloaded from UCSC hg19 (NCBI Reference Sequence Database Release 37), and miRNA annotations were obtained from miRBase version 17. We removed probes that had missing values, bound multiple genomic regions, bound targets with known sequence variants (1000 Genomes Project, SNPs with a minor allele frequency of ≥2/120) or were located on the X or Y chromosome. The final set contained 309,015 probes.

Intensities from both channels (IM and IU) were combined into a single β value per locus ranging from 0 (no methylation) to 1 (complete methylation on both alleles), where

b =+I

I IM

M U

Finally, we used a subset quantile within-array normalization (SWAN) algo-rithm43 to match the intensity peaks of type I and type II probes using the R package minfi43. We detected artifactual samples with the help of quality control graphs and individual density distribution. In total, 16 of 215 samples were removed from further analysis.

Hierarchical clustering was performed using the 10,000 most variant sites from all samples. Distance was computed using 1 − r, where r was the Pearson’s product-moment correlation coefficient between features, and clustering was performed using average linkage (UPGMA). To assess cluster robustness, we used a multiscale bootstrap resampling procedure (1,000 iterations) available through the pvclust R package.

Immunohistochemistry. Immunohistochemical staining for DNMT3B and RBL2 was performed on paraffin-embedded sections that were subjected to antigen retrieval in 10 mM citrate buffer (pH 6.0) for 15 min at temperatures below the boiling point. Slides were incubated overnight at 4 °C with a mouse monoclonal antibody to Dnmt3b (Abcam, ab13604) used at a 1:200 dilution or a polyclonal rabbit antibody to RBL2 (Abcam, ab76234) used at a 1:100 dilution. After incubation with primary antibody, secondary biotin-conjugated antibodies were applied to tissue sections for 30 min. After washing with PBS, slides were developed using diaminobenzidine (Dako) as the chromogen. All slides were counterstained with Harris hematoxylin. Slides were independ-ently scored for RBL2 and DNMT3B positivity by three individuals, and the results were merged after consensus scoring was obtained. Briefly, samples were considered to be negative for RBL2 or DNMT3B staining if tumor cells

showed no nuclear positivity, and the core included a control brain or reactive glia with positive nuclear staining1.

ETMR cell lines and xenografts. Tumor tissue was collected at the time of autopsy from a 2-year-old male with recurrent ETMR and immediately placed in chilled transport medium consisting of 1× PBS with glucose (4.5 g/l), penicillin (50 U/ml) and streptomycin (50 U/ml). After rinsing in transport medium, tissue was transferred to NeuroCult human NS-A prolifera-tion medium (Stem Cell Technologies) supplemented with heparin (0.0002%; Stem Cell Technologies), epidermal growth factor (EGF, 20 ng/ml; Peprotech), fibroblast growth factor (bFGF, 20 ng/ml; R&D) and sonic hedgehog (recom-binant human SHH N terminus, 0.1 µg/ml; R&D), and tissue was mechanically disaggregated into single cells and small clusters of cells by gentle trituration. The cell suspension was passed through a 70-µm filter, and cells were plated in stem cell medium at a density of 20,000 cells/ml in low-adhesion flasks for suspension cells (Sarstedt) and maintained at 37 °C in a 5% CO2 incubator. Tumor spheres formed within 7–10 d. Cells were passaged immediately after spheres reached 200–400 µm in diameter in xenografts. Spheres were disag-gregated to single cells with Accumax (Innovative Cell Technologies) and resuspended in sterile PBS. We injected 100,000 cells in a volume of 2–3 µl into the right striatum of 6- to 8-week-old male NOD-SCID mice (Jackson Laboratory) using the following stereotactic coordinates: −1.0 anterioposte-rior, 2.0 medial-lateral and 3.0 dorsoventral. Mice were euthanized, and brains were collected for histology and extraction of DNA and RNA 8 weeks after injection. The McGill animal care committee approved the protocols.

Overexpression of C19MC members in human neural stem cells and the PFSK-1 cell line. The 3-miR expression plasmid was generated by design-ing a cluster of three miRNA precursor stem loop structures in a pCDH-CMV-MCS-EF1-copGFP vector under a constitutive CMV promoter and was subcloned into the pcDNA3.1 vector (Invitrogen) for expression in hNSCs. Generation of hNSCs with stable expression of 3-miR and propagation of the stable cell line were performed as previously described1. Expression of miR-NAs was validated by quantitative RT-PCR using TaqMan probes (Applied Biosystems). Constructs for individual miRNAs were generated similarly and were stably overexpressed individually in PFSK-1 cells. hNSCs were obtained by P. Dirks, and the PFSK-1cell line (a PNET from a supratentorial tumor from a 22-month-old) was purchased from the American Type Culture Collection (ATCC). Both have been reported previously1. All cell lines used in this study were tested for mycoplasma and were found to be free of this pathogen.

For quantitative RT-PCR, cDNA was reverse transcribed from 1 µg of total RNA using the High-Capacity cDNA Reverse Transcription kit (Applied Biosystems). Expression of RBL2 transcripts was detected using Platinum SYBR Green (Invitrogen) according to the manufacturer’s instructions. mRNA expression levels were normalized to expression levels for ACTB using the ∆∆Ct method.

29. Schwartzentruber, J. et al. Driver mutations in histone H3.3 and chromatin remodelling genes in paediatric glioblastoma. Nature 482, 226–231 (2012).

30. Shi, Y. & Majewski, J. FishingCNV: a graphical software package for detecting rare copy number variations in exome-sequencing data. Bioinformatics 29, 1461–1462 (2013).

31. Olshen, A.B., Venkatraman, E.S., Lucito, R. & Wigler, M. Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics 5, 557–572 (2004).

32. Lohse, M. et al. RobiNA: a user-friendly, integrated software solution for RNA-Seq-based transcriptomics. Nucleic Acids Res. 40, W622–W627 (2012).

33. Trapnell, C., Pachter, L. & Salzberg, S.L. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25, 1105–1111 (2009).

34. Langmead, B., Trapnell, C., Pop, M. & Salzberg, S.L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009).

35. Thorvaldsdóttir, H., Robinson, J.T. & Mesirov, J.P. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief. Bioinform. 14, 178–192 (2013).

36. Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

37. Quinlan, A.R. & Hall, I.M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).

38. Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).

Nature GeNeticsdoi:10.1038/ng.2849

39. Kent, W.J. BLAT—the BLAST-Like Alignment Tool. Genome Res. 12, 656–664 (2002).

40. Untergasser, A. et al. Primer3Plus, an enhanced web interface to Primer3. Nucleic Acids Res. 35, W71–W74 (2007).

41. 1000 Genomes Project Consortium. A map of human genome variation from population-scale sequencing. Nature 467, 1061–1073 (2010).

42. Huang, W., Sherman, B.T. & Lempicki, R.A. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 4, 44–57 (2009).

43. Maksimovic, J., Gordon, L. & Oshlack, A. SWAN: subset-quantile within array normalization for Illumina Infinium HumanMethylation450 BeadChips. Genome Biol. 13, R44 (2012).