supplementary materials for...other supplementary material for this manuscript include s the...

50
www.sciencemag.org/content/353/6307/aaf1644/suppl/DC1 Supplementary Materials for The linker histone H1.0 generates epigenetic and functional intratumor heterogeneity Cristina Morales Torres, Alva Biran, Matthew J. Burney, Harshil Patel, Tristan Henser-Brownhill, Ayelet-Hashahar Shapira Cohen, Yilong Li, Rotem Ben-Hamo, Emma Nye, Bradley Spencer-Dene, Probir Chakravarty, Sol Efroni, Nik Matthews, Tom Misteli, Eran Meshorer, Paola Scaffidi* *Corresponding author. Email: [email protected] Published 30 September 2016, Science 353, aaf1644 (2016) DOI: 10.1126/science.aaf1644 This PDF file includes: Materials and Methods Supplementary Text Figs. S1 to S17 Tables S1, S2, and S7 References Other Supplementary Material for this manuscript includes the following: (available at www.sciencemag.org/cgi/content/full/353/6307/aaf1644/DC1) Tables S3 to S6 and S8 as Excel files

Upload: others

Post on 11-Oct-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

www.sciencemag.org/content/353/6307/aaf1644/suppl/DC1

Supplementary Materials for

The linker histone H1.0 generates epigenetic and functional intratumor heterogeneity

Cristina Morales Torres, Alva Biran, Matthew J. Burney, Harshil Patel, Tristan Henser-Brownhill, Ayelet-Hashahar Shapira Cohen, Yilong Li,

Rotem Ben-Hamo, Emma Nye, Bradley Spencer-Dene, Probir Chakravarty, Sol Efroni, Nik Matthews, Tom Misteli, Eran Meshorer, Paola Scaffidi*

*Corresponding author. Email: [email protected]

Published 30 September 2016, Science 353, aaf1644 (2016) DOI: 10.1126/science.aaf1644

This PDF file includes:

Materials and Methods Supplementary Text Figs. S1 to S17 Tables S1, S2, and S7 References

Other Supplementary Material for this manuscript includes the following: (available at www.sciencemag.org/cgi/content/full/353/6307/aaf1644/DC1) Tables S3 to S6 and S8 as Excel files

Page 2: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

2

Materials and Methods

Cell lines and constructs Growth conditions of all cell lines used in the study are listed in Table S7. Inducible

cell lines were generated by introducing lentiviral constructs expressing specific cDNAs, shRNAs or control plasmids. For knockdown, inducible pTRIPZ constructs V2THS_38052 and V2THS_38055 (Open Biosystems) were used for H1.0 knockdown and V2THS_37942 was used for H1.4 knockdown. For cDNA expression constructs, pTRIPZ was first modified to replace puromycin resistance with blasticidin resistance using Geneart (Life Technologies), and the region containing TurboRFP and the shRNAmir cassette with an SV40 poly-A signal (AgeI-MluI). H1.0, H1.4 and H1.5 coding regions amplified from genomic DNA were then cloned into AgeI-BstbI sites. Empty pTRIPZ, which expresses the mir30 cassette, rtTA3 and TurboRFP, was used as negative control. Virus production was performed by transfecting 293-T cells with the pTRIPZ constructs, psPAX2 and pMD2.G plasmids (Addgene) using Fugene HD (Promega). Cells infected with cDNA-expressing constructs were selected with 5µg/ml blasticidin for 15d (MDA-MB-231) or 2µg/ml blasticidin for 10d (all other cell lines). Cells infected with shRNAs-expressing constructs or empty control were isolated by cell sorting based on TurboRFP expression after transient induction with 1µg/ml Doxycycline (dox) for 16h (in vitro-transformed fibroblasts) or selected with 1µg/ml puromycin for 7 days (all other cell lines). All cancer cell lines have been sourced from the Crick Institute common repository, authenticated by STR profiling and tested for mycoplasma. Cell proliferation rate was measured using the 96 Aqueous One Solution kit (Promega). Patient samples

Normal brain and GBM (grade 4 astrocytoma) tissue sections were obtained from the Whittington tissue bank (UCL license number: 12055, 3 normal and 4 tumors) and from US Biomax (tissue microarray GL806b). All other tissue sections were obtained from US Biomax (tissue microarrays: BR1053b, BCN962). Clinical information for each sample contained in the tissue microarrays is available at www.biomax.us. For immunofluorescence microscopy analysis, a 1:1 mixture of anti-H3K9Ac and anti-H3K27me3 (modH3), which generates a homogenous staining across sections, was used as a control to assess inter-sample variability in antigen retrieval efficiency. Samples showing a negative staining for modH3 were excluded from the analysis. The following cores, showing reliable staining, were used for the analysis: GL806b: A7, A8, C7, C8, G9; BR1053b: A1, A4, A7, A8, A9, B1, B4, B5, B12, B13, C1, C5, C8, D1, D5, E1, E4, F8, G2, H9, J12, J1, J14; BCN962: B6, A4, A5, B11, A10, A11, F1, E1, E3, F6, E4, E6. Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872 and the deposited data matrix was used for the analysis (2). DNA methylation and survival analysis were performed using TCGA datasets. Protein immunodetection

Western blot analysis, flow cytometric analysis, cell sorting and immunofluorescence microscopy of cultured or sorted tumor cells were performed as previously described (15) using anti-H1.0 (Millipore, raised in mouse, clone 3H9, 1:300, originally produced by M. Bustin’s laboratory) (42), anti-H1.4 (Millipore clone AE-4,

Page 3: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

3

1:1000), anti-histone cluster 1 (ABIN310776 antibodies online 1:1000) anti-Actin [ACTN05 (C4)] (Abcam #3280, 1:15000), anti-SSEA1 (eBioscience, clone MC-480, 1:100-1:500), anti-Ki67 (BD bioscience 1:500) anti-ITGA6 (Atlas, R06863, 1:50) anti-SSEA1-Alexa647 (eBioscience 1:10), anti-CD166-PE (Miltenyi Biotec, 1:10), anti-mouse H-2kd MHC class I-FITC (eBioscience, 1:300), and anti-Phospho-Histone H3 (Ser28) (Cell Signaling #9713, 1:500). Recombinant proteins were purchased from Peprotech. For immunostaining of tissue sections, formalin-fixed paraffin-embedded sections were de-waxed in xylene then rehydrated by passage through graded alcohol to water. For antigen retrieval, sections were microwaved in Tris-EDTA pH9 (10mM Tris Base, 1mM EDTA Solution, 0.05% Tween 20) for 15 minutes and then transferred to PBS. Normal donkey serum diluted to 10% in 1% BSA was used to block non-specific staining in the tissue for 30 minutes. Slides were then incubated with primary antibodies diluted in 1% BSA in PBS for 1 hour at room temperature, washed in PBS and then incubated with secondary antibodies raised in donkey for 45 min at room temperature. Sections were washed in PBS, incubated in 1% Sudan Black for 25 minutes, and mounted in Vectamount with DAPI (Vector Labs). Slides were imaged on a Zeiss 510 confocal microscope using 40x or 63x objectives, or on a Yokogawa CV7000 microscope using a 40x objective for high throughput imaging, on a Zeiss Axiovert or on an ArrayScan VTi automated microscope. Tissue sections were imaged using a Zeiss Axio Scope.Z1 scanner and combined images of the whole section were generated using Zeiss Zen 2 (Blue Edition) V2.1 software. Quantification of the fluorescent signal was performed using Metamorph software or with HCS Studio Cell Analysis Software. For imaging flow cytometry, dissociated tumor cells were first stained with anti-SSEA1-Alexa647, anti-CD166-PE and anti-mouse H-2kd MHC class I-FITC, then incubated for 15 min with Live/Dead NIR fixable stain (Molecular Probes), fixed with 4% PFA for 15 min at RT with agitation, permeabilized with PBS containing 2% FCS and 0.1% Triton X-100 for 5 min and finally stained with rabbit anti-H1.0, followed by an anti-Rabbit Alexa 405-secondary antibody, and 7’AAD to stain DNA. Cells were imaged using a two-camera FlowSight imaging flow cytometer (Amnis Corp.) equipped with ten fluorescence channels and two bright-field channels. Quantitative analysis was performed using the IDEAS software. Tumorigenicity assays

Tumor formation in NSG mice, tumor dissociation, limiting dilution transplantation assays, soft agar assays and proliferation assays were performed as previously described (15 ). For limiting dilution assays, 5000 (4 injections), 500 (4 injections) and 100 cells (9 injections) were injected for each condition and tumor growth was monitored for 4 months. For experiments requiring in vivo modulation of H1 variants in established tumors, cells infected with inducible constructs were injected intradermally into NSG mice (2,000 transformed cells and 100,000 hTERT-fibroblasts as carrier cells) in the uninduced state to induce tumor formation (10 injections per condition). When tumors first became palpable, typically after 4 weeks, doxycycline treatment was started and sustained for about a month until tumors were collected and analyzed (2µg/ml Dox in drinking water supplemented with 1% sucrose, changed every 2-3 days). Subsequent soft agar assay were performed without adding further doxycycline, whereas in vivo limiting dilution experiments were performed by continuing the treatment for the first two weeks

Page 4: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

4

after injection to ensure that the induced changes in protein levels were maintained throughout the engraftment stage. Animals were randomly allocated to cages to receive either the drug or the vehicle. Breast cancer cell lines transduced with H1.0 inducible constructs (H1.0 shRNA or cDNA) or control plasmid were injected in the mammary fat pad of 8-10 weeks old NSG female mice (1-2x106 cells in 50 µl of 75% matrigel (reduced growth factors, Scientific Laboratory Supplies, 356231), 5 injections per condition) in an uninduced or induced state (7 days of previous Dox treatment). Doxycycline treatment was sustained in vivo during tumor growth until tumors were excised and analyzed. When two separate tumors arose from one injection, they were analyzed independently. Samples size was estimated based on preliminary experiments showing the degree of inter-sample variability. Animal studies were conducted in agreement with the approved NCI animal protocol LRBGE-007 and the Crick project license PPL 70/8167. DNA methylation analysis

For bisulfite sequencing analysis, genomic DNA extracted from sorted tumor cells, glioblastoma, melanoma and breast cancer cells was treated for bisulfite conversion using EZ DNA methylation-Direct Kit (Zymo research) according to the manufacturer’s instructions. Three regions of H1F0 (CGI_1, CGI_2 and CGIshore) were amplified by PCR (primer sequences in Table S8) from bisulfite-treated DNA and cloned into pCR 2.1 Topo vector using TOPO TA Cloning Kit (Invitrogen). 15-20 colonies were sequenced for each region. For 5-Aza-2’-deoxycytidine (5-Aza) treatment, SSEA1+ cells isolated from a tumor were treated with 5nM 5-Aza (Sigma) or DMSO as control (Fisher chemical) for up to 14 days. Higher concentrations of 5-Aza were toxic to the SSEA1+ cells. Cell culture media including 5nM 5-Aza or DMSO was changed every day and cells were trypsinized and plated again when needed to allow cells to proliferate, until cells were collected for RNA extraction. For analysis of TCGA samples, gene expression (Illumina HiSeq RNA-seq) and DNA methylation (Illumina 450K Infinium analysis) datasets from individual cancers were downloaded from the UCSC Cancer browser https://genome-cancer.ucsc.edu, patients were ranked based on H1F0 expression levels (RSEM as directly provided in the UCSC Cancer Browser) and DNA methylation levels were visualized as a heatmap. Spearman rank correlation between H1F0 mRNA levels and the average methylation levels of CpG 7 and 8 was used to assess statistical dependence between two variables. The analysis was done both on internally normalized datasets for each cancer type, including the available normal samples, and on pan-cancer normalized datasets, in which the expression levels from of each tissue were mean-shifted to 0 such that H1F0 levels were comparable across tissues. Correlation between H1F0 methylation and tumor grade was assessed by dividing patients in two groups (methylation levels below or above 35%) and counting the number of patients with high-grade tumors in each group. Statistical significance of the differences was assessed by one tailed Fisher exact test. Patient cohorts with median methylation levels <10% were excluded from the analysis. CGI shore functional assays

For luciferase assays, a fragment containing the H1F0 CGI shore (nucleotides +543 to +858 from TSS, Table S8) was cloned into pNL3.1 Nanoluc reporter plasmid

Page 5: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

5

(Promega) in front of a minimal promoter driving NanoLuc (Nluc) expression. Two previously published DNA fragments (21, 43), corresponding to LADs in chromosome 1 and 17 and showing no transactivation potential, were also cloned into pNL3.1 and used as negative controls (Table S8). pGL3-Control Vector (Promega), expressing the firefly luc gene under the control of a SV40 promoter and enhancer was used as a constitutive internal control to normalize Nluc for transfection efficiency. To test the effect of methylation on the shore region, plasmids were in vitro methylated using the CpG Methyltransferase M.SssI (New England Biolabs). Methylation efficiency was tested by digestion with the CpG methylation-sensitive restriction enzyme AgeI. Plasmids were transiently transfected into in vitro-transformed fibroblasts using TransIT LT1 (Mirus Bio LCC) according to manufacturer instructions and the activity of the Nluc and firefly luc proteins was assayed after 24 hours using the Nano-Glo® Dual-Luciferase® reporter assay system. Deletion of the endogenous H1F0 CGI shore was carried out by CRISPR-Cas9-mediated genome editing. Two sgRNAs flanking the CGI shore, designed using the CRISPR Design MIT tool (http://crispr.mit.edu/) (Table S8) were cloned into pSpCas9(BB)-2A-GFP vector (Addgene) and transfected into in vitro-transformed fibroblasts. 48h after transfection, GFP+ cells were isolated by cell sorting and grown as individual clones. Clones carrying homozygous or heterozygous deletions of the CGI shore were identified by PCR and DNA sequencing (Table S8; both heterozygous and homozygous clones carried a deletion spanning from +614 to +796bp from the TSS). Clones with homozygous deletion of the H1F0 CGI shore were analyzed by qRT-PCR using primers located downstream of the deletion. For heterozygous clones, allele-specific primers probing either the deleted or the full-length mRNA were used (Table S8). qRT-PCR standard curves were performed to ensure that the allele-specific reactions had comparable efficiency. To induce targeted methylation of the endogenous H1F0 CGI shore, cells were transiently transfected with plasmids encoding for either WT or ANV mutant Cas9-DNMT3A (22) and a pool of 4 sgRNAs targeting the H1F0 CGI shore (Table S8). 48h after transfection, GFP+ cells were sorted and grown for 3 days. Genomic DNA for bisulfite sequencing and RNA for qRT-PCR were collected 5 days after transfection. RNA-seq and qRT-PCR

RNA extraction was carried out using RNeasy Plus Mini Kit (Qiagen) following manufacturer’s instructions. Stranded total RNA sequencing was performed on rRNA-depleted samples (Ribozero Gold, Epicentre Madison). RNA libraries were prepared using TruSeq Stranded Total RNA-Seq Library Prep (Illumina), assessed on a DNA 1000 BioAnalyser 2100 chip (Agilent) to ensure good quality, and sequenced on an Illumina HiSeq 2500 sequencer, generating ~55 million 101-bp strand-specific paired-end reads per sample. For qRT-PCR, cDNA was generated using High Capacity cDNA Reverse Transcription Kits (Life Technologies) and gene expression levels were analyzed on a CFX96 real-time PCR detection system (Bio-rad) using SsoAdvanced™ Universal SYBR® Green Supermix (Bio-rad). Cyclophilin A (PPIA) was used as reference housekeeping gene.

Page 6: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

6

RNA-seq analysis and GSEA The RSEM package (version 1.2.11) was used for alignment and subsequent gene-

level counting of the sequenced reads with respect to hg19 RefSeq genes archived in the Illumina iGenomes resource http://support.illumina.com/sequencing/sequencing_software/igenome.ilmn. Utility scripts within the Trinity software package (version r20140413p1) were used to generate normalized FPKM values from the raw RSEM counts, and to perform differential expression analysis with DESeq (version 1.14.0). Differentially-expressed genes were defined as those showing statistically significant differences (FDR <0.05) using both H1.0-targeting shRNAs, independently of fold-change. Genome-wide expression plots were generated using similar methods to those described in (44). Log2FC values were obtained from the DESeq output for the relevant sample comparisons (NT vs DOX and DOX vs washDOX). The locpoly function in R (version 3.0.2; R Development Core Team, 2008) was used with a bandwidth setting of 10 to smooth the log2FC data for all expressed genes. The resulting data was then plotted at equidistant intervals based on the molecular location of genes along the human genome. An upregulated or downregulated domain was defined as a set of at least two consecutive genes showing, respectively, positive or negative smoothed log2FC values. Deviation from random distribution was calculated on raw, unsmoothed data, by performing a Montecarlo simulation in which genes where randomly shuffled along chromosomes (1000 iterations). Gene set enrichment analysis (GSEA) was performed as previously described, using GSEA software (version 2.1.0, Broad Institute). FPKM values for NT and DOX samples were provided to the algorithm and tested for enrichment of oncogenic signatures from the MSigDB or custom-made positional gene sets generated based on RefSeq genes and cytogenetic sub-bands coordinates downloaded from the UCSC genome browser. Gene sets were considered enriched when a FDR q-value < 0.25 was obtained using both shRNAs after 1000 permutations, as per GSEA recommendation. ChIP

Chromatin immunoprecipitation (ChIP) was performed on in vitro-transformed fibroblasts. For H1.0 ChIP, 10 million cells were fixed with 1% formaldehyde for 10 min at RT, treated with 125 mM glycine for 5 min at RT and washed three times with ice-cold PBS. Dry pellets were flash frozen in liquid nitrogen. Pellets were resuspended in 0.7 ml SDS lysis buffer (1% SDS, 10 mM EDTA pH 8, 50 mM Tris pH 8.1) and incubated for 15 min on ice. Chromatin was sheared to 200-500 bp with 20 cycles of 0.5sec ON/OFF using the Bioruptor sonicator (Diagenode). Chromatin from each biological replicate was divided into 4, diluted 1:10 in ChIP dilution buffer (0.01% SDS, 1.1% Triton, 1.2 mM EDTA, 16.7 mM Tris-HCl, pH 8.1, 167 mM NaCl), pre-cleared with 30 μl of salmon sperm DNA/protein A-agarose 50% gel slurry (Upstate, Millipore) for 30 min at 4°C and immunoprecipitated overnight at 4°C with either H1.0 (5 µg, Millipore, 05-629, originally generated by M. Bustin’s laboratory) (42), H1.4 (5 µg, Millipore, 05-457), H3K4me3 (3 µg, Abcam, ab8580), or H3K27me3 (3 µg, Millipore, 07-449). Chromatin-antibody complexes were isolated by incubation with 60 μl protein A-agarose beads for 1h at 4°C, with agitation. Beads were harvested and sequentially washed for 2 min on a rotating platform with 1 ml of each of the following buffers: low salt immune complex wash buffer (0.1% SDS, 1% Triton, 2 mM EDTA, 20 mM Tris, pH 8.1, 15 mM NaCl),

Page 7: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

7

high salt immune complex buffer (0.1% SDS, 1% Triton, 2 mM EDTA, 20 mM Tris, pH 8.1, 500 mM NaCl), LiCl immune complex buffer (0.25 M LiCl, 1% NP-40, 1% Deoxycholic acid, 1 mM EDTA, 10 mM Tris, pH 8.1) and twice with TE (10 mM Tris, 1 mM EDTA). Chromatin-antibody complexes were eluted using elution buffer (500 µl of 1% SDS, 0.1 M NaHCO3), incubated at 65°C for 10 min followed by 5 min at RT. Cross-linking was reversed by incubation of the eluted samples overnight at 65°C under high-salt (200 mM NaCl). Samples were treated with RNaseI (0.5 h, 37°C) followed by proteinase K (1 h, 55°C) and purified with phenol:chlorophorm and ethanol precipitation. Library preparation was carried out as published (47). Multiplexed libraries were sequenced on an Illumina HiSeq2500 using 50 bp single-end runs (Table S6). ChIP-qPCR was performed to validate selected peaks detected by bioinformatics analysis. Primers were designed according to peaks that were detected using Primer-Blast. H1.0 ChIP-qPCR was performed on BIRC2, MMP7 and APRT (Table S8). For ChIP comparing uninduced and induced cells containing shH1.0-1, 25 million cells were fixed as indicated above and resuspended in 1.8 ml of IP Buffer (100 mM Tris at pH 8.6, 100nM NaCl, 0.25% SDS, 2.5% Triton X-100, and 5 mM EDTA). After sonication with Bioruptor Pico (Diagenode) (30sec On/ 30 sec off for 14 cycles) yielding genomic DNA fragments with a bulk size of 250–300 bp, 500 µg of sonicated chromatin was used for each immunoprecipitation. Samples were immunoprecipitated overnight at 4°C with anti-H3K27ac (5µg, Abcam, ab4729), H3K27me3 (5µg, Millipore, 07-449), HMGA1 (5 µg, Abcam, ab4078), HMGN2 (5 µg, gift of M. Bustin), PARP1 (5 µg Diagenode, c15410245) or IgG (5ug, Abcam, ab46540) antibodies. Immune complexes were recovered by adding 30 μL of magnetics bead (Dynabeads Protein G, Life Technologies) and incubated for 3 h at 4°C with agitation. Beads were washed three times in 0.5 mL of low salt Buffer (20 mM Tris at pH 8.1, 150 mM NaCl, 2 mM EDTA, 1% Triton X-100, and 0.1% SDS), and once in 0.5 ml of High salt Buffer (20mM Tris at pH 8.1, 1% Triton X-100, 500 mM NaCl, and 2 mM EDTA, 0.1%SDS). Chromatin-antibody complexes were eluted using elution buffer (120 µl of 1% SDS, 0.1 M NaHCO3) and incubated at 65°C overnight. Sample DNA was purified using the Qiagen PCR Purification Kit. DNA was resuspended in 200 μL of water and 2 ul used in real time PCR or library preparation using standard Illumina protocols. Multiplexed libraries were sequenced on an Illumina HiSeq2500 using 101bp paired read runs (Table S6). ChIP-seq analysis

ChIP-seq reads were aligned to the human GRCh37/hg19 genome assembly using Bowtie taking only uniquely aligned reads with no more than two mismatches (Table S6). Significant peaks of H3K4me3 and H3K27me3 were called using MACS 1.4, using a minimal p-value cutoff of 10-5 and a fold change for model building of 30. Peak detection for H1.0 and H1.4 was performed using a custom-made algorithm: reads were counted within 50 bp windows. The median was calculated for each three bins and normalized by the sum of the reads. To effectively capture local biases in the genome (48), the log2 ratio between the ChIP samples and the input genomic DNA (input_1) was calculated. Peaks were defined as regions with log2 ratio of ChIP DNA over input DNA > 2 within six neighboring windows, and average number of reads > 15. Common peaks were defined as peaks overlapping over a 200bp window in the two replicate samples. Accuracy of peak calling was confirmed using MACS2 (narrow settings with a bandwidth of 300),

Page 8: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

8

using a high-coverage input (input_2). ~ 90% of peaks called with the custom-made algorithm overlapped with peaks called by MACS2 over a 200bp window. Both large-scale and gene-focused distribution of H1.0 binding sites were highly similar using either peak-calling algorithm. Genomic distributions of H3K4me3, H3K27me3, H1.4 and H1.0 peaks were calculated by evaluating the overlap of each peak with genomic features, based on UCSC gene annotation coordinates. Promoter regions were defined as a 5Kb region upstream of the TSS. When a single base included several genomic annotations, the following order was enforced: promoter, 5' UTR, 3' UTR, exon, intron or intergenic region, so that each base was counted only once. The number of base pairs from each genomic feature was normalized to the total number of base pairs in the genome and then to the fraction of the genome containing the genomic feature being measured. Average peak density profiles of H3K4me3, and H3K27me3, and increased FAIRE peaks centered onto H1.0 peaks (± 5Kb), and average peak density profiles of H1.0 and H1.4 centered onto TSS of UCSC genes (± 5Kb), were generated using SeqMiner. H1.0 peak density profiles along chromosomes were generated by plotting the number of peaks in every megabase of the genome. To correlate H1.0, H1.4, H3K4me3, H3K27me3, FAIRE, NKI LADs and TADs datasets, we used a resolution that enabled the representation and normalization of datasets of a different nature (e.g. both TFs & histone modifications ChIP-seq), to minimize the effect of various peak sizes on the statistical analysis. The human genome (hg19) was partitioned into non-intersecting bins of 1.5 Kb lengths, choosing a similar scale as previously published (49). Next, binary matrices were built, spanning the genomes, where a binding event of a protein (Peak) was assigned with a value of 1 and a “free” region was assigned with 0. When reads were analyzed, each bin was assigned with the corresponding number of reads. Heatmaps were generated using k-means clustering based on the Pearson correlation between the samples. FAIRE

Cells were processed and analyzed as described (45). Briefly, 3x106 cells were fixed for 9 minutes with 1% formaldehyde (Fisher), resuspended in 100 ul of Lysis Buffer A (10 mM Tris-HCl pH 8.0, 2% (vol/vol) Triton X-100, 1% SDS, 100 mM NaCl and 1 mM EDTA) and incubated for 1.5h on ice prior to sonication with 80 cycles 15’’ON/90’’OFF using a Bioruptor Pico (Diagenode). Nucleosome-free regions were isolated by phenol-chloroform extraction of sonicated chromatin. FAIRE-qPCR was performed by analyzing 25ng of DNA (either FAIRE DNA, corresponding to active regulatory regions, or whole genomic DNA) in a 10ul reaction using primers described in Table S8, and by calculating FAIRE enrichment or depletion relative to the whole genome. FAIRE libraries were prepared using a modified ChIP-seq protocol (Illumina) in which the PCR step was performed before gel extraction of the correct band. Multiplexed libraries (3 samples per lane) were sequenced on an Illumina HiSeq2500 with 101 bp paired-end runs. FAIRE-seq analysis

FAIRE-Seq reads were aligned to the human GRCh37/hg19 genome assembly using BWA in paired end mode, allowing three mismatches across a seed length of the full 101bp sequence. Over 70% of reads were aligned for each sample. Reads were filtered for proper-pairing and non-PCR duplication using Samtools and non-overlap of

Page 9: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

9

ENCODE blacklisted regions (http://encodedcc.stanford.edu/ftp/modENCODE_VS_ENCODE/Regulation/Human/blacklist/) using Bamutils. FAIRE peaks were called using MACS-2.0.10 in paired-end mode using broad settings with a bandwidth of 300bp, normalizing each sample to its own background as described (45). The top 100,000 peaks (based on adjusted p-value) were selected for each replicate and those common to both (with a 20% reciprocal overlap) were further analyzed using Bedtools and standard R scripts. Constitutive peaks were defined as peaks present in all NT and DOX samples. Appearing peaks were defined as peaks present in both DOX replicates and in neither NT replicates. Disappearing peaks were defined as peaks present in both NT replicates and in neither DOX replicates. Differential FAIRE peaks analysis was performed using DiffReps (46), with a window size of 200bp, a step size of 20, and a p-value < 0.05. Differential FAIRE was correlated with RNA-seq expression results within 2kb up or downstream of gene TSSs and plotted in R. FAIRE-seq peaks were correlated with GC content using the Bedtools makewindows and nuc functions, in 500bp bins for global piecharts or 10bp bins for local profiles around FAIRE-Seq peaks. GC-rich and AT-rich regions were defined as regions with average GC content higher or lower than 45%, respectively. Average GC content for disappearing and appearing FAIRE-Seq peaks was drawn using the geom_smooth function within the ggplot2 and reshape2 R packages using default settings. FAIRE-seq peaks were annotated to the nearest genomic region using the region_analysis tool as provided in the DiffReps package. We defined the promoter as within 1kb up or downstream of the TSS, with all other regions considered as ‘elsewhere’. Differential H3K27ac and H3K27me3 ChIP-seq analysis

ChIP-seq reads from NT and DOX samples were aligned and filtered as for FAIRE-seq data, with an ~80% average alignment rate. Peaks of read enrichment were called using MACS-2.1.0 in narrow mode for H3K27ac and in broad mode for H3K27me3, using input samples and an adjusted p-value < 10-5. Peaks overlapping between replicates were selected for comparison with FAIRE-seq datasets, whilst differential enrichment was assessed using DiffReps. All comparisons with gene expression results used a DiffReps adjusted p-value of 0.001. Enhancer analysis

Putative distal regulatory regions of H1.0-sensitive genes were identified by comparing the FANTOM5 enhancer atlas (21) with our H3K27ac ChIP-seq dataset. FANTOM enhancers showing H3K27ac peaks within a 2Kb window (43) were defined as active enhancers in in vitro-transformed fibroblasts. An active enhancer was assigned to a given gene if the midpoint of the enhancer was located within 50Kb of the TSS and was located in the same topological domain as the gene (50). The number of H1.0 binding sites within a 2Kb window around upregulated and downregulated gene enhancers was scored and normalized to the density of H1.0 binding sites in the specific region (H1.0 peaks/Mb) to account for large-scale differences in H1.0 occupancy in AT- and GC-rich regions. Similarly, the number of increased or decreased FAIRE peaks upon H1.0 knockdown that overlap with upregulated and downregulated gene enhancers was scored.

Page 10: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

10

Survival analysis Survival analysis was performed on all TCGA datasets containing more than 200

patients, using RNA-seq expression data obtained from www.firebrowse.com (rnaseqv2__illuminahiseq_rnaseqv2__unc_edu__Level_3__RSEM_genes_normalized files), except GBM, which does not have a complete RNA-seq dataset and for which the U133A microarray dataset was used. Clinical data was obtained from http://cancergenome.nih.gov/ (nationwidechildrens.org_clinical_patient files). Patients were ranked based on H1F0 expression and the top and bottom tertile were compared with respect to vital status by Kaplan-Meier analysis using SPSS or SigmaPlot software. Log-rank test was used to assess statistical significance of the differences between patient groups and a p-value ≤ 0.05 was considered significant. Two additional datasets each were analyzed for GBM and BRCA. Due to the lower number of patients in the validation datasets, no filtering was applied and all patients were grouped using the K-means clustering function in Matlab software. GBM validation datasets were GEO accession: GSE4412, 85 patients, GSE7679, 80 patients and GSE13041, 191 patients. BRCA validation datasets were GEO accession: GSE19536, 111 patients, and GSE11121, 200 patients. To test whether the prognostic value of H1F0 is independent of other clinically-relevant features, the distributions of available features in each dataset in H1F0-low and H1F0-high patients were compared by chi-square test for categorical variable or by two-tailed t-test for continuous variables. Multivariate Cox proportional hazards regression analysis was performed to assess the individual contribution of H1F0 levels when correlation with other features (age in GBM, ER and PR status in BRCA) was observed. Association between H1F0 levels and BRCA subtypes was performed using datasets with available PAM50 information (GSE20713 and GSE56493). Statistical analysis

Unless otherwise stated in figure legends, data are presented either as individual samples or as mean ± standard error of the mean (SEM) of multiple replicates, with N indicated in the figure legend. In box plots, the boundary of the box closest to zero indicates the 25th percentile, a line within the box marks the median, and the boundary of the box farthest from zero indicates the 75th percentile. Whiskers (error bars) above and below the box indicate the 95th and 5th percentiles. Outliers are plotted as dots. The statistical test used for each comparison, whether adjustment for multiple corrections was performed and the p-value, are indicated in the corresponding figure legends.

Page 11: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

11

Supplementary Text

Author contributions CMT and PS generated and characterized cell lines and performed in vivo

experiments. CMT performed bisulfite sequencing analysis, CRISPR, ChIP and FAIRE experiments. AB performed H1.0 ChIP experiments and reporter assays. AB and AH performed bioinformatic analysis. MJB performed FAIRE-seq and ChIP-seq bioinformatic analysis. HP performed RNA-seq bioinformatic analysis. THB performed imaging flow cytometry. YL performed DNA methylation analysis of TCGA samples. EN, BSD and PS performed immunostaining of tumor sections. RBH, PS and PC performed survival analysis. NM performed NGS of RNA, ChIP and FAIRE samples. TM provided feedback. CMT, AB, EM and PS designed experiments and interpreted data. EM and PS supervised the study. PS conceived the study, performed RNA-seq analysis and wrote the manuscript.

Text related to Fig.3B

Despite the strong negative selection of cells expressing high levels of H1.0, 4/11 tumors were found to express overall increased levels of H1.0 upon induction of the exogenous protein, although lower than control tumors induced for only 3 days (Fig. 3B). The presence or absence of negative selection of H1.0-expressing cells depends on which cells engraft and form each tumor. Although we selected transduced cells, the population of infected cells expressed variable levels of exogenous H1.0. Suppl. Fig. 8E shows that ~ 80% of transduced cells are capable of expressing high levels of H1.0 upon induction in vitro, but ~20% can only express lower levels. In most injections, both subpopulations engraft and, upon induction, cells expressing lower levels of H1.0 are selected at the expense of those expressing high levels. This is likely what happens in the tumors where only low levels of exogenous H1.0 are detected. However, it is possible that only cells capable of expressing high levels of H1.0 engraft. In this case, negative selection cannot occur even upon induction. Since we inject a low number of cells for tumor formation, which lead to the engraftment of less than 100 cells, stochastically some tumors will show negative selection of cells expressing high levels of H1.0 and some will not. Consistent with random engraftment of cells that can express variable levels of H1.0, we show in Fig. 3B that even control tumors induced for only 3 days show variable levels of exogenous H1.0.

Text related to Fig. 3D-K

Although the changes in tumor organization assessed by FACS and clonogenic assays are moderate, they are significant, reproducible and in agreement with several other reports showing that modest differences in the content of self-renewing tumor cells translate into major differences in terms of tumor aggressiveness and patient outcome. For instance, Prost et al. have shown that 2.4-3.5 fold reductions in the percentage of leukemia stem cells induced by PPARγ agonists result in complete and sustained molecular response in CML patients (25); similarly, Wang et al. have reported that 1.6-1.8-fold reductions in the fraction of tumorigenic liver cancer cells upon LncTCF7 knock-down result in complete protection from tumor formation in one third of tested

Page 12: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

12

mice in transplantation assays (26); as another example, Pece et al. have shown that a fold difference of 3.2 ± 1.4 in cancer stem cell frequency distinguish well-differentiated grade 1 breast cancers (∼95% 5-year survival rate) from aggressive grade 3 cancers (∼50% 5-year survival rate) (18). Thus, the extent of the changes that we observe are in line with what has been reported in other studies, and has been shown to significantly affect patient outcome.

There are two major reasons why bigger differences cannot be expected when analyzing tumor organization. First, experiments in which we aim at restoring normal levels of H1.0 are complicated by the fact that cells expressing high levels of H1.0 are negatively selected during tumor growth because over time they arrest in G0. While being a strong indication of impaired self-renewal ability of H1.0high cells in itself, this negative selection dilutes the differences that can be detected when analyzing tumor organization by FACS or clonogenic assays. Second, in experiments in which we prevent accumulation of H1.0 by expressing shRNAs, the cells that are mainly affected are only a fraction of the whole tumor population, those which otherwise would accumulate high levels of H1.0 (likely the terminally differentiated cells), which represent between 40-60% of cells depending on the tumors (Fig. S2A). The magnitude of the changes that we observe is a reflection of intratumor heterogeneity itself and, as a consequence, of the fact that only subsets of cells are affected by the experimental modulation of H1.0.

Furthermore, due to the limited timeframe of the experiment (treatment over 4 weeks, which exerts its effect while cells differentiate), it is likely that not all cells reach the stage where H1.0 modulation has functional consequences in the primary tumor. However, when cells are transplanted into new recipient mice and have additional time to undergo differentiation, considerable differences between untreated and treated tumors emerge (Fig. 3I, 3K).

Text related to Fig. 3H

The difference in the frequency of self-renewing cells between the uninduced cDNA and uninduced shH1.0 conditions is due to H1.0-independent variability caused by integration of the viral constructs. Stable cell lines generated by integration of viral constructs may display batch-specific features, because integration of the lentivirus randomly disrupts the genome and can potentially affect cell function independently of the insert, and because selection of infected cells randomly amplifies subpopulations of cells from the original population. In order to account for such effects, we have used inducible constructs, rather than constitutive expression, to be able to specifically assess effects induced by the protein knock-down or expression within the same cell line rather than between cell lines. As a confirmation of such batch-specific effects, Fig. S10B shows that the uninduced state of the two cell lines generated by introduction of two distinct H1.0-targeting shRNAs, which should theoretically be identical, are as different as the uninduced and induced state of each shRNA as indicated by PCA. Thus, the only meaningful comparison is uninduced vs induced within each cell line. Indeed, this analysis shows significant differences in all cell lines regardless of the inter-cell line variability (Fig. 3D-I).

Page 13: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

13

Page 14: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

14

Fig. S1. Specific downregulation of the H1.0 variant in self-renewing SSEA1+ tumor cells.

A. Schematic representation of the engineered system used in this study (15), illustrating the subpopulations of tumor cells compared by gene expression analysis. Upon in vitro transformation, human primary dermal fibroblasts acquire the ability to initiate and maintain hierarchically-organized tumors, which contain self-renewing, tumorigenic SSEA1+ cells and differentiated, non-tumorigenic SSEA1- cells. B. Expression levels of the indicated histone H1 variant mRNAs measured by microarray analysis. SSEA1+ (+) and SSEA1- (-) cells from three tumors are compared. H1F0 (encoding H1.0) and HIST1H1E (encoding H1.4) represent ~80% of H1 transcripts. Only H1F0 is consistently differentially expressed in SSEA1+ tumor cells. The name of the protein encoded by each gene is indicated in brackets. C-D. Immunofluorescence microscopy of a tumor induced by in vitro-transformed fibroblasts using the indicated antibodies. Cells were co-stained with DAPI. Lower magnification images show a differentiated area containing no SSEA1+ cells and a more undifferentiated area containing numerous SSEA1+ cells (C). SSEA1- cells consistently express high H1.0 levels in differentiated tumor areas, whereas they are more heterogeneous, and overall lower, in undifferentiated areas. Scale bar 20 µm (upper panel) and 50 µm (lower panels).

Page 15: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

15

Fig. S2. H1.0 levels inversely correlate with cell proliferative status.

A. Immunofluorescence microscopy of a tumor induced by in vitro-transformed fibroblasts using the indicated antibodies. H3S28P marks mitotic cells. Cells were co-stained with DAPI. Higher magnification of the cells in the white squares is shown. Scale bar 50 µm. B. Distribution of mitotic cells across the indicated classes of tumor cells. Values are average and standard deviation from three tumors. N = 100 mitotic cells. Statistical significance of the differences is indicated by two asterisks (p < 0.01, Mann-Whitney test).

Page 16: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

16

Fig. S3. Downregulation of the H1.0 variant in glioblastoma and breast cancer.

A. Immunofluorescence microscopy of a normal brain section and a GBM section using the indicated antibodies. Cells were co-stained with DAPI. Higher magnification of the cells in the white squares is shown. Scale bar 30 µm. B. Average (bar graph) and single cell (scatter plots) H1F0 mRNA levels in GBM samples measured by single cell RNA-seq (2). Every dot represents a cell. H1F0 levels are expressed as transcripts per million (TPM) C. Relative H1F0 expression levels in low grade glioma (LGG) and glioblastoma (GBM) samples from TCGA measured by RNA-seq. N = 527 (LGG) and 168 (GBM). Statistical significance of the differences is indicated (Mann-Whitney test). D. Immunofluorescence microscopy of a normal breast section and breast cancer sections of various histological grades using anti-H1.0 and anti-H1.4 antibodies. Cells were co-stained with DAPI. Scale bar 30 µm.

Page 17: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

17

Page 18: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

18

Fig. S4. Altered levels of H1.0 in various cancer types.

A. Immunofluorescence microscopy of the indicated normal tissues and two corresponding cancer samples using anti-H1.0 antibodies (red). Cells were co-stained with DAPI (blue). Scale bar 30 µm. B-D. Quantitative immunofluorescence of the indicated cell lines from various cancer types, using an anti-H1.0 antibody. Cells were co-stained with DAPI. Cells stained in a 96-well plate were imaged using identical settings. N > 1000 for each sample. Scale bar: 20 µm. Three (B,C) or four (D) wells were stained and imaged for each cell line. Wells in which more than 50% of the acquired fields were of poor quality, or that were statistical outliers compared to the other replicate wells were not included in the analysis. In D, H1.0 levels were measured 3 days after incubation of transformed fibroblasts with the indicated recombinant human proteins. N > 1000 for each sample. Cells with detectable H1.0 were considered those showing a signal intensity greater than 200 (range of intensity: 40-1200). Values indicate average ± SEM from 2-4 wells. One or two asterisks indicate p<0.05 and p<0.01, respectively (Student t-test). N > 200 (max 1300).

Page 19: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

19

Page 20: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

20

Fig. S5. The H1F0 CGI shore contains a regulatory region controlling H1F0 expression, which undergoes methylation in cancer.

A. DNA methylation analysis of various normal tissues and three cancer cell lines by Reduced Representation Bisulfite Sequencing (ENCODE). The location of 450K Infinium probes is indicated as reference. B. DNA methylation analysis of SSEA1+/CD166high cells by bisulfite sequencing. The region corresponding to the H1F0 CpG island (CGI_1 and CGI_2, see Fig.2A), is not methylated. Lines represent individual sequenced molecules. White and black circles represent unmethylated and methylated CpGs, respectively. The last 4 CpGs of the CGI_2 fragment are not shown since they overlap with the first 4 CpGs of the CGIshore fragment. C. DNA methylation analysis of SSEA1+ cells by bisulfite sequencing comparing the methylation status of the H1F0 CGIshore during 5-Aza treatment. Lines represent individual sequenced molecules. White and black circles represent unmethylated and methylated CpGs, respectively. The percentage of methylated sites at selected CpGs is indicated (C) and plotted over time (D). Grey lines (D) represent the individual CpG values, while the red line represents the average value at each time point. The H1F0 CGIshore is demethylated with kinetics comparable to those observed at mRNA levels. E. H3K27ac profile along the H1F0 locus as assessed by ChIP-seq. A red line indicates the approximate location of the CGI shore. The purple and green lines indicate the regions probed by ChIP-qPCR in panel F. F. ChIP-qPCR analysis of the indicated regions of the H1F0 locus using anti-H3K27ac antibodies. Values are average ± SEM from four independent ChIP replicates. Data are show as relative to the signal obtained with 1% of input in each PCR reaction. Two asterisks indicate p < 0.01 (Student t-test) G. PCR on genomic DNA from untransfected cells containing the full-length H1F0 (NT), and different clones modified by CRISPR-Cas9-mediated genome editing carrying a heterozygous (Hetero.) or homozygous (Homo. 1 and 2) deletion of H1F0 CGI shore. MW: molecular weight standard. H. qRT-PCR comparing total H1F0 mRNA levels (left graph) or mRNAs from individual H1F0 alleles in the indicated cell populations (right graph). Values are average ± SEM from three technical replicates. Two asterisks indicate p < 0.01 (Student t-test). I. Quantification of DNA methylation induced by wild type (WT) or a catalytically dead mutant (ANV) Cas9-DNMT3A as assessed by bisulfite sequencing. CGI_1 correspond to a region around the TSS (Fig. 2A). The average methylation levels for all CpGs in each region are shown. Two asterisks indicate p < 0.01 (two-way ANOVA) J. Assessment of in vitro methylation efficiency. H1F0 CGI shore reporter plasmid, either untreated (NT) or methylated in vitro (Meth) digested with XbaI, BamHI and the methylation sensitive enzyme AgeI. The size of the bands from the digested plasmids is indicated. The absence of the 793 bp band indicates efficient methylation of the plasmid. MW: molecular weight standard. K. DNA methylation analysis of the indicated cancer cell lines by bisulfite sequencing, probing the methylation status of the H1F0 CGI shore. Lines represent individual sequenced molecules. White and black circles represent unmethylated and methylated CpGs, respectively.

Page 21: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

21

Page 22: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

22

Fig. S6. Methylation of H1F0 in human cancer.

Analysis of H1F0 methylation in the indicated TCGA samples: Acute Myeloid Leukemia [LAML], Adrenocortical carcinoma [ACC], Bladder urothelial carcinoma [BLCA], Brain lower grade glioma [LGG], Breast invasive carcinoma [BRCA], Cervical squamous cell carcinoma and endocervical adenocarcinoma [CESC], Colon adenocarcinoma and rectum adenocarcinoma [COADREAD], Esophageal carcinoma [ESCA], Glioblastoma multiforme [GBM], Head and neck squamous cell carcinoma [HNSC], Kidney chromophobe [KICH], Kidney renal clear cell carcinoma [KIRC], Kidney renal papillary cell carcinoma [KIRP], Liver hepatocellular carcinoma [LIHC], Lung adenocarcinoma [LUAD], Lung squamous cell carcinoma [LUSC], Lymphoid neoplasm Diffuse large B-cell lymphoma [DLBC], Mesothelioma [MESO], Pancreatic adenocarcinoma [PAAD], Pheochromocytoma and paraganglioma [PCPG], Prostate adenocarcinoma [PRAD], Sarcoma [SARC], Skin cutaneous melanoma [SKCM], Stomach adenocarcinoma [STAD], Thyroid carcinoma [THCA], Uterine carcinosarcoma [UCS], Uterine corpus endometrial carcinoma [UCEC]. Patients from the indicated cancer types are sorted based on H1F0 expression levels and the corresponding DNA methylation levels are visualized by a heatmap where different colors represent the indicated percentage of DNA methylation at individual CpGs (dark blue: 0%, light blue: 25%, yellow: 50%, orange: 75%, red: 100%). Every row corresponds to a patient. Scatter plots show covariance between H1F0 mRNA levels and methylation of CpG 7 and 8 and p-value of the covariance is indicated (Spearman’s rank correlation). Every point represents a patient. With the exception of DLBC, all cancer types show significant Spearman correlation between H1F0 mRNA levels and methylation of the locus (p <0.01). Expression levels are normalized within each cancer group.

Page 23: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

23

Page 24: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

24

Fig. S7. Characterization of inducible cell lines.

A-B. Quantification of H1F0 and HIST1H1E mRNA levels upon expression of exogenous cDNAs or specific shRNAs by qRT-PCR. NT and DOX indicate uninduced and induced samples, respectively. washDOX indicates samples in which Dox has been removed for 4d after induction. Two distinct pairs of primers have been used to quantify levels of exogenous or total (endogenous + exogenous) mRNAs (A) Values represent average ± SEM from two technical replicates. Results from the three biological replicates used for each sample for RNA-seq are plotted in B. Both H1.0 and H1.4 inducible cell lines show comparable levels of cDNA expression or knockdown. C-D. Quantitative immunodetection of H1.0 by fluorescent microscopy in uninduced cells and cells expressing inducible shRNAs targeting H1.0 for 7 days. TurboRFP acts as a reporter of shRNA expression. N>300 for each sample. Scale bar: 30 µm (c). Statistical significance of the differences is indicated (D). E. Western blot analysis of uninduced cells and cells expressing inducible shRNAs targeting H1.0 for 7 days. Core histones visualized by Ponceau staining of the blot are shown as loading control. F. Quantification of the indicated H1 variant mRNAs upon H1.0 or H1.4 knockdown by qRT-PCR. Values are mean ± SEM from three technical replicates. Statistically significant differences induced by the specific shRNAs relative to the corresponding NT samples are indicated by an asterisk (p < 0.01). shH1.4 also induces downregulation of HIST1H1B due to partial similarity of the targeted region. A control, non-retrotranscribed mRNA sample (no RT) is included to account for genomic DNA contamination in the RNA preparation. No compensation from other variants is observed upon H1.0 or H1.4 knockdown. G. Quantification of H1 complement upon H1.0 knock-down by western blot analysis. Blots were probed with an antibody specific for H1.0 and an antibody detecting the main H1 variants. H. Quantification of H1F0 and TurboRFP mRNA levels upon expression of exogenous cDNAs or shH1.0-1 by qRT-PCR in the indicated breast cancer cell lines. NT and DOX indicate uninduced and induced samples, respectively. Values are average ± SEM from two technical replicates.

Page 25: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

25

Fig. S8. Negative selection of cells expressing constitutively high levels of H1.0 during tumor growth.

A. Detection of apoptotic cells by annexin V staining in uninduced cells (NT) or cells expressing the indicated cDNAs or shRNAs. Overexpression of H1.5 cDNA, which is toxic and induces apoptosis, is used as a positive control. B. Growth curve comparing uninduced (NT) and induced (DOX) cells under the indicated conditions. Values are average ±SEM from three technical replicates. Two asterisks indicate p < 0.01. C-D.

Page 26: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

26

Immunofluorescence microscopy of uninduced cells (NT) or cells expressing H1.0 cDNA (DOX) using the indicated antibodies, 8 days after induction. Cells were counterstained with DAPI. The percentage of Ki67+ cells is indicated. Most cells expressing H1.0 cDNA are positive for the proliferation marker Ki67 (C), including cells expressing high levels of exogenous H1.0 (yellow arrows in D and percentages in C), but the fraction of Ki67 negative cells (orange arrow in D) increases over time (C, p = 0.014, Student t-test). Scale bar: 30 µm E. Immunofluorescence microscopy of cells expressing H1.0 cDNA (DOX) stained at the indicated time after induction using an H1.0 antibody. Cells were counterstained with DAPI. The percentage of cells expressing high levels of H1.0 is indicated. Scale bar: 30 µm. F-I. Quantification of the in vivo induction efficiency of GFP cDNA (F) or shH1.0-1 as reported by expression of TurboRFP (I) by flow cytometric analysis. Every number indicates a tumor, either uninduced (NT) or induced (DOX). G-H. Ex vivo induction of cells from tumors analyzed in Fig. 3B. Immunodetection (G) and quantification (H) of H1.0 in dissociated cells from the indicated tumors induced with Dox ex vivo. Prior to induction ex vivo, dissociated tumor cells were grown for a week in the absence of Dox to allow degradation of exogenous H1.0 that had been induced in vivo, thereby reestablished an uninduced state. Tumor cell populations contain variable percentages of H1.0high cells in response to Dox treatment ex vivo. 120<N<500. Merge between H1.0 and DAPI signals is shown. Scale bar: 30 µm. J. Quantification of the in vivo induction efficiency of H1.0 cDNA by qRT-PCR in the indicated breast cancer cell lines. Every number indicates a tumor, either uninduced (NT) or induced (DOX). Cultured cells that were injected to induce the tumors are shown as reference. Reduced levels of H1F0 mRNA in tumors compared to injected cells suggests that cells expressing exogenous H1.0 have been partially negatively selected during tumor growth. Of note, uninduced tumors from HCC1954 cells show lower H1F0 levels compared to the uninduced injected cells, suggesting that cells expressing particularly low levels of H1.0 have been positively selected during tumor growth even in the absence of Dox. K. Immunodetection of H1.0 in tumors 2 (NT) and 6 (DOX) induced by MDA-MB-231 and 1 (NT) and 8 (DOX) induced by HCC1954, analyzed by qRT-PCR in J. Merge between H1.0 and DAPI signals is shown. N>150. Scale bar: 50 µm.

Page 27: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

27

Page 28: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

28

Fig. S9. In vivo modulation of H1.0 levels affect tumor organization.

A. Quantification of the indicated subsets of cells by FACS (undifferentiated and differentiated cells) or by soft agar assay (in vitro self-renewing cells) in uninduced or induced tumors generated by cells containing Dox-responsive H1.4 cDNA constructs. Differences between NT and DOX samples are not significant. B. Immunofluorescence microscopy of one uninduced and one induced tumor generated by in vitro-transformed fibroblasts containing the indicated shRNA. Sections are stained with the indicated antibodies. H3S28P marks mitotic cells. Cells were co-stained with DAPI. Scale bar 100 µm. C. Quantification of mitotic cell abundance in the indicated tumors. Blind scoring of H3S28P staining was performed on whole tumor sections. D. Quantification of the indicated subsets of cells by FACS (undifferentiated and differentiated cells) or by soft agar assay (in vitro self-renewing cells) in uninduced or induced tumors generated by cells containing a Dox-responsive shH1.0-2. Results obtained with shH1.0-1 are shown in in Fig. 3E-I. Statistical significance of the differences is indicated (Student t-test).

Page 29: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

29

Fig. S10. Genome-wide changes in gene expression induced by H1.0 knockdown.

A. Western blot analysis of the indicated samples containing shH1.0-1, using the indicated antibodies. NT, Early, DOX and WashDOX indicate, respectively, uninduced cells, cells induced for 24h, cells induced for 14d, and cells in which DOX has been

Page 30: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

30

removed for 4d after the 14d induction. The amount of H1.0 in each condition relative to the NT sample is indicated under the blots. B. PCA analysis of the indicated samples analyzed by RNA-seq. C. Heatmap representing relative expression of differentially expressed genes (DEGs, FDR < 0.05 using both H1.0-targeting shRNAs) in induced cells (DOX) compared to uninduced cells (NT). Colors represent values normalized to the average value for each gene in all conditions, with green and red representing the lowest and highest values of each gene among all samples, respectively. The outlier of shH1.0-2_WashDOX indicated by an asterisk in B is not shown to avoid skewing of the relative values. Changes in DEGs are already detected 24h after induction (Early) and are partially reversed 4d after Dox removal (WashDOX) when H1.0 begins to be re-expressed D. Quantification of the indicated mRNA levels by RNA-seq. Values from three biological replicates are shown for selected DEGs. Changes induced by H1.0 knockdown (DOX) are partially reversed 4d after Dox removal and H1.0 re-expression (washDOX). E. qRT-PCR comparing the expression levels of the indicated genes before and after induction of shH1.0-1 or shH1.4. Values are average ± SEM from two technical replicates.

Page 31: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

31

Page 32: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

32

Fig. S11. Non-random distribution of self-renewal-related gene signatures and linear proximity of H1.0-sensitive genes along chromosomes.

A. Quantification of GC content of genes belonging to the indicated gene signatures from GSEA. Gene signatures associated with oncogenic or stem cell-related transcriptional programs are indicated in blue, while gene signatures associated with tumor suppressive or differentiation-related programs are in red. Both signatures affected by H1.0 knock-down (see Fig. 4B) and other signatures more generally related to stemness or differentiation are shown. One (p <0.05, Student t-test) or two (p < 0.01) asterisks indicate statistical significance of the differences compared to all RefSeq genes (all genes). For clarity, outliers are omitted. Description of the indicated signatures can be found in GSEA MSigDB (http://software.broadinstitute.org/gsea/msigdb/genesets.jsp?collection=CGP) under the following names: Ramalho_Stemness_Up, Bild_Ctnnb1_oncogenic, Hallmark_E2f_Targets, Hallmark_Kras_Signaling_Up, Bild_Src_oncogenic, Ramalho_Stemness_Dn, Hallmark_P53_Pathway, Hallmark_Kras_Signaling_Down, Hallmark_Myogenesis, Hallmark_Notch_Signaling. B. Selection of positional gene sets showing significant coordinated upregulation or downregulation in response to H1.0 knockdown identified by GSEA (see also Supplementary Table S5). C. RefSeq gene track of a region showing coordinated gene upregulation as detected by GSEA. Genes upregulated upon H1.0 knockdown are in red, non-expressed genes are in grey and genes insensitive to H1.0 knockdown are in black. D. Number of DEGs containing CpG islands in their promoter. Similar percentages are observed for upregulated and downregulated genes. Fractions of GC-rich (GC content > 0.55) or AT-rich genes (GC content < 0.45) containing CGIs are also indicated as a reference. E. Observed distribution of the number of consecutive genes that undergo up- or down-regulation in response to H1.0 knockdown across the genome (Observed), and expected distribution based on a Montecarlo simulation in which genes were randomly shuffled along chromosomes (Shuffled). Statistical significance of the differences between observed and random values (based on 1000 iterations of the Montecarlo simulation) is shown. Note that upregulated or downregulated domains are defined based on smoothed values to account for outliers (see Methods) and contain up to 365 genes. F. Quantification of the indicated mRNA levels by RNA-seq. Values are average ± SEM from the three biological replicates. NT, Early, DOX and WashDOX indicate, respectively, uninduced cells, cells induced for 24h, cells induced for 14d, and cells in which DOX has been removed for 4d after the 14d induction. G. Smoothed log2 fold change of gene expression between Early and NT (shH1.0-1) along the human genome (see Methods). Similar plots were obtained comparing the shH1.0-2 samples. Numbers indicate the chromosomes delimited with vertical lines. Upregulated domains are shown in blue and downregulated domains in red.

Page 33: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

33

Fig. S12. Characterization of genome-wide H1.0 binding profiles.

A. Correlation heatmap of biological replicates of H1.0, H3K4me3 and H3K27me3 ChIP-seq datasets. Colors indicate the calculated correlation, with 1 representing highest positive correlation. B. Tracks of BAM files from the indicated ChIP-seq datasets, showing consistent tag density for H1.0 ChIP-seq replicates and correlation with H3K27me3 around the TSS of the repressed AN05 gene. C. Genomic distribution of H3K4me3, H3K27me3 and H1.0 peaks from two biological replicates. H1.0 is enriched

Page 34: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

34

in gene elements and depleted from intergenic regions (p<0.01, 2-way ANOVA). D. Correlation heatmap of the indicated features. Colors indicate the calculated correlation, with 1 representing highest positive correlation, 0 representing no correlation and negative values representing anticorrelation. E. Quantification of H1.0 binding to the indicated gene promoters by ChIP-qPCR. Individual biological replicates are shown and labelled with a distinct patterned fill (replicate 1: white striped, replicate 2: solid, replicate 3: black striped). For each ChIP, enrichment measured at BIRC2 and MMP7 are shown as relative to the enrichment measured at the highly expressed gene APRT. F. Log2 fold enrichment of H1.0 binding on chromosomes (fraction of H1.0 peaks /fraction of genome in each chromosome). Chromosomes are sorted based on average GC content, with GC-rich chromosomes at the bottom. G. Average density profile of H1.0 peaks is centered on gene TSSs. Genes with the indicated levels of GC content are plotted separately. H. Visualization of the indicated tracks along chromosome 16 showing positive correlation of H1.0 binding sites with DNA GC content, nucleosome density (30) and gene-rich regions, and negative correlation with LADs. I. Comparison between H1.0 and H1.4 binding profiles based on ChIP-seq analysis. Whole genome included genic and non-genic regions. AT-rich and GC-rich regions are 1 Mb regions with GC content < 0.45 or > 0.55, respectively, across the whole genome. Average values from the two ChIP replicates are shown. When analyzing genes, “All” indicates all genes bound by H1.0 or H1.4, whereas “Unique” indicates genes only bound by one H1 variant. H1.0 or H1.4 peaks in a region surrounding (± 1Kb) the transcriptional start site (TSS) are graphed. J. Heatmap showing tag density of the two H1.0 ChIP-seq replicates around the TSS (± 5Kb) of H1.0-sensitive genes. Each line represents a gene. White and red represent the lowest and highest tag density, respectively, along each gene. K. Number of genes bound by H1.0 within 1Kb around the gene transcriptional start site (TSS) or in a region from 3 Kb to 5 Kb downstream of the TSS (gene body). Up and Down indicate, respectively, genes that undergo upregulation or downregulation in response to H1.0 knockdown. AT-rich and GC-rich indicate, respectively, all genes in the genome characterized by GC content < 0.45 or > 0.55. L. Bubble plot of the number of H1.0 peaks located within 1Kb around the TSS (x axis) vs. the number of peaks on the gene body (3 kb to 5 kb downstream of the TSS (y axis) at upregulated (blue) or downregulated (red) genes. The size of the bubble represents the number of genes that have each binding pattern.

Page 35: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

35

Page 36: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

36

Fig. S13. Correlation between H1.0 binding profiles and the position of genes responding to H1.0 loss on chromosomes.

A. H1.0 peak density along all chromosomes. The approximate location of cytogenetic bands and the differentially expressed positional gene sets identified by GSEA are shown. B-E. Comparison between gene sets enriched among upregulated (Up) or downregulated (Down) genes in response to H1.0 KD, with respect to the indicated features. Box plots show 5th-95th percentile of the distributions. Average density profile of H1.0 peaks (D) and FAIRE peaks (E) show enrichment of H1.0 and increased FAIRE signal (decreased nucleosome occupancy) upon H1.0 knock-down around the transcriptional start site (TSS) of upregulated genes. All RefSeq genes (All) are shown as a reference. Statistical significance of the differences between upregulated and downregulated genes is indicated (paired t-test, with Benjamini-Hochberg adjustment for D and E). All genes belonging to positional gene sets identified by GSEA as upregulated (Up) or downregulated (Down) in response to H1.0 knockdown are included in this analysis, regardless of transcriptional status. H1.0 peaks and nucleosome-depleted regions are enriched across entire chromosomal domains including non-expressed genes.

Page 37: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

37

Page 38: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

38

Fig. S14. Loss of H1.0 alters nucleosome occupancy in AT-rich domains.

A. Heatmap showing tag density of the two indicated FAIRE-seq replicates for all FAIRE peaks detected in both replicates. Each line represents a FAIRE peak and 0 is the center of the peak. White and red represent the lowest and highest tag density, respectively, along each region. Peaks were grouped by k-means clustering. B. Nucleosome occupancy at the indicated promoter regions (TSS ± 1Kb) measured by FAIRE-qPCR before (NT) and after (DOX) H1.0 knockdown. An active intergenic regulatory element (Reg. E.) in an AT-rich region and three regions in GC-rich domains were also probed. For every genomic region, the FAIRE enrichment or depletion relative to the whole genome (genomic DNA) is shown. Values represent average ± SEM of three technical qPCR replicates. Statistical significance of the differences is indicated by one (p < 0.05) or two (p < 0.01) asterisks. Two independent biological replicates showed similar results. C. Changes in FAIRE signal upon H1.0 knockdown at the promoter (TSS ±1 Kb) of upregulated (blue) and downregulated (red) genes. Results obtained with either H1.0-targeting shRNA are shown. Statistical significance of the differences between the two sets of genes is indicated (two-tailed Fisher's exact test) D. Average DNA GC content centered on FAIRE peaks that appear or disappear upon H1.0 knockdown. Results obtained with shH1.0-2 are shown. Statistical significance of the differences between appearing and disappearing FAIRE peaks is indicated (paired t-test with Benjamini-Hochberg adjustment). The grey area represents the 95% confidence interval of the best fit after smoothing. E. Fraction of the indicated types of FAIRE peaks overlapping with H3K27ac peaks. Promoter regions and the rest of the genome are shown separately. A large proportion of H1.0-sensitive FAIRE peaks are not located within acetylated chromatin. F. Binding of H1.0 to putative enhancers of H1.0-sensitive genes. Values are normalized to the density of H1.0 binding sites in the each region (H1.0 peaks/Mb) to account for large-scale differences in H1.0 occupancy in AT- and GC-rich regions. G. Nucleosome occupancy at putative enhancers of H1.0-sensitive genes. Upregulated gene enhancers show enrichment of increased FAIRE peaks upon H1.0 knockdown (increased FAIRE signal), indicating reduced nucleosome occupancy, whereas downregulated gene enhancers show enrichment of decreased FAIRE peaks upon H1.0 knockdown (decreased FAIRE signal), indicating increased nucleosome occupancy. Results obtained with both H1.0-targeting shRNAs are shown. Statistical significance of the differences between the two sets of genes is indicated. Statistical significance of the differences between GC-rich and AT-rich domains is indicated (one-tailed Fisher's exact test). H. Changes in nucleosome occupancy at the indicated promoter or enhancer (Reg. E.) regions assessed by FAIRE-qPCR. For every genomic region, the FAIRE signal measured upon knockdown of either H1.0 or H1.4 (DOX) relative to their corresponding uninduced state (NT) is graphed. Values represent average ± SEM of two independent FAIRE experiments.

Page 39: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

39

Fig. S15. Unaffected binding of other architectural chromatin proteins in the absence of H1.0.

Quantitative ChIP-PCR analysis of the indicated proteins in the presence (NT) or absence (DOX) of H1.0. Both nucleosome-enriched and nucleosome-depleted regions, as assessed by the levels of the core histone H3, were probed. No consistent change in ChIP signal at the various regions is observed for any protein in the absence of H1.0 (p > 0.05, two-way ANOVA).

Page 40: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

40

Page 41: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

41

Fig. S16. H1F0 levels correlate with patient survival independently of other clinically-relevant features.

A, C. Kaplan-Meier analysis of the indicated BRCA subsets of patients from the datasets analyzed in Fig. 6, showing that H1F0 correlates with patient survival independently of estrogen receptor (ER), progesterone receptor (PR) and lymph node (LN) status. Statistical significance of the differences between H1F0-low and H1F0-high patients in each subset is indicated (Log-rank test). B. No significant association between H1F0 levels and BRCA subtypes is observed (one-way ANOVA) C. Table summarizing the results of the survival analysis of the indicated datasets. Significant log-rank p-values are in red. Shorter median survival time of H1F0-low patients is in blue. For datasets showing significant log-rank p-values, the p-values of the correlation between H1F0 levels and the indicated clinical features are indicated (Chi-square for categorical variables and two–tailed t-test for continuous variables). Significant correlation is observed with ER and PR status in BRCA TCGA and age in GBM (GSE4412 and GSE7679) but Cox proportional hazard regression (Cox PH) shows an independent contribution of H1F0 to patient survival.

Page 42: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

42

Fig. S17. Correlation between expression levels of other H1 variants and patient survival.

A. Table summarizing the results of the survival analysis of the indicated TCGA datasets, in which low levels of H1F0 correlate with poor patient prognosis. Significant log-rank p-values are in red (lower survival for patients expressing the gene at low levels) or green (lower survival for patients expressing the gene at high levels). Since all H1 variants show a strong correlation with ER status, Cox proportional hazard regression (Cox PH) was used to assess the prognostic values of H1 variants in BRCA. Genes showing undetectable levels (<1 normalized RNA-seq counts or <5 log2 microarrays signal for GBM) in at least 33% of patients were not assessed and are indicated with a dash. B.

Page 43: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

43

Kaplan-Meier analysis of the TCGA LGG patient cohort showing that HIST1H1E correlates with patient survival in an opposite manner compared to H1F0. Statistical significance of the differences between HIST1H1E-low and HIST1H1E-high patients in each subset is indicated (Log-rank test).

Page 44: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

44

Table S1.

Correlation between H1F0 enhancer methylation and tumor grade. Tumors in grey, which have <10% median H1F0 methylation, were not analyzed.

Page 45: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

45

Table S2.

Limiting dilution transplantation assay for secondary tumor formation using cells from uninduced or induced primary tumors containing Dox-responsive shH1.0 constructs or Dox-responsive H1.0 cDNA constructs.

Page 46: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

46

Table S7.

Cell line growth conditions. All cells were grown at 37 °C in 5% CO2.

Page 47: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

47

References 1. G. H. Heppner, Tumor heterogeneity. Cancer Res. 44, 2259–2265 (1984). Medline

2. A. P. Patel, I. Tirosh, J. J. Trombetta, A. K. Shalek, S. M. Gillespie, H. Wakimoto, D. P. Cahill, B. V. Nahed, W. T. Curry, R. L. Martuza, D. N. Louis, O. Rozenblatt-Rosen, M. L. Suvà, A. Regev, B. E. Bernstein, Single-cell RNA-seq highlights intratumoral heterogeneity in primary glioblastoma. Science 344, 1396–1401 (2014). Medline doi:10.1126/science.1254257

3. P. C. Nowell, The clonal evolution of tumor cell populations. Science 194, 23–28 (1976). Medline doi:10.1126/science.959840

4. R. A. Burrell, N. McGranahan, J. Bartek, C. Swanton, The causes and consequences of genetic heterogeneity in cancer evolution. Nature 501, 338–345 (2013). Medline doi:10.1038/nature12625

5. A. Marusyk, V. Almendro, K. Polyak, Intra-tumour heterogeneity: A looking glass for cancer? Nat. Rev. Cancer 12, 323–334 (2012). Medline doi:10.1038/nrc3261

6. K. Illmensee, B. Mintz, Totipotency and normal differentiation of single teratocarcinoma cells cloned by injection into blastocysts. Proc. Natl. Acad. Sci. U.S.A. 73, 549–553 (1976). Medline doi:10.1073/pnas.73.2.549

7. F. Barabé, J. A. Kennedy, K. J. Hope, J. E. Dick, Modeling the initiation and progression of human acute leukemia in mice. Science 316, 600–604 (2007). Medline doi:10.1126/science.1139851

8. H. Easwaran, H. C. Tsai, S. B. Baylin, Cancer epigenetics: Tumor heterogeneity, plasticity of stem-like states, and drug resistance. Mol. Cell 54, 716–727 (2014). Medline doi:10.1016/j.molcel.2014.05.015

9. A. Kreso, J. E. Dick, Evolution of the cancer stem cell model. Cell Stem Cell 14, 275–291 (2014). Medline doi:10.1016/j.stem.2014.02.006

10. W. C. Hahn, C. M. Counter, A. S. Lundberg, R. L. Beijersbergen, M. W. Brooks, R. A. Weinberg, Creation of human tumour cells with defined genetic elements. Nature 400, 464–468 (1999). Medline doi:10.1038/22780

11. B. Elenbaas, L. Spirio, F. Koerner, M. D. Fleming, D. B. Zimonjic, J. L. Donaher, N. C. Popescu, W. C. Hahn, R. A. Weinberg, Human breast cancer cells generated by oncogenic transformation of primary mammary epithelial cells. Genes Dev. 15, 50–65 (2001). Medline doi:10.1101/gad.828901

12. J. N. Rich, C. Guo, R. E. McLendon, D. D. Bigner, X. F. Wang, C. M. Counter, A genetically tractable model of human glioma formation. Cancer Res. 61, 3556–3560 (2001). Medline

13. M. J. Son, K. Woolard, D. H. Nam, J. Lee, H. A. Fine, SSEA-1 is an enrichment marker for tumor-initiating cells in human glioblastoma. Cell Stem Cell 4, 440–452 (2009). Medline doi:10.1016/j.stem.2009.03.003

14. M. L. Suvà, E. Rheinbay, S. M. Gillespie, A. P. Patel, H. Wakimoto, S. D. Rabkin, N. Riggi, A. S. Chi, D. P. Cahill, B. V. Nahed, W. T. Curry, R. L. Martuza, M. N. Rivera, N. Rossetti, S. Kasif, S. Beik, S. Kadri, I. Tirosh, I. Wortman, A. K. Shalek, O. Rozenblatt-Rosen, A. Regev, D. N. Louis, B. E. Bernstein, Reconstructing and reprogramming the tumor-

Page 48: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

48

propagating potential of glioblastoma stem-like cells. Cell 157, 580–594 (2014). Medline doi:10.1016/j.cell.2014.02.030

15. P. Scaffidi, T. Misteli, In vitro generation of human cells with cancer stem cell properties. Nat. Cell Biol. 13, 1051–1061 (2011). Medline doi:10.1038/ncb2308

16. A. Izzo, K. Kamieniarz, R. Schneider, The histone H1 family: Specific members, specific functions? Biol. Chem. 389, 333–343 (2008). Medline doi:10.1515/BC.2008.037

17. D. Doenecke, A. Alonso, Organization and expression of the developmentally regulated H1(o) histone gene in vertebrates. Int. J. Dev. Biol. 40, 395–401 (1996). Medline

18. S. Pece, D. Tosoni, S. Confalonieri, G. Mazzarol, M. Vecchi, S. Ronzoni, L. Bernard, G. Viale, P. G. Pelicci, P. P. Di Fiore, Biological and molecular heterogeneity of breast cancers correlates with their cancer stem cell content. Cell 140, 62–73 (2010). Medline doi:10.1016/j.cell.2009.12.007

19. H. R. Ali, S. J. Dawson, F. M. Blows, E. Provenzano, P. D. Pharoah, C. Caldas, Cancer stem cell markers in breast cancer: Pathological, clinical and prognostic significance. BCR 13, R118 (2011). Medline doi:10.1186/bcr3061

20. R. A. Irizarry, C. Ladd-Acosta, B. Wen, Z. Wu, C. Montano, P. Onyango, H. Cui, K. Gabo, M. Rongione, M. Webster, H. Ji, J. B. Potash, S. Sabunciyan, A. P. Feinberg, The human colon cancer methylome shows similar hypo- and hypermethylation at conserved tissue-specific CpG island shores. Nat. Genet. 41, 178–186 (2009). Medline doi:10.1038/ng.298

21. R. Andersson, C. Gebhard, I. Miguel-Escalada, I. Hoof, J. Bornholdt, M. Boyd, Y. Chen, X. Zhao, C. Schmidl, T. Suzuki, E. Ntini, E. Arner, E. Valen, K. Li, L. Schwarzfischer, D. Glatz, J. Raithel, B. Lilje, N. Rapin, F. O. Bagger, M. Jørgensen, P. R. Andersen, N. Bertin, O. Rackham, A. M. Burroughs, J. K. Baillie, Y. Ishizu, Y. Shimizu, E. Furuhata, S. Maeda, Y. Negishi, C. J. Mungall, T. F. Meehan, T. Lassmann, M. Itoh, H. Kawaji, N. Kondo, J. Kawai, A. Lennartsson, C. O. Daub, P. Heutink, D. A. Hume, T. H. Jensen, H. Suzuki, Y. Hayashizaki, F. Müller, A. R. Forrest, P. Carninci, M. Rehli, A. Sandelin; FANTOM Consortium, An atlas of active enhancers across human cell types and tissues. Nature 507, 455–461 (2014). Medline doi:10.1038/nature12787

22. A. Vojta, P. Dobrinić, V. Tadić, L. Bočkor, P. Korać, B. Julg, M. Klasić, V. Zoldoš, Repurposing the CRISPR-Cas9 system for targeted DNA methylation. Nucleic Acids Res. 44, 5615–5628 (2016). Medline doi:10.1093/nar/gkw159

23. E. Järvinen, A. Angers-Loustau, A. M. Osiceanu, K. Wartiovaara, Timing of the cell cycle exit of differentiating hippocampal neural stem cells. Int. J. Stem Cells 3, 46–53 (2010). Medline doi:10.15283/ijsc.2010.3.1.46

24. M. Snir, I. Kehat, A. Gepstein, R. Coleman, J. Itskovitz-Eldor, E. Livne, L. Gepstein, Assessment of the ultrastructural and proliferative properties of human embryonic stem cell-derived cardiomyocytes. Am. J. Physiol. Heart Circ. Physiol. 285, H2355–H2363 (2003). Medline doi:10.1152/ajpheart.00020.2003

25. S. Prost, F. Relouzat, M. Spentchian, Y. Ouzegdouh, J. Saliba, G. Massonnet, J. P. Beressi, E. Verhoeyen, V. Raggueneau, B. Maneglier, S. Castaigne, C. Chomienne, S. Chrétien, P. Rousselot, P. Leboulch, Erosion of the chronic myeloid leukaemia stem cell pool by PPARγ agonists. Nature 525, 380–383 (2015). Medline doi:10.1038/nature15248

Page 49: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

49

26. Y. Wang, L. He, Y. Du, P. Zhu, G. Huang, J. Luo, X. Yan, B. Ye, C. Li, P. Xia, G. Zhang, Y. Tian, R. Chen, Z. Fan, The long noncoding RNA lncTCF7 promotes self-renewal of human liver cancer stem cells through activation of Wnt signaling. Cell Stem Cell 16, 413–425 (2015). Medline doi:10.1016/j.stem.2015.03.003

27. L. Aloia, B. Di Stefano, L. Di Croce, Polycomb complexes in stem cells and embryonic development. Development 140, 2525–2534 (2013). Medline doi:10.1242/dev.091553

28. M. Costantini, O. Clay, F. Auletta, G. Bernardi, An isochore map of human chromosomes. Genome Res. 16, 536–541 (2006). Medline doi:10.1101/gr.4910606

29. J. R. Dixon, S. Selvaraj, F. Yue, A. Kim, Y. Li, Y. Shen, M. Hu, J. S. Liu, B. Ren, Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376–380 (2012). Medline doi:10.1038/nature11082

30. A. Valouev, S. M. Johnson, S. D. Boyd, C. L. Smith, A. Z. Fire, A. Sidow, Determinants of nucleosome organization in primary human cells. Nature 474, 516–520 (2011). Medline doi:10.1038/nature10002

31. D. Tillo, T. R. Hughes, G+C content dominates intrinsic nucleosome occupancy. BMC Bioinformatics 10, 442 (2009). Medline doi:10.1186/1471-2105-10-442

32. J. D. Anderson, J. Widom, Poly(dA-dT) promoter elements increase the equilibrium accessibility of nucleosomal DNA target sites. Mol. Cell. Biol. 21, 3830–3839 (2001). Medline doi:10.1128/MCB.21.11.3830-3839.2001

33. A. E. Vinogradov, Bendable genes of warm-blooded vertebrates. Mol. Biol. Evol. 18, 2195–2200 (2001). Medline doi:10.1093/oxfordjournals.molbev.a003766

34. G. J. Hogan, C. K. Lee, J. D. Lieb, Cell cycle-specified fluctuation of nucleosome occupancy at gene promoters. PLOS Genet. 2, e158 (2006). Medline doi:10.1371/journal.pgen.0020158

35. H. Koohy, T. A. Down, T. J. Hubbard, Chromatin accessibility data sets show bias due to sequence specificity of the DNase I enzyme. PLOS ONE 8, e69853 (2013). Medline doi:10.1371/journal.pone.0069853

36. C. Dingwall, G. P. Lomonossoff, R. A. Laskey, High sequence specificity of micrococcal nuclease. Nucleic Acids Res. 9, 2659–2674 (1981). Medline doi:10.1093/nar/9.12.2659

37. P. Dalerba, T. Kalisky, D. Sahoo, P. S. Rajendran, M. E. Rothenberg, A. A. Leyrat, S. Sim, J. Okamoto, D. M. Johnston, D. Qian, M. Zabala, J. Bueno, N. F. Neff, J. Wang, A. A. Shelton, B. Visser, S. Hisamori, Y. Shimono, M. van de Wetering, H. Clevers, M. F. Clarke, S. R. Quake, Single-cell dissection of transcriptional heterogeneity in human colon tumors. Nat. Biotechnol. 29, 1120–1127 (2011). Medline doi:10.1038/nbt.2038

38. K. Kise, Y. Kinugasa-Katayama, N. Takakura, Tumor microenvironment for cancer stem cells. Adv. Drug Deliv. Rev. 99 (Pt B), 197–205 (2016). Medline doi:10.1016/j.addr.2015.08.005

39. M. Shipitsin, L. L. Campbell, P. Argani, S. Weremowicz, N. Bloushtain-Qimron, J. Yao, T. Nikolskaya, T. Serebryiskaya, R. Beroukhim, M. Hu, M. K. Halushka, S. Sukumar, L. M. Parker, K. S. Anderson, L. N. Harris, J. E. Garber, A. L. Richardson, S. J. Schnitt, Y. Nikolsky, R. S. Gelman, K. Polyak, Molecular definition of breast tumor heterogeneity. Cancer Cell 11, 259–273 (2007). Medline doi:10.1016/j.ccr.2007.01.013

Page 50: Supplementary Materials for...Other Supplementary Material for this manuscript include s the following : ... Single-cell RNA-Seq data of GMB tumors was downloaded from GEO series GSE57872

50

40. A. Alonso, B. Breuer, H. Bouterfa, D. Doenecke, Early increase in histone H1(0) mRNA during differentiation of F9 cells to parietal endoderm. EMBO J. 7, 3003–3008 (1988). Medline

41. J. M. Terme, B. Sesé, L. Millán-Ariño, R. Mayor, J. C. Izpisúa Belmonte, M. J. Barrero, A. Jordan, Histone H1 variants are differentially expressed and incorporated into chromatin during differentiation and reprogramming to pluripotency. J. Biol. Chem. 286, 35347–35357 (2011). Medline doi:10.1074/jbc.M111.281923

42. E. Mendelson, M. Bustin, Monoclonal antibodies against distinct determinants of histone H5 bind to chromatin. Biochemistry 23, 3459–3466 (1984). Medline doi:10.1021/bi00310a012

43. A. Rada-Iglesias, R. Bajpai, T. Swigut, S. A. Brugmann, R. A. Flynn, J. Wysocka, A unique chromatin signature uncovers early developmental enhancers in humans. Nature 470, 279–283 (2011). Medline doi:10.1038/nature09692

44. A. Letourneau, F. A. Santoni, X. Bonilla, M. R. Sailani, D. Gonzalez, J. Kind, C. Chevalier, R. Thurman, R. S. Sandstrom, Y. Hibaoui, M. Garieri, K. Popadin, E. Falconnet, M. Gagnebin, C. Gehrig, A. Vannier, M. Guipponi, L. Farinelli, D. Robyr, E. Migliavacca, C. Borel, S. Deutsch, A. Feki, J. A. Stamatoyannopoulos, Y. Herault, B. van Steensel, R. Guigo, S. E. Antonarakis, Domains of genome-wide gene expression dysregulation in Down’s syndrome. Nature 508, 345–350 (2014). Medline doi:10.1038/nature13200

45. J. M. Simon, P. G. Giresi, I. J. Davis, J. D. Lieb, Using formaldehyde-assisted isolation of regulatory elements (FAIRE) to isolate active regulatory DNA. Nat. Protoc. 7, 256–267 (2012). Medline doi:10.1038/nprot.2011.444

46. L. Shen, N. Y. Shao, X. Liu, I. Maze, J. Feng, E. J. Nestler, diffReps: Detecting differential chromatin modification sites from ChIP-seq data with biological replicates. PLOS ONE 8, e65598 (2013). Medline doi:10.1371/journal.pone.0065598

47. R. Blecher-Gonen, Z. Barnett-Itzhaki, D. Jaitin, D. Amann-Zalcenstein, D. Lara-Astiaso, I. Amit, High-throughput chromatin immunoprecipitation for genome-wide mapping of in vivo protein-DNA interactions and epigenomic states. Nat. Protoc. 8, 539–554 (2013). Medline doi:10.1038/nprot.2013.023

48. Y. Benjamini, T. P. Speed, Summarizing and correcting the GC content bias in high-throughput sequencing. Nucleic Acids Res. 40, e72 (2012). Medline doi:10.1093/nar/gks001

49. J. Zhu, M. Adli, J. Y. Zou, G. Verstappen, M. Coyne, X. Zhang, T. Durham, M. Miri, V. Deshpande, P. L. De Jager, D. A. Bennett, J. A. Houmard, D. M. Muoio, T. T. Onder, R. Camahort, C. A. Cowan, A. Meissner, C. B. Epstein, N. Shoresh, B. E. Bernstein, Genome-wide chromatin state transitions associated with developmental and environmental cues. Cell 152, 642–654 (2013). Medline doi:10.1016/j.cell.2012.12.033

50. W. A. Whyte, D. A. Orlando, D. Hnisz, B. J. Abraham, C. Y. Lin, M. H. Kagey, P. B. Rahl, T. I. Lee, R. A. Young, Master transcription factors and mediator establish super-enhancers at key cell identity genes. Cell 153, 307–319 (2013). Medline doi:10.1016/j.cell.2013.03.035