dna methylation, snp, gene expression integrated...
TRANSCRIPT
DNA methylation, SNP, Gene expression integrated analysis in 30 breast cancer cell line
Heejoon Chae
Bioinformatics instituteSeoul National University, Seoul, Korea
October, 2016
Contents} Introduction
} Epigenetic in Breast cancer} Motivation & Goal} Biological backgrounds
} Experiment design} Experiment data} Analysis workflow
} Analysis results} Region specific average methylation pattern} CpG island shore methylation} Region specific differential methylation across the tumor
subtypes} Mutation rate difference} Tumor subtype-specific mutations
} Summary} Acknowledgements
Introduction
IntroductionEpigenetic events
} Epigenetic events are those that control activation and suppression of genetic elements
} Single gene transcriptional process influenced by} Transcription factor binding} DNA methylation in Transcription binding site} MiRNA targeting mechanism
} Multi-level biological causal relationship} Complex analysis process} Requires harmony of computational and biological knowledge
IntroductionEpigenetic integrated analysis } DNA methylation, SNP, gene expression, TF integrated analysis is challenging because…
1: DNA methylation level estimation (Genome-wide profiling, DMR estimation)2: mRNA expression quantification (aligning, mismatches)3:Transcription factor binding (TFBS DBs, TF binding prediction, promoter)4: Mutation in promoter and association with methylation and binding (variant calling)5: Causality of abnormal gene expression (TF binding block by methylation, negative correlation
between DNA methylation and gene expression, TF expression, CpG island)
Methylation gene expression ?normal vs. disease
Introduction Epigenetic in breast cancer} Breast cancer
} A heterogeneous disease} Molecular subtypes can be distinguished based on their distinct
genomic, transcriptomic, and epigenomic profiles.
} Epigenetic in cancer} Aberrant epigenetic modifications, including DNA methylation,
are known as key regulators of gene activity in tumorigenesis.} Many scientists believe that DNA methylation is not random
and probably there is an instructive mechanisms embedded in the genomic sequences [1].
[1] Keshet, I., et al.: Evidence for an instructive mechanism of de novo methylation in cancer cells. Nature genetics 38(2), 149–153 (2006)
IntroductionMotivation & Goal} Thus our motivation is
} To investigate where there is any notable correlation between mutations (cancer-subtype specific genomic sequences), cancer subtype specific methylation patterns and abnormal gene expression.
} Thus our goal in this study is } To look for any association between genome sequence differences
and methylation patterns, and gene expression in the context of breast cancer.
} In this study, we used affinity-based methylation sequencing data in 30 breast cancer cell lines } Representing functionally distinct cancer subtypes } To investigate methylation and mutation patterns at the whole
genome level
BackgroundsDNA methylation, SNP
} DNA methylation} Chemical modification of DNA
} Addition of a methyl group to DNA ->No change on original DNA sequence
} Inheritable and removable} Mammals : 60-90% of all CpGs are methylated} Aberrant DNA methylation pattern are often measured in cancer
} SNP (Single Nucleotide Polymorphism)} Point mutation often causes changing amino sequence of protein
-> Gene expression change} Important influence to epigenetic diseases such as cancer
BackgroundsTranscription factor, CpG island} Transcription factor (TF)
} A protein that binds to DNA region, especially called transcription factor binding site (TFBS)
} TF binds TFBS in promoter initiates gene transcriptional process } A single TF influences to multiple genes as well as a single gene is influenced by
multiple TFs.} Gene regulatory network consist of TFs and targeted genes
} Plays roles in their phenotypic difference} Often considered biomarkers controlling many diseases
} CpG island} CpG island refers regions with a high frequency of CpG sites} At least 200bp, over 50% of GC%} In mammalian genome, about 40% of genes’ promoter regions are related with
CpG island} In addition, most of the CpG sites in CpG island are unmethylated} Aberrant DNA methylation in CpG island and thus blocks TF bind is one of the
well-known characteristics of genetic diseases
Experiment design
Experiment design Experiment data
} ICBP 30 breast cancer cell line } 3 tumor subtypes (Lu, BaA, BaB)} MBDcap-seq data} Gene expression microarray
} Targeted Bisulfite treated sequencing data} 85 DEGs in 8 cell lines
} Raw data} DNA methylation : MBDseq, targeted BSseq,} Gene expression : microarray
Data Measure type Data size # sample # reads per sample
MBDseq DNA methylation 6GB/sample 30 1200000
Microarray Gene expression 12MB/sample 30 N/A
BSseq DNA methylation 1.5GB/sample 8 37000000
Experiment designWorkflow
Analysis results
Analysis resultsRegion specific average methylation pattern in three tumor subtypes
LU BaA BaB
Genebody
Exon
CpGisland
Analysis resultsCpG island shore methylation
} CpGI shore methylation is significantly different across tumor subtypes} Average methylation
level in Basal B subtype
} Peak in CpGI shore also detected in normal cancer sample
Analysis resultsRegion specific average methylation pattern in three tumor subtypes
} Cancer related genes CAV1, PTRF, GDF15, and TGFB1 shows significantly differential methylation in CpGI shore regions
Analysis resultsVerification of DNA methylation by targeted bisulfite sequencing
} CpGI shore methylation is verified targeted BS-seq
} Similar to MBD-seq, CpGI shore differentially methylated
Analysis resultsMethylation status validation by BS-seq
Strong correlation (0.9~0.5) between Targeted BSseq and MBDseq
Analysis resultsRegion specific differential methylation between tumor subtypes
} Differentially methylated bins for pair-wise tumor subtype} Significance tested by t-test and adjusted with Bonferroni correction (P.adj-value
< 0.05)⇒ Significant difference between Lumial and BasalB in intron and CpGIshore
regions} Ratio of hypo methylation in intron and CpGI shore regions⇒ Significant hypo methylation in Basal B
Analysis resultsMutation rate difference between tumor phenotypes
} Mutation is often considered as important signature in epigenetic diseases such as cancer. } Several drugs are targeted these genetic alteration to recover or repress the functionality caused by mutations.
} Mutation rate pattern over the methylation change was significantly different in different subtypes } At CpG sites displaying low and intermediate methylation, basal A showed a distinct and higher mutation rate
} P-value = 1.1258X10−2 by ANOVA test with Bonferroni correction } At highly methylated CpG sites, the mutation rate was significantly different (P-value = 6.84X10−7 by ANOVA
test with Bonferroni correction) for the basal B subtype
Analysis resultsTumor subtype specific mutations in various genomic regions
} CpGI regions (known as "methyl protected" and thus hypomethylated regions) including CpGI shore and shelf, basal A specific mutations occurred the most frequently
} Basal B specific mutations were significantly more frequent in intron region⇒ Mutation rate difference may result from regional subtype-specific mutation
occurrence and their methylation difference across the subtypes
Summary
} To see the association between DNA methylation, sequence variation, and gene expression in the context of breast cancer} Genome-wide DNA methylation pattern and correlation with gene
expression } Region-specific average DNA methylation pattern in three tumor
subtypes} Gene expression difference (DEG) influenced by methylation
difference (DMR) between tumor phenotypes} Methylated transcription factor binding site and influenced gene
expression
} Mutation rate difference across the tumor subtypes} Tumor subtype specific SNPs in various genomic regions
Acknowledgements
} Supported by } Korea Health Technology R&D Project through the Korea Health
Industry Development Institute (KHIDI), funded by the Ministry of Health & Welfare, Republic of Korea (grant number : HI15C3224 )
} Collaborative Genome Program for Fostering New Post-Genome industry through the National Research Foundation of Korea (NRF) funded by the Ministry of Science ICT and Future Planning (2014M3C9A3063541)
} Bio & Medical Technology Development Program of the NRF funded by the Ministry of Science, ICT & Future Planning (2012M3A9D1054622),
} Funding from the Integrated Cancer Biology Program (ICBP) of the National Cancer Institute (NCI) (Awards CA13001)
Thank you