genome-wide dna methylation analysis bi-qing li key laboratory of systems biology, shanghai...

77
Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Upload: ann-waters

Post on 18-Dec-2015

221 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Genome-wide DNA methylation analysis

Bi-Qing LiKey Laboratory of Systems biology,

Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Page 2: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

outlineBackgroundMethod to distinguish 5mCArray based genome-wide DNA methylation analysisNGS based genome-wide DNA methylation analysisThird generation sequencing based genome-wide DNA

methylation analysisIllumina BS-seq data manipulation

Page 3: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

BackgroundMethod to distinguish 5mCArray based genome-wide DNA methylation analysisNGS based genome-wide DNA methylation analysisThird generation sequencing based genome-wide DNA

methylation analysisIllumina BS-seq data manipulation

Page 4: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Background DNA methylation is the main covalent chemical modification

of DNA involved in a variety of biological processes, including embryogenesis and development, silencing of transposable elements, regulation of gene transcription and tumorigenesis and progression.

The methylation pattern of DNA is highly variable among cells types and developmental stages and influenced by disease processes and genetic factors, which brings considerable theoretical and technological challenges for its comprehensive analysis.

Recently various high-throughput approaches have been developed and applied for the genome wide analysis of DNA methylation providing single base pair resolution, quantitative DNA methylation data with genome wide coverage.

Genes 2010, 1(1), 85-101; doi:10.3390/genes1010085

Page 5: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

BackgroundMethod to distinguish 5mCArray based genome-wide DNA methylation analysisNGS based genome-wide DNA methylation analysisThird generation sequencing based genome-wide DNA

methylation analysisIllumina BS-seq data manipulation

Page 6: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Method to distinguish 5mC

Biotechniques. 2010 Oct;49(4):iii-xi

Page 7: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Restriction endonuclease-based analysis

Pu: A or G, mC: 5-methylcytosine or 5-hydroxymethylcytosine or N4-methylcytosine , These half-sites can be separated by up to 3 kb, but the optimal separation is 55-103 base pairs

Cut unmethylated DNA Regardless of methylation

Cut unmethylated DNA Partially affacted by CpG methylation

Cut methylated DNA

isoschizomer

neoschizomer

Biotechniques. 2010 Oct;49(4):iii-xi

Page 8: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Restriction endonuclease-based analysisMethylation-sensitive restriction digestion followed by PCR

across the restriction site is a very sensitive technique that is still used in some applications today.

This method is still applicable for some locus-specific studies that require linkage of DNA methylation information across multiple kilobases, either between CpGs or between a CpG and a genetic polymorphism.

Limited by providing methylation data only at the restriction enzyme recognition sites or adjacent regions

It is extremely prone to false-positive results caused by incomplete digestion for reasons other than DNA methylation.

Nat Rev Genet. 2010 Feb 2;11(3):191-203

Page 9: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Bisulfite conversion of DNA

Proc Natl Acad Sci U S A. 1992 Mar 1;89(5):1827-31.

Bisulfite conversion

PCR

Page 10: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Bisulfite conversion of DNASingle base pair resolution, no bias

DNA degradation by high temperature and low PH

Incomplete conversion of unmethylated cytosine

High GC density regions

Protected by histones

Stable secondary structure elements

Reduced complexity of genome, greater sequence redundancy, decreased hybridization specificity

Difficult to mapping (repetitive regions)

Genes 2010, 1(1), 85-101; doi:10.3390/genes1010085

Page 11: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Immunoprecipitation-based methodsmethylated DNA immunoprecipitation (MeDIP-seq)

Antibody recognizes 5mc to pull down the methylated fraction of genome

More sensitive to highly methylated, intermediate-CpG density regions

methyl-binding domain protein (MBD-seq)

Using the methyl-binding protein MeCP2 or MBD2’s affinity for CpGs

More sensitive to highly methylated, high-CpG density regions

Methods. 2010 Nov;52(3):203-12

Page 12: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Immunoprecipitation-based methodsStraitforward and data relatively easier to analyze

Bias associated with CpG density and need adjustment

High(MBD) or intermediate(MeDIP) CpG dense regions will be interpreted as “more methylated” than equally methylated low-CpG density regions

Low resolution, do not yield information on individual CpG dinucleotides

Methods. 2010 Nov;52(3):203-12

Page 13: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

BackgroundMethod to distinguish 5mCArray based genome-wide DNA methylation analysisNGS based genome-wide DNA methylation analysisThird generation sequencing based genome-wide DNA

methylation analysisIllumina BS-seq data manipulation

Page 14: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Array-based genome wide DNA methylation analysis & restriction endonuclease

Digestion of one pool of genomic DNA with a methylation-sensitive restriction enzyme and mock digestion of another pool or using two different enzymes

Two DNA pools are amplified and labelled with different fluorescent dyes for two-color

Array hybridization

Nat Rev Genet. 2010 Feb 2;11(3):191-203

Page 15: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Array-based genome wide DNA methylation analysis & restriction endonuclease

Comprehensive high-throughput arrays for relative methylation (CHARM)

McrBC fractionate unmethylated DNA

Label methyl-depleted DNA with Cy5 and total DNA with Cy3

Hybridized on high density arrays

Genome Res. 2008 May;18(5):780-90

Cut methylated DNA

Page 16: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Array-based genome wide DNA methylation analysis & restriction endonuclease

Digestion genomic DNA with HpaII and MspI

Ligation-mediated PCR for the amplification of HpaII or MspI genomic restriction fragments

Label HpaII amplified with Cy5 and MspI with Cy3

Array hybridization

Genome Res. 2006 Aug;16(8):1046-55

HpaII tiny fragment enrichment by ligation mediatedPCR (HELP)

Cut unmethylated DNA

Regardless of methylation

Page 17: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Array-based genome wide DNA methylation analysis & methylation immunoprecipitation

Enrichment of methylated fragments using 5mC antibody or the affinity of methyl-binding proteins

Input DNA and enriched DNA are labeled with different fluorescent dyes

Array hybridization

Nat Rev Genet. 2010 Feb 2;11(3):191-203

Page 18: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Array-based genome wide DNA methylation analysis & methylation immunoprecipitation

Methylated DNA immunoprecipitationFrom Wikipedia, the free encyclopedia

Page 19: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Array-based genome wide DNA methylation analysis & bisulfite conversion

ILLUMINA® EPIGENETIC ANALYSIS

Page 20: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Array-based genome wide DNA methylation analysis & bisulfite conversion

27,578 CpG sites

14,495 protein-coding gene promoters

110 microRNA gene promoters Nat Rev Genet. 2010 Feb 2;11(3):191-203

Page 21: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Array-based genome wide DNA methylation analysis & bisulfite conversion

Genome Res. 2006 Mar;16(3):383-93

Page 22: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Array-based genome wide DNA methylation analysis & bisulfite conversion

GoldenGate BeadArray 1536 specific CpG site in 371 geneGoldenGate Methylation Cancer Panel I 1505 CpG sites selected from 807 genes

Nat Rev Genet. 2010 Feb 2;11(3):191-203

Illumina® Epigenetics Analysis

Page 23: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Array-based genome wide DNA methylation analysis

Easy to perform such experimentsEasy to interpret data with many well-characterized

software programsLow resolutionNot easy to distinguish one repetitive element from

another in a hybridization-based methodNot truly genome-wide

Page 24: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

BackgroundMethod to distinguish 5mCArray based genome-wide DNA methylation analysisNGS based genome-wide DNA methylation analysisThird generation sequencing based genome-wide DNA

methylation analysisIllumina BS-seq data manipulation

Page 25: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

NGS based genome-wide DNA methylation analysis

Biotechniques. 2010 Oct;49(4):iii-xi

Page 26: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

NGS based genome-wide DNA methylation analysis-ROCHE 454

Roche/454 pyrosequencing-based massively parallel bisulfite pyrosequencing

Include more CpG sites facilitating complex methylation pattern research

Easier and more accurately aligned to reference, especially in repetitive regions

Bigger chance to cover more genotype information (SNP) adjacent to cytosine

Relatively high sequencing costHigher error rates in calling identical bases

Genes 2010, 1(1), 85-101; doi:10.3390/genes1010085

Page 27: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

NGS based genome-wide DNA methylation analysis-Illumina/SOLEXA

Methyl-seq

~100-350bp

Illumina Genome Analyzer II

Genome Res. 2009 Jun;19(6):1044-56

Cut unmethylated DNA

Regardless of methylation

Page 28: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

NGS based genome-wide DNA methylation analysis-Illumina/SOLEXA

Methyl-sensitive cut counting(MSCC)

Nat Biotechnol. 2009 Apr;27(4):361-8

The method is similar to Methyl-Seq; however, sequencing of MspI libraries was reported to have little effect on the measurement of methylation and was abolished to reduce costs.

Genome Med. 2009 Nov 16;1(11):106

Page 29: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

NGS based genome-wide DNA methylation analysis-Illumina/SOLEXA

methyl-DNA immunoprecipitation(MeDIP) seq

Methods. 2009 Mar;47(3):142-50

Page 30: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

NGS based genome-wide DNA methylation analysis-Illumina/SOLEXA

Reduced representation bisulfite sequencing(RRBS)

Nucleic Acids Research, 2005, Vol. 33, No. 18 Nature. 2008 Aug 7;454(7205):766-70Nat Methods. 2010 Feb;7(2):133-6

Illumina Genome Analyzer

Page 31: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

NGS based genome-wide DNA methylation analysis-Illumina/SOLEXA

Bisulfite padlock probes(BSPPs)

Nat Biotechnol. 2009 Apr;27(4):353-60

Page 32: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

NGS based genome-wide DNA methylation analysis-Illumina/SOLEXA

Bisulfite sequencing(BS-seq)

Nature. 2008 Mar 13;452(7184):215-9

Page 33: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

NGS based genome-wide DNA methylation analysis-Illumina/SOLEXA

Cytosine methylome sequencing(MethylC-seq)

Cell. 2008 May 2;133(3):523-36

Nature. 2009 Nov 19;462(7271):315-22

Nature. 2011 Mar 3;471(7336):68-73

Page 34: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

BackgroundMethod to distinguish 5mCArray based genome-wide DNA methylation analysisNGS based genome-wide DNA methylation analysisThird generation sequencing based genome-wide DNA

methylation analysisIllumina BS-seq data manipulation

Page 35: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Third generation sequencing based genome-wide DNA methylation analysis-PacBio

single-molecule, real-time sequencing (SMRT)

ZMW: zero mode waveguide Nat Biotechnol. 2010 May;28(5):426-8

Page 36: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Third generation sequencing based genome-wide DNA methylation analysis-PacBio

single-molecule, real-time sequencing (SMRT)

Nat Methods. 2010 Jun;7(6):461-5 Nat Methods. 2010 Jun;7(6):435-7

Page 37: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Third generation sequencing based genome-wide DNA methylation analysis-Oxford Nanopore

Oxford Nanopore Technologies

Nat Biotechnol. 2010 May;28(5):426-8

Page 38: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

BackgroundMethod to distinguish 5mCArray based genome-wide DNA methylation analysisNGS based genome-wide DNA methylation analysisThird generation sequencing based genome-wide DNA

methylation analysisIllumina BS-seq data manipulation

Page 39: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Illumina BS-seq data manipulation

FASTQ file format and PHRED scoreAdaptor trimming with FASTXQuality control with FastQCReads filter and trimming with FASTXReads mapping with BismarkBasic analysisAdvanced analysis and application

Page 40: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Illumina BS-seq data manipulation

FASTQ file format and PHRED scoreAdaptor trimming with FASTXQuality control with FastQCReads filter and trimming with FASTXReads mapping with BismarkBasic analysisAdvanced analysis and application

Page 41: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Illumina BS-seq data manipulationFASTQ file format

FASTQ has emerged as a common file format for sharing sequencing read data combining both the sequence and an associated per base quality score

Nucleic Acids Research, 2010, Vol. 38, No. 6 1767–1771

Page 42: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Illumina BS-seq data manipulationPHRED score

Nucleic Acids Research, 2010, Vol. 38, No. 6 1767–1771

Nature. 2009 Nov 19;462(7271):315-22

Page 43: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Illumina BS-seq data manipulationPHRED score

http://en.wikipedia.org/wiki/FASTQ_format#cite_note-Illumina_User_Guide_1.5-2

Page 44: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Illumina BS-seq data manipulation

FASTQ file format and PHRED scoreAdaptor trimming with FASTXQuality control with FastQCReads filter and trimming with FASTXReads mapping with BismarkBasic analysisAdvanced analysis and application

Page 45: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Illumina BS-seq data manipulationadaptor trimming with FASTX

Nature. 2009 Nov 19;462(7271):315-22

Page 46: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Illumina BS-seq data manipulationadaptor trimming with FASTX

http://hannonlab.cshl.edu/fastx_toolkit/index.html

Page 47: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Illumina BS-seq data manipulationadaptor trimming with FASTX

http://hannonlab.cshl.edu/fastx_toolkit/commandline.html#fastx_clipper_usage

Page 48: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Illumina BS-seq data manipulation

FASTQ file format and PHRED scoreAdaptor trimming with FASTXQuality control with FastQCReads filter and trimming with FASTXReads mapping with BismarkBasic analysisAdvanced analysis and application

Page 49: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Illumina BS-seq data manipulationQuality control with FastQC

http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/

Page 50: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Illumina BS-seq data manipulationQuality control with FastQC

Page 51: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Illumina BS-seq data manipulation Quality control with FastQC

Page 52: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Illumina BS-seq data manipulation

FASTQ file format and PHRED scoreAdaptor trimming with FASTXQuality control with FastQCReads filter and trimming with FASTXReads mapping with BismarkBasic analysisAdvanced analysis and application

Page 53: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Illumina BS-seq data manipulationReads filter and trimming with FASTX

http://hannonlab.cshl.edu/fastx_toolkit/commandline.html#fastq_quality_filter_usage

e.g.1 fastq_quality_filter -Q 33 -q 20 -p 100 -v -i input -o output

e.g.2 fastq_quality_filter -q 10 -p 100 -i /usr/local/data/GBS/OWB-RAD1.fastq -Q 33 | fastq_quality_filter -Q 33-q 20 -p 80 -o OWB1-filt.fastq

Page 54: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Illumina BS-seq data manipulationReads filter and trimming with FASTX

FASTQ quality trimmer

e.g.1 fastq_quality_trimmer -t 20 -l 35 -v -i input -o output

Page 55: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Illumina BS-seq data manipulation

FASTQ file format and PHRED scoreAdaptor trimming with FASTXQuality control with FastQCReads filter and trimming with FASTXReads mapping with BismarkBasic analysisAdvanced analysis and application

Page 56: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Illumina BS-seq data manipulationReads mapping with Bismark

Page 57: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Illumina BS-seq data manipulation Reads mapping with Bismark

Bioinformatics. 2011 Jun 1;27(11):1571-2.

Page 58: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Two computationally converted reference

Bioinformatics. 2011 Jun 1;27(11):1571-2.

Illumina BS-seq data manipulationReads mapping with Bismark

Page 59: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Illumina BS-seq data manipulation Reads mapping with Bismark

Page 60: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Illumina BS-seq data manipulation Reads mapping with Bismark

H=A, C or T

Page 61: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Illumina BS-seq data manipulationReads mapping with Bismark

H=A, C or T

Page 62: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Illumina BS-seq data manipulationReads mapping with Bismark

H=A, C or T

Page 63: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Illumina BS-seq data manipulationReads mapping with Bismark

Page 64: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Illumina BS-seq data manipulationReads mapping with Bismark

Page 65: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

H=A, C or T

Illumina BS-seq data manipulationReads mapping with Bismark

chromosome position strand context mC All C

1 468 + CG 4 4

1 469 - CG 5 6

1 470 + CG 5 5

1 471 - CG 7 7

1 7384 - CHG 6 9

1 225896 - CHH 4 16

1 771455 + CHH 5 22

1 702235 + CHG 2 12

Page 66: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Illumina BS-seq data manipulation

FASTQ file format and PHRED scoreAdaptor trimming with FASTXQuality control with FastQCReads filter and trimming with FASTXReads mapping with BismarkBasic analysisAdvanced analysis and application

Page 67: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Illumina BS-seq data manipulationBasic analysis-Reads coverage

Page 68: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Illumina BS-seq data manipulationBasic analysis-Reads depth

Page 69: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Illumina BS-seq data manipulationBasic analysis-Reads depth percentage

Page 70: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Illumina BS-seq data manipulationBasic analysis- Methylation level

number of methylated readsmethylationlevel

number of methylated reads number of unmethylated reads

chromosome position strand context mC All C Methylationlevel

1 468 + CG 4 4 100%

1 469 - CG 5 6 83.3%

1 470 + CG 5 5 100%

1 471 - CG 7 7 100%

1 7384 - CHG 6 9 66.7%

1 225896 - CHH 4 16 25%

1 771455 + CHH 5 22 22.7%

1 702235 + CHG 2 12 16.7%

H=A, C or T

Page 71: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Illumina BS-seq data manipulationBasic analysis-Methylaion density

( , , )( )

( , , )( )

number of calls of a givenmethylationtype mCG mCHG mCHHAbsolute mC

bin size

mC number of calls of a givenmethylationtype mCG mCHG mCHHRelativemethylation

C total number of sites of the sametype

H=A, C or T

Page 72: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Illumina BS-seq data manipulation

FASTQ file format and PHRED scoreAdaptor trimming with FASTXQuality control with FastQCReads filter and trimming with FASTXReads mapping with BismarkBasic analysisAdvanced analysis and application

Page 73: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Illumina BS-seq data manipulationAdvanced analysis and application

DNA methylation and gene expression

DNA methylation is linked to gene silencing and is considered to be an important mechanism in the regulation of gene expression

Gene expression

Gene expression microarray

RNA-seq

Page 74: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Illumina BS-seq data manipulationAdvanced analysis and application

DNA methylation and gene expression

proximal TSS (-150 bp to +150 bp across TSS)

Promoter (1.5 kb upstream of the TSS)

Nature. 2009 Nov 19;462(7271):315-22

Page 75: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Genome Res. 2010 Mar;20(3):320-31.

Illumina BS-seq data manipulationAdvanced analysis and application

DNA methylation and gene expression

Page 76: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Illumina BS-seq data manipulationAdvanced analysis and application

Differentially methylated region(DMRs) and gene expression

DNA methylation at DNA–protein interaction sitesDNA methylation, miRNA, and histone modification……

Nature. 2009 Nov 19;462(7271):315-22

Genome Res. 2010 Mar;20(3):320-31.

Page 77: Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

Thank you!