talk outline -...

46
Measuring methylation: from arrays to sequencing Jovana Maksimovic, PhD [email protected] @JovMaksimovic github.com/JovMaksimovic Bioinformatics Winter School, 3 July 2017

Upload: others

Post on 14-Jun-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

Measuring methylation: from arrays to sequencing

Jovana Maksimovic, [email protected]

@JovMaksimovic

github.com/JovMaksimovic

Bioinformatics Winter School, 3 July 2017

Page 2: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

Talk outline

• Epigenetics• DNA methylation• Measuring DNA methylation• Methylation arrays

• How do they work?• What do they measure?• Example analysis

• Methylation sequencing• What are the challenges?• How does it work?• Suggested analysis pipeline

• Summary

Page 3: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

I work at MCRI

Me!

I mostly work on human development

& disease…

I write software for analysing methylation

array data

missMethyl…and a lot of gene

expression data

RNAseqMicroarrays

I analyse a lot of epigenetic data…

ChIPseqATACseqBSseqMicroarrays

…sometimes using mice or other

models

Page 4: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

What is epigenetics?

• Epigenetics refers to stable heritable traits not explained by changes in DNA sequence

• Greek prefix “epi” means “on top of” genetics

• Chromosome modificationsthat affect gene expression

• Histones, DNA methylation• “Anything” that isn’t DNA!

• Essential for normal development

• Can be modified by environment

• Can be disrupted in disease

Page 5: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

Epigenetics brings DNA to life!

embryogenesis

blastocyst

zygote

sperm

egg

embryonic stem cells

B cell

T cell

red blood

cell

haematopoietic

stem cell fat

cell

sperm

cell

skin cell

muscle

cell

gland

cell

hormone-

secreting

cell

germ

cell

neuron

astrocyte

neuronal

progenitor

cell

lung

cell

kidney

cell

identical DNA in every cell

diffe

ren

t ep

igen

etic

patte

rns

• Important in all species

Modified from https://biology.mit.edu/research/stemcell_epigenetics

intestine cell

Page 6: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

Epigenetics is CRAZY complicated!

• New sequencing & microarray technologies are enabling us to learn A LOT more about epigenetics

• Different data types need different analysis

• Today I’m only focussing on DNA methylation

Roy et al. (2010), Science

Me

Me

Page 7: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

What is DNA methylation?

DNA methylation primarily occurs at CpG dinucleotides

C G

AT

C C

Page 8: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

Patterson et al. 2011, J Vis Exp

DNA methylation in the genome

• The human genome contains ~30,000,000 CpGs (~1%)

• VERY different between different species

• CpGs are not evenly spacedacross the genome

• Tend to be present in clusters called CpG islands

• CpG methylation is spatiallycorrelated

Eckhardt et al. 2007, Nature Genetics

~500bp

Methylation correlation with distance

Page 9: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

Methylation can regulate gene expression

Plot from Peter Hickey

http://meeting.dxy.cn/oemethylation2012/article/i18782.html

Methylation at a single CpG vs. gene expression

Each point is one

sample

Page 10: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

Methylation changes coat colour of Agouti mice Dolinoy 2008, Nutr Rev.

This gene controls

coat colour in

Agouti mice

These CpG sites in the

promoter change PS1A

expression depending on

methylation

These mice

are genetically

identical

Hypomethylated Hypermethylated

Coat colour different due to different maternal diet i.e. environment!

Page 11: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

Cridge et al. 2015, Nutrients

Methylation makes worker bees!These larvae

are genetically

identical Hypomethylated

Hypermethylated

Page 12: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

Methylation is coolWhat do we usually want to know about it?

Page 13: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

Finding methylation differences can tell us a lot• Methylation is critical in determining cell type

• Regulatory T-cell vs. Naïve T-cell

• Methylation can be disrupted in disease• Cancer vs. Normal

• Methylation is affected by the environment• Smokers vs. Non-smokers

Collect appropriate samples

Extract DNA and measure methylation

Statistical analysis

Normal

Cancer

Page 14: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

Epigenome-wide association studies (EWAS)

• Similar to GWAS

• Compare lots of cases to lots of controls

• Often looking for small effects e.g. complex disease or environmental effects

• Need lots of samples • 100s or 1000s of cases &

controls

https://en.wikipedia.org/wiki/Epigenome-wide_association_study_(EWAS)

Page 15: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

How do we measure methylation?

• Bisulphite conversion• Create “SNPs”

• Single nucleotideresolution

• Array

• Sequencing

• Enrichment of methylated DNA

• Restriction enzymes

• Affinity

• Regional resolution• Array

• Sequencing

Page 16: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

What is bisulphite conversion?

• Chemical process

• Unmethylated Cs get converted to Ts

• Methylated Cs areprotected

• Creates “SNP”• Used to call methylation

PCR

Page 17: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

Methylation arraysWhat are they and how do they work?

Page 18: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

Illumina InfiniumHumanMethylation BeadChips

• Human only

• Gene biased; selected to be relevant to human development & disease• eg. TSS, promoters, CpG islands, enhancers, ...

1 chip = 12 samples

>27,000 unique CpG sitesmeasured in each sample

1 chip = 8 samples1 chip = 12 samples

27k array (2009) 450k array (2011) 850k array (2015)

>450,000 unique CpG sitesmeasured in each sample

>850,000 unique CpG sites measured in each sample

Modified slide from Belinda Phipson

Page 19: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

Methylation arrays are based on SNP array technology

• Methylation array “SNPs” (C/T) are created by bisulphite conversion

• Comparing the intensity of C/T gives the proportion of methylation at single CpG

What is this

base?

Measure

fluorescence

intensity

Page 20: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

What methylation values can we get?

• On an array, we measure methylation in a population of cells

• Individual cell can be either 0, 0.5 or 1 at one CpG

• Across a population we get a continuous measurement between [0-1]

CH3 CH3 CH3

0 0.5 1

A sample

Many cells in single sample

Page 21: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

Measures of methylation

• Arrays measure both methylated (C) and unmethylated (T)signal to get proportion of methylation at a CpG

β =𝑀𝑒𝑡ℎ

𝑀𝑒𝑡ℎ+𝑈𝑛𝑚𝑒𝑡ℎ

Intuitive, easy to interpret, great for visualisation

M value

Bet

a va

lue

Du et al. 2011, BMC Bioinformatics

𝑀 = log2𝛽

1−𝛽

Can convert between them via a logit transformation

𝑀 = log2𝑀𝑒𝑡ℎ

𝑈𝑛𝑚𝑒𝑡ℎ

Better statistical properties, recommended

for statistical testing

Page 22: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

What does the data look like?

Sample A1

Sample A2

Sample A3

Sample B1

Sample B2

Sample B3

0.213 0.221 0.311 0.123 0.216 0.198

-0.011 0.001 -0.016 2.011 2.002 2.702

2.213 2.256 2.698 0.052 0.101 0.238

4.567 5.231 4.982 4.152 6.216 4.698

-4.723 -3.459 -5.36 -5.763 -5.122 -4.998

-5.567 -4.666 -4.845 -4.522 -4.111 -3.245

3.421 5.467 5.554 5.445 5.298 4.514

2.981 3.345 3.512 -3.534 -4.311 -3.889

3.792 2.987 3.324 -0.231 -0.066 -0.001

… ... ...

CpGsites

Table of M-values

Page 23: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

Array analysis pipeline

QC: b density plots, control probes, MDS/clustering plots, …

Normalization: within and between arrays

Statistical testing for differential methylation, CpGs & regions

Annotation to genes, gene set testing, visualization, …

Combine with other data types

Transform data to remove unwanted

variation

minfi, missMethyl, wateRmelon

Estimate means and variances and

borrow information across probes

limma, bumphunter, DMRcate

Think about biological interpretation

missMethyl, Gviz

e.g. gene expression GenomicRanges

Remove bad samples and poor performing

probes (CpGs)

minfi, methylumi, limma

Software

Page 24: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

M28 M29 M30

naive

activated

naive

activated

rTreg

rTreg

Page 25: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

After QC, data exploration is your friend!

Dimension 1 Dimension 3

Dim

ensio

n 2

Dim

ensio

n 4

Clustering by individual and cell type

MDS plots showing largest sources of variation in the data

Page 26: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

Statistical testing:Look for differences at single CpGs

Differential methylation

Phipson & Oshlack 2015, Genome Biology

moderated t = | ത𝑦𝑐𝑎𝑛−ത𝑦𝑛𝑜𝑟𝑚|

ǁ𝑠 𝑣

ǁ𝑠 is the empirical Bayes variance

Linear model :

𝑦 = 𝑋𝛽 + ε

Smyth, 2004

Adjust the p-values using Benjamini and Hochberg’s FDR

Can take into account any other covariates

One test per CpG!

Modified slide from Belinda Phipson & Alicia Oshlack

Lots of differences between immune cell types!

Page 27: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

Statistical testing:Differences across CpG dense region

• Recall: CpG methylation is spatially correlated

• Can we find consistent group-average level differences between CpGs that are close together?

• More functionally relevant than differences at individual CpGs?

Aryee et al. 2014, Bioinformatics

Lots of DMRs between immune cell types!

Page 28: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

You can do other cool stuff!

• Unmethylatedregions in rTregcompared to naïvecells enriched for FOXP3 binding motifs!

Forkhead-binding motif

Consensus motif from DMR seqs.

DMR consensus motif matches

Forkhead-binding motif

Differences in cell types controlled by FOXP3!Modified slide from Alicia Oshlack

Page 29: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

Methylation array analysis is very mature: lots of methods!

https://www.bioconductor.org/

https://f1000research.com/articles/5-1281/v3

Page 30: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

Methylation sequencingAKA bisulphite sequencing: the good, the bad and the ugly

Page 31: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

Two main types of bisulphite sequencing• Whole-genome bisulphite sequencing (BS-seq)

• Gold standard

• Genome-wide (~30,000,000 CpGs in human)

• Expensive but covers almost everything• Need high (10-30x) coverage to reliably call methylation

• Targeted BS-seq• Only sequence regions of interest

• Reduced representation BS-seq (restriction enzyme)

• Capture BS-seq (similar principal to exome)

• Cheaper but can miss a lot of stuff• Can usually do higher (20-60x) coverage

Page 32: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

What was bisulphite conversion again?

DNA

fragment

All four of these can

be sequenced!

Page 33: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

What are the challenges?

• Like calling SNPs, methylation in BS-seq inferred by comparison to unconverted reference sequence

• Correct alignment is critical

• More challenging than usual!• Aligned sequences do not exactly match reference

• Complexity of libraries is reduced• Many Cs become Ts, so less info for mapping!

• Methylation is not symmetrical• Two strands of DNA in the reference genome must be

considered separately

Page 34: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

Mapping (Bismark)

DNA

fragment

BS conversion & PCR

Page 35: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

Mapping (Bismark)

TCGGTATGTTTAAACGTT

DNA

fragment

BS conversion & PCR

Page 36: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

Mapping (Bismark)

TCGGTATGTTTAAACGTT

TTGGTATGTTTAAATGTT TCAATATATTTAAACATT

In silico read

conversionC-to-T G-to-A

DNA

fragment

BS conversion & PCR

Page 37: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

Mapping (Bismark)

TCGGTATGTTTAAACGTT

TTGGTATGTTTAAATGTT TCAATATATTTAAACATT

In silico read

conversionC-to-T G-to-A

…TTGGTATGTTTAAATGTT…

…AACCATACAAATTTACAA……CCAACATATTTAAACACT……GGTTGTATAAATTTGTGA…

Align to in silico

bisulphite converted

genome

Fwd strand C-to-T converted genome Fwd strand G-to-A converted genome

Reverse complement Reverse complement

DNA

fragment

BS conversion & PCR

Page 38: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

TCAATATATTTAAACATT TCAATATATTTAAACATT

TCAATATATTTAAACATT TCAATATATTTAAACATT

Mapping (Bismark)

TCGGTATGTTTAAACGTT

TTGGTATGTTTAAATGTT TCAATATATTTAAACATT

In silico read

conversionC-to-T G-to-A

…TTGGTATGTTTAAATGTT…

…AACCATACAAATTTACAA……CCAACATATTTAAACACT……GGTTGTATAAATTTGTGA…

Align to in silico

bisulphite converted

genome

Fwd strand C-to-T converted genome Fwd strand G-to-A converted genome

Reverse complement Reverse complement

…TTGGTATGTTTAAATGTT…

…AACCATACAAATTTACAA…

…CCAACATATTTAAACACT…

…GGTTGTATAAATTTGTGA…

x x x x x x x x x

x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x

Read all alignment

outputs simultaneously

to determine if

sequence can be

mapped uniquely

DNA

fragment

BS conversion & PCR

Page 39: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

TCAATATATTTAAACATT TCAATATATTTAAACATT

TCAATATATTTAAACATT TCAATATATTTAAACATT

Mapping (Bismark)

TCGGTATGTTTAAACGTT

TTGGTATGTTTAAATGTT TCAATATATTTAAACATT

In silico read

conversionC-to-T G-to-A

…TTGGTATGTTTAAATGTT…

…AACCATACAAATTTACAA……CCAACATATTTAAACACT……GGTTGTATAAATTTGTGA…

Align to in silico

bisulphite converted

genome

Fwd strand C-to-T converted genome Fwd strand G-to-A converted genome

Reverse complement Reverse complement

…TTGGTATGTTTAAATGTT…

…AACCATACAAATTTACAA…

…CCAACATATTTAAACACT…

…GGTTGTATAAATTTGTGA…

x x x x x x x x x

x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x

TTGGTATGTTTAAATGTT TTGGTATGTTTAAATGTT

TTGGTATGTTTAAATGTT TTGGTATGTTTAAATGTT

x x x

x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x

x x

x

x x xRead all alignment

outputs simultaneously

to determine if

sequence can be

mapped uniquely

DNA

fragment

BS conversion & PCR

Page 40: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

Calling methylation

TCGGTATGTTTAAATGTT

TATGTTTAAATGTT

…TCGGTATGTTTAAAT

…TCGGTATGTT AAACGTT…

…TCGGTATGTTTAAATGTT

GTT…

…TTG

…CCGGCATGTTTAAACGCT…

…TCGGTATGTTT

…TCGGTATGTTTAAATGTT…

ATGTT…

…TCGGTATGTTTAAAT TT…

…TTGGTATGTTTA ATGTT…

…TCGGTATGTTTAAACGT 2

10× 100 = 20%

8

10× 100 = 80%

Genome reference

Page 41: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

Calling methylation

TCGGTATGTTTAAATGTT

TATGTTTAAATGTT

…TCGGTATGTTTAAATGTT…

…TTG

…CCGGCATGTTTAAACGCT… Genome reference

Good coverage is very important for reliablemethylation calls!

Page 42: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

Some real BS-seq mapping results

https://software.broadinstitute.org/software/igv/interpreting_bisulfite_mode

Page 43: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

Methylation calling output

chr1 753479 753479 50 1 1

chr1 753492 753492 66.67 2 1

chr1 753540 753540 100 1 0

chr1 753541 753541 50 1 1

chr1 753667 753667 25 1 3

chr1 753724 753724 66.67 2 1

chr1 753763 753763 0 0 2

chr1 753785 753785 0 0 1

chr1 759932 759932 100 1 0

chr1 760913 760913 0 0 1

chr1 761299 761299 100 2 0

chr1 761371 761371 80 8 2

chr1 761377 761377 100 10 0

chr1 761446 761446 92.86 13 1

chr1 761460 761460 53.85 7 6

chr1 762005 762005 100 1 0

chr1 762114 762114 0 0 5

chr1 762176 762176 0 0 7

chr1 762180 762180 0 0 8

No. unmethylated

reads

No. methylated

reads

% methylation

Sum for total coverage

Position of C in genome

80% =8

8+2× 100

This is what we work with!

Page 44: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

Krueger et al. 2012, Nature Methods

Analysis pipeline

Thorough QC is VERY

important for BS-seq

Need to be brutal with

trimming off poor

quality bases…

…and adapters

As with SNP calling,

removing PCR

duplicates is a good

idea for better

methylation calling

Other stuff to find cool biology!

Page 45: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

Summary

• Methylation arrays very popular• Only for human• Great for EWAS• Analysis very mature

• Bioconductor is the place to go!

• BS-seq best option for genome-wide single nucleotide resolution

• Only option for species other than human• Pre-processing, mapping, etc. pretty good• Statistical analysis still developing

• Bioconductor is a valuable resource

• Downstream analysis dependent on biological question

• Methylation is interesting & we know how to measure it• Best technology for the job depends on what you want to know!

Page 46: Talk outline - Bioinformaticsbioinformatics.org.au/ws17/wp-content/uploads/sites/13/2016/02/Jo… · Talk outline •Epigenetics •DNA methylation •Measuring DNA methylation •Methylation

Acknowledgments

Murdoch Childrens Research Institute

• Alicia Oshlack

• Belinda Phipson

• MCRI Bioinformatics group!

Johns Hopkins University

• Peter Hickey

missMethylhttps://www.bioconductor.org/packages/release/bioc/html/missMethyl.html

[email protected]@JovMaksimovic

github.com/JovMaksimovic

https://f1000research.com/articles/5-1281/v3