methods for the detection of conservation of methylation ...marjoram... · methods for the...

19
Methods for the detection of conservation of methylation (R21, year 2) (or, how statisticians can make your life more complicated)

Upload: others

Post on 19-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Methods for the detection of conservation of methylation ...Marjoram... · Methods for the detection of conservation of methylation (R21, year 2) (or, how statisticians can make your

Methods for the detection of conservation of methylation

(R21, year 2)

(or, how statisticians can make your life more complicated)

Page 2: Methods for the detection of conservation of methylation ...Marjoram... · Methods for the detection of conservation of methylation (R21, year 2) (or, how statisticians can make your

Let’s detect conservation!

Darryl Shibata

1. The phenotype of a cell is determined by its epigenome.

2. Develop methods to rank genes/pathways/regions according to their degree of conservation of methylation status.

3. Thus, we will identify the genes/pathways most important for growth of an individual Colorectal cancer

4. (A small step towards) Personalized medicine!

Page 3: Methods for the detection of conservation of methylation ...Marjoram... · Methods for the detection of conservation of methylation (R21, year 2) (or, how statisticians can make your

That sounds great, but…

1. What exactly do you mean by conservation? How do we measure it?

2. How does your technology work?

3. What kind of error rates do you get using your technology (Illumina Infinium MethylationEPIC BeadChip Kit)?

4. You need this by when?!?

Page 4: Methods for the detection of conservation of methylation ...Marjoram... · Methods for the detection of conservation of methylation (R21, year 2) (or, how statisticians can make your

What does conservation mean?

Scherer, et al., Nucleic Acids Research, 2020, Vol. 48, No. 8

Page 5: Methods for the detection of conservation of methylation ...Marjoram... · Methods for the detection of conservation of methylation (R21, year 2) (or, how statisticians can make your

What does conservation mean?

Scherer, et al., Nucleic Acids Research, 2020, Vol. 48, No. 8

Page 6: Methods for the detection of conservation of methylation ...Marjoram... · Methods for the detection of conservation of methylation (R21, year 2) (or, how statisticians can make your

Pooled Data

6

Samples from left and right side of tumor.

Our preliminary data are pooled, rather than cell-level. • Output = “beta values”. • After much QC, we can think of the beta value for a given

site as the “proportion of cells that are methylated at that site in that sample”.

Page 7: Methods for the detection of conservation of methylation ...Marjoram... · Methods for the detection of conservation of methylation (R21, year 2) (or, how statisticians can make your

Statistics to measure conservation

• Variance - by site or by region • Manhattan distance • Proportion of sites with extreme methylation

proportions ([0,0.2] or [0.8,1])

7

Page 8: Methods for the detection of conservation of methylation ...Marjoram... · Methods for the detection of conservation of methylation (R21, year 2) (or, how statisticians can make your

Overall strategy

• e.g., for a gene-based analysis: • Calculate the observed statistic value for

each gene • Rank genes according to those values • Pick-off the genes with highest(lowest)

value, as the most conserved.

8

Page 9: Methods for the detection of conservation of methylation ...Marjoram... · Methods for the detection of conservation of methylation (R21, year 2) (or, how statisticians can make your

• Look for genes in which methylation is statistically significantly conserved across normal tissue.

• Belief: those genes are “important”. • ‘Validate’: Look at expression of those conserved genes

in the Expression Atlas database. Is it high? • Colon • Small intestine • Endometrium

Proof of concept / QC

Page 10: Methods for the detection of conservation of methylation ...Marjoram... · Methods for the detection of conservation of methylation (R21, year 2) (or, how statisticians can make your

Variance of Statistic

• Suppose n sites in the region/gene.

• Proportion of sites with extreme methylation proportions ([0,0.2] or [0.8,1]) • Binomial distribution: variance is O(1/n)

• Variance • Variance of variance:

10

=(n� 1)2µ4

n3� (n� 1)(n� 3)µ2

2

n3= O(1/n)

<latexit sha1_base64="uEADy/LAHmOHSF2KL6d+syM/X8o=">AAACKnicbZDLTgIxFIY7XhFvqEs3jcQEFuAMkKgLEtSNOzGRSwLDpFM60NDpTNqOCZnwPG58FTcsNMStD2K5LBA8SZs//3dO2vO7IaNSmebE2Njc2t7ZTewl9w8Oj45TJ6d1GUQCkxoOWCCaLpKEUU5qiipGmqEgyHcZabiDhylvvBIhacBf1DAkto96nHoUI6UtJ3VXbnsC4TjDc1a2U4BtP3JKo5h3iiOYg0tMX8XslBY6hTkvP2WsK551Umkzb84KrgtrIdJgUVUnNW53Axz5hCvMkJQtywyVHSOhKGZklGxHkoQID1CPtLTkyCfSjmerjuCldrrQC4Q+XMGZuzwRI1/Koe/qTh+pvlxlU/M/1oqUd2PHlIeRIhzPH/IiBlUAp7nBLhUEKzbUAmFB9V8h7iMdj9LpJnUI1urK66JeyFul/O1zKV25X8SRAOfgAmSABa5BBTyCKqgBDN7AB/gEX8a7MTYmxve8dcNYzJyBP2X8/AJycaOd</latexit>

Page 11: Methods for the detection of conservation of methylation ...Marjoram... · Methods for the detection of conservation of methylation (R21, year 2) (or, how statisticians can make your

Bootstrapping:

• For a gene with m CpG sites: • Construct a large number of ‘null genes’ that also

have m CpG sites. • Calculate statistic value for each null gene. • Rank the statistic value for the observed gene

within the set of values for null genes to get a p-value.

• Do this for all genes of all lengths, and look at those with lowest resulting p-values.

11

Page 12: Methods for the detection of conservation of methylation ...Marjoram... · Methods for the detection of conservation of methylation (R21, year 2) (or, how statisticians can make your

Software: Methcon5

12

JCO Clin Cancer Inform 4:100-107. © 2020 by American Society of Clinical Oncology

github.com/USCbiostats/MethCon5

Page 13: Methods for the detection of conservation of methylation ...Marjoram... · Methods for the detection of conservation of methylation (R21, year 2) (or, how statisticians can make your

More complex models?

𝛼k: region-specific tendency to be methylated𝜌n: density of CpGs at nth CpG𝛽k: region-specific dependence on densitydn: distance between nth CpG and (n-1)th CpGγk:region-specificcorrelationparameter

Page 14: Methods for the detection of conservation of methylation ...Marjoram... · Methods for the detection of conservation of methylation (R21, year 2) (or, how statisticians can make your

Software: DNAMeda (Shiny App - Visualization)

Islands

Shores

Non-island

github.com/USCbiostats/DNAMeda

Page 15: Methods for the detection of conservation of methylation ...Marjoram... · Methods for the detection of conservation of methylation (R21, year 2) (or, how statisticians can make your

Software: mutational signatures

15

9

A Unifying Model to Test Difference?

Somatic mutations

pmsignature EstimatedProportions,

!"

Are !"different in

two groups?

e.g., Wilcoxon

Uncertainty in Proportions

HiLDA

HiLDA = “Hierarchical Latent Dirichlet Allocation”

!!~#$% &!", . . . , &!#, )!!"~#$% &"", . . . , &"#, )"

Page 16: Methods for the detection of conservation of methylation ...Marjoram... · Methods for the detection of conservation of methylation (R21, year 2) (or, how statisticians can make your

HILDA

16github.com/USCbiostats/HiLDA

Page 17: Methods for the detection of conservation of methylation ...Marjoram... · Methods for the detection of conservation of methylation (R21, year 2) (or, how statisticians can make your

Software: iMutSig

17

Page 18: Methods for the detection of conservation of methylation ...Marjoram... · Methods for the detection of conservation of methylation (R21, year 2) (or, how statisticians can make your

iMutSig - Shiny app.

18

Page 19: Methods for the detection of conservation of methylation ...Marjoram... · Methods for the detection of conservation of methylation (R21, year 2) (or, how statisticians can make your

Acknowledgements

19

USC: Emil Hvitfeldt, Kim Siegmund, Darryl Shibata, Zhi Yang

JH: Hari Easwaren, Tom Pisanic

And many thanks to ITCR