semi-supervised enhancer prediction using the segway framework · buske_201003_encode_presentation...

Post on 23-Aug-2020

2 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Semi-supervised enhancer predictionusing the Segway framework

Orion J. Buske, Tzitziki Lemus,Michael M. Hoffman, Jeff A. Bilmes,

William Staffor d Noble

Department of Genome SciencesUniversity of Washington

1 2 0 2 1 00… …unsupervised

1 2 0 2 1 00… …unsupervised

2 02 1 00… …

semi-supervised

novelknown p300 peaks (Heintzman et al. 2009)

recall

prec

isio

n

better

worse

Fraction of p300 sitesoverlapped by predictions

Fraction of predictionsthat overlap p300 sites

recall

prec

isio

n

CTCF

H3K4me1

H3K4me2

H3K4me3

H3K9ac

H3K9me1

H3K27ac

H3K27me3

H3K36me3

H4k20me1

DNaseI

Pol2

predictedobserved

Higher H3K4me1 H3K9me1 H3K36me3 H4K20me1 Input BDP1 BRF1 GATA1 JunD

Lower H3K4me3 DNaseI CTCF Pol2 TAF1

Example

semi-supervised labelprecision: 0.27recall: 0.56

P-SS

Segway hypothesizes more than one type of p300 site

semi-supervised labelprecision: 0.27recall: 0.56

P-SS

higher

lower

P-2

P-3

semi-supervised labelprecision: 0.27recall: 0.56

unsupervised labels

P-SS

higher

lower

P-SSH3K4me3TAF1Pol2H3K27acZNF267

P-2H3K4me3TAF1Pol3H3K4me1cFos

P-3H3K4me3TAF1Pol2H3K9acCTCF

Subtypes correspond to active/repressed chromatin statessimilar

P-2

P-3

P-SS

Combined P-SS, P-2, P-3precision: 0.21recall: 0.91

With combined labels, we achieve comparableprecision with excellent recall

P-2

P-3

P-SS

At least two segments within 1kb(P-SS, P-2, P-3)

precision: 0.31recall: 0.77

With multiple predicted sites in close proximity, weimprove precision with good recall

P-2

P-3

P-SS

Acknowledgements

AvailabilitySegway: http://noble.gs.washington.edu/proj/segway

Segtools: http://noble.gs.washington.edu/proj/segtools

ENCODE Project Consortium

NHGRI

CTCF

H3K4me1

H3K4me2

H3K4me3

H3K9ac

H3K9me1

H3K27ac

H3K27me3

H3K36me3

H4k20me1

DNase

Pol2

predicted (TP)observed (TP)false negative

False negatives have highermean signal than truepositives

Lower GATA1

Higher H3K4me3 Pol2 Pol3 TAF1

Example

top related