modelling gene expression dynamics with gaussian processes · di erential equations: pol-ii to mrna...

$: Modelling gene expression dynamics with Gaussian processes · Di erential equations: Pol-II to mRNA dynamics. Gaussian processes: ... M.Rattray, N.D. Lawrence \Fast non-parametric$
Modelling gene expression dynamicswith Gaussian processes

Regulatory Genomics and EpigenomicsMarch 10th 2016

Magnus RattrayFaculty of Life Sciences

University of Manchester

Talk Outline

Introduction to Gaussian process regression

Example 1. Hierarchical models: batches and clusters

Example 2. Branching models: perturbations and bifurcations

Example 3. Differential equations: Pol-II to mRNA dynamics

Gaussian processes: flexible non-parametric models

Probability distributions over functions

f (t) ∼ GP(mean(t), cov(t, t ′)

)Covariance function k = cov(t, t ′) defines typical properties,

I Static . . . Dynamic

I Smooth . . . Rough

I Stationary. . . non-Stationary

I Periodic. . . Chaotic

The covariance function has parameters tuning these properties

Bayesian Machine Learning perspective: Rasmussen & Williams“Gaussian Processes for Machine Learning” (MIT Press, 2006)

Gaussian processes

Samples from a 25-dimensional multivariate Gaussian distribution:

n

fn

5 10 15 20 25

−1

−0.5

0.5

1

1.5

2

n

m

5 10 15 20 25

5

10

15

20

25

0.2

0.4

0.6

0.8

[f1, f2, . . . , f25] ∼ N (0,C )

Learning and Inference in Computational Systems Biology, MIT Press

Gaussian processes

Take dimension →∞

−3 −2 −1 1 2 3

−2.5

−2

−1.5

−1

−0.5

0.5

1

1.5

2

2.5

−3 −2 −1 1 2 3

−3

−2

−1

1

2

3

f ∼ GP(0, k) k(t, t ′)

= exp

(−(t − t ′)2

l2

)

Learning and Inference in Computational Systems Biology, MIT Press

Gaussian processes for inference: Bayesian Regression

2.0 1.5 1.0 0.5 0.0 0.5 1.0 1.5 2.0

2

1

0

1

2

Figures from James Hensman

Regression example

2.0 1.5 1.0 0.5 0.0 0.5 1.0 1.5 2.0

2

1

0

1

2

Figures from James Hensman

Ex1. Hierarchical models: batches and clusters

0 5 10 15 2021012

0 5 10 15 20 0 5 10 15 20 0 5 10 15 20

0 5 10 15 2021012

0 5 10 15 20 0 5 10 15 20 0 5 10 15 20

Data from Kalinka et al. ”Gene expression divergence recapitulatesthe developmental hourglass model” Nature 2010

Joint work with James Hensman and Neil Lawrence

Usual processing options for time course batches

Lumped

0 5 10 15 20

2

1

0

1

2

Averaged

0 5 10 15 20

2

1

0

1

2

Hierarchical Gaussian process

gene:

g(t) ∼ GP(0, kg (t, t ′)

)replicate:

fi (t) ∼ GP(g(t), kf (t, t ′)

)

Hierarchical Gaussian process

3

2

1

0

1

2

J. Hensman, N.D. Lawrence, M.Rattray ”Hierarchical Bayesianmodelling of gene expression time series across irregularly sampledreplicates and clusters” BMC Bioinformatics 2013

Hierarchical Gaussian process for clustering

An extended hierarchy

h(t) ∼ GP(

0, kh(t, t ′))

cluster

gi (t) ∼ GP(h(t), kg (t, t ′)

)gene

fir (t) ∼ GP(gi (t), kf (t, t ′)

)replicate


2

1

0

1

2

CG11786

2

1

0

1

2

CG12723

2

1

0

1

2

CG13196

2

1

0

1

2

CG13627

2

1

0

1

2

CG4386

2

1

0

1

2

Osi15

0 5 10 15 202

1

0

1

2

0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20

Time (hours)

Norm

alis

ed

gen

e e

xp

ress

ion


Modifying an existing algorithm to include this model of replicateand cluster structure leads to more meaningful clustering

MF BP CC L N. clust.

agglomerative HGP 0.46 0.16 0.50 7360.8 50agglomerative GP 0.39 0.13 0.36 6203.7 128Mclust (concat.) 0.39 0.07 0.25 1324.0 26

Mclust (averaged) 0.40 0.08 0.24 -736.2 20

Variational Bayes algorithm is more efficient, allowing Bayesianclustering of >10K profiles with a Dirichlet Process prior

J. Hensman, M.Rattray, N.D. Lawrence “Fast non-parametricclustering of time-series data” IEEE TPAMI 2015

Clustering with a periodic covariance

Ex2. Branching models: perturbations and bifurcations

0 5 10 15Time (hrs)

−4

−3

−2

−1

0

1

2

3

4

5

Sim

ulated gene expression

Simulated control data

Simulated perturbed data

Joint work with Jing Yang, Chris Penfold and Murray Grant

Samples from a branching model

Sample 1


Sample 2


Sample 3


Sample 4


Sample 5

Joint distribution to two functions crossing at tp

f ∼ GP(0,K ) , g ∼ GP(0,K ) , g(tp) = f (tp)

Σ =

(Kff Kfg

Kgf Kgg

)=

(K (T,T)

K(T,tp)K(tp ,T)k(tp ,tp)

K(T,tp)K(tp ,T)k(tp ,tp)

K (T,T)

)(1)

Joint distribution of two datasets diverging at tp

y c(tn) ∼ N (f (tn), σ2)

yp(tn) ∼ N (f (tn), σ2) for tn ≤ tp

yp(tn) ∼ N (g(tn), σ2) for tn > tp

Posterior probability of the perturbation time tp

p(tp|y c(T), yp(T)) ' p(y c(T), yp(T)|tp)∑t=tmaxt=tmin

p(y c (T), yp(T)|t)

0 5 10 15Time (hrs)

−4

−3

−2

−1

0

1

2

3

4

5

Sim

ulated gene expression

Simulated control data

Simulated perturbed data

Comparison to DE thresholding approach

Alternative: use first point where some DE threshold is passedWe tested several DE metrics in the nsgp package

0 5 10 15 20

0

5

10

15

20

Esti

mate

d p

ert

urb

ati

on

tim

es

DEtime

Mean

Median

MAP

0 5 10 15 20

nsgp

EMLL0.5

EMLL1

Testing perturbation times

Investigating a plant’s response to bacterial challenge

Infection with virulent Pseudomonas syringage pv. tomato DC3000vs. disarmed strain DC3000hrpA

0 10 17

Times (hrs)

8

9

10

11

12

13

Gen

e e

xp

ressio

n

CATMA1c72124

Condition 1

Condition 2

0 10 17

Times (hrs)

CATMA1a09335

Condition 1

Condition 2

Yang et al. “Inferring the perturbation time from biological timecourse data” Bioinformatics (accepted) preprint: arxiv 1602.01743

Current work: modelling branching in single-cell data

12

48

1632

64start

5

10

15

20

25

30

35

40

1

2

3

4

5

6

7

with A. Boukouvalas, M. Zweissele, J. Hensman and N. Lawrence

Ex 3. Linking Pol-II activity to mRNA profiles

040

80160

320640

1280

t (min)

mRNA

a

b

c

d

e

f

040

80160

320640

1280

t (min)

Pol−II

0

2

4

6

Pol II

ENSG00000163659 (TIPARP)

0 40 80 160 320 64012800

50

100

mR

NA

(F

PK

M)

t (min)0 50 100

0

0.2

0.4

0.6

0.8

1

Delay (min)

Joint work with Antti Honkela,Jaakko Peltonen, Neil Lawrence

Linking Pol-II activity to mRNA profiles

dm(t)

dt= βp(t −∆)− αm(t)

I m(t) is mRNA concentration (RNA-Seq data)

I p(t) is mRNA production rate (3’ pol-II ChIP-Seq data)

I α is degradation rate (mRNA half-life t1/2 = 2/α)

I ∆ is processing delay

We model p(t) ∼ GP(0, kp) as a Gaussian process (GP)

Likelihood can be worked out exactly

Bayesian MCMC used to estimate parameters α, β, ∆ and GPcovariance (2) and noise variance (1) parameters

Example fits

040

80160

320640

1280

t (min)

mRNA

a

b

c

d

e

f

040

80160

320640

1280

t (min)

Pol−II a: Early pol-II, no production delay

0

2

4

Pol II

ENSG00000166483 (WEE1)

0 40 80 160 320 64012800

20

40

mR

NA

(F

PK

M)

t (min)0 50 100

0

0.2

0.4

0.6

0.8

1

Delay (min)

Example fits

040

80160

320640

1280

t (min)

mRNA

a

b

c

d

e

f

040

80160

320640

1280

t (min)

Pol−II b: Early pol-II, delayed production

0

1

2

3

Pol II

ENSG00000019549 (SNAI2)

0 40 80 160 320 64012800

2

4

6

8

mR

NA

(F

PK

M)

t (min)0 50 100

0

0.2

0.4

0.6

0.8

1

Delay (min)

Example fits

040

80160

320640

1280

t (min)

mRNA

a

b

c

d

e

f

040

80160

320640

1280

t (min)

Pol−II c: Later pol-II, delayed production

0

2

4

6

Pol II

ENSG00000163659 (TIPARP)

0 40 80 160 320 64012800

50

100

mR

NA

(F

PK

M)

t (min)0 50 100

0

0.2

0.4

0.6

0.8

1

Delay (min)

Example fits

040

80160

320640

1280

t (min)

mRNA

a

b

c

d

e

f

040

80160

320640

1280

t (min)

Pol−II d: Late pol-II, no delay

0

0.5

1

Pol II

ENSG00000162949 (CAPN13)

0 40 80 160 320 64012800

5

10

15

mR

NA

(F

PK

M)

t (min)0 50 100

0

0.2

0.4

0.6

0.8

1

Delay (min)

Example fits

040

80160

320640

1280

t (min)

mRNA

a

b

c

d

e

f

040

80160

320640

1280

t (min)

Pol−II e: Late pol-II, no delay

0

0.5

1

Pol II

ENSG00000117298 (ECE1)

0 40 80 160 320 64012800

20

40

mR

NA

(F

PK

M)

t (min)0 50 100

0

0.2

0.4

0.6

0.8

1

Delay (min)

Example fits

040

80160

320640

1280

t (min)

mRNA

a

b

c

d

e

f

040

80160

320640

1280

t (min)

Pol−II f: Late pol-II, delayed production

0

1

2

3

Pol II

ENSG00000181026 (AEN)

0 40 80 160 320 64012800

10

20

30

mR

NA

(F

PK

M)

t (min)0 50 100

0

0.2

0.4

0.6

0.8

1

Delay (min)

Large processing delays observed in 11% of genes

Delay (min)

# of

gen

es

0 40 80 >120

050

100

1500

1550

Delay linked with gene length and intron structure

0 20 40 60 80t (min)

0.00

0.05

0.10

0.15

0.20

0.25

0.30

Fra

ctio

n of

gen

es w

ith ∆

>t

02

46

810

12−

log 1

0(p−

valu

e)

m<10000(N=282)m>10000(N=1504)

p−value

0 20 40 60 80t (min)

0.00

0.05

0.10

0.15

0.20

0.25

Fra

ctio

n of

gen

es w

ith ∆

>t

02

46

810

−lo

g 10(p

−va

lue)

f<0.20(N=1443)f>0.20(N=343)

p−value

∆: delay m: gene length f: final intron length / gene length

Delayed genes show evidence of late-intron retention

DLX

3

0246

10 m

in

0

5

10

20 m

in

0102030

40 m

in

05

101520

80 m

in

02468

10

160

min

02468

320

min

10 20 30 40 50 60t (min)

−0.

010.

000.

010.

020.

030.

04M

ean

pre−

mR

NA

end

acc

umul

atio

n in

dex

01

23

45

67

−lo

g 10(p

−va

lue)

∆ > t∆ < tp−value

Left: density of RNA-Seq reads uniquely mapping to the introns inthe DLX3 gene

Right: Differences in the mean pre-mRNA accumulation index inlong delay genes (blue) and short delay genes (red)

Comparison of processing and transcription times

RNA processing delay ∆ (min)

Tran

scrip

tiona

l del

ay (

min

)

1 3 10 30 100

13

1030

100

Transcription time = length/velocityestimate from Danko Mol Cell 2013

Transcription time > processing delayin 87% of genes

Honkela et al. “Genome-wide modeling of transcription kineticsreveals patterns of RNA production delays” PNAS 2015

Extension - inferring time-varying degradation rates

dm(t)

dt= βp(t)− α(t)m(t)

0 2 4 6 8 10Times

0.5

0.0

0.5

1.0

1.5

2.0

2.5

Degra

dati

on r

ate

D

0 2 4 6 8 10Times

5

0

5

10

15

20

25

30

35

Gene e

xpre

ssio

n

mRNA measurementspre-mRNA measurements

Conclusion

I Gaussian process models provide a good mix of flexibility andtractability in temporal and spatial modelling:

I Hierarchical modelling, e.g. replicates, clusters, speciesI Periodic models without strong sinuisoidal assumptionsI Tractable under branching and bifurcationsI Tractable under linear operations on functions

I Easy addition and multiplication of kernels

I Good approximate inference algorithms for non-Gaussian data

I Codebase is growing making methods increasingly flexible andcomputationally efficient (e.g. GPy, GPflow and some R)

Funding: BBSRC (EraSysBio+ SYNERGY), EU FP7 (RADIANT)

Collaborators: Neil Lawrence, James Hensman, Antti Honkela,Jaakko Peltonen, Jing Yang, Alexis Boukouvalas, Max Zweissele

Our existential crisis

modelling gene expression dynamics with gaussian processes · di erential equations: pol-ii to mrna...

Documents