combinatorially complex equilibrium model selection
DESCRIPTION
Combinatorially Complex Equilibrium Model Selection. Tom Radivoyevitch Assistant Professor Epidemiology and Biostatistics Case Western Reserve University Email: [email protected] Website: http://epbi-radivot.cwru.edu/. - PowerPoint PPT PresentationTRANSCRIPT
Combinatorially Complex Combinatorially Complex Equilibrium Model Equilibrium Model
SelectionSelection
Tom RadivoyevitchAssistant ProfessorEpidemiology and BiostatisticsCase Western Reserve University
Email: [email protected]: http://epbi-radivot.cwru.edu/
Combinatorially Complex Combinatorially Complex Equilibrium Model Equilibrium Model
SelectionSelection(overview/title breakdown)(overview/title breakdown)
• Combinatorially Complex Equilibria:Combinatorially Complex Equilibria:few reactants => many possible complexesfew reactants => many possible complexes
• Generate the possible models + Select or average the Generate the possible models + Select or average the best best
• Selection/Averaging: Akaike Information Criterion (AIC)Selection/Averaging: Akaike Information Criterion (AIC)• AIC decreases with P and then increasesAIC decreases with P and then increases• Billions of models, but only thousands near AIC upturnBillions of models, but only thousands near AIC upturn
• Challenge: Generate 1P, 2P, 3P model space chunks Challenge: Generate 1P, 2P, 3P model space chunks sequentially sequentially
• 8GB RAM => max of ~5000 models at a time8GB RAM => max of ~5000 models at a time• Models = Hypotheses Models = Hypotheses • Spurs: complete K = ∞ Spurs: complete K = ∞ [Complex]=0 [Complex]=0• Grids: binary K equal each otherGrids: binary K equal each other• R package ccems (CRAN) implements the methodsR package ccems (CRAN) implements the methods
Center RelevanceCenter Relevance
• StructuresStructures constrain minimal simplicity and maximal constrain minimal simplicity and maximal complexity of possible complexes and models in the complexity of possible complexes and models in the model space model space
• Move prior knowledge of structures into enzyme Move prior knowledge of structures into enzyme models models
• System BiologySystem Biology models integrate enzyme models models integrate enzyme models
• Protein-Protein Protein-LigandProtein-Protein Protein-Ligand I Interactionnteraction readouts: readouts:• Enzyme activities and DLS mass (i.e. average Enzyme activities and DLS mass (i.e. average
measures)measures)• Mass distributions of complexes (e.g. SEC and Mass distributions of complexes (e.g. SEC and
GEMMA)GEMMA)• Validate omic-predicted interactions and model themValidate omic-predicted interactions and model them• PEPCC (Harry Gill): Systematically express, purify and PEPCC (Harry Gill): Systematically express, purify and
characterize interaction suspects (e.g. of core wnt characterize interaction suspects (e.g. of core wnt signaling proteins) signaling proteins)
Ultimate GoalUltimate Goal
EXPERIMENTALBIOLOGY
COMPUTERMODELING
CONTROLTHEORY
models
control lawsdata
hypotheses
proposed clinical trial
validated process model development
control system design methods development
Present Future
• Safer flying airplanes with autopilotsSafer flying airplanes with autopilots• Ultimate Goal: individualized, state feedback based Ultimate Goal: individualized, state feedback based
clinical trialsclinical trials
• Better understanding => better controlBetter understanding => better control• Conceptual models help trial designs today Conceptual models help trial designs today • Computer models of airplanes help train pilots and Computer models of airplanes help train pilots and
autopilotsautopilots
Radivoyevitch et al. (2006) BMC Cancer 6:104
dNTP Supply System
Figure 1. dNTP supply. Many anticancer agents act on or through this system to kill cells. The most central enzyme of this system is RNR.
UDP
CDP
GDP
ADP
dTTP
dCTP
dGTP
dATP
dT
dC
dG
dADNA
dUMP
dU
TS
DCTD
dCK
DN
A p
olym
eras
eTK1
cytosol
mitochondria
dT
dC
dG
dA
TK2
dGK
dTMPdCMP
dGMP
dAMP
dTTPdCTP
dGTP
dATP
5NT
NT2
cytosol
nucleus
dUDP
dUTPdUTPase dN
dN
dCK
flux activation inhibition
ATPordATP
RN
R
dCK
R1
R2 R2
R1 R1
R1 R1
R1 R1
R1 R1 R1
R1
R1 R1
R1 R1 R1
R1
R2 R2
UDP, CDP, GDP, ADP bind to catalytic site
ATP, dATP, dTTP, dGTP bind to selectivity site
dATP inhibits at activity site, ATP activates at activity site?
5 catalytic site states x 5 s-site states x 3 a-site states x 2 h-site states = 150 states
(150)6 different hexamer complexes => 2^(150)6 models ~ 1 followed by a trillion zeros
ATP activates at hexamerization site??
Ribonucleotide Reductase (RNR)Ribonucleotide Reductase (RNR)
R2 R2
RNR is Combinatorially Complex
Michaelis-Menten ModelMichaelis-Menten Model
RNR: no NDP and no R2 dimer => kcat of complex is zero,else different R1-R2-NDP complexes can have different kcat values.
m
Tmm
mT
TT
TT
KSSVv
kEKSKS
KSkEv
kEEES
EEES
ESkEv
EPEESPEkvEESkv
][][
][1/][
101/][
/][][
][][][
][0][][
][][
)(][0)(][][0][
max
1
1
1
1
E + S ES
0.005 0.010 0.020 0.050 0.100 0.200 0.500
020
4060
8010
0
Total [r] (uM)
Per
cent
Act
ivity solid line = Eqs. (1-2)
dotted = Eq. (3)
Data from Scott, C. P., Kashlan, O. B., Lear, J. D., and Cooperman, B. S. (2001) Biochemistry 40(6), 1651-166
Model Parameter Initial Value Optimal Value Confidence Interval RRGGttr1.1.0 RRGGtt_r 0.020 0.012 (0.007, 0.024) SSE 1070.252 823.793 AIC 45.006 42.650
MM Kd 0.020 0.033 (0.022, 0.049) SSE 2016.335 1143.682 AIC 50.706 45.603
R=R1 r=R22G=GDP t=dTTP
)2(]][[][][0
)1(]][[][][0
_
_
SET
SET
KSESS
KSEEE
SE
T
SE
T
T
SE
SET
SE
TSE
T
KS
KS
EESversus
KS
KS
EES
KS
EEKSEE
_
_
_
_
_
_ ][1
][
][][][1
][
][][][1
1][][][1][][
Substitute this in here to get a quadratic in [S] which solves as
Bigger systems of higher polynomials cannot be solved algebraically => use ODEs (above)
][4][][(][][(5.][][ _2
__ TSETTSETTSET SKESKESKSES
0)0]([,0)0]([
]][[][][][
]][[][][][
_
_
SE
KSESS
dSd
KSEEE
dEd
SET
SET
Michaelis-Menten ModelMichaelis-Menten Model [S] vs. [S[S] vs. [STT] ]
(3)
E ES
EI ESI
E ES
EI ESISEIIEIE
T
SEIIESET
SEIIEIESET
KKSIE
KIEII
KKSIE
KSESS
KKSIE
KIE
KSEEE
___
___
____
]][][[]][[][][0
]][][[]][[][][0
]][][[]][[]][[][][0
E ES
EI
EIT
EST
EIEST
KIEII
KSESS
KIE
KSEEE
]][[][][0
]][[][][0
]][[]][[][][0
ESIEIT
ESIEST
ESIEIEST
KISE
KIEII
KISE
KSESS
KISE
KIE
KSEEE
]][][[]][[][][0
]][][[]][[][][0
]][][[]][[]][[][][0
E ES
EI ESI
E
EI ESI
E ES
ESI
E
EI ESI
E ES
ESI
=
=E
EI
E
ESI
E ES E
Competitive inhibition
uncompetitive inhibition if kcat_ESI=0
E | ES
EI | ESI
noncompetitive inhibition Example of K=K’ Model
==
Enzyme, Substrate and InhibitorEnzyme, Substrate and Inhibitor
Total number of spur graph models is 16+4=20 Radivoyevitch, (2008) BMC Systems Biology 2:15
Rt Spur Graph ModelsRt Spur Graph Models
RRttRRtRt
T
RRttRRtRRRtT
KtR
KtR
KtRtt=
KtR
KtR
KR
KtRRRp=
222
2222
20
2220
.0)0(;0)0(
2
222
222
2222
tRKtR
KtR
KtRtt=
dtd
KtR
KtR
KR
KtRRRp=
dRd
RRttRRtRtT
RRttRRtRRRtT
R RR
RRtt
RRt Rt
R
RRtt
RRt Rt
R RR
RRtt
Rt
R RR
RRt Rt
R RR
RRtt
RRt
R
RRtt
Rt
R
RRt Rt
R RR
Rt
IJJJJJIJ JJJI JIJJ
IJIJ IJJI JJII
JJJJ
R
RRtt
RRt
R RR
RRtt
R RR
RRt
R
Rt
R
RRtt
R
RRt
JIIJIIJJ JIJI
R RR RIJII IIIJ IIJI JIII IIII
R
Rt
R
RRtt
R
RRt
R RRI0II III0 II0I 0III
R = R1 t = dTTP
for dTTP induced R1 dimerization(RR, Rt, RRt, RRtt)
R Rt t
RRt t
Rt R t
RRt t
Rt Rt RRtt
Kd_R_R
Kd_Rt_R
Kd_Rt_Rt
Kd_R_t
Kd_R_t Kd_RRt_t
Kd_RR_t=
=
=
=
|
|
|
Rt Grid Graph ModelsRt Grid Graph Models
R Rt t
RRt t
Rt R t
RRt t
RRtt
KR_R
KR_t
KRRt_t
KRR_t=
=
===
== ===
=
==
===
==
=
R Rt t
RRt t
Rt R t
RRt t
Rt Rt RRtt
KR_R
KRt_R
KRt_Rt
KR_t
KR_t
|
|
|
|
|
|
| |
|
|
|
|
| |
|
?
?
HIFF
HDFFHDDD
=
][][2][2][22
][)1]([][)(
)(
11TT
T
RRRttRRtRRM
RpRRMyE
yEy
AICc = N*log(SSE/N)+2P+2P(P+1)/(N-P-1)
Scott, C. P., Kashlan, O. B., Lear, J. D., and Cooperman, B. S. (2001) Biochemistry 40(6), 1651-166
Radivoyevitch, (2008) BMC Systems Biology 2:15
Application to DataApplication to Data
HDFF==
R
RRtt
IIIJ
5 10 15
100
120
140
160
180
Total [dTTP] (uM)
Ave
rage
Mas
s (k
Da)
III0mIIIJHDFF
III0m
Model Parameter Initial Value
Optimal Value
Confidence Interval
1 III0m m1 90.000 82.368 (79.838, 84.775)
SSE 4397.550 525.178
AIC 71.965 57.090
2 IIIJ R2t2 1.000^3 2.725^3 (2.014^3, 3.682^3)
SSE 2290.516 557.797
AIC 67.399 57.512
27 HDFF R2t0 1.000 12369.79 (0, 1308627507869)
R1t0_t 1.000 1.744 (0.003, 1187.969)
R2t0_t 1.000 0.010 (0.000, 403.429)
SSE 25768.23 477.484
AIC 105.342 77.423
RRttRRtRt
T
RRttRRtRRRtT
KtR
KtR
KtRtt=
KtR
KtR
KR
KtRRRp=
222
2222
20
2220
.0)0(;0)0(
2
222
222
2222
tRKtR
KtR
KtRtt=
dtd
KtR
KtR
KR
KtRRRp=
dRd
RRttRRtRtT
RRttRRtRRRtT
jitR
ji
KtR=tR
ji
2+5+9+13 = 28 parameters => 228=2.5x108 spur graph models via Kj=∞ hypotheses
28 models with 1 parameter, 428 models with 2, 3278 models with 3, 20475 with 4
R = R1X = ATP
18
6
612
4
46
2
22
1
18
6
612
4
46
2
22
1
642
642
0
6420
i XR
i
i XR
i
i XR
i
i RX
i
T
i XR
i
i XR
i
i XR
i
i RX
i
T
iiii
iiii
KXRi
KXRi
KXRi
KXRiXX=
KXR
KXR
KXR
KXRRR=
Yeast R1 structure. Dealwis Lab, PNAS 102, 4022-4027, 2006
ATP-induced R1 Hexamerization
Kashlan et al. Biochemistry 2002 41:462
==
==
==
= ====
==
= ====
======
====
----
====
==
====
======
==
= ====
------------
==
====
------------
X ====
------------
==
====
------------
X
====
======
XX
====
======
XX
X
====
------------
XX
X
====
======
XX
====
======
==
====
==
====
==
====
======
------------
XX
X
X ====
------------
X====
------------
====
======
X ------------
X
====
------------
------------
X ------------
X------------
X ====
X====
====
XX
X XX
X
Kolmogorov-Smirnov Test p < 10-16 28 of top 30 did not include an h-site term; 28/30 ≠ 503/2081 with p < 10-16
This suggests no h-site. Top 13 included R6X8 or R6X9, save one, single edge model R6X7 This suggests that less than 3 a-sites are occupied in hexamer.
2088 Models with SSE < 2 min (SSE)
142 144 146 148 150 152 154
0.00
0.05
0.10
0.15
0.20
0.25
AIC
dens
ity
no h site (508)h site (1580)
A
142 144 146 148 150
0.05
0.15
0.25
AICde
nsity
no h site (77)h site (78)
B
146 148 150 152 154
0.00
0.05
0.10
0.15
0.20
0.25
0.30
AIC
dens
ity
no h site (287)h site (1174)
C
148 150 152 154
0.0
0.1
0.2
0.3
0.4
0.5
AIC
dens
ity
no h site (146)h site (329)
D
50 100 200 500 1000 2000
100
200
300
400
500
[ATP] (uM)
Mas
s (k
Da)
R6X8 141.23R6X9 142.88R6X7 143.12R6X10 146.12R6X6 148.92R6X11 149.60R6X12 152.82R6X13 155.68R6X14 158.17R6X15 160.33R6X16 162.20R6X17 163.84R6X18 165.26
Data from Kashlan et al. Biochemistry 2002 41:462
Conclusions (so far)1. The dataset does not support the existence of an h-site
2. The dataset suggests that ~1/2 of the a-site are not occupied by ATP
R1
R1
R1 R1
R1 R1
aa
[ATP]=~1000[dATP]So system prefers to have 3 a-sites empty and ready for dATPInhibition versus activation is partly due to differences in pockets
a
a
a
a
UDP
CDP
GDP
ADP
dTTP
dCTP
dGTP
dATP
dT
dC
dG
dADNA
dUMP
dU
TS
DCTD
dCK
DN
A p
olym
eras
e
TK1
cytosol
mitochondria
dT
dC
dG
dA
TK2
dGK
dTMPdCMP
dGMP
dAMP
dTTPdCTP
dGTP
dATP
5NT
NT2
cytosol
nucleus
dUDP
dUTPdUTPase dN
dN
dCK
flux activation inhibition
ATPordATP
RN
R
dCK
== =
=
==_=
=
==
=
=
==
==
== _
EESES2
ES3
ES4
X
8 K Models x 8 k Models => 64 Models
Tetrameric Enzyme Models (e.g. Thymidine Kinase 1)
4
1
4
1
4
1
4
1
4
1
][14
][
][1][4
]][[
][4
][
i ES
iT
i ES
iT
i
i ES
iT
i ES
iT
i
T
i
ii
i
i
i
i
KS
KSik
KSE
KSEik
E
ESikk
4321
4321
432
432
]][[4]][[3]][[2]][[][][0
]][[]][[]][[]][[][][0
ESESESEST
ESESESEST
KSE
KSE
KSE
KSESS
KSE
KSE
KSE
KSEEE
4,3,2,1]][[][ iKSEESiES
ii
][4
][4
1
T
i
ii
E
ESikk
== =
=
==_=
=
===
===
==== _
ESES2
ES3
ES4
If [S] ≈ [ST]
44
][]][[4]][[4][ _
_SE
ESESon
offSEonoff
KKK
ESSE
kk
KSEkESk
66
][]][[6
][2]][[3
][]][[4 __
222
2
2__SESSE
ESESSESSE
KKKK
ESSE
ESSES
ESSEKK
4][3]][[2
][2]][[3
][]][[4 ___
33
2
2___
2
2
SESSESSE
ESSESSESSE
KKKK
ESSES
ESSES
ESSEKKK
SESSESSESSEES
SESSESSESSE
KKKKKESSES
ESSES
ESSES
ESSEKKKK
____4
4
3
3
2
2____
32
32][4]][[
][3]][[2
][2]][[3
][]][[4
Complete K vs. Binary K
0.05 0.20 1.00 5.00 50.00
0.2
0.5
1.0
2.0
MunchPetersen 1993
Total [dT]
k (1
/sec
)
K = 1.79, 1.84, 2.34, 2.24 k = 7.04, 5.92, 5.91, 3.62
1 2 3
-0.0
6-0
.02
0.02
0.06
Model Average
Fitted Value
Tran
sfor
med
Res
idua
l
5e-02 5e-01 5e+00 5e+01
0.5
1.0
2.0
5.0
Berenstein 2000
Total [dT]k
(1/s
ec)
K = 0.76, 0.78, 0.74, 0.20 k = 7.10, 6.98, 7.01, 7.02
1 2 3 4 5 6 7
-0.2
-0.1
0.0
0.1
0.2
0.3
Model Average
Fitted Value
Tran
sfor
med
Res
idua
l
0.2 0.5 1.0 2.0 5.0
0.5
1.0
2.0
Frederiksen 2004
Total [dT]
k (1
/sec
)
K = 0.73, 0.51, 0.49, 0.49 k = 1.97, 3.16, 3.90, 3.89
0.5 1.0 1.5 2.0 2.5 3.0 3.5
-0.1
5-0
.05
0.05
0.15
Model Average
Fitted ValueTr
ansf
orm
ed R
esid
ual
5e-02 5e-01 5e+00 5e+01
0.5
1.0
2.0
5.0
Li 2004
Total [dT]
k (1
/sec
)
K = 0.70, 0.56, 0.54, 0.56 k = 5.21, 4.77, 4.70, 4.69
1 2 3 4
-0.2
-0.1
0.0
0.1
0.2
0.3
Model Average
Fitted Value
Tran
sfor
med
Res
idua
l
0.1 0.2 0.5 1.0 2.0 5.0
0.02
0.05
0.10
0.20
Birringer 2006
Total [dT]
k (1
/sec
)
K = 1.19, 0.64, 0.49, 0.47 k = 0.20, 0.27, 0.27, 0.27
0.05 0.10 0.15 0.20 0.25
-0.0
15-0
.005
0.00
5
Model Average
Fitted Value
Tran
sfor
med
Res
idua
l
K = (0.85, 0.69, 0.65, 0.51) µM k = (3.3, 3.9, 4.1, 4.1) sec-1
kmax = 4.1/sec S50 = 0.6 µM h = 1.1
h
T
h
T
SS
SSk
k
50
50max
1
DFFF.DDDD DDLL.DDDD DDDM.DDDD DDDD.DFFFDDDD.DDLL DDDD.DDDM
0.05 0.50 5.00
0.2
2.0
MunchPetersen 1993
Total [dT]
k (1
/sec
)
K = 0.72, 0.72, 0.72, 0.72 k = 2.91, 2.91, 4.18, 4.18
DD
DD
.DD
LL
5e-02 5e+00
0.5
5.0
Berenstein 2000
Total [dT]
k (1
/sec
)
K = 0.39, 0.39, 0.39, 0.39 k = 3.58, 3.58, 7.14, 7.14
0.2 1.0 5.0
0.5
2.0
Frederiksen 2004
Total [dT]
k (1
/sec
)
K = 0.36, 0.36, 0.36, 0.36 k = 1.35, 1.35, 3.80, 3.80
5e-02 5e+00
0.5
5.0 Li 2004
Total [dT]
k (1
/sec
)
K = 0.45, 0.45, 0.45, 0.45 k = 3.89, 3.89, 4.66, 4.66
0.1 0.5 2.0
0.02
0.20
Birringer 2006
Total [dT]
k (1
/sec
)
K = 0.35, 0.35, 0.35, 0.35 k = 0.07, 0.07, 0.27, 0.27
Total [dT]
k (1
/sec
)
0.05 0.50 5.00
0.2
2.0
MunchPetersen 1993
Total [dT]
k (1
/sec
)
K = 1.28, 1.28, 1.28, 1.28 k = 4.99, 4.99, 4.99, 4.11
DD
DD
.DD
DM
5e-02 5e+000.
55.
0
Berenstein 2000
Total [dT]
k (1
/sec
) K = 0.15, 0.15, 0.15, 0.15 k = 1.69, 1.69, 1.69, 7.18
0.2 1.0 5.0
0.5
2.0
Frederiksen 2004
Total [dT]
k (1
/sec
)
K = 1.20, 1.20, 1.20, 1.20 k = 5.51, 5.51, 5.51, 3.44
5e-02 5e+00
0.5
5.0 Li 2004
Total [dT]
k (1
/sec
)
K = 0.60, 0.60, 0.60, 0.60 k = 5.01, 5.01, 5.01, 4.67
0.1 0.5 2.0
0.02
0.20
Birringer 2006
Total [dT]
k (1
/sec
)
K = 1.43, 1.43, 1.43, 1.43 k = 0.41, 0.41, 0.41, 0.20
Total [dT]
k (1
/sec
)
0.05 0.50 5.00
0.2
2.0
MunchPetersen 1993
Total [dT]
k (1
/sec
)
K = 0.95, 0.95, 0.95, 0.95 k = 3.56, 4.29, 4.29, 4.29
DD
DD
.DFF
F
5e-02 5e+00
0.5
5.0
Berenstein 2000
Total [dT]
k (1
/sec
)
K = 0.54, 0.54, 0.54, 0.54 k = 3.93, 7.22, 7.22, 7.22
0.2 1.0 5.0
0.5
2.0
Frederiksen 2004
Total [dT]k
(1/s
ec)
K = 0.53, 0.53, 0.53, 0.53 k = 0.96, 3.93, 3.93, 3.93
5e-02 5e+00
0.5
5.0 Li 2004
Total [dT]
k (1
/sec
)
K = 0.48, 0.48, 0.48, 0.48 k = 3.50, 4.67, 4.67, 4.67
0.1 0.5 2.0
0.02
0.20
Birringer 2006
Total [dT]
k (1
/sec
)
K = 0.56, 0.56, 0.56, 0.56 k = 0.06, 0.28, 0.28, 0.28
Total [dT]
k (1
/sec
)
0.05 0.50 5.00
0.2
2.0
MunchPetersen 1993
Total [dT]
k (1
/sec
)
K = 1.08, 1.08, 0.85, 0.85 k = 4.23, 4.23, 4.23, 4.23
DD
LL.D
DD
D
5e-02 5e+00
0.5
5.0
Berenstein 2000
Total [dT]
k (1
/sec
)
K = 0.83, 0.83, 0.42, 0.42 k = 7.14, 7.14, 7.14, 7.14
0.2 1.0 5.0
0.5
2.0
Frederiksen 2004
Total [dT]
k (1
/sec
)
K = 0.90, 0.90, 0.41, 0.41 k = 3.80, 3.80, 3.80, 3.80
5e-02 5e+00
0.5
5.0 Li 2004
Total [dT]
k (1
/sec
)
K = 0.57, 0.57, 0.48, 0.48 k = 4.66, 4.66, 4.66, 4.66
0.1 0.5 2.0
0.02
0.20
Birringer 2006
Total [dT]
k (1
/sec
)
K = 1.08, 1.08, 0.32, 0.32 k = 0.26, 0.26, 0.26, 0.26
Total [dT]
k (1
/sec
)
0.05 0.50 5.00
0.2
2.0
MunchPetersen 1993
Total [dT]
k (1
/sec
)
K = 1.02, 1.02, 1.02, 0.56 k = 4.12, 4.12, 4.12, 4.12
DD
DM
.DD
DD
5e-02 5e+00
0.5
5.0
Berenstein 2000
Total [dT]
k (1
/sec
)
K = 0.76, 0.76, 0.76, 0.20 k = 7.02, 7.02, 7.02, 7.02
0.2 1.0 5.0
0.5
2.0
Frederiksen 2004
Total [dT]
k (1
/sec
)
K = 0.77, 0.77, 0.77, 0.19 k = 3.69, 3.69, 3.69, 3.69
5e-02 5e+000.
55.
0 Li 2004
Total [dT]
k (1
/sec
)
K = 0.54, 0.54, 0.54, 0.51 k = 4.69, 4.69, 4.69, 4.69
0.1 0.5 2.0
0.02
0.20
Birringer 2006
Total [dT]
k (1
/sec
)
K = 0.90, 0.90, 0.90, 0.08 k = 0.25, 0.25, 0.25, 0.25
Total [dT]
k (1
/sec
)
0.05 0.50 5.00
0.2
2.0
MunchPetersen 1993
Total [dT]
k (1
/sec
)
K = 1.15, 0.97, 0.97, 0.97 k = 4.31, 4.31, 4.31, 4.31
DFF
F.D
DD
D
5e-02 5e+00
0.5
5.0
Berenstein 2000
Total [dT]
k (1
/sec
)
K = 0.98, 0.55, 0.55, 0.55 k = 7.23, 7.23, 7.23, 7.23
0.2 1.0 5.0
0.5
2.0
Frederiksen 2004
Total [dT]
k (1
/sec
)
K = 1.32, 0.54, 0.54, 0.54 k = 3.92, 3.92, 3.92, 3.92
5e-02 5e+00
0.5
5.0 Li 2004
Total [dT]
k (1
/sec
)
K = 0.65, 0.49, 0.49, 0.49 k = 4.67, 4.67, 4.67, 4.67
0.1 0.5 2.0
0.02
0.20
Birringer 2006
Total [dT]k
(1/s
ec)
K = 1.62, 0.53, 0.53, 0.53 k = 0.27, 0.27, 0.27, 0.27
Total [dT]
k (1
/sec
)
0.05 0.20 1.00 5.00 50.00
0.2
0.5
1.0
2.0
MunchPetersen 1993
Total [dT]
k (1
/sec
)
K = 1.32, 0.72, 1.18, 2.38 k = 5.16, 2.92, 5.99, 3.63
1 2 3
-0.0
8-0
.04
0.00
0.04
Fitted Values
Tran
sfor
med
Res
idua
ls
1 2 3
-0.1
0-0
.05
0.00
0.05
Fitted Values
Res
idua
ls
CV = 0.014
5e-02 5e-01 5e+00 5e+01
0.5
1.0
2.0
5.0
Berenstein 2000
Total [dT]
k (1
/sec
)
K = 1.18, 0.44, 0.69, 0.10 k = 11.94, 3.19, 4.10, 7.03
1 2 3 4 5 6 7
-0.2
-0.1
0.0
0.1
0.2
Fitted Values
Tran
sfor
med
Res
idua
ls
1 2 3 4 5 6 7
-0.3
-0.1
0.1
0.3
Fitted Values
Res
idua
ls
CV = 0.04
0.1 0.5 2.0 5.0
0.2
0.5
1.0
2.0
Frederiksen 2004
Total [dT]
k (1
/sec
)
K = 1.65, 0.30, 8.94, 0.19 k = 1.14, 5.42, 8.50, 3.56
0.5 1.5 2.5 3.5
-0.1
5-0
.05
0.05
Fitted Values
Tran
sfor
med
Res
idua
ls
0.5 1.5 2.5 3.5
-0.2
0-0
.10
0.00
0.10
Fitted Values
Res
idua
ls
CV = 0.06
5e-02 5e-01 5e+00 5e+01
0.5
1.0
2.0
5.0
Li 2004
Total [dT]
k (1
/sec
)
K = 1.69, 2.01, 0.10, 0.52 k = 12.81, 6.05, 3.52, 4.76
1 2 3 4
-0.1
0.0
0.1
0.2
Fitted Values
Tran
sfor
med
Res
idua
ls
1 2 3 4
-0.2
-0.1
0.0
0.1
0.2
Fitted Values
Res
idua
ls
CV = 0.039
0.1 0.2 0.5 1.0 2.0 5.0
0.02
0.05
0.10
0.20
Birringer 2006
Total [dT]
k (1
/sec
)
K = 104.58, 0.88, 0.01, 0.20 k = 21.33, 3.32, 0.01, 0.29
0.05 0.10 0.15 0.20 0.25
-0.0
100.
000
0.00
5
Fitted Values
Tran
sfor
med
Res
idua
ls
0.05 0.10 0.15 0.20 0.25
-0.0
15-0
.005
0.00
5
Fitted Values
Res
idua
ls
CV = 0.062
0.05 0.20 1.00 5.00 50.00
0.2
0.5
1.0
2.0
MunchPetersen 1993
Total [dT]
k (1
/sec
)
K = 1.32, 0.72, 1.18, 2.38 k = 5.16, 2.92, 5.99, 3.63
1 2 3
-0.0
8-0
.04
0.00
0.04
Fitted Values
Tran
sfor
med
Res
idua
ls
1 2 3
-0.1
0-0
.05
0.00
0.05
Fitted Values
Res
idua
ls
CV = 0.014
5e-02 5e-01 5e+00 5e+01
0.5
1.0
2.0
5.0
Berenstein 2000
Total [dT]
k (1
/sec
)
K = 1.18, 0.44, 0.69, 0.10 k = 11.94, 3.19, 4.10, 7.03
1 2 3 4 5 6 7
-0.2
-0.1
0.0
0.1
0.2
Fitted Values
Tran
sfor
med
Res
idua
ls
1 2 3 4 5 6 7
-0.3
-0.1
0.1
0.3
Fitted Values
Res
idua
ls
CV = 0.04
0.1 0.5 2.0 5.0
0.2
0.5
1.0
2.0
Frederiksen 2004
Total [dT]
k (1
/sec
)
K = 1.65, 0.30, 8.94, 0.19 k = 1.14, 5.42, 8.50, 3.56
0.5 1.5 2.5 3.5
-0.1
5-0
.05
0.05
Fitted Values
Tran
sfor
med
Res
idua
ls
0.5 1.5 2.5 3.5
-0.2
0-0
.10
0.00
0.10
Fitted Values
Res
idua
ls
CV = 0.06
5e-02 5e-01 5e+00 5e+01
0.5
1.0
2.0
5.0
Li 2004
Total [dT]
k (1
/sec
)
K = 1.69, 2.01, 0.10, 0.52 k = 12.81, 6.05, 3.52, 4.76
1 2 3 4
-0.1
0.0
0.1
0.2
Fitted Values
Tran
sfor
med
Res
idua
ls
1 2 3 4
-0.2
-0.1
0.0
0.1
0.2
Fitted Values
Res
idua
ls
CV = 0.039
0.1 0.2 0.5 1.0 2.0 5.00.
020.
050.
100.
20
Birringer 2006
Total [dT]
k (1
/sec
)
K = 104.58, 0.88, 0.01, 0.20 k = 21.33, 3.32, 0.01, 0.29
0.05 0.10 0.15 0.20 0.25
-0.0
100.
000
0.00
5
Fitted Values
Tran
sfor
med
Res
idua
ls
0.05 0.10 0.15 0.20 0.25
-0.0
15-0
.005
0.00
5
Fitted Values
Res
idua
ls
CV = 0.062
8-parameter model fits to the datasets
1e-08 1e-04 1e+00 1e+04
05
1015
2025
30
DFLM.DFLM
Total [S]
k (1
/sec
)A
1e-08 1e-04 1e+00 1e+04
020
4060
8010
0
DFLM.DFLM
Total [S]
k (1
/sec
)
B
K = (10-6, 10-3, 1, 103) A) k = (100, 10, 40, 10) B) k = (10, 10, 40, 100)
With parameters initially at ½ their true values, A identified itself, B had difficulties (its 1st two K estimates were off 2-5-fold).
Horizontal lines are k1/4, 2k2/4, 3k3/4 and 4k4/4 Vertical lines are K1/4, 2K2/3, 3K3/2 and 4K4
Combinatorially Complex Equilibrium Model
Selection (ccems, CRAN 2009)
Systems Biology Markup Language
interface to R (SBMLR, BIOC 2004)Model networks of enzymes
Model individual enzymes
SUMMARYSUMMARY
R1
R2 R2
R1 R1
R1 R1
R1 R1
R1 R1 R1
R1
R1 R1
R1 R1 R1
R1
R2 R2
R2 R2
Figure 8. T. Thorsen et al. (S. R. Quake Lab) Science 2002
Figure 9. J. Melin and S. R. Quake Annu. Rev. Biophys. Biomol. Struct. 2007. 36:213–31
Background: Quake Lab MicrofluidicsBackground: Quake Lab Microfluidics
Figure 9 shows how a peristaltic pump is implemented by three valves that cycle through the control codes 101, 100, 110, 010, 011, 001, where 0 and 1 represent open and closed valves; note that the 0 in this sequence is forced to the right as the sequence progresses.
Adaptive Experimental DesignsAdaptive Experimental DesignsFind best next 10 measurement Find best next 10 measurement conditions given models of data conditions given models of data collected.collected.
Need automated analyses in feedback Need automated analyses in feedback loop of automatic controls of microfluidic loop of automatic controls of microfluidic chips chips
µFluidic M-inputCMPM
C1
Mixing Control bits
C2C3
CM
…
TPC1 …C2 C2 C3 buff buff buff
N-plug stream (C1:2C2:C3:C4)/N
Output
Output
C4
Streams of pulses
Filtered output
(a)
(b)
Output
Output Mixer
Dye-3 (C3)Dye-2 (C2)
Solvent Dye-1 (C1)
0b
2b
1b
3b
Flow velocity = 2 cm/sTP=100 ms, M=4N=20, Levels = 64
Dye-3
C3
Dye-1
Dye-2
C2
Water
C1
Mixing Channel
Output
Control Lines
3 mm C1
C2
C3
5
0
10
-5 0 5 10 15 20
2535
45
minuteste
mpe
ratu
re (C
)
-5 0 5 10 15 20
01
23
45
minutes
cont
rol e
ffort
-5 0 5 10 15 20
2535
45
minutes
tem
pera
ture
(C)
-5 0 5 10 15 20
02
46
8
minutes
cont
rol e
ffort
+-
setpointKp
Ki∫ Σ hot plate water temperature
Simple example of a control system for a single-input single-output (SISO) system
Emphasis is on the stochastic component of the model.
Is there something in the black box or are the input wires disconnected from the output wires such that only thermal noise is being measured? Do we have enough data?
Model components: (Deterministic = signal) + (Stochastic = noise)Statistics Engineerin
gEmphasis is on the deterministic component of the model
We already know what is in the box, since we built it. The goal is to understand it well enough to be able to control it.
Predict the best multi-agent drug dose time course schedules
Increasing amounts of data/knowledge
Why Systems BiologyWhy Systems Biology
dNTPs + AnalogsdNTPs + Analogs DNA + Drug-DNADNA + Drug-DNA
Damage DrivenDamage Drivenor or
S-phase DrivenS-phase Driven
dNTP demand dNTP demand is eitheris either
DNA repairDNA repair
SalvageSalvage
De novoDe novo
MMRMMR-- Cancer Treatment Cancer Treatment Strategy Strategy
IUdRIUdR
ConclusionsConclusions
For systems biology to succeed:For systems biology to succeed:– move biological research toward move biological research toward
systems which are best understoodsystems which are best understood– specialize modelers to become experts specialize modelers to become experts
in biological literatures (SB is not a in biological literatures (SB is not a service)service)
AcknowledgementsAcknowledgements Case Comprehensive Cancer CenterCase Comprehensive Cancer Center NIH (K25 CA104791)NIH (K25 CA104791) Chris Dealwis (CWRU Pharmacology)Chris Dealwis (CWRU Pharmacology) Anders Hofer (Umea) Anders Hofer (Umea) Thank you Thank you
Indirect Approach Indirect Approach pro-B Cell Childhood ALLpro-B Cell Childhood ALL
TT: TEL-AML1 with HR : TEL-AML1 with HR tt : TEL-AML1 with CCR : TEL-AML1 with CCR tt : other outcome : other outcome
BB: BCR-ABL with CCR: BCR-ABL with CCR bb: BCR-ABL with HR: BCR-ABL with HR bb: censored, missing, : censored, missing,
or other outcome or other outcome
B
b
b
b
b
b
b
b
bb
bb
b
b
b
b
tt t
tt
t
t t
ttt
tt
t
t
t
t
t
t
t
t
t
tt
t
t
t
t
t
t
tt
t
t
t
t
t
t
t
t
t
tt
t
t
tt
t
t
t
t
t
tt
tt
t
T
T
T
t
t
t
t
t
tt
t
t t
tt
t
tt
t
t
t
t
0 2 4 6 8 10 12 140
200
400
600
800
1000
1200
DNTS Flux (uM/hr)
DN
PS
Flu
x (u
M/h
r)
Ross et al: Blood 2003, 102:2951-2959 Yeoh et al: Cancer Cell 2002, 1:133-143 Radivoyevitch et al., BMC Cancer 6, 104 (2006)
THF
CH2THF
CH3THF
CHOTHF
DHF
CHODHFHCHO
GAR
FGAR
AICAR
FAICAR
dUMP dTMP
NADP+ NADPHNADP+ NADPH
NADP+ NADPH
MetHcys
Ser
Gly
GART
ATICATIC
TS
ATP
ADP
11R
2R 2
3
4
10
9 8
5 6
7
12 11
13
HCOOH
MTHFDMTHFR
MTR
DHFR
SHMT
FTS
FDS
Morrison PF, Allegra CJ: Folate cycle kinetics in human breast cancer cells. JBiolChem 1989, 264:10552-10566.
library(ccems) # Thymidine Kinase Example topology = list( heads=c("E1S0"), # E1S0 = substrate free E sites=list( c=list( # c for catalytic site t=c("E1S1","E1S2","E1S3","E1S4") ) # t for tetramer ) ) # TK1 is 25kDa = 25mg/umole, so 1 mg = .04 umoles g = mkg(topology, activity=T,TCC=F)dd=subset(TK1,(year==2000),select=c(E,S,v))dd=transform(dd, ET=ET/4,v=ET*v/(.04*60))# now uM/sectops=ems(dd,g,maxTotalPs=8,kIC=10,topN=96)#9 min on 1 cpu
library(ccems) # Ribonucleotide Reductase Exampletopology <- list( heads=c("R1X0","R2X2","R4X4","R6X6"), sites=list( # s-sites are already filled only in (j>1)-mers a=list( #a-site thread m=c("R1X1"), # monomer 1 d=c("R2X3","R2X4"), # dimer 2 t=c("R4X5","R4X6","R4X7","R4X8"), # tetramer 3 h=c("R6X7","R6X8","R6X9","R6X10", "R6X11", "R6X12") # hexamer 4 ), # tails of a-site threads are heads of h-site threads h=list( # h-site m=c("R1X2"), # monomer 5 d=c("R2X5", "R2X6"), # dimer 6 t=c("R4X9", "R4X10","R4X11", "R4X12"), # tetramer 7 h=c("R6X13", "R6X14", "R6X15","R6X16", "R6X17", "R6X18")# hexamer 8 ) ))g=mkg(topology,TCC=TRUE) dd=subset(RNR,(year==2002)&(fg==1)&(X>0),select=c(R,X,m,year))cpusPerHost=c("localhost" = 4,"compute-0-0"=4,"compute-0-1"=4,"compute-0-2"=4)top10=ems(dd,g,cpusPerHost=cpusPerHost, maxTotalPs=3, ptype="SOCK",KIC=100)
K = e∆G/RT => K = (0, ∞) maps to ∆G = (-∞, ∞)
Model weights e∆AIC/∑e∆AIC
Fast Total Concentration Constraint (TCC; i.e. g=0) solvers are critical to model
estimation/selection. TCC ODEs (#ODEs = #reactants) solve TCCs faster than kon =1 and koff = Kd systems (#ODEs = #species = high # in combinatorially complex situations)
Semi-exhaustive approach = fit all models with same number of parameters as parallel batch, then fit next batch only if current shows AIC improvement over previous batch.
Comments on Methods
50 100 200 500 2000
100
200
300
400
500
[ATP] (uM)
Mas
s (k
Da)
A
R6X8R2X4.R6X8R6X9
100 200 300 400 500
-20
020
40
Fitted Value
Res
idua
l
B
1e+02 1e+03 1e+04 1e+05 1e+06
010
020
030
040
050
0
[ATP] (uM)
Mas
s (k
Da)
R2X4.R6X8R2X3.R6X9R2X3.R6X12
Conjecture
Greater X/R ratio dominates at highLigand concentrations
In this limit the system wants to partition As much ATP into a bound form as possible
Bias in 1st four residuals=> Do not weight errors