Modeling Combinatorially Modeling Combinatorially Complex Ribonucleotide Complex Ribonucleotide
ReductaseReductase
Tom RadivoyevitchAssistant ProfessorEpidemiology and BiostatisticsCase Western Reserve University
Email: [email protected]: http://epbi-radivot.cwru.edu/
OverviewOverview
• Model enzymes as quasi-equilibria (e.g. E ES) Model enzymes as quasi-equilibria (e.g. E ES) • Combinatorially Complex Equilibria:Combinatorially Complex Equilibria:
few reactants => many possible complexesfew reactants => many possible complexes• R package: Combinatorially Complex Equilibrium Model R package: Combinatorially Complex Equilibrium Model
Selection (ccems) implements methods for activity and Selection (ccems) implements methods for activity and mass datamass data
• Hypotheses: complete K = ∞ Hypotheses: complete K = ∞ [Complex] = 0 vs binary [Complex] = 0 vs binary KK1 1 = K= K22
• Generate a set of possible models, fit them, and select Generate a set of possible models, fit them, and select the best the best
• Model Selection: Akaike Information Criterion (AIC)Model Selection: Akaike Information Criterion (AIC)• AIC decreases with P and then increasesAIC decreases with P and then increases• Billions of models, but only thousands near AIC upturnBillions of models, but only thousands near AIC upturn• Generate 1P, 2P, 3P model space chunks sequentiallyGenerate 1P, 2P, 3P model space chunks sequentially• Use structures to constrain complexity and simplicity of Use structures to constrain complexity and simplicity of
modelsmodels
Ultimate GoalUltimate Goal
EXPERIMENTALBIOLOGY
COMPUTERMODELING
CONTROLTHEORY
models
control lawsdata
hypotheses
proposed clinical trial
validated process model development
control system design methods development
Present Future
• Safer flying airplanes with autopilotsSafer flying airplanes with autopilots• Ultimate Goal: individualized, state feedback based Ultimate Goal: individualized, state feedback based
clinical trialsclinical trials
• Better understanding => better controlBetter understanding => better control• Conceptual models help trial designs today Conceptual models help trial designs today • Computer models of airplanes help train pilots and Computer models of airplanes help train pilots and
autopilotsautopilots
Radivoyevitch et al. (2006) BMC Cancer 6:104
dNTP Supply System
Figure 1. dNTP supply. Many anticancer agents act on or through this system to kill cells. The most central enzyme of this system is RNR.
UDP
CDP
GDP
ADP
dTTP
dCTP
dGTP
dATP
dT
dC
dG
dA
DNA
dUMP
dU
TS
DCTD
dCK
DN
A p
olym
eras
eTK1
cytosol
mitochondria
dT
dC
dG
dA
TK
2dG
K
dTMP
dCMP
dGMP
dAMP
dTTP
dCTP
dGTP
dATP
5NT
NT2
cytosol
nucleus
dUDP
dUTPdUTPase
dN
dN
dCK
flux activation inhibition
ATPordATP
RN
R
dCK
R1
R2 R2
R1 R1
R1 R1
R1 R1
R1
R1
R1
R1
R1 R1
R1
R1
R1
R1
R2 R2
UDP, CDP, GDP, ADP bind to catalytic site
ATP, dATP, dTTP, dGTP bind to selectivity site
dATP inhibits at activity site, ATP activates at activity site?
5 catalytic site states x 5 s-site states x 3 a-site states x 2 h-site states = 150 states
(150)6 different hexamer complexes => 2^(150)6 models 2^(150)6 = ~1 followed by a trillion zeros1 trillion complexes => 1 trillion (1 followed by only 12 zeros) 1-parameter models
ATP activates at hexamerization site??
Ribonucleotide Reductase (RNR)Ribonucleotide Reductase (RNR)
R2 R2
RNR is Combinatorially Complex
Michaelis-Menten ModelMichaelis-Menten Model
RNR: no NDP and no R2 dimer => kcat of complex is zero,else different R1-R2-NDP complexes can have different kcat values.
E + S ES
mm
m
m
mT
Tmm
mT
T
TT
TT
KS
SV
KS
KSV
KS
KSkE
kEKSKS
KSkEv
EESEES
EESkE
kEEES
E
EES
ESkE
EPEESPEk
EESkv
][
][
1/][
/][
1/][
/][][
][1/][
10
1/][
/][][
1]/[][
10
1]/[][
]/[][][
][][][
][0
][][
][][
)(][0)(][
][0][
maxmax1
1
1
1
1
1
mm K
S
E
ES
ES
ESK
][
][
][
][
]][[but so
Key perspective
0.005 0.010 0.020 0.050 0.100 0.200 0.500
02
04
06
08
01
00
Total [r] (uM)
Pe
rce
nt A
ctiv
ity
solid line = Eqs. (1-2) dotted = Eq. (3)
Data from Scott, C. P., Kashlan, O. B., Lear, J. D., and Cooperman, B. S. (2001) Biochemistry 40(6), 1651-166
Model Parameter Initial Value Optimal Value Confidence Interval
RRGGttr1.1.0 RRGGtt_r 0.020 0.012 (0.007, 0.024)
SSE 1070.252 823.793
AIC 45.006 42.650
MM Kd 0.020 0.033 (0.022, 0.049)
SSE 2016.335 1143.682
AIC 50.706 45.603
R=R1 r=R22
G=GDP t=dTTP
)2(]][[
][][0
)1(]][[
][][0
_
_
SET
SET
K
SESS
K
SEEE
SE
T
SE
T
T
SE
SET
SE
TSE
T
KS
KS
EESversus
KS
KS
EES
KS
EEK
SEE
_
_
_
_
_
_ ][1
][
][][][
1
][
][][][
1
1][][
][1][][
Substitute this in here to get a quadratic in [S] whose solution is
Bigger systems of higher polynomials cannot be solved algebraically => use ODEs (above)
][4][][(][][(5.][][ _2
__ TSETTSETTSET SKESKESKSES
0)0]([,0)0]([
]][[][][
][
]][[][][
][
_
_
SE
K
SESS
d
Sd
K
SEEE
d
Ed
SET
SET
Michaelis-Menten ModelMichaelis-Menten Model [S] vs. [S[S] vs. [STT] ]
(3)
E ES
EI ESI
E ES
EI ESI
SEIIEIET
SEIIESET
SEIIEIESET
KK
SIE
K
IEII
KK
SIE
K
SESS
KK
SIE
K
IE
K
SEEE
___
___
____
]][][[]][[][][0
]][][[]][[][][0
]][][[]][[]][[][][0
E ES
EI
EIT
EST
EIEST
K
IEII
K
SESS
K
IE
K
SEEE
]][[][][0
]][[][][0
]][[]][[][][0
ESIEIT
ESIEST
ESIEIEST
K
ISE
K
IEII
K
ISE
K
SESS
K
ISE
K
IE
K
SEEE
]][][[]][[][][0
]][][[]][[][][0
]][][[]][[]][[][][0
E ES
EI ESI
E
EI ESI
E ES
ESI
E
EI ESI
E ES
ESI
=
=E
EI
E
ESI
E ES E
Competitive inhibition
uncompetitive inhibition if kcat_ESI=0
E | ES
EI | ESI
noncompetitive inhibition Example of K=K’ Model
==
Enzyme, Substrate and InhibitorEnzyme, Substrate and Inhibitor
Total number of spur graph models is 16+4=20 Radivoyevitch, (2008) BMC Systems Biology 2:15
Rt Spur Graph ModelsRt Spur Graph Models
RRttRRtRt
T
RRttRRtRRRtT
K
tR
K
tR
K
tRtt=
K
tR
K
tR
K
R
K
tRRRp=
222
2222
20
2220
.0)0(;0)0(
2
222
222
2222
tR
K
tR
K
tR
K
tRtt=
d
td
K
tR
K
tR
K
R
K
tRRRp=
d
Rd
RRttRRtRtT
RRttRRtRRRtT
R RR
RRtt
RRt Rt
R
RRtt
RRt Rt
R RR
RRtt
Rt
R RR
RRt Rt
R RR
RRtt
RRt
R
RRtt
Rt
R
RRt Rt
R RR
Rt
IJJJJJIJ JJJI JIJJ
IJIJ IJJI JJII
JJJJ
R
RRtt
RRt
R RR
RRtt
R RR
RRt
R
Rt
R
RRtt
R
RRt
JIIJIIJJ JIJI
R RR R
IJII IIIJ IIJI JIII IIII
R
Rt
R
RRtt
R
RRt
R RR
I0II III0 II0I 0III
R = R1 t = dTTP
for dTTP induced R1 dimerization
(RR, Rt, RRt, RRtt)
R Rt t
RRt t
Rt R t
RRt t
Rt Rt RRtt
Kd_R_R
Kd_Rt_R
Kd_Rt_Rt
Kd_R_t
Kd_R_tKd_RRt_t
Kd_RR_t=
=
=
=
|
|
|
Rt Grid Graph ModelsRt Grid Graph Models
R Rt t
RRt t
Rt R t
RRt t
RRtt
KR_R
KR_t
KRRt_t
KRR_t=
=
=
=
=
===
=
=
=
=
=
===
=
=
=
R Rt t
RRt t
Rt R t
RRt t
Rt Rt RRtt
KR_R
KRt_R
KRt_Rt
KR_t
KR_t
|
|
|
|
|
|
| |
|
|
|
|
| |
|
?
?
HIFF
HDFFHDDD
=
][
][2][2][22
][
)1]([][)(
)(
11TT
T
R
RRttRRtRRM
R
pRRMyE
yEy
AICc = N*log(SSE/N)+2P+2P(P+1)/(N-P-1)
Scott, C. P., Kashlan, O. B., Lear, J. D., and Cooperman, B. S. (2001) Biochemistry 40(6), 1651-166
Radivoyevitch, (2008) BMC Systems Biology 2:15
Application to DataApplication to Data
HDFF
=
=
R
RRtt
IIIJ
5 10 15
10
01
20
14
01
60
18
0
Total [dTTP] (uM)
Ave
rag
e M
ass
(kD
a)
III0mIIIJHDFF
III0m
Model Parameter Initial Value
Optimal Value
Confidence Interval
1 III0m m1 90.000 82.368 (79.838, 84.775)
SSE 4397.550 525.178
AIC 71.965 57.090
2 IIIJ R2t2 1.000^3 2.725^3 (2.014^3, 3.682^3)
SSE 2290.516 557.797
AIC 67.399 57.512
27 HDFF R2t0 1.000 12369.79 (0, 1308627507869)
R1t0_t 1.000 1.744 (0.003, 1187.969)
R2t0_t 1.000 0.010 (0.000, 403.429)
SSE 25768.23 477.484
AIC 105.342 77.423
RRttRRtRt
T
RRttRRtRRRtT
K
tR
K
tR
K
tRtt=
K
tR
K
tR
K
R
K
tRRRp=
222
2222
20
2220
.0)0(;0)0(
2
222
222
2222
tR
K
tR
K
tR
K
tRtt=
d
td
K
tR
K
tR
K
R
K
tRRRp=
d
Rd
RRttRRtRtT
RRttRRtRRRtT
jitR
ji
K
tR=tR
ji
2+5+9+13 = 28 parameters => 228=2.5x108 spur graph models via Kj=∞ hypotheses
28 models with 1 parameter, 428 models with 2, 3278 models with 3, 20475 with 4
R = R1X = ATP
18
6
612
4
46
2
22
1
18
6
612
4
46
2
22
1
642
642
0
6420
i XR
i
i XR
i
i XR
i
i RX
i
T
i XR
i
i XR
i
i XR
i
i RX
i
T
iiii
iiii
K
XRi
K
XRi
K
XRi
K
XRiXX=
K
XR
K
XR
K
XR
K
XRRR=
Yeast R1 structure. Dealwis Lab, PNAS 102, 4022-4027, 2006
ATP-induced R1 Hexamerization
Kashlan et al. Biochemistry 2002 41:462
==
==
==
= ==
==
==
= ==
==
==
==
==
==
==
----
==
==
==
==
==
==
==
==
==
= ==
==
----
------
--
==
==
==
----
------
--
X ==
==
----
------
--
==
==
==
----
------
--
X
==
==
==
==
==
XX
==
==
==
==
==
XX
X
==
==
----
------
--
XX
X
==
==
==
==
==
XX
==
==
==
==
==
==
==
==
==
==
==
==
==
==
==
==
==
----
------
--
XX
X
X ==
==
----
------
--
X==
==
----
------
--
==
==
==
==
==
X ----
------
--
X
==
==
----
------
--
----
------
--
X ----
------
--
X
----
------
--
X ==
==
X==
==
==
==
XX
X XX
X
28 of top 30 did not include an h-site term; 28/30 ≠ 503/2081 with p < 10-16
This suggests no h-site. Top 13 all include R6X8 or R6X9, save one, single edge model R6X7 This suggests less than 3 a-sites are occupied in hexamer.
For details, see Radivoyevitch, T. Automated mass action model space generation and analysis methods for two-reactant combinatorially complex equilibriums: An analysis of ATP-induced ribonucleotide reductase R1 hexamerization data, Biology Direct 4, 50 (2009).
2088 Models with SSE < 2 min (SSE)
142 144 146 148 150 152 154
0.00
0.05
0.10
0.15
0.20
0.25
AIC
dens
ity
no h site (508)h site (1580)
A
142 144 146 148 150
0.05
0.15
0.25
AIC
dens
ity
no h site (77)h site (78)
B
146 148 150 152 154
0.00
0.05
0.10
0.15
0.20
0.25
0.30
AIC
dens
ity
no h site (287)h site (1174)
C
148 150 152 154
0.0
0.1
0.2
0.3
0.4
0.5
AIC
dens
ity
no h site (146)h site (329)
D
50 100 200 500 1000 2000
10
02
00
30
04
00
50
0
[ATP] (uM)
Ma
ss (
kDa
) R6X8 141.23R6X9 142.88R6X7 143.12R6X10 146.12R6X6 148.92R6X11 149.60R6X12 152.82R6X13 155.68R6X14 158.17R6X15 160.33R6X16 162.20R6X17 163.84R6X18 165.26
Data from Kashlan et al. Biochemistry 2002 41:462
Conclusions (so far)
1. The dataset does not support the existence of an h-site
2. The dataset suggests that ~1/2 of the a-sites are not occupied by ATP
R1
R1
R1 R1
R1 R1
aa
[ATP]=~1000[dATP]So system prefers to have 3 a-sites empty and ready for dATPInhibition versus activation is partly due to differences in pockets
a
a
a
a
UDP
CDP
GDP
ADP
dTTP
dCTP
dGTP
dATP
dT
dC
dG
dA
DNA
dUMP
dU
TS
DCTD
dCK
DN
A p
olym
eras
e
TK1
cytosol
mitochondria
dT
dC
dG
dA
TK
2dG
K
dTMP
dCMP
dGMP
dAMP
dTTP
dCTP
dGTP
dATP
5NT
NT2
cytosol
nucleus
dUDP
dUTPdUTPase
dN
dN
dCK
flux activation inhibition
ATPordATP
RN
R
dCK
36
3
26
2
12
1
2
6622
. 66220iii XR
i
XR
i
XR
i
R
T K
XR
K
XR
K
XR
K
RRR=
ij XR
ijij
K
XR=XR
][][][][ 321 63
62
21
20
iii XRkXRkXRkRkk
[ATP] (M)
CD
P R
ed
uct
ase
Act
ivity
(1
/se
c)
0.1
0.1
50
.20
.25
0.3
0 1000 2000 3000 10000
4.13.18 (AIC=-65, SSE=0.00257) 2.7.12 (AIC=-58.9, SSE=0.00386)i1 4 i2 13 i3 18 k0 0.31 K0 3.66 K1 25.86 K2 21.26 K3 56.23i1 2 i2 7 i3 12 k0 0.31 K0 3.86 K1 14.29 K2 12.48 K3 54.67
The integers i1, i2, and i3 follow 18 ≥ i3 > i2 and i2/6 > i1/2 > 0. Models with occupied h-sites are in red, those without are in black. Sizes of spheres are proportional to 1/SSE.
0.002 0.004 0.006 0.008 0.010 0.012
05
01
00
15
02
00
25
03
00
SSE
Pro
ba
bili
ty D
en
sity
occupied h-sites (171 models)no occupied h-sites (54 models)
Combinatorially Complex Equilibrium Model
Selection (ccems, CRAN 2009)
Systems Biology Markup Language
interface to R (SBMLR, BIOC 2004)
Model networks of enzymes
Model individual enzymes
SUMMARYSUMMARY
R1
R2 R2
R1 R1
R1 R1
R1 R1
R1
R1
R1
R1
R1 R1
R1
R1
R1
R1
R2 R2
R2 R2
Figure 8. T. Thorsen et al. (S. R. Quake Lab) Science 2002
Figure 9. J. Melin and S. R. Quake Annu. Rev. Biophys. Biomol. Struct. 2007. 36:213–31
Background: Quake Lab MicrofluidicsBackground: Quake Lab Microfluidics
Figure 9 shows how a peristaltic pump is implemented by three valves that cycle through the control codes 101, 100, 110, 010, 011, 001, where 0 and 1 represent open and closed valves; note that the 0 in this sequence is forced to the right as the sequence progresses.
Adaptive Experimental DesignsAdaptive Experimental DesignsFind best next 10 measurement Find best next 10 measurement conditions given models of data conditions given models of data collected.collected.
Need automated analyses in feedback Need automated analyses in feedback loop of automatic controls of microfluidic loop of automatic controls of microfluidic chips chips
µFluidic M-inputCMPM
C1
Mixing Control bits
C2C3
CM
…
TP
C1 …C2 C2 C3 buff buff buff
N-plug stream (C1:2C2:C3:C4)/N
Output
Output
C4
Streams of pulses
Filtered output
(a)
(b)
Output
Output Mixer
Dye-3 (C3)Dye-2 (C2)
Solvent Dye-1 (C1)
0b
2b
1b
3b
Flow velocity = 2 cm/sTP=100 ms, M=4N=20, Levels = 64
Dye-3
C3
Dye-1
Dye-2
C2
Water
C1
Mixing Channel
Output
Control Lines
3 mm C1
C2
C3
Emphasis is on the stochastic component of the model.
Is there something in the black box or are the input wires disconnected from the output wires such that only thermal noise is being measured? Do we have enough data?
Model components: (Deterministic = signal) + (Stochastic = noise)
Statistics EngineeringEmphasis is on the deterministic component of the model
We already know what is in the box, since we built it. The goal is to understand it well enough to be able to control it.
Predict the best multi-agent drug dose time course schedules
Increasing amounts of data/knowledge
Why Systems BiologyWhy Systems Biology
5
0
10
-5 0 5 10 15 20
25
35
45
minuteste
mp
era
ture
(C
)
-5 0 5 10 15 20
01
23
45
minutes
con
tro
l effo
rt
-5 0 5 10 15 20
25
35
45
minutes
tem
pe
ratu
re (
C)
-5 0 5 10 15 20
02
46
8
minutes
con
tro
l effo
rt+
-setpoint
Kp
Ki∫ Σ hot plate water temperature
Simple example of a control system for a single-input single-output (SISO) system
dNTPs + AnalogsdNTPs + Analogs DNA + Drug-DNADNA + Drug-DNA
Damage DrivenDamage Drivenor or
S-phase DrivenS-phase Driven
dNTP demand dNTP demand is eitheris either
DNA repairDNA repair
SalvageSalvage
De novoDe novo
MMRMMR-- Cancer Treatment Cancer Treatment Strategy Strategy
IUdRIUdR
Indirect Approach Indirect Approach pro-B Cell Childhood ALLpro-B Cell Childhood ALL
TT: TEL-AML1 with HR : TEL-AML1 with HR tt : TEL-AML1 with : TEL-AML1 with
CCRCCR tt : other outcome : other outcome
BB: BCR-ABL with CCR: BCR-ABL with CCR bb: BCR-ABL with HR: BCR-ABL with HR bb: censored, missing, : censored, missing,
or other outcome or other outcome
B
b
b
b
b
b
b
b
bb
bb
b
b
b
b
tt t
t
t
t
t t
ttt
tt
t
t
t
t
t
t
t
t
t
tt
t
t
t
t
t
t
tt
t
t
t
t
t
t
t
t
t
tt
t
t
tt
t
t
t
t
t
tt
tt
t
T
T
T
t
t
t
t
t
tt
t
tt
tt
t
tt
t
t
t
t
0 2 4 6 8 10 12 140
20
04
00
60
08
00
10
00
12
00
DNTS Flux (uM/hr)
DN
PS
Flu
x (u
M/h
r)
Ross et al: Blood 2003, 102:2951-2959 Yeoh et al: Cancer Cell 2002, 1:133-143
Radivoyevitch et al., BMC Cancer 6, 104 (2006)
THF
CH2THF
CH3THF
CHOTHF
DHF
CHODHF
HCHO
GAR
FGAR
AICAR
FAICAR
dUMP dTMP
NADP+ NADPH
NADP+ NADPH
NADP+ NADPH
MetHcys
Ser
Gly
GART
ATIC
ATIC
TS
ATP
ADP
11R
2R 2
3
4
10
9 8
5 6
7
12 11
13
HCOOH
MTHFDMTHFR
MTR
DHFR
SHMT
FTS
FDS
Morrison PF, Allegra CJ: Folate cycle kinetics in human breast cancer cells. JBiolChem 1989, 264:10552-10566.
ConclusionsConclusions
For systems biology to succeed:For systems biology to succeed:– move biological research toward systems move biological research toward systems
which are best understoodwhich are best understood– specialize modelers to become experts in specialize modelers to become experts in
biological literatures (e.g. dNTP Supply) biological literatures (e.g. dNTP Supply) Systems biology is not a serviceSystems biology is not a service
AcknowledgementsAcknowledgements
Case Comprehensive Cancer CenterCase Comprehensive Cancer Center NIH (K25 CA104791)NIH (K25 CA104791) Charles Kunos (CWRU)Charles Kunos (CWRU) John Pink (CWRU)John Pink (CWRU) Chris Dealwis (CWRU)Chris Dealwis (CWRU) Anders Hofer (Umea) Anders Hofer (Umea) Yun Yen (COH)Yun Yen (COH) And thank you for listening! And thank you for listening!
1e+02 1e+03 1e+04 1e+05 1e+06
01
00
20
03
00
40
05
00
[ATP] (uM)
Ma
ss (
kDa
)
R2X4.R6X8R2X3.R6X9R2X3.R6X12
Conjecture
Greater X/R ratio dominates at highLigand concentrations
In this limit the system wants to partition As much ATP into a bound form as possible
library(ccems) # Ribonucleotide Reductase Exampletopology <- list( heads=c("R1X0","R2X2","R4X4","R6X6"), sites=list( # s-sites are already filled only in (j>1)-mers a=list( #a-site thread m=c("R1X1"), # monomer 1 d=c("R2X3","R2X4"), # dimer 2 t=c("R4X5","R4X6","R4X7","R4X8"), # tetramer 3 h=c("R6X7","R6X8","R6X9","R6X10", "R6X11", "R6X12") # hexamer 4 ), # tails of a-site threads are heads of h-site threads h=list( # h-site m=c("R1X2"), # monomer 5 d=c("R2X5", "R2X6"), # dimer 6 t=c("R4X9", "R4X10","R4X11", "R4X12"), # tetramer 7 h=c("R6X13", "R6X14", "R6X15","R6X16", "R6X17", "R6X18")# hexamer 8 ) ))g=mkg(topology,TCC=TRUE) dd=subset(RNR,(year==2002)&(fg==1)&(X>0),select=c(R,X,m,year))cpusPerHost=c("localhost" = 4,"compute-0-0"=4,"compute-0-1"=4,"compute-0-2"=4)top10=ems(dd,g,cpusPerHost=cpusPerHost, maxTotalPs=3, ptype="SOCK",KIC=100)
Fast Total Concentration Constraint (TCC; i.e. g=0) solvers are critical to model
estimation/selection. TCC ODEs (#ODEs = #reactants) solve TCCs faster than kon =1 and koff = Kd systems (#ODEs = #species = high # in combinatorially complex situations)
Semi-exhaustive approach = fit all models with same number of parameters as parallel batch, then fit next batch only if current shows AIC improvement over previous batch.
Comments on Methods