discrete modeling, discovery and prediction for evolving ......discrete modeling, discovery and...
Post on 26-Jun-2020
1 Views
Preview:
TRANSCRIPT
Discrete Modeling, Discovery and Prediction for Evolving, Living
Systems
Myra B. Cohen1, Nicole R. Buan2, Christine Kelley3, Mikaela Cashman1, Jennie L. Catlett2
1. Department of Computer Science & Engineering 2. Department of Biochemistry 3. Department of Mathematics
Motivation
vs.
Green Energy Petroleum based Fuels
Methane-producing archaea (methanogens)
• Phylogenetically distinct group • Derive all their energy from
reduction of C1 compounds to methane
• 4% of the global C cycle (2 Gigatons per year)*
• Strict anaerobes
* Thauer RK. et al. 2008. Microbiology. 6:579-591.
Global C cycle
Methanogen Biotechnology
www.spaceX.com
www.sagentpharma.com
WHO Essential Medicine ~50% all chemotherapy
www.fordcngokc.com
www.mineralhq.com
Transportation Cleaner than diesel
Methanogen Biotechnology
Deer Island, MA Hyperion, CA
Lincoln, NE
Biomass Energy - Nebraska
Aliens Among As
Adapted from Pace, NR. 2009. MMBR. 73(4):565-76
Humans
E. Coli
methanogens
A Tale of two pathways… Methylotrophic Acetoclastic
Entropy-retarded Enthalpy-retarded
We can control behavior
Typical Organism Behaviors
(e.g E. coli)
First-principles reasoning? • Methanogens are ruled by:
– Thermodynamics and biochemistry, information processing, regulation, selection, mutation, etc.
• To date no general set of equations describes behavior and evolution that – Applies equally well to methanogens,
bacteria, eukaryotes
Dynamic • Organisms reproduce with ~99.999% probability of
genetic information being passed to next generation • Mutations occur which can change gene functionality • Environment impacts the behavior:
– Food sources – Light – Temperature – Pressure – …?
Data Driven • As these organisms grow/die within their
environment they are sensing both the environment as well as receiving messages (communicating) with other organisms in their vicinity
• Based on what they sense they produce outputs (e.g. methane)
Models Today Chemical Reaction Networks
Reaction Networks • Allow us to model the chemical reactions (as
PDEs) through a cell • Based on the “whole cell model”
Physical Models • Flux balance analysis:
– Optimization algorithm that solves the series of reaction equations to calculate the steady-state fluxes of an organism’s reaction network
– Can use to predict biomass based on inputs
• Gapfilling: – Incomplete models may have incomplete
networks and will not grow. Gapfllling fills in missing reaction pathways using mixed linear programming
Problems with Existing Models
• Highly dependent on human annotations from empirical data
• Infer unknown behavior from organisms that are annotated
• Complex – difficult to reason about high level behavior
Variance of Pathways
Lieber, Catlett, Madayiputhiya, Nandukumar, Lopez, Metcalf and Buan. 2014. PLOS One. 9(9): e107563.
Application Systems
Lieber, Catlett, Madayiputhiya, Nandukumar, Lopez, Metcalf and Buan. 2014. PLOS One. 9(9): e107563.
Organisms sense, adapt
Use DDDAS?
Software (Discrete) Testing Perspective
Configurable Software
Discrete/Model Sampling
Observe Behavior
Optimize Parameters for
an objective
Pierobon, Cohen, Buan, Kelley, SCIM: Sampling, Characterization, Inference and Modeling of Biological Consortia, 2015
Methanogen Configuration Options
• Media compounds (e.g. glucose) • Light • Pressure • Temp • Oxygen Use discrete values for sampling
Reasoning about Configurations with Coding
Theory Error correcting codes: transmit information reliably and efficiently across space/time Factor graphs • Variable nodes represent information • Constraint nodes represent constraints/dependencies • Decoding and error-correction is performed via message
passing on the edges of the graph. • Update rules of the messages at the nodes follow belief
propagation algorithm on Bayesian networks
Factor Graph
• The input (i.e., “channel information”) to each variable node is a vector with n parameters (one for each factor)
• Update rules are designed for each factor, and iterative decoding is performed to determine how the system behaves for various inputs
• We can test how the system changes with modifications to certain factors
f1 f2 f3 f4
x1 x2 x3 x4 x5 x6
f(x1,x2,x3,x4,x5,x6) = f1(x1,x3,x6) f2(x2,x4) f3(x1,x5) f4(x3,x5)
µx1àf1
Configurations
ρf1àx1 ρf3àx1
inx1
Fitness methane/flux Population
by fitness
p
……
…
…
popula(on
1 2 3 n
Crossover
p
…
…
1 2 3 n
…
X
Mutation
p
…
…
1 2 3 n
…
X
p
……
…
…
popula(on
1 2 3 n
Population
DDDAS System
Sensors Evolution/Adaptation
Simulation/updating of models
Feasibility
Goals • Evaluate models for optimization • Use a well studied methanogen
– Methanosarcina acetivorans • Explore a part of configuration space
contained in KBase • Understand how well current models
describe the organism
Exploring Environment • Iteration One (729 data points)
– 12 compounds in growth media H2O, Phosphate, CO2, NH3, Acetate, Sulfate, H+, L-Cysteine ,Co2+, Ni2+, Fe2+, H2
– Vary max flux for 6 (3 different flux values) • Iteration Two (2187 data points)
– Two compounds that have no impact. Made constant, added 3 more –> 7 factors
Results (iteration 1)
Phosphate
1.2
4.6
Flux=1
L-Cysteine
5.1
Flux=1 Flux=10 or 100
Flux=10or100
Results (iteration 2) Acetate
.05 Flux=1
Flux=100
Phosphate
1.2
4.6
Flux=1
L-Cysteine
Flux=1
Flux=10or100
C02
4.6
Flux=10or100
5.1
.5 Flux=10
But • We know the models are not perfect • Still need laboratory data
Next Iterations • Drill down on the four primary factors:
– Acetate, Phosphate, L-Cysteine and CO2 • Use smaller flux distances • Run generic algorithm on a large
number of flux values and more compounds
• Validate results in lab and update model
Summary • View biological organisms as part of a
DDDA system • Developing techniques for discrete
sampling/modeling of their configuration space
• Developing optimization techniques to fit into the DDDAS loop
References 1. Thauer RK. et al. 2008. Microbiology. 6:579-591 2. Pace, NR. 2009. MMBR. 73(4):565-76 3. Lieber, Catlett, Madayiputhiya, Nandukumar, Lopez, Metcalf
and Buan. 2014. PLOS One. 9(9): e107563 4. Pierobon, Cohen, Buan, Kelley, SCIM: Sampling,
Characterization, Inference and Modeling of Biological Consortia, 2015
5. J. Swanson, M.B. Cohen, M.B. Dwyer, B.J. Garvin and J. Firestone, Beyond the Rainbow: Self-Adaptive Failure Avoidance in Configurable Systems, Foundations of Software Engineering, 2014, pp. 377-388
Acknowledgements
CCF-1161767 CNS-1205472 IOS-1449525
Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the funding agencies
top related