![Page 1: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation Kamisetty H., Xing, E.P. and Langmead C.J](https://reader035.vdocument.in/reader035/viewer/2022062316/568167cf550346895ddd20ff/html5/thumbnails/1.jpg)
Free Energy Estimates of All-atom Protein Structures Using Generalized Belief
PropagationKamisetty H., Xing, E.P. and
Langmead C.J.
Raluca Gordan
February 12, 2008
![Page 2: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation Kamisetty H., Xing, E.P. and Langmead C.J](https://reader035.vdocument.in/reader035/viewer/2022062316/568167cf550346895ddd20ff/html5/thumbnails/2.jpg)
Papers Free Energy Estimates of All-atom Protein Structures Using
Generalized Belief Propagation Kamisetty, H., Xing, E.P. and Langmead C.J.
Constructing Free-Energy Approximations and Generalized Belief Propagation Algorithms Yedidia, J.S., Freeman, W.T. and Weiss Y.
Understanding Belief Propagation and its GeneralizationsYedidia, J.S., Freeman, W.T. and Weiss Y.
Bethe free energy, Kikuchi approximations, and belief propagation algorithms Yedidia, J.S., Freeman, W.T. and Weiss Y.
Effective energy functions for protein structure predictionLazaridis, T. and Karplus M.
![Page 3: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation Kamisetty H., Xing, E.P. and Langmead C.J](https://reader035.vdocument.in/reader035/viewer/2022062316/568167cf550346895ddd20ff/html5/thumbnails/3.jpg)
free energy
entropy
internal energy
Markov random field
probabilistic graphical models
potential function
pair-wise MRF
factor graphs
region-based free energy
region graph
belief propagation
generalized belief propagation
marginal probabilities
Gibbs free energy
inference
Bayes nets
enthalpy
![Page 4: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation Kamisetty H., Xing, E.P. and Langmead C.J](https://reader035.vdocument.in/reader035/viewer/2022062316/568167cf550346895ddd20ff/html5/thumbnails/4.jpg)
free energy
entropy
internal energy
Markov random field
probabilistic graphical models
potential function
pair-wise MRF
factor graphs
region-based free energy
region graph
belief propagation
generalized belief propagation
marginal probabilities
Gibbs free energy
inference
Bayes nets
enthalpy
![Page 5: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation Kamisetty H., Xing, E.P. and Langmead C.J](https://reader035.vdocument.in/reader035/viewer/2022062316/568167cf550346895ddd20ff/html5/thumbnails/5.jpg)
free energy
entropy
internal energy
Markov random field
probabilistic graphical models
potential function
pair-wise MRF
factor graphs
region-based free energy
region graph
belief propagation
generalized belief propagation
marginal probabilities
Gibbs free energy
inference
Bayes nets
enthalpy
![Page 6: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation Kamisetty H., Xing, E.P. and Langmead C.J](https://reader035.vdocument.in/reader035/viewer/2022062316/568167cf550346895ddd20ff/html5/thumbnails/6.jpg)
free energy
entropy
internal energy
Markov random field
probabilistic graphical models
potential function
pair-wise MRF
factor graphs
region-based free energy
region graph
belief propagation
generalized belief propagation
marginal probabilities
Gibbs free energy
inference
Bayes nets
enthalpy
![Page 7: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation Kamisetty H., Xing, E.P. and Langmead C.J](https://reader035.vdocument.in/reader035/viewer/2022062316/568167cf550346895ddd20ff/html5/thumbnails/7.jpg)
Free energy Free energy = the amount of energy in a system which can be
converted into work Gibbs free energy = the amount of thermodynamic energy
which can be converted into work at constant temperature and pressure
Enthalpy = the “heat content” of a system Entropy = a measure of the degree of randomness or disorder
of a system
G = Gibbs free energyH = enthalpyS = entropyE = internal energy
T = temperature P = pressureV = volume
Stryer L., Biochemistry (4th Edition)
G = H – T·S = (E + P·V) – T·S
![Page 8: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation Kamisetty H., Xing, E.P. and Langmead C.J](https://reader035.vdocument.in/reader035/viewer/2022062316/568167cf550346895ddd20ff/html5/thumbnails/8.jpg)
Thermodynamics: changes in free energy, entropy, …
For nearly all biochemical reactions ΔV is small and ΔH is almost equal to ΔE
Hence, we can write:
Gibbs free energy (G)
Stryer L., Biochemistry (4th Edition)
ΔG = ΔH – T·ΔSΔG = (ΔE + P·ΔV) – T·ΔS
ΔG = ΔE – T·ΔS
![Page 9: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation Kamisetty H., Xing, E.P. and Langmead C.J](https://reader035.vdocument.in/reader035/viewer/2022062316/568167cf550346895ddd20ff/html5/thumbnails/9.jpg)
Free energy functions G = E – T· S Energy functions are used in protein structure prediction, fold
recognition, homology modeling, protein design E.g.: approaches to protein structure prediction are based on the
thermodynamic hypothesis, which postulates that the native state of a protein is the state of lowest free energy under physiological conditions.
The contribution of Kamisetty H., Xing E.P and Langmead, C.J: the entropy component of their free energy estimate can be
used to distinguish native protein structures from decoys (structures with similar internal energy to that of the native structure, but otherwise incorrect)
compute estimates of ΔΔG upon mutation that correlate well with experimental values.
Lazaridis T. and Karplus M., Effective energy function for protein structure prediction
![Page 10: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation Kamisetty H., Xing, E.P. and Langmead C.J](https://reader035.vdocument.in/reader035/viewer/2022062316/568167cf550346895ddd20ff/html5/thumbnails/10.jpg)
Free energy functions G = E – T· S Internal energy functions E
model inter- and intramolecular interactions (e.g. van der Waals, electrostatic, solvent, etc.)
Entropy functions S are harder to compute because they involve sums
over an exponential number of terms
![Page 11: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation Kamisetty H., Xing, E.P. and Langmead C.J](https://reader035.vdocument.in/reader035/viewer/2022062316/568167cf550346895ddd20ff/html5/thumbnails/11.jpg)
The entropy term G = E – T· S Ignore the entropy term
+ simple- limits the accuracy
Use statistical potentials derived from known protein structures (PDB)
+ these statistics encode both the entropy S and the internal energy E
- the interactions are not independent* Model the protein structure as a probabilistic
graphical model and use inference-based approaches to estimate the free energy (Kamisetty et al.)
+ fast and accurate
* Thomas P.D. and Dill, K.A., Statistical Potentials Extracted From Protein Structures: How Accurate Are They?
![Page 12: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation Kamisetty H., Xing, E.P. and Langmead C.J](https://reader035.vdocument.in/reader035/viewer/2022062316/568167cf550346895ddd20ff/html5/thumbnails/12.jpg)
free energy
entropy
internal energy
Markov random field
probabilistic graphical models
potential function
pair-wise MRF
factor graphs
region-based free energy
region graph
belief propagation
generalized belief propagation
marginal probabilities
Gibbs free energy
inference
Bayes nets
enthalpy
![Page 13: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation Kamisetty H., Xing, E.P. and Langmead C.J](https://reader035.vdocument.in/reader035/viewer/2022062316/568167cf550346895ddd20ff/html5/thumbnails/13.jpg)
Probabilistic Graphical Models Are graphs that represent the dependencies
among random variables usually each random variable is a node, and the
edges between the nodes represent conditional dependencies
E.g. Bayesian networks (pair-wise) Markov random fields Factor graphs
![Page 14: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation Kamisetty H., Xing, E.P. and Langmead C.J](https://reader035.vdocument.in/reader035/viewer/2022062316/568167cf550346895ddd20ff/html5/thumbnails/14.jpg)
Bayes Nets – random variables
– values for the rv Each variable can be in a discrete number of
states Arrows - conditional probabilities Each variable is independent of the other
variables, given its parents
Joint probability:
Marginal probability:
![Page 15: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation Kamisetty H., Xing, E.P. and Langmead C.J](https://reader035.vdocument.in/reader035/viewer/2022062316/568167cf550346895ddd20ff/html5/thumbnails/15.jpg)
Bayes Nets – random variables
– values for the rv Each variable can be in a discrete number of
states Arrows - conditional probabilities Each variable is independent of the other
variables, given its parents
Joint probability:
Marginal probability:
Belief: probability computed approximately
![Page 16: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation Kamisetty H., Xing, E.P. and Langmead C.J](https://reader035.vdocument.in/reader035/viewer/2022062316/568167cf550346895ddd20ff/html5/thumbnails/16.jpg)
– hidden variables
– values for the hidden vars
– observed variables
compatibility functions (potentials)
often called the evidence for
for connected vars and
Markov Random Fields
Overall joint probability:
where Z is a normalization constant (also called the partition function)
pair-wise MRF because the potential is pair-wise
![Page 17: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation Kamisetty H., Xing, E.P. and Langmead C.J](https://reader035.vdocument.in/reader035/viewer/2022062316/568167cf550346895ddd20ff/html5/thumbnails/17.jpg)
Factor Graphs Bipartite graph:
– variable nodes
( – values for the vars)
– function (factor) nodes
(represent the interactions between variables)
The joint probability factors into a product of functions:
E.g.:
![Page 18: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation Kamisetty H., Xing, E.P. and Langmead C.J](https://reader035.vdocument.in/reader035/viewer/2022062316/568167cf550346895ddd20ff/html5/thumbnails/18.jpg)
Factor Graphs Bipartite graph:
– variable nodes
( – values for the vars)
– function (factor) nodes
(represent the interactions between variables)
The joint probability factors into a product of functions:
E.g.:
![Page 19: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation Kamisetty H., Xing, E.P. and Langmead C.J](https://reader035.vdocument.in/reader035/viewer/2022062316/568167cf550346895ddd20ff/html5/thumbnails/19.jpg)
Graphical Models
Bayes nets
pair-wise MRF
factor graphs
Understanding Belief Propagation and its GeneralizationsYedidia, J.S., Freeman, W.T. and Weiss Y. (2002)
![Page 20: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation Kamisetty H., Xing, E.P. and Langmead C.J](https://reader035.vdocument.in/reader035/viewer/2022062316/568167cf550346895ddd20ff/html5/thumbnails/20.jpg)
free energy
entropy
internal energy
Markov random field
probabilistic graphical models
potential function
pair-wise MRF
factor graphs
region-based free energy
region graph
belief propagation
generalized belief propagation
marginal probabilities
Gibbs free energy
inference
Bayes nets
enthalpy
![Page 21: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation Kamisetty H., Xing, E.P. and Langmead C.J](https://reader035.vdocument.in/reader035/viewer/2022062316/568167cf550346895ddd20ff/html5/thumbnails/21.jpg)
Belief Propagation (BP) Marginal probabilities that we compute approximately = beliefs Marginal probability
The number of terms in the sums grows exponentially with the number of variables
BP is a method for approximating the marginal probabilities in a time that grows linearly with the number of variables (nodes)
BP for pwMRFs, BNs or FGs is precisely mathematically equivalent at every iteration of the BP algorithm
![Page 22: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation Kamisetty H., Xing, E.P. and Langmead C.J](https://reader035.vdocument.in/reader035/viewer/2022062316/568167cf550346895ddd20ff/html5/thumbnails/22.jpg)
Belief Propagation (BP)
The message from node to node about the state node should be in.
E.g.: has 3 possible values {1,2,3} and
The belief at each node:
The message update rule:
hidden variables , observed variables compatibility functions (potentials) , marginal probabilities
![Page 23: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation Kamisetty H., Xing, E.P. and Langmead C.J](https://reader035.vdocument.in/reader035/viewer/2022062316/568167cf550346895ddd20ff/html5/thumbnails/23.jpg)
Belief Propagation (BP)
The message update rule:
The belief at each node:
![Page 24: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation Kamisetty H., Xing, E.P. and Langmead C.J](https://reader035.vdocument.in/reader035/viewer/2022062316/568167cf550346895ddd20ff/html5/thumbnails/24.jpg)
Belief Propagation (BP)
Iterative method
When the MRF has no cycles, the beliefs computed using BP are exact!
Even when the MRF has cycles, the BP algorithm is still well defined and empirically often gives good approximate answers.
![Page 25: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation Kamisetty H., Xing, E.P. and Langmead C.J](https://reader035.vdocument.in/reader035/viewer/2022062316/568167cf550346895ddd20ff/html5/thumbnails/25.jpg)
Statistical physics (Boltzmann’s law)
Kullback-Leibler distance:
KL = 0 iff the beliefs are exact and in this case we have
When the beliefs are exact the Gibbs free energy achieves its minimal value (–lnZ, also called the “Helmholz free energy”)
Graphical Models and Free Energy
![Page 26: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation Kamisetty H., Xing, E.P. and Langmead C.J](https://reader035.vdocument.in/reader035/viewer/2022062316/568167cf550346895ddd20ff/html5/thumbnails/26.jpg)
Approximating the Free Energy
Approximations Mean-field free energy approximation
uses one-node beliefs and assumes that Bethe free energy approximation
uses one-node beliefs and two-node beliefs Region-based free energy approximations
idea: break up the graph into a set of regions, compute the free energy over each region and then approximate the total free energy by the sum of the free energies over the regions
Summations over an exponential number of terms
![Page 27: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation Kamisetty H., Xing, E.P. and Langmead C.J](https://reader035.vdocument.in/reader035/viewer/2022062316/568167cf550346895ddd20ff/html5/thumbnails/27.jpg)
Generalized Belief Propagation Region-based free energy approximations
idea: break up the graph into a set of regions, compute the free energy over each region and then approximate the total free energy by the sum of the free energies over the regions
GBP a message-passing algorithm similar to BP messages between regions vs. messages between nodes the regions of nodes that communicate can be visualized in terms
of a region graph (Yedidia, Freeman, Weiss) the region-graph approximation method generalizes the Bethe
method, the junction graph method and the cluster variation method different choices of region graphs give different GBP algorithms tradeoff: complexity / accuracy how to optimally choose the regions – more art than science
![Page 28: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation Kamisetty H., Xing, E.P. and Langmead C.J](https://reader035.vdocument.in/reader035/viewer/2022062316/568167cf550346895ddd20ff/html5/thumbnails/28.jpg)
Generalized Belief Propagation Usually improves on simple BP (when the graph
contains cycles) Good advice: when constructing the regions, try to
include at least the shortest cycles inside regions For region graphs with no cycles, GBP is
guaranteed to work Even when the region graph has cycles, GBP
usually gives good results Constructing Free-Energy Approximations and Generalized
Belief Propagation Algorithms Yedidia, J.S., Freeman, W.T. and Weiss Y.
![Page 29: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation Kamisetty H., Xing, E.P. and Langmead C.J](https://reader035.vdocument.in/reader035/viewer/2022062316/568167cf550346895ddd20ff/html5/thumbnails/29.jpg)
Free Energy Estimates of All-atom Protein Structures Using Generalized Belief
PropagationKamisetty H., Xing, E.P. and
Langmead C.J.
![Page 30: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation Kamisetty H., Xing, E.P. and Langmead C.J](https://reader035.vdocument.in/reader035/viewer/2022062316/568167cf550346895ddd20ff/html5/thumbnails/30.jpg)
Model Model the protein structure as a complex probability
distribution, using a pair-wise MRF observed variables: backbone atom positions (continuous) hidden variables: side chain atom positions represented
using rotamers (discrete) interactions (edges): two variables share an edge if they are
closer than a threshold distance (Cα-Cα distance < 8Å) potential functions:
where is the energy of interaction between rotamer state of residue and rotamer state of residue
![Page 31: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation Kamisetty H., Xing, E.P. and Langmead C.J](https://reader035.vdocument.in/reader035/viewer/2022062316/568167cf550346895ddd20ff/html5/thumbnails/31.jpg)
Model
![Page 32: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation Kamisetty H., Xing, E.P. and Langmead C.J](https://reader035.vdocument.in/reader035/viewer/2022062316/568167cf550346895ddd20ff/html5/thumbnails/32.jpg)
MRF to Factor Graph
![Page 33: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation Kamisetty H., Xing, E.P. and Langmead C.J](https://reader035.vdocument.in/reader035/viewer/2022062316/568167cf550346895ddd20ff/html5/thumbnails/33.jpg)
Building the Region Graphbig regions – 3 or 2 variablessmall regions – one variable
To form the region graph, add edges from each big region to all small regions that contain a strict subset of the big region’s nodes.
![Page 34: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation Kamisetty H., Xing, E.P. and Langmead C.J](https://reader035.vdocument.in/reader035/viewer/2022062316/568167cf550346895ddd20ff/html5/thumbnails/34.jpg)
Generalized Belief Propagation Choice of regions
Idea: place residues that are closely coupled together in the same big regions
Balance accuracy/complexity Aji and McEliece
“Two-way algorithm” (Yedidia, Freeman, Weiss) Initialize the GBP messages to random starting points
and run the algorithm until the beliefs converge or for maximum 100 iterations
![Page 35: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation Kamisetty H., Xing, E.P. and Langmead C.J](https://reader035.vdocument.in/reader035/viewer/2022062316/568167cf550346895ddd20ff/html5/thumbnails/35.jpg)
Results on the Decoy Datasets 48 datasets Each dataset :
multiple decoys and the native structure of a protein
all decoys had similar backbones to the native structure (Cα RMSD < 2.0Å)
when ranked in decreasing order of entropy, the native structure is ranked the highest in 87.5% of the datasets
PROCHECK (protein structure validation): for the datasets in which the native structure was ranked 3rd or 4th, this structure had a very high number of “bad” bond angles
For dissimilar backbones: 84%
G = E – T· S
![Page 36: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation Kamisetty H., Xing, E.P. and Langmead C.J](https://reader035.vdocument.in/reader035/viewer/2022062316/568167cf550346895ddd20ff/html5/thumbnails/36.jpg)
Results on the Decoy Datasets
Comparison to other energy functions:
![Page 37: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation Kamisetty H., Xing, E.P. and Langmead C.J](https://reader035.vdocument.in/reader035/viewer/2022062316/568167cf550346895ddd20ff/html5/thumbnails/37.jpg)
Predicting ΔΔG upon mutation
![Page 38: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation Kamisetty H., Xing, E.P. and Langmead C.J](https://reader035.vdocument.in/reader035/viewer/2022062316/568167cf550346895ddd20ff/html5/thumbnails/38.jpg)
Summary Model protein structures as complex probability
distributions, using probabilistic graphical models (MRFs and FGs)
Use Generalized Belief Propagation (two-way algorithm) to approximate the free energy
Successfully use the method to distinguish native structures from decoys predict changes in free energy after mutation
Other applications: side chain placement (Yanover and Weiss), other inference problems over the graphical model.
![Page 39: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation Kamisetty H., Xing, E.P. and Langmead C.J](https://reader035.vdocument.in/reader035/viewer/2022062316/568167cf550346895ddd20ff/html5/thumbnails/39.jpg)
Questions?