4. lecture ss 2006gk 11261 the development of methods that efficiently determine the global minima...
TRANSCRIPT
4. Lecture SS 2006
GK 1126 1
The development of methods that efficiently determine the global minima of
complex and rugged energy landscapes remains a challenging problem with
applications in many scientific and technological areas.
In particular, for NP-hard problems, stochastic methods offer an acceptable
compromise between the reliability of the method and its computational cost,
which scales only as a power law with the number of variables (for a fixed
probability to locate the true minimum).
In such techniques the global minimization is performed through the simulation of
a dynamical process for a “particle” on the multidimensional potential energy
surface.
V4: Optimization methods for biomolecular structure modelling
4. Lecture SS 2006
GK 1126 2
Unbiased methods vs. Biased methods (not treated here)
e.g. protein structure based on
databases of known structures
algorithms for clusters that favor
compact geometries
The most efficient method for any given problem is likely to be system dependent.
V4: Global optimisation methods
Chapter 6.7
4. Lecture SS 2006
GK 1126 3
The effort involved in solving a given global optimisation problem is described by
computational complexity theory.
Locating the global minimum on a PES belongs to the class of NP-hard
problems. However, this may be a worst case scenario.
Certainly, there are PESs for crystals, naturally occurring proteins and many
clusters that contain exponentially large number of minima.
But locating the probable global minimum is not difficult experimentally and is
sometimes easy in computer models as well.
Looking at the disconnectivity graphs on the following slides suggests that
finding the global minimum should be relatively easy if the system corresponds to
a ‚palm tree‘, but relatively difficult if it corresponds to the ‚banyan tree‘.
Complexity of global search
4. Lecture SS 2006
GK 1126 4
Fig. 5.4. One-dimensional potential energy functions (left) and the corresponding disconnectivity graphs
(right). The dotted lines indicate the energies at which the superbasin analysis was performed.
(a) Low downhill barriers and a well-defined global minimum produced by relatively large systematic
changes in potential energy between successive minima (‚palm tree‘);
(b) higher downhill barriers (‚willow tree‘);
(c) a landscape where the barrier heights are larger than the typical energy difference between
successive minima (‚banyan tree‘).
Disconnectivity graphs
Taken from Wales book
4. Lecture SS 2006
GK 1126 5
Lennard-Jones clusters are clusters of particles that interact via simple LJ
interactions. Optimal test system for global optimization algorithms.
Global minima of Lennard-Jones clusters
Wales, Scheraga, Science 285, 1368 (1999)
4. Lecture SS 2006
GK 1126 6
Simulated annealing (SA)
Taken from Wales book
SA offered perhaps the first generally applicable global
optimisation algorithm.
The system is equilibrated at a high temperature and then
cooled, with the objective of maintaining equilibrium until
the global free energy minimum and potential energy
minimum coincide.
A necessary and sufficient condition for a SA run to locate
the global minimum is that the temperature decreases
logarithmically with time. However, such a schedule is
rather slow in practice.
The problem with SA occurs when the global free energy
minimum changes with temperature.
Then, the system may become trapped behind a
barrier that will rise compared to kT as the
temperature is further decreased.
4. Lecture SS 2006
GK 1126 7
Conformational space annealing (CSA)
In the CSA approach attention is gradually focused onto smaller regions of
conformation space as the search progresses.
A set of conformations is continuously updated, as for genetic algorithms, starting
from pre-assigned randomly generated and energy-minimized conformations.
In each cycle a number of dissimilar conformations are selected from the current set,
perturbed and energy minimised.
For each of the resulting structures the closest conformation from the current set is
identified and the two minima are considered „similar“ if they lie within a cutoff, which
decreases with time.
For two similar structures, the one with the lower energy is saved in the current set.
New structures that are not similar to any of the current set can still replace the
highest energy member of that set if they lie lower in energy.
4. Lecture SS 2006
GK 1126 8
Hypersurface deformation methods attempt to simplify the global optimisation
problem by applying a transformation to the PES that smoothes it and reduces
the number of local minima.
The global minimum is accordingly easier to find on the transformed surface,
but it is then necessary to reverse the transformation and map the global
minimum back to the real surface.
Since the global minimum may change during the reverse mapping, depending on
how the surface was smoothed, an efficient local search procedure must be
applied during this process, and more than one minimum must be tracked
backwards.
Deformation methods
4. Lecture SS 2006
GK 1126 9
In the DEM the true PES is used as the initial boundary condition for a
multidimensional diffusion equation
with the boundary condition
As t we have
and for sufficiently large values of t the deformed surface has a single minimum.
The diffusion equation
t
tVtV
,
,2 XX
XX VV 0,
0,2 tV X
4. Lecture SS 2006
GK 1126 10
Model packing of 2 TM helices for glycophorin A dimer.
Smoothing: pairwise interactions between point atoms are
altered to be interactions between spacially delocalized atoms.
Assume that the thermodynamic ground state – the conformation
of lowest free energy – is also the one of lowest potential energy.
Success story: potential smoothing algorithm accurately predicts TM helix packing
4. Lecture SS 2006
GK 1126 11
The diffusion equation
Pappu, Marshall, Ponder,Nat. Struct. Biol. 6, 50 (1999)
Fig. 1 One-dimensional schematic of the effect of a smoothing protocol on a potential energy surface. The original PES is transformed by successive application of a smoothing operator, where the extent of smoothing is dictated by a control parameter t. The undeformed original surface (t = 0), the surface at an intermediate level of smoothing (t = t1) and a highly smoothed surface (t = tlarge) are shown. As the surface is transformed, higher lying minima merge into catchment regions of low lying minima and barriers between minima are progressively lowered. Open circles are starting or intermediate points on each surface. Solid circles are local minima. Dashed arrows show the result of local optimization ending at a local minimum. Solid arrows represent adiabatic movement from a local minimum on one surface to the corresponding starting point on a rougher surface. A simple smoothing protocol consists of repeated cycles of local optimization followed by adiabatic transfer to the next surface. This figure shows an idealized smoothing protocol wherein the unique minimum that remains on the t = tlarge surface is directly related to the global minimum on the original PES. A series of optimizations followed by a gradual reduction in the level of smoothing will therefore lead back to the global minimum. Note that the reversing protocol depends on a set of discrete t steps between surfaces. For small t, vertical transfer to a less smooth surface will result in a point close to a transition state whenever a bifurcation has been introduced, as at t = t1 in the figure. The simple protocol will only succeed if there is a consistent bias toward the minimum on the broader, deeper side of the bifurcation.
4. Lecture SS 2006
GK 1126 12
The diffusion equation
Pappu, Marshall, Ponder,Nat. Struct. Biol. 6, 50 (1999)
Schematic of a more realistic potential smoothing protocol for molecular search problems. Shown is a crossing between the two surviving minima on the t = t2 surface. A reversing schedule encounters the first bifurcation at t = t2. At this level of smoothing the protocol favors basin B over basin A due to a crossing of relative energies, which is an artifact of the averaging process. The reversing protocol from Fig. 1 follows a path it chooses at the first bifurcation. If bifurcations are sampled where the relative energies of the alternative basins are inverted from the t = 0 surface, then the simple method will not converge to the global minimum.
Between t = t2 and t = 0 there exist values of t for which the energy ordering resembles that of the original PES. A local search process coupled to the smoothing schedule can potentially recognize errors due to earlier energy crossings. For example, a local search represented by the dotted arrow on the t = t1 surface would correctly decide that basin A should be favored over basin B. Local searches are especially efficient when carried out on smoother surfaces since the extent of conformational space sampled is larger than for the original PES. If the global minimum is a very narrow and deep well on the PES then crossings can occur for very small values of the smoothing parameter t. For such problems, smoothing coupled to local search may fail to converge to locate the global minimum due to inadequate local sampling.
4. Lecture SS 2006
GK 1126 13
Fig. 3 Schematic of a helix dimer illustrating the method used to
compute helix packing parameters. H1 and H2 denote the helix
axes for helix 1 and helix 2 respectively. P4 is a point along the
line of contact connecting the two helices between points P3 and
P5. P1 is the C position of Phe 78 for helix 1, P2 is a point on
the helix axis H1 and C1 is the location of the center of mass of
helix 1. Similarly P7 is the C position of Phe 78 for helix 2, P6 is
a point on the helix axis H2, and C2 is the location of the center
of mass for helix 2.
The crossing angle W is the torsion angle defined by the points
P2, P3, P5 and P6. The distance of closest contact d is the
distance between the points P3 and P5. The angle a that
measures the rotation of helix 1 about its axis H1 is the torsion
angle defined by the points P1, P2, P3 and P4.
Similarly, the angle b that measures the rotation of helix 2 about
its axis H2 is the torsion angle defined by the points P7, P6, P5
and P4.
The scalar shift parameter s is defined as |T1-T2|, where T1 is
the distance between points P3 and C1 and T2 is the distance
between points P5 and C2.
Inter-helix degrees of freedom
Pappu, Marshall, Ponder,Nat. Struct. Biol. 6, 50 (1999)
4. Lecture SS 2006
GK 1126 14
Apply modified OPLS united atom force-field: replace each of the pairwise LJ 12-6
van der Waals terms by a sum of two Gaussians:
Force field
Pappu, Marshall, Ponder,Nat. Struct. Biol. 6, 50 (1999)
2
1
2expi
iigaussLJ rbaUU
Fit ai and bi to match LJ 12-6 function with hard sphere radius and well depth of
one.
Here, a1 and b1 are positive to mimick repulsion at small distances,
a2 and b2 are negative and positive to mimick the attractive part of the LJ potential.
4. Lecture SS 2006
GK 1126 15
Transform potential function U(r) to Ud(r,t) such that
where is a multidimensional diffusion operator.
The deformed Gaussian van der Waals function that is an analytical solution to the
diffusion equation is of the form:
Force field
dtU
U
Pappu, Marshall, Ponder,Nat. Struct. Biol. 6, 50 (1999)
2
1
2
23 41exp
41,
i
ii
t
rb
t
atrU
where t is the deformation parameter that controls the extent of potential
smoothing. For the torsional potential term, the smoothened functional form
becomes:
jj tjjV 2expcos1
2
1
where is the torsional angle value, j is the periodicity, Vj is the half-amplitude, is
a phase factor and t is the deformation parameter.
Initial value of smoothing parameter t = 4.25.
4. Lecture SS 2006
GK 1126 16
At some chosen value of t along the reveral protocol the level of smoothing is
reduced followed by local minimization.
The system is moved out of this local minimum along a set of search directions
corresponding to the eigenvectors of the Hessian matrix at the current local
minimum.
The system is moved along a search direction i and the conformational energy is
computed at each equidistant point k along the search direction.
If the energy at point k satistifies
it is chosed to be a new point from which to start a minimization.
Potential smoothing and search (PSS) algorithm
Pappu, Marshall, Ponder,Nat. Struct. Biol. 6, 50 (1999)
1,1,,1, and kikikiki VVVV
4. Lecture SS 2006
GK 1126 17
This pair of conditions suggests apparent downhill progress on a PES.
If the energy of the alternate minimum is lower than the energy of the original
minimum, the system is moved to the alternate location and the search process is
iterated until no new minima of lower energy can be found at the current level of
smoothing.
Start with t = 4.25. Local searches along Hessian eigenvector directions are
performed for all values of t < 4.0 during the reversing schedule.
A typical reversing schedule includes 50 – 100 values of t between 4.25 and 0.
During search, alternate low energy minima are found on the t =1,8, 0.54, 0.23 and
0.18 surfaces.
Search converges to the same minima irrespective of starting conformations.
Potential smoothing and search (PSS) algorithm
Pappu, Marshall, Ponder,Nat. Struct. Biol. 6, 50 (1999)
4. Lecture SS 2006
GK 1126 18
Distribution of interhelical energies for the 5,834
local minima found from a two-body grid search for
helices from the consensus NMR structure for the
GpA helix dimer. The global minimum on this grid
has an energy of -29.56 kcal mol–1, and is also
located by PSS. The low energy conformers
obtained from this grid search show dissimilarities
from the NMR structure akin to the results for the
idealized helices described in Table 1.
b, Distribution of interhelical energies for the 4,105
local minima found from a two-body grid search for
idealized helices built using (,) angles for a
canonical -helix and -angles from a rotamer
library. The global minimum on this grid has an
energy of -31.84 kcal mol–1. This is also the
structure found using the PSS algorithm.
Interhelical energies during a grid search
Pappu, Marshall, Ponder,Nat. Struct. Biol. 6, 50 (1999)
docking of NMR helices
docking of idealized helices:gives similar distribution using NMR helices does notbias result.
4. Lecture SS 2006
GK 1126 19
Ribbon drawing derived from the TM helix
portion of the experimental NMR structure
(PDB file 1AFO, Model 13).
b, The corresponding helix backbone and side
chains from the global minimum determined
by the PSS algorithm. Regions involved in
interhelical packing are very similar in the two
structures, and the RMSD for superposition
over all Ca atoms is 0.59 Å.
Comparison of modeled structure with NMR structure
Pappu, Marshall, Ponder,Nat. Struct. Biol. 6, 50 (1999)
4. Lecture SS 2006
GK 1126 20
Genetic algorithms
Taken from Wales book
4. Lecture SS 2006
GK 1126 21
The freezing problem in stochastic minimization methods arises when the energy
difference between “adjacent” local minima on the PES is much smaller than the
energy of intervening transition states separating them. As an example consider
the dynamics on the model potential below. At high temperatures a particle can
still cross the barriers, but not differentiate between the wells. As the temperature
drops, the particle will eventually become trapped with almost equal probability in
any of the wells, failing to resolve the energy difference between them. The
physical idea behind the stochastic tunneling method (STUN) is to allow the
particle to “tunnel” forbidden regions of the PES, once it has been determined that
they are irrelevant for the low-energy properties of the problem.
Stochastic tunnelling method
Wenzel, Hamacher, Phys. Rev. Lett. 82, 3003 (1999)
4. Lecture SS 2006
GK 1126 22
(a) Schematic one-dimensional PES and (b) STUN
effective potential, where the minimum indicated by
the arrow is the best minimum found so far. All wells
that lie above the best minimum found are
suppressed. If the dynamical process can escape the
well around the current ground-state estimate, it will
not be trapped by local minima that are higher in
energy. Wells with deeper minima are preserved and
enhanced.
(c) After the next minimum has been located, wells
that were still pronounced in (b) are also suppressed.
Once the true ground state has been found (not
shown), all other wells have been suppressed and will
no longer trap the dynamical process. The dotted line
in (c) illustrates an energy threshold 0 < ft < 1 to
classify the nature of the dynamics. Adjusting the
temperature to maintain a particular average effective
energy balances the tunneling and the local-search
phases of the algorithm.
Stochastic tunnelling method
Wenzel, Hamacher, Phys. Rev. Lett. 82, 3003 (1999)
Apply nonlinear transformation to PES:
01 fxfSTUN exf
4. Lecture SS 2006
GK 1126 23
The following transformation of the energy landscape does not change the
relative energies of any minima:
where X represents the 3D vector of nuclear coordinates and „min“ signifies that
an energy minimization is carried out starting from X. The transformed energy
at any point, X, becomes the energy of the structure obtained by minimization.
Each local minimum is, therefore, surrounded by a catchment basin of constant
energy consisting of all the neighboring geometries from which that particular
minimum is obtained. The overall energy landscape becomes a set of plateaus,
one for each catchment basin, but the energies of the local minima are
unaffected by the transformation.
Basin-hopping
Wales, Scheraga, Science 285, 1368 (1999)
XX EE min~
XE~
4. Lecture SS 2006
GK 1126 24
Basin-hopping
Wales, Doye, J Phys Chem A 101, 5111 (1997)
Aside from removing all the transition state regions from the surface, the
catchment basin transformation also accelerates the dynamics, because the
system can psss between basins all along their boundaries. Atoms can even
pass through each other without encountering prohibitive energy barriers.
4. Lecture SS 2006
GK 1126 25
Success story
Ab initio folding of 40-residue headpiece of HIV accessory protein 1F4I-40.
Use all-atom force field: atomically resolved electrostatic model, group specific
dielectric constants, LJ parametrization adapted to exp. distances in X-ray
structures.
SASA solvation model fitted to exp. solvation free energies
of Gly-X-Gly peptides.
Only degrees of freedom that are optimized:
rotations of dihedral angles of the backbone and
rotations of dihedral angles the side chains
Herges, Wenzel, Phys. Rev. Lett. 94, 018101 (2005)
4. Lecture SS 2006
GK 1126 26
Optimization strategy: adapted basin-hopping technique
Replace single minimization step by SA run with self-adapting cooling cycle and
length.
At the end of one annealing step, accept new conformation if its energy
difference to the current configuration was no higher than a given threshold T.
Start each simulation at nonclashing conformation with random backbone
dihedral angles. 3 steps:
(a) high-temperature bracket of 800/300 K, T = 15 K, reduced solvent effect,
(b) start from final configuration of (a), same SA, full solvent effect
(c) low-temperature bracket of 600/3 K, T = 1K.
Within each annealing run, the temperature is geometrically decreased. The
number of steps per annealing run is gradually increased to ensure better
convergence.
Total effort: each full simulation requies ca. 107 energy evalulations 10 ns of
standard MD simulation. Herges, Wenzel, Phys. Rev. Lett. 94, 018101 (2005)
4. Lecture SS 2006
GK 1126 27
Good agreement with NMR
Herges, Wenzel, Phys. Rev. Lett. 94, 018101 (2005)
4. Lecture SS 2006
GK 1126 28
Comparison of C – C distances
Dark diagonal block: intrahelical
contacts well resolved (not
surprising).
Off-diagonal dark blocks indicate that
also a large fraction of long-range
native contacts is reproduced
correctly.
Herges, Wenzel, Phys. Rev. Lett. 94, 018101 (2005)
4. Lecture SS 2006
GK 1126 29
Classification of decoy structures
60.000 low-energy conformations (decoys)
were generated.
Group decoys with RMSD < 3.0 Å into
families with free-energy brackets of 2
kcal/mol.
Construct decoy tree.
As soon as two decoys in different branches
have an RMSD of < 3.0 Å, the 2 families are
merged into one superfamily.
Herges, Wenzel, Phys. Rev. Lett. 94, 018101 (2005)
Trees with very short stems and many low-energy branches are characteristic of glassy PES =
Levinthal paradoxon.
Trees with well structured trees with few terminal branches suggest existence of a folding
funnel. This tree here is consistent with a broad folding funnel.
4. Lecture SS 2006
GK 1126 30
Global optimisation is a great challenge.
Efficient schemes exist that may each be most advantageous in a particular class
of problems.
- Genetic algorithms
- Deformation of PES – diffusion equation method, stochastic tunneling,
basin hopping
Ab initio protein structure prediction of small domains up to 100 amino acids is
about to be mastered for „easy“ cases.
Remains to be seen if one can distinguish „easy“ and „hard“ cases.
Summary