xin zhou [email protected]
DESCRIPTION
Structuring and Sampling in Complex Conformational Space Weighted Ensemble Dynamics Simulation . Xin Zhou [email protected]. Asia Pacific Center for Theoretical Physics, Dep. o f Phys., POSTECH , Pohang , Korea . Oct 12, 2009 Beijing. Independent Junior Research Group . - PowerPoint PPT PresentationTRANSCRIPT
1
Xin Zhou [email protected]
Asia Pacific Center for Theoretical Physics,Dep. of Phys., POSTECH,
Pohang, Korea
Structuring and Sampling in Complex Conformational Space
Weighted Ensemble Dynamics Simulation
Oct 12, 2009 Beijing
Independent Junior Research Group
Multiscale Modeling & Simulations in soft materials
One available position for postdoctor/exchange Ph.D student
http://www.apctp.org/jrg/members/xzhou
Shun Xu (Ph.D candidate, 2008.11-) Linchen Gong (Ph.D candidate, 2008.12-)Shijing Lu (Ph.D candidate, 2009.1-)
Members:
X.Z. (Leader, 2008.6-)Pakpoom Reunchan (Postdoctor, 2009.10-)
Understanding results of simulations
Improve efficiencies of simulations
traditional: project to low-dimensional
(reaction coordinates) space
new : kinetic transition network
coarse-graining,
enhanced sampling,
accelerated slow-dynamics
Extend spatial and temporal scales but keep necessary details
Multiscale simulations:
1.More sufficient simulation provides more complete understanding2.The understanding of systems is helpful to design more efficient simulation algorithm
Vibration of bonds: 10-14 second
Protein folding > 10-6 second
There are coupling among different scales !
multiple scales
Energetic barrier Entropic barrier
Due to high free energy (energy and/or entropy) barriers, standard MC/MD simulations need very long time to reach equilibrium
Current advanced simulation techniques are not very helpful in overcoming entropic barriers
Barriers
A. F. Voter (1998) V. S. Pande (2000)
Ensemble Dynamics
Independently generate multiple short trajectories
Statistically analyze slow transition dynamics
€
k → nkn trajectories: transition rate
Weighted Ensemble Dynamics
arbitrarily select initial conformations
Independently generate multiple short trajectories
weight the trajectories
Statistically analyze slow transition dynamics analyze state structure and equilibrium properties
Linchen Gong & X.Z. (2009)
Weighted Ensemble Dynamics
A single t-length MD trajectory is not sufficient to reach global equilibrium
Multiple t-length trajectories can be used to reproduce global equilibrium properties by reweighting the trajectories
€
Aeq
=wi A i∑
wi∑Each trajectory has an unique weight (wi) in contributing to equilibrium properties
The weight of trajectory is only dependent on its initial conformation
€
wi = Ω(r q i(τ = 0))
€
Ω(r q ) =Peq (r q )Pinit (
r q )
Weighted Ensemble Dynamics
€
wi = Ω(r q i(0)) =Peq (r q i(0))Pinit (
r q i(0))
{Wi} satisfies a self-consistent equation for any selected initial conformations
The initial distribution might be unknown
The fluctuation of weights might be too huge to be practice in reproducing equilibrium properties
Usually impractical
€
Ω1→2(x) ≡ P2(x)P1(x)
€
A(x) 2 = Ω1→2(x)A(x) 1
€
P2(x)P1(x)
=1+μν
∑ gμν (1) < δ1Aμ >2 δ1A
ν (x)
€
δ1A ≡ A− < A >1
€
gμν (1) = [gμν (1)]−1
gμν (1) = δ1Aμδ1A
ν
1
Expansion of Probability Density
€
v s = Ω(x) −1
€
δ1Aμ (x)
€
s1→22 ≡ (Ω1→2(x) −1)2
1
=μν
∑ gμν (1) δ1Aμ
2δ1A
ν
2
Theory of WED
€
wi =1+ gμν (init) δinit Aμ
eqδ init A
ν (r q i(0))
=1+ 1p
w j δ init Aμ
jj
∑ gμν (init)δinit Aν (r q i(0))
Self-consistent equation:
the (short) initial segments of trajectories replace the initial configurations
€
wi = Ω(r q i(0)) → 1αt 0
αt
∫ Ω(r q i(τ ))dτ
α <<1
€
wi =1+ 1p j∑ gμν (+) δ+Aμ
i+δ+Aν
j w j
Theory of WED
A symmetric linear homogeneous equation:
€
H = GTG
€
Hw = 0€
Gw = 0
€
Gij = 1p μν∑ gμν (+) δ+Aμ
i+δ+Aν
j−δij + 1
pw = (w1,...,wp )T
The ground state of H (eigenvector with zero eigenvalue) gives weights of trajectories
If the ground state of H is non-degenerate, a unique w is obtained, the equilibrium distribution is reproduced
€
H =v G ii
∑ •r G i
r G i = (Gi1,...,Gip ),
Gij = 1p μν∑ gμν (+) δ+Aμ
i+δ+Aν
j−δ ij + 1
p
€
Hw = 0
parallel simulation from any initial conformations
Equilibrium Criterion
€
P(x) = Peq (x)∝ e−βV (x )In principle not practice
In practice
€
A(x)P(x ) ≈ A(x)
eqfor any A(x)
€
sP →eq2 ≡ (ΩP →eq (x) −1)2
P
=μν∑ gμν (P) δP Aμ
eqδP Aν
eq
≈ 0
for complete independent basis functions
€
Hw = 0WED:
Judge if simulations reach equilibrium Reweighting trajectories to reach equilibrium distribution
If the ground state is degenerate, the trajectories are limited in different conformational regions, which are separated each other within the scale of total simulation time: meta-stable states
€
Hw = 0
The number of degenerated ground states equals to the number of meta-stable states in the total simulation time scale
Meta-stable states
Simulation trajectories visit in a few completely separated conformation regions, the relative weights of the regions are unknown
States and eigenvalues of H
Eigenvalue = 0 : separated states in the time scaleEigenvalue = 1 : trajectories in a same state0<eigenvalue<1 : partially separated states in the time scale a (small) fraction of trajectories happen transitions between states
Weights and eigenvector
Trajectories are grouped into statesTransition trajectories slightly split the degenerate ground statesThe weights of trajectories in the same state are almost constant
Projection in ground states
€
Lαi ≡
r G i • ˆ u α
S1 (1.75) : 77 S2 (0.75) : 92 S3 (-0.25) : 37 S4 (-1.25) : 66 S4-S3-S2 : 9S4-S3 : 119
Non-transition trajectories
transition trajectories
1. Non-transition trajectories inside a state are mapped to the same point 2. Transition trajectories between two states are mapped to the line connected by the states3. Transition trajectories among three states are mapped to the plane of the states
Occupation fraction vs projectionMulti-time transition trajectories
single-time transition trajectories
The occupation fraction of a trajectory in states is linearly related to its projection
Transition time vs projection
single-time transition trajectories
€
t itrans = a + bLi
Without requiring knowledge of states and transitions
Transition state ensemble
Free energy reconstruction
Two different initial distributions are re-weighted to the accurate free energy profile
Weights of trajectories started from the same state are almost same
Transition network in 2D multi-well potential
Topology of transition network is kept
Mexico-hat: entropy effects Eigenvalues gradually increase from 0 to 1
Topology of transition network is kept
Alanine dipeptide in waters
An alanine dipeptide solvated in 522 TIP3P water molecules: 1588 atoms
500 initial conformations, generated from a 10 ns simulation at T=600K
500 WED trajectories (600 ps each)
Eigenvalue of H
450K and 300K
Initial psi
Started from Ceq7
Started from Alpha7
Started from Cax7
Modified potential at 300K projection
Free energy reconstruction
Occupation fraction vs projection (300K mod)
Single-time transition trajectories
Transition time vs projection (300K mod)
Single-time transition trajectories
Real transition time by checking along the single-time transition trajectories
Dipeptide at 150K
Eigenvalues of H continuously increase from zero to unity at 150K:
Inter-trajectory difference due to entropy effects makes multiple eigenvalues be smaller than (but close to) unity
150K
Count of traj.
300K modified
More dispersive projection
Diffusive dynamics at 150K
Histogram at transition regions is significant
Diffusively cross the transition regions
Trajectories do not sufficiently cover whole the state at the low temperature
Statistical difference between trajectories is large
Solvent effects
Include solvent-related functions in expansion
Completeness of Basis functions
€
Ω1→2(x) =1+μν
∑ gμν (1) < δ1Aμ >2 δ1A
ν (x)
€
s1→22 =
μν∑ gμν (1) < δ1A
μ >2< δ1Aν >2
S quickly reaches saturation while the number of basis functions is far smaller than the size of sample
It does not require the expansion is accurate at everywhere, but distinguish conformational regions
More complex cases
While there are n small eigenvalues, trajectories should be projected to n-1 dimensional space
Cluster analysis is required Trajectories might need to be split into multiple shorter segments to distinguish transition and non-transition trajectories
Meta-stable states can be clustered in different time scales
Generalization
€
< Pi | Pj >= gμν (ref )∑ δAμ
iδAν
j
=μ
∑ < ˆ A μ >i< ˆ A μ > j
↔ dij = (μ
∑ < ˆ A μ >i − < ˆ A μ > j )2
The overlap matrix of trajectories €
Pref (x) = 1n
Pii
∑ (x)i=1,…,n trajectories generated from the same potential but different initial configurations, the distributions are denoted as Pi(x)
€
Ωi(x) ≡ Pi(x)Pref (x)
€
Λ ij =< Pi | Pj >≡Pi(x)Pj (x)
Pref (x)∫ dx = Ωi j
Each trajectory (set of conformations) is mapped to a point
€
Pj (x) ↔ { ˆ A μj},μ =1,...
mapping
€
P(x) ↔ { ˆ A μP (x )
},μ =1,...
€
ˆ A μ
€
ˆ A ν
0 1
If two samples come from the same distribution, their mapped points locate at the same position
The error satisfies a Gaussian distribution (the center limit theorem)
€
rc2 ≤ n
M
Trajectory mapping
€
Pj (x) ↔ { ˆ A μj}
€
ˆ A μ
€
ˆ A ν
0 1
t-length MD trajectory:
1. Inside a state and reach local equilibrium
2. Transition among a few states and reach local equilibrium in each of the states, but not reach the inter-state equilibrium
3. Inside a state (conformational region) but not reach local equilibrium
€
Pj (x) =α∑ C j
α Pα (x)Equilibrium distribution of meta-stable states
Trajectory clustering
€
Pj (x) ↔ { ˆ A μj}t-length MD trajectory:
1. Concentrated points (clusters): non-transition2. Points on lines connected
with clusters: transition 3. Diffusion dynamics in
entropy-dominated regions
€
r02 = n
M
Dimension of manifold
€
r02 = n
M€
Pj (x) =α∑ C j
α Pα (x)
nd is the dimension of
€
δPj (x) ≡ Pj (x) − Pref (x) ↔ { ˆ A μj}
μ =1,...,nj =1,...,N t trajectories
basis functions
ns is number of meta-stable states
€
nd ≤ ns −1It is an equality while the set of applied basis functions is sufficient
Hierarchic kinetic network
Split trajectory into short segments: detect kinetic network in shorter time scales
Hierarchic meta-stable state structure
€
P(v q ;t) ↔ { ˆ A μP(t)}
€
f α (t) <∝ P(v q ;t) | Pα (v q ) >C(t2 − t1) <∝ P(v q ;t1) | P(r q ;t2) >
Fraction in states:
correlation:
1. Form hierarchic Kinetic network which involving complete equilibrium and transition kinetic/dynamical properties
2. Calculate weights of trajectories and and correlation in diffusion regions
The trajectory ensemble is mapped as a trajectory
Understanding of system
• 74 atoms, charged terminals;
• Implicit solvent simulation: Generalized Born;
• 1000 trajectories;
• 20 ns and 40000 conformationsper trajectory.
Example
12-alanine peptide
1000 20-ns trajectories
Application
172 basis functions from torsion angles
Dimension of manifold Principle Component AnalysisTotal dimension: 172
€
nd ≤ ns −1
€
nd = 5
€
r02 = n
M~ 0.01
clustering minimal spanning tree clustering algorithm
€
τ(G2) > 2.8μs
k−1(G4 → G3) ≈ 250ns
k−1(G4 → G11) ≈ 500ns
2
sub-trajectory clustering (1 ns)
€
ns ≥ 4
• G1, 1ns clustering
• G1, 2D free energy profile
2D might be insufficient
• G11, sub-trajectory clustering
Hierarchic kinetic transition network
put everything together to form a network
1. meta-stable states in different time scales
2. Transition connections among states3. Transition rates, transition states and
transition paths4. Typical (or average) configurations of
states
10 ns
1 ns
0.1 ns
summary1. A complex conformational space can be understood by constructing
hierarchical meta-stable state structure in time scales2. WED generates multiple trajectories from different initial
configurations, the trajectories are mapped to the average values of some independent physical variables
3. Clustering algorithm groups these trajectories to form the meta-stable state structure
4. Equilibrium properties can be reproduced based on overlapping of trajectories, thus the sampling is further enhanced
5. Dynamics and kinetics within the total simulation time can be obtained from the WED simulations
6. Dynamics in longer time scales might be much more easily obtained based on the state structure
Thanks for your attention!