xin zhou [email protected]

1

Xin Zhou [email protected]

Asia Pacific Center for Theoretical Physics,Dep. of Phys., POSTECH,

Pohang, Korea

Structuring and Sampling in Complex Conformational Space

Weighted Ensemble Dynamics Simulation

Oct 12, 2009 Beijing

Independent Junior Research Group

Multiscale Modeling & Simulations in soft materials

One available position for postdoctor/exchange Ph.D student

[email protected]

http://www.apctp.org/jrg/members/xzhou

Shun Xu (Ph.D candidate, 2008.11-) Linchen Gong (Ph.D candidate, 2008.12-)Shijing Lu (Ph.D candidate, 2009.1-)

Members:

X.Z. (Leader, 2008.6-)Pakpoom Reunchan (Postdoctor, 2009.10-)

Understanding results of simulations

Improve efficiencies of simulations

traditional: project to low-dimensional

(reaction coordinates) space

new : kinetic transition network

coarse-graining,

enhanced sampling,

accelerated slow-dynamics

Extend spatial and temporal scales but keep necessary details

Multiscale simulations:

1.More sufficient simulation provides more complete understanding2.The understanding of systems is helpful to design more efficient simulation algorithm

Vibration of bonds: 10-14 second

Protein folding > 10-6 second

There are coupling among different scales !

multiple scales

Energetic barrier Entropic barrier

Due to high free energy (energy and/or entropy) barriers, standard MC/MD simulations need very long time to reach equilibrium

Current advanced simulation techniques are not very helpful in overcoming entropic barriers

Barriers

A. F. Voter (1998) V. S. Pande (2000)

Ensemble Dynamics

Independently generate multiple short trajectories

Statistically analyze slow transition dynamics

€

k → nkn trajectories: transition rate

Weighted Ensemble Dynamics

arbitrarily select initial conformations

Independently generate multiple short trajectories

weight the trajectories

Statistically analyze slow transition dynamics analyze state structure and equilibrium properties

Linchen Gong & X.Z. (2009)


A single t-length MD trajectory is not sufficient to reach global equilibrium

Multiple t-length trajectories can be used to reproduce global equilibrium properties by reweighting the trajectories

€

Aeq

=wi A i∑

wi∑Each trajectory has an unique weight (wi) in contributing to equilibrium properties

The weight of trajectory is only dependent on its initial conformation

€

wi = Ω(r q i(τ = 0))

€

Ω(r q ) =Peq (r q )Pinit (

r q )


€

wi = Ω(r q i(0)) =Peq (r q i(0))Pinit (

r q i(0))

{Wi} satisfies a self-consistent equation for any selected initial conformations

The initial distribution might be unknown

The fluctuation of weights might be too huge to be practice in reproducing equilibrium properties

Usually impractical

€

Ω1→2(x) ≡ P2(x)P1(x)

€

A(x) 2 = Ω1→2(x)A(x) 1

€

P2(x)P1(x)

=1+μν

∑ gμν (1) < δ1Aμ >2 δ1A

ν (x)

€

δ1A ≡ A− < A >1

€

gμν (1) = [gμν (1)]−1

gμν (1) = δ1Aμδ1A

ν

1

Expansion of Probability Density

€

v s = Ω(x) −1

€

δ1Aμ (x)

€

s1→22 ≡ (Ω1→2(x) −1)2

1

=μν

∑ gμν (1) δ1Aμ

2δ1A

ν

2

Theory of WED

€

wi =1+ gμν (init) δinit Aμ

eqδ init A

ν (r q i(0))

=1+ 1p

w j δ init Aμ

jj

∑ gμν (init)δinit Aν (r q i(0))

Self-consistent equation:

the (short) initial segments of trajectories replace the initial configurations

€

wi = Ω(r q i(0)) → 1αt 0

αt

∫ Ω(r q i(τ ))dτ

α <<1

€

wi =1+ 1p j∑ gμν (+) δ+Aμ

i+δ+Aν

j w j

Theory of WED

A symmetric linear homogeneous equation:

€

H = GTG

€

Hw = 0€

Gw = 0

€

Gij = 1p μν∑ gμν (+) δ+Aμ

i+δ+Aν

j−δij + 1

pw = (w1,...,wp )T

The ground state of H (eigenvector with zero eigenvalue) gives weights of trajectories

If the ground state of H is non-degenerate, a unique w is obtained, the equilibrium distribution is reproduced

€

H =v G ii

∑ •r G i

r G i = (Gi1,...,Gip ),

Gij = 1p μν∑ gμν (+) δ+Aμ

i+δ+Aν

j−δ ij + 1

p

€

Hw = 0

parallel simulation from any initial conformations

Equilibrium Criterion

€

P(x) = Peq (x)∝ e−βV (x )In principle not practice

In practice

€

A(x)P(x ) ≈ A(x)

eqfor any A(x)

€

sP →eq2 ≡ (ΩP →eq (x) −1)2

P

=μν∑ gμν (P) δP Aμ

eqδP Aν

eq

≈ 0

for complete independent basis functions

€

Hw = 0WED:

Judge if simulations reach equilibrium Reweighting trajectories to reach equilibrium distribution

If the ground state is degenerate, the trajectories are limited in different conformational regions, which are separated each other within the scale of total simulation time: meta-stable states

€

Hw = 0

The number of degenerated ground states equals to the number of meta-stable states in the total simulation time scale

Meta-stable states

Simulation trajectories visit in a few completely separated conformation regions, the relative weights of the regions are unknown

States and eigenvalues of H

Eigenvalue = 0 : separated states in the time scaleEigenvalue = 1 : trajectories in a same state0<eigenvalue<1 : partially separated states in the time scale a (small) fraction of trajectories happen transitions between states

Weights and eigenvector

Trajectories are grouped into statesTransition trajectories slightly split the degenerate ground statesThe weights of trajectories in the same state are almost constant

Projection in ground states

€

Lαi ≡

r G i • ˆ u α

S1 (1.75) : 77 S2 (0.75) : 92 S3 (-0.25) : 37 S4 (-1.25) : 66 S4-S3-S2 : 9S4-S3 : 119

Non-transition trajectories

transition trajectories

1. Non-transition trajectories inside a state are mapped to the same point 2. Transition trajectories between two states are mapped to the line connected by the states3. Transition trajectories among three states are mapped to the plane of the states

Occupation fraction vs projectionMulti-time transition trajectories

single-time transition trajectories

The occupation fraction of a trajectory in states is linearly related to its projection

Transition time vs projection

single-time transition trajectories

€

t itrans = a + bLi

Without requiring knowledge of states and transitions

Transition state ensemble

Free energy reconstruction

Two different initial distributions are re-weighted to the accurate free energy profile

Weights of trajectories started from the same state are almost same

Transition network in 2D multi-well potential

Topology of transition network is kept

Mexico-hat: entropy effects Eigenvalues gradually increase from 0 to 1

Topology of transition network is kept

Alanine dipeptide in waters

An alanine dipeptide solvated in 522 TIP3P water molecules: 1588 atoms

500 initial conformations, generated from a 10 ns simulation at T=600K

500 WED trajectories (600 ps each)

Eigenvalue of H

450K and 300K

Initial psi

Started from Ceq7

Started from Alpha7

Started from Cax7

Modified potential at 300K projection

Free energy reconstruction

Occupation fraction vs projection (300K mod)

Single-time transition trajectories

Transition time vs projection (300K mod)

Single-time transition trajectories

Real transition time by checking along the single-time transition trajectories

Dipeptide at 150K

Eigenvalues of H continuously increase from zero to unity at 150K:

Inter-trajectory difference due to entropy effects makes multiple eigenvalues be smaller than (but close to) unity

150K

Count of traj.

300K modified

More dispersive projection

Diffusive dynamics at 150K

Histogram at transition regions is significant

Diffusively cross the transition regions

Trajectories do not sufficiently cover whole the state at the low temperature

Statistical difference between trajectories is large

Solvent effects

Include solvent-related functions in expansion

Completeness of Basis functions

€

Ω1→2(x) =1+μν

∑ gμν (1) < δ1Aμ >2 δ1A

ν (x)

€

s1→22 =

μν∑ gμν (1) < δ1A

μ >2< δ1Aν >2

S quickly reaches saturation while the number of basis functions is far smaller than the size of sample

It does not require the expansion is accurate at everywhere, but distinguish conformational regions

More complex cases

While there are n small eigenvalues, trajectories should be projected to n-1 dimensional space

Cluster analysis is required Trajectories might need to be split into multiple shorter segments to distinguish transition and non-transition trajectories

Meta-stable states can be clustered in different time scales

Generalization

€

< Pi | Pj >= gμν (ref )∑ δAμ

iδAν

j

=μ

∑ < ˆ A μ >i< ˆ A μ > j

↔ dij = (μ

∑ < ˆ A μ >i − < ˆ A μ > j )2

The overlap matrix of trajectories €

Pref (x) = 1n

Pii

∑ (x)i=1,…,n trajectories generated from the same potential but different initial configurations, the distributions are denoted as Pi(x)

€

Ωi(x) ≡ Pi(x)Pref (x)

€

Λ ij =< Pi | Pj >≡Pi(x)Pj (x)

Pref (x)∫ dx = Ωi j

Each trajectory (set of conformations) is mapped to a point

€

Pj (x) ↔ { ˆ A μj},μ =1,...

mapping

€

P(x) ↔ { ˆ A μP (x )

},μ =1,...

€

ˆ A μ

€

ˆ A ν

0 1

If two samples come from the same distribution, their mapped points locate at the same position

The error satisfies a Gaussian distribution (the center limit theorem)

€

rc2 ≤ n

M

Trajectory mapping

€

Pj (x) ↔ { ˆ A μj}

€

ˆ A μ

€

ˆ A ν

0 1

t-length MD trajectory:

1. Inside a state and reach local equilibrium

2. Transition among a few states and reach local equilibrium in each of the states, but not reach the inter-state equilibrium

3. Inside a state (conformational region) but not reach local equilibrium

€

Pj (x) =α∑ C j

α Pα (x)Equilibrium distribution of meta-stable states

Trajectory clustering

€

Pj (x) ↔ { ˆ A μj}t-length MD trajectory:

1. Concentrated points (clusters): non-transition2. Points on lines connected

with clusters: transition 3. Diffusion dynamics in

entropy-dominated regions

€

r02 = n

M

Dimension of manifold

€

r02 = n

M€

Pj (x) =α∑ C j

α Pα (x)

nd is the dimension of

€

δPj (x) ≡ Pj (x) − Pref (x) ↔ { ˆ A μj}

μ =1,...,nj =1,...,N t trajectories

basis functions

ns is number of meta-stable states

€

nd ≤ ns −1It is an equality while the set of applied basis functions is sufficient

Hierarchic kinetic network

Split trajectory into short segments: detect kinetic network in shorter time scales

Hierarchic meta-stable state structure

€

P(v q ;t) ↔ { ˆ A μP(t)}

€

f α (t) <∝ P(v q ;t) | Pα (v q ) >C(t2 − t1) <∝ P(v q ;t1) | P(r q ;t2) >

Fraction in states:

correlation:

1. Form hierarchic Kinetic network which involving complete equilibrium and transition kinetic/dynamical properties

2. Calculate weights of trajectories and and correlation in diffusion regions

The trajectory ensemble is mapped as a trajectory

Understanding of system

• 74 atoms, charged terminals;

• Implicit solvent simulation: Generalized Born;

• 1000 trajectories;

• 20 ns and 40000 conformationsper trajectory.

Example

12-alanine peptide

1000 20-ns trajectories

Application

172 basis functions from torsion angles

Dimension of manifold Principle Component AnalysisTotal dimension: 172

€

nd ≤ ns −1

€

nd = 5

€

r02 = n

M~ 0.01

clustering minimal spanning tree clustering algorithm

€

τ(G2) > 2.8μs

k−1(G4 → G3) ≈ 250ns

k−1(G4 → G11) ≈ 500ns

2

sub-trajectory clustering (1 ns)

€

ns ≥ 4

• G1, 1ns clustering

• G1, 2D free energy profile

2D might be insufficient

• G11, sub-trajectory clustering

Hierarchic kinetic transition network

put everything together to form a network

1. meta-stable states in different time scales

2. Transition connections among states3. Transition rates, transition states and

transition paths4. Typical (or average) configurations of

states

10 ns

1 ns

0.1 ns

summary1. A complex conformational space can be understood by constructing

hierarchical meta-stable state structure in time scales2. WED generates multiple trajectories from different initial

configurations, the trajectories are mapped to the average values of some independent physical variables

3. Clustering algorithm groups these trajectories to form the meta-stable state structure

4. Equilibrium properties can be reproduced based on overlapping of trajectories, thus the sampling is further enhanced

5. Dynamics and kinetics within the total simulation time can be obtained from the WED simulations

6. Dynamics in longer time scales might be much more easily obtained based on the state structure

Thanks for your attention!

xin zhou [email protected]

Documents