deeplearning enhanced markov state models (msms)chz379.ust.hk/songshanhu/deeplearning_msm.pdf · 2...
TRANSCRIPT
Feb 20, 2019
Deep learning enhanced Markov State Models (MSMs)
Wei Wang
Outline
2
• General protocol of building MSM
• Challenges with MSM
• VAMPnets
• Time-lagged auto-encoder
Revisit the protocol of building MSM
3
Need a lot of expertise in biology & machine learning
4Wang, Cao, Zhu, Huang WIREs Comput. Mol. Sci., e1343, (2017)
Criterion to choose a model: slowest dynamics
5
Choose the MSM that best captures the slowest transitions of the system
Wang, Cao, Zhu, Huang WIREs Comput. Mol. Sci., e1343, (2017)
Choose the one with slowest transition
6
Timescales (μs)
Da, Pardo, Xu, Zhang, Gao, Wang, Huang, Nature Communications., 7, 11244, (2016)
Perform this cumbersome work: search
7
• Propose good clustering algorithms & features• Parametric search using good strategies
http://msmbuilder.org/osprey/1.1.0
Challenges: parametric space is too large: Collective Variable (CV)
8http://homepages.laas.fr/jcortes/algosb13/sutto-ALGO13-META.pdf
Need to propose good features
Challenges: parametric space is too large: CV
9http://homepages.laas.fr/jcortes/algosb13/sutto-ALGO13-META.pdf
Need to propose good features
Challenges: parametric space is too large: CV
10
Need to propose good features, otherwise will worsen the clustering stage
tICATruth
Wehmeyera and Noe, J. Chem. Phys. 148, 241703 (2018)
Challenges: parametric space is too large: clustering
11Zhang et al., Methods in Enzymology, 578, 343-371 (2016)
Essence of these operations
12
• Linearlly/Nonlinearlly transform the protein configurations into the state vectors: !"#$ → &', &), … , &+ , ∑-.'+ &+ = 1
(1, 0, 0, 0)
(0, 0, 1, 0)
Husic and Pande, J. Am. Chem. Soc. 2018, 140, 2386−2396
Deep learning can greatly help: powerful
13
• In the mathematical theory of artificial neural networks, theuniversal approximation theorem states that a feed-forwardnetwork with a single hidden layer containing a finite number ofneurons can approximate continuous functions on compactsubsets of Rn, under mild assumptions on the activationfunction.
• Deep learning has been widely applied in numerous fields
Dog: 0.99Cat: 0.01
https://en.wikipedia.org/wiki/Universal_approximation_theorem
Deep learning can greatly help MSM
14
Dog: 0.99Cat: 0.01
Macro1: 0.990Macro2: 0.005Macro3: 0.005
Outline
15
• General protocol of building MSM
• Challenges with MSM
• VAMPnets
• Time-lagged auto-encoder
VAMPnets for deep learning of molecular kinetics
16
• VAMPnets: employ the variational approach for Markov processes(VAMP) to develop a deep learning framework for molecular kineticsusing neural networks, encodes the entire mapping from molecularcoordinates to Markov states, thus combining the whole data processingpipeline in a single end-to-end framework.
Noe et al., 9, 5, 2018, Nature Communications
coordinates
state vector
Related to the implied timescale plot, maximize it
Understanding VAMPnets
17
• The basic structure of neural network
• What is VAMP score
Basic structure of neural network
18
Forward propagation
19
Where can we get the weights?
Backpropagation to update the weights
20
Define a objective function ! = ∑$ %&'() − %+'),-
Weights are updated following the largest gradient direction
http://www.saedsayad.com/images/ANN_4.png
Backpropagation to update the weights
21https://independentseminarblog.files.wordpress.com/2017/12/giphy.gif
Backpropagation to update the weights
22
Define a objective function ! = ∑$ %&'() − %+'),-
Weights are updated following the largest gradient direction
http://www.saedsayad.com/images/ANN_4.png
In VAMPnets, it is VAMP-2 score
VAMP-2 score: objective function
23
!(#): state vector, e.g., ! # = (0,1,0) if x belongs to state 2
Noe et al., 9, 5, 2018, Nature Communications
VAMP-2 score: related to TPM
24
!(#): state vector, e.g., ! # = (0,1,0) if x belongs to state 2
Sum of eigenvalues of T(*)+Related to the implied
timescale plot, we want tomaximize it
Noe et al., 9, 5, 2018, Nature Communications
VAMPnets: example on alanine dipeptide
25Noe et al., 9, 5, 2018, Nature Communications
10 heavy atoms
xyz for 10 heavy atoms
Output: 6 probabilities
Try to lump to 6 states
VAMPnets: example on alanine dipeptide
26
• Visualizing the outputs (soft assignments)
• Once we have the state vectors, we can calculate TPM, and get the kinetics
Noe et al., 9, 5, 2018, Nature Communications
Comparison with traditional way to build MSM
27
• Advantages• No need to worry about features to do tICA and the clustering
algorithms• Inputs are simple: aligned trajectories• Find the variationally optimal one
• Disadvantages• Easy to overfit the data• Easy to be trapped in local optimal
Noe et al., 9, 5, 2018, Nature Communications
Alanine dipeptide
Outline
28
• General protocol of building MSM
• Challenges with MSM
• VAMPnets
• Time-lagged auto-encoder
Other application of deep learning in MSM: CV
29
• Improve PCA/tICA through nonlinear transformation trained by (time-lagged) auto-encoder
• PCA/tICA: find the direction that maximizes the variance/time-lagged covariance matrix.
PCA: minimizing reconstruction error
30http://alexhwilliams.info/itsneuronalblog/2016/03/27/pca/
PCA: Linear version of auto-encoder
31
Original data Reconstructed data
Wehmeyer and Noe, J. Chem. Phys. 148, 241703 (2018)
Improving tICA using time-lagged auto-encoder
32
Time-lagged autoencoder:
D,E are constant matrix in tICA
Current frame Next frame
Wehmeyer and Noe, J. Chem. Phys. 148, 241703 (2018)
Improving tICA using time-lagged auto-encoder
33
Time-lagged autoencoder:
D,E are constant matrix in tICA
! = #
Wehmeyer and Noe, J. Chem. Phys. 148, 241703 (2018)
Time-lagged autoencoder improves over tICA
34
Villin
Wehmeyer and Noe, J. Chem. Phys. 148, 241703 (2018)
Summary
35
• Deep learning improves MSM in reducing the number of prior knowledge
• However, deep learning may overfit the data when our sampling is not enough