some slides and images are taken from: david wolfe corne...

44
Introduction to Machine learning Some slides and images are taken from: David Wolfe Corne https://www.macs.hw.ac.uk/~dwcorne/Teaching/introdl.ppt Wikipedia Geoffrey A. Hinton

Upload: others

Post on 11-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

Introduction to Machine learning

Some slides and images are taken from:

David Wolfe Corne https://www.macs.hw.ac.uk/~dwcorne/Teaching/introdl.pptWikipediaGeoffrey A. Hinton

Page 2: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

Examples 1

Page 3: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

WaveNet

Examples 1

Page 4: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

WaveNet

Examples 1

Page 5: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

Examples 2

Page 6: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

• Basics of Machine Learning

• Feedforward nets • Backpropagation • Convolutional networks

• Boltzmann machines • Deep Belief Nets

Page 7: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

Supervised learning 1 – Function approximation

Shallow ML methods Deep ML methods

• Regression • Kernel Regression

x y = F(x)

• Feedforward networks • Deep tensor networks • ConvNets

Page 8: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

Supervised learning 2 – Classification

Page 9: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

Supervised learning 2 – Classification

Shallow ML methods Deep ML methods

• Support Vector Machines (SVMs) • Kernel SVMs

Inputs Classes

Page 10: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

Unsupervised learning 1 – Learning distributions

Page 11: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

Unsupervised learning 1 – Learning distributions

• Markov random fields

Shallow ML methods Deep ML methods

• Boltzmann machines • Deep Belief Nets • variational Autoencoders

Page 12: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

Unsupervised learning 2 – Classification

Shallow ML methods Deep ML methods

• Clustering • Deep transformation + clustering

Page 13: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

1. Linear least squares regression Example: Markov state models and Koopman models

2. Validation, cross-validation and hyperparameter selection

3. Regularization and sparsity Example: Sindy

4. Kernel ridge regression

Plan for the first talk

Page 14: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

Linear least squares regression

Page 15: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

Regression

Page 16: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

Regression

Page 17: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

Regression

Page 18: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

Linear least squares

Page 19: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

Normal equations

Page 20: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

Normal equations

Solution methods:1)

do not literally do this!

Page 21: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

Normal equations

Solution methods:2) Cholesky decomposition of CXX (still numerically unstable, but sometimes useful)

3) Orthogonolization of X (e.g. via Householder reflections) — stable

4) SVD of X to perform the pseudoinversion — stable

Page 22: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

Markov state models and Koopman models of molecular dynamics

Page 23: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

Markov state models and Koopman models of molecular dynamics

Page 24: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

Markov state models and Koopman models of molecular dynamics

Page 25: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

Markov state models

Page 26: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

Koopman models

Koopman model:

Eigenvalues and eigenvectors of K have been used to perform dimension reduction in:

VAC (variational approach of conformation dynamics)

EDMD (Extended dynamic mode decomposition)

TICA (time-lagged independent component analysis)

Wu et al: JCP 146, 154104 (2017)

Noé and Nüske: SIAM MMS 11, 635-655 (2013)Nüske et al: JCTC 10, 1739-1752. (2014)

Williams, Kevrekidis and Rowley: J. Nonlinear Sci. 25, 1307-1346 (2015)

Perez-Hernandez et al: JCP 139, 015102 (2013)Schwantes and Pande: JCTC 9, 2000-2009 (2013)

Page 27: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

Validation, cross-validation and hyperparameter selection

Page 28: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

Validation, cross-validation and hyperparameter selection

Page 29: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

Validation

Page 30: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

Validation

Page 31: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

Validation

Page 32: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

Cross-validation

Page 33: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

Cross-validation

Page 34: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

Hyperparameter selection

Page 35: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

Regularization and sparsity

Page 36: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

L2 / Tihonov regularization / Ridge regression

Page 37: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

Sparsity-inducing regularization

L0 (too hard)

L1 (LASSO)

Elastic net

Good approach in practice: - Use L1 regularization and annihilate small elements with thresholding.- Select regularization hyperparameters with cross-validation.

Page 38: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

Sindy (sparse identification of nonlinear dynamics)

Brunton, Proctor and Kutz: PNAS 113, 3932–3937 (2016)

Page 39: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

Kernel ridge regression

Page 40: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

Kernel ridge regression

Reminder: ridge regression

Kernel formulation

Page 41: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

Kernel ridge regression

Kernel formulation

Page 42: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

Kernel ridge regression

Kernel formulation

Page 43: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

Kernel ridge regression

Key ideas

Page 44: Some slides and images are taken from: David Wolfe Corne ...helper.ipam.ucla.edu/publications/eltut/eltut_14769.pdf · Noé and Nüske: SIAM MMS 11, 635-655 (2013) Nüske et al: JCTC

Example: polynomial Kernel