hilbert space embeddings of hidden markov models
DESCRIPTION
Hilbert Space Embeddings of Hidden Markov Models. Le Song, Byron Boots, Sajid Siddiqi , Geoff Gordon and Alex Smola. Big Picture Question. Dependent variables Hidden variables. High dimensional Nonlinear Multimodal . High dimensional Nonlinear Multimodal . Dependent variables - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Hilbert Space Embeddings of Hidden Markov Models](https://reader035.vdocument.in/reader035/viewer/2022062811/568161b1550346895dd1749a/html5/thumbnails/1.jpg)
Hilbert Space Embeddings of Hidden Markov Models
Le Song, Byron Boots, Sajid Siddiqi, Geoff Gordon and Alex Smola
1
![Page 2: Hilbert Space Embeddings of Hidden Markov Models](https://reader035.vdocument.in/reader035/viewer/2022062811/568161b1550346895dd1749a/html5/thumbnails/2.jpg)
Graphical Models Kernel Methods
Big Picture Question
Dependent variablesHidden variables
High dimensional Nonlinear Multimodal
High dimensional Nonlinear Multimodal
Dependent variablesHidden variables
Combine the best of graphical models and kernel methods?
2
![Page 3: Hilbert Space Embeddings of Hidden Markov Models](https://reader035.vdocument.in/reader035/viewer/2022062811/568161b1550346895dd1749a/html5/thumbnails/3.jpg)
Hidden Markov Models (HMMs)
…
…
Video sequence
Music
High-dimensional features Hidden variablesUnsupervised learning
3
![Page 4: Hilbert Space Embeddings of Hidden Markov Models](https://reader035.vdocument.in/reader035/viewer/2022062811/568161b1550346895dd1749a/html5/thumbnails/4.jpg)
Notation
…
…
ObservationTransitionPrior
4
![Page 5: Hilbert Space Embeddings of Hidden Markov Models](https://reader035.vdocument.in/reader035/viewer/2022062811/568161b1550346895dd1749a/html5/thumbnails/5.jpg)
Previous Work on HMMs• Expectation maximization [Dempster et al. 77]:
– Maximum likelihood solutionLocal maximaCurse of dimensionality
• Singular value decomposition (SVD) for surrogate hidden states
No local optimaConsistent
Spectral HMMs [Hsu et al. 09, Siddiqi et al. 10], Subspace Identification [Van Overschee and De Moor 96]
5
![Page 6: Hilbert Space Embeddings of Hidden Markov Models](https://reader035.vdocument.in/reader035/viewer/2022062811/568161b1550346895dd1749a/html5/thumbnails/6.jpg)
• Input output• Variable elimination:
Predictive Distributions of HMMs
=Observable
Operator [Jaeger 00]
6
![Page 7: Hilbert Space Embeddings of Hidden Markov Models](https://reader035.vdocument.in/reader035/viewer/2022062811/568161b1550346895dd1749a/html5/thumbnails/7.jpg)
• Input output• Variable elimination (matrix representation):
Predictive Distributions of HMMs
…
7
![Page 8: Hilbert Space Embeddings of Hidden Markov Models](https://reader035.vdocument.in/reader035/viewer/2022062811/568161b1550346895dd1749a/html5/thumbnails/8.jpg)
• Key observation: need not recover : Observable representation of HMM
• where are singular vectors of joint probability of sequence pairs [Hsu et al. 09]
Only need to estimate O, Ax and π up to invertible transformation S
8
![Page 9: Hilbert Space Embeddings of Hidden Markov Models](https://reader035.vdocument.in/reader035/viewer/2022062811/568161b1550346895dd1749a/html5/thumbnails/9.jpg)
Observable representation for HMM
pairs triplets singletons
sequence
Thin SVD of C2,1, get principal left singular vectors U 9
![Page 10: Hilbert Space Embeddings of Hidden Markov Models](https://reader035.vdocument.in/reader035/viewer/2022062811/568161b1550346895dd1749a/html5/thumbnails/10.jpg)
Observable representation for HMMs
pairs triplets singletons
Works only for discrete case 10
![Page 11: Hilbert Space Embeddings of Hidden Markov Models](https://reader035.vdocument.in/reader035/viewer/2022062811/568161b1550346895dd1749a/html5/thumbnails/11.jpg)
Key Objects in Graphical Models• Marginal distributions• Joint distributions• Conditional distributions • Sum rule • Product rule
Use kernel representation for distributions, do probabilistic inference in feature space
11
![Page 12: Hilbert Space Embeddings of Hidden Markov Models](https://reader035.vdocument.in/reader035/viewer/2022062811/568161b1550346895dd1749a/html5/thumbnails/12.jpg)
Embedding distributions• Summary statistics for distributions :
• Pick a kernel , and generate a different summary statistic
Mean
Covariance
expected features
Probability P(y0)
12
![Page 13: Hilbert Space Embeddings of Hidden Markov Models](https://reader035.vdocument.in/reader035/viewer/2022062811/568161b1550346895dd1749a/html5/thumbnails/13.jpg)
Embedding distributions
• One-to-one mapping from to for certain kernels (RBF kernel)
• Sample average converges to true mean at13
![Page 14: Hilbert Space Embeddings of Hidden Markov Models](https://reader035.vdocument.in/reader035/viewer/2022062811/568161b1550346895dd1749a/html5/thumbnails/14.jpg)
Embedding joint distributions• Embedding joint distributions using
outer-product feature map
• is also the covariance operator • Recover discrete probability with delta kernel • Empirical estimate converges at
14
![Page 15: Hilbert Space Embeddings of Hidden Markov Models](https://reader035.vdocument.in/reader035/viewer/2022062811/568161b1550346895dd1749a/html5/thumbnails/15.jpg)
Embedding Conditionals
• For each value X=x conditioned on, return the summary statistic for
• Some X=x are never observed15
![Page 16: Hilbert Space Embeddings of Hidden Markov Models](https://reader035.vdocument.in/reader035/viewer/2022062811/568161b1550346895dd1749a/html5/thumbnails/16.jpg)
Embedding conditionals
Conditional Embedding Operator
avoid data partition
16
![Page 17: Hilbert Space Embeddings of Hidden Markov Models](https://reader035.vdocument.in/reader035/viewer/2022062811/568161b1550346895dd1749a/html5/thumbnails/17.jpg)
Conditional Embedding Operator• Estimation via covariance operators [Song et al. 09]
• Gaussian case: covariance matrix instead• Discrete case: joint over marginal • Empirical estimate converges at
17
![Page 18: Hilbert Space Embeddings of Hidden Markov Models](https://reader035.vdocument.in/reader035/viewer/2022062811/568161b1550346895dd1749a/html5/thumbnails/18.jpg)
Sum and Product RulesProbabilistic
RelationHilbert Space
Relation
Sum Rule
Product Rule
Total Expectation
ConditionalEmbedding
Linearity
18
![Page 19: Hilbert Space Embeddings of Hidden Markov Models](https://reader035.vdocument.in/reader035/viewer/2022062811/568161b1550346895dd1749a/html5/thumbnails/19.jpg)
Hilbert Space HMMs
…
…
19
![Page 20: Hilbert Space Embeddings of Hidden Markov Models](https://reader035.vdocument.in/reader035/viewer/2022062811/568161b1550346895dd1749a/html5/thumbnails/20.jpg)
Hilbert space HMMs
pairs triplets singletons
20
![Page 21: Hilbert Space Embeddings of Hidden Markov Models](https://reader035.vdocument.in/reader035/viewer/2022062811/568161b1550346895dd1749a/html5/thumbnails/21.jpg)
Experiment• Video sequence prediction• Slot car sensor measurement prediction • Speech classification• Compare with discrete HMMs learned by EM
[Dempster et al. 77], spectral HMM [Sajid et al. 10], and Linear dynamical system approach (LDS) [Sajid et al. 08]
21
![Page 22: Hilbert Space Embeddings of Hidden Markov Models](https://reader035.vdocument.in/reader035/viewer/2022062811/568161b1550346895dd1749a/html5/thumbnails/22.jpg)
Predicting Video Sequences• Sequence of grey scale images as inputs• Latent space dimension 50
22
![Page 23: Hilbert Space Embeddings of Hidden Markov Models](https://reader035.vdocument.in/reader035/viewer/2022062811/568161b1550346895dd1749a/html5/thumbnails/23.jpg)
Predicting Sensor Time-series• Inertial unit: 3D acceleration and orientation• Latent space dimension 20
23
![Page 24: Hilbert Space Embeddings of Hidden Markov Models](https://reader035.vdocument.in/reader035/viewer/2022062811/568161b1550346895dd1749a/html5/thumbnails/24.jpg)
Audio Event Classification• Mel-Frequency Cepstral Coefficients features• Varying latent space dimension
24
![Page 25: Hilbert Space Embeddings of Hidden Markov Models](https://reader035.vdocument.in/reader035/viewer/2022062811/568161b1550346895dd1749a/html5/thumbnails/25.jpg)
Summary• Represent distributions in feature spaces, reason
using Hilbert space sum and product rules
• Extends HMMs nonparametrically to domains with kernels
• Kernelize belief propagation, CRF and general graphical models with hidden variables?
25