chapter 8 discriminative classifiers hidden markov models
Post on 13-Dec-2015
241 Views
Preview:
TRANSCRIPT
CHAPTER 8CHAPTER 8
DISCRIMINATIVE CLASSIFIERSHIDDEN MARKOV MODELS
Generative vs. DiscriminativeGenerative vs. Discriminative
The Perceptron Model
Example: Spam
Binary Decision RuleBinary Decision Rule
Online Perceptron TrainingOnline Perceptron Training
Perceptron Training IllustrationPerceptron Training Illustration
Properties of Perceptrons
Issues with PerceptronsIssues with Perceptrons
Reasoning over TimeReasoning over Time
• Often, we want to reason about a sequence of observations
Speech recognition
Robot localization
User attention• Need to introduce time into our models• Basic approach: hidden Markov models
(HMMs)• More general: dynamic Bayes’ nets
Markov ModelsMarkov Models
Conditional IndependenceConditional Independence
Weather Example
Mini-Forward AlgorithmMini-Forward Algorithm
ExampleExample
Stationary Distributions
• If we simulate the chain long enough: What happens? Uncertainty accumulates Eventually, we have no idea what the state is!
• Stationary distributions: For most chains, the distribution we end up in
is independent of the initial distribution Called the stationary distribution of the chain Usually, can only predict a short time out
Example: Web Link AnalysisExample: Web Link Analysis
Mini-Viterbi AlgorithmMini-Viterbi Algorithm
Hidden Markov ModelsHidden Markov Models
ExampleExample
Conditional IndependenceConditional Independence
HMM ApplicationsHMM Applications
Forward AlgorithmForward Algorithm
Viterbi AlgorithmViterbi Algorithm
Viterbi ExampleViterbi Example
Viterbi PropertiesViterbi Properties• Designed for computing the most likely state hidden
sequence given a sequence of observations in Hidden Markov Models
• Two passes, forward to compute the forward probabilities, and then backward to reconstruct the maximum sequence
• What’s the time complexity?
• O(d2n) - Why is this exciting?
• There are many extensions to the basic Viterbi algorithm which have been developed for other models which have similar local structure: syntactic parsing, for instance.
Speech in an HourSpeech in an Hour
HMMs for Speech
HMMs for Continuous Obs.?HMMs for Continuous Obs.?
• Before: discrete, finite set of observations• Now: spectral feature vectors are real-valued!• Solution 1: discretization• Solution 2: continuous emissions models
Gaussians Multivariate Gaussians Mixtures of Multivariate Gaussians
• A state is progressively: Context independent subphone (~3 per phone) Context dependent phone (=triphones) State-tying of CD phone
ASR Lexicon: Markov ModelsASR Lexicon: Markov Models
Viterbi with 2 Words + Unif. LM
ConclusionConclusion
• Perceptron A discriminative model, an alternative to generative models
like Naïve Bayes Simple classification rule, based on a weight vector Simple online learning algorithm, guaranteed to converge if
training set is separable
• Hidden Markov Models A special kind of Bayesian Network designed for reasoning
about sequences of hidden states Polynomial time inference for most likely state sequence
(Viterbi) and marginalization (Forward- Backward) Many applications
top related