chapter 8 discriminative classifiers hidden markov models

Post on 13-Dec-2015

241 Views

Category:

Documents

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

CHAPTER 8CHAPTER 8

DISCRIMINATIVE CLASSIFIERSHIDDEN MARKOV MODELS

Generative vs. DiscriminativeGenerative vs. Discriminative

The Perceptron Model

Example: Spam

Binary Decision RuleBinary Decision Rule

Online Perceptron TrainingOnline Perceptron Training

Perceptron Training IllustrationPerceptron Training Illustration

Properties of Perceptrons

Issues with PerceptronsIssues with Perceptrons

Reasoning over TimeReasoning over Time

• Often, we want to reason about a sequence of observations

Speech recognition

Robot localization

User attention• Need to introduce time into our models• Basic approach: hidden Markov models

(HMMs)• More general: dynamic Bayes’ nets

Markov ModelsMarkov Models

Conditional IndependenceConditional Independence

Weather Example

Mini-Forward AlgorithmMini-Forward Algorithm

ExampleExample

Stationary Distributions

• If we simulate the chain long enough: What happens? Uncertainty accumulates Eventually, we have no idea what the state is!

• Stationary distributions: For most chains, the distribution we end up in

is independent of the initial distribution Called the stationary distribution of the chain Usually, can only predict a short time out

Example: Web Link AnalysisExample: Web Link Analysis

Mini-Viterbi AlgorithmMini-Viterbi Algorithm

Hidden Markov ModelsHidden Markov Models

ExampleExample

Conditional IndependenceConditional Independence

HMM ApplicationsHMM Applications

Forward AlgorithmForward Algorithm

Viterbi AlgorithmViterbi Algorithm

Viterbi ExampleViterbi Example

Viterbi PropertiesViterbi Properties• Designed for computing the most likely state hidden

sequence given a sequence of observations in Hidden Markov Models

• Two passes, forward to compute the forward probabilities, and then backward to reconstruct the maximum sequence

• What’s the time complexity?

• O(d2n) - Why is this exciting?

• There are many extensions to the basic Viterbi algorithm which have been developed for other models which have similar local structure: syntactic parsing, for instance.

Speech in an HourSpeech in an Hour

HMMs for Speech

HMMs for Continuous Obs.?HMMs for Continuous Obs.?

• Before: discrete, finite set of observations• Now: spectral feature vectors are real-valued!• Solution 1: discretization• Solution 2: continuous emissions models

Gaussians Multivariate Gaussians Mixtures of Multivariate Gaussians

• A state is progressively: Context independent subphone (~3 per phone) Context dependent phone (=triphones) State-tying of CD phone

ASR Lexicon: Markov ModelsASR Lexicon: Markov Models

Viterbi with 2 Words + Unif. LM

ConclusionConclusion

• Perceptron A discriminative model, an alternative to generative models

like Naïve Bayes Simple classification rule, based on a weight vector Simple online learning algorithm, guaranteed to converge if

training set is separable

• Hidden Markov Models A special kind of Bayesian Network designed for reasoning

about sequences of hidden states Polynomial time inference for most likely state sequence

(Viterbi) and marginalization (Forward- Backward) Many applications

top related