chapter 6. hidden markov and maximum entropy model

sented by Jian-Shiun Tzeng 4/9/2009 Chapter 6. Hidden Markov and Maximum Entropy Model Daniel Jurafsky and James H. Martin 2008

Upload: sinjin

Post on 12-Jan-2016

71 views

Category:

Documents

3 download

Report

Download

Embed Size (px):

DESCRIPTION

Chapter 6. Hidden Markov and Maximum Entropy Model. Daniel Jurafsky and James H. Martin 2008. Introduction. Maximum Entropy ( MaxEnt ) More widely known as multinomial logistic regression Begin from non-sequential classifier A probabilistic classifier - PowerPoint PPT Presentation

TRANSCRIPT

Presented by Jian-Shiun Tzeng 4/9/2009

Chapter 6. Hidden Markov and Maximum Entropy Model

Daniel Jurafsky and James H. Martin2008

Introduction

• Maximum Entropy (MaxEnt)– More widely known as multinomial logistic regression

• Begin from non-sequential classifier– A probabilistic classifier– Exponential or log-linear classifier– Text classification– Sentiment analysis

• Positive or negative opinion

– Sentence boundary

Page 3: Chapter 6. Hidden Markov and Maximum Entropy Model

Linear Regression

Page 4: Chapter 6. Hidden Markov and Maximum Entropy Model

Linear Regression

• x(j): a particular instance• y(j)

obs: observed label in the training set of x(j)

• y(j)pred: predict value from linear regression model

sum square error

Page 5: Chapter 6. Hidden Markov and Maximum Entropy Model

Logistic Regression – simplest case of binary classification

• Consider whether x is in class (1, true) or not (0, false)

w f‧ (-∞,∞)∈

∈ [0,∞)

∈ (-∞,∞)

∈ [0,1]

Page 6: Chapter 6. Hidden Markov and Maximum Entropy Model

Logistic Regression – simplest case of binary classification

Page 7: Chapter 6. Hidden Markov and Maximum Entropy Model

Logistic Regression – Classification

Page 8: Chapter 6. Hidden Markov and Maximum Entropy Model

Advanced: Learning in logistic regression

Page 9: Chapter 6. Hidden Markov and Maximum Entropy Model

Maximum Entropy Modeling

• Input: x (a word need to tag or a doc need to classify)– Features

• Ends in –ing• Previous word is “the”

– Each feature fi, weight wi

– Particular class c– Z is a normalizing factor, used to make the prob. sum

to 1

Page 10: Chapter 6. Hidden Markov and Maximum Entropy Model

Maximum Entropy Modeling

C = {c1, c2, …, cC}

Normalization

fi: A feature that only takes on the values 0 and 1 is also called an indicator function

In MaxEnt, instead of the notation fi, we will often use the notation fi(c,x), meaning that a feature i for a particular class c for a given observation x

Page 11: Chapter 6. Hidden Markov and Maximum Entropy Model

Maximum Entropy ModelingAssume C = {NN, VB}

Page 12: Chapter 6. Hidden Markov and Maximum Entropy Model

Learning Maximum Entropy Model

Page 13: Chapter 6. Hidden Markov and Maximum Entropy Model

HMM vs. MEMMHMM MEMM

MEMM can condition on any useful feature of the input observation; in HMM this isn’t possible

word

class

Page 14: Chapter 6. Hidden Markov and Maximum Entropy Model

Conditional Random Fields (CRFs)

• CRFs (Lafferty, McCallum, et al. 2001) constitute another conditional model based on maximal entropy

• Like MEMM, CRFs are able to accommodate many possibly correlated features of the observation

• However, CRFs are better able to trade off decisions at different sequence positions

• MEMM were found to suffer from the label bias problem

Page 15: Chapter 6. Hidden Markov and Maximum Entropy Model

Label Bias

• The problem appears when the MEMM contains states with different output degrees

• Because the probabilities of transitions from any given state must sum to 1, transitions from lower degree states receive higher probabilities than transitions from higher degree states

• In the extreme case, transition from a state with degree 1 always gets probability 1, effectively ignoring the observation

• CRFs do not have this problem because they define a single ME-based distribution over the whole label sequence

Hidden Markov Models./awm/tutorials/hmm14.pdf · Hidden Markov Models ... 14)

EE365: Hidden Markov Models - Stanford Universityee266.stanford.edu/lectures/hmm.pdf · EE365: Hidden Markov Models Hidden Markov Models The Viterbi Algorithm 1. Hidden Markov Models

Entropy & Hidden Markov Models Natural Language Processing CMSC 35100 April 22, 2003

Hidden Markov Model Nov 11, 2008 Sung-Bae Cho. Hidden Markov Model Inference of Hidden Markov Model Path Tracking of HMM Learning of Hidden Markov Model

Hidden Markov Models

Hidden Markov Models - Penn Engineeringcis520/lectures/HMM.pdf · Hidden Markov Models ... w is the “hidden” part of the “Hidden Markov Model” In speech recognition, we will

Hidden Markov Models · Hidden Markov Models 1 10-601 Introduction to Machine Learning Matt Gormley Lecture 20 Nov. 7, 2018 ... Hidden Markov Model 28 A Hidden Markov Model (HMM)

HIDDEN MARKOV AND MAXIMUM ENTROPY MODELS T

Hidden Markov Models and Gaussian Mixture Models · Hidden Markov Models and Gaussian Mixture Models ... Hidden Markov Model ... ASR Lectures 4&5 Hidden Markov Models and Gaussian

Interacting Hidden Markov Models for Video Understandingdraper/papers/narayana_ijprai18.pdfKeywords: Hidden Markov models; video analysis; Segre variety. 1. Introduction Hidden Markov