hiddenmarkovmodel final
DESCRIPTION
huhuTRANSCRIPT
-
Hidden Markov Models (HMMs)
Dhiraj
DSG-MVL
-
Future is independent of the past given the present
Used to model extraordinary large number of applications using temporal data or
sequence of data eg weather, financial
Language, music it deals with how world is
evolving over time
Andrei Andreyevich Markov
(1856 -1922)
-
MARKOV CHAINS
3
-
Markov Chain :
Auto Insurance Example
4
-
Auto Insurance Example
5
-
Generics
6
-
Markov Chain :
Auto Insurance Example
7
-
Markov Chain :
Auto Insurance Example
8
-
Markov Chain :
Auto Insurance Example
9
-
Markov Chain : Auto Insurance Example
Power of Markov chain, it will allow us to travel
in future many many steps
10
-
Markov Chain : Auto Insurance Example
11
-
Markov Chain : Free Throw confidence
12
-
Markov Chain : Free Throw confidence
13
-
Markov Chain : Free Throw confidence
14
-
Markov Chain : Free Throw confidence
15
-
Markov Chain : Free Throw confidence
16
-
Markov Chain : Free Throw confidence Transitions
17
-
Markov Chain : Free Throw confidence Transitions
18
-
Markov Chain : Transition Matrix
19
-
TRANSITION DIAGRAM: EXAMPLE 1
20
-
TRANSITION DIAGRAM: EXAMPLE 2
21
-
TRANSITION DIAGRAM: EXAMPLE 3
Relative Probability
22
-
MARKOV CHAIN
Transient/ephemeral
Recurrent
Absorbing
-
24
-
Current
States
States Going to
25
-
System Behavior
Initial State representing Initial Vector
Arriving
Playing on
Phone
Paying
Attention
After One
time Unit
After n time
Units
26
-
System Behavior
After Two
time Unit
Playing on
Phone
Paying
Attention
Writing
Notes
Kicked
Out
27
-
System Behavior
After 100
time Unit
Playing on
Phone
Paying
Attention Writing
Notes
Listening Kicked
Out
Arriving
28
-
Markov Model A Markov model is a type of stochastic
process, sometimes referred to as a
chain
The model is similar to the FSM except that it is executed by probabilistic
moves rather than deterministic moves
It is nondeterministic, where the FSM is deterministic
-
Markov Models
A discrete (finite) system:
N distinct states.
Begins (at time t=1) in some initial state(s).
At each time step (t=1,2,) the system moves
from current to next state according to transition
probabilities associated with current state.
This kind of system is called a finite, or discrete
Markov model
30
-
Set of states:
Process moves from one state to another generating a
sequence of states :
Markov chain property: probability of each subsequent state
depends only on what was the previous state:
To define Markov model, the following probabilities have to be
specified: transition probabilities and initial
probabilities
The output of the process is the set of states at each instant of
time
Markov Models
},,,{ 21 Nsss
,,,, 21 ikii sss
)|(),,,|( 1121 ikikikiiik ssPssssP
)|( jiij ssPa )( ii sP
31
-
Markov Property
Markov Property: The state of the system at time t+1
depends only on the state of the system at time t
Xt=1 Xt=2 Xt=3 Xt=4 Xt=5
] x X | x P[X
] x X , x X , . . . , x X , x X | x P[X
tt11t
00111-t1-ttt11t
t
t
32
-
A Markov System
s1 s3
s2
Has N states, called s1, s2 .. sN
There are discrete timesteps,
t=0, t=1,
N = 3
t=0
-
A Markov System
s1 s3
s2
Has N states, called s1, s2 .. sN
There are discrete timesteps,
t=0, t=1,
On the tth timestep the system is in exactly one of the available
states. Call it qt
Note: qt {s1, s2 .. sN }
N = 3
t=0
qt=q0=s3
Current State
-
A Markov System
s1 s3
s2
Has N states, called s1, s2 .. sN
There are discrete timesteps,
t=0, t=1,
On the tth timestep the system is in exactly one of the available
states. Call it qt
Note: qt {s1, s2 .. sN }
Between each timestep, the next
state is chosen randomly.
N = 3
t=1
qt=q1=s2
Current State
-
A Markov System
s1 s3
s2
Has N states, called s1, s2 .. sN
There are discrete timesteps,
t=0, t=1,
On the tth timestep the system is in exactly one of the available
states. Call it qt
Note: qt {s1, s2 .. sN }
Between each timestep, the next
state is chosen randomly.
The current state determines the
probability distribution for the
next state.
N = 3
t=1
qt=q1=s2
P(qt+1=s1|qt=s3) = 1/3
P(qt+1=s2|qt=s3) = 2/3
P(qt+1=s3|qt=s3) = 0
P(qt+1=s1|qt=s1) = 0
P(qt+1=s2|qt=s1) = 0
P(qt+1=s3|qt=s1) = 1
P(qt+1=s1|qt=s2) = 1/2
P(qt+1=s2|qt=s2) = 1/2
P(qt+1=s3|qt=s2) = 0
-
A Markov System
s1 s3
s2
Has N states, called s1, s2 .. sN
There are discrete timesteps,
t=0, t=1,
On the tth timestep the system is in exactly one of the available
states. Call it qt
Note: qt {s1, s2 .. sN }
Between each timestep, the next
state is chosen randomly.
The current state determines the
probability distribution for the
next state.
N = 3
t=1
qt=q1=s2
P(qt+1=s1|qt=s3) = 1/3
P(qt+1=s2|qt=s3) = 2/3
P(qt+1=s3|qt=s3) = 0
P(qt+1=s1|qt=s1) = 0
P(qt+1=s2|qt=s1) = 0
P(qt+1=s3|qt=s1) = 1
P(qt+1=s1|qt=s2) = 1/2
P(qt+1=s2|qt=s2) = 1/2
P(qt+1=s3|qt=s2) = 0
1/2
1/2
1/3
2/3
1
Often notated with arcs
between states
-
Markov Property
s1 s3
s2 qt+1 is conditionally independent
of { qt-1, qt-2, q1, q0 } given qt.
In other words:
P(qt+1 = sj |qt = si ) =
P(qt+1 = sj |qt = si ,any earlier history) N = 3
t=1
qt=q1=s2
P(qt+1=s1|qt=s3) = 1/3
P(qt+1=s2|qt=s3) = 2/3
P(qt+1=s3|qt=s3) = 0
P(qt+1=s1|qt=s1) = 0
P(qt+1=s2|qt=s1) = 0
P(qt+1=s3|qt=s1) = 1
P(qt+1=s1|qt=s2) = 1/2
P(qt+1=s2|qt=s2) = 1/2
P(qt+1=s3|qt=s2) = 0
1/2
1/2
1/3
2/3
1
-
Hidden Markov Models
(probabilistic finite state automata)
Often we face scenarios where states cannot be
directly observed.
We need an extension: Hidden Markov Models
a11 a22 a33 a44
a12 a23 a34
b11 b14 b12
b13
1 2 3
4
Observed
phenomenon
aij are state transition probabilities.
bik are observation (output) probabilities. b11 + b12 + b13 + b14 = 1,
b21 + b22 + b23 + b24 = 1, etc.
-
Hidden Markov Models - HMM
H1 H2 HL-1 HL
X1 X2 XL-1 XL
Hi
Xi
Hidden variables
Observed data
-
Definition of Hidden Markov Model The Hidden Markov Model (HMM) is a finite set of states,
each of which is associated with a probability
distribution. A Hidden Markov model is a statistical model in which the system
being modelled is assumed to be markov process with unobserved
hidden states.
Transitions among the states are governed by a set
of probabilities called transition probabilities.
In a particular state an outcome or observation
can be generated, according to the associated
probability distribution.
It is only the outcome, not the state visible to an external
observer and therefore states are ``hidden'' from the
observer; hence the name Hidden Markov Model.
-
Hidden Markov Models A Hidden Markov model is a statistical model
in which the system being modelled is
assumed to be markov process with
unobserved hidden states.
In Regular Markov models the state is clearly visible to others in which the state transition
probabilities are the parameters only where
as in HMM the state is not visible but the
output is visible.
-
Hidden Markov Model Consider a discrete-time Markov Process
Consider a system that may be described at any time as being in one of a set of N distinct states
At regularly spaced, discrete times, the system undergoes a change of state according to a set of
probabilities associated with the state
We denote the time instance associated with state changes as t = 1,2, and actual state at time t as
-
Essentials To define hidden Markov model, the following
probabilities have to be specified: matrix of
transition probabilities A=(aij), aij= P(si | sj) ,
matrix of observation probabilities
B=(bi (vm )), bi(vm ) = P(vm | si) and a vector of
initial probabilities =(i), i = P(si) . Model is
represented by M=(A, B, ).
-
Hidden Markov Model
45
-
Discrete Markov Model: Example
Discrete Markov Model
with 5 states.
Each aij represents the
probability of moving
from state i to state j
The aij are given in a
matrix A = {aij}
The probability to start
in a given state i is i , The vector repre-
sents these start
probabilities.
46
-
Overview
Mathematical notation
Example : flow chart
Mathematic Notation
To obtain the conditional probability of achieving a
particular state based on previous state
X1,X2,.Xn where X1 represents variable at time 1
P[Xn+1 = j | Xn = i ] = P(i,j)
What is the probability that given the system is in state i and it
will move in state j
47
-
Mathematical Notation
(0,0) (0,1) (0,2)
(1,0) (1,1) (1,2)
(2,0) (2,1) (2,2)
P P P
P P P
P P P
P(0,0): Probability of moving from state 0
to state 0
Probability Matrix
0
1
2
0.3
0.4
1.0
0.5
0.3
0.5
0.5 0.5 0
0 0 1.0
0.3 0.4 0.3
48
-
Example: Orange Juice
Assumption: A Family of four buy orange juice once a weak
A = Someone using Brand A A = Someone using other Brand
Transition Diagram
A A Next State
P = A
A
Current State
Transition Probability Matrix
A A
So = Initial State Distribution Matrix
A A 0.9
0.1
0.7
0.3
0.9 0.1
0.7 0.3
0.2 0.8
49
-
Example: Orange Juice
Start 0.2
0.8
A
A
0.9
0.1 0.7
0.3
A
A A
A
To find probability that someone
uses Brand A after one week
P(Brand A after 1 wk) =
(0.2) (0.9) + (0.8) (0.7) = .74
Initial State Distribution Matrix
0.74 0.26
0.2 0.8So =
S1 =
A A
50
-
Markov Model
51
-
Markov Model
52
-
Markov Model
53
-
Hidden Markov Model Markov model is a process in which each state
corresponds to a deterministically observable event
and hence the output of any given state is not
random
We extend the concept of Markov Models to include
the case in which the observation is a probabilistic
function of the state
That is the resulting model is a doubly embedded
stochastic process with an underlying stochastic
process that is not directly observable(hidden) but
can be observed only through another set of
stochastic process that produce the sequence of
observations 54
-
HMM Components
A set of states (xs)
A set of possible output symbols (ys)
A state transition matrix (as) probability of making transition from
one state to the next
Output emission matrix (bs) probability of a emitting/observing a
symbol at a particular state
Initial probability vector probability of starting at a particular
state
Not shown, sometimes assumed to be 1
-
COIN-TOSS MODEL
59
-
COIN-TOSS MODEL (contd..)
60
-
COIN-TOSS MODEL (contd..)
61
-
COIN-TOSS MODEL (contd..)
62
-
Weather Example Revisited
63
-
64
Weather Example Revisited
-
65
Weather Example Revisited
-
PROBLEM
66
-
Solution
67
-
Solution contd.
68
-
Solution contd.
69
-
States
Observation
70
-
Evaluation problem. Given the HMM M=(A, B, )
and the observation sequence O=o1 o2 ... oK ,
calculate the probability that model M has generated
sequence O .
Decoding problem. Given the HMM M=(A, B, ) and the observation sequence O=o1 o2 ... oK ,
calculate the most likely sequence of hidden states si
that produced this observation sequence O.
Learning problem. Given some training observation sequences O=o1 o2 ... oK and general
structure of HMM (numbers of hidden and visible
states), adjust M=(A, B, ) to maximize the probability.
O=o1...oK denotes a sequence of observations
ok{v1,,vM}.
71
Main issues using HMMs
-
Learning/Training Problem
Modify the model parameters that best represents
the observed output, given the output sequence
and the model.
Consider the coin-toss example (with 3 biased
coins)
Say we get the observations as
{HHHHTTHTTTTHHTT}
So find the model parameters, i.e., transition matrix,
emission matrix and initial distribution that best
represents the output
72
-
Evaluation Problem
What is the chance of appearance of an output
observation while the model is known?
Consider coin-toss example (with 3 biased coins)
We know some previous output sequence obtained
from the coin-toss experiment
Say {HHHHTTHTTTTHHTT}
We know the model parameters too
So what is the possibility that we will get an output
sequence like {HTHT}
73
-
Decoding Problem
What is the state sequence that best explains the
output sequence while the model is known?
Say we get the observations as
{HHHHTTHTTTTHHTT}
Decode/Find the sequence of states that generates
the output sequence
In simpler words, find the sequence of tossing the 3
biased coins that generates the given output
sequence
74
-
Solution to the Problems
Learning/Training:
Baum-Welch Algorithm
Viterbi Training (Unsupervised Learning)
Evaluation:
Forward Algorithm
Decoding:
Forward-Backward Algorithm
Viterbi Algorithm
75
-
Thank you