hidden markov models sushmita roy bmi/cs 576 [email protected] oct 16 th, 2014
TRANSCRIPT
Hidden Markov models
Sushmita RoyBMI/CS 576
www.biostat.wisc.edu/bmi576/[email protected]
Oct 16th, 2014
Key concepts
• What are Hidden Markov models (HMMs)?– States, Emission characters, Parameters
• Three important questions in HMMs and algorithms to solve these questions– Probability of a sequence observations: Forward
algorithm– Most likely path (sequence of states): Viterbi– Parameter estimation: Baum-Welch/Forward-
backward algorithm
Revisiting the CpG question
• Given a sequence x1..xT we can use two Markov chains to decide if x1..xT is a CpG island or not.
• What do we do if we were asked to “find” the CpG islands in the genome?
• We have to search for these “islands” in a “sea” of non-islands
A simple HMM for identifying CpG islands
A+
T+C+
G+A-
T-C-
G-
CpG IslandBackground
Note, there is no more one-to-one correspondence between states and observed symbols
An HMM for an occasionally dishonest casino
0.9
1 1/62 1/63 1/64 1/65 1/66 1/6
0.95
0.05
21
0.1
1 1/102 1/103 1/104 1/105 1/106 1/2
Fair Loaded
What is hidden?
What is observed?
Which dice is rolled
Number (1-6) on the die
What does an HMM do?
• Enables us to model observed sequences of characters generated by a hidden dynamic system
• The system can exist in a fixed number of “hidden” states
• The system probabilistically transitions between states and at each state it emits a symbol/character
Formally defining a HMM
• States• Emission alphabet• Parameters– State transition probabilities for probabilistic
transitions from state at time t to state at time t+1– Emission probabilities for probabilistically emitting
symbols from a state
Notation• States with emissions will be numbered from 1 to K– 0 begin state, N end state
• observed character at position t• Observed sequence• Hidden state sequence or path• Transition probabilities
• Emission probabilities: Probability of emitting symbol b from state k
An example HMM
0.8
probability of emitting character A in state 2
probability of a transition from state 1 to state 3
0.4
A 0.4C 0.1G 0.2T 0.3
A 0.1C 0.4G 0.4T 0.1
A 0.2C 0.3G 0.3T 0.2
begin end
0.5
0.5
0.2
0.8
0.6
0.1
0.90.2
0 5
4
3
2
1
A 0.4C 0.1G 0.1T 0.4
Path notation
A 0.1C 0.4G 0.4T 0.1
A 0.4C 0.1G 0.1T 0.4
begin end
0.5
0.5
0.2
0.8
0.4
0.6
0.1
0.90.2
0.8
0 5
4
3
2
1
A 0.4C 0.1G 0.2T 0.3
A 0.2C 0.3G 0.3T 0.2
Three important questions in HMMs
• How likely is an HMM to have generated a given sequence?– Forward algorithm
• What is the most likely “path” for generating a sequence of observations– Viterbi algorithm
• How can we learn an HMM from a set of sequences?– Forward-backward or Baum-Welch (an EM algorithm)
How likely is a given sequence from an HMM?
Initial transition Emitting symbol xt
State transition betweenconsecutive time points
How likely is a given sequence from an HMM?
• But we don’t know what the path is• So we need to sum over all paths• The probability over all paths is:
Example
• Consider an candidate CpG island
• Considering our HMM for CpG island model, some possible paths that are consistent with this CpG island are
CGCGC
C+G+C+G+C+
C-G-C-G-C-
C-G+C-G+C-
Number of paths
A 0.4C 0.1G 0.1T 0.4
A 0.4C 0.1G 0.2T 0.3
begin end
0 3
2
1
• for a sequence of length T, how many possible paths through this HMM are there?
2T
• the Forward algorithm enables us to compute the probability of a sequence by efficiently summing over all possible paths
How likely is a given sequence: Forward algorithm
• Define as the probability of observing and ending in state k at time t
• This can be written recursively as follows
Steps of the Forward algorithm
• Initialization: denote 0 for the “begin” state
• Recursion: for t=1 to T
• Termination
Forward algorithm example
A 0.4C 0.1G 0.2T 0.3
A 0.1C 0.4G 0.4T 0.1
A 0.4C 0.1G 0.1T 0.4
A 0.2C 0.3G 0.3T 0.2
begin end
0.5
0.5
0.2
0.8
0.4
0.6
0.1
0.90.2
0.8
0 5
4
3
2
1
What is the probability of sequence TAGA ?
Table for TAGA
1 0 0 0 0
0 0.15 0.012 0.00048 0.0000384
0 0.2 0.064 0.00512 0.0016384
0 0 0.024 0.00576 0.0005376
0 0 0.004 0.00528 0.0001552
0 0 0 0 0.00046224
T A G A
t=1 t=2 t=3 t=4
0
1
2
3
4
States
5
This entry is also P(x). Does not require f1(4) and f2(4)
Three important questions in HMMs
• How likely is an HMM to have generated a given sequence?– Forward algorithm
• What is the most likely “path” for generating a sequence of observations– Viterbi algorithm
• How can we learn an HMM from a set of sequences?– Forward-backward or Baum-Welch (an EM algorithm)
Viterbi algorithm
• Viterbi algorithm gives an efficient way to find the most likely/probable state
• Consider the dishonest casino example– Given a sequence of dice rolls can you infer when
the casino was using the loaded versus fair dice?– Viterbi algorithm gives this answer
• Viterbi is very similar to the Forward algorithm– Except instead of summing we maximize
Notation for Viterbi
• probability of the most likely path for x1.. xt ending in state k
• pointer to the state that gave the maximizing transition
• is the most probability path sequence x1,..,xT
Steps of the Viterbi algorithm
• Initialization:
• Recursion:for t=1 to T
• Termination:
Probability associated with the most likely path
Viterbi algorithm example
A 0.4C 0.1G 0.2T 0.3
A 0.1C 0.4G 0.4T 0.1
A 0.4C 0.1G 0.1T 0.4
A 0.2C 0.3G 0.3T 0.2
begin end
0.5
0.5
0.2
0.8
0.4
0.6
0.1
0.90.2
0.8
0 5
4
3
2
1
What is the most likely path for TAG ?
Viterbi computations for TAG
1 0 0 0
0 0.15 0.012 0.00048
0 0.2 0.064 0.00512
0 0 0.024 0.00288
0 0 0.004 0.00512
0 0 0 0.004608
T A G
t=1 t=2 t=3
0
1
2
3
4
Stat
es
5
0 0 0
0 0 0
0 1 3
0 2 2
T A G
t=1 t=2 t=3
vk(t)
ptrt(k)
1
2
3
4
Trace back path
Using HMM to detect CpG islands
• Recall the 8-state HMM for our CpG island• Apply the Viterbi algorithm to a DNA
sequence on this HMM• Contiguous assignments of ‘+’ states will
correspond to CpG islands
Summary
• Hidden Markov models are extensions to Markov chains enabling us to model and segment sequence data
• HMMs are defined by a set of states and emission characters, transition probabilities and emission probabilities
• We have examined two questions for HMMs– Computing the probability of a sequence of observed
characters given an HMM (Forward algorithm)– Computing the most likely sequence of states (or path) for
a sequence of observed characters