hidden markov models ellen walker bioinformatics hiram college, 2008
TRANSCRIPT
![Page 1: Hidden Markov Models Ellen Walker Bioinformatics Hiram College, 2008](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d1f5503460f949f2b6a/html5/thumbnails/1.jpg)
Hidden Markov Models
Ellen Walker
Bioinformatics
Hiram College, 2008
![Page 2: Hidden Markov Models Ellen Walker Bioinformatics Hiram College, 2008](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d1f5503460f949f2b6a/html5/thumbnails/2.jpg)
State Machine to Recognize “AUG”
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.transition
Start state
Final state
Each character causes a transition to the next state
![Page 3: Hidden Markov Models Ellen Walker Bioinformatics Hiram College, 2008](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d1f5503460f949f2b6a/html5/thumbnails/3.jpg)
“AUG” anywhere in a string
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
![Page 4: Hidden Markov Models Ellen Walker Bioinformatics Hiram College, 2008](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d1f5503460f949f2b6a/html5/thumbnails/4.jpg)
“AUG” in frame
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
![Page 5: Hidden Markov Models Ellen Walker Bioinformatics Hiram College, 2008](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d1f5503460f949f2b6a/html5/thumbnails/5.jpg)
Deterministic Finite Automaton (DFA)
• States– One start state– One or more accept states
• Transitions– For every state, for every character
• Outputs– Optional: states can emit outputs, e.g.
“Stop” at accept state
![Page 6: Hidden Markov Models Ellen Walker Bioinformatics Hiram College, 2008](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d1f5503460f949f2b6a/html5/thumbnails/6.jpg)
Why DFAs?
• Every regular expression has an associated state machine that recognizes it (and vice versa)
• State machines are easy to implement in very low level code (or hardware)
• Sometimes the state machine is easier to describe than the regular expression
![Page 7: Hidden Markov Models Ellen Walker Bioinformatics Hiram College, 2008](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d1f5503460f949f2b6a/html5/thumbnails/7.jpg)
Hidden Markov Models
• Also a form of state machine
• Transitions based on probabilities, not inputs
• Every state has (probabilistic) output (or emission)
• “Hidden” because only emissions are visible, not states or transitions
![Page 8: Hidden Markov Models Ellen Walker Bioinformatics Hiram College, 2008](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d1f5503460f949f2b6a/html5/thumbnails/8.jpg)
HMM vs. DFA
• DFA is deterministic– Each decision (which state next? What to
output?) is fully determined by the input string
• HMM is probabilistic– HMM makes both decisions based on
probability distributions
![Page 9: Hidden Markov Models Ellen Walker Bioinformatics Hiram College, 2008](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d1f5503460f949f2b6a/html5/thumbnails/9.jpg)
HMM vs. DFA (2)
• DFA model is explicit and used directly like a program.
• HMM model must be inferred from data. Only emissions (outputs) can be observed. States and transitions, as well as the probability distributions for transitions and outputs are hidden.
![Page 10: Hidden Markov Models Ellen Walker Bioinformatics Hiram College, 2008](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d1f5503460f949f2b6a/html5/thumbnails/10.jpg)
HMM Example: Fair Bet Casino
• The casino has two coins, a Fair coin (F) and a Biased coin (B)– Fair coin has 50% H, 50% T– Biased coin has 75% H, 25% T
• Before each flip, with probability 10%, the dealer will switch coins.
• Can you tell, based only on a sequence of H and T which coin is used when?
![Page 11: Hidden Markov Models Ellen Walker Bioinformatics Hiram College, 2008](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d1f5503460f949f2b6a/html5/thumbnails/11.jpg)
“Fair Bet Casino” HMM
Image from Jones & Pevner 2004
![Page 12: Hidden Markov Models Ellen Walker Bioinformatics Hiram College, 2008](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d1f5503460f949f2b6a/html5/thumbnails/12.jpg)
The Decoding Problem
• Given an HMM and a sequence of outputs, what is the most likely path through the HMM that generated the outputs?
![Page 13: Hidden Markov Models Ellen Walker Bioinformatics Hiram College, 2008](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d1f5503460f949f2b6a/html5/thumbnails/13.jpg)
Viterbi Algorithm
• Uses dynamic programming• Starting point:
– When the output string is “”, the most likely state is the start state (and there is no path)
• Taking a step:– Likelihood of this state is maximum of all ways to
get here, measured as:• Likelihood of previous state *
Likelihood of transition to this state * Likelihood of output from this state
![Page 14: Hidden Markov Models Ellen Walker Bioinformatics Hiram College, 2008](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d1f5503460f949f2b6a/html5/thumbnails/14.jpg)
Example: “HHT”
• Initial -> F – Prev= 1, Trans = 0.5, Out=0.5, total = 0.25
• Initial -> B– Prev =1, Trans = 0.5, Out=0.75, total =
0.375
• Result: F = 0.25, B=0.375
![Page 15: Hidden Markov Models Ellen Walker Bioinformatics Hiram College, 2008](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d1f5503460f949f2b6a/html5/thumbnails/15.jpg)
Example: “HHT”
• F -> F – Prev=0.25, Trans = 0.9, Out=0.5, total = 0.1125
• B -> F – Prev=0.375, Trans = 0.1, Out=0.5, total = 0.01875
• F -> B– Prev =.25, Trans = 0.1, Out=0.75, total = 0.01875
• B -> B– Prev =.375, Trans = 0.9, Out=0.75, total = 0.253125
• Result: F = 0.1125, B=0.253125
![Page 16: Hidden Markov Models Ellen Walker Bioinformatics Hiram College, 2008](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d1f5503460f949f2b6a/html5/thumbnails/16.jpg)
Example: HHT
• F -> F – Prev=.1125, Trans = 0.9, Out=0.5, total = 0.0506
• B -> F – Prev=.253125, Trans = 0.1, Out=0.5, total = 0.0127
• F -> B– Prev =.1125, Trans = 0.1, Out=0.25, total = 0.00281
• B -> B– Prev=.253125, Trans = 0.9, Out=0.25, total = 0.0570
• Result: F = 0.0506, B=0.0570
![Page 17: Hidden Markov Models Ellen Walker Bioinformatics Hiram College, 2008](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d1f5503460f949f2b6a/html5/thumbnails/17.jpg)
Tracing Back
• Pick the highest result from the last step, follow the highest transition from each previous step (just like Smith-Waterman)
• Result: initial->B->B->B• Biased coin always used• What if the next flip is T?
![Page 18: Hidden Markov Models Ellen Walker Bioinformatics Hiram College, 2008](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d1f5503460f949f2b6a/html5/thumbnails/18.jpg)
Log Probabilities
• Probabilities are increasingly small, as you multiply numbers less than one
• Computers have limits to precision
• Therefore, it’s better to use a log probability format
• 1/10*1/10 = 1/100 (10-1 *10-1 = 10-2)
• -1 + -1 = -2
![Page 19: Hidden Markov Models Ellen Walker Bioinformatics Hiram College, 2008](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d1f5503460f949f2b6a/html5/thumbnails/19.jpg)
GC Rich Islands
• A GC Rich Island is an area of a genome where GC content is significantly greater than the genome as a whole
• GC Rich Islands are like Biased Coins• Can recognize them using the same HMM
– GC content is p(H) for fair coin– Larger number is p(H) for biased coin– Estimate probability of entering vs. leaving GC
Rich island for “changing coin” probability
![Page 20: Hidden Markov Models Ellen Walker Bioinformatics Hiram College, 2008](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d1f5503460f949f2b6a/html5/thumbnails/20.jpg)
Probability of State Sequence, Given Output Sequence
• Given HMM and output string, what is probability that HMM is in state S at time t?– Forward: similar formulation as decoding
problem, except take sum of all paths, instead of max of all paths (times from 0 to t-1)
– Backward: similar, but work from end of string (times from t+1 to end of sequence
![Page 21: Hidden Markov Models Ellen Walker Bioinformatics Hiram College, 2008](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d1f5503460f949f2b6a/html5/thumbnails/21.jpg)
Parameter Estimation
• Given many strings, what are the parameters of the HMM that generated them?– Assume we know the states and
transitions, but not the probabilities of transitions or outputs
– This is an optimization problem
![Page 22: Hidden Markov Models Ellen Walker Bioinformatics Hiram College, 2008](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d1f5503460f949f2b6a/html5/thumbnails/22.jpg)
Characteristics of an Optimization Problem
• Each potential solution has a “goodness” value (in this case, probability)
• We want the best solution• Perfect answer: try all possibilities (not
usually possible)• Good, but not perfect answer: use a
heuristic
![Page 23: Hidden Markov Models Ellen Walker Bioinformatics Hiram College, 2008](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d1f5503460f949f2b6a/html5/thumbnails/23.jpg)
Hill Climbing (an Optimization Heuristic)
• Start with a solution (could be random)
• Consider one or more “steps”, or perturbations to the solution
• Choose the “step” that most improves the score
• Repeat until the score is good enough, or no better score can be reached
![Page 24: Hidden Markov Models Ellen Walker Bioinformatics Hiram College, 2008](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d1f5503460f949f2b6a/html5/thumbnails/24.jpg)
Hill Climbing for HMM
• Guess a state sequence
• Using the string(s), estimate transition and emission probabilities
• Using the probabilities, generate a new state sequence using the decoding algorithm
• Repeat until the sequence stabilizes
![Page 25: Hidden Markov Models Ellen Walker Bioinformatics Hiram College, 2008](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d1f5503460f949f2b6a/html5/thumbnails/25.jpg)
HMM for Sequence Profiles
• Three kinds of states:– Insertion– Deletion– Match
• Probability estimations indicate how often each occurs
• Logos are direct representations of HMMs in this format