![Page 1: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/1.jpg)
Hidden Markov Models in the context of geneticanalysis
Vincent Plagnol
UCL Genetics Institute
November 22, 2012
![Page 2: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/2.jpg)
Outline
1 Introduction
2 Two basic problemsForward/backward Baum-Welch algorithmViterbi algorithm
3 When the parameters are unknown
4 Two applicationsGene predictionCNV detection from SNP arrays
5 Two extensions to the basic HMMStochastic EMSemi-Markov models
![Page 3: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/3.jpg)
Outline
1 Introduction
2 Two basic problemsForward/backward Baum-Welch algorithmViterbi algorithm
3 When the parameters are unknown
4 Two applicationsGene predictionCNV detection from SNP arrays
5 Two extensions to the basic HMMStochastic EMSemi-Markov models
![Page 4: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/4.jpg)
The problem
Many applications of statistics can be seen as a categorisation.
We try to fit complex patterns into discrete boxes in order toapprehend them better.
Clustering approaches are typical of this:
Inference of an individual’s ancestry being a mix of X and Y.Separation between high risk and low risk disease groups . . .
Hidden Markov Models try to achieve exactly this purpose ina different context.
![Page 5: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/5.jpg)
Basic framework
![Page 6: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/6.jpg)
An example: gene discovery from DNA sequence
![Page 7: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/7.jpg)
An example: gene discovery from DNA sequence
We will first use this simplest example.
We assume that the hidden chain X has two states: gene, orintergenic.
To be complete there should be a third state: gene on thereverse strand.
For now we assume that the emission probabilities P(Yi |Xi )are independent conditionally on the hidden chain X .
This may not be good enough for most applications but this isa place to start.
![Page 8: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/8.jpg)
Notations
(Y )ni=1 represents the sequence of observed data points.
The Yi can be discrete or continuous, but we will assumediscrete for now.
(X )ni=1 is the sequence of hidden states.
∀i ,Xi ∈ {1, . . . ,S} and we have S discrete hidden states.We also assume that we know the distribution P(Y |X ), butthis set of parameters may also be unknown.
![Page 9: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/9.jpg)
Basic description of Markov Chains (1)
A discrete stochastic process X is Markovian is
P(X n1 |Xi ) = P(X i−1
1 |Xi )P(X i+11 |Xi )
Essentially the future and the past are independentconditionally on the present: it is “memory-less”.
One can easily make a continuous version of this.
If the Markov model has S states, then the process can bedescribed using a SxS transition matrix.
The diagonal values pii describe the probability to stay instate i .
![Page 10: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/10.jpg)
Basic description of Markov Chains (2)
The probability to spend exactly k units of time in state i is:
P(X spends k units in i) = pkii (1− pii )
This is the definition of an geometric variable.
In a continuous state it would be an exponential distribution.
The definition of the present can also be modified: Xi may forexample depends on the previous k states instead of the lastone.
This increases the size of the parameter space but makes themodel richer.
![Page 11: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/11.jpg)
Basics for hidden Markov Chains
The hidden Markov chain framework adds one layer (denotedY ) to the Markovian process discribed previously.
The conditional distribution of P(Yj |Xj = s) may be unknown,completely specified or partially specified.
Typically the number of hidden states S is relatively small (nomore than a few hundreds of states).
But n may be very large, i. e. X and Y may be very longsequences (think DNA sequences).
![Page 12: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/12.jpg)
Slightly more general version
Without complicating anything, we can most of the timeassume that P(Yj |Xj) also varies with j .
Y could also be a Markov chain.
Non-Markovian stays can be, to some extent, mimicked byusing a sequence of hidden state:
First part of the gene, middle of the gene, end of the gene.
![Page 13: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/13.jpg)
The set of parameters Θ
1 (Pst) is the transition matrix for the hidden states.
2 Qsk = P(Y = k |X = s) is probability distribution for theobserved chain Y give X .
3 Lastly, we need a vector µ to initiate the hidden chain X .
![Page 14: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/14.jpg)
Two related problems
1 At a given point i in the sequence, what is the most likelyhidden state Xi?
2 What is the most likely hidden sequence (X )ni=1?
3 The first question relates to marginal probabilities and thesecond to the joint likelihood.
![Page 15: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/15.jpg)
Outline
1 Introduction
2 Two basic problemsForward/backward Baum-Welch algorithmViterbi algorithm
3 When the parameters are unknown
4 Two applicationsGene predictionCNV detection from SNP arrays
5 Two extensions to the basic HMMStochastic EMSemi-Markov models
![Page 16: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/16.jpg)
Outline
1 Introduction
2 Two basic problemsForward/backward Baum-Welch algorithmViterbi algorithm
3 When the parameters are unknown
4 Two applicationsGene predictionCNV detection from SNP arrays
5 Two extensions to the basic HMMStochastic EMSemi-Markov models
![Page 17: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/17.jpg)
What we can compute at this stage
At this stage our tools are limited.
Given a sequence x = (x1, . . . , xn) we can compute
P(X = x ,Y = y) = P(X = x)P(Y = y |X = x)
This is the full joint likelihood for (X ,Y ).
![Page 18: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/18.jpg)
Why problem 1 is difficult
P(Xi = xi |Y ) =P(Xi = xi ,Y )
P(Y )=
P(Xi = xi ,Y )∑s=1,...,S P(Xi = s,Y )
So the problem amounts to estimating P(Xi = r ,Y )
A direct computation would sum over all possible sequences:
P(Xi = s,Y ) =∑
x |xi=s
P [X = x ,Y ]
With S hidden states we need to sum over Sn terms, which isnot practical.
We need to be smarter.
![Page 19: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/19.jpg)
We need to use the Markovian assumption
P(Xi = s,Y ) = P(Xi = s)P(Y |Xi = s)
= P(Xi = s)∑x
P(Y ,X = x |Xi = s)
= P(Xi = s)∑x i1
P(Y i1 ,X
i1 = x i1|Xi = s)
×∑xni+1
P(Y ni+1,X
ni+1 = xni+1|Xi = s)
= P(Xi = s)P(Y i1 |Xi = s)× P(Y n
i+1|Xi = s)
= P(Y i1 ,Xi = s)P(Y n
i+1|Xi = s)
= αs(i)× βs(i)
![Page 20: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/20.jpg)
A new computation
We have shown that:
P(Xi = s|Y ) =αs(i)× βs(i)∑St=1 αt(i)× βt(i)
where:αs(i) = P(Y i
1 ,Xi = s)
βs(i) = P(Y ni+1|Xi = s)
And it is actually possible to compute, recursively, thequantities αs(i), βs(i).
![Page 21: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/21.jpg)
Two recursive computations
The (forward) recursion for α is:
αs(i + 1) = P(Yi+1|Xi+1 = s)×S∑
t=1
αt(i)Pts
The (backward) recursion for β is:
βs(i − 1) =∑t
Pstβt(i)P(Yi |Xi = t)
![Page 22: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/22.jpg)
Proof for the first recursion
αs(i + 1) = P(Y i+11 ,Xi+1 = s)
=∑t
P(Y i+11 ,Xi+1 = s|Xi = t)P(Xi = t)
=∑t
P(Y i+11 |Xi+1 = s,Xi = t)P(Xi+1 = s|Xi = t)P(Xi = t)
= P(Yi+1|Xi+1 = s)∑t
PtsP(Y i1 |Xi = t,Xi+1 = s)P(Xi = t)
= P(Yi+1|Xi+1 = s)∑t
PtsP(Y i1 ,Xi = t)
= P(Yi+1|Xi+1 = s)∑t
Ptsαt(i)
A similar proof is used for the backward recursion.
![Page 23: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/23.jpg)
Computational considerations
The algorithm requires to store n × S floats.
In terms of computation times, the requirements are inS2 × N.
Linearity in n is the key feature because it enables the analysisof very long DNA sequences.
Note that probabilities rapidly become infinitely small.
Everything needs to be done at the log scale (be careful whenimplementing it).
Various R packages are available for hidden Markov Chains(google it!).
![Page 24: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/24.jpg)
Outline
1 Introduction
2 Two basic problemsForward/backward Baum-Welch algorithmViterbi algorithm
3 When the parameters are unknown
4 Two applicationsGene predictionCNV detection from SNP arrays
5 Two extensions to the basic HMMStochastic EMSemi-Markov models
![Page 25: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/25.jpg)
Pbm 2: Finding the most likely hidden sequence X̂
A different problem consists of finding the most likely hiddensequence X̂ .
Indeed, the most likely Xi using the marginal distribution maybe quite different from X̂i .
An algorithm exists to achieve this maximisation and it iscalled Viterbi algorithm.
![Page 26: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/26.jpg)
The Viterbi algorithm
DefineVs(i) = max
x i1
P(Y i1 ,X
i1|Xi = s)
Similarly to the previous problem a forward recursion can bedefined for Vs(n + 1) as a function of Vs .
Following this forward computation a reverse parsing of theMarkov chain can identify the most likely sequence.
![Page 27: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/27.jpg)
An exercise
Here is a table that shows the probability of the data for threestates (one state per row, 6 points in the chain). This matrixshows a log likelihood of the data given the position in the chainand the hidden state (which can be either 1, 2 or 3).
State 1 2 3 4 5 6
1 1 3 4 3 5 42 2 1 5 8 5 13 4 2 2 4 1 5
Assume that remaining in the same state costs no log-likelihood,but transitioning from one state to another costs one unit oflikelihood.The probability over the three states is uniform to start the chain.Compute
Vs(i) = maxx i1
P(Y i1 ,X
i1|Xi = s)
and estimate the most liekly Viterbi path.
![Page 28: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/28.jpg)
A few words about Andrew Viterbi
Andrew James Viterbi (born inBergamo in 1935) is anItalian-American electrical engineerand businessman.
In addition to his academic work heco-founded Qualcomm.
Viterbi made a very large donationto the University of SouthernCalifornia to name the school theViterbi school of engineering.
![Page 29: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/29.jpg)
Computational considerations
Requirements are the same as before.
The algorithm requires to store n × S floats.
In terms of computation times, the requirements are inS2 × N.
Linearity in n is the key feature because it enables the analysisof very long DNA sequences.
Easy to code (in C or R, see example and R libraries).
![Page 30: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/30.jpg)
Outline
1 Introduction
2 Two basic problemsForward/backward Baum-Welch algorithmViterbi algorithm
3 When the parameters are unknown
4 Two applicationsGene predictionCNV detection from SNP arrays
5 Two extensions to the basic HMMStochastic EMSemi-Markov models
![Page 31: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/31.jpg)
Unknown parameters case
Often we do not know the distribution P(Y |X ).
We may also not know the transition probabilities for thehidden Markov chain X .
If the parameters Θ are not known, how can we estimatethem?
![Page 32: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/32.jpg)
What if we knew X?
If we know X , the problem becomes straightforward.
For example a maximum likelihood estimate would be:
P(Y = k |X = s) =
∑i 1Yi=k,Xi=s∑
i 1Xi=s
More sophisticated (but still straightforward) versions of thiscould be used if Y was a nth order Markov chain.
![Page 33: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/33.jpg)
A typical missing data problem
In this missing data context, a widely used algorithm is thethe Expectation-Maximisation (EM) algorithm .
The EM algorithm is set up to find the parameters thatmaximise the likelihood of the observed data Y in thepresence of missing data X .
At each step the likelihood is guaranteed to increase.
The algorithm can easily be stuck in a local maximum of thelikelihood surface.
![Page 34: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/34.jpg)
The basic idea of the EM
The is a general iterative algorithm with multiple applications.
It first computes the expected value of the likelihood given thecurrent parameters (essentially imputing the hidden chain X ):
Q(θ, θn) = EX |Y (log L(X ,Y , θn))
Then maximises the quantity Q(θ, θn+1) as a function of θ.
θn+1 = argmaxθ
Q(θ, θn)
![Page 35: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/35.jpg)
EM in the context of HMM
Pst =
∑i P(Xi = s,Xi+1 = t|Y )∑
i P(Xi = s|Y )
Qsk =
∑i 1Yi=kP(Xi = s|Y )∑
i P(Xi = s|Y )
The updated probabilities can be estimated using thesequences αs , βs estimated previously.
This special case of the EM for HMM is called theBaum-Welch algorithm.
![Page 36: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/36.jpg)
Outline
1 Introduction
2 Two basic problemsForward/backward Baum-Welch algorithmViterbi algorithm
3 When the parameters are unknown
4 Two applicationsGene predictionCNV detection from SNP arrays
5 Two extensions to the basic HMMStochastic EMSemi-Markov models
![Page 37: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/37.jpg)
Outline
1 Introduction
2 Two basic problemsForward/backward Baum-Welch algorithmViterbi algorithm
3 When the parameters are unknown
4 Two applicationsGene predictionCNV detection from SNP arrays
5 Two extensions to the basic HMMStochastic EMSemi-Markov models
![Page 38: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/38.jpg)
Gene prediction
Zhang, Nat Rev Genetics, 2002
![Page 39: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/39.jpg)
Some drawbacks of this approach
The number of hidden states can be very large.
Modelling codons takes three states, plus probably threestates for the first and three states for the last codons.
So about nine states just for the exons.
One probably needs nine more states on the reverse strand.
Some alternatives exist (using semi-Markov models).
![Page 40: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/40.jpg)
Outline
1 Introduction
2 Two basic problemsForward/backward Baum-Welch algorithmViterbi algorithm
3 When the parameters are unknown
4 Two applicationsGene predictionCNV detection from SNP arrays
5 Two extensions to the basic HMMStochastic EMSemi-Markov models
![Page 41: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/41.jpg)
Copy number variant detection from SNP arrays
+
+
+
+ +
+
+
+
+
+
+
+
++
+
+ +
++
+
++
+
++
++
+++
+
++
+++
++ ++
+
++
+
+
++
+
+
+
+
+
+
++
+
++ ++
+
+
+
+
++
+
+
+
+
+++
++
+
+
++
+
+++
+
+
++
+
+
+
+
+
+ +
+
+
++
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
++
+
+
+
++
+
++
+
+
+
+
+
+
++
++
++
+
++
+
+
+
++
+
++
++
+
++
+
++
+
++
+
+
+
+
+
+
+
+
++
+
+
++
+
+
+
++
+
+
+
++ +
+
+
+
+
+
++
+
+++
++
+
++
++
+
+
+
+
+
++
+
+
++
+
++
+
+
+
+ +
+
+ +
+++
++
+ ++
+++ +
+
+
++
+
+
++
++
++
+ +
+
+
+
+
+ +
+
++
+
+
+
+
+
+
++
+
+
+
+
+
++
+
++
+
+
+
+
+
+
+
+
+
+
++
++
+
+
+++
+
+
++
+
+
++
+
+
+
+
+
+
+
+
+
+
+
+
++
+
+
+
++
+
++
+++
+
++
+
+
+
++
+
+
+
++
++
+
++
+
+++
+
+
+
+
+++
+
++
+
++
+
+
+
++
+
+ +
+
+
++
+
+
+++
+
+
+
++
++
+
+
+
++
++
++
+
+
+
+++
+
+
+
+
+
+
+++
++
+ +
+
+
+
+
+
+
+
+
++
+
+
+
+
++
++
+
++
+
+
+
+
+
+
++++
+
++
+
+
++
+
++
+
+
+
+
+
+
+
++
+
+
+ +
+
++
++
+
++
++
+
+
+
+
+
+
++
++
++
++
++
++
+
+
+
+ ++
+
+
+
+
+ +
+
+++
+
+
+
+
+
+
+
+
+
+
+
+
+
++
++
+
++
+
+
+
+
+
+
++
+
++
+ +
++
+
+
++
+
+
+
++
+
+
++
+
++
+
+
++
+ +++
+
++
++
+
+ +
+
+
+
+
+
+
++
+
+
+
+
+
++
+
+
+
++
+
+
+ +
++
+
+
+
+
++
+
+
++
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
++
+
+
+
+ ++
+
+
+
++
+
+
++
+++
++
+
+ +
+
++ +
+
+
+
+
+
+
++
+
+
+
+
+
++
+
+
+
+
+
+
+
+
+
+++
+
+
+
+
+
+
+
+
+
+
+
+
+
+ +
+
+
+
++
+
+
+ +
+
++ +
+
+
+
+
+
+
+
+
+
+
+
+
++
+ +
+
+
+
+
+
+
+
+
+
++
+++
+
++
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
++
+
+
++
+
++
++
+
+
+
+
+
+
+
++
+
+
++
++
+
++
+
+
++
+
+
+
+
+
++
+
+
+
+
+
+
+ +
++
+
+
+
++
+
++
+
++++
++
+
+
++
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
++
+
+
+
+ +
+
+
+
++
+
+
+
+
++
+
+
+
+
++++
++
+
+
+
+
++
+
+
+
+
+
+
+++
+
+
+
+
+
+
+
++
+
+ ++
+
+
++
+
++
++
+
++
+
+
+
++
+
++++
+
+
+
+
++
+
+
+
+
+
+ +
++
++
++
+
+
++
+
+
+
++
+
++
+
+
+ +
+
+
++
+
+
+
+
+
++
+
+
+
+
+
++
+
+
++ +
+
+
++ ++
+
++
+
+
+
++ + +
+
+
+
+
+
+
+
+
+
+
+
++
++
+
+
++
+
+
+ +
+
+
+
+
+
+
+
+
+++
+
++
++
+
+++ +
++ +
+
+
+
++
++
+
++
++
+
+
++
+
+
+
+
+
+
+
+
+
+
+
+
+
+
++
++ +
+
+
+
++
+
+
+
++
+
+
+
+
++
++
+
+
+
+
+
+
+
++ ++
+
+
+
+
+
+
+ +
+
++
+
+
+
+
+
+
+
+
+
+
+
++
+
+
+
++
+
++++
+
+
+++
+
+
+
+
+
+
+
+
++++
+
+
+
+
+
++
+
+
+
+
+
+
+++
++ +
+
+
+
+
+
+
+
+
+
+
+ +
+
+
+
+
+
+
+
+
++
+
+
+
+
+
+
+
+
++
+
+
+
+
+
+
++
++
+
+
+
++
+
++
+
+
++
+
+
+
+
+++
+
+
+++
++
+
+
+
+
+
+
+
+
+
+
+++
++
++
+
++
++ +
+
+
++
+
+
+
+
+
+
+
+
++
++
+
+
+
+
+
+
+
++
+
+
+
+
+
++
++++
+
+
+
++
+
++
+
+
++
+
+++
+++
+
+
++
+
+
+
+
+
+
+
+
+
+
+
+ +
+
+
++
+
+
+
+
+
+
+++
++
+
+
+
+
+
++
+
+
+
+
+
+
+
+
+
+
+
+ +++
+
+
+
+
+
+++ ++
+++
+
+
+
+
+
+ + +
+
+
++
+
+
+
+
++
+
+
+
+
+
++
++
+
+
++
+
++
+
+
+
++
+
++
+
+
+
++
+
+
+
++
+
+
+
+
+
+
+
+
++
+
+
+
+
+
+
+
+
++
+
++
+
+
+
+
++
+
+
++
+
+
0.0 0.5 1.0 1.5
0.0
0.5
1.0
1.5
Allele 1
Alle
le 2
![Page 42: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/42.jpg)
Copy number variant detection from SNP arrays
Wang et al, Genome Research 2007
![Page 43: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/43.jpg)
Outline
1 Introduction
2 Two basic problemsForward/backward Baum-Welch algorithmViterbi algorithm
3 When the parameters are unknown
4 Two applicationsGene predictionCNV detection from SNP arrays
5 Two extensions to the basic HMMStochastic EMSemi-Markov models
![Page 44: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/44.jpg)
Outline
1 Introduction
2 Two basic problemsForward/backward Baum-Welch algorithmViterbi algorithm
3 When the parameters are unknown
4 Two applicationsGene predictionCNV detection from SNP arrays
5 Two extensions to the basic HMMStochastic EMSemi-Markov models
![Page 45: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/45.jpg)
Stochastic EM (SEM)
The EM-Baum-Welch algorithm essentially uses theconditional distribution of X given Y .
Another way to compute this expectation is to use aMonte-Carlo approach by simulating X given Y and taking anaverage.
This is a trade-off:
We of course do not retain the certainty that the likelihood isincreasing (as provided by the EM).However added randomness may avoid the pitfall of having theestimator stuck in local maximum (a major issue with the EM).
![Page 46: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/46.jpg)
Stochastic EM (SEM)
A simulation of X conditionally on Y would use the followingdecomposition:
P(XN1 |Y N
1 ) = P(X1|Y N1 )P(X2|Y N
1 ,X1) · · ·P(XN |Y N1 ,X
N−11 )
This relies on being able to compute the marginal probabilitiesbut this is what Baum-Welch does.
Once the α, β have been computed, the simulation is linear intime and multiple sequences can be simulated rapidly.
![Page 47: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/47.jpg)
How to simulate in practice
The simulation uses the equality:
P(Xi+1 = t|Y ,Xi = s) =PstP(Yi+1|Xi+1 = t)P(Y n
i+2|Xi+1 = s)
P(Y ni+1|Xi = s)
=PstP(Yi+1|Xi+1 = t)βt(i + 1)
βs(i)
Note that this is a forward-backward algorithm as well but theforward step is built into the simulation step, unlike thetraditional Baum-Welch.
![Page 48: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/48.jpg)
Estimation issues
Using a single estimated run for the hidden chain X isnecessarily less efficient that relying on the expectedprobability.
The number of data points must be very large to make theestimation precise.
One could potentially take an average of multiple simulatedruns.
With sufficient numbers of simulations one actually gets veryclose to the EM.
List most practical estimation procedures one has to find thegood combination of tools, and there is not one answer.
![Page 49: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/49.jpg)
Outline
1 Introduction
2 Two basic problemsForward/backward Baum-Welch algorithmViterbi algorithm
3 When the parameters are unknown
4 Two applicationsGene predictionCNV detection from SNP arrays
5 Two extensions to the basic HMMStochastic EMSemi-Markov models
![Page 50: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/50.jpg)
Semi-Markov models (HSMM)
In the context of the gene prediction using three states percodon is not satisfying.
We would like something that takes into account groups of3bp jointly.
Semi-Markov models do exactly this.
When entering a state s, a random variable Ts is drawn for theduration in state s.Then the emission probability for Y can be defined for theentire duration of the stay.So codons are naturally defined by groups of 3bp instead ofdealing with multiple hidden states.
![Page 51: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/51.jpg)
Backward recursion for SEM applied to semi-Markovhidden chains
We are interested in computing the quantities:
∀n ∈ [1,N − 1], ∀i ∈ [1, k], βi (n) = P(Y Nn+1|Y n
1 ,Xn = i)
βi (N) = 1
βi (n) = P(Y Nn+1|Y n
1 ,Xn = εi )
=∑j
∑l<N−n
PijP(Tγj = l)P(Y n+l+1n+1 |X n+l
n+1)βj(n + l)
Note the complexity not in NS2 ×max(l) as opposed to NS2 before.
![Page 52: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/52.jpg)
Forward simulations for SEM
One can simulate a new hidden sequence recursively with theformulas:
P(X n+ln+1 = j |Y N
1 ,Xn = i)
=PijP(Tj = l)P(Y n+l+1
n+1 |X n+ln+1 = j)βj(n + l)
βi (n)
This is very much analogous to the basic HMM situation, with theextra complication generated by the variable state length.
![Page 53: Hidden Markov Models in the context of genetic analysisrmjbale/Stat/HMM.pdf · Forward/backward Baum-Welch algorithm Viterbi algorithm 3 When the parameters are unknown 4 Two applications](https://reader031.vdocument.in/reader031/viewer/2022022116/5c88311f09d3f2722d8d17f8/html5/thumbnails/53.jpg)
Estimation for semi-Markov models
It is possible to run a Viterbi algorithm using the samerecursion derived for the Markovian case.
It is also possible to use a SEM algorithm to simulate thehidden sequence X and use it to estimate the parameters ofthe model.
A full EM is also possible but I never implemented it.
The computational requirements may become challenging butit all depends on the application.