rolling dice data analysis - hidden markov model danielle tan haolin zhu
TRANSCRIPT
![Page 1: Rolling Dice Data Analysis - Hidden Markov Model Danielle Tan Haolin Zhu](https://reader035.vdocument.in/reader035/viewer/2022071709/56649cf85503460f949c9197/html5/thumbnails/1.jpg)
Rolling Dice Data Analysis - Hidden Markov Model
Danielle Tan
Haolin Zhu
![Page 2: Rolling Dice Data Analysis - Hidden Markov Model Danielle Tan Haolin Zhu](https://reader035.vdocument.in/reader035/viewer/2022071709/56649cf85503460f949c9197/html5/thumbnails/2.jpg)
Observations-Histogram 1
1 2 3 4 5 60
500
1000
1500
2000
2500Histogram of All Dice Rolls
![Page 3: Rolling Dice Data Analysis - Hidden Markov Model Danielle Tan Haolin Zhu](https://reader035.vdocument.in/reader035/viewer/2022071709/56649cf85503460f949c9197/html5/thumbnails/3.jpg)
Observations-Histogram 2
1 2 3 4 5 60
20
40
60
80
100
120
140
160
180
200Histogram of Dice Rolls #1 to #1000
1 2 3 4 5 60
50
100
150
200
250
300
350Histogram of Dice Rolls #1001 to #2000
1 2 3 4 5 60
50
100
150
200
250
300
350
400Histogram of Dice Rolls #7001 to #8000
#1-1000: fair die?
#1001-2000: loaded die 1?
#7001-8000: loaded die 2?
![Page 4: Rolling Dice Data Analysis - Hidden Markov Model Danielle Tan Haolin Zhu](https://reader035.vdocument.in/reader035/viewer/2022071709/56649cf85503460f949c9197/html5/thumbnails/4.jpg)
Observations-Cumulative Sum
0 1000 2000 3000 4000 5000 6000 7000 8000 90000
0.5
1
1.5
2
2.5
3
3.5x 10
4 Cumulative Sum of Dice Roll Values
Actual DataFair Dice
1500 1980 36903790 4250 4660 5700 6500 7700
Fair region’s slope = 3.5; Loaded regions have approx. same slope of 4.5
![Page 5: Rolling Dice Data Analysis - Hidden Markov Model Danielle Tan Haolin Zhu](https://reader035.vdocument.in/reader035/viewer/2022071709/56649cf85503460f949c9197/html5/thumbnails/5.jpg)
Observations- Histogram 3
1st Loaded Region
3rd Loaded Region
5th Loaded Region
1 2 3 4 5 60
50
100
150
200
250
1 2 3 4 5 60
20
40
60
80
100
120
140
160
180
200Histogram of Dice Rolls #4250 to #4660
1 2 3 4 5 60
50
100
150
200
250
300
350Histogram of Dice Rolls #7000 to #7780
![Page 6: Rolling Dice Data Analysis - Hidden Markov Model Danielle Tan Haolin Zhu](https://reader035.vdocument.in/reader035/viewer/2022071709/56649cf85503460f949c9197/html5/thumbnails/6.jpg)
Observations 2 dice: One is fair, one is loaded.
Loaded regions are:
#1500-1980; #3690-3790, #4250-4660, #5700-6500 & #7000-7700
Probability of 6 on the loaded dice is ½.
Once either of the dice is used, it will continue being used for a while.
![Page 7: Rolling Dice Data Analysis - Hidden Markov Model Danielle Tan Haolin Zhu](https://reader035.vdocument.in/reader035/viewer/2022071709/56649cf85503460f949c9197/html5/thumbnails/7.jpg)
Hidden Markov Model Known information:
A sequence of observations with integers between 1-6.
Questions of interest: How was this data set generated? What portion of the data was generated by the fair dice and
loaded dice respectively? What are the probabilities of the transition between the
dice? What is the probability of generating 6 using the loaded
dice?
![Page 8: Rolling Dice Data Analysis - Hidden Markov Model Danielle Tan Haolin Zhu](https://reader035.vdocument.in/reader035/viewer/2022071709/56649cf85503460f949c9197/html5/thumbnails/8.jpg)
Hidden-Markov Model
Define two states:
Fair Loaded
Probabilities of the transition between the two states.
0.05
0.05 0.95 0.95
Transition Matrix:0.95 0.05
0.05 0.95A
A guess from observation!
![Page 9: Rolling Dice Data Analysis - Hidden Markov Model Danielle Tan Haolin Zhu](https://reader035.vdocument.in/reader035/viewer/2022071709/56649cf85503460f949c9197/html5/thumbnails/9.jpg)
Hidden-Markov Model In each state, there are 6 possible output:
Fair Loaded
1 1/6 1/10
2 1/6 1/10
3 1/6 1/10
4 1/6 1/10
5 1/6 1/10
6 1/6 1/2
Emission Matrix:
1/ 6 1/ 6 1/ 6 1/ 6 1/ 6 1/ 6
1/10 1/10 1/10 1/10 1/10 1/ 2b
Again a guess!
![Page 10: Rolling Dice Data Analysis - Hidden Markov Model Danielle Tan Haolin Zhu](https://reader035.vdocument.in/reader035/viewer/2022071709/56649cf85503460f949c9197/html5/thumbnails/10.jpg)
Hidden-Markov ModelA set of observations:
The states are hidden:
1 2( , , )Ny y yy
1 2( , , )Ns s ss For example: s=(FFFFFFFFLLLFFFLL…)
Given the output sequence y, we need to find the most likely set of state transition and output probabilities. In other words, to derive the maximum likelihood estimate of the parameters (transition probabilities) of the HMM given a dataset of output sequences.
![Page 11: Rolling Dice Data Analysis - Hidden Markov Model Danielle Tan Haolin Zhu](https://reader035.vdocument.in/reader035/viewer/2022071709/56649cf85503460f949c9197/html5/thumbnails/11.jpg)
Forward-Backward algorithmWhat is the probability that the actual state of the system is i at time t?
The probability of the observed data up to time t:
The probability of the observed data after time t:
Then:
11
( ) ( ) ( )M
t ij j t tj
j A b y j
1 1 2 21 1 1 1 2 11
( ) ( ); ( ) ( ) ( )... ( ) ( )M
j t i i i i t ij j ti
j b y j b y A b y i A b y
1
( ) ( )( )
( ) ( )
t tt M
t ti
i iP i
i i
( ) ( | )t tP i P s i y
![Page 12: Rolling Dice Data Analysis - Hidden Markov Model Danielle Tan Haolin Zhu](https://reader035.vdocument.in/reader035/viewer/2022071709/56649cf85503460f949c9197/html5/thumbnails/12.jpg)
Baum-Welch re-estimation
Notice that we are using a guess of the transition matrix and the emission matrix!
Re-estimation of A and b:
Then we are able to iterate until it converges—we keep track of the probability of the whole data set generated by the given parameters until it converges to a maximum.
1 1'( ) ( ) ( )
( ) ( )t ij j t tt
ijt tt
i A b y jA
i i
'( , ) ( ) ( )
( )( ) ( )
t t tti
t tt
y k i ib k
i i
![Page 13: Rolling Dice Data Analysis - Hidden Markov Model Danielle Tan Haolin Zhu](https://reader035.vdocument.in/reader035/viewer/2022071709/56649cf85503460f949c9197/html5/thumbnails/13.jpg)
Results
Transition matrix:
Emission matrix:
0.9982 0.0018
0.0036 0.9964A
0.1696 0.1766 0.1583 0.1661 0.1649 0.1645
0.0985 0.0973 0.1019 0.0951 0.1051 0.5022b
![Page 14: Rolling Dice Data Analysis - Hidden Markov Model Danielle Tan Haolin Zhu](https://reader035.vdocument.in/reader035/viewer/2022071709/56649cf85503460f949c9197/html5/thumbnails/14.jpg)
Results Time when the loaded dice was used:
![Page 15: Rolling Dice Data Analysis - Hidden Markov Model Danielle Tan Haolin Zhu](https://reader035.vdocument.in/reader035/viewer/2022071709/56649cf85503460f949c9197/html5/thumbnails/15.jpg)
Results Histogram of the data generated by the Hidden-
Markov model:
![Page 16: Rolling Dice Data Analysis - Hidden Markov Model Danielle Tan Haolin Zhu](https://reader035.vdocument.in/reader035/viewer/2022071709/56649cf85503460f949c9197/html5/thumbnails/16.jpg)
Results Cumulative sum of the data generated by the Hidden
Markov model:
![Page 17: Rolling Dice Data Analysis - Hidden Markov Model Danielle Tan Haolin Zhu](https://reader035.vdocument.in/reader035/viewer/2022071709/56649cf85503460f949c9197/html5/thumbnails/17.jpg)
Results Log of the likelihood