heavily adapted from slides by mitch marcuscis520/lectures/bayes... · 1 z 2 z 3 z 4 w 1 w 2 w 3 w...
TRANSCRIPT
![Page 1: Heavily adapted from slides by Mitch Marcuscis520/lectures/bayes... · 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 b z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 For each document,](https://reader034.vdocument.in/reader034/viewer/2022043013/5fadd4d73ba9405d8d2f062a/html5/thumbnails/1.jpg)
Recitation: Bayes Nets and Friends
Lyle Ungar and Tony LiuHeavily adapted from slides by Mitch Marcus
![Page 2: Heavily adapted from slides by Mitch Marcuscis520/lectures/bayes... · 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 b z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 For each document,](https://reader034.vdocument.in/reader034/viewer/2022043013/5fadd4d73ba9405d8d2f062a/html5/thumbnails/2.jpg)
Recitation Plan
◆ Naïve Bayes Exercise
◆ LDA Example
◆ Bayes Net Exercises
◆ HMM Example
![Page 3: Heavily adapted from slides by Mitch Marcuscis520/lectures/bayes... · 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 b z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 For each document,](https://reader034.vdocument.in/reader034/viewer/2022043013/5fadd4d73ba9405d8d2f062a/html5/thumbnails/3.jpg)
Recall: Naïve Bayes
![Page 4: Heavily adapted from slides by Mitch Marcuscis520/lectures/bayes... · 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 b z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 For each document,](https://reader034.vdocument.in/reader034/viewer/2022043013/5fadd4d73ba9405d8d2f062a/html5/thumbnails/4.jpg)
Recall: Naïve Bayes◆ Conditional independence assumption
◆ As MAP estimator (uses prior for smoothing)
⚫ Contrast MLE – what’s the problem?
![Page 5: Heavily adapted from slides by Mitch Marcuscis520/lectures/bayes... · 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 b z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 For each document,](https://reader034.vdocument.in/reader034/viewer/2022043013/5fadd4d73ba9405d8d2f062a/html5/thumbnails/5.jpg)
Naïve Bayes Exercise
![Page 6: Heavily adapted from slides by Mitch Marcuscis520/lectures/bayes... · 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 b z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 For each document,](https://reader034.vdocument.in/reader034/viewer/2022043013/5fadd4d73ba9405d8d2f062a/html5/thumbnails/6.jpg)
Recall: The LDA Model
![Page 7: Heavily adapted from slides by Mitch Marcuscis520/lectures/bayes... · 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 b z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 For each document,](https://reader034.vdocument.in/reader034/viewer/2022043013/5fadd4d73ba9405d8d2f062a/html5/thumbnails/7.jpg)
Recall: The LDA Model
z4z3z2z1
w4w3w2w1
b
z4z3z2z1
w4w3w2w1
z4z3z2z1
w4w3w2w1
◆ For each document,
![Page 8: Heavily adapted from slides by Mitch Marcuscis520/lectures/bayes... · 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 b z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 For each document,](https://reader034.vdocument.in/reader034/viewer/2022043013/5fadd4d73ba9405d8d2f062a/html5/thumbnails/8.jpg)
Recall: The LDA Model
z4z3z2z1
w4w3w2w1
b
z4z3z2z1
w4w3w2w1
z4z3z2z1
w4w3w2w1
◆ For each document,
⚫ Choose the topic distribution ~ Dirichlet()
![Page 9: Heavily adapted from slides by Mitch Marcuscis520/lectures/bayes... · 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 b z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 For each document,](https://reader034.vdocument.in/reader034/viewer/2022043013/5fadd4d73ba9405d8d2f062a/html5/thumbnails/9.jpg)
Recall: The LDA Model
z4z3z2z1
w4w3w2w1
b
z4z3z2z1
w4w3w2w1
z4z3z2z1
w4w3w2w1
◆ For each document,
⚫ Choose the topic distribution ~ Dirichlet()
⚫ For each of the N words wn:
![Page 10: Heavily adapted from slides by Mitch Marcuscis520/lectures/bayes... · 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 b z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 For each document,](https://reader034.vdocument.in/reader034/viewer/2022043013/5fadd4d73ba9405d8d2f062a/html5/thumbnails/10.jpg)
Recall: The LDA Model
z4z3z2z1
w4w3w2w1
b
z4z3z2z1
w4w3w2w1
z4z3z2z1
w4w3w2w1
◆ For each document,
⚫ Choose the topic distribution ~ Dirichlet()
⚫ For each of the N words wn:
◼ Choose a topic z ~ Multinomial()
![Page 11: Heavily adapted from slides by Mitch Marcuscis520/lectures/bayes... · 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 b z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 For each document,](https://reader034.vdocument.in/reader034/viewer/2022043013/5fadd4d73ba9405d8d2f062a/html5/thumbnails/11.jpg)
Recall: The LDA Model
z4z3z2z1
w4w3w2w1
b
z4z3z2z1
w4w3w2w1
z4z3z2z1
w4w3w2w1
◆ For each document,
⚫ Choose the topic distribution ~ Dirichlet()
⚫ For each of the N words wn:
◼ Choose a topic z ~ Multinomial()
◼ Then choose a word wn ~ Multinomial(bz)
Where each topic has a different parameter vector b for the words
![Page 12: Heavily adapted from slides by Mitch Marcuscis520/lectures/bayes... · 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 b z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 For each document,](https://reader034.vdocument.in/reader034/viewer/2022043013/5fadd4d73ba9405d8d2f062a/html5/thumbnails/12.jpg)
h
d ~ Dirichlet()
zd,n ~ Multinomial(d)
wd,n ~ Multinomial(bz)
David Blei 2012: Probabilistic Topic Models (on course website)
![Page 13: Heavily adapted from slides by Mitch Marcuscis520/lectures/bayes... · 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 b z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 For each document,](https://reader034.vdocument.in/reader034/viewer/2022043013/5fadd4d73ba9405d8d2f062a/html5/thumbnails/13.jpg)
LDA Parameter Estimation
◆ Given a corpus of documents, find the parameters and b which
maximize the likelihood of the observed data (words in documents),
marginalizing over the hidden variables , z
◆ E-step:
⚫ Compute p(,z|w,,b), the posterior of the hidden variables (,z)
given each document w, and parameters and b.
◆ M-step
⚫ Estimate parameters and b given the current hidden variable
distribution estimates
: topic distribution for the document,z: topic for each word in the document
You don’t need to know the details;Only what is hidden and what is observed;And that EM works here.
![Page 14: Heavily adapted from slides by Mitch Marcuscis520/lectures/bayes... · 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 b z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 For each document,](https://reader034.vdocument.in/reader034/viewer/2022043013/5fadd4d73ba9405d8d2f062a/html5/thumbnails/14.jpg)
LDA Exercise
![Page 15: Heavily adapted from slides by Mitch Marcuscis520/lectures/bayes... · 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 b z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 For each document,](https://reader034.vdocument.in/reader034/viewer/2022043013/5fadd4d73ba9405d8d2f062a/html5/thumbnails/15.jpg)
Recall: Bayes Nets
![Page 16: Heavily adapted from slides by Mitch Marcuscis520/lectures/bayes... · 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 b z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 For each document,](https://reader034.vdocument.in/reader034/viewer/2022043013/5fadd4d73ba9405d8d2f062a/html5/thumbnails/16.jpg)
Recall: Bayes Nets
◆ Local Markov Assumption
⚫ given its parents, each node is conditionally
independent of everything except its descendants
◆ Active Trails
◆ D Separation
![Page 17: Heavily adapted from slides by Mitch Marcuscis520/lectures/bayes... · 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 b z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 For each document,](https://reader034.vdocument.in/reader034/viewer/2022043013/5fadd4d73ba9405d8d2f062a/html5/thumbnails/17.jpg)
Active Trails
A trail {X1,X2,⋯,Xk} in the graph (no cycles) is an active
trail if for each consecutive triplet in the trail:
◆ Xi−1→Xi→Xi+1, and Xi is not observed
Xi−1←Xi←Xi+1, and Xi is not observed
Xi−1←Xi→Xi+1, and Xi is not observed
Xi−1→Xi←Xi+1, and Xi is observed or one of its
descendants is observed
Variables connected by active trails are not conditionally independent
![Page 18: Heavily adapted from slides by Mitch Marcuscis520/lectures/bayes... · 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 b z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 For each document,](https://reader034.vdocument.in/reader034/viewer/2022043013/5fadd4d73ba9405d8d2f062a/html5/thumbnails/18.jpg)
D-separation
◆ Variables Xi and Xj are independent if there is no
active trail between Xi and Xj .
⚫ given a set of observed variables O⊂{X1,⋯,Xm}
⚫ O sometimes called a Markov Blanket
![Page 19: Heavily adapted from slides by Mitch Marcuscis520/lectures/bayes... · 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 b z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 For each document,](https://reader034.vdocument.in/reader034/viewer/2022043013/5fadd4d73ba9405d8d2f062a/html5/thumbnails/19.jpg)
Bayes Net Exercises
![Page 20: Heavily adapted from slides by Mitch Marcuscis520/lectures/bayes... · 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 b z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 For each document,](https://reader034.vdocument.in/reader034/viewer/2022043013/5fadd4d73ba9405d8d2f062a/html5/thumbnails/20.jpg)
Bayes Net Exercises
![Page 21: Heavily adapted from slides by Mitch Marcuscis520/lectures/bayes... · 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 b z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 For each document,](https://reader034.vdocument.in/reader034/viewer/2022043013/5fadd4d73ba9405d8d2f062a/html5/thumbnails/21.jpg)
Bayes Net Exercises
![Page 22: Heavily adapted from slides by Mitch Marcuscis520/lectures/bayes... · 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 b z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 For each document,](https://reader034.vdocument.in/reader034/viewer/2022043013/5fadd4d73ba9405d8d2f062a/html5/thumbnails/22.jpg)
Bayes Net Exercises
![Page 23: Heavily adapted from slides by Mitch Marcuscis520/lectures/bayes... · 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 b z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 For each document,](https://reader034.vdocument.in/reader034/viewer/2022043013/5fadd4d73ba9405d8d2f062a/html5/thumbnails/23.jpg)
Recall: Hidden Markov Models
![Page 24: Heavily adapted from slides by Mitch Marcuscis520/lectures/bayes... · 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 b z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 For each document,](https://reader034.vdocument.in/reader034/viewer/2022043013/5fadd4d73ba9405d8d2f062a/html5/thumbnails/24.jpg)
Recall: Hidden Markov Models
◆ Markov assumption
◆ Parameterization
![Page 25: Heavily adapted from slides by Mitch Marcuscis520/lectures/bayes... · 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 b z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 For each document,](https://reader034.vdocument.in/reader034/viewer/2022043013/5fadd4d73ba9405d8d2f062a/html5/thumbnails/25.jpg)
Parameters of an HMM
◆ States: A set of states S = s1,…,sk
◆ Markov transition probabilities: A =
a1,1,a1,2,…,ak,k Each ai,j = p(sj | si) represents the
probability of transitioning from state si to sj.
◆ Emission probabilities: A set B of functions of
the form bi(ot) = p(o|si) giving the probability of
observation ot being emitted by si
◆ Initial state distribution: the probability that si is
a start state
![Page 26: Heavily adapted from slides by Mitch Marcuscis520/lectures/bayes... · 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 b z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 For each document,](https://reader034.vdocument.in/reader034/viewer/2022043013/5fadd4d73ba9405d8d2f062a/html5/thumbnails/26.jpg)
Markov Model ExampleTomorrow’s Weather
Today’s
Weather
Sunny Rainy
Sunny 0.8 0.2
Rainy 0.6 0.4
S1 = [0, 1]
Markov Transition Matrix A
Steady state at [0.75, 0.25]
![Page 27: Heavily adapted from slides by Mitch Marcuscis520/lectures/bayes... · 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 b z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 For each document,](https://reader034.vdocument.in/reader034/viewer/2022043013/5fadd4d73ba9405d8d2f062a/html5/thumbnails/27.jpg)
Hidden Markov Model ExampleTomorrow’s Weather
Today’s
Weather
Sunny Rainy
Sunny 0.8 0.2
Rainy 0.6 0.4
S1 = [0.5, 0.5]
Markov Transition Matrix A
Weather
Sunny Rainy
Umbrella 0.1 0.8
No Umbrella 0.9 0.2
Emission Probabilities B
We observe:(umbrella, no umbrella)
We can ask questions like:- What is the joint probability
of the states (rain, sun) and our observations?
![Page 28: Heavily adapted from slides by Mitch Marcuscis520/lectures/bayes... · 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 b z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 z 1 z 2 z 3 z 4 w 1 w 2 w 3 w 4 For each document,](https://reader034.vdocument.in/reader034/viewer/2022043013/5fadd4d73ba9405d8d2f062a/html5/thumbnails/28.jpg)
HMM Exercise