selected topics in graphical models petr Šimeček
TRANSCRIPT
![Page 1: Selected Topics in Graphical Models Petr Šimeček](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f1b5503460f94c30384/html5/thumbnails/1.jpg)
Selected Topics in Graphical Models
Petr Šimeček
![Page 2: Selected Topics in Graphical Models Petr Šimeček](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f1b5503460f94c30384/html5/thumbnails/2.jpg)
Independence Unconditional Independence:
Discrete r.v. Continuous r.v.
Conditional Independence: Discrete r.v.
Continuous r.v.
YX yxyxyx ,)p()p(),p( ..P)f()f(),f( sayxyx
ZYX |
zyx
zyzxzzyx
,,
),p(),p()p(),,p(
..P
),f(),f()f(),,f(
sa
zyzxzzyx
![Page 3: Selected Topics in Graphical Models Petr Šimeček](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f1b5503460f94c30384/html5/thumbnails/3.jpg)
List of Independence RelationshipsN random variables X1, X2, …, XN and their
distribution P
List of all conditional and unconditional independence relations between them
)},(|),(),(
,},...,1{,,);|,{(
CiXBiXAiX
disjunctNCBACBA
iii
![Page 4: Selected Topics in Graphical Models Petr Šimeček](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f1b5503460f94c30384/html5/thumbnails/4.jpg)
Representation by Graph
X6
X5 X4
X3
X2
X1
X1
X3
X2
X4
![Page 5: Selected Topics in Graphical Models Petr Šimeček](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f1b5503460f94c30384/html5/thumbnails/5.jpg)
Example – Sprinkler Network
Rain
WetGrass
Sprinkler
Cloudy
![Page 6: Selected Topics in Graphical Models Petr Šimeček](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f1b5503460f94c30384/html5/thumbnails/6.jpg)
Example – Sprinkler Network
Rain
WetGrass
Sprinkler
CloudyCLOUDY
T F
0.5 0.5
![Page 7: Selected Topics in Graphical Models Petr Šimeček](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f1b5503460f94c30384/html5/thumbnails/7.jpg)
Example – Sprinkler Network
Rain
WetGrass
Sprinkler
Cloudy
SPRINK T F
C=T 0.1 0.9
C=F 0.5 0.5
CLOUDY
T F
0.5 0.5
RAIN T F
C=T 0.8 0.2
C=F 0.2 0.8
![Page 8: Selected Topics in Graphical Models Petr Šimeček](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f1b5503460f94c30384/html5/thumbnails/8.jpg)
Example – Sprinkler Network
Rain
WetGrass
Sprinkler
Cloudy
SPRINK T F
C=T 0.1 0.9
C=F 0.5 0.5
CLOUDY
T F
0.5 0.5
WET GRASS T F
R=T S=T 0.99 0.01
R=T S=F 0.9 0.1
R=F S=T 0.9 0.1
R=F S=F 0 1
RAIN T F
C=T 0.8 0.2
C=F 0.2 0.8
![Page 9: Selected Topics in Graphical Models Petr Šimeček](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f1b5503460f94c30384/html5/thumbnails/9.jpg)
Example – Sprinkler Network
R
W
S
C
),|P().|P().|P().P(),,,P( SRWCSCRCWSRC
The number of parameters needn’t grow
exponentially with the number of
variables!
It depends on the number of parents
of nodes.
![Page 10: Selected Topics in Graphical Models Petr Šimeček](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f1b5503460f94c30384/html5/thumbnails/10.jpg)
Purpose 1– Propagation of Evidence
Rain
WetGrass
Sprinkler
Cloudy
What is the probability that it is raining if we know that grass is wet?
![Page 11: Selected Topics in Graphical Models Petr Šimeček](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f1b5503460f94c30384/html5/thumbnails/11.jpg)
Propagation of EvidenceIn general: I have observed some
variable(s). What is the probability of other variable(s)? What is the most probable value(s)?
Why don’t transfer BN to contingency table? Marginalization does not work for N large: needs 2N memory, much time, has low precision…
![Page 12: Selected Topics in Graphical Models Petr Šimeček](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f1b5503460f94c30384/html5/thumbnails/12.jpg)
Propagation of EvidenceIn general: I have observed some
variable(s). What is the probability of other variable(s)? What is the most probable value(s)?
Why don’t transfer BN to contingency table? Marginalization does not work for N large: needs 2N memory, much time, has low precision…
![Page 13: Selected Topics in Graphical Models Petr Šimeček](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f1b5503460f94c30384/html5/thumbnails/13.jpg)
Purpose 2 – Parameter Learning
Rain
WetGrass
Sprinkler
Cloudy
SPRINK T F
C=T ? ?
C=F ? ?
CLOUDY
T F
? ?
WET GRASS T F
R=T S=T ? ?
R=T S=F ? ?
R=F S=T ? ?
R=F S=F ? ?
RAIN T F
C=T ? ?
C=F ? ?
![Page 14: Selected Topics in Graphical Models Petr Šimeček](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f1b5503460f94c30384/html5/thumbnails/14.jpg)
Parameter LearningWe know: graph (CI structure) sample (observations) of BN
We don’t know: conditional probabilistic distributions
(could be estimated by MLE, Bayesian stat.)
![Page 15: Selected Topics in Graphical Models Petr Šimeček](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f1b5503460f94c30384/html5/thumbnails/15.jpg)
Purpose 3 – Structure Learning
CLOUDY SPRINKLER RAIN WET GRASS
TRUE FALSE FALSE FALSE
FALSE TRUE FALSE FALSE
TRUE FALSE TRUE FALSE
FALSE FALSE FALSE FALSE
FALSE TRUE FALSE FALSE
FALSE FALSE TRUE TRUE
TRUE FALSE TRUE TRUE
TRUE FALSE TRUE FALSE
… … … …
![Page 16: Selected Topics in Graphical Models Petr Šimeček](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f1b5503460f94c30384/html5/thumbnails/16.jpg)
Structure LearningWe know: independent observations (data) of BN sometimes, the casual ordering of vars
We don’t know: graph (CI structure) conditional probabilistic distributions
Solution: CI tests maximization of some criterion – huge s. space
(AIC, BIC, Bayesian approach)
![Page 17: Selected Topics in Graphical Models Petr Šimeček](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f1b5503460f94c30384/html5/thumbnails/17.jpg)
Example – Entry Examination
![Page 18: Selected Topics in Graphical Models Petr Šimeček](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f1b5503460f94c30384/html5/thumbnails/18.jpg)
Markov EquivalenceSome of arcs can be changed without
changing CI relationships.
The best one can hope to do is to identify the model up to Markov equivalence.
RainWet
GrassRain
WetGrass
![Page 19: Selected Topics in Graphical Models Petr Šimeček](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f1b5503460f94c30384/html5/thumbnails/19.jpg)
Structure Learning Theory
algorithms proved to be asymptotically right Janžura, Nielsen (2003)
1 000 000 observations for 10 binary variables
Practice in medicine – usually 50-1500 obs. BNs are often used in spite of that
![Page 20: Selected Topics in Graphical Models Petr Šimeček](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f1b5503460f94c30384/html5/thumbnails/20.jpg)
Structure Learning - Simulation 3 variables, take m from 100 to 1000 for each m do 100 times
generate of Bayesian network generate m samples use K2 structure learning algorithm
count the probability of successful selection for each m
This should give an answer to the question:“Is it a chance to find the true model?”
![Page 21: Selected Topics in Graphical Models Petr Šimeček](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f1b5503460f94c30384/html5/thumbnails/21.jpg)
200 400 600 800 1000
45
50
55
60
65
70
75
Number of Observations
Pro
ba
bili
ty o
f Tru
e M
od
el S
ele
ctio
n (%
)
![Page 22: Selected Topics in Graphical Models Petr Šimeček](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f1b5503460f94c30384/html5/thumbnails/22.jpg)
To Do List:
software: free, open source, easy to use, fast, separated API
more simulation: theory x practice
popularization of structural learning
Czech literature: maybe my PhD. thesis
![Page 23: Selected Topics in Graphical Models Petr Šimeček](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f1b5503460f94c30384/html5/thumbnails/23.jpg)
References: Castillo E. et al. (1997): Expert Systems
and Probabilistic Network Models, Springer Verlag.
Neapolitan R. E. (2003): Learning Bayesian Networks, Prentice Hall.
Janžura N., Nielsen J. (2003): A numerical method for learning Bayesian Networks from Statistical Data, WUPES.