![Page 1: Probabilistic Modelling and Reasoning - School of Informatics · Relationships between courses PMR Probabilistic modelling and reasoning. Focus on probabilistic modelling. Learning](https://reader031.vdocument.in/reader031/viewer/2022040701/5d576c1488c9936f0e8b6f29/html5/thumbnails/1.jpg)
Probabilistic Modelling and Reasoning
Chris Williams
School of Informatics, University of Edinburgh
September 2008
1 / 23
![Page 2: Probabilistic Modelling and Reasoning - School of Informatics · Relationships between courses PMR Probabilistic modelling and reasoning. Focus on probabilistic modelling. Learning](https://reader031.vdocument.in/reader031/viewer/2022040701/5d576c1488c9936f0e8b6f29/html5/thumbnails/2.jpg)
Course Introduction
Welcome
Administration
HandoutBooksAssignmentsTutorialsCourse rep(s)
Maths level
2 / 23
![Page 3: Probabilistic Modelling and Reasoning - School of Informatics · Relationships between courses PMR Probabilistic modelling and reasoning. Focus on probabilistic modelling. Learning](https://reader031.vdocument.in/reader031/viewer/2022040701/5d576c1488c9936f0e8b6f29/html5/thumbnails/3.jpg)
Relationships between courses
PMR Probabilistic modelling and reasoning. Focus on probabilisticmodelling. Learning and inference for probabilistic models, e.g.Probabilistic expert systems, latent variable models, HiddenMarkov models, Kalman filters, Boltzmann machines.
IAML Introductory Applied Machine Learning. Basic introductorycourse on supervised and unsupervised learning
MLPR More advanced course on machine learning, including coverageof Bayesian methods
RL Reinforcement Learning. Focus on Reinforcement Learning (i.e.delayed reward).
DME Develops ideas from IAML, PMR to deal with real-world datasets. Also data visualization and new techniques.
3 / 23
![Page 4: Probabilistic Modelling and Reasoning - School of Informatics · Relationships between courses PMR Probabilistic modelling and reasoning. Focus on probabilistic modelling. Learning](https://reader031.vdocument.in/reader031/viewer/2022040701/5d576c1488c9936f0e8b6f29/html5/thumbnails/4.jpg)
Dealing with Uncertainty
The key foci of this course are1 The use of probability theory as a calculus of uncertainty2 The learning of probability models from data
Graphical descriptions are used to define (in)dependence
Probabilistic graphical models give us a framework for dealingwith hidden-cause (or latent variable) models
Probability models can be used for classification problems, bybuilding a probability density model for each class
4 / 23
![Page 5: Probabilistic Modelling and Reasoning - School of Informatics · Relationships between courses PMR Probabilistic modelling and reasoning. Focus on probabilistic modelling. Learning](https://reader031.vdocument.in/reader031/viewer/2022040701/5d576c1488c9936f0e8b6f29/html5/thumbnails/5.jpg)
Example 1: QMR-DT
Diagnostic aid in thedomain of internalmedicine600 diseases, 4000symptom nodesTask is to infer diseasesgiven symptoms
diseases
symptoms
Shaded nodes represent obser-vations
5 / 23
![Page 6: Probabilistic Modelling and Reasoning - School of Informatics · Relationships between courses PMR Probabilistic modelling and reasoning. Focus on probabilistic modelling. Learning](https://reader031.vdocument.in/reader031/viewer/2022040701/5d576c1488c9936f0e8b6f29/html5/thumbnails/6.jpg)
Example 2: Inference for Automated Driving
Model of avision-basedlane sensorfor car drivingDynamicbeliefnetwork—performinginferencethrough timeSee Russelland Norvig,§17.5
LanePosition.t
PositionSensor.t
SensorAccuract.t
Terrain.t
SensorFailure.t
LanePosn.t+1
SensorPosn.t+1
SensorAcc.t+1
SensorFail.t+1
Terrain.t+1
Weather.t+1Weather.t
6 / 23
![Page 7: Probabilistic Modelling and Reasoning - School of Informatics · Relationships between courses PMR Probabilistic modelling and reasoning. Focus on probabilistic modelling. Learning](https://reader031.vdocument.in/reader031/viewer/2022040701/5d576c1488c9936f0e8b6f29/html5/thumbnails/7.jpg)
Further Examples
Automated Speech Recognition using Hidden Markov Modelsacoustic signal → phones → words
Detecting genes in DNA (Krogh, Mian, Haussler, 1994)
Tracking objects in images (Kalman filter and extensions)
Troubleshooting printing problems under Windows 95(Heckerman et al, 1995)
Robot navigation: inferring where you are
7 / 23
![Page 8: Probabilistic Modelling and Reasoning - School of Informatics · Relationships between courses PMR Probabilistic modelling and reasoning. Focus on probabilistic modelling. Learning](https://reader031.vdocument.in/reader031/viewer/2022040701/5d576c1488c9936f0e8b6f29/html5/thumbnails/8.jpg)
Probability Theory
Why probability?
Events, Probability
Variables
Joint distribution
Conditional Probability
Bayes’ Rule
Inference
Reference: e.g. Bishop §1.2; Russell and Norvig, chapter 14
8 / 23
![Page 9: Probabilistic Modelling and Reasoning - School of Informatics · Relationships between courses PMR Probabilistic modelling and reasoning. Focus on probabilistic modelling. Learning](https://reader031.vdocument.in/reader031/viewer/2022040701/5d576c1488c9936f0e8b6f29/html5/thumbnails/9.jpg)
Why probability?
Even if the world were deterministic, probabilistic assertionssummarize effects of
laziness: failure to enumerate exceptions, qualifications etc.
ignorance: lack of relevant facts, initial conditions etc.
Other approaches to dealing with uncertainty
Default or non-monotonic logics
Certainty factors (as in MYCIN) – ad hoc
Dempster-Shafer theory
Fuzzy logic handles degree of truth, not uncertainty
9 / 23
![Page 10: Probabilistic Modelling and Reasoning - School of Informatics · Relationships between courses PMR Probabilistic modelling and reasoning. Focus on probabilistic modelling. Learning](https://reader031.vdocument.in/reader031/viewer/2022040701/5d576c1488c9936f0e8b6f29/html5/thumbnails/10.jpg)
Events
The set of all possible outcomes of an experiment is called thesample space, denoted by Ω
Events are subsets of Ω
If A and B are events, A ∩ B is the event “A and B”; A ∪ B is theevent “A or B”; Ac is the event “not A”
A probability measure is a way of assigning probabilities toevents s.t
P(∅) = 0, P(Ω) = 1If A ∩ B = ∅
P(A ∪ B) = P(A) + P(B)
i.e. probability is additive for disjoint events
Example: when two fair dice are thrown, what is the probabilitythat the sum is 4?
10 / 23
![Page 11: Probabilistic Modelling and Reasoning - School of Informatics · Relationships between courses PMR Probabilistic modelling and reasoning. Focus on probabilistic modelling. Learning](https://reader031.vdocument.in/reader031/viewer/2022040701/5d576c1488c9936f0e8b6f29/html5/thumbnails/11.jpg)
Variables
A variable takes on values from a collection of mutually exclusiveand collectively exhaustive states, where each statecorresponds to some event
A variable X is a map from the sample space to the set of states
Examples of variables
Colour of a car blue, green, redNumber of children in a family 0, 1, 2, 3, 4, 5, 6, > 6Toss two coins, let X = (number of heads)2. X can take onthe values 0, 1 and 4.
Random variables can be discrete or continuous
Use capital letters to denote random variables and lower caseletters to denote values that they take, e.g. P(X = x)∑
x P(X = x) = 1
11 / 23
![Page 12: Probabilistic Modelling and Reasoning - School of Informatics · Relationships between courses PMR Probabilistic modelling and reasoning. Focus on probabilistic modelling. Learning](https://reader031.vdocument.in/reader031/viewer/2022040701/5d576c1488c9936f0e8b6f29/html5/thumbnails/12.jpg)
Probability: Frequentist and Bayesian
Frequentist probabilities are defined in the limit of an infinitenumber of trials
Example: “The probability of a particular coin landing heads upis 0.43”
Bayesian (subjective) probabilities quantify degrees of belief
Example: “The probability of it raining tomorrow is 0.3”
Not possible to repeat “tomorrow” many times
Frequentist interpretation is a special case
12 / 23
![Page 13: Probabilistic Modelling and Reasoning - School of Informatics · Relationships between courses PMR Probabilistic modelling and reasoning. Focus on probabilistic modelling. Learning](https://reader031.vdocument.in/reader031/viewer/2022040701/5d576c1488c9936f0e8b6f29/html5/thumbnails/13.jpg)
Joint distributions
Properties of several random variables are important formodelling complex problemsSuppose Toothache and Cavity are the variables:
Toothache = true Toothache = falseCavity = true 0.04 0.06Cavity = false 0.01 0.89
NotationP(Toothache = true, Cavity = false) = 0.01Notation
P(Toothache = true, Cavity = false) = P(Cavity = false, Toothache = true)
13 / 23
![Page 14: Probabilistic Modelling and Reasoning - School of Informatics · Relationships between courses PMR Probabilistic modelling and reasoning. Focus on probabilistic modelling. Learning](https://reader031.vdocument.in/reader031/viewer/2022040701/5d576c1488c9936f0e8b6f29/html5/thumbnails/14.jpg)
Marginal Probabilities
The sum ruleP(X ) =
∑Y
P(X , Y )
e.g. P(Toothache = true) =?
14 / 23
![Page 15: Probabilistic Modelling and Reasoning - School of Informatics · Relationships between courses PMR Probabilistic modelling and reasoning. Focus on probabilistic modelling. Learning](https://reader031.vdocument.in/reader031/viewer/2022040701/5d576c1488c9936f0e8b6f29/html5/thumbnails/15.jpg)
Conditional Probability
Let X and Y be two disjoint subsets of variables, such thatP(Y = y) > 0. Then the conditional probability distribution(CPD) of X given Y = y is given by
P(X = x|Y = y) = P(x|y) =P(x, y)
P(y)
Product rule
P(X, Y) = P(X)P(Y|X) = P(Y)P(X|Y)
Example: In the dental example, what isP(Cavity = true|Toothache = true)?∑
x P(X = x|Y = y) = 1 for all yCan we say anything about
∑y P(X = x|Y = y) ?
15 / 23
![Page 16: Probabilistic Modelling and Reasoning - School of Informatics · Relationships between courses PMR Probabilistic modelling and reasoning. Focus on probabilistic modelling. Learning](https://reader031.vdocument.in/reader031/viewer/2022040701/5d576c1488c9936f0e8b6f29/html5/thumbnails/16.jpg)
• Chain rule is derived by repeated application of the productrule
P(X1, . . . , Xn) = P(X1, . . . , Xn−1)P(Xn|X1, . . . , Xn−1)
= P(X1, . . . , Xn−2)P(Xn−1|X1, . . . , Xn−2)
P(Xn|X1, . . . , Xn−1)
= . . .
=n∏
i=1
P(Xi |X1, . . . , Xi−1)
• Exercise: give six decompositions of p(x , y , z) using thechain rule
16 / 23
![Page 17: Probabilistic Modelling and Reasoning - School of Informatics · Relationships between courses PMR Probabilistic modelling and reasoning. Focus on probabilistic modelling. Learning](https://reader031.vdocument.in/reader031/viewer/2022040701/5d576c1488c9936f0e8b6f29/html5/thumbnails/17.jpg)
Bayes’ Rule
From the product rule,
P(X|Y) =P(Y|X)P(X)
P(Y)
From the sum rule the denominator is
P(Y) =∑
X
P(Y|X)P(X)
17 / 23
![Page 18: Probabilistic Modelling and Reasoning - School of Informatics · Relationships between courses PMR Probabilistic modelling and reasoning. Focus on probabilistic modelling. Learning](https://reader031.vdocument.in/reader031/viewer/2022040701/5d576c1488c9936f0e8b6f29/html5/thumbnails/18.jpg)
Why is this useful?
For assessing diagnostic probability from causal probability
P(Cause|Effect) =P(Effect |Cause)P(Cause)
P(Effect)
Example: let M be meningitis, S be stiff neck
P(M|S) =P(S|M)P(M)
P(S)=
0.8× 0.00010.1
= 0.0008
Note: posterior probability of meningitis still very small
18 / 23
![Page 19: Probabilistic Modelling and Reasoning - School of Informatics · Relationships between courses PMR Probabilistic modelling and reasoning. Focus on probabilistic modelling. Learning](https://reader031.vdocument.in/reader031/viewer/2022040701/5d576c1488c9936f0e8b6f29/html5/thumbnails/19.jpg)
Evidence: from Prior to Posterior
Prior probability P(Cavity = true) = 0.1
After we observe Toothache = true, we obtain the posteriorprobability P(Cavity = true|Toothache = true)
This statement is dependent on the fact that Toothache = true isall I know
Revised probability of toothache if, say, I have a dentalexamination....
Some information may be irrelevant, e.g.P(Cavity = true|Toothache = true, DiceRoll = 5)= P(Cavity = true|Toothache = true)
19 / 23
![Page 20: Probabilistic Modelling and Reasoning - School of Informatics · Relationships between courses PMR Probabilistic modelling and reasoning. Focus on probabilistic modelling. Learning](https://reader031.vdocument.in/reader031/viewer/2022040701/5d576c1488c9936f0e8b6f29/html5/thumbnails/20.jpg)
Inference from joint distributions
Typically, we are interested in the posterior joint distributionof the query variables XF given specific values e for theevidence variables XE
Remaining variables XR = X\(XF ∪ XE)
Sum out over XR
P(XF |XE = e) =P(XF , XE = e)
P(XE = e)
=
∑r P(XF , XR = r, XE = e)∑
r,f P(XF = f, XR = r, XE = e)
20 / 23
![Page 21: Probabilistic Modelling and Reasoning - School of Informatics · Relationships between courses PMR Probabilistic modelling and reasoning. Focus on probabilistic modelling. Learning](https://reader031.vdocument.in/reader031/viewer/2022040701/5d576c1488c9936f0e8b6f29/html5/thumbnails/21.jpg)
Problems with naïve inference:
Worst-case time complexity O(dn) where d is the largestaritySpace complexity O(dn) to store the joint distributionHow to find the numbers for O(dn) entries???
21 / 23
![Page 22: Probabilistic Modelling and Reasoning - School of Informatics · Relationships between courses PMR Probabilistic modelling and reasoning. Focus on probabilistic modelling. Learning](https://reader031.vdocument.in/reader031/viewer/2022040701/5d576c1488c9936f0e8b6f29/html5/thumbnails/22.jpg)
Decision Theory
DecisionTheory = ProbabilityTheory + UtilityTheory
When making actions, an agent will have preferencesabout different possible outcomesUtility theory can be used to represent and reason withpreferencesA rational agent will select the action with the highestexpected utility
22 / 23
![Page 23: Probabilistic Modelling and Reasoning - School of Informatics · Relationships between courses PMR Probabilistic modelling and reasoning. Focus on probabilistic modelling. Learning](https://reader031.vdocument.in/reader031/viewer/2022040701/5d576c1488c9936f0e8b6f29/html5/thumbnails/23.jpg)
Summary
Course foci:
Probability theory as calculus of uncertaintyInference in probabilistic graphical modelsLearning probabilistic models form data
Events, random variables
Joint, conditional probability
Bayes rule, evidence
Decision theory
23 / 23