uncertainty copyright, 1996 © dale carnegie & associates, inc. chapter 13
Post on 19-Dec-2015
214 views
TRANSCRIPT
CSE 471/598, CBS 598 by H. Liu
2
UncertaintyEvolution of an intelligent agent: problem solving, planning, uncertaintyDealing with uncertainty is an unavoidable problem in reality.An agent must act under uncertainty.To make decision with uncertainty, we need Probability theory Utility theory Decision theory
CSE 471/598, CBS 598 by H. Liu
3
Sources of uncertaintyNo access to the whole truthNo categorical answerIncompleteness The qualification problem - impossible to
explicitly enumerate all conditions
Incorrectness of information about conditionsThe rational decision depends on both the relative importance of various goals and the likelihood of its being achieved.
CSE 471/598, CBS 598 by H. Liu
4
Handling uncertain knowledge
Difficulties in using FOL to cope with UK A dental diagnosis system using FOL
Symptom (p, Toothache) =>Disease (p, Cavity) Disease (p, Cavity) => Symptom (p, Toothache) Are they correct?
Reasons Laziness - too much work! Theoretical ignorance - we don’t know everything Practical ignorance - we don’t want to include all
Represent UK with a degree of beliefThe tool for handling UK is probability theory
CSE 471/598, CBS 598 by H. Liu
5
Probability provides a way of summarizing the uncertainty that comes from our laziness and ignorance - how wonderful it is!Probability, belief of the truth of a sentence 1 - true, 0 - false, 0<P<1 - intermediate
degrees of belief in the truth of the sentence
Degree of truth (fuzzy logic) vs. degree of beliefAlternatives to probability theory? Yes, to be discussed in later chapters.
CSE 471/598, CBS 598 by H. Liu
6
All probability statements must indicate the evidence w.r.t. which the probability is being assessed. Prior or unconditional probability
before evidence is obtained Posterior or conditional probability
after new evidence is obtained
CSE 471/598, CBS 598 by H. Liu
7
Uncertainty & rational decisions
Without uncertainty, decision making is simple - achieving the goal or notWith uncertainty, it becomes uncertain - three plans A90, A120 and A1440We need first have preferences between the different possible outcomes of the plansUtility theory is used to represent and reason with preferences.
CSE 471/598, CBS 598 by H. Liu
8
RationalityDecision Theory = Probability T + Utility T Maximum Expected Utility Principle defines rationality An agent is rational iff it chooses the action
that yields the highest utility, averaged over all possible outcomes of the action
A decision-theoretic agent (Fig 13.1, p 466) Is it any different from other agents we
learned?
CSE 471/598, CBS 598 by H. Liu
9
Basic probability notationPrior probability Proposition - P(Sunny) Random variable - P(Weather=Sunny)
Boolean, discrete, and continuous random variables Each RV has a domain
(sunny,rain,cloudy,snow) Probability distribution P(weather) =
<.7,.2,.08,.02>
Joint probability P(A^B) probabilities of all combinations of the values
of a set of RVs more later
CSE 471/598, CBS 598 by H. Liu
10
Conditional probabilityConditional probability P(A|B) = P(A^B)/P(B) Product rule - P(A^B) = P(A|B)P(B)
Probabilistic inference does not work like logical inference “P(A|B)=0.6” != “when B is true, P(A) is 0.6”
P(A) P(A|B), P(A|B,C), ...
CSE 471/598, CBS 598 by H. Liu
11
The axioms of probabilityAll probabilities are between 0 and 1Necessarily true (valid) propositions have probability 1, false (unsatisfiable) 0The probability of a disjunctionP(AvB)=P(A)+P(B)-P(A^B) A Venn diagram illustration
Ex: Deriving the rule ofNegation from P(a v !a)
CSE 471/598, CBS 598 by H. Liu
12
The joint probability distribution
Joint completely specifies an agent’s probability assignments to all propositions in the domainA probabilistic model consists of a set of random variables (X1, …,Xn).An atomic event is an assignment of particular values to all the variables Given Boolean random variables A and
B, what are atomic events?
CSE 471/598, CBS 598 by H. Liu
13
Joint probabilities
An example of two Boolean variables
Toothache !Toothache
Cavity!Cavity
0.04 0.01
0.06 0.89
• Observations: mutually exclusive and collectively exhaustive• What are P(Cavity), P(Cavity v Toothache),
P(Cavity|Toothache)?
CSE 471/598, CBS 598 by H. Liu
14
Joint (2)
If there is a Joint, we can read off any probability we need. Is it true? How?Impractical to specify all the entries for the Joint over n Boolean variables.Sidestep the Joint and work directly with conditional probability
CSE 471/598, CBS 598 by H. Liu
15
Inference using full joint distributions
Marginal probability (Fig 13.3) P(cavity) = Maginalization – summing out all the
variables other than cavity P(Y) = Sum-over z P(Y,z)
Conditioning – a variant of maginalization using the product rule P(Y) = Sum-over z P(Y|z)P(z)
CSE 471/598, CBS 598 by H. Liu
16
Normalization Method 1 using the def of conditional prob
P(cavity|toothache) = P(c^t)/P(t) P(!cavity|toothache) = P(!c^t)/P(t)
Method 2 using normalization P(cavity|toothache) = αP(cavity,toothache)= α[P(cavity, T, catch) + P(cavity, T, !catch)]= α[<0.108,0.016> + <0.012,0.064>] =
α<0.12,0.08> What is α?
CSE 471/598, CBS 598 by H. Liu
17
IndependenceP(toothache, catch, cavity, weather) A total of 32 entries, given W has 4 values How is one’s tooth problem related to weather? P(T,Ch,Cy,W=cloudy) = P(W=Clo|T...)P(T…)?
Whose tooth problem can influence our weather? P(W=Clo|T…) = P(W=Clo) Hence, P(T,Ch,Cy,W=clo) = P(W=Clo)P(T…) How many joint distribution tables? Two - (4, 8)
Independence between X and Y means P(X|Y) = P(X) or P(Y|X) = P(Y) or P(XY) =
P(X)P(Y)
CSE 471/598, CBS 598 by H. Liu
18
Bayes’ rule
Deriving the rule via the product ruleP(B|A) = P(A|B)P(B)/P(A)
A more general case is P(X|Y) = P(Y|X)P(X)/P(Y)
Bayes’ rule conditionalized on evidence EP(X|Y,E) = P(Y|X,E)P(X|E)/P(Y|E)
Applying the rule to medical diagnosis meningitis (P(M)=1/50,000)), stiff neck
(P(S)=1/20), P(S|M)=0.5, what is P(M|S)? Why is this kind of inference useful?
CSE 471/598, CBS 598 by H. Liu
19
Applying Bayes’ ruleRelative likelihood Comparing the relative likelihood of
meningitis and whiplash, given a stiff neck, which is more likely?P(M|S)/P(W|S) = P(S|M)P(M)/P(S|W)P(W)
Avoiding direct assessment of the prior P(M|S) =? P(!M|S) =? And P(M|S) + P(!M|S)
= 1, P(S) = ? P(S|!M) = ?
CSE 471/598, CBS 598 by H. Liu
20
Using Bayes’ ruleCombining evidence from P(Cavity|Toothache) and P(Cavity|Catch) to
P(Cavity|Toothache,Catch)
Bayesian updating from P(Cavity|T)=P(Cavity)P(T|Cavity)/P(T)
P(A|B) = P(B|A)P(A)/P(B) to
P(Cavity|T,Catch)=P(Catch|T,Cavity)/P(Catch|T) P(A|B,C) = P(B|A,C)P(A|C)/P(B|C)
CSE 471/598, CBS 598 by H. Liu
21
Recall that independent events A, B P(B|A)=P(B), P(A|B)=P(A), P(A,B)=P(A)P(B)
Conditional independence (X and Y are ind given Z) P(X|Y,Z)=P(X|Z) and P(Y|X,Z)=P(Y|Z) P(XY|Z)=P(X|Z)P(Y|Z) derived from absolute
indepenceGiven Cavity, Toothache and Catch are indpt P(T,Ch,Cy) = P(T,Ch|Cy)P(Cy) = P(T|
Cy)P(Ch|Cy)P(Cy) One large table is decomposed into 3 smaller
tables: 23-1 vs. 5 (= 2*(21-1)+2*(21-1)+21-1)
T|Cy T|!Cy
Cy
CSE 471/598, CBS 598 by H. Liu
22
Independence, decomposition, Naïve BayesIf all n symptoms are conditionally indpt given Cavity, the size of the representation grows as O(n) instead of O(2n)The decomposition of large probabilistic domains into weakly connected subsets via conditional independence is one important development in modern AINaïve Bayes model (Cause and Effects) P(Cause,E1,…,En) = P(Cause) P(Ei|Cause) An amazingly successful classifier as well
CSE 471/598, CBS 598 by H. Liu
23
Where do probabilities come from?
There are three positions: The frequentist - numbers can come only from experiments The objectivist - probabilities are real aspects of the universe The subjectivist - characterizing an agent’s belief
The reference class problem – intrusion of subjectivity A frequentist doctor wants to consider similar patients
How similar two patients are?
Laplace’s principle of indifference Propositions that are syntactically “symmetric” w.r.t.
the evidence should be accorded equal probability
CSE 471/598, CBS 598 by H. Liu
24
SummaryUncertainty exists in the real world.It is good (it allows for laziness) and bad (we need new tools) Priors, posteriors, and jointBayes’ rule - the base of Bayesian InferenceConditional independence allows Bayesian updating to work effectively with many pieces of evidence.But ...