uncertainty copyright, 1996 © dale carnegie & associates, inc. chapter 13

24
Uncertainty Copyright, 1996 © Dale Carnegie & Associates, Inc. Chapter 13

Post on 19-Dec-2015

214 views

Category:

Documents


1 download

TRANSCRIPT

Uncertainty

Copyright, 1996 © Dale Carnegie & Associates, Inc.

Chapter 13

CSE 471/598, CBS 598 by H. Liu

2

UncertaintyEvolution of an intelligent agent: problem solving, planning, uncertaintyDealing with uncertainty is an unavoidable problem in reality.An agent must act under uncertainty.To make decision with uncertainty, we need Probability theory Utility theory Decision theory

CSE 471/598, CBS 598 by H. Liu

3

Sources of uncertaintyNo access to the whole truthNo categorical answerIncompleteness The qualification problem - impossible to

explicitly enumerate all conditions

Incorrectness of information about conditionsThe rational decision depends on both the relative importance of various goals and the likelihood of its being achieved.

CSE 471/598, CBS 598 by H. Liu

4

Handling uncertain knowledge

Difficulties in using FOL to cope with UK A dental diagnosis system using FOL

Symptom (p, Toothache) =>Disease (p, Cavity) Disease (p, Cavity) => Symptom (p, Toothache) Are they correct?

Reasons Laziness - too much work! Theoretical ignorance - we don’t know everything Practical ignorance - we don’t want to include all

Represent UK with a degree of beliefThe tool for handling UK is probability theory

CSE 471/598, CBS 598 by H. Liu

5

Probability provides a way of summarizing the uncertainty that comes from our laziness and ignorance - how wonderful it is!Probability, belief of the truth of a sentence 1 - true, 0 - false, 0<P<1 - intermediate

degrees of belief in the truth of the sentence

Degree of truth (fuzzy logic) vs. degree of beliefAlternatives to probability theory? Yes, to be discussed in later chapters.

CSE 471/598, CBS 598 by H. Liu

6

All probability statements must indicate the evidence w.r.t. which the probability is being assessed. Prior or unconditional probability

before evidence is obtained Posterior or conditional probability

after new evidence is obtained

CSE 471/598, CBS 598 by H. Liu

7

Uncertainty & rational decisions

Without uncertainty, decision making is simple - achieving the goal or notWith uncertainty, it becomes uncertain - three plans A90, A120 and A1440We need first have preferences between the different possible outcomes of the plansUtility theory is used to represent and reason with preferences.

CSE 471/598, CBS 598 by H. Liu

8

RationalityDecision Theory = Probability T + Utility T Maximum Expected Utility Principle defines rationality An agent is rational iff it chooses the action

that yields the highest utility, averaged over all possible outcomes of the action

A decision-theoretic agent (Fig 13.1, p 466) Is it any different from other agents we

learned?

CSE 471/598, CBS 598 by H. Liu

9

Basic probability notationPrior probability Proposition - P(Sunny) Random variable - P(Weather=Sunny)

Boolean, discrete, and continuous random variables Each RV has a domain

(sunny,rain,cloudy,snow) Probability distribution P(weather) =

<.7,.2,.08,.02>

Joint probability P(A^B) probabilities of all combinations of the values

of a set of RVs more later

CSE 471/598, CBS 598 by H. Liu

10

Conditional probabilityConditional probability P(A|B) = P(A^B)/P(B) Product rule - P(A^B) = P(A|B)P(B)

Probabilistic inference does not work like logical inference “P(A|B)=0.6” != “when B is true, P(A) is 0.6”

P(A) P(A|B), P(A|B,C), ...

CSE 471/598, CBS 598 by H. Liu

11

The axioms of probabilityAll probabilities are between 0 and 1Necessarily true (valid) propositions have probability 1, false (unsatisfiable) 0The probability of a disjunctionP(AvB)=P(A)+P(B)-P(A^B) A Venn diagram illustration

Ex: Deriving the rule ofNegation from P(a v !a)

CSE 471/598, CBS 598 by H. Liu

12

The joint probability distribution

Joint completely specifies an agent’s probability assignments to all propositions in the domainA probabilistic model consists of a set of random variables (X1, …,Xn).An atomic event is an assignment of particular values to all the variables Given Boolean random variables A and

B, what are atomic events?

CSE 471/598, CBS 598 by H. Liu

13

Joint probabilities

An example of two Boolean variables

Toothache !Toothache

Cavity!Cavity

0.04 0.01

0.06 0.89

• Observations: mutually exclusive and collectively exhaustive• What are P(Cavity), P(Cavity v Toothache),

P(Cavity|Toothache)?

CSE 471/598, CBS 598 by H. Liu

14

Joint (2)

If there is a Joint, we can read off any probability we need. Is it true? How?Impractical to specify all the entries for the Joint over n Boolean variables.Sidestep the Joint and work directly with conditional probability

CSE 471/598, CBS 598 by H. Liu

15

Inference using full joint distributions

Marginal probability (Fig 13.3) P(cavity) = Maginalization – summing out all the

variables other than cavity P(Y) = Sum-over z P(Y,z)

Conditioning – a variant of maginalization using the product rule P(Y) = Sum-over z P(Y|z)P(z)

CSE 471/598, CBS 598 by H. Liu

16

Normalization Method 1 using the def of conditional prob

P(cavity|toothache) = P(c^t)/P(t) P(!cavity|toothache) = P(!c^t)/P(t)

Method 2 using normalization P(cavity|toothache) = αP(cavity,toothache)= α[P(cavity, T, catch) + P(cavity, T, !catch)]= α[<0.108,0.016> + <0.012,0.064>] =

α<0.12,0.08> What is α?

CSE 471/598, CBS 598 by H. Liu

17

IndependenceP(toothache, catch, cavity, weather) A total of 32 entries, given W has 4 values How is one’s tooth problem related to weather? P(T,Ch,Cy,W=cloudy) = P(W=Clo|T...)P(T…)?

Whose tooth problem can influence our weather? P(W=Clo|T…) = P(W=Clo) Hence, P(T,Ch,Cy,W=clo) = P(W=Clo)P(T…) How many joint distribution tables? Two - (4, 8)

Independence between X and Y means P(X|Y) = P(X) or P(Y|X) = P(Y) or P(XY) =

P(X)P(Y)

CSE 471/598, CBS 598 by H. Liu

18

Bayes’ rule

Deriving the rule via the product ruleP(B|A) = P(A|B)P(B)/P(A)

A more general case is P(X|Y) = P(Y|X)P(X)/P(Y)

Bayes’ rule conditionalized on evidence EP(X|Y,E) = P(Y|X,E)P(X|E)/P(Y|E)

Applying the rule to medical diagnosis meningitis (P(M)=1/50,000)), stiff neck

(P(S)=1/20), P(S|M)=0.5, what is P(M|S)? Why is this kind of inference useful?

CSE 471/598, CBS 598 by H. Liu

19

Applying Bayes’ ruleRelative likelihood Comparing the relative likelihood of

meningitis and whiplash, given a stiff neck, which is more likely?P(M|S)/P(W|S) = P(S|M)P(M)/P(S|W)P(W)

Avoiding direct assessment of the prior P(M|S) =? P(!M|S) =? And P(M|S) + P(!M|S)

= 1, P(S) = ? P(S|!M) = ?

CSE 471/598, CBS 598 by H. Liu

20

Using Bayes’ ruleCombining evidence from P(Cavity|Toothache) and P(Cavity|Catch) to

P(Cavity|Toothache,Catch)

Bayesian updating from P(Cavity|T)=P(Cavity)P(T|Cavity)/P(T)

P(A|B) = P(B|A)P(A)/P(B) to

P(Cavity|T,Catch)=P(Catch|T,Cavity)/P(Catch|T) P(A|B,C) = P(B|A,C)P(A|C)/P(B|C)

CSE 471/598, CBS 598 by H. Liu

21

Recall that independent events A, B P(B|A)=P(B), P(A|B)=P(A), P(A,B)=P(A)P(B)

Conditional independence (X and Y are ind given Z) P(X|Y,Z)=P(X|Z) and P(Y|X,Z)=P(Y|Z) P(XY|Z)=P(X|Z)P(Y|Z) derived from absolute

indepenceGiven Cavity, Toothache and Catch are indpt P(T,Ch,Cy) = P(T,Ch|Cy)P(Cy) = P(T|

Cy)P(Ch|Cy)P(Cy) One large table is decomposed into 3 smaller

tables: 23-1 vs. 5 (= 2*(21-1)+2*(21-1)+21-1)

T|Cy T|!Cy

Cy

CSE 471/598, CBS 598 by H. Liu

22

Independence, decomposition, Naïve BayesIf all n symptoms are conditionally indpt given Cavity, the size of the representation grows as O(n) instead of O(2n)The decomposition of large probabilistic domains into weakly connected subsets via conditional independence is one important development in modern AINaïve Bayes model (Cause and Effects) P(Cause,E1,…,En) = P(Cause) P(Ei|Cause) An amazingly successful classifier as well

CSE 471/598, CBS 598 by H. Liu

23

Where do probabilities come from?

There are three positions: The frequentist - numbers can come only from experiments The objectivist - probabilities are real aspects of the universe The subjectivist - characterizing an agent’s belief

The reference class problem – intrusion of subjectivity A frequentist doctor wants to consider similar patients

How similar two patients are?

Laplace’s principle of indifference Propositions that are syntactically “symmetric” w.r.t.

the evidence should be accorded equal probability

CSE 471/598, CBS 598 by H. Liu

24

SummaryUncertainty exists in the real world.It is good (it allows for laziness) and bad (we need new tools) Priors, posteriors, and jointBayes’ rule - the base of Bayesian InferenceConditional independence allows Bayesian updating to work effectively with many pieces of evidence.But ...