Download - Cs511 Uncertainty (1)
-
8/20/2019 Cs511 Uncertainty (1)
1/72
Reasoning Under Uncertainty
• Most tasks requiring intelligent behavior have some
degree of uncertainty associated with them.
• The type of uncertainty that can occur in knowledge-
based systems may be caused by problems with thedata. For example:
. !ata might be missing or unavailable
". !ata might be present but unreliable or ambiguous due to measurement errors.
#. The representation of the data may be imprecise
or inconsistent.
$. !ata may %ust be user&s best guess.
'. !ata may be based on defaults and the defaultsmay have exceptions.
-
8/20/2019 Cs511 Uncertainty (1)
2/72
• The uncertainty may also be caused by the represented
knowledge since it might
. (epresent best guesses of the experts that are based on plausible or statistical associations theyhave observed.
". )ot be appropriate in all situations *e.g.+ mayhave indeterminate applicability,
• iven these numerous sources of errors+ most
knowledge-based systems require the incorporation of some form of uncertainty management.
• hen implementing some uncertainty scheme we
must be concerned with three issues:
. /ow to represent uncertain data
". /ow to combine two or more pieces of uncertaindata
#. /ow to draw inference using uncertain data
e will introduce three ways of handling uncertainty:
"
-
8/20/2019 Cs511 Uncertainty (1)
3/72
• 0robabilistic reasoning.
• 1ertainty factors
• !empster-2hafer Theory
#
-
8/20/2019 Cs511 Uncertainty (1)
4/72
1. Classical Probability
The oldest and best defined technique for managinguncertainty is based on classical probability theory. 3et usstart to review it by introducing some terms.
• Sample space: 1onsider an experiment whose
outcome is not predictable with certainty in advance./owever+ although the outcome of the experiment will
not be known in advance+ let us suppose that the set of all possible outcomes is known. This set of all
possible outcomes of an experiment is known as the sample space of the experiment and denoted by S .
For example:
• 4f the outcome of an experiment consists in the
determination of the sex of a newborn child+ then2 5 6g+ b7
where the outcome g means that the child is a girland b that it is a boy.
• 4f the experiment consists of flipping two coins+
then the sample space consists of the followingfour points:
2 5 6*/+ /,+ */+ T,+ *T+ /,+ *T+ T,7
$
-
8/20/2019 Cs511 Uncertainty (1)
5/72
• Event: any subset E of the sample space is known as
an event.
• That is+ an event is a set consisting of possibleoutcomes of the experiment. 4f the outcome of theexperiment is contained in 8+ then we say that 8 hasoccurred.
For example+ if 8 5 6*/+ /,+ 6/+ T,7+ then 8 is theevent that a head appears on the first coin.
• For any event 8 we define the new event 8&+ referred
to as the complement of 8+ to consist of all points inthe sample space 2 that are not in 8.
• Mutually exclusive events: 9 set of events 8+ 8"+ ...+
8n in a sample space 2+ are called mutually exclusiveevents if 8i ∩ 8 % 5 ∅+ i ≠ %+ ≤ i+ % ≤ n.
'
-
8/20/2019 Cs511 Uncertainty (1)
6/72
• 9 formal theory of probability can be made using
three axioms:
. ≤ 0*8, ≤ .
". 0*8i, 5 *or 0*2, 5 , i
This axiom states that the sum of all events which donot affect each other+ called mutually exclusive
events+ is .
9s a corollary of this axiom:
0*8i, ; 0*8i&, 5 +
where 8i& is the complement of event 8i.
#. 0*8 ∪ 8", 5 0*8, ; 0*8",+
where 8 and 8" are mutually exclusive events. 4ngeneral+ this is also true.
-
8/20/2019 Cs511 Uncertainty (1)
7/72
Compound probabilities
• 8vents that do not affect each other in any way are
called independent events. For two independentevents 9 and =+
0*9∩ =, 5 0*9, 0*=,
Independent events: The events 8+ 8"+ ...+ 8n in asample space 2+ are independent if
0*8i ∩ ... ∩ 8ik , 5 0*8i, ...0*8ik ,
for each subset 6i+ ...+ik , ⊆6+ ...+ n7+≤ k ≤ n+ n ≥ .
• 4f events 9 and = are mutually exclusive+ then
0*9∪ =, 5 0*9, ; 0*=,
• 4f events 9 and = are not mutually exclusive+ then
0*9∪ =, 5 0*9, ; 0*=, - 0*9∩ =,
This is also called Addition la.
>
-
8/20/2019 Cs511 Uncertainty (1)
8/72
Conditional Probabilities
The probability of an event 9+ given = occurred+ is calleda conditional probability and indicated by
0*9 ? =,
The conditional probability is defined as
0*9∩ =,
0*9 ? =, 5 ------------------+ for 0*=, ≠ .
0*=,
• Multiplicative !a of probability for two events is
then defined as
0*9∩ =, 5 0*9 ? =, 0*=,
which is equivalent to the following
0*9∩ =, 5 0*= ? 9, 0*9,
•"enerali#ed Multiplicative !a0*9 ∩ 9" ∩ ... ∩ 9n, 5
0*9 ? 9" ∩ ... ∩ 9n, 0*9" ? 9# ∩ ... ∩ 9n,
... 0*9n- ? 9n, 0*9n,
@
-
8/20/2019 Cs511 Uncertainty (1)
9/72
An example
9s an example of probabilities+ Table below showshypothetical probabilities of a disk crash using a =rand Adrive within one year.
=rand A =rand A& Total of (ows
1rash 1 .< . .> )o crash 1& ." . .#Total of columns .@ ." .
$ypot%etical probabilities o& a dis' cras%
A A& Total of rows
1 0*1 ∩ A, 0*1 ∩ A&, 0*1,
1& 0*1&∩ A, 0*1&∩ A&, 0*1&,
Total of columns 0*A, 0*A&,
Probability interpretation o& to setsBsing above tables+ the probabilities of all events can becalculated. 2ome probabilities are*, 0*1, 5 .>*", 0*1&, 5 .#
*#, 0*A, 5 .@*$, 0*A&, 5 ."*', 0*1 ∩ A, 5 .<
*the probability of a crash and using =rand A,
C
-
8/20/2019 Cs511 Uncertainty (1)
10/72
*'0*A, .@
*>, The probability of a crash+ given that =rand A is notused+ is
0*1 ∩ A&, .
0*1 ? A&, 5 ------------- 5 ------- 5 .'
0*A&, ."
• 0robabilities *', and *
-
8/20/2019 Cs511 Uncertainty (1)
11/72
4n contrast+ the meaning of the conditional probability*
-
8/20/2019 Cs511 Uncertainty (1)
12/72
(ayes) *%eorem
)ote that conditional probability is defined as
0*/ ∩ 8,
0*/ ? 8, 5 ------------------+ for 0*8, ≠ .
0*8,
i.e.+ the conditional probability of / given 8.
• 4n real-life practice+ the probability 0*/ ? 8, cannot
always be found in the literature or obtained fromstatistical analysis. The conditional probabilities
0*8 ? /,
however often are easier to come byD
• 4n medical textbooks+ for example+ a disease is
described in terms of the signs likely to be found in atypical patient suffering from the disease.
• The following theorem provides us with a method for
computing the conditional probability 0*/ ? 8, fromthe probabilities 0*8,+ 0*/, and 0*8 ? /,D
"
-
8/20/2019 Cs511 Uncertainty (1)
13/72
From conditional probability:
0*/ ∩ 8,
0*/ ? 8, 5 ------------------+0*8,
0*8 ∩ /,Furthermore+ we have+ 0*8 ? /, 5 ---------------
0*/,2o+
0*8 ? /,0*/, 5 0*/ ? 8,0*8, 5 0*/ ∩ 8,Thus0*8 ? /, 0*/,
0*/ ? 8, 5 ---------------------0*8,
• This is the =ayes& Theorem. 4ts general form can be
written in terms of events+ 8+ and hypotheses*assumptions,+ /+ in the following alternative forms.
0*8 ∩ /i,
0*/i ? 8, 5 -------------------
∑ 0*8 ∩ / %, %
0*8 ? /i, 0*/i, 0*8 ? /i, 0*/i,5 ----------------------- 5 ---------------------
∑ 0*8 ? / %, 0*/ %, 0*8, %
#
-
8/20/2019 Cs511 Uncertainty (1)
14/72
$ypot%etical reasoning andbac'ard induction
• =ayes& Theorem is commonly used for decision tree
analysis of business and the social sciences.
• The method of =ayesian decision making is also used
in expert system 0(E2081TE(.
e use oil exploration in 0(E2081TE( as an example.2uppose the prospector believes that there is a better than'-' chance of finding oil+ and assumes the following.
0*E, 5 .< and 0*E&, 5 .$
Bsing the seismic survey technique+ we obtain the
following conditional probabilities+ where ; means a positive outcome and - is a negative outcome
0*; ? E, 5 .@ 0*- ? E, 5 ." *false -,0*; ? E&,5 . *false ;, 0*- ? E&, 5 .C
• Bsing the prior and conditional probabilities+ we can
construct the initial probability tree as shown below.
$
-
8/20/2019 Cs511 Uncertainty (1)
15/72
ProbabilitiesPrior
2ub%ective Epinionof site: 0*/i,
Conditional2eismic test result
0*8?/i,
+oint: 0*8∩/,
50*8?/i,0*/i,
)o oil oil
0*E&, 5 .$ 0*E,5.<
test ; test test ; test0*-?E&, 0*;?E&, 0*-?E, 0*;?E,5.C 5 . 5 ." 5 .@
0*-∩E&, 0*;∩E&, 0*-∩E, 0*;∩E,
5.#< 5.$ 5." 5.$@
4nitial probability tree for oil exploration
'
-
8/20/2019 Cs511 Uncertainty (1)
16/72
• Bsing 9ddition law to calculate the total probability of
a ; and a - test
0*;, 5 0*; ∩ E, ; 0*; ∩ E&, 5 .$@ ; .$ 5 .'"0*-, 5 0*- ∩ E, ; 0*- ∩ E&, 5 ." ; .#< 5 .$@
• 0*;, and 0*-, are unconditional probabilities that can
now be used to calculate the posterior probabilities atthe site+ as shown below.
Probabilitiesunconditional
0*8,
PosteriorEf site:
0*/i?8, 5 0*8?
/i,0*/i,G0*8,
+oint: 0*8∩/,
50*/i?8,0*8,
test ; test0*-, 5 .$@ 0*;,5.'"
)o oil oil )o oil oil0*E&?-, 0*E?-, 0*E&?;, 0*E?;,5#G$ 5 G$ 5 G# 5 "G#
0*-∩E&, 0*-∩E, 0*;∩E&, 0*;∩E,
5.#< 5." 5.$ 5.$@
(evised probability tree for oil exploration
-
8/20/2019 Cs511 Uncertainty (1)
17/72
• The Figure below shows the =aysian decision tree
using the data from the above Figure. The payoffs areat the bottom of the tree.
Thus if oil is found+ the payoff isH+"'+ - H"+ - H'+ 5 H++
while a decision to quit after the seismic test result gives a payoff of -H'+.
The assumed amounts are:Eil lease+ if successful: H+"'+!rilling expense: -H"+2eismic survey: -H'+
Event*est result
; or -
Act.Iuit or drill
EventEil or no oil
payo&&
0*-, 5 .$@ 0*;,5.'"
Iuit !rill Iuit !rill
-H'+ -H'+-H++ H++ -H++ H++
4nitial =ayesian !ecision tree for oil exploration
)o oil oil )o oil oil
0*E&?-, 0*E?-, 0*E&?;, 0*E?;,5#G$ 5 G$ 5 G# 5 "G#
9
1=
! 8
>
-
8/20/2019 Cs511 Uncertainty (1)
18/72
• 4n order for the prospector to make the best decision+
the expected payoff must be calculated at event node9.
• To compute the expected payoff at 9+ we must work
backward from the leaves. This process is calledbac'ard induction.
@
-
8/20/2019 Cs511 Uncertainty (1)
19/72
• The expected payoff from an event node is the sum of
the payoffs times the probabilities leading to the payoffs.
8xpected payoff at node 1H@$
-
8/20/2019 Cs511 Uncertainty (1)
20/72
• The decision tree shows the optimal strategy for the
prospector. 4f the seismic test is positive+ the siteshould be drilled+ otherwise+ the site should be
abandoned.
• The decision tree is an example of hypothetical
reasoning or Jwhat ifK type of situations.
• =y exploring alternate paths of action+ we can prune
paths that do not lead to optimal payoffs.
"
-
8/20/2019 Cs511 Uncertainty (1)
21/72
(ayes) rule and 'noledge,basedsystems
9s we know+ rule-based systems express knowledge in an4F-T/8) format:
4F A is trueT/8) L can be concluded with probability p
• 4f we observe that A is true+ then we can conclude thatL exist with the specified probability. For example
4F the patient has a coldT/8) the patient will sneee *.>',
• =ut what if we reason abductively and observe L *i.e.+the patient sneees, while knowing nothing about A*i.e.+ the patient has a cold,N hat can we concludeabout itN =ayes& Theorem describes how we canderive a probability for A.
"
-
8/20/2019 Cs511 Uncertainty (1)
22/72
• ithin the rule given above+ L *denotes some piece of
evidence *typically referred to as 8, and A denotessome hypothesis */, given
0*8 ? /, 0*/,*, 0*/ ? 8, 5 -------------------
0*8,or
0*8 ? /, 0*/,*", 0*/ ? 8, 5 -----------------------------------------
0*8 ? /,0*/, ; 0*8 ? /&,0*/&,
To make this more concrete+ consider whether (ob has acold *the hypothesis, given that he sneees *the evidence,.
• 8quation *", states that the probability that (ob has a
cold given that he sneees is the ratio of the probability that he both has a cold and sneees+ to the probability that he sneees.
• The probability of his sneeing is the sum of the
conditional probability that he sneees when he has acold and the conditional probability that he sneees
when he doesn&t have a cold. 4n other words+ the probability that he sneees regardless of whether hehas a cold or not. 2uppose that we know in general
""
-
8/20/2019 Cs511 Uncertainty (1)
23/72
0*/, 5 0*(ob has a cold,5 ."
0*8 ? /, 50*(ob was observed sneeing ? (ob has a cold,
5 .>'0*8 ? /&, 5 0*(ob was observed sneeing ?(ob does not have a cold,
5 ."Then
0*8, 5 0*(ob was observed sneeing,5 *.>',*.", ; *.",*.@,5 .#
and
0*/ ? 8, 50*(ob has a cold ? (ob was observed sneeing,
*.>',*.",5 ---------------
*.#,5 .$@#@>
• Er (ob&s probability of having a cold given that he
sneees is about .'.
"#
-
8/20/2019 Cs511 Uncertainty (1)
24/72
• e can also determine what his probability of having
a cold would be if he was not sneeing:
0*8& ? /,0*/,0*/ ? 8&, 5 -------------------
0*8&,*-.>', *.",
5 -------------------* - .#,
5 .>"$<
• 2o knowledge that he sneees increasing his
probability of having a cold by approximately ".'+while knowledge that does not sneee decreases his
probability by a factor of almost #.
"$
-
8/20/2019 Cs511 Uncertainty (1)
25/72
Propagation o& (elie&
)ote that what we have %ust examined is very limitedsince we have only considered when each piece of evidence affects only one hypothesis.
• This must be generalied to deal with JmK hypotheses
/+ /"+ ... /m and JnK pieces of evidence 8+ ...+ 8n+ the
situation normally encountered in real-world problems. hen these factors are included+ 8quation*", becomes
0*8 %∩8 %"∩...∩8 %k ? /i, 0*/i,
*#, 0*/i ? 8 % ∩8 %"∩ ...∩8 %k , 5 -----------------------------------
0*8 % ∩8 %"∩ ...∩ 8 %k ,
0*8 % ? /i,0*8 %" ? /i, ... 0*8 %k ? /i,0*/i,5 -------------------------------------------------------
m
∑0*8 % ? /l,0*8 %" ? /l, ... 0*8 %k ? /l,0*/l,l5
where 6%+ ...+%k , ⊆6+ ...+ n7
• This probability is called the posterior probability of
hypothesis /i from observing evidence 8 %+ 8 %"+ ...+ 8 %k .
"'
-
8/20/2019 Cs511 Uncertainty (1)
26/72
• This equation is derived based on several
assumptions:
. The hypotheses /+ ...+ /m+ m ≥ + are mutuallyexclusive.
". Furthermore+ the hypotheses /+ ...+ /m arecollectively exhaustive.
#. The pieces of evidence 8+ ...+ 8n+ n ≥ + are
conditionally independent given any hypothesis
/i+ ≤ i ≤ m .
Conditional independent: The events 8+ 8"+ ...+8n+ are conditionally independent given an event/ if
0*8 % ∩ ... ∩ 8 %k ? /, 5 0*8 % ? /, ...0*8 %k ? /,
for each subset 6%+ ...+%k , ⊆6+ ...+ n7.
• This last assumption often causes great difficulties for
probabilistic based methods.
For example+ two symptoms+ 9 and =+ might eachindependently indicate that some disease is ' percent
likely. Together+ however+ it might be that thesesymptoms reinforce *or contradict, each other. 1aremust be taken to ensure that such a situation does notexist before using the =ayesian approach.
"
-
8/20/2019 Cs511 Uncertainty (1)
27/72
• To illustrate how belief is propagated through a
system using =ayes& rule+ consider the values shownin the Table below. These values represent
*hypothetically, three mutually exclusive andexhaustive hypotheses
. /+ the patient+ (ob+ has a coldD
". /"+ (ob has an allergyD and
#. /#+ (ob has a sensitivity to light
with their prior probabilities+ 0*/i,&s+ and twoconditionally independent pieces of evidence
. 8+ (ob sneees and
". 8"+ (ob coughs+
which support these hypotheses to differing degrees.
i 5 i 5 " i 5 #*cold, *allergy, *light sensitive,
0*/i, .< .# .0*8 ? /i, .# .@ .#0*8" ? /i, .< .C .
">
-
8/20/2019 Cs511 Uncertainty (1)
28/72
• 4f we observe evidence 8 *e.g.+ the patient sneees,+
we can compute posterior probabilities for thehypotheses using 8quation *#, *where k 5 , to be:
*.#,*.
-
8/20/2019 Cs511 Uncertainty (1)
29/72
0*/" ? 8 ∩ 8",
*.@,*.C,*.#,5 ------------------------------------------------------------
*.#,*.
-
8/20/2019 Cs511 Uncertainty (1)
30/72
Advantages and disadvantages o& (ayesian met%ods
The =ayesian methods have a number of advantages thatindicates their suitability in uncertainty management.
• Most significant is their sound theoretical foundation
in probability theory. Thus+ they are currently the most
mature of all of the uncertainty reasoning methods.
hile =ayesian methods are more developed than theother uncertainty methods+ they are not without faults.
. They require a significant amount of probability datato construct a knowledge base. Furthermore+ human
experts are normally uncertain and uncomfortableabout the probabilities they are providing.
". hat are the relevant prior and conditional probabilities based onN 4f they are statistically based+the sample sies must be sufficient so the probabilitiesobtained are accurate. 4f human experts have provided
the values+ are the values consistent andcomprehensiveN
#. Eften the type of relationship between the hypothesisand evidence is important in determining how theuncertainty will be managed. (educing these
#
-
8/20/2019 Cs511 Uncertainty (1)
31/72
associations to simple numbers removes relevantinformation that might be needed for successfulreasoning about the uncertainties. For example+
=ayesian-based medical diagnostic systems havefailed to gain acceptance because physicians distrustsystems that cannot provide explanations describinghow a conclusion was reached *a feature difficult to
provide in a =ayesian-based system,.
$. The reduction of the associations to numbers also
eliminated using this knowledge within other tasks.For example+ the associations that would enable thesystem to explain its reasoning to a user are lost+ as isthe ability to browse through the hierarchy of evidences to hypotheses.
#
-
8/20/2019 Cs511 Uncertainty (1)
32/72
- Certainty &actors
• 1ertainty factor is another method of dealing with
uncertainty. This method was originally developed for the ML14) system.
• Ene of the difficulties with =ayesian method is that
there are too many probabilities required. Most of
them could be unknown.
• The problem gets very bad when there are many
pieces of evidence.
• =esides the problem of amassing all the conditional
probabilities for the =ayesian method+ another ma%or problem that appeared with medical experts was therelationship of belief and disbelief.
• 9t first sight+ this may appear trivial since obviously
disbelief is simply the opposite of belief. 4n fact+ thetheory of probability states that
0*/, ; 0*/&, 5
and so
#"
-
8/20/2019 Cs511 Uncertainty (1)
33/72
0*/, 5 - 0*/&,
• For the case of a posterior hypothesis that relies on
evidence+ 8
*, 0*/ ? 8, 5 - 0*/& ? 8,
• /owever+ when the ML14) knowledge engineers
began interviewing medical experts+ they found that physicians were extremely reluctant to state their knowledge in the form of equation *,.
• For example+ consider a ML14) rule such as the
following.
4F , The stain of the organism is gram positive+ and ", The morphology of the organism is coccus+ and #, The growth conformation of the organism is chains
T/8) There is suggestive evidence *.>, that theidentity of the organism is streptococcus
• This can be written in terms of posterior probability:
*", 0*/ ? 8∩ 8" ∩ 8#, 5 .>
where the 8i correspond to the three patterns of theantecedent.
##
-
8/20/2019 Cs511 Uncertainty (1)
34/72
• The ML14) knowledge engineers found that while an
expert would agree to equation *",+ they becameuneasy and refused to agree with the probability result
*#, 0*/& ? 8∩ 8" ∩ 8#, 5 - .> 5 .#
• This illustrates these numbers such as .> and .# are
likelihoods of belief+ not probabilities.
•3et us have another example.
2uppose this is your last course required for a degree.9ssume your grade-point-average *09, has not beentoo good and you need an O9& in this course to bringup your 09. The following formula may expressyour belief in the likelihood of graduation.
*$, 0*graduating ? O9& in this course, 5 .>
)otice that this likelihood is not P. The reason it&snot P is that a final audit of your course andgrades must be made by the school. There could be
problem due to a number of reasons that would still
prevent your graduation.
#$
-
8/20/2019 Cs511 Uncertainty (1)
35/72
• 9ssuming that you agree with *$, *or perhaps your
own value for the likelihood, then by equation *,
*', 0*not graduating ? O9& in this course, 5 .#
• From a probabilistic point of view+ *', is correct.
/owever+ it seems intuitively wrong. 4t is %ust not rightthat if you really work hard and get an O9& in thiscourse+ then there is a #P chance that you won&tgraduate. *', should make you uneasy.
• The fundamental problem is that while 0*/ ? 8,
implies a cause of effect relationship between 8 and/+ there may be no cause and effect relationship
between 8 and /&.
These problems with the theory of probability led the theresearchers in ML14) to investigate other ways of representing uncertainty.
• The method that they used with ML14) was based on
certainty &actors.
#'
-
8/20/2019 Cs511 Uncertainty (1)
36/72
Measures o& belie& and disbelie&
4n ML14)+ the certainty factor *1F, was originallydefined as the difference between belief and disbelief.
1F*/+ 8, 5 M=*/+ 8, - M!*/+ 8,
where
1F is the certainty factor in the hypothesis /due to evidence 8
M= is the measure o& increased belie& in / due to 8 M! is the measure o& increased disbelie& in / due to 8
• The certainty factor is a way of combining belief and
disbelief into a single number.
• 1ombining the measures of belief and disbelief into a
single number has some interesting uses.
• The certainty factor can be used to rank hypothesis in
order of importance.
For example+ 4f a patient has certain symptomswhich suggest several possible diseases+ then thedisease with the highest 1F would be the one that isfirst investigated by ordering tests.
#
-
8/20/2019 Cs511 Uncertainty (1)
37/72
The measures of belief and disbelief were defined in termsof probabilities by
5 if 0*/, 5 M=*/+ 8,
maxQ0*/ ? 8,+ 0*/,R - 0*/, 5 ---------------------------------- otherwise - 0*/,
5 if 0*/, 5M!*/+8,
min Q0*/ ? 8,+ 0*/,R - 0*/, 5 ---------------------------------- otherwise - 0*/,
#>
-
8/20/2019 Cs511 Uncertainty (1)
38/72
• 9ccording to these definitions+ some characteristics
are shown in Table '-.
SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS 1haracteristics alues
------------------------------------------------------
(anges ≤ M= ≤
≤ M! ≤
- ≤ 1F ≤
------------------------------------------------------- 1ertain True /ypothesis M= 50*/ ? 8, 5 M! 5
1F 5 -------------------------------------------------------1ertain False /ypothesis M= 5
0*/&?8, 5 M! 5
1F 5 -------------------------------------------------------- 3ack of evidence M= 5 0*/ ? 8, 5 0*/, M! 5
1F 5-------------------------------------------------------
2ome 1haracteristics of M=+ M! and 1F
#@
-
8/20/2019 Cs511 Uncertainty (1)
39/72
• The certainty factor+ 1F+ indicates the net belief in
hypothesis based on some evidence.
• 9 positive 1F means the evidence supports thehypothesis since M= U M!.
• 9 1F 5 means that the evidence definitely proves
the hypothesis.
•9 1F 5 means one of two possibilities.
. First+ a 1F 5 M= - M! 5 could mean that bothM= and M! are .
". The second possibility is that M= 5 M! and bothare nonero. The result is that the belief is
canceled out by the disbelief.
• 9 negative 1F means that the evidence favors the
negation of the hypothesis since M= V M!. 9nother way of stating this is that there is more reason todisbelief a hypothesis than to belief it.
For example+ a 1F 5 ->P means that the disbelief is>P greater than the belief.
9 1F5>P means that the belief is >P greater thanthe disbelief.
#C
-
8/20/2019 Cs511 Uncertainty (1)
40/72
• 1ertainty factors allow an expert to express a belief
without committing a value to the disbelief.
• The following equation is true.
1F*/+ 8, ; 1F*/&+ 8, 5
• The equation means that evidence supporting a
hypothesis reduces support to the negation of thehypothesis by an equal amount so that the sum isalways .
• For the example of the student graduating if an O9& is
given in the course
1F*/+8, 5 .> 1F*/&+8, 5 -.>
which means* P certain that 4 will graduate if 4 get
an O9& in this course. *>, 4 am ->P certain that 4 will not graduate if 4
get an O9& in this course.
• means no evidence.
• 2o certainty values greater than favor the hypothesis
$
-
8/20/2019 Cs511 Uncertainty (1)
41/72
• 1ertainty factors less than favor the negation of the
hypothesis. 2tatements *, are equivalentusing certainty factors
• The above 1F values might be elicited by asking
/ow much do you believe that getting an O9Kwill help you graduateN
if the evidence is to confirm the hypothesis+ or
/ow much do you disbelief that getting O9&will help you graduateN
9n answer of >P to each question will set 1F*/+ 8,5 .>+ and 1F*/&+8, 5 -.>.
$
-
8/20/2019 Cs511 Uncertainty (1)
42/72
Calculation it% Certainty /actors
9lthough the original definition of 1F was
1F 5 M= - M!
there were difficulties with this definition
• because one piece of disconfirming evidence couldcontrol the confirmation of many other pieces of evidence.
For example+ ten pieces of evidence might produce aM= 5 .CCC and one disconfirming piece with M! 5.>CC could then give
1F 5 .CCC - .>CC 5 ."
• The definition of 1F was changed in ML14) in C>>
to be
M= - M!
1F 5 ------------------------ - min*M=+ M!,
$"
-
8/20/2019 Cs511 Uncertainty (1)
43/72
• This softens the effects of a single piece of
disconfirming evidence on many confirming pieces of evidence. Bnder this definition with M=5.CCC+
M!5.>CC
.CCC-.>CC ." 1F 5 --------------------------- 5 ------------- 5 .CC' - min*.CCC+ .>CC, - .>CC
• The ML14) method for combining evidence in the
antecedent of a rule are shown in Table '-".
------------------------------------------------------------- 8vidence+ 8 9ntecedent 1ertainty ------------------------------------------------------------- 8 9)! 8" min Q1F*8+ e,+1F*8"+ e,R
8 E( 8" maxQ1F*8+ e,+1F*8"+ e,R )ET 8 -1F*8+ e,
-------------------------------------------------------------- Table '-"
• For example+ given a logical expression for combining
evidence such as
8 5 *8 9)! 8" 9)! 8#, or *8$ 9)! )ET 8',
the evidence 8 would be computed as
$#
-
8/20/2019 Cs511 Uncertainty (1)
44/72
8 5 maxQmin*8+ 8"+ 8#,+ min*8$+ -8',Rfor values
8 5 .C 8" 5 .@ 8# 5 .#
8$ 5 -.' 8' 5 -.$the result is
8 5 maxQmin*.C+ .@+ .#,+ min*-.'+ -*-.$,R 5 maxQ.#+ -.'R 5 .#
• The formula for the 1F of a rule
4f 8 T/8) /is given by
*@, 1F*/+e, 5 1F*8+e, 1F*/+8,
where
1F*8+e, is the certainty factor of the evidence 8making up the antecedent of the rule base onuncertain evidence e.
1F*/+8, is the certainty factor of hypothesisassuming that the evidence is with certainty+ when1F*8+e, 5 .
1F*/+e, is the certainty factor of the hypothesis based on uncertain evidence e.
$$
-
8/20/2019 Cs511 Uncertainty (1)
45/72
• Thus+ if all the evidence in the antecedent is known
with certainty+ the formula for the certainty factor of the hypothesis is
1F*/+e, 5 1F*/+8,
since 1F*8+e, 5 .
• 2ee an example. 1onsider the 1F for the
streptococcus rule discussed before+
4F , The stain of the organism is gram positive+ and", The morphology of the organism is coccus+ and#, The growth confirmation of the organism is chains
T/8) There is suggestive evidence *.>, that the identity of the organism is streptococcus
where the certainty factor of the hypothesis under certain evidence is
1F*/+ 8, 5 1F */+ 8∩8"∩8#, 5 .>
and is also called the attenuation &actor.
The attenuation &actor is based on the assumptionthat all the evidence--8+ 8" and 8#--is known withcertainty. That is+
1F*8+ e, 5 1F*8"+ e, 5 1F*8#+ e, 5
$'
-
8/20/2019 Cs511 Uncertainty (1)
46/72
• hat happens when all the evidenced are not known
with certaintyN
• 4n the case of ML14)+ the formula *@, must be usedto determine the resulting 1F value since 1F*/+
8∩8"∩8#, 5 .> is no longer valid for uncertain
evidence.
For example+ assuming 1F*8+e, 5 .' 1F*8"+e, 5 .< 1F*8#+e, 5 .#
then
1F*8+e, 5 1F*8∩8"∩8#+e,
5 minQ1F*8+e,+ 1F*8"+e,+ 1F*8#+e,R 5 minQ.'+ .
5 ."
$
-
8/20/2019 Cs511 Uncertainty (1)
47/72
• hat happen when another rule also concludes the
same hypothesis+ but with a different certainty factorN
• The certainty factors of rules concluding the samehypothesis is calculated from the combining functionfor certainty factors defined as
*C, 1F1EM=4)8*1F+1F",
5 1F ; 1F" * - 1F, if both 1F and 1F" U
1F ; 1F"
5 ------------------------- if one of 1F and 1F" V - min*?1F?+?1F"?,
5 1F ; 1F" * ; 1F, if both 1F and 1F" V
where 1F is 1F*/+ e, and 1F" is 1F"*/+ e,.
• The formula for 1F1EM=4)8 used depends on whether
the individual certainty factors are positive or negative.
• The combining function for more than two certainty
factors is applied incrementally. That is+ the 1F1EM=4)8is calculated for two 1F values+ and then the1F1EM=4)8 is combined using formula *C, with the third1F values+ and so forth.
$>
-
8/20/2019 Cs511 Uncertainty (1)
48/72
• The following figure summaries the calculations with
certainty factors for two rules based on uncertainevidence and concluding the same hypothesis.
1F of two rules with the same hypothesis based on uncertain evidence
/ypothesis+ /
1F*/+ e, 5 1F
*8+ e,1F
*/+e, 1F
"*/+ e, 5 1F
"*8+ e,1F
"*/+e,
(ule (ule "
9)! E( )ET*min, *max, *-,
9)! E( )ET*min, *max, *-,
$@
-
8/20/2019 Cs511 Uncertainty (1)
49/72
• 4n our above example+ if another rule concludes
strepococcus with certainty factor 1F" 5 .'+ then thecombined certainty using the first formula of *C, is
1F1EM=4)8*."+ .', 5 ." ; .'* - .", 5 .
-
8/20/2019 Cs511 Uncertainty (1)
50/72
Advantages and disadvantages o& certainty &actors
The 1F formalism has been quite popular with expertsystem developers since its creation because
. 4t is a simple computational model that permitsexperts to estimate their confidence in conclusion
being drawn.
". 4t permits the expression of belief and disbelief ineach hypothesis+ allowing the expression of the effectof multiple sources of evidence.
#. 4t allows knowledge to be captured in a rulerepresentation while allowing the quantification of
uncertainty.
$. The gathering of the 1F values is significantly easier than the gathering of values for the other methods. )ostatistical base is required - you merely have to ask theexpert for the values.
Many systems+ including ML14)+ have utilied thisformalism and have displayed a high degree of competence in their application areas. =ut is thiscompetence due to these systems& ability to manipulateand reason with uncertainty or is it due to other factorsN
'
-
8/20/2019 Cs511 Uncertainty (1)
51/72
2ome studies have shown that changing the certaintyfactors or even turn off the 1F reasoning portion of ML14) does not seems to affect the correct diagnoses
much.
• This revealed that the knowledge described within the
rule contributes much more to the final+ derivedresults than the 1F values.
Ether criticisms of this uncertainty reasoning method
include among others:
. The 1F lack theoretical foundation. =asically+ the 1Fwere partly ad %oc. 4t is an approximation of
probability theory.
". )on-independent evidence can be expressed and
combined only by JchunkingK it together within thesame rule. hen large quantities of non-independentevidence must be expressed+ this proves to beunsatisfactory
#. the 1F values could be the opposite of conditional probabilities.
For example+ if 0*/, 5 .@ 0*/", 5 ."0*/ ? 8, 5 .C 0*/" ? 8, 5 .@
then 1F*/+ 8, 5 .' and 1F*/"+ 8, 5 .>'
'
-
8/20/2019 Cs511 Uncertainty (1)
52/72
• 2ince one purpose of 1F is to rank hypotheses in
terms of likely diagnosis+ it is a contradiction for a
disease to have a higher conditional probability0*/ ? 8, and yet have a lower certainty factor+1F*/+ 8,.
'"
-
8/20/2019 Cs511 Uncertainty (1)
53/72
0 empster,S%a&er *%eory
/ere we discuss another method for handling uncertainty.4t is called !empster-2hafer theory. 4t is evolved duringthe Cs through the efforts of 9rthur !empster and one of his students+ lenn 2hafer.
• This theory was designed as a mathematical theory of
evidence.
• The development of the theory has been motivated by
the observation that probability theory is not able todistinguish between uncertainty and ignorance owingto incomplete information.
'#
-
8/20/2019 Cs511 Uncertainty (1)
54/72
/rames o& discernment
iven a set of possible elements+ called environment+
Θ 5 6θ+ θ"+ ...+ θn7
that are mutually exclusive and exhaustive.
• The environment is the set of ob%ects that are of
interest to us.
• For example+
Θ 5 6airline+ bomber+ fighter7
Θ 5 6red+ green+ blue+ orange+ yellow7
• Ene way of thinking about Θ is in terms of questions
and answers. 2uppose
Θ 5 6airline+ bomber+ fighter7
and the questions is+ Jwhat are the military aircraftNK.
The answer is the subset of Θ
6θ"+ θ#7 5 6bomber+ fighter7
'$
-
8/20/2019 Cs511 Uncertainty (1)
55/72
• 8ach subset of Θ can be interpreted as a possible
answer to a question.
• 2ince the elements are mutually exclusive and the
environment is exhaustive+ there can be only onecorrect answer subset to a question.
• Ef course+ not all possible questions may be
meaningful.
•The subsets of the environment are all possible validanswers in this universe of discourse.
• 9n environment is also called a &rame o&
discernment.
• The term discern means that it is possible to
distinguish the one correct answer from all the other possible answers to a question.
• The power set of the environment *with " ) subsets for
a set of sie ), has as its elements all answers to the possible questions of the frame of discernment.
''
-
8/20/2019 Cs511 Uncertainty (1)
56/72
Mass /unctions and Ignorance
4n =ayesian theory+ the posterior probability changes asevidence is acquired. 3ikewise in !empster-2hafer theory+the belief in evidence may vary.
• 4t is customary in !empster-2hafer theory to think
about the degree of belief in evidence as analogous to
the mass of a physical ob%ect.
That is+ the mass of evidence supports a belief.
• The reason for the analogy with an ob%ect of mass is
to consider belief as a quantity that can move around+ be split up+ and combined.
'
-
8/20/2019 Cs511 Uncertainty (1)
57/72
• 9 fundamental difference between !empster-2hafer
theory and probability theory is the treatment of ignorance.
• 9s discussed in 1hapter $+ probability theory must
distribute an equal amount of probability even inignorance.
For example+ if you have no prior knowledge+ thenyou must assume the probability 0 of each possibility
is
0 5 ----
)
where ) is the number of possibilities.
8.g.+ The formula 0*/, ; 0*/&, 5 must be enforced
• The !empster-2hafer theory does not force belief to
be assigned to ignorance or refutation of a hypothesis.
The mass is assigned only to those subsets of theenvironment to which you wish to assign belief.
'>
-
8/20/2019 Cs511 Uncertainty (1)
58/72
• 9ny belief that is not assigned to a specific subset is
considered no belie& or nonbelie& and %ust associated
with environment Θ.
• =elief that refutes a hypothesis is disbelie& + which is
not nonbelief.
For example+ we are trying to identify whether anaircraft is hostile. 2uppose there is the evidence of .>indicating a belief that the target aircraft is hostile+where hostile aircraft are only considered to be
bombers and fighters. Thus+ the mass assignment is tothe subset 6bomber+ fighter,+ and
m*6bomber+ fighter7, 5 .>
The rest of the belief is left with the environment+ Θ+as nonbelief.
m*Θ, 5 - .> 5 .#.
'@
-
8/20/2019 Cs511 Uncertainty (1)
59/72
• The !empster-2hafer theory has a ma%or difference
with probability theory which would assume that
0*hostile, 5 .>0*non-hostile, 5 - .> 5 .#
• .# in !empster-2hafer theory is held as nonbelief in
the environment by m*Θ,. This means neither belief
nor disbelief in the evidence to a degree of .#.
9 mass has considerably more freedom than probabilitiesas show in the table below.
!empster-2hafer theory 0robability theory-------------------------------------------------------------------
m*Θ, does not have to be ∑0i 5 i
4f A ⊆ L+ it is not necessary 0*A, ≤ 0*L,that m*A, ≤ m*L,
)o required relationship 0*A, ; 0*A&, 5 between m*A, and m*A&,
'C
-
8/20/2019 Cs511 Uncertainty (1)
60/72
e now state things more formally.
• 3et Θ be a frame of discernment *environment,. 9
mass assignment function assigns a number m* x, toeach x ⊆ Θ such that:
*, ≥ m*x, ≥
*", m*∅, 5
*#,∑ m*x, 5 x⊆Θ
• 3et Θ be a frame of discernment *environment,. and
let m be a mass assignment function on Θ. 9 set x ⊆ Θ
is called a focal element in m if m*x, U . The core of m+ denoted by k*m,+ is the set of all focal elements inm.
3et us consider a medical example. 2uppose
Θ 5 6heart-attack+ pericarditis+ pulmonary-embolism+
aortic-dissection7.
• )ote that each mass assignment on Θ assigns mass
numbers to "$
5 < sets. 4f for a specific patient thereis no evidence pointing at a certain diagnosis in
particular+ the mass of is assigned toΘ.
if x 5 Θ
-
8/20/2019 Cs511 Uncertainty (1)
61/72
m*x, 5 otherwise
• 8ach proper subset of Θ gets assigned the number .
The core of m is equal to 6Θ7
• )ow suppose that some evidence has become
available that points to the composite hypothesis
heart#attack or pericarditis with some certainty.
Then the subset 6heart#attack + pericarditis7 will beassigned a mass+ e.g.+ .$. !ue to lack of further information+ the remaining certainty .< is assigned to
Θ.
.< if x 5 Θ
m*x, 5 .$ if x 5 6heart#attack + pericarditis7 otherwise
• )ow suppose we have obtained some evidence
against the hypothesis that our patient is sufferingfrom pericarditis. This information can be consideredas support for hypothesis that the patient is not suffering from pericarditis. This is equivalent to thecomposite hypothesis heart#attack or p$lmonary#
embolism or aortic#dissection. e therefore assign amass+ for example .>+ to the set 6heart#attack + p$lmonary#embolism+ aortic#dissection7
.# if x 5 Θ
-
8/20/2019 Cs511 Uncertainty (1)
62/72
m"*x, 5 .> if x 5 6heart#attack + p$lmonary#embolism+aortic#dissection 7
otherwise
-
8/20/2019 Cs511 Uncertainty (1)
63/72
Combining evidence
!empster-2hafer theory provides a function for computingfrom two pieces of evidence and their associated massesdescribing the combined influence of these pieces of evidence.
• This function is known as %empster&s r$le of
combination.
• 3et m and m" be mass assignments on Θ+ the frame of
discernment. The combined mass is computed usingthe formula *special form of empster)s rule o& Combination,
m ⊕ m"*X, 5 ∑ m*A, m"*L,
A ∩L5 X
For instance+ using our hostile aircraft example+ basedon two pieces of evidence+ we obtain
m"*6=7, 5 .C m"*Θ, 5 .
-------------------------------------------------------------m*6=+ F7, 5 .> 6=7 .
m*Θ, 5 .# 6=7 ."> Θ .#
8.g.+ the entry T is calculated as this
-
8/20/2019 Cs511 Uncertainty (1)
64/72
T*6=7, 5 m*6=+ F7, m"*6=7, 5 *.>,*.C,5. 5 .C =omber
m#*6=+ F7, 5 m ⊕ m"*6=+ F 7,
5 .> =omber or fighter
m#*Θ, 5 m ⊕ m"*Θ, 5 .# nonbelief
• The m#*6=7, represents the belief that the target is a
bomber and only a bomber.
• m#*6=+ F7, and m#*Θ, imply more information as their
sets include a bomber+ it is plausible that their sumsmay contribute to a belief in the bomber.
• 2o .> ; .# 5 . may be added to the belief of .C
in the bomber set to yield the maximum belief *5 ,that it could be a bomber. This is called the plausiblebelie& .
-
8/20/2019 Cs511 Uncertainty (1)
65/72
• e have two belief values for the bomber+ .C and .
This pair represents a range o& belie& . 4t is called anevidential interval.
• The loer bound is called the support *Spt, or (el+
and the upper bound is called plausibility *PIs,.
For instance+ .C is the lower bound in the aboveexample+ and is the upper bound.
• The support is the minimum belief based on the
evidence+ while the plausibility is the maximum belief we are willing to give.
• Thus+ ≤ =el ≤ 04s ≤ . Table below shows some
common evidential interval.
8vidential 4nterval Meaning ------------------------------------------------------------------------
Q+ R 1ompletely true
Q+ R 1ompletely false
Q+ R 1ompletely ignorant
Q=el+ R where V =el V here Tends to support
Q+ 04sR where V 04s V here Tends to refuteQ=el+ 04sR where V =el ≤04s V here Tends to both support and refute
-
8/20/2019 Cs511 Uncertainty (1)
66/72
• The (el *belie& &unction2 or support, is defined to be
the total belief of a set and all its subsets.
=el*A, 5∑ m*L,L⊆ A
For example+
=el*6=+ F7, 5 m*6=+ F7, ; m*6=7, ; m*6F7,5 .> ; ; 5 .>
• =elief function is different from the mass+ which is the
belief in the evidence assigned to a single set.
• 2ince belief functions are defined in terms of masses+
the combination of two belief functions also can beexpressed in terms of masses of a set and all itssubsets.
• For example:
=el ⊕ =el"*6=7, 5 m⊕m"*6=7, ; m⊕ m"*∅,
5 .C ; 5 .C.
=el ⊕ =el"*6=+ F7,
5 m⊕m"*6=+ F7, ; m⊕m"*6=7, ; m⊕ m"*6F7,
5 .> ; .C ; 5 .C>
=el ⊕ =el"*Θ,
5 m⊕ m"*Θ, ; m⊕m"*6=+ F7, ; m⊕m"*6=7,
5 .# ; .> ; .C 5
-
8/20/2019 Cs511 Uncertainty (1)
67/72
• =el*Θ, 5 in all cases since the sum of masses must
always equal .
• The evidential interval of a set 2+ EI*2,+ may bedefined in terms of the belief.
84*2, 5 Q=el*2,+ - =el*2&,R
• For instance+ if 2 5 6=7+ then 2& 5 69+ F7 and
=el*69+ F7,5 m⊕m"*69+ F7, ; m⊕m"*697, ; m⊕ m"*6F7,5 ; ; 5
since these are not focal elements and the mass is for nonfocal elements.
• Thus+ 84*6=7, 5 Q.C+ - R 5 Q.C+ R
9lso+ since =el*697, 5 + and
=el*6=+ F7, 5 =el ⊕ =el"*6=+ F7, 5 .C>.
Then84*6=+ F7, 5 Q.C>+ - R 5 Q.C>+ R
84*697, 5 Q+ .#R
-
8/20/2019 Cs511 Uncertainty (1)
68/72
• The plausibility is defined as the degree to which the
evidence fails to refute A
04s*A, 5 - =el*A&,
Thus+ 84*A, 5 Q=el*A,+ 04s*A,R
• The evidential interval 3total belie&2 plausibility4 can
be expressed+
Qevidence for support+ evidence for support ; ignoranceR
• The dubiety *bt, or doubt represents the degree to
which A is disbelieved or refuted.
• The ignorance *Igr, is the degree to which the mass
supports A and A&.
These are defined as follows:
!bt*A, 5 =el*A&, 5 - 04s*A,
4gr*A, 5 04s*A, - =el*A,
-
8/20/2019 Cs511 Uncertainty (1)
69/72
*%e 5ormali#ation o& (elie&
• 3et us see an example. 2uppose a third evidence now
reports conflicting evidence of an airliner
m#*697, 5 .C'+ m#*Θ, 5 .'
• The table shows how the cross products are
calculated.
m⊕m"*6=7, m⊕m"*6=+ F7, m⊕m"*Θ,
.C .> .#----------------------------------------------------------------------------------
m#*697, 5 .C' ∅ .@'' ∅ .
-
8/20/2019 Cs511 Uncertainty (1)
70/72
∑ m⊕m"⊕m#*A,
5 ."@' ; .$' ; .#' ; .' 5 .>@'
• /owever a sum of is required because the combined
evidence m⊕m"⊕m#+ is a valid mass and the sumover all focal elements must be .
This is a problem.
• The solution to this problem is a normaliation of the
focal elements by dividing each focal element by
- κ where κ is defined for any sets A and L as
κ 5 ∑ m*A,m"*L,A ∩L 5 ∅
For our problem+
κ 5 .@'' ; .
-
8/20/2019 Cs511 Uncertainty (1)
71/72
• The total normalied belief in 6=7 is now
=el*6=7, 5 m⊕m"⊕m#*6=7, 5 .'>#
=el*6=7&, 5 =el*69+ F7,
5 m⊕m"⊕m#*69+ F7, ;
m⊕m"⊕m#*697, ;
m⊕m"⊕m#*6F7,
5 ; .##+ .R
• The general form of empster)s Rule o& Combination is+
∑ m*A, m"*L,A ∩L 5 X
m⊕m"*X, 5 --------------------------
- κ
• )ote that κ 5 is undefined.
>
-
8/20/2019 Cs511 Uncertainty (1)
72/72
i&&iculty it% t%e empster,S%a&ert%eory
Ene difficulty with the !empster-2hafer theory occurswith normaliation and may lead to results which arecontrary to our expectation.
• The problem is that it ignores the belief that the
ob%ect being considered does not exist.
For example+ the beliefs by two doctors+ 9 and =+ in a patient&s illness are as follows
m9*meningitis, 5 .CC+ m9*brain tumor, 5 .m=*concussion, 5 .CC+ m=*brain tumor, 5 .
=oth doctors think there is a very low chance+ .+ of a brain tumor but greatly disagree on the ma%or
problem.
The !empster rule of combination gives a combined belief of in the brain tumor. The result is very
unexpected