axioms and algorithms for inferences involving probabilistic independence
DESCRIPTION
Axioms and Algorithms for Inferences Involving Probabilistic Independence. Dan Geiger, Azaria Paz, and Judea Pearl, Information and Computation 91(1), March 1991, 128-141. Presentation by Guy Moses & Omer Weissbrod - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Axioms and Algorithms for Inferences Involving Probabilistic Independence](https://reader035.vdocument.in/reader035/viewer/2022070418/5681594b550346895dc68847/html5/thumbnails/1.jpg)
Axioms and AlgorithmsAxioms and Algorithmsfor Inferencesfor Inferences InvolvingInvolving
Probabilistic IndependenceProbabilistic Independence
Dan Geiger, Azaria Paz, and Judea Pearl, Information and Computation 91(1), March 1991,
128-141.
Presentation by Guy Moses & Omer Weissbrod
for the course 236372 - Bayesian NetworksComputer Science Faculty, Technion – winter 2009
partially based on the presentation by Ilan Gronau
![Page 2: Axioms and Algorithms for Inferences Involving Probabilistic Independence](https://reader035.vdocument.in/reader035/viewer/2022070418/5681594b550346895dc68847/html5/thumbnails/2.jpg)
2
What’s ahead?
IntroductionIntroduction - some definitions, notations and reminders.
Proof of CompletenessProof of Completeness. - “if it’s true – it can be proved”.
Preparations for the Membership AlgorithmPreparations for the Membership Algorithm – more definitions, and some theoretical groundwork.
The Membership AlgorithmThe Membership Algorithm – description, proof of correctness, complexity analysis.
![Page 3: Axioms and Algorithms for Inferences Involving Probabilistic Independence](https://reader035.vdocument.in/reader035/viewer/2022070418/5681594b550346895dc68847/html5/thumbnails/3.jpg)
3
DefinitionsDefinitions• U (Universe) – set of random variables with probability
distribution P.
• X,Y – finite sets of random variables:
X = x1,…,xn, Y = y1,…,ym
• P(X,Y) = P(X)·P(Y) - a short-hand notation for the equality:
Pr{x1=a1,…, xn=an, y1=b1, …, ym=bm} =
Pr{x1=a1,…, xn=an} · Pr{y1=b1, …, ym=bm}
for every choice of a1, …, an, b1, …, bm
• (X,Y) – short-hand for P(X,Y) = P(X)·P(Y)
This is called an independence statement.
*note that X,Y are disjoint sets of variables (XY = ).
![Page 4: Axioms and Algorithms for Inferences Involving Probabilistic Independence](https://reader035.vdocument.in/reader035/viewer/2022070418/5681594b550346895dc68847/html5/thumbnails/4.jpg)
4
NotationsNotations - a specific independence statement of the form
(X,Y)
- a set of independence statements of the form
(X,Y): = 1, … , k
• XY - short-hand notation for the union X Y
• P satisfies = (X,Y) means: P(X,Y) = P(X)·P(Y) for that specific P.
![Page 5: Axioms and Algorithms for Inferences Involving Probabilistic Independence](https://reader035.vdocument.in/reader035/viewer/2022070418/5681594b550346895dc68847/html5/thumbnails/5.jpg)
5
Soundness and CompletenessSoundness and CompletenessDefinitions:
iff every distribution that satisfies also satisfies .
iff cl(),i.e. there exists a derivation chain 1,…,n=
s.t. for each j, either j or j is derived by an axiom from
the previous statements.
For a set of axioms A:
Soundness: A is sound iff for every and :
Completeness: A is complete iff for every and :
Completeness - Alternative definition:
A is complete iff for every and every cl() there exists a
distribution P that satisfies cl)( and does not satisfy .
![Page 6: Axioms and Algorithms for Inferences Involving Probabilistic Independence](https://reader035.vdocument.in/reader035/viewer/2022070418/5681594b550346895dc68847/html5/thumbnails/6.jpg)
6
Independence AxiomsIndependence Axioms
We saw (in 1st lecture) that axioms 1a-1d are sound (always infer correctly).
Today we’ll show they are complete (can derive every true statement).
1a Trivial Independence:
1b Symmetry:
1c Decomposition
( , )
( , ) ( , )
( , ) ( , )
( , ) (
:
1d M , ) ( ,i ing: )x
X
X Y Y X
X YZ X Y
X Y XY Z X YZ
![Page 7: Axioms and Algorithms for Inferences Involving Probabilistic Independence](https://reader035.vdocument.in/reader035/viewer/2022070418/5681594b550346895dc68847/html5/thumbnails/7.jpg)
7
What’s ahead?
IntroductionIntroduction - some definitions, notations and reminders.
Proof of CompletenessProof of Completeness. - “if it’s true – it can be proved”.
Preparations for the Membership AlgorithmPreparations for the Membership Algorithm – more definitions, and some theoretical groundwork.
The Membership AlgorithmThe Membership Algorithm – description, proof of correctness, complexity analysis.
![Page 8: Axioms and Algorithms for Inferences Involving Probabilistic Independence](https://reader035.vdocument.in/reader035/viewer/2022070418/5681594b550346895dc68847/html5/thumbnails/8.jpg)
8
Minimal StatementMinimal Statement• Definition: =(X,Y) cl() is minimal if for every non-empty X’,Y’
s.t. X’X, Y’ Y, X’Y’XY we have (X’,Y’) cl().
• For every =(X,Y) cl() we can find an appropriate minimal
’=(X’,Y’)cl() through iterative decomposition.
• Observation:
P satisfies P satisfies ’ (decomposition soundness),
Therefore:
P doesn’t satisfy ’ P doesn’t satisfy .
• Our plan: Given an arbitrary cl(), We will find a distribution P that
satisfies cl() but doesn’t satisfy ’. This will prove completeness (using the alternative completeness definition and the observation above).
• To simplify annotation, we will assume WLOG that =(X,Y) is already minimal.
![Page 9: Axioms and Algorithms for Inferences Involving Probabilistic Independence](https://reader035.vdocument.in/reader035/viewer/2022070418/5681594b550346895dc68847/html5/thumbnails/9.jpg)
9
Let =(X,Y) cl() be a minimal statement where:
X={x1,…,xn}, Y={y1,…,ym}, and Z={z1,z2,…,zk} stand for the rest
of the variables in U.
We will construct P as follows: All variables, except x1, are fair coins
(probability for each of their two values)
x1 is defined thus:
Part 1: P does not satisfy
We will inspect the following scenario: x1=1, all other variables are 0.
P(x1, … , xn, y1, … , ym) P(x1, … , xn)·P(y1, … , ym)
Therefore, P does not satisfy , as required.
Completeness ProofCompleteness Proof
12 1
mod2n m
i ji j
x x y
=0 =0.5n =0.5m
![Page 10: Axioms and Algorithms for Inferences Involving Probabilistic Independence](https://reader035.vdocument.in/reader035/viewer/2022070418/5681594b550346895dc68847/html5/thumbnails/10.jpg)
10
Completeness Proof – cont’dCompleteness Proof – cont’dPart 2: P satisfies cl()
Let (V,W) cl(). We will show that P(V,W)=P(V)·P(W). This is done by inspecting different scenarios:
Scenario 1: either V or W contains only elements of Z. We will assume WLOG that W contains only elements of Z.
all variables in Z are independent under P and therefore:
WWVV
X
ZY
Y
Y
YY
YX
X
X
Z
Z
Z
Z
Z
Z
Z
Z
ZZ
ZZ
Z
i
iz W
P VW P V P z P V P W
![Page 11: Axioms and Algorithms for Inferences Involving Probabilistic Independence](https://reader035.vdocument.in/reader035/viewer/2022070418/5681594b550346895dc68847/html5/thumbnails/11.jpg)
11
Completeness Proof – cont’dCompleteness Proof – cont’dPart 2: P satisfies cl()
Let (V,W) cl(). We will show that P(V,W)=P(V)·P(W). This is done by inspecting different scenarios:
Scenario 2: Both V and W contain elements of X Y,but V W doesn’t contain all elements of X Y.
Without full information about the assignments of the variables in X Y, x1 could turn out to be 0 or 1 with probability , and therefore:
0.5 0.5
0.5 0.5
V W V W
V W
P VW
P V P W
WW
VVX
ZY
Y
Y
YY
YX
X
X
Z
Z
Z
Z
Z
Z
Z
Z
ZZ
ZZ
Z
![Page 12: Axioms and Algorithms for Inferences Involving Probabilistic Independence](https://reader035.vdocument.in/reader035/viewer/2022070418/5681594b550346895dc68847/html5/thumbnails/12.jpg)
12
WWX
ZY
Y
Y
YY
YX
X
X
Z
Z
Z
Z
Z
Z
Z
Z
ZZ
ZZ
Z
VV
Completeness Proof – cont’dCompleteness Proof – cont’dPart 2: P satisfies cl() - continued
Scenario 3: Both V and W contain elements of X Y, and (X Y)(V W).
We will show a derivation chain for =(X,Y), contradicting our original assumption that cl():
Mark: (V,W)=(XVYVZV, XWYWZW)cl()
where: Y=YVYW, X=XVXW, ZVZWZ, V=XVYVZV, W=XWYWZW
Remove all z’s by decomposition: (XVYV, XWYW)cl()
Due to minimality of =(X,Y): (XV,YV)cl() and (XW,Y)cl()
(XV,YV)(XVYV, XWYW) (XV,YV XWYW) = (XV,XWY)
(XW,Y) (XWY, XV) (Y, XVXW) = (Y,X) =
mix
mix
![Page 13: Axioms and Algorithms for Inferences Involving Probabilistic Independence](https://reader035.vdocument.in/reader035/viewer/2022070418/5681594b550346895dc68847/html5/thumbnails/13.jpg)
13
Completeness Proof – SummaryCompleteness Proof – SummaryReminder: Completeness - Alternative definition:
A is complete iff for every and every cl() there exists a
distribution P that satisfies cl)( and does not satisfy .
We’ve shown: given a minimal cl(), there exists a
distribution P that obeys:
1. P does not satisfy .
2. P satisfies .
Given a non-minimal cl(), we will derive its minimal statement ’, and devise a distribution P’ that satisfies but does not satisfy ’. Due to soundness of decomposition, P’ cannot satisfy as well.
![Page 14: Axioms and Algorithms for Inferences Involving Probabilistic Independence](https://reader035.vdocument.in/reader035/viewer/2022070418/5681594b550346895dc68847/html5/thumbnails/14.jpg)
14
Scope of CompletenessScope of CompletenessThe proof uses P- a binary p.d. (probability distribution
function) therefore:all p.d.’s over
Udiscrete p.d.’s
binary p.d.’s
normalp.d.’s
however,
for normal p.d.’s, the axiom set a1-d1 is not complete.
a stronger axiom is required:
replace:
with:
P P P( , ) ( , ) ( ,1d Mixin : )g I X Y I X Y Z I X Y Z
P P P1d' Composition: ( , ) ( , ) ( , )I X Y I X Z I X Y Z
•P
![Page 15: Axioms and Algorithms for Inferences Involving Probabilistic Independence](https://reader035.vdocument.in/reader035/viewer/2022070418/5681594b550346895dc68847/html5/thumbnails/15.jpg)
15
What’s ahead?
IntroductionIntroduction - some definitions, notations and reminders.
Proof of CompletenessProof of Completeness. - “if it’s true – it can be proved”.
Preparations for the Membership AlgorithmPreparations for the Membership Algorithm – more definitions, and some theoretical groundwork.
The Membership AlgorithmThe Membership Algorithm – description, proof of correctness, complexity analysis.
![Page 16: Axioms and Algorithms for Inferences Involving Probabilistic Independence](https://reader035.vdocument.in/reader035/viewer/2022070418/5681594b550346895dc68847/html5/thumbnails/16.jpg)
16
Some more Definitions and ToolsSome more Definitions and Tools
Definition: Spanspan(): the set of elements represented in statement .
Example: span(x1x2,x3,x4) = {x1,x2,x3,x4}
span(): the set of elements represented in all statements of .
Example: span({(x1,x2),(x1,x3)}) = {x1,x2,x3}
![Page 17: Axioms and Algorithms for Inferences Involving Probabilistic Independence](https://reader035.vdocument.in/reader035/viewer/2022070418/5681594b550346895dc68847/html5/thumbnails/17.jpg)
17
Some more Definitions and ToolsSome more Definitions and Tools
Definition: ProjectionThe projection of on X, denoted (X), is the statement
derived from by removing all elements not in X from .
Example:
if =(x1x2x3, x4x5) and X={x2,x3,x4} then
(X)=(x2x3, x4).
The projection of on X, denoted (X), is {(X) | }.
![Page 18: Axioms and Algorithms for Inferences Involving Probabilistic Independence](https://reader035.vdocument.in/reader035/viewer/2022070418/5681594b550346895dc68847/html5/thumbnails/18.jpg)
18
Some more Definitions and ToolsSome more Definitions and Tools
Projection Lemma: iff ‘ , where ’ = (span())
) if ' then clearly because all the statements in ‘ can be derived from the statements in by decomposition.
![Page 19: Axioms and Algorithms for Inferences Involving Probabilistic Independence](https://reader035.vdocument.in/reader035/viewer/2022070418/5681594b550346895dc68847/html5/thumbnails/19.jpg)
19
Some more Definitions and ToolsSome more Definitions and Tools
Projection Lemma: iff ’ , where ’ = (span()), s = span()
) if then there is a derivation chain for : 1, 2, … , k.
For each j:
if k j, k<j, (by symmetry or decomposition)
then k(s) j(s) by symmetry or decomposition respectively.
Similarly,
if j is derived from k and l by mixing,
then j(s) is derived from k(s), l(s) by mixing.
![Page 20: Axioms and Algorithms for Inferences Involving Probabilistic Independence](https://reader035.vdocument.in/reader035/viewer/2022070418/5681594b550346895dc68847/html5/thumbnails/20.jpg)
20
Some more Definitions and ToolsSome more Definitions and Tools
Projection Lemma: iff ’ , where ’ = (span()), s = span()
Observations from projection lemma:• Variables not in are unnecessary for determining whether
.
• The problem of verifying whether can be simplified to the problem of verifying whether ' , where '= (span()).
• This problem can be solved with a possibly reduced time and space complexity.
![Page 21: Axioms and Algorithms for Inferences Involving Probabilistic Independence](https://reader035.vdocument.in/reader035/viewer/2022070418/5681594b550346895dc68847/html5/thumbnails/21.jpg)
21
Conditions for Conditions for Inference of IndependenceInference of Independence
Maim claim: for a given , we have ’ iff:
1. is trivial: =(X,) (up to symmetry)
OR
2. is in ’: ’ (up to symmetry)
OR
3. is derivable from ’:
there exists ’’ s.t. span() = span(’)
and for ’=(AP,BQ) =(AQ,BP) (A,B,Q,P may be empty)
’ (A,P), ’ (B,Q) (up to symmetry)
![Page 22: Axioms and Algorithms for Inferences Involving Probabilistic Independence](https://reader035.vdocument.in/reader035/viewer/2022070418/5681594b550346895dc68847/html5/thumbnails/22.jpg)
22
Proof of Main ClaimProof of Main ClaimMaim claim: for a given , we have ’ iff:
1. is trivial*: =(X,) *up to symmetry
2. is in* ’: ’
3. is derivable* from ’: ’’ s.t. span() = span(’)
and for ’=(AP,BQ) =(AQ,BP) : ’ (A,P), ’ (B,Q)
) if 1. is trivial*
OR 2. is in* ’. than the proof is immediate.
otherwise,
3. there exists ’’ s.t. span() = span(’)
and for ’=(AP,BQ) =(AQ,BP) : ’ (A,P), ’ (B,Q)
we will show a constructive proof under these conditions
![Page 23: Axioms and Algorithms for Inferences Involving Probabilistic Independence](https://reader035.vdocument.in/reader035/viewer/2022070418/5681594b550346895dc68847/html5/thumbnails/23.jpg)
23
Proof of Main ClaimProof of Main ClaimMaim claim: for a given , we have ’ iff:
1. is trivial*: =(X,) *up to symmetry
2. is in* ’: ’
3. is derivable* from ’: ’’ s.t. span() = span(’)
and for ’=(AP,BQ) =(AQ,BP) : ’ (A,P), ’ (B,Q)
) (contd.) given that ’ (AP,BQ), ’ (A,P), ’ (B,Q).
1. (A,P)(AP,BQ) (A,PBQ)
2. (B,Q)(AP,BQ) (APB,Q) (PB,Q)
3. (PB,Q)(A,PBQ) (AQ,PB) = (AQ, BP) =
We’ve proven this direction.
mix
mix
mix
dec.
![Page 24: Axioms and Algorithms for Inferences Involving Probabilistic Independence](https://reader035.vdocument.in/reader035/viewer/2022070418/5681594b550346895dc68847/html5/thumbnails/24.jpg)
24
Proof of Main ClaimProof of Main ClaimMaim claim: for a given , we have ’ iff:
1. is trivial*: =(X,) *up to symmetry
2. is in* ’: ’
3. is derivable* from ’: ’’ s.t. span() = span(’)
and for ’=(AP,BQ) =(AQ,BP) : ’ (A,P), ’ (B,Q)
) Given ’ , if 1. is trivial* OR 2. is in* ’,
than the proof is immediate.
Otherwise, since no axiom can add new variables to a statement, there must exist ’’ s.t. span() =
span(’) in the derivation chain of .
also: = (AQ,BP) (A,P)
= (AQ,BP) (Q,B)
dec.
dec.
![Page 25: Axioms and Algorithms for Inferences Involving Probabilistic Independence](https://reader035.vdocument.in/reader035/viewer/2022070418/5681594b550346895dc68847/html5/thumbnails/25.jpg)
25
Conclusions from ClaimConclusions from Claim• We’ve seen that, after discarding unneeded
variables,it is possible to tell whether ’ (when it’s not immediately obvious) by:
a. Finding another statement ’’ for whichspan() = span(’),
b. Verifying that ’ (A,P), ’ (B,Q)when ’=(AP,BQ) =(AQ,BP).
• This suggests using a recursive “divide and
conquer” approach.
![Page 26: Axioms and Algorithms for Inferences Involving Probabilistic Independence](https://reader035.vdocument.in/reader035/viewer/2022070418/5681594b550346895dc68847/html5/thumbnails/26.jpg)
26
What’s ahead?
IntroductionIntroduction - some definitions, notations and reminders.
Proof of CompletenessProof of Completeness. - “if it’s true – it can be proved”.
Preparations for the Membership AlgorithmPreparations for the Membership Algorithm – more definitions, and some theoretical groundwork.
The Membership AlgorithmThe Membership Algorithm – description, proof of correctness, complexity analysis.
![Page 27: Axioms and Algorithms for Inferences Involving Probabilistic Independence](https://reader035.vdocument.in/reader035/viewer/2022070418/5681594b550346895dc68847/html5/thumbnails/27.jpg)
27
The Membership Algorithm The Membership Algorithm Procedure Find(,):
1. set ’ := (span()).
2. if is trivial, or ’ (up to symmetry)then Find(,) := TRUE.
3. else if for all non-trivial ’’: span() span(’),
then Find(,) := FALSE.
4. else there exists ’’: span() = span(’),
and ’=(AP,BQ) =(AQ,BP),
set 1:= (A,P), 2:= (B,Q).
Find(,) := (Find(’,1) Find(’,2))
![Page 28: Axioms and Algorithms for Inferences Involving Probabilistic Independence](https://reader035.vdocument.in/reader035/viewer/2022070418/5681594b550346895dc68847/html5/thumbnails/28.jpg)
28
Algorithm Correctness ProofAlgorithm Correctness ProofWe will prove that Find(,) := TRUE cl() by
induction on k=.
Induction base: if k=1 then is trivial, therefore the algorithm will return TRUE in step 2 and cl().
![Page 29: Axioms and Algorithms for Inferences Involving Probabilistic Independence](https://reader035.vdocument.in/reader035/viewer/2022070418/5681594b550346895dc68847/html5/thumbnails/29.jpg)
29
Algorithm Correctness ProofAlgorithm Correctness ProofInduction assumption: Find(,) := TRUE cl()
for each ’<k.
Induction step: Find(,) := TRUE iff either:
1. Step 2 returns TRUE is trivial or ’ cl().
2. Step 4 returns TRUE
iff
Find(’,1) := TRUE Find(’,2) := TRUE
iff
1cl(’) 2cl(’)
iff
cl(’)
(according to algorithm’s definition)
(according to induction assumption)
(according to main claim)
(according to projection lemma)iff cl()
![Page 30: Axioms and Algorithms for Inferences Involving Probabilistic Independence](https://reader035.vdocument.in/reader035/viewer/2022070418/5681594b550346895dc68847/html5/thumbnails/30.jpg)
30
Complexity AnalysisComplexity AnalysisDefinitions:
n = the number of distinct variables in {}.
k = the number of distinct variables in {}.
• First projection cost: O(||·n) – happens only once.
• Recursive step: T)k) ||·k + T(k1) + T(k2)
where k1+k2=k, k1=|1|, k2=|2|
• Can be shown by induction: T)k) ||·k·(depth of recursion)
• Worst case analysis: T)k) ||·k·k= ||·k2
• Total run time is bounded by: O(||·n + ||·k2)which is also: O(||·n2) since k n.
![Page 31: Axioms and Algorithms for Inferences Involving Probabilistic Independence](https://reader035.vdocument.in/reader035/viewer/2022070418/5681594b550346895dc68847/html5/thumbnails/31.jpg)
31
Improvements and Variations Improvements and Variations • Instead of arbitrarily choosing ’, find one whose
sub-statements {A,B,P,Q} have balanced size (can improve run-time complexity).
• Using the derivation chain presented in the constructive proof, the algorithm can also return a derivation chain for with a length of O(k).
![Page 32: Axioms and Algorithms for Inferences Involving Probabilistic Independence](https://reader035.vdocument.in/reader035/viewer/2022070418/5681594b550346895dc68847/html5/thumbnails/32.jpg)
32
Variations (contd.) Variations (contd.) The algorithm can be expanded into a polynomial
algorithm for the following problems:
• Given two sets and , is cl() cl() ?is cl() = cl() ?
• Minimize the size of while preserving cl():
Start with a maximal-size statement and remove from all statements derivable from it.Repeat with the next largest statement etc.
![Page 33: Axioms and Algorithms for Inferences Involving Probabilistic Independence](https://reader035.vdocument.in/reader035/viewer/2022070418/5681594b550346895dc68847/html5/thumbnails/33.jpg)
Thank you!Thank you!