lecture 4: variable elimination - github pages · lecture 4: variable elimination theo rekatsinas...
Post on 30-May-2020
8 Views
Preview:
TRANSCRIPT
CS839:ProbabilisticGraphicalModels
Lecture4:VariableElimination
TheoRekatsinas
1
Recap:P-maps,BNs,MRFs
2
Section1
• ADAGGisaperfectmap(P-map)foradistributionPifI(P)=I(G)
• ThereareindependenceassumptionsthatcanbecapturedbyaBNandnotanMRF.Vice-versa
PartiallyDirectedAcyclicGraphs
3
Section1
• A.K.A.chaingraphs• Nodescanbepartitionedintodisjointchaincomponents• Anedgewithinthesamecomponentsmustbeundirected• Anedgebetweentwonodesindifferentchaincomponentsmustbedirected
ProbabilisticInferenceandLearning
4
Section1
• Wediscussedcompactrepresentationsofprobabilitydistributions:GraphicalModels(GMs)• AGM(fullyparameterized)MdescribesauniqueprobabilitydistributionP• TaskswecanperformwithaGM:• Task1(Inference):WhatistheprobabilityPM(X|Y)?Y:evidence• Task2(Learning):HowdoweestimateaplausiblemodelM fromdataD?
• LearningreferstoobtainingapointestimateofM• Bayesian’sarelookingforP(M|D)whichisactuallyaninferenceproblem.• Missingdata:tocomputeapointestimateofMweneedtoperforminferencetoimputethemissingdata.
Likelihood
5
Section1
• Mostqueriesinvolveevidence• Evidencee isanassignmentofvaluestoasetEvariablesinthedomain• WithoutlossofgeneralityE={Xk+1,…,Xn}
• Example:computetheprobabilityofevidencee
• A.k.a.computethelikelihoodofe
P (e) =X
x1
· · ·X
xk
P (x1, . . . , xk
, e)
ConditionalProbability
6
Section1
• Oftenweareinterestedintheconditionalprobabilitydistributionofavariablegiventheevidence:
• AkaaposterioribeliefinX,givenevidencee• MostofthetimesweonlycareaboutasubsetY ofalldomainvariablesX={Y,Z}anddonotcareabouttheremainingZ,
• theprocessofsummingout“donotcare”variablesiscalledmarginalization,andtheresultingprobabilityiscalledamarginalprobability
P (X|e) = P (X, e)
P (e)=
P (X, e)Px
P (X = x, e)
P (Y|e) =X
z
P (Y,Z = z|e)
Applicationsofaposterioribelief
7
Section1
• Prediction:whatistheprobabilityofanoutcomegiventhestartingcondition
• Thequerynodeisadescendentoftheevidence
• Diagnosis:whatistheprobabilityofdisease/faultgivensymptoms
• Thequerynodeisanancestoroftheevidence• Learningunderpartialobservations• Expectationmaximizationalgorithm(laterinclass)
A B C
?
A B C
?
MPA:MostProbableAssignment
8
Section1
• Whatisthemostprobablejointassignment(MPA)forsomevariablesofinterest
• Weareagaingivensomeevidenceeandweignore(thevaluesof)“donotcare”variablesz
• Akamaximumaposterioriassignmentofy (MAPinference)
MPA(Y|e) = argmax
yP (y|e) = argmax
y
X
z
P (y, z|e)
ApplicationsofMPA
9
Section1
• Classification: findmostlikelylabel,givenevidence• Explanation:whatthemostlikelyscenario,giventheobservedevidence
• MAPstateofavariabledependsonthesetofvariablesthatarejointlyqueried
• Example:• MapofY1• Mapof(Y1,Y2)
Y1 Y2 P(Y1,Y2)0 0 0.050 1 0.351 0 0.31 1 0.3
ComplexityofInference
10
Section1
• Thm: ComputingP(X=x|e)inaGMisNP-hard
• However,formanypracticalinstancesofGMswecansolveinferenceefficiently(inpolynomialtime)• NP-hardnessimpliesthatwecannotfindageneralprocedurethatworksefficientlyforarbitraryGMs• ForcertainGMfamilieswehaveprovablyefficientexactinferenceprocedures.
ApproachestoInference
11
Section1
• Exactinference• Variableelimination• Message-passingalgorithm(sum-product,beliefpropagation)• Junctiontreealgorithms
• Approximateinferencetechniques• Stochasticsimulation/samplingmethods• MarkovchainMonteCarlomethods• Variational algorithms
Operations:MarginalizationandElimination
12
Section1
• ConsiderthefollowingGM:
• WhatisthelikelihoodthatEistrue?
• Query:P(e)
A B C D E
P (e) =X
d
X
c
X
b
X
a
P (a, b, c, d, e)
Operations:MarginalizationandElimination
13
Section1
• ConsiderthefollowingGM:
• WhatisthelikelihoodthatEistrue?
• Query:P(e)
A B C D E
P (e) =X
d
X
c
X
b
X
a
P (a, b, c, d, e)
Anaïvesummationneedstoenumerateoveranexponentialnumberofterms
Operations:MarginalizationandElimination
14
Section1
• ConsiderthefollowingGM:
• WhatisthelikelihoodthatEistrue?
• Query:P(e)
• Chaindecomposition
A B C D E
P (e) =X
d
X
c
X
b
X
a
P (a, b, c, d, e)
P (e) =X
d
X
c
X
b
X
a
P (a)P (b|a)P (c|b)P (d|c)P (e|d)
EliminationonChains
15
Section1
• ConsiderthefollowingGM:
• WhatisthelikelihoodthatEistrue?
• Reorderterms
A B C D E
P (e) =X
d
X
c
X
b
X
a
P (a)P (b|a)P (c|b)P (d|c)P (e|d)
=X
d
X
c
X
b
P (c|b)P (d|c)P (e|d)X
a
P (a)P (b|a)
EliminationonChains
16
Section1
• ConsiderthefollowingGM:
• Performinnermostsummation
• Thissummationeliminatesonevariablefromoursummationargumentatalocalcost
A B C D EXP (e) =
X
d
X
c
X
b
P (c|b)P (d|c)P (e|d)X
a
P (a)P (b|a)
=X
d
X
c
X
b
P (c|b)P (d|c)P (e|d)p(b)
EliminationonChains
17
Section1
• ConsiderthefollowingGM:
• Performinnermostsummation
A B C D EX XP (e) =
X
d
X
c
X
b
P (c|b)P (d|c)P (e|d)p(b)
=X
d
X
c
P (d|c)P (e|d)X
b
P (c|b)p(b)
=X
d
X
c
P (d|c)P (e|d)p(c)
EliminationonChains
18
Section1
• ConsiderthefollowingGM:
• Eliminatenodesone-by-oneallthewaytotheend
• Complexity:• ForeachstepwehaveO(|Dom(Xi)|*|Dom(Xi+1)|)operations:O(kn2)• ComparewithnaïveO(nk)
A B C D EX X X XP (e) =
X
d
P (e|d)p(d)
UndirectedChains
19
Section1
• Rearrangingterms…A B C D E
P (e) =X
d
X
c
X
b
X
a
1
Z�(b, a)�(c, b)�(d, c)�(e, d)
=1
Z
X
d
X
c
X
b
�(c, b)�(d, c)�(e, d)X
a
�(b, a)
= . . .
ConditionalRandomFields
20
Section1
TheSum-ProductOperation
21
Section1
• Ingeneral,wewanttocomputethevalueofanexpressionoftheform:
• whereFisasetoffactors
• Wecallthistaskthesum-productinferencetask.
X
z
Y
�2F
�
InferenceviaVariableElimination
22
Section1
• Generalidea:• Writequeryintheform
• Iteratively• Moveallirrelevanttermsoutsideofinnermostsum• Performinnermostsum,gettinganewterm• Insertthenewtermintotheproduct
• Backtooriginalquery
P (X1, e) =X
xn
· · ·X
x3
X
x2
Y
i
P (xi
|pai
)
P (X1|e) =�(X1, e)Px1
�(X1, e)
Outcomeofelimination
23
Section1
• LetXbesomesetofvariables• LetFbeasetoffactorssuchthatforeach• LetbeasetofqueryvariablesandZ =X– Ybethevariabletobeeliminated
• TheresultofeliminatingZisafactor
• Thisdoesnotnecessarilycorrespondtoanyprobabilityorconditionalprobability
� 2 F, Scope[�] 2 XY ⇢ X
⌧(Y) =X
z
Y
�2F
�
EvidenceandSum-Product
24
Section1
• Evidencepotential
• Totalevidencepotential
• Introducingevidence– restrictedfactors:
⌧(Y, e) =X
z,e
Y
�2F
� · �(E, e)
VariableEliminationAlgorithm
25
Section1
ProcedureElimination(G,//theGME,//evidenceZ,//setofvariablestobeeliminatedX,//queryvariable(s))
1. Initialize(G)2. Evidence(E)3. Sum-Product-Elimination(F,Z,)4. Normalization(F)
�
VariableEliminationAlgorithm
26
Section1
ProcedureInitialize(G,Z)1. LetbeanorderingofZsuchthatiff I<j2. InitializeFwiththefullsetoffactors
ProcedureEvidence(E)1.Foreach
Z1, . . . , Zk Zi � Zj
i 2 IE ,
F = F [ �(Ei, ei)
VariableEliminationAlgorithm
27
Section1
ProcedureSum-Product-Variable-Elimination(F,Z,)
1. Fori =1,…,k
2.3. return4. Normalization()
�
F Sum-Product-Eliminate-Var(F,Zi)
�⇤ Q
�F�
�⇤
�⇤
VariableEliminationAlgorithm
28
Section1
ProcedureSum-Product-Eliminate-Var (F,Z)
1.
2.
3.
4.
5.return
F
0 {� 2 F : Z 2 Scope[�]}F 00 F � F 0
Q
�2F 0 �
⌧ P
Z
F 00 [ {⌧}
VariableEliminationAlgorithm
29
Section1
ProcedureNormalization()
1.
�⇤
P (X|E) = �⇤(X)Px
�⇤(X)
VariableEliminationExample
30
Section1
Query:P(A|h)• Needtoeliminate:B,C,D,E,F,G,HInitialfactors:P(A)P(B)P(C|B)P(D|A)P(E|C,D)P(F|A)P(G|E)P(H|E,F)
VariableEliminationExample
31
Section1
Query:P(A|h)• Needtoeliminate:B,C,D,E,F,G,HInitialfactors:P(A)P(B)P(C|B)P(D|A)P(E|C,D)P(F|A)P(G|E)P(H|E,F)Step1:• Conditioning onevidence(fixHtoh)
• Sameasamarginalizationstep:pH(E,F ) = P (H = h|E,F )
pH(E,F ) =P
h0 P (H = h|E,F )�(h0 = h)
VariableEliminationExample
32
Section1
Query:P(A|h)• Needtoeliminate:B,C,D,E,F,G,HInitialfactors:P(A)P(B)P(C|B)P(D|A)P(E|C,D)P(F|A)P(G|E)P(H|E,F)=>P(A)P(B)P(C|B)P(D|A)P(E|C,D)P(F|A)P(G|E)pH(E,F)Step2:EliminateG
=>P(A)P(B)P(C|B)P(D|A)P(E|C,D)P(F|A)P(G|E)p(E,F)=>P(A)P(B)P(C|B)P(D|A)P(E|C,D)P(F|A) pH(E,F)
pG(E) =P
g P (G = g|E) = 1
VariableEliminationExample
33
Section1
Query:P(A|h)• Needtoeliminate:B,C,D,E,F,G,HInitialfactors:P(A)P(B)P(C|B)P(D|A)P(E|C,D)P(F|A)P(G|E)P(H|E,F)=>P(A)P(B)P(C|B)P(D|A)P(E|C,D)P(F|A)pH(E,F)Step3:EliminateF
=>P(A)P(B)P(C|B)P(D|A)P(E|C,D)pF(A,E)
pH(E,A) =P
f P (F = f |A)pH(E,F )
VariableEliminationExample
34
Section1
Query:P(A|h)• Needtoeliminate:B,C,D,E,F,G,HInitialfactors:P(A)P(B)P(C|B)P(D|A)P(E|C,D)P(F|A)P(G|E)P(H|E,F)=>P(A)P(B)P(C|B)P(D|A)P(E|C,D)pF(A,E)Step4:EliminateE
=>P(A)P(B)P(C|B)P(D|A)pE(A,C,D)
pE(A,C,D) =P
e P (E = e|C,D)pF (A,E)
VariableEliminationExample
35
Section1
Query:P(A|h)• Needtoeliminate:B,C,D,E,F,G,HInitialfactors:P(A)P(B)P(C|B)P(D|A)P(E|C,D)P(F|A)P(G|E)P(H|E,F)=>P(A)P(B)P(C|B)P(D|A)pE(A,C,D)Step5:EliminateD
=>P(A)P(B)P(C|B)pD(A,C)
pD(A,C) =P
d P (D = d|A)pE(A,C,D)
VariableEliminationExample
36
Section1
Query:P(A|h)• Needtoeliminate:B,C,D,E,F,G,HInitialfactors:P(A)P(B)P(C|B)P(D|A)P(E|C,D)P(F|A)P(G|E)P(H|E,F)=>P(A)P(B)P(C|B)pD(A,C)Step6:EliminateC
=>P(A)P(B)P(C|B)pC(A,B)
pC(A,B) =P
c P (C = c|B)pD(A,C)
VariableEliminationExample
37
Section1
Query:P(A|h)• Needtoeliminate:B,C,D,E,F,G,HInitialfactors:P(A)P(B)P(C|B)P(D|A)P(E|C,D)P(F|A)P(G|E)P(H|E,F)=>P(A)P(B)pC(A,B)Step7:EliminateB
=>P(A)pB(A)
pB(A) =P
b P (B = b|A)pC(A,B)
VariableEliminationExample
38
Section1
Query:P(A|h)• Needtoeliminate:B,C,D,E,F,G,HInitialfactors:P(A)P(B)P(C|B)P(D|A)P(E|C,D)P(F|A)P(G|E)P(H|E,F)=>P(A)pB(A)Step8:Wrap-up
P (A, h) = P (A)pB(A), P (h) =P
a P (A = a)pB(A = a)
P (A|h) = P (A,h)P (h)
Complexityofvariableelimination
39
Section1
• Supportinoneeliminationstepwecompute
• Thisrequiresmultiplications
• Foreachvalueforwedokmultiplications
• additions
• Foreachvalueforwedo|Val(X)|additions
p
x
(y1, . . . , yk) =P
x
p
0x
(x, y1, . . . , yk)
p
0x
(x, y1, . . . , yk) =Q
k
i=1 pi(x,yci)
k · |V al(X)| ·Q
I |V al(yCi)|x, y1, . . . , yk
|V al(X)| ·Q
i |V al(yCi)|y1, . . . , yk
Complexityofvariableelimination
40
Section1
• Supportinoneeliminationstepwecompute
• Thisrequiresmultiplications
• Foreachvalueforwedokmultiplications
• additions
• Foreachvalueforwedo|Val(X)|additions
p
x
(y1, . . . , yk) =P
x
p
0x
(x, y1, . . . , yk)
p
0x
(x, y1, . . . , yk) =Q
k
i=1 pi(x,yci)
k · |V al(X)| ·Q
I |V al(yCi)|x, y1, . . . , yk
|V al(X)| ·Q
i |V al(yCi)|y1, . . . , yk
Complexityisexponential innumberofvariablesintheintermediatefactor
Summary
41
Section1
• Thesimpleeliminatealgorithmcapturesthekeyalgorithmicoperation
underlyingprobabilisticinference:
• Wetakeasumoverproductofpotentialfunctions
• Complexityofthealgorithmdependsonsizeofthesummandsthatappearin
thesequenceofthesummationoperations.
• ComputationalcomplexityoftheEliminatealgorithmcanbereducedtopurely
graph-theoreticconsiderations:treewidth (nextclass).
• Reasoningaboutthethreewidth wecandesignimprovedinferencealgorithms.
top related