rong pan department of computer science and electrical engineering university of maryland baltimore...
DESCRIPTION
Semantically-Linked Bayesian Networks: A Framework for Probabilistic Inference Over Multiple Bayesian Networks PhD Dissertation Defense Advisor: Dr. Yun Peng. Rong Pan Department of Computer Science and Electrical Engineering University of Maryland Baltimore County Aug 2, 2006. Outline. - PowerPoint PPT PresentationTRANSCRIPT
Semantically-Linked Bayesian Networks: Semantically-Linked Bayesian Networks: A Framework for Probabilistic Inference A Framework for Probabilistic Inference
Over Multiple Bayesian NetworksOver Multiple Bayesian Networks
PhD Dissertation DefensePhD Dissertation Defense
Advisor: Dr. Yun PengAdvisor: Dr. Yun Peng
Rong PanRong PanDepartment of Computer Science and Electrical EngineeringDepartment of Computer Science and Electrical Engineering
University of Maryland Baltimore CountyUniversity of Maryland Baltimore CountyAug 2, 2006Aug 2, 2006
OutlineOutline
►MotivationsMotivations►BackgroundBackground►OverviewOverview►How Knowledge is SharedHow Knowledge is Shared► Inference on SLBNInference on SLBN►Concept Mapping using SLBNConcept Mapping using SLBN►Future worksFuture works
Motivations (1)Motivations (1)►Separately developed BNs aboutSeparately developed BNs about
related domainsrelated domains
different aspects of the same domaindifferent aspects of the same domain
…
Motivations (2)Motivations (2)► Existing approach: Existing approach:
Multiply Sectioned Bayesian Networks (MSBN)Multiply Sectioned Bayesian Networks (MSBN)
Every subnet is sectioned from a global BNEvery subnet is sectioned from a global BN Strictly consistent subnetsStrictly consistent subnets Exactly Exactly identicalidentical shared variables with shared variables with samesame distribution distribution All parents of the shared variables must appear in one subnetAll parents of the shared variables must appear in one subnet
Sectioning
Motivations (3)Motivations (3)► Existing approach: Existing approach:
Agent Encapsulated Bayesian Networks (AEBN)Agent Encapsulated Bayesian Networks (AEBN)
Distribution BN Model for a specific applicationDistribution BN Model for a specific application Hierarchical global structureHierarchical global structure Very restricted expressivenessVery restricted expressiveness Exactly Exactly identicalidentical shared variables with shared variables with different different prior prior distributionsdistributions
AgentOutput Variable
Input Variable
Local Variable
Motivations (4)Motivations (4)
►A distributed BN model was expected with A distributed BN model was expected with features:features: Uncertainty reasoning over separately developed Uncertainty reasoning over separately developed
BNsBNs Variables shared by different BNs can be similar but Variables shared by different BNs can be similar but
not identicalnot identical Principled, well justifiedPrincipled, well justified Support various applicationsSupport various applications
BackgroundBackgroundBayesian NetworkBayesian Network
►DAGDAG►VariablesVariables
with Finite with Finite StatesStates
►Edges: causal Edges: causal influencesinfluences
►Conditional Conditional Probability Table Probability Table (CPT)(CPT)
2.08.065.035.0
)|( AAP
BackgroundBackgroundEvidences in BNEvidences in BN
MammalTrueFalse
83.316.7
Male Mammal TrueFalse
50.050.0
Virtual Evidence TrueFalse
100 0
MammalTrueFalse
80.020.0
Male Mammal TrueFalse
40.060.0
MammalTrueFalse
100 0
Male Mammal TrueFalse
100 0
MammalTrueFalse
83.316.7
Male Mammal TrueFalse
50.050.0
Original BN Hard Evidence:Male_Mammal = True
Soft Evidence:Q(Male_Mammal) = (0.5 0.5)
Virtual Evidence:L(Male_Mammal) = 0.8/0.2
Virtual Evidence = Soft Evidence:L(Male_Mammal) = 0.3/0.2
MammalTrueFalse
90.99.09
Male Mammal TrueFalse
72.727.3
Virtual Evidence TrueFalse
100 0
BackgroundBackgroundJeffrey’s Rule (Soft Evidence)Jeffrey’s Rule (Soft Evidence)
► Given external observations Given external observations QQ((BBii), the rest of the BN is updated ), the rest of the BN is updated by Jeffrey’s Rule:by Jeffrey’s Rule:
where where PP((AA| | BBii) is the conditional probability before evidence, ) is the conditional probability before evidence,
QQ((BBii) is the soft evidence.) is the soft evidence. ► Multiple Soft EvidencesMultiple Soft Evidences
Problem: update one variable’s distribution to its target value Problem: update one variable’s distribution to its target value can make those of others’ off their targetscan make those of others’ off their targets
Solution: IPFPSolution: IPFP
i
ii BQBAPAQ )()|()(
BackgroundBackground Iterative Proportional Fitting Procedure (IPFP)Iterative Proportional Fitting Procedure (IPFP)
► QQ00 : initial distribution on the set of variables : initial distribution on the set of variables XX, , ► {{PP((SSii)}: a consistent set of )}: a consistent set of nn marginal probability marginal probability
distributions, where distributions, where XX SiSi . . ► The IPFP processThe IPFP process
where where ii is the iteration number, is the iteration number, jj = ( = (ii-1) mod -1) mod nn + 1 + 1
► The distribution after IPFP satisfies the given constraints The distribution after IPFP satisfies the given constraints {{PP((SSii)} and has minimum cross-entropy to the initial )} and has minimum cross-entropy to the initial distribution distribution QQ00
otherwise
SQifSQ
SPXQXQ ji
ji
ji
i
0
0)()(
)()()( 1
11
SLBN: Overview (1)SLBN: Overview (1)
►Semantically-Linked Bayesian Networks (SLBN)Semantically-Linked Bayesian Networks (SLBN) A theoretical framework that supports probabilistic A theoretical framework that supports probabilistic
inference over separately developed BNsinference over separately developed BNs
Global Knowledge
Similar variables
SLBN: Overview (2)SLBN: Overview (2)► FeaturesFeatures
Inference over separate BNs that share semantically similar Inference over separate BNs that share semantically similar variablesvariables
Global knowledge: J-graphGlobal knowledge: J-graph Principled, well-justifiedPrincipled, well-justified
► In SLBNIn SLBN BNs are linked at the similar variablesBNs are linked at the similar variables Probabilistic influences are propagated via the shared Probabilistic influences are propagated via the shared
variablesvariables Inference process utilizes Soft Evidence (Jeffrey’s Rule), Inference process utilizes Soft Evidence (Jeffrey’s Rule),
Virtual Evidence, IPFP, and traditional BN inferenceVirtual Evidence, IPFP, and traditional BN inference
How knowledge is shared:How knowledge is shared:Semantic Similarity (1)Semantic Similarity (1)
What is similarity?What is similarity?SimilarSimilar::
Pronunciation:Pronunciation: 'si-m&-l&r, 'sim-l&r 'si-m&-l&r, 'sim-l&rFunction:Function: adjectiveadjective1: 1: having characteristics in commonhaving characteristics in common2:2: alike in substance or essentials alike in substance or essentials3:3: not differing in shape but only in size or position not differing in shape but only in size or position
–– –– www.merrian-webster.comwww.merrian-webster.com
High-tech Company Employee V.S. High-income People
Computer Keyboard V.S. Typewriter
How knowledge is shared:How knowledge is shared:Semantic Similarity (2)Semantic Similarity (2)
► Semantic Similarity of conceptsSemantic Similarity of concepts Share of common instancesShare of common instances Quantified and utilized with directionQuantified and utilized with direction Quantified by the ratio of the shared instances to all Quantified by the ratio of the shared instances to all
the instancesthe instances
Natural language’s definition for “similar” is vagueNatural language’s definition for “similar” is vague Hard to formalizeHard to formalize Hard to quantifyHard to quantify Hard to utilize in intelligenceHard to utilize in intelligence
Conditional ProbabilityP(High-tech Company Employee | High-income People)
Man V.S.
Woman
How knowledge is shared:How knowledge is shared:Variable Linkage (1)Variable Linkage (1)
►In Bayesian Network (BN) / SLBNIn Bayesian Network (BN) / SLBN Concepts are represented by variablesConcepts are represented by variables Semantic similarities are between propositionsSemantic similarities are between propositions
We say
“High-tech Company Employee” is similar to “High-income People”
We mean
“High-tech Company Employee = True” is similar to “High-income People = True”
How knowledge is shared:How knowledge is shared:Variable Linkage (2)Variable Linkage (2)
►Variable linkagesVariable linkages Represent semantic similarities in SLBNRepresent semantic similarities in SLBN Are between variables in different BNsAre between variables in different BNs
ABBA
AB SNNBAL ,,,,
A : Source Variable
B : Destination Variable
NA: Source BN
NB: Destination BN
: Quantification of the similarityABS
is a m × n matrix:ABS
)}|({),( ijAB abPjiS
How knowledge is shared:How knowledge is shared:Variable Linkage (3)Variable Linkage (3)
►Variable Linkage V.S. BN EdgeVariable Linkage V.S. BN Edge
Variable LinkageVariable Linkage BN EdgeBN Edge
Representation Representation OfOf
Semantic SimilaritySemantic Similarity Causal InfluencesCausal Influences
Conditional Conditional ProbabilityProbability
Quantification of Quantification of Similarity; Similarity; Invariant w.r.t. any Invariant w.r.t. any eventevent
Conditional Conditional dependency;dependency;May be changed by May be changed by eventsevents
Prob. Influence Prob. Influence PropagationPropagation
Along the directionAlong the direction Both directionsBoth directions ((-msg. and -msg. and -msg.)-msg.)
How knowledge is shared:How knowledge is shared:Variable Linkage (4)Variable Linkage (4)
►Expressiveness of Variable LinkageExpressiveness of Variable Linkage Logical relationships defined in OWL syntax: Logical relationships defined in OWL syntax:
Equivalent, Union, Intersection, and Subclass Equivalent, Union, Intersection, and Subclass complement. complement.
Relaxation of logical relationships by replacing set Relaxation of logical relationships by replacing set inclusion by overlapping: Overlap, Superclass, inclusion by overlapping: Overlap, Superclass, SubclassSubclass
Equivalence relations but same concepts are modeled Equivalence relations but same concepts are modeled as different variablesas different variables
How knowledge is shared:How knowledge is shared:Examples (1)Examples (1)
…
…
Identical
Union
How knowledge is shared:How knowledge is shared:Examples (2)Examples (2)
Overlap
Superclass
…
How knowledge is shared:How knowledge is shared:Consistent Linked VariablesConsistent Linked Variables
► The priori beliefs on the linked variables on both sides The priori beliefs on the linked variables on both sides must be consistent with the variable linkage:must be consistent with the variable linkage: PP22((BB) = ) = ∑∑i i PPSS((BB||A=aA=aii))PP11((A=aA=aii)) There exists a single distribution consistent with the prior There exists a single distribution consistent with the prior
belief on belief on AA, , BB, , AA, , BB, and the linkage’s similarity., and the linkage’s similarity.► examined by IPFPexamined by IPFP
A B
A BP1(A)
P1(A| A)
P1(A)
P2(B)
P2(B| A)
P2(B)PS(B| A)
Inference on SLBN Inference on SLBN The ProcessThe Process
1. Enter Evidence
2. Propagate
4. Updated Result
3. Enter Soft/Virtual Evidences;
BN Belief Update With traditional Inference
SLBN Rules for ProbabilisticInfluence Propagation
BN Belief Update with Soft Evidence
Inference on SLBN Inference on SLBN The TheoryThe Theory
Bayes’ Rule Jeffrey’s Rule IPFP
Soft Evidence
BN Inference
Virtual Evidence
SLBN
Theoretical BasisImplementation (Existing)Implementation (SLBN)
Inference on SLBN Inference on SLBN Assumptions/RestrictionsAssumptions/Restrictions
► All linked BNs are consistent with the linkagesAll linked BNs are consistent with the linkages► One variable can only be involved in one linkageOne variable can only be involved in one linkage► Causal precedence in all linked BNs are consistentCausal precedence in all linked BNs are consistent
Linked BNs with inconsistent causal sequences
Linked BNs with consistent causal sequences
Inference on SLBN Inference on SLBN Assumptions/Restrictions (Cont.)Assumptions/Restrictions (Cont.)
►For a variable linkage, the causes/effects of For a variable linkage, the causes/effects of source is also the causes/effects of the source is also the causes/effects of the destinationdestination Linkages cannot cross each otherLinkages cannot cross each other
…
…
… ...
Crossed linkages
Inference on SLBN Inference on SLBN SLBN Rules for Probabilistic Influence Propagation (1)
► Some hard evidence influence the source from Some hard evidence influence the source from bottombottom
Y1
Y2
Y3
…
…
X1
► Propagated influences Propagated influences are represented by soft are represented by soft evidencesevidences
► Beliefs of destination Beliefs of destination
BN are update with SEBN are update with SE
Inference on SLBN Inference on SLBN SLBN Rules for Probabilistic Influence Propagation (2)
► Some hard evidence influence the source from Some hard evidence influence the source from toptop
Y1
Y2
Y3
…
…
X1
► Additional soft evidences Additional soft evidences are created to cancel the are created to cancel the influences from the influences from the linkage to linkage to parentparent((destdest((LL))))
Inference on SLBN Inference on SLBN SLBN Rules for Probabilistic Influence Propagation (3)
► Some hard evidence influence the source from both Some hard evidence influence the source from both top top andand bottom bottom
Y1
Y2
Y3
…
…
X1
► Additional soft evidences Additional soft evidences are created to propagate are created to propagate the combined influences the combined influences from the linkage to from the linkage to parentparent((destdest((LL))))
Inference on SLBN Inference on SLBN Belief Update with Soft Evidence (1)Belief Update with Soft Evidence (1)
►Represent soft evidences by virtual evidencesRepresent soft evidences by virtual evidences Belief update with soft evidence is IPFPBelief update with soft evidence is IPFP Belief update with one virtual evidence is one step Belief update with one virtual evidence is one step
of IPFPof IPFP►Therefore, we canTherefore, we can
Use virtual evidence to implement IPFP on BNUse virtual evidence to implement IPFP on BN Use virtual evidence to implement soft evidenceUse virtual evidence to implement soft evidence
►SE VESE VE Iterate on the whole BNIterate on the whole BN Iterate on soft evidence variablesIterate on soft evidence variables
Inference on SLBN Inference on SLBN Belief Update with Soft Evidence (2)Belief Update with Soft Evidence (2)
►Iterate on whole BNIterate on whole BN
AA BB
(0.3, 0.7)(0.3, 0.7) (0.8, 0.2)(0.8, 0.2)(0.6, 0.4)(0.6, 0.4) (0.66, (0.66,
0.34)0.34)(0.47, 0.53)(0.47, 0.53) (0.5, 0.5)(0.5, 0.5)
(0.6, 0.4)(0.6, 0.4) (0.60, (0.60, 0.40)0.40)
(0.55, 0.45)(0.55, 0.45) (0.5, 0.5)(0.5, 0.5)
……………………..(0.6, 0.4)(0.6, 0.4) (0.5, 0.5)(0.5, 0.5)
Q(B) = (0.5, 0.5)Q(A) = (0.6, 0.4)
ve veve ve
A B
Inference on SLBN Inference on SLBN Belief Update with Soft Evidence (1)Belief Update with Soft Evidence (1)
►Iterate on SE variablesIterate on SE variables
Q(B) = (0.5, 0.5)Q(A) = (0.6, 0.4)
ve
A B
A = tA = t A = fA = f
BB = = tt 0.20.2 0.60.6
BB = = ff 0.10.1 0.10.1
P(A, B) =
IPFP with Q(A), Q(B)
A = tA = t A = fA = f
BB = = tt 0.2360.236 0.2640.264
BB = = ff 0.3640.364 0.1360.136
Q(A, B) =
Inference on SLBN Inference on SLBN Belief Update with Soft Evidence (3)Belief Update with Soft Evidence (3)
►Existing approaches : Big-CliqueExisting approaches : Big-Clique
Big-CliqueBig-CliqueIteration onIteration onwhole BNwhole BN
Iteration onIteration onse variablesse variables
BN Inference BN Inference BasisBasis
Rewrite Rewrite Junction-TreeJunction-Tree
Wrapper of Wrapper of any methodany method
Wrapper of any Wrapper of any methodmethod
Time for each Time for each IterationIteration O(O(ee|C||C|)) O(BN Inf.)O(BN Inf.) O(O(ee|V||V|))
SpaceSpace OO((ee|C||C|)) O(|O(|V|V|)) OO((ee|V||V|))
C: the big cliqueV: se variables|C|≥|V|
Iteration on whole BN: Small BNs, many soft evidencesIteration on se variables: Large BNs, a few soft evidences
J-Graph (1)J-Graph (1)OverviewOverview
► Joint-graph (J-graph) is a graphical probability Joint-graph (J-graph) is a graphical probability model that representsmodel that represents The joint distribution of SLBNThe joint distribution of SLBN The interdependencies between variables across The interdependencies between variables across
variable linkagesvariable linkages► UsageUsage
Check if all assumptions are satisfiedCheck if all assumptions are satisfied Justify Inference ProcessJustify Inference Process
J-Graph (2)J-Graph (2)DefinitionDefinition
►J-Graph is constructed by merging all linked J-Graph is constructed by merging all linked BNs and linkages into one graphBNs and linkages into one graph DAGDAG Variable nodes, Linkage NodesVariable nodes, Linkage Nodes Edges: all edges in the linked BNs have a Edges: all edges in the linked BNs have a
representation in J-graphrepresentation in J-graph CPT: CPT: QQ((AA||AA) = ) = PP((AA||AA), ), QQ((AA||BB) = ) = PPSS((AA||BB) for ) for
►QQ: distribution in J-graph, : distribution in J-graph, PP: original distribution : original distribution
ABL
J-Graph (3)J-Graph (3)ExampleExample
A
B C
D
A’
B’
C’
D’
A1
B→B’;1→2
C→C’;1→2
D2
A’2
D’2
Linkage nodes represent all linked variables and the linkage encode the similarity of the linkage in CPT merge the CPTs by IPFP
LinkageNode
Concept Mapping using SLBN (1)Concept Mapping using SLBN (1)MotivationsMotivations
►Ontology mappings are seldom certainOntology mappings are seldom certain Existing approaches Existing approaches
►use hard threshold to filter mappingsuse hard threshold to filter mappings►throw similarities away after mappings are createdthrow similarities away after mappings are created►mappings are identical and 1-1mappings are identical and 1-1
ButBut►often one concept is similar to more than one conceptoften one concept is similar to more than one concept►Semantically similar concepts are hard to be represented Semantically similar concepts are hard to be represented
logicallylogically
Probabilistic Information
Learner
BayesOWL
Concept Mapping using SLBN (2)Concept Mapping using SLBN (2)The FrameworkThe Framework
Onto2Onto1WWW
BN2BN1VariableLinkages
BayesOWL
SLBN
Concept Mapping using SLBN (3)Concept Mapping using SLBN (3)ObjectionObjection
►Discover new and complex concept mappingsDiscover new and complex concept mappings Make full use of the learned similarity in SLBN’s Make full use of the learned similarity in SLBN’s
inferenceinference Create an expression for a concept in another Create an expression for a concept in another
ontologyontology►Find how similar “Onto1:B Find how similar “Onto1:B Onto1:C” is to “Onto2:A” Onto1:C” is to “Onto2:A”
►Experiments have shown encouraging resultsExperiments have shown encouraging results
Concept Mapping using SLBN (3)Concept Mapping using SLBN (3)ExperimentExperiment
► Artificial Intelligence sub-domain from Artificial Intelligence sub-domain from ACMACM Topic Taxonomy Topic Taxonomy DMOZDMOZ (Open Directory) hierarchies (Open Directory) hierarchies
Learned Similarities:
07.021.012.060.0
).,.( rsacmswdmozP
04.025.013.058.0
).,.( snacmswdmozP
01.004.030.065.0
).,.( krfmacmswdmozPJ(dmoz.sw, acm.rs) = 0.64 J(dmoz.sw, acm.sn) = 0.61
J(dmoz.sw, acm.krfm) = 0.49 After SLBN Inference:
JJ((dmozdmoz..swsw, , acmacm..rs rs acm acm..snsn) = 0.7250 ) = 0.7250 Q Q ((acmacm..rsrs = = True True acm acm..snsn = = True True || dmoz dmoz..swsw = = TrueTrue) = 0.9646 ) = 0.9646
Future WorksFuture Works
► Modeling with SLBNModeling with SLBN Discover semantic similar concepts by machine Discover semantic similar concepts by machine
learning algorithmslearning algorithms Create effective and correct linkages from learned Create effective and correct linkages from learned
algorithmsalgorithms► Distributed Inference methodsDistributed Inference methods► Loosing the restrictionsLoosing the restrictions
Inference with linkages of both directionsInference with linkages of both directions Use functions to represent similaritiesUse functions to represent similarities
Thank You!Thank You!
► Questions?Questions?
BackgroundBackgroundSemantics of BNSemantics of BN
► Chain ruleChain rule
wherewhere ((aaii) is the parent set of ) is the parent set of aaii..► dd-separation:-separation:
i
ii aaPXP ))(|()(
B
CA
CA
BA
B
C
serials diverging converging
Instantiated
Not instantiated
d-separated variables do not influence each other.