impact of structuring on bayesian network learning and reasoning
DESCRIPTION
Impact of Structuring on Bayesian Network Learning and Reasoning. M ieczysław .A. .K ł opotek Institute of Computer Science, Polish Academy of Sciences, Warsaw, Poland,. First Warsaw International Seminar on Soft Computing Warsaw, September 8th, 2003. Agenda. Definitions - PowerPoint PPT PresentationTRANSCRIPT
1
Impact of Structuring on Bayesian Network Learning
and Reasoning
Mieczysław.A..Kłopotek
Institute of Computer Science,
Polish Academy of Sciences,
Warsaw, Poland,
First Warsaw International Seminar on Soft Computing Warsaw, September 8th, 2003
2
2
Agenda
Definitions Approximate Reasoning Bayesian networks
Reasoning in Bayesian networks Learning Bayesian networks from data Structured Bayesian networks (SBN) Reasoning in SBN Learning SBN from data Concluding remarks
3
3
Approximate Reasoning One possible method of expressing uncertainty: Joint
Probability Distribution Variables: causes, effects, observables Reasoning: How probable is that a variable takes a given
value if we kniow the values of some other variables Given: P(X,Y,....,Z) Find: P(X=x | T=t,...,W=w)
Difficult, if more than 40 variables have to be taken into account hard to represent, hard to reason, hard to collect data)
4
4
The method of choice for representing uncertainty in AI.
Many efficient reasoning methods and learning methods
Utilize explicit representation of structure to: provide a natural and compact
representation of large probability distributions.
allow for efficient methods for answering a wide range of queries.
Bayesian Network
5
5
Bayesian Network Efficient and effective representation of a
probability distribution Directed acyclic graph
Nodes - random variables of interests Edges - direct (causal) influence
Nodes are statistically independent of their non descendants given the state of their parents
6
6
A Bayesian network
R
Z
T
Y
X
S
Pr(r,s,x,z,y)=
Pr(z) . Pr(s|z) . Pr(y|z)
. Pr(x|y) . Pr(r|y,s)
7
7
Applications of Bayesian networks Genetic optimization algorithms with
probabilistic mutation/crossing mechanism Classification, including text classification Medical diagnosis (PathFinder, QMR), other
decision making tasks under uncertainty Hardware diagnosis (Microsoft troubleshooter,
NASA/Rockwell Vista project) Information retrieval (Ricoh helpdesk) Recommender systems other
8
8
Reasoning – the problem with a Bayesian network Fusion algorithm of Pearl elaborated for tree-like
networks only For other types of networks transformations to
trees: transformation to Markov tree (MT) is needed
(Shafer/Shenoy, Spiegelhalter/Lauritzen) – except for trees and polytrees NP hard
Cutset reasoning (Pearl) – finding cutsets difficult, the reasoning complexity grows exponentially with cutset size needed
evidence absorption reasoning by edge reversal (Shachter) – not always possible in a simple way
9
9
Towards MT – moral graph
R
Z
T
Y
X
S
Parents of a node in BN connected, edges not oriented
10
10
Towards MT – triangulated graph
R
Z
T
Y
X
S
All cycles with more than 3 nodes have at least one link between non-neighboring nodes of the cycle.
11
11
Towards MT – Hypertree
R
Z
T
Y
X
S
Hypertree = acyclic hypergraph
12
12
The Markov tree
Z,T,Y T,Y,S Y,S,R
Y,X
Hypernodes of hypertree are nodes of the Markov tree
13
13
Junction tree – alternative representation of MT
Z,T,S Z,Y,S Y,S,R
Y,X
Z,S Y,S
Y
Common BN nodes assigned to edges joining MT nodes
14
14
Efficient reasoning in Markov trees, but ....
Z,T,S Z,Y,S Y,S,R
Y,X
Z,S Y,S
Y
msg msg
msg
MT node contents projected onto common variables are passed to the neighbors
15
15
Triangulability test - Triangulation not always possible
All neighbors need to be connected
16
16
Evidence absorption reasoning
R
Z
T
Y
X
S
R
Z
T
Y
X
S
Evidence absorption
R
Z
T
Y
X
S
Edge reversal
Efficient only for good-luck selection of conditioning variables
17
17
Cutset reasoning – fixing values of some nodes creates a (poly)tree
R
Z
T
Y
X
S
Node fixed
Hence edge ignorable
18
18
How to overcome the difficulty when reasoning with BN Learn directly a triangulated graph or Markov tree
from data (Cercone N., Wong S.K.M., Xiang Y) Hard and inefficient for long dependence chains,
danger of large hypernodes Learn only tree-structured/polytree structured BN
(e.g. In Goldberg’s Bayesian Genetic Algorithms, TAN text classifiers etc.) Oversimplification, long dependence chains lost
Our approach: Propose a more general class of Bayesian networks that is still efficient for reasoning
19
19
What is a structured Bayesian network An analogon of well-structured
programs Graphical structure: nested sequences
and alternatives By collapsing sequences and
alternatives to single nodes, one single node obtainable
Efficient reasoning possible
20
20
Structured Bayesian Network (SBN), an example
For comparison: a tree-structured BN
21
21
SBN collapsing
22
22
SBN construction steps
means 0,1 or 2 arrows
23
23
Reasoning in SBN
Either directly in the structure Or easily transformable to Markov tree Direct reasoning consisting of
Forward step (leave node/root node valuation calculation)
Backward step (intermediate node valuation calculation
24
24
Reasoning in SBN forward step
means 0,1 or 2 arrows
A
B
A
B
CEP(B|A)
P(B|C,E)
25
25
Reasoning in SBN backward step: local context
AC
B
D
.....
.....A
C
B
D
.....
.....A
C
B
.....
.....A
C
B
.....
.....
(a) (b) (c) (d)
Joint distribu-tion of A,B known, joint C,D or C sought
26
26
Reasoning in SBN – backward step: local reasoning
A,B,............ A,B,C,DA,B
Msg2(A,B)
Msg1(A,B)
P(A)*P(B|A,D)
Not needed
27
27
SBN –towards a MT
28
28
SBN –towards a MT
29
29
SBN –towards a MT
30
30
B
C D
E
A
J
I
S
R
H
F
G
P
K
L
M N
O
Towards a Markobv tree – an example
31
31
B
C D
E
A
J
I
S
R
H
F
G
P
K
L
M N
O
Towards a Markobv tree – an example
32
32
A,B,I
B,C,D,I
C,D,E,I
F,G,I
G,H,I
I,H,E,R D,E,I
E,H,R,J H,R,J
K,L,R
L,M,N,R
M,N,O,R
N,O,R
O,P,R
R,J,P
P,J,S
Markov tree from SBN
33
33
B
C D
E
A
J
I
S
R
H
F
G
P
K
L
M N
O
Structured Bayesian network – a Hierarchical (Object-Oriented) Bayesian network
34
34
Learning SBN from Data
Define the DEP() measure as follows:
DEP(Y,X)=P(x|y)-P(x|y).
Define DEP[](Y,X)= (DEP(Y,X) )2
Construct a tree according to Chow/Liu algorithm using DEP[](Y,X) with Y
belonging to the tree and X not.
35
35
Continued ....
Let us call all the edges obtained by the previous algorithm “free edges”.
During the construction process the following type of edges may additionally appear “node X loop unoriented edge”, “node X loop oriented edge”, “node X loop transient edge”.
Do in a loop (till termination condition below is satisfied):
For each two properly connected non-neighboring nodes identify the unique connecting path between them.
36
36
Continued ....
Two nodes are properly connected if the path between them consists either of edges having the status of free edges or of oriented, unoriented (but not suspended) edges of the same loop, with no pair of oriented or transient oriented edges pointing in different directions and no transient edge pointing to one of the two connected points.
Note that in this sense there is at most one path properly connecting two nodes.
37
37
Continued ....
Connect that a pair of non-neighboring nodes X,Y by an edge, that maximizes DEP[](X,Y), the minimum of
unconditional DEP and conditional DEP given a direct successor of X on the path to Y.
Identify the loop that has emerged from this operation.
38
38
Continued ....
We can have one of the following cases: (1)it consists entirely of free edges (2)it contains some unoriented loop edges, but
no oriented edge. (3)It contains at least one oriented edge. Depending on this, give a proper status to
edges contained in a loop: “node X loop unoriented edge”, “node X loop oriented edge”, “node X loop transient edge”. (details in written presentation).
39
39
Places of edge insertion
X
C
D
Y
B
Y
C
D
E
X
Y
G
D
E
X
C
Y
C
D
X
B
H
X
C
D
E
Y
X
G
D
E
Y
C
40
40
Concluding Remarks
new class of Bayesian networks defined completely new method of reasoning in Bayesian
networks outlined Local computation – at most 4 nodes involved applicable to a more general class of networks then
known reasoning methods new class Bayesian networks easily transfornmed to
Markov trees new class Bayesian networks – a kind of hierarchical or
object-oriented Bayesian networks Can be learned from data
41
41
THANK YOU