bayesian metanetworks for context-sensitive feature relevance
DESCRIPTION
Bayesian Metanetworks for Context-Sensitive Feature Relevance. Vagan Terziyan [email protected] Industrial Ontologies Group, University of Jyväskylä, Finland. SETN-2006, Heraclion, Crete, Greece 24 May 2006. Contents. Bayesian Metanetworks Metanetworks for managing conditional dependencies - PowerPoint PPT PresentationTRANSCRIPT
Bayesian Metanetworksfor Context-Sensitive Feature Relevance
Vagan TerziyanVagan Terziyan
Industrial Ontologies Group, University of Jyväskylä,
Finland SETN-2006, Heraclion, Crete, Greece
24 May 2006
Contextual level
Predictive level
2
Contents
Bayesian Metanetworks Metanetworks for
managing conditional dependencies
Metanetworks for managing feature relevance
Example Conclusions
Vagan Terziyan
Industrial Ontologies Group
Department of Mathematical Information Technologies
University of Jyvaskyla (Finland)
http://www.cs.jyu.fi/ai/vagan
This presentation: http://www.cs.jyu.fi/ai/SETN-2006.ppt
3
Bayesian Metanetworks
4
Conditional dependence between variables X and Y
X
Y
P(X)
P(Y)
P(Y|X)
P(Y) = X (P(X) · P(Y|X))
5
Bayesian Metanetwork
Definition. The Bayesian Metanetwork is a set of Bayesian networks, which are put on each other in such a way that the elements (nodes or conditional dependencies) of every previous probabilistic network depend on the local probability distributions associated with the nodes of the next level network.
6
Two-level Bayesian C-Metanetwork for Managing Conditional Dependencies
Contextual level
Predictive level
8
Contextual Effect on Conditional Probability (1)
XX x1 x2 x3 x4 x5 x6 x7
predictive attributes contextual attributes
xk xr
Assume conditional dependence between predictive attributes
(causal relation between physical quantities)…
xt
… some contextual attribute may effect
directly the conditional dependence between
predictive attributes but not the attributes itself
10
Contextual Effect on Conditional Probability (3)
xkxr
xt
Xk1 : order flowers
Xk2 : order wine
Xr1 : visit football match
Xr2 : visit girlfriend
P1(Xr |Xk ) Xk1 Xk2
Xr1 0.3 0.9
Xr2 0.4 0.5
P2(Xr |Xk ) Xk1 Xk2
Xr1 0.1 0.2
Xr2 0.8 0.7
Xt1 : I am in Paris
Xt2 : I am in Moscow
Xk : Order present Xr : Make a visit
11
Contextual Effect on Conditional Probability (4)
xt
P1(Xr |Xk ) Xk1 Xk2
Xr1 0.3 0.9
Xr2 0.4 0.5
P2(Xr |Xk ) Xk1 Xk2
Xr1 0.1 0.2
Xr2 0.8 0.7
Xt1 : I am in Paris
Xt2 : I am in Moscow
xrxk
P( P (Xr |Xk ) | Xt ) Xt1 Xt2
P1(Xr |Xk ) 0.7 0.2
P2(Xr |Xk ) 0.3 0.8
12
Contextual Effect on Unconditional Probability (1)
XX x1 x2 x3 x4 x5 x6 x7
predictive attributes contextual attributes
xk
Assume some predictive attribute is a random
variable with appropriate probability distribution
for its values…
xt
… some contextual attribute may effect
directly the probability distribution of the predictive attribute
x1 x2 x3x4
XX
P(X)P(X)
14
Contextual Effect on Unconditional Probability (3)
xk
xt
XXkk
PP11(X(Xkk))
Xk1 Xk2
0.2
0.7
XXkk
PP22(X(Xkk))
Xk1 Xk2
0.50.3
Xt1 : I am in Paris
Xt2 : I am in Moscow
Xk1 : order flowers
Xk2 : order wineXk : Order present
P( P (Xk ) | Xt ) Xt1 Xt2
P1(Xk ) 0.4 0.9
P2(Xk ) 0.6 0.1
16
Two-level Bayesian C-Metanetwork for managing conditional dependencies
Contextual level
Predictive level A
B
X
Y
P(B|A) P(Y|X)
18
Two-level Bayesian R-Metanetwork for Modelling Relevant Features’ Selection
Contextual level
Predictive level
19
Feature relevance modelling (1)
We consider relevance as a probability of importance of the variable to the inference of target attribute in the given context. In such definition relevance inherits all properties of a probability.
X
Y
Probability
P(X)
P(Y)-?
P(Y|X)
Relevance
Ψ(X)
Y
P0(Y) Probability to have this model is:
P((X)=”no”)= 1-X
X
Y
P(X)
P(Y|X)
Probability to have this model is:
P((X)=”yes”)= X
P1(Y)
20
Feature relevance modelling (2)
X
Y
Probability
P(X)
P(Y)-?
P(Y|X)
Relevance
Ψ(X)
.)]1()([)|(1
)( X
XX XPnxXYPnx
YP
X: {x1, x2, …, xnx }
21
Example (1) Let attribute X will be “state of weather” and
attribute Y, which is influenced by X, will be “state of mood”.
X (“state of weather”) ={“sunny”, “overcast”, “rain”}; P(X=”sunny”) = 0.4; P(X=”overcast”) = 0.5; P(X=”rain”) = 0.1;
Y (“state of mood”) ={“good”, “bad”}; P(Y=”good”|X=”sunny”)=0.7; P(Y=”good”|X=”overcast”)=0.5; P(Y=”good”|X=”rain”)=0.2; P(Y=”bad”|X=”sunny”)=0.3; P(Y=”bad”|X=”overcast”)=0.5; P(Y=”bad”|X=”rain”)=0.8;
X
Y
Probability
P(X)
P(Y)-?
P(Y|X)
Relevance
Ψ(X)
P(X)
P(Y|X)
Let: X=0.6
22
Example (2)
Now we have:
One can also notice that these values belong to the intervals created by the two extreme cases, when parameter X is not relevant at all or it is fully relevant:
X
Y
Probability
P(X)
P(Y)-?
P(Y|X)
Relevance
Ψ(X)
55.0|)""(|)""(|)""(467.0 116.000 XXXgoodYPgoodYPgoodYP
533.0|)""(|)""(|)""(45.0 006.011 XXXbadYPbadYPbadYP
;517.0]}4.0)""(8.1[)""|""(
]4.0)""(8.1[)""|""(
]4.0)""(8.1[)""|""({3
1)""(
rainXPrainXgoodYP
overcastXPovercastXgoodYP
sunnyXPsunnyXgoodYPgoodYP
.483.0)""( badYP
.)]1()([)|(1
)( X
XX XPnxXYPnx
YP
!
23
General Case of Managing Relevance (1)
X1
Y
Probability
P(X1)
P(Y)-?
P(Y|X1,X2,…,XN)
Relevance
Ψ(X1)
XN
Probability
P(XN) Relevance
Ψ(XN) X2
Probability
P(X2) Relevance
Ψ(X2)
…
Predictive attributes:
X1 with values {x11,x12,…,x1nx1};
X2 with values {x21,x22,…,x2nx2};
…XN with values {xn1,xn2,…,xnnxn};
Target attribute:
Y with values {y1,y2,…,yny}.
Probabilities:
P(X1), P(X2),…, P(XN);P(Y|X1,X2,…,XN).
Relevancies:X1 = P((X1) = “yes”);
X2 = P((X2) = “yes”);
…XN = P((XN) = “yes”);
Goal: to estimate P(Y).
24
General Case of Managing Relevance (2)
X1
Y
Probability
P(X1)
P(Y)-?
P(Y|X1,X2,…,XN)
Relevance
Ψ(X1)
XN
Probability
P(XN) Relevance
Ψ(XN) X2
Probability
P(X2) Relevance
Ψ(X2)
…
1 2 )"")(()"")((
1
])1()(),...2,1|([...1
)(X X XN noXqq
XqyesXrr
XrN
s
XrPnxrXNXXYPnxs
YP
Probability
P(XN)
25
Example of Relevance Bayesian Metanetwork (1)
X
Y
P(X)
P(Y)-?
P(Y|X) P(Ψ(X)|Ψ(A))
A
P(A) Ψ(A) Ψ(X)
)]}.1()()|(
)([)|({1
)(
XAAX
X
A
PP
XPnxXYPnx
YP
Conditional relevance !!!
26
Example of Relevance Bayesian Metanetwork (2)
X
Y
P(X)
P(Y)
P(Y|X)
Ψ(X|A)
A
B
P(B)
P(B|A)
P(A) Ψ(A) Ψ (X)
27
Example of Relevance Bayesian Metanetwork (3)
X
Y
P(X)
P(Y)
P(Y|X)
Ψ(X|A)
A
B
P(B)
P(B|A)
P(A) Ψ(A) Ψ (X)
Contextual level
Predictive level
Y B
A X
Ψ(B)
Ψ(A)
Ψ(Y)
Ψ(X)
28
When Bayesian Metanetworks ?
1. Bayesian Metanetwork can be considered as very powerful tool in cases where structure (or strengths) of causal relationships between observed parameters of an object essentially depends on context (e.g. external environment parameters);
2. Also it can be considered as a useful model for such an object, which diagnosis depends on different set of observed parameters depending on the context.
29
Conclusion
We are considering a context as a set of contextual attributes, which are not directly effect probability distribution of the target attributes, but they effect on a “relevance” of the predictive attributes towards target attributes.
In this paper we use the Bayesian Metanetwork vision to model such context-sensitive feature relevance. Such model assumes that the relevance of predictive attributes in a Bayesian network might be a random attribute itself and it provides a tool to reason based not only on probabilities of predictive attributes but also on their relevancies.
30
Read more about Bayesian Metanetworks in:
Terziyan V., A Bayesian Metanetwork, In: International Journal on Artificial Intelligence Tools, Vol. 14, No. 3, 2005, World Scientific, pp. 371-384.
http://www.cs.jyu.fi/ai/papers/KI-2003.pdf
Terziyan V., Vitko O., Bayesian Metanetwork for Modelling User Preferences in Mobile Environment, In: German Conference on Artificial Intelligence (KI-2003), LNAI, Vol. 2821, 2003, pp.370-384.
http://www.cs.jyu.fi/ai/papers/IJAIT-2005.pdf
Terziyan V., Vitko O., Learning Bayesian Metanetworks from Data with Multilevel Uncertainty, In: M. Bramer and V. Devedzic (eds.), Proceedings of the First International Conference on Artificial Intelligence and Innovations, Toulouse, France, August 22-27, 2004, Kluwer Academic Publishers, pp. 187-196 .
http://www.cs.jyu.fi/ai/papers/AIAI-2004.ps