bayesian metanetworks for context-sensitive feature relevance

Bayesian Metanetworksfor Context-Sensitive Feature Relevance

Vagan TerziyanVagan Terziyan

[email protected]

Industrial Ontologies Group, University of Jyväskylä,

Finland SETN-2006, Heraclion, Crete, Greece

24 May 2006

Contextual level

Predictive level

2

Contents

Bayesian Metanetworks Metanetworks for

managing conditional dependencies

Metanetworks for managing feature relevance

Example Conclusions

Vagan Terziyan

Industrial Ontologies Group

Department of Mathematical Information Technologies

University of Jyvaskyla (Finland)

http://www.cs.jyu.fi/ai/vagan

This presentation: http://www.cs.jyu.fi/ai/SETN-2006.ppt

3

Bayesian Metanetworks

4

Conditional dependence between variables X and Y

X

Y

P(X)

P(Y)

P(Y|X)

P(Y) = X (P(X) · P(Y|X))

5

Bayesian Metanetwork

Definition. The Bayesian Metanetwork is a set of Bayesian networks, which are put on each other in such a way that the elements (nodes or conditional dependencies) of every previous probabilistic network depend on the local probability distributions associated with the nodes of the next level network.

6

Two-level Bayesian C-Metanetwork for Managing Conditional Dependencies

Contextual level

Predictive level

8

Contextual Effect on Conditional Probability (1)

XX x1 x2 x3 x4 x5 x6 x7

predictive attributes contextual attributes

xk xr

Assume conditional dependence between predictive attributes

(causal relation between physical quantities)…

xt

… some contextual attribute may effect

directly the conditional dependence between

predictive attributes but not the attributes itself

10


xkxr

xt

Xk1 : order flowers

Xk2 : order wine

Xr1 : visit football match

Xr2 : visit girlfriend

P1(Xr |Xk ) Xk1 Xk2

Xr1 0.3 0.9

Xr2 0.4 0.5

P2(Xr |Xk ) Xk1 Xk2

Xr1 0.1 0.2

Xr2 0.8 0.7

Xt1 : I am in Paris

Xt2 : I am in Moscow

Xk : Order present Xr : Make a visit

12

Contextual Effect on Unconditional Probability (1)

XX x1 x2 x3 x4 x5 x6 x7

predictive attributes contextual attributes

xk

Assume some predictive attribute is a random

variable with appropriate probability distribution

for its values…

xt

… some contextual attribute may effect

directly the probability distribution of the predictive attribute

x1 x2 x3x4

XX

P(X)P(X)

14

Contextual Effect on Unconditional Probability (3)

xk

xt

XXkk

PP11(X(Xkk))

Xk1 Xk2

0.2

0.7

XXkk

PP22(X(Xkk))

Xk1 Xk2

0.50.3

Xt1 : I am in Paris


Xk1 : order flowers

Xk2 : order wineXk : Order present

P( P (Xk ) | Xt ) Xt1 Xt2

P1(Xk ) 0.4 0.9

P2(Xk ) 0.6 0.1

16

Two-level Bayesian C-Metanetwork for managing conditional dependencies

Contextual level

Predictive level A

B

X

Y

P(B|A) P(Y|X)

18

Two-level Bayesian R-Metanetwork for Modelling Relevant Features’ Selection

Contextual level

Predictive level

19

Feature relevance modelling (1)

We consider relevance as a probability of importance of the variable to the inference of target attribute in the given context. In such definition relevance inherits all properties of a probability.

X

Y

Probability

P(X)

P(Y)-?

P(Y|X)

Relevance

Ψ(X)

Y

P0(Y) Probability to have this model is:

P((X)=”no”)= 1-X

X

Y

P(X)

P(Y|X)

Probability to have this model is:

P((X)=”yes”)= X

P1(Y)

20

Feature relevance modelling (2)

X

Y

Probability

P(X)

P(Y)-?

P(Y|X)

Relevance

Ψ(X)

.)]1()([)|(1

)( X

XX XPnxXYPnx

YP

X: {x1, x2, …, xnx }

21

Example (1) Let attribute X will be “state of weather” and

attribute Y, which is influenced by X, will be “state of mood”.

X (“state of weather”) ={“sunny”, “overcast”, “rain”}; P(X=”sunny”) = 0.4; P(X=”overcast”) = 0.5; P(X=”rain”) = 0.1;

Y (“state of mood”) ={“good”, “bad”}; P(Y=”good”|X=”sunny”)=0.7; P(Y=”good”|X=”overcast”)=0.5; P(Y=”good”|X=”rain”)=0.2; P(Y=”bad”|X=”sunny”)=0.3; P(Y=”bad”|X=”overcast”)=0.5; P(Y=”bad”|X=”rain”)=0.8;

X

Y

Probability

P(X)

P(Y)-?

P(Y|X)

Relevance

Ψ(X)

P(X)

P(Y|X)

Let: X=0.6

22

Example (2)

Now we have:

One can also notice that these values belong to the intervals created by the two extreme cases, when parameter X is not relevant at all or it is fully relevant:

X

Y

Probability

P(X)

P(Y)-?

P(Y|X)

Relevance

Ψ(X)

55.0|)""(|)""(|)""(467.0 116.000 XXXgoodYPgoodYPgoodYP

533.0|)""(|)""(|)""(45.0 006.011 XXXbadYPbadYPbadYP

;517.0]}4.0)""(8.1[)""|""(

]4.0)""(8.1[)""|""(

]4.0)""(8.1[)""|""({3

1)""(

rainXPrainXgoodYP

overcastXPovercastXgoodYP

sunnyXPsunnyXgoodYPgoodYP

.483.0)""( badYP

.)]1()([)|(1

)( X

XX XPnxXYPnx

YP

!

23

General Case of Managing Relevance (1)

X1

Y

Probability

P(X1)

P(Y)-?

P(Y|X1,X2,…,XN)

Relevance

Ψ(X1)

XN

Probability

P(XN) Relevance

Ψ(XN) X2

Probability

P(X2) Relevance

Ψ(X2)

…

Predictive attributes:

X1 with values {x11,x12,…,x1nx1};

X2 with values {x21,x22,…,x2nx2};

…XN with values {xn1,xn2,…,xnnxn};

Target attribute:

Y with values {y1,y2,…,yny}.

Probabilities:

P(X1), P(X2),…, P(XN);P(Y|X1,X2,…,XN).

Relevancies:X1 = P((X1) = “yes”);

X2 = P((X2) = “yes”);

…XN = P((XN) = “yes”);

Goal: to estimate P(Y).

24

General Case of Managing Relevance (2)

X1

Y

Probability

P(X1)

P(Y)-?

P(Y|X1,X2,…,XN)

Relevance

Ψ(X1)

XN

Probability

P(XN) Relevance

Ψ(XN) X2

Probability

P(X2) Relevance

Ψ(X2)

…

1 2 )"")(()"")((

1

])1()(),...2,1|([...1

)(X X XN noXqq

XqyesXrr

XrN

s

XrPnxrXNXXYPnxs

YP

Probability

P(XN)

25

Example of Relevance Bayesian Metanetwork (1)

X

Y

P(X)

P(Y)-?

P(Y|X) P(Ψ(X)|Ψ(A))

A

P(A) Ψ(A) Ψ(X)

)]}.1()()|(

)([)|({1

)(

XAAX

X

A

PP

XPnxXYPnx

YP

Conditional relevance !!!

26


X

Y

P(X)

P(Y)

P(Y|X)

Ψ(X|A)

A

B

P(B)

P(B|A)

P(A) Ψ(A) Ψ (X)

27


X

Y

P(X)

P(Y)

P(Y|X)

Ψ(X|A)

A

B

P(B)

P(B|A)

P(A) Ψ(A) Ψ (X)

Contextual level

Predictive level

Y B

A X

Ψ(B)

Ψ(A)

Ψ(Y)

Ψ(X)

28

When Bayesian Metanetworks ?

1. Bayesian Metanetwork can be considered as very powerful tool in cases where structure (or strengths) of causal relationships between observed parameters of an object essentially depends on context (e.g. external environment parameters);

2. Also it can be considered as a useful model for such an object, which diagnosis depends on different set of observed parameters depending on the context.

29

Conclusion

We are considering a context as a set of contextual attributes, which are not directly effect probability distribution of the target attributes, but they effect on a “relevance” of the predictive attributes towards target attributes.

In this paper we use the Bayesian Metanetwork vision to model such context-sensitive feature relevance. Such model assumes that the relevance of predictive attributes in a Bayesian network might be a random attribute itself and it provides a tool to reason based not only on probabilities of predictive attributes but also on their relevancies.

30

Read more about Bayesian Metanetworks in:

Terziyan V., A Bayesian Metanetwork, In: International Journal on Artificial Intelligence Tools, Vol. 14, No. 3, 2005, World Scientific, pp. 371-384.

http://www.cs.jyu.fi/ai/papers/KI-2003.pdf

Terziyan V., Vitko O., Bayesian Metanetwork for Modelling User Preferences in Mobile Environment, In: German Conference on Artificial Intelligence (KI-2003), LNAI, Vol. 2821, 2003, pp.370-384.

http://www.cs.jyu.fi/ai/papers/IJAIT-2005.pdf

Terziyan V., Vitko O., Learning Bayesian Metanetworks from Data with Multilevel Uncertainty, In: M. Bramer and V. Devedzic (eds.), Proceedings of the First International Conference on Artificial Intelligence and Innovations, Toulouse, France, August 22-27, 2004, Kluwer Academic Publishers, pp. 187-196 .

http://www.cs.jyu.fi/ai/papers/AIAI-2004.ps

bayesian metanetworks for context-sensitive feature relevance

Documents

conditional probability

attribute z

attribute pyxxypxpy

pz probability distribution

values of attribute

x random variable

5p2xr xk xk1xk2xr10

2p2xr xk