finite mixture model of bounded semi- naïve bayesian network classifiers kaizhu huang, irwin king,...

Finite mixture model of Bounded Semi- Naïve Bayesian Network Classifiers Kaizhu Huang, Irwin King, Michael R. Lyu Multimedia Information Processing Laboratory The Chinese University of Hong Kong Shatin, NT. Hong Kong {kzhuang, king, lyu} ICANN&ICONIP2003, June, 2003 Istanbul, Turkey

Post on 21-Dec-2015




0 download


Page 1: Finite mixture model of Bounded Semi- Naïve Bayesian Network Classifiers Kaizhu Huang, Irwin King, Michael R. Lyu Multimedia Information Processing Laboratory

Finite mixture model of Bounded Semi-Naïve Bayesian Network Classifiers

Kaizhu Huang, Irwin King, Michael R. Lyu

Multimedia Information Processing Laboratory

The Chinese University of Hong KongShatin, NT. Hong Kong

{kzhuang, king, lyu}

ICANN&ICONIP2003, June, 2003Istanbul, Turkey

Page 2: Finite mixture model of Bounded Semi- Naïve Bayesian Network Classifiers Kaizhu Huang, Irwin King, Michael R. Lyu Multimedia Information Processing Laboratory

ICANN&ICONIP 2003, JUNE, 2003 The Chinese University of Hong Kong Multimedia Information Processing Lab



Abstract Background

Classifiers Naïve Bayesian Classifiers Semi-Naïve Bayesian Classifiers Chow-Liu Tree

Bounded Semi-Naïve Bayesian Classifiers Mixture of Bounded Semi-Naïve Bayesian Classifiers Experimental Results Discussion Conclusion

Page 3: Finite mixture model of Bounded Semi- Naïve Bayesian Network Classifiers Kaizhu Huang, Irwin King, Michael R. Lyu Multimedia Information Processing Laboratory

ICANN&ICONIP 2003, JUNE, 2003 The Chinese University of Hong Kong Multimedia Information Processing Lab



Propose a technique for constructing semi-naïve Bayesian classifiers.

It is bounded by the number of variables that can be combined into a node.

It has a less computational cost than the traditional semi-naïve Bayesian networks.

Experiments show the proposed technique is more accurate. Upgrade the Semi-Naïve structure into a mixture

structure The expression power is increased Experiments show the mixture approach outperforms other

types of classifiers

Page 4: Finite mixture model of Bounded Semi- Naïve Bayesian Network Classifiers Kaizhu Huang, Irwin King, Michael R. Lyu Multimedia Information Processing Laboratory

ICANN&ICONIP 2003, JUNE, 2003 The Chinese University of Hong Kong Multimedia Information Processing Lab


A Typical Classification Problem

Given a set of symptoms, one wants to find out whether these symptoms give rise to a particular disease.

Page 5: Finite mixture model of Bounded Semi- Naïve Bayesian Network Classifiers Kaizhu Huang, Irwin King, Michael R. Lyu Multimedia Information Processing Laboratory

ICANN&ICONIP 2003, JUNE, 2003 The Chinese University of Hong Kong Multimedia Information Processing Lab


Probabilistic Classifiers The classification mapping function is defined as:

The joint probability is not easily estimated from the dataset; Usually, the assumption about the distribution has to be made, e.g., dependent or independent?

a constant for a given x w.r.t. cl


Posterior probability

Joint probability

Page 6: Finite mixture model of Bounded Semi- Naïve Bayesian Network Classifiers Kaizhu Huang, Irwin King, Michael R. Lyu Multimedia Information Processing Laboratory

ICANN&ICONIP 2003, JUNE, 2003 The Chinese University of Hong Kong Multimedia Information Processing Lab


Naïve Bayesian Classifiers (NB) Assumption: Given the class label C, the attributes

are independent: Classification mapping function

Related Work


Page 7: Finite mixture model of Bounded Semi- Naïve Bayesian Network Classifiers Kaizhu Huang, Irwin King, Michael R. Lyu Multimedia Information Processing Laboratory

ICANN&ICONIP 2003, JUNE, 2003 The Chinese University of Hong Kong Multimedia Information Processing Lab


Related Work

Naïve Bayesian Classifiers NB’s performance is comparable with some state-

of-the-art classifiers even when its independency assumption does not hold in normal cases.

Question: Question: Can the performance be better when the conditional Can the performance be better when the conditional

independency assumption of NB is independency assumption of NB is relaxedrelaxed??

Page 8: Finite mixture model of Bounded Semi- Naïve Bayesian Network Classifiers Kaizhu Huang, Irwin King, Michael R. Lyu Multimedia Information Processing Laboratory

ICANN&ICONIP 2003, JUNE, 2003 The Chinese University of Hong Kong Multimedia Information Processing Lab


Semi-Naïve Bayesian Classifiers(SNB) A looser assumption than NB. Independency occurs among the jointed variables,

given the class label C.

Related Work

Page 9: Finite mixture model of Bounded Semi- Naïve Bayesian Network Classifiers Kaizhu Huang, Irwin King, Michael R. Lyu Multimedia Information Processing Laboratory

ICANN&ICONIP 2003, JUNE, 2003 The Chinese University of Hong Kong Multimedia Information Processing Lab


A tree dependence structure

Related Work

Chow-Liu Tree (CLT) Another looser assumption than NB. A dependence tree exists among the variables,

given the class variable C.

Page 10: Finite mixture model of Bounded Semi- Naïve Bayesian Network Classifiers Kaizhu Huang, Irwin King, Michael R. Lyu Multimedia Information Processing Laboratory

ICANN&ICONIP 2003, JUNE, 2003 The Chinese University of Hong Kong Multimedia Information Processing Lab


A conditional tree

dependency assumption

among variables

A conditional independency

assumption among jointed


Chow & Liu68 developed a

global optimal and polynomial

time cost algorithm

Traditional SNBs are not

well developed like CLT

Summary of Related Work

Page 11: Finite mixture model of Bounded Semi- Naïve Bayesian Network Classifiers Kaizhu Huang, Irwin King, Michael R. Lyu Multimedia Information Processing Laboratory

ICANN&ICONIP 2003, JUNE, 2003 The Chinese University of Hong Kong Multimedia Information Processing Lab


Kononenko91 Pazzani96

Local heuristicLocal heuristic



NoInefficient even in

jointing 3 variables


Exponential time cost

Problems of Traditional SNBs


Semi-dependence does not hold

in real cases as wellStrong


Page 12: Finite mixture model of Bounded Semi- Naïve Bayesian Network Classifiers Kaizhu Huang, Irwin King, Michael R. Lyu Multimedia Information Processing Laboratory

ICANN&ICONIP 2003, JUNE, 2003 The Chinese University of Hong Kong Multimedia Information Processing Lab


Our Solution

Bounded Semi-Naïve Bayesian Network(B-SNB) Accurate?

We use a global combinatorial optimization method. Efficient?

We find the network based on Linear Programming, which can be solved in polynomial time.

Mixture of B-SNB (MBSNB) Strong assumption?

Mixture structure is a superclass of B-SNB

Page 13: Finite mixture model of Bounded Semi- Naïve Bayesian Network Classifiers Kaizhu Huang, Irwin King, Michael R. Lyu Multimedia Information Processing Laboratory

ICANN&ICONIP 2003, JUNE, 2003 The Chinese University of Hong Kong Multimedia Information Processing Lab


Our Solution

Improved significantly

Page 14: Finite mixture model of Bounded Semi- Naïve Bayesian Network Classifiers Kaizhu Huang, Irwin King, Michael R. Lyu Multimedia Information Processing Laboratory

ICANN&ICONIP 2003, JUNE, 2003 The Chinese University of Hong Kong Multimedia Information Processing Lab


Jointed variables

Completely covering the variable set without overlapping

Conditional independency


Bounded Semi-Naïve Bayesian Network Model Definition

Page 15: Finite mixture model of Bounded Semi- Naïve Bayesian Network Classifiers Kaizhu Huang, Irwin King, Michael R. Lyu Multimedia Information Processing Laboratory

ICANN&ICONIP 2003, JUNE, 2003 The Chinese University of Hong Kong Multimedia Information Processing Lab


Large search space

Reduced by adding the constraint as follows: The cardinality of each jointed variable is exactly equal to K

Hidden principle: When K is small, a K cardinality of jointed variables will be more accurate than

separating them into several jointed variables. Example: P(a,b) P(c,d) is more close to P(a,b,c,d) than P(a,b)P(c)P(d).

Search space after reduction:

Constraining the Search Space

Page 16: Finite mixture model of Bounded Semi- Naïve Bayesian Network Classifiers Kaizhu Huang, Irwin King, Michael R. Lyu Multimedia Information Processing Laboratory

ICANN&ICONIP 2003, JUNE, 2003 The Chinese University of Hong Kong Multimedia Information Processing Lab


How to search for the appropriate model? Finding the m= [n/K ] K-cardinality subsets (jointed variables)

from variables (features) set which satisfy the SNB conditions to maximize the Log likelihood.

[x] means rounding the x to the nearest integer

Searching K-Bounded-SNB Model

Page 17: Finite mixture model of Bounded Semi- Naïve Bayesian Network Classifiers Kaizhu Huang, Irwin King, Michael R. Lyu Multimedia Information Processing Laboratory

ICANN&ICONIP 2003, JUNE, 2003 The Chinese University of Hong Kong Multimedia Information Processing Lab


Relax the previous constraints into 0x1--an integer programming

(IP) problem is changed into a linear programming (LP)


Relax the previous constraints into 0x1--an integer programming

(IP) problem is changed into a linear programming (LP)


No coverage among jointed


All the jointed variables forms the variable set

Rounding Scheme:Rounding LP solution into an IP


Rounding Scheme:Rounding LP solution into an IP


Global Optimization Procedure

Page 18: Finite mixture model of Bounded Semi- Naïve Bayesian Network Classifiers Kaizhu Huang, Irwin King, Michael R. Lyu Multimedia Information Processing Laboratory

ICANN&ICONIP 2003, JUNE, 2003 The Chinese University of Hong Kong Multimedia Information Processing Lab


Mixture Upgrading (using EM)



, update Sk dby B-SNB method

Page 19: Finite mixture model of Bounded Semi- Naïve Bayesian Network Classifiers Kaizhu Huang, Irwin King, Michael R. Lyu Multimedia Information Processing Laboratory

ICANN&ICONIP 2003, JUNE, 2003 The Chinese University of Hong Kong Multimedia Information Processing Lab


Experimental Setup

Datasets 6 benchmark datasets from UCI machine learning repository 1 synthetically generated dataset named “XOR”

Experimental Environments Platform:Windows 2000 Developing tool: Matlab 6.1

Page 20: Finite mixture model of Bounded Semi- Naïve Bayesian Network Classifiers Kaizhu Huang, Irwin King, Michael R. Lyu Multimedia Information Processing Laboratory

ICANN&ICONIP 2003, JUNE, 2003 The Chinese University of Hong Kong Multimedia Information Processing Lab


Overall Prediction Rate(%)

• We set the bound parameter K to 2 and 3.• 2-BSNB means the BSNB model for bounded parameter set to 2.

Experimental Results

Page 21: Finite mixture model of Bounded Semi- Naïve Bayesian Network Classifiers Kaizhu Huang, Irwin King, Michael R. Lyu Multimedia Information Processing Laboratory

ICANN&ICONIP 2003, JUNE, 2003 The Chinese University of Hong Kong Multimedia Information Processing Lab



Page 22: Finite mixture model of Bounded Semi- Naïve Bayesian Network Classifiers Kaizhu Huang, Irwin King, Michael R. Lyu Multimedia Information Processing Laboratory

ICANN&ICONIP 2003, JUNE, 2003 The Chinese University of Hong Kong Multimedia Information Processing Lab



Page 23: Finite mixture model of Bounded Semi- Naïve Bayesian Network Classifiers Kaizhu Huang, Irwin King, Michael R. Lyu Multimedia Information Processing Laboratory

ICANN&ICONIP 2003, JUNE, 2003 The Chinese University of Hong Kong Multimedia Information Processing Lab



Page 24: Finite mixture model of Bounded Semi- Naïve Bayesian Network Classifiers Kaizhu Huang, Irwin King, Michael R. Lyu Multimedia Information Processing Laboratory

ICANN&ICONIP 2003, JUNE, 2003 The Chinese University of Hong Kong Multimedia Information Processing Lab


C4.5 vs MBSNB

Page 25: Finite mixture model of Bounded Semi- Naïve Bayesian Network Classifiers Kaizhu Huang, Irwin King, Michael R. Lyu Multimedia Information Processing Laboratory

ICANN&ICONIP 2003, JUNE, 2003 The Chinese University of Hong Kong Multimedia Information Processing Lab


Average Error Rate

Average Error Rate Chart

Page 26: Finite mixture model of Bounded Semi- Naïve Bayesian Network Classifiers Kaizhu Huang, Irwin King, Michael R. Lyu Multimedia Information Processing Laboratory

ICANN&ICONIP 2003, JUNE, 2003 The Chinese University of Hong Kong Multimedia Information Processing Lab



Large K B-SNBs are not good for sparse datasets. Post dataset: 90 samples; K=3, the accuracy


Which value for K is good depends on the properties of the datasets. For example, Tic-Tac-Toe, Vehicle: 3-variable bias;

K=3, the accuracy increases.

Page 27: Finite mixture model of Bounded Semi- Naïve Bayesian Network Classifiers Kaizhu Huang, Irwin King, Michael R. Lyu Multimedia Information Processing Laboratory

ICANN&ICONIP 2003, JUNE, 2003 The Chinese University of Hong Kong Multimedia Information Processing Lab



When n cannot be divided by K exactly (n mod K)=l, l0, The assumption that all the joined variable has

the same cardinality K will be violated.Solution:

Find an l-cardinality jointed variable with the minimum entropy Do the optimization on the other n-l variables since (n-l mod K) will be


How to choose K ? When the sample number of the dataset is small, a large K may

not get a good performance. A good K should be related to the nature of the datasets. A natural way is to use the cross validation methods to find the

optimal K.

Page 28: Finite mixture model of Bounded Semi- Naïve Bayesian Network Classifiers Kaizhu Huang, Irwin King, Michael R. Lyu Multimedia Information Processing Laboratory

ICANN&ICONIP 2003, JUNE, 2003 The Chinese University of Hong Kong Multimedia Information Processing Lab



A novel Bounded Semi-Naïve Bayesian classifier is proposed.

Direct combinatorial optimization method enables B-SNB to have global optimization.

The transformation from IP into an LP problem reduces the computational complexity into a polynomial one.

A Mixture of BSNB is developed Expand the expression power of B-SNB Experimental results show the mixture approach outperforms

other types of classifiers.

Page 29: Finite mixture model of Bounded Semi- Naïve Bayesian Network Classifiers Kaizhu Huang, Irwin King, Michael R. Lyu Multimedia Information Processing Laboratory

ICANN&ICONIP 2003, JUNE, 2003 The Chinese University of Hong Kong Multimedia Information Processing Lab


Thank you!