machine learning - 1 tutorial 1: optimal bayes classification

Machine Learning - 1

Tutorial 1:

Optimal Bayes Classification


Theory Review

We assume all variables are random variables, with known distributions

Notation:

- A finite set of classes (categories),

- Input space (patterns),

- A classifier, maps to

W i i Nw Î W =, 1,...,

X x XÎ

:f X ®W x XÎ w Î W


Basic Assumption

The following distributions are known:

• - Prior probabilities for each class.

• - Conditional probability of the input, given that the class is

• If is a continuous space, denotes the probability density.

( ) ( ) ( ){ }Np p pw w w1 2, ,...,

( )ip x i Nw =1| , ,...,

iw

X ( )ip x w|


Reminder: Fish Classification

• Two classes

• Prior probability can be estimated from relative frequency

• Class conditional probability can be estimated by a frequency histogram


Optimal Bayes Classifier

Bayes Rule:

Optimal Bayes Classifier:

This classifier minimizes the conditional error probability and the average error probability.

( )( ) ( )

( )

( ) ( )( ) ( )

i i i ii

i ii

p x p p x pp x

p x p x p

w w w ww

w w= =

å| |

||

( ) ( ) ( ),..., ,...,

( ) argmax | argmax |i i i

i N i Nf x p x p x pw w w

=1 =1= =


Exercise 1It is given that and

The prior probability is uniform:

And the class conditional probability is Gaussian:

Where

a. What is the Bayes optimal decision rule? What are the decision boundaries in the plane?

b. Does the decision boundary for the Gaussian case always have the same form? What it depends on?

2X = ¡ { }w w w1 2 3W= , ,

( ) ( ) ( )p p pw w w1 2 3

1= = =3

( | ) ( , )i ip x Nw m S:

, ,m m m1 2 3

æö æö æ ö0 3 - 3÷ ÷ ÷ç ç ç÷ ÷ ÷ç ç ç= = =÷ ÷ ÷ç ç ç÷ ÷ ÷3 0 0ç ç ç÷ ÷ ÷ç ç çè ø è ø è ø

æ ö1 ÷0ç ÷ç 2 ÷çS = ÷ç ÷1ç ÷0ç ÷çè ø2


Exercise 2It is given that the input space is binary vectors of length d,

meaning that . The output space is

with general prior .

We define the per-coordinate class conditional probabilities:

In addition, each coordinate is statistically independent of all other coordinates

a. What is the optimal decision rule?

b. What Happens if for some i, ?

{ }dx Î 01,

( ) { }d ix x x x1= Î 01,..., , ,

( )|i ip p x w

0= =1( )|

i iq p x w

1= =1

i ip q=

{ },w w0 1

W=

0 1( ), ( )p pw w


Exercise 3

Two classes are given, with uniform prior. In addition it is given that:

What is the Bayes optimal decision rule?

( ) ( ){ }( ) ( ){ }| exp /

| exp /

p x x

p x x

wp

wp

2

0

2

1

1= - - 2 2

21

= - - 4 82 2

w w0 1,

machine learning - 1 tutorial 1: optimal bayes classification

Documents