chapter 2 (part 3) bayesian decision theory discriminant functions for the normal density bayes...

16
Chapter 2 (part 3) Bayesian Decision Theory Discriminant Functions for the Normal Density Bayes Decision Theory – Discrete Features All materials used in this course were taken from the textbook “Pattern Classification” by Duda et al., John Wiley & Sons, 2001 with the permission of the authors and the publisher

Post on 21-Dec-2015

227 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Chapter 2 (part 3) Bayesian Decision Theory Discriminant Functions for the Normal Density Bayes Decision Theory – Discrete Features All materials used

Chapter 2 (part 3)Bayesian Decision Theory

Discriminant Functions for the Normal Density

Bayes Decision Theory – Discrete Features

All materials used in this course were taken from the textbook “Pattern Classification” by Duda et al., John Wiley & Sons, 2001 with the permission of the authors and the publisher

Page 2: Chapter 2 (part 3) Bayesian Decision Theory Discriminant Functions for the Normal Density Bayes Decision Theory – Discrete Features All materials used

Dr. Djamel Bouchaffra CSE 616 Applied Pattern Recognition, Chapter 2 , Section 2.

2Discriminant Functions for the Normal Density

We saw that the minimum error-rate classification can be achieved by the discriminant function

gi(x) = ln P(x | i) + ln P(i)

Case of multivariate normal

)(Plnln21

2ln2d

)x()x(21

)x(g ii

1

ii

tii

6

Page 3: Chapter 2 (part 3) Bayesian Decision Theory Discriminant Functions for the Normal Density Bayes Decision Theory – Discrete Features All materials used

Dr. Djamel Bouchaffra CSE 616 Applied Pattern Recognition, Chapter 2 , Section 2.

3

Case i = 2.I (I stands for the identity matrix)

)category! th the for threshold the called is (

)(Pln2

1w ;w

:where

function)nt discrimina (linear wxw)x(g

0i

iiti20i2

ii

0itii

i

6

Page 4: Chapter 2 (part 3) Bayesian Decision Theory Discriminant Functions for the Normal Density Bayes Decision Theory – Discrete Features All materials used

Dr. Djamel Bouchaffra CSE 616 Applied Pattern Recognition, Chapter 2 , Section 2.

4

– A classifier that uses linear discriminant functions is called “a linear machine”

– The decision surfaces for a linear machine are pieces of hyperplanes defined by:

gi(x) = gj(x)

6

Page 5: Chapter 2 (part 3) Bayesian Decision Theory Discriminant Functions for the Normal Density Bayes Decision Theory – Discrete Features All materials used

Dr. Djamel Bouchaffra CSE 616 Applied Pattern Recognition, Chapter 2 , Section 2.

5

6

Page 6: Chapter 2 (part 3) Bayesian Decision Theory Discriminant Functions for the Normal Density Bayes Decision Theory – Discrete Features All materials used

Dr. Djamel Bouchaffra CSE 616 Applied Pattern Recognition, Chapter 2 , Section 2.

6

– The hyperplane separating Ri and Rj

always orthogonal to the line linking the means!

)()(P

)(Pln)(

21

x jij

i2

ji

2

ji0

)(21

x then )(P)(P if ji0ji

6

Page 7: Chapter 2 (part 3) Bayesian Decision Theory Discriminant Functions for the Normal Density Bayes Decision Theory – Discrete Features All materials used

Dr. Djamel Bouchaffra CSE 616 Applied Pattern Recognition, Chapter 2 , Section 2.

7

6

Page 8: Chapter 2 (part 3) Bayesian Decision Theory Discriminant Functions for the Normal Density Bayes Decision Theory – Discrete Features All materials used

Dr. Djamel Bouchaffra CSE 616 Applied Pattern Recognition, Chapter 2 , Section 2.

8

6

Page 9: Chapter 2 (part 3) Bayesian Decision Theory Discriminant Functions for the Normal Density Bayes Decision Theory – Discrete Features All materials used

Dr. Djamel Bouchaffra CSE 616 Applied Pattern Recognition, Chapter 2 , Section 2.

9

Case i = (covariance of all classes are identical but arbitrary!)

– Hyperplane separating Ri and Rj

(the hyperplane separating Ri and Rj is generally not orthogonal to the line between the means!)

).(

)()(

)(P/)(Pln)(

21

x jiji

1tji

jiji0

6

Page 10: Chapter 2 (part 3) Bayesian Decision Theory Discriminant Functions for the Normal Density Bayes Decision Theory – Discrete Features All materials used

Dr. Djamel Bouchaffra CSE 616 Applied Pattern Recognition, Chapter 2 , Section 2.

10

6

Page 11: Chapter 2 (part 3) Bayesian Decision Theory Discriminant Functions for the Normal Density Bayes Decision Theory – Discrete Features All materials used

Dr. Djamel Bouchaffra CSE 616 Applied Pattern Recognition, Chapter 2 , Section 2.

11

6

Page 12: Chapter 2 (part 3) Bayesian Decision Theory Discriminant Functions for the Normal Density Bayes Decision Theory – Discrete Features All materials used

Dr. Djamel Bouchaffra CSE 616 Applied Pattern Recognition, Chapter 2 , Section 2.

12

Case i = arbitrary

– The covariance matrices are different for each category

(Hyperquadrics which are: hyperplanes, pairs of hyperplanes, hyperspheres, hyperellipsoids, hyperparaboloids, hyperhyperboloids)

)(lnln2

1

2

1 w

w

2

1 W

:

)(

10

1i

1i

0

iiiitii

ii

i

itii

ti

P

where

wxwxWxxg

6

Page 13: Chapter 2 (part 3) Bayesian Decision Theory Discriminant Functions for the Normal Density Bayes Decision Theory – Discrete Features All materials used

Dr. Djamel Bouchaffra CSE 616 Applied Pattern Recognition, Chapter 2 , Section 2.

13

6

Page 14: Chapter 2 (part 3) Bayesian Decision Theory Discriminant Functions for the Normal Density Bayes Decision Theory – Discrete Features All materials used

Dr. Djamel Bouchaffra CSE 616 Applied Pattern Recognition, Chapter 2 , Section 2.

14

6

Page 15: Chapter 2 (part 3) Bayesian Decision Theory Discriminant Functions for the Normal Density Bayes Decision Theory – Discrete Features All materials used

Dr. Djamel Bouchaffra CSE 616 Applied Pattern Recognition, Chapter 2 , Section 2.

15

Bayes Decision Theory – Discrete Features

Components of x are binary or integer valued, x can take only one of m discrete values

v1, v2, …, vm

Case of independent binary features in 2 category problem

Let x = (x1, x2, …, xd)t where each xi is either 0 or 1, with probabilities:

pi = P(xi = 1 | 1)

qi = P(xi = 1 | 2)

9

Page 16: Chapter 2 (part 3) Bayesian Decision Theory Discriminant Functions for the Normal Density Bayes Decision Theory – Discrete Features All materials used

Dr. Djamel Bouchaffra CSE 616 Applied Pattern Recognition, Chapter 2 , Section 2.

16

The discriminant function in this case is:

0g(x) if and 0g(x) if decide

)(P

)(Pln

q1

p1lnw

:and

d,...,1i )p1(q

)q1(plnw

:where

wxw)x(g

21

2

1d

1i i

i0

ii

iii

0i

d

1ii

9