cpsc540 - computer science at ubcnando/540-2013/lectures/l8.pdf · february, 2013 university of...

Post on 18-Aug-2020

0 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

CPSC540

Nando de FreitasFebruary, 2013University of British Columbia

Decision trees

Outline of the lecture

This lecture provides an introduction to decision trees. It discusses:

� Decision trees� Using reduction in entropy as a criterion for constructing decision trees.� The application of decision trees to classificationThe application of decision trees to classification

Motivation example 1:object detection

Motivation example 2: Kinect

Image classification example

[MSR Tutorial on decision forests by Criminisi et al, 2011]

Classification tree

[Criminisi et al, 2011]

Another commerce example

Simafore.com

From a spreadsheet to a decision node

[AI book of Stuart Russell and Peter Norvig]

A learned decision tree

[AI book of Stuart Russell and Peter Norvig]

How do we construct the tree ?i.e., how to pick attribute (nodes)?

np

n

np

n

np

p

np

p

np

n

np

pH

++−

++−=

++ 22 loglog),(

For a training set containing p positive examples and n negative examples, we have:

� A chosen attribute A, with K distinct values, divides the training set E into subsets E1, … , EK.

� The Expected Entropy (EH) remaining after trying attribute A(with branches i=1,2,…,K) is

∑ ++++=

Kiiii

np

n

np

pH

np

npAEH ),()(

How to pick nodes?

� Information gain (I) or reduction in entropy for this attribute is:

� Choose the attribute with the largest I

∑= +++

=i iiii npnp

Hnp

AEH1

),()(

)(),()( AEHnp

n

np

pHAI −

++=

[Hwee Tou Ng & Stuart Russell]

� Convention: For the training set, p = n = 6, H(6/12, 6/12) = 1bit

� Consider the attributes Patronsand Type(and others too):

bits 0)]4

2,

4

2(

12

4)

4

2,

4

2(

12

4)

2

1,

2

1(

12

2)

2

1,

2

1(

12

2[1)(

bits 0541.)]6

4,

6

2(

12

6)0,1(

12

4)1,0(

12

2[1)(

=+++−=

=++−=

HHHHTypeI

HHHPatronsI

Example

4412441222122212

[Hwee Tou Ng & Stuart Russell]

Classification tree

[Criminisi et al, 2011]

Use information gain to decide splits

[Criminisi et al, 2011]

Advanced: Gaussian information gain to decide splits

[Criminisi et al, 2011]

[Criminisi et al, 2011]

Alternative node decisions

[Criminisi et al, 2011]

Next lecture

The next lecture introduces random forests.

top related