decision trees: representation - svivek · summary: decision trees • decision trees can represent...

38
Machine Learning Decision Trees: Representation 1 Some slides from Tom Mitchell, Dan Roth and others

Upload: others

Post on 22-May-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Decision Trees: Representation - svivek · Summary: Decision trees • Decision trees can represent any Boolean function • A way to represent lot of data • A natural representation

MachineLearning

DecisionTrees:Representation

1SomeslidesfromTomMitchell,DanRothandothers

Page 2: Decision Trees: Representation - svivek · Summary: Decision trees • Decision trees can represent any Boolean function • A way to represent lot of data • A natural representation

Keyissuesinmachinelearning

• ModelingHowtoformulateyourproblemasamachinelearningproblem?Howtorepresentdata?Whichalgorithmstouse?Whatlearningprotocols?

• RepresentationGoodhypothesisspacesandgoodfeatures

• Algorithms– Whatisagoodlearningalgorithm?– Whatissuccess?– Generalizationvs overfitting– Thecomputationalquestion:Howlongwilllearningtake?

2

Page 3: Decision Trees: Representation - svivek · Summary: Decision trees • Decision trees can represent any Boolean function • A way to represent lot of data • A natural representation

Comingup…(therestofthesemester)

Differenthypothesisspacesandlearningalgorithms– DecisiontreesandtheID3algorithm– Linearclassifiers

• Perceptron• SVM• Logisticregression

– Combiningmultipleclassifiers• Boosting,bagging

– Non-linearclassifiers– Nearestneighbors

3

Page 4: Decision Trees: Representation - svivek · Summary: Decision trees • Decision trees can represent any Boolean function • A way to represent lot of data • A natural representation

Comingup…(therestofthesemester)

Differenthypothesisspacesandlearningalgorithms– DecisiontreesandtheID3algorithm– Linearclassifiers

• Perceptron• SVM• Logisticregression

– Combiningmultipleclassifiers• Boosting,bagging

– Non-linearclassifiers– Nearestneighbors

4

Importantissuestoconsider

1. Whatdothesehypothesesrepresent?

2. Implicitassumptionsandtradeoffs

3. Generalization?

4. Howdowelearn?

Page 5: Decision Trees: Representation - svivek · Summary: Decision trees • Decision trees can represent any Boolean function • A way to represent lot of data • A natural representation

Thislecture:LearningDecisionTrees

1. Representation:Whataredecisiontrees?

2. Algorithm:Learningdecisiontrees– TheID3algorithm:Agreedyheuristic

3. Someextensions

5

Page 6: Decision Trees: Representation - svivek · Summary: Decision trees • Decision trees can represent any Boolean function • A way to represent lot of data • A natural representation

Thislecture:LearningDecisionTrees

1. Representation:Whataredecisiontrees?

2. Algorithm:Learningdecisiontrees– TheID3algorithm:Agreedyheuristic

3. Someextensions

6

Page 7: Decision Trees: Representation - svivek · Summary: Decision trees • Decision trees can represent any Boolean function • A way to represent lot of data • A natural representation

Representingdata

Datacanberepresentedasabigtable,withcolumnsdenotingdifferentattributes

Name Label

ClaireCardie -PeterBartlett +EricBaum +Haym Hirsh +Shai Ben-David +MichaelI.Jordan -

7

Page 8: Decision Trees: Representation - svivek · Summary: Decision trees • Decision trees can represent any Boolean function • A way to represent lot of data • A natural representation

Representingdata

Datacanberepresentedasabigtable,withcolumnsdenotingdifferentattributes

Name Namehaspunctuation?

Secondcharacteroffirstname

Lengthoffirst

name>5?

Samefirstletterintwonames?

Label

ClaireCardie No l Yes Yes -PeterBartlett No e No No +EricBaum No r No No +Haym Hirsh No a No Yes +Shai Ben-David

Yes h No No +

MichaelI.Jordan

Yes i Yes No -8

Page 9: Decision Trees: Representation - svivek · Summary: Decision trees • Decision trees can represent any Boolean function • A way to represent lot of data • A natural representation

Name Namehaspunctuation?

Secondcharacteroffirstname

Lengthoffirst

name>5?

Samefirstletterintwonames?

Label

ClaireCardie No l Yes Yes -PeterBartlett No e No No +EricBaum No r No No +Haym Hirsh No a No Yes +Shai Ben-David

Yes h No No +

MichaelI.Jordan

Yes i Yes No -

Representingdata

Datacanberepresentedasabigtable,withcolumnsdenotingdifferentattributes

Withthesefourattributes,howmanyuniquerowsarepossible?2· 26· 26· 2=2704

Ifthereare100attributes,allbinary,howmanyuniquerowsarepossible?2100

9

Page 10: Decision Trees: Representation - svivek · Summary: Decision trees • Decision trees can represent any Boolean function • A way to represent lot of data • A natural representation

Name Namehaspunctuation?

Secondcharacteroffirstname

Lengthoffirst

name>5?

Samefirstletterintwonames?

Label

ClaireCardie No l Yes Yes -PeterBartlett No e No No +EricBaum No r No No +Haym Hirsh No a No Yes +Shai Ben-David

Yes h No No +

MichaelI.Jordan

Yes i Yes No -

Representingdata

Datacanberepresentedasabigtable,withcolumnsdenotingdifferentattributes

Withthesefourattributes,howmanyuniquerowsarepossible?2×26×2×2 = 208

Ifthereare100attributes,allbinary,howmanyuniquerowsarepossible?2100

10

Page 11: Decision Trees: Representation - svivek · Summary: Decision trees • Decision trees can represent any Boolean function • A way to represent lot of data • A natural representation

Name Namehaspunctuation?

Secondcharacteroffirstname

Lengthoffirst

name>5?

Samefirstletterintwonames?

Label

ClaireCardie No l Yes Yes -PeterBartlett No e No No +EricBaum No r No No +Haym Hirsh No a No Yes +Shai Ben-David

Yes h No No +

MichaelI.Jordan

Yes i Yes No -

Representingdata

Datacanberepresentedasabigtable,withcolumnsdenotingdifferentattributes

Withthesefourattributes,howmanyuniquerowsarepossible?2×26×2×2 = 208

Ifthereare100attributes,allbinary,howmanyuniquerowsarepossible?2100

11

Page 12: Decision Trees: Representation - svivek · Summary: Decision trees • Decision trees can represent any Boolean function • A way to represent lot of data • A natural representation

Name Namehaspunctuation?

Secondcharacteroffirstname

Lengthoffirst

name>5?

Samefirstletterintwonames?

Label

ClaireCardie No l Yes Yes -PeterBartlett No e No No +EricBaum No r No No +Haym Hirsh No a No Yes +Shai Ben-David

Yes h No No +

MichaelI.Jordan

Yes i Yes No -

Representingdata

Datacanberepresentedasabigtable,withcolumnsdenotingdifferentattributes

Withthesefourattributes,howmanyuniquerowsarepossible?2×26×2×2 = 208

Ifthereare100attributes,allbinary,howmanyuniquerowsarepossible?(100times)2×2×2×⋯×2 = 2)**

12

Page 13: Decision Trees: Representation - svivek · Summary: Decision trees • Decision trees can represent any Boolean function • A way to represent lot of data • A natural representation

Name Namehaspunctuation?

Secondcharacteroffirstname

Lengthoffirst

name>5?

Samefirstletterintwonames?

Label

ClaireCardie No l Yes Yes -PeterBartlett No e No No +EricBaum No r No No +Haym Hirsh No a No Yes +Shai Ben-David

Yes h No No +

MichaelI.Jordan

Yes i Yes No -

Representingdata

Datacanberepresentedasabigtable,withcolumnsdenotingdifferentattributes

Withthesefourattributes,howmanyuniquerowsarepossible?2×26×2×2 = 208

Ifthereare100attributes,allbinary,howmanyuniquerowsarepossible?(100times)2×2×2×⋯×2 = 2)**

13

Ifwewantedtostoreallpossiblerows,thisnumberistoolarge.

Weneedtofigureouthowtorepresentdatainabetter,moreefficientway

Page 14: Decision Trees: Representation - svivek · Summary: Decision trees • Decision trees can represent any Boolean function • A way to represent lot of data • A natural representation

Whataredecisiontrees?

Ahierarchicaldatastructurethatrepresentsdatausingadivide-and-conquerstrategy

Canbeusedashypothesisclassfornon-parametricclassificationorregression

Generalidea:Givenacollectionofexamples,learnadecisiontreethatrepresentsit

14

Page 15: Decision Trees: Representation - svivek · Summary: Decision trees • Decision trees can represent any Boolean function • A way to represent lot of data • A natural representation

Whataredecisiontrees?

• Decisiontreesareafamilyofclassifiersforinstancesthatarerepresentedbycollectionsofattributes(i.e.features)

• Nodes aretestsforfeaturevalues

• Thereisonebranch foreveryvaluethatthefeaturecantake

• Leaves ofthetreespecifytheclasslabels

15

Page 16: Decision Trees: Representation - svivek · Summary: Decision trees • Decision trees can represent any Boolean function • A way to represent lot of data • A natural representation

Let’sbuildadecisiontreeforclassifyingshapes

Label=ALabel=C Label=B

16

Page 17: Decision Trees: Representation - svivek · Summary: Decision trees • Decision trees can represent any Boolean function • A way to represent lot of data • A natural representation

Let’sbuildadecisiontreeforclassifyingshapes

17

Beforebuildingadecisiontree:

Whatisthelabelforaredtriangle?Andwhy?

Label=ALabel=C Label=B

Page 18: Decision Trees: Representation - svivek · Summary: Decision trees • Decision trees can represent any Boolean function • A way to represent lot of data • A natural representation

Let’sbuildadecisiontreeforclassifyingshapes

Whataresomeattributesoftheexamples?

18

Label=ALabel=C Label=B

Page 19: Decision Trees: Representation - svivek · Summary: Decision trees • Decision trees can represent any Boolean function • A way to represent lot of data • A natural representation

Let’sbuildadecisiontreeforclassifyingshapes

Whataresomeattributesoftheexamples?Color,Shape

19

Label=ALabel=C Label=B

Page 20: Decision Trees: Representation - svivek · Summary: Decision trees • Decision trees can represent any Boolean function • A way to represent lot of data • A natural representation

Let’sbuildadecisiontreeforclassifyingshapes

Whataresomeattributesoftheexamples?Color,Shape Color?

20

Label=ALabel=C Label=B

Page 21: Decision Trees: Representation - svivek · Summary: Decision trees • Decision trees can represent any Boolean function • A way to represent lot of data • A natural representation

Let’sbuildadecisiontreeforclassifyingshapes

Whataresomeattributesoftheexamples?Color,Shape Color?

Blue Red Green

21

Label=ALabel=C Label=B

Page 22: Decision Trees: Representation - svivek · Summary: Decision trees • Decision trees can represent any Boolean function • A way to represent lot of data • A natural representation

Let’sbuildadecisiontreeforclassifyingshapes

Whataresomeattributesoftheexamples?Color,Shape Color?

Blue Red Green

B

22

Label=ALabel=C Label=B

Page 23: Decision Trees: Representation - svivek · Summary: Decision trees • Decision trees can represent any Boolean function • A way to represent lot of data • A natural representation

Let’sbuildadecisiontreeforclassifyingshapes

Whataresomeattributesoftheexamples?Color,Shape Color?

Blue Red Green

B

squaretriangle circle

CAB

Shape?

23

Label=ALabel=C Label=B

Page 24: Decision Trees: Representation - svivek · Summary: Decision trees • Decision trees can represent any Boolean function • A way to represent lot of data • A natural representation

Let’sbuildadecisiontreeforclassifyingshapes

Whataresomeattributesoftheexamples?Color,Shape Color?

Shape?circlesquare

AB

Blue Red Green

B

squaretriangle circle

CAB

Shape?

24

Label=ALabel=C Label=B

Page 25: Decision Trees: Representation - svivek · Summary: Decision trees • Decision trees can represent any Boolean function • A way to represent lot of data • A natural representation

Label=ALabel=C Label=B

Let’sbuildadecisiontreeforclassifyingshapes

Whataresomeattributesoftheexamples?Color,Shape Color?

Shape?circlesquare

AB

Blue Red Green

squaretriangle circle

CAB

Shape?

1. Howdowelearn adecisiontree?Comingupsoon…

2. Howtouseadecisiontreeforprediction?• Whatisthelabelforared triangle?

• Justfollowapathfromtheroottoaleaf

• Whataboutagreentriangle?

25

B

Page 26: Decision Trees: Representation - svivek · Summary: Decision trees • Decision trees can represent any Boolean function • A way to represent lot of data • A natural representation

Label=ALabel=C Label=B

Let’sbuildadecisiontreeforclassifyingshapes

Whataresomeattributesoftheexamples?Color,Shape Color?

Shape?circlesquare

AB

Blue Red Green

squaretriangle circle

CAB

Shape?

1. Howdowelearn adecisiontree?Comingupsoon…

2. Howtouseadecisiontreeforprediction?• Whatisthelabelforared triangle?

• Justfollowapathfromtheroottoaleaf

• Whataboutagreen triangle?

26

B

Page 27: Decision Trees: Representation - svivek · Summary: Decision trees • Decision trees can represent any Boolean function • A way to represent lot of data • A natural representation

ExpressivityofDecisiontrees

WhatBooleanfunctionscandecisiontreesrepresent?– AnyBooleanfunction

(Color=blue ANDShape=triangle) Label=B)AND(Color=blue ANDShape=square) Label=A) AND(Color=blueANDShape=circle) Label=C)AND….

Everypathfromthetreetoarootisarule

Thefulltreeisequivalenttotheconjunctionofalltherules

AnyBooleanfunctioncanberepresentedasadecisiontree.

27

Page 28: Decision Trees: Representation - svivek · Summary: Decision trees • Decision trees can represent any Boolean function • A way to represent lot of data • A natural representation

ExpressivityofDecisiontrees

WhatBooleanfunctionscandecisiontreesrepresent?– AnyBooleanfunction

(Color=blue ANDShape=triangle) Label=B)AND(Color=blue ANDShape=square) Label=A) AND(Color=blueANDShape=circle) Label=C)AND….

AnyBooleanfunctioncanberepresentedasadecisiontree.

28

Everypathfromthetreetoarootisarule

Thefulltreeisequivalenttotheconjunctionofalltherules

Page 29: Decision Trees: Representation - svivek · Summary: Decision trees • Decision trees can represent any Boolean function • A way to represent lot of data • A natural representation

ExpressivityofDecisiontrees

WhatBooleanfunctionscandecisiontreesrepresent?– AnyBooleanfunction

(Color=blue ANDShape=triangle) Label=B)AND(Color=blue ANDShape=square) Label=A) AND(Color=blueANDShape=circle) Label=C)AND….

AnyBooleanfunctioncanberepresentedasadecisiontree.

29

Everypathfromthetreetoarootisarule

Thefulltreeisequivalenttotheconjunctionofalltherules

Page 30: Decision Trees: Representation - svivek · Summary: Decision trees • Decision trees can represent any Boolean function • A way to represent lot of data • A natural representation

DecisionTrees

• Outputsarediscretecategories

• Butrealvaluedoutputsarealsopossible(regressiontrees)

• Wellstudiedmethodsforhandlingnoisydata(noiseinthelabelorinthefeatures)andforhandlingmissingattributes– Pruningtreeshelpswithnoise– Moreonthislater…

30

Page 31: Decision Trees: Representation - svivek · Summary: Decision trees • Decision trees can represent any Boolean function • A way to represent lot of data • A natural representation

Numericattributesanddecisionboundaries

• Wehaveseeninstancesrepresentedasattribute-valuepairs(color=blue,secondletter=e,etc.)– Valueshavebeencategorical

• Howdowedealwithnumericfeaturevalues?(eg length=?)– Discretizethemorusethresholdsonthenumericvalues– Thisexampledividesthefeaturespaceintoaxisparallelrectangles

31

Page 32: Decision Trees: Representation - svivek · Summary: Decision trees • Decision trees can represent any Boolean function • A way to represent lot of data • A natural representation

Numericattributesanddecisionboundaries

• Wehaveseeninstancesrepresentedasattribute-valuepairs(color=blue,secondletter=e,etc.)– Valueshavebeencategorical

• Howdowedealwithnumericfeaturevalues?(eg length=?)– Discretizethemorusethresholdsonthenumericvalues– Thisexampledividesthefeaturespaceintoaxisparallelrectangles

32

Page 33: Decision Trees: Representation - svivek · Summary: Decision trees • Decision trees can represent any Boolean function • A way to represent lot of data • A natural representation

Numericattributesanddecisionboundaries

• Wehaveseeninstancesrepresentedasattribute-valuepairs(color=blue,secondletter=e,etc.)– Valueshavebeencategorical

• Howdowedealwithnumericfeaturevalues?(eg length=?)– Discretizethemorusethresholdsonthenumericvalues– Thisexampledividesthefeaturespaceintoaxisparallelrectangles

13X

7

5

Y

- +

+ +

+ +

-

-

+

33

Page 34: Decision Trees: Representation - svivek · Summary: Decision trees • Decision trees can represent any Boolean function • A way to represent lot of data • A natural representation

Numericattributesanddecisionboundaries

• Wehaveseeninstancesrepresentedasattribute-valuepairs(color=blue,secondletter=e,etc.)– Valueshavebeencategorical

• Howdowedealwithnumericfeaturevalues?(eg length=?)– Discretizethemorusethresholdsonthenumericvalues– Thisexampledividesthefeaturespaceintoaxisparallelrectangles

13X

7

5

Y

- +

+ +

+ +

-

-

+

34

X<3

Y<5

no yes

Y>7yesno

X<1

no yes

- + +

+ -yesno

Page 35: Decision Trees: Representation - svivek · Summary: Decision trees • Decision trees can represent any Boolean function • A way to represent lot of data • A natural representation

Numericattributesanddecisionboundaries

• Wehaveseeninstancesrepresentedasattribute-valuepairs(color=blue,secondletter=e,etc.)– Valueshavebeencategorical

• Howdowedealwithnumericfeaturevalues?(eg length=?)– Discretizethemorusethresholdsonthenumericvalues– Thisexampledividesthefeaturespaceintoaxisparallelrectangles

13X

7

5

Y

- +

+ +

+ +

-

-

+Decisionboundariescanbenon-linear

35

X<3

Y<5

no yes

Y>7yesno

X<1

no yes

- + +

+ -yesno

Page 36: Decision Trees: Representation - svivek · Summary: Decision trees • Decision trees can represent any Boolean function • A way to represent lot of data • A natural representation

Summary:Decisiontrees

• DecisiontreescanrepresentanyBooleanfunction• Awaytorepresentlotofdata• Anaturalrepresentation(think20questions)• Predicting withadecisiontreeiseasy

• Clearly,givenadataset,therearemanydecisiontreesthatcanrepresentit.[Exercise:Why?]

• Learningagoodrepresentationfromdataisthenextquestion

36

Page 37: Decision Trees: Representation - svivek · Summary: Decision trees • Decision trees can represent any Boolean function • A way to represent lot of data • A natural representation

Summary:Decisiontrees

• DecisiontreescanrepresentanyBooleanfunction• Awaytorepresentlotofdata• Anaturalrepresentation(think20questions)• Predicting withadecisiontreeiseasy

• Clearly,givenadataset,therearemanydecisiontreesthatcanrepresentit.[Exercise:Why?]

• Learningagoodrepresentationfromdataisthenextquestion

37

Page 38: Decision Trees: Representation - svivek · Summary: Decision trees • Decision trees can represent any Boolean function • A way to represent lot of data • A natural representation

Exercises

1. WritedownthedecisiontreefortheshapesdataiftherootnodewasShape insteadofColor.

2. Willthetwotreesmakethesamepredictionsforunseenshapes/colorcombinations?

3. ShowthatmultiplestructurallydifferentdecisiontreescanrepresentthesameBooleanfunctionoftwoormorevariables.

38

Label=ALabel=C Label=B

(thinkaboutwhatitmeansfortwotreestobestructurallydifferent)