decision tree learning
DESCRIPTION
Decision Tree Learning. Presented by Ping Zhang Nov. 26th, 2007. Introduction. Decision tree learning is one of the most widely used and practical method for inductive inference - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Decision Tree Learning](https://reader036.vdocument.in/reader036/viewer/2022062521/5681679e550346895ddce649/html5/thumbnails/1.jpg)
Decision Tree Learning
Presented by Ping Zhang
Nov. 26th, 2007
![Page 2: Decision Tree Learning](https://reader036.vdocument.in/reader036/viewer/2022062521/5681679e550346895ddce649/html5/thumbnails/2.jpg)
Introduction Decision tree learning is one of the most
widely used and practical method for inductive inference
Decision tree learning is a method for approximating discrete-valued target functions, in which the learned function is represented by a decision tree
Decision tree learning is robust to noisy data and capable of learning disjunctive expressions
![Page 3: Decision Tree Learning](https://reader036.vdocument.in/reader036/viewer/2022062521/5681679e550346895ddce649/html5/thumbnails/3.jpg)
Decision tree representation Decision tree classify instances by
sorting them down the tree from the root to some leaf node, which provides the classification of the instance
Each node in the tree specifies a test of some attribute of the instance, and each branch descending from that node corresponds to one of the possible values for this attributes
![Page 4: Decision Tree Learning](https://reader036.vdocument.in/reader036/viewer/2022062521/5681679e550346895ddce649/html5/thumbnails/4.jpg)
Decision Tree for PlayTennis
![Page 5: Decision Tree Learning](https://reader036.vdocument.in/reader036/viewer/2022062521/5681679e550346895ddce649/html5/thumbnails/5.jpg)
When to Consider Decision Trees Instances describable by attribute-value
pairs Target function is discrete valued Disjunctive hypothesis may be required Possibly noisy training data
Examples (Classification problems): Equipment or medical diagnosis Credit risk analysis
![Page 6: Decision Tree Learning](https://reader036.vdocument.in/reader036/viewer/2022062521/5681679e550346895ddce649/html5/thumbnails/6.jpg)
Top-Down Induction of Decision Trees
![Page 7: Decision Tree Learning](https://reader036.vdocument.in/reader036/viewer/2022062521/5681679e550346895ddce649/html5/thumbnails/7.jpg)
Entropy (1)
![Page 8: Decision Tree Learning](https://reader036.vdocument.in/reader036/viewer/2022062521/5681679e550346895ddce649/html5/thumbnails/8.jpg)
Entropy (2)
![Page 9: Decision Tree Learning](https://reader036.vdocument.in/reader036/viewer/2022062521/5681679e550346895ddce649/html5/thumbnails/9.jpg)
Information Gain
![Page 10: Decision Tree Learning](https://reader036.vdocument.in/reader036/viewer/2022062521/5681679e550346895ddce649/html5/thumbnails/10.jpg)
Training Examples
![Page 11: Decision Tree Learning](https://reader036.vdocument.in/reader036/viewer/2022062521/5681679e550346895ddce649/html5/thumbnails/11.jpg)
Selecting the Next Attribute
![Page 12: Decision Tree Learning](https://reader036.vdocument.in/reader036/viewer/2022062521/5681679e550346895ddce649/html5/thumbnails/12.jpg)
Which attribute should be tested here?
![Page 13: Decision Tree Learning](https://reader036.vdocument.in/reader036/viewer/2022062521/5681679e550346895ddce649/html5/thumbnails/13.jpg)
Hypothesis Space Search by ID3 Hypothesis space is complete Target function surely in there Only outputs a single hypothesis No back tracking Local minima Statically-based search choices Robust to noisy data Inductive bias: “prefer shortest tree”
![Page 14: Decision Tree Learning](https://reader036.vdocument.in/reader036/viewer/2022062521/5681679e550346895ddce649/html5/thumbnails/14.jpg)
From ID3 to C4.5 C4.5 made a number of improvements to ID3. Some of these are: Handling both continuous and discrete attributes Handling training data with missing attribute value Handling attributes with differing costs Pruning trees after creation
![Page 15: Decision Tree Learning](https://reader036.vdocument.in/reader036/viewer/2022062521/5681679e550346895ddce649/html5/thumbnails/15.jpg)
Overfitting in Decision Trees
![Page 16: Decision Tree Learning](https://reader036.vdocument.in/reader036/viewer/2022062521/5681679e550346895ddce649/html5/thumbnails/16.jpg)
Reduced-Error Pruning
![Page 17: Decision Tree Learning](https://reader036.vdocument.in/reader036/viewer/2022062521/5681679e550346895ddce649/html5/thumbnails/17.jpg)
Rule Post-Pruning Convert tree to equivalent set of rules Prune each rule by removing any preconditions that result in
improving its estimated accuracy Sort the pruned rules by their estimated accuracy, and consider
them in this sequence when classifying subsequent instance
Perhaps most frequently used method
![Page 18: Decision Tree Learning](https://reader036.vdocument.in/reader036/viewer/2022062521/5681679e550346895ddce649/html5/thumbnails/18.jpg)
Continuous Valued Attributes Create a discrete attribute to test continuous
There are two candidate thresholds The information gain can be computed for each of the candidate attributes, Temperature>54 and Temperature>85, and the best can be selected(Temperature>54)
![Page 19: Decision Tree Learning](https://reader036.vdocument.in/reader036/viewer/2022062521/5681679e550346895ddce649/html5/thumbnails/19.jpg)
Attributes with many ValuesProblems: If attribute has many values, Gain will select it Imagine using the attribute Data. It would have the
highest information gain of any of attributes. But the decision tree is not useful.
![Page 20: Decision Tree Learning](https://reader036.vdocument.in/reader036/viewer/2022062521/5681679e550346895ddce649/html5/thumbnails/20.jpg)
Missing Attribute Values
![Page 21: Decision Tree Learning](https://reader036.vdocument.in/reader036/viewer/2022062521/5681679e550346895ddce649/html5/thumbnails/21.jpg)
Attributes with Costs Consider Medical diagnosis, BloodTset has cost 150 dallors How to learn a consistent tree with low expected cost?
![Page 22: Decision Tree Learning](https://reader036.vdocument.in/reader036/viewer/2022062521/5681679e550346895ddce649/html5/thumbnails/22.jpg)
Conclusion Decision Tree Learning is Simple to understand and interpret Requires little data preparation Able to handle both numerical and
categorical data Use a white box model Possible to validate a model using
statistical tests Robust, perform well with large data in
a short time