chapter 14. classification and regression treesbenedikt/courses/ch14_2005.pdf · chapter 14: cart 5...

24
Chapter 14. Classification and Regression Trees Chapter 14. Classification and Regression Trees Chapter 14: CART

Upload: others

Post on 25-Jul-2020

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chapter 14. Classification and Regression Treesbenedikt/Courses/ch14_2005.pdf · Chapter 14: CART 5 IntroductionIntroduction Consider tree-partioning in fuzzy modeling: • The CART

Chapter 14. Classification and Regression Trees

Chapter 14. Classification and Regression Trees

Chapter 14: CART

Page 2: Chapter 14. Classification and Regression Treesbenedikt/Courses/ch14_2005.pdf · Chapter 14: CART 5 IntroductionIntroduction Consider tree-partioning in fuzzy modeling: • The CART

Chapter 14: CART

2

Decision TreesDecision Trees

Page 3: Chapter 14. Classification and Regression Treesbenedikt/Courses/ch14_2005.pdf · Chapter 14: CART 5 IntroductionIntroduction Consider tree-partioning in fuzzy modeling: • The CART

Chapter 14: CART

3

IntroductionIntroduction

ANFIS training can be split into two parts:• Structure identification• Parameter identification

Page 4: Chapter 14. Classification and Regression Treesbenedikt/Courses/ch14_2005.pdf · Chapter 14: CART 5 IntroductionIntroduction Consider tree-partioning in fuzzy modeling: • The CART

Chapter 14: CART

4

IntroductionIntroduction

Structure identification in fuzzy modeling includes the following basic items:

• Select the relevant input variables• Select initial structure for ANFIS:

- Input partition- Number of membership functions for each input- Number of IF-THEN rules- Prerequisite part for fuzzy rules- Consequent part for fuzzy rules

• Initialization of parameters for membership functions

Page 5: Chapter 14. Classification and Regression Treesbenedikt/Courses/ch14_2005.pdf · Chapter 14: CART 5 IntroductionIntroduction Consider tree-partioning in fuzzy modeling: • The CART

Chapter 14: CART

5

IntroductionIntroduction

Consider tree-partioning in fuzzy modeling:• The CART (Classification and Regression Tree)

algorithm• CART offers a fast method for structure

identification• CART does not have problems with the curse of

dimensionality• ANFIS based on CART is powerful in training and

application because weight normalization is included in the procedure

Page 6: Chapter 14. Classification and Regression Treesbenedikt/Courses/ch14_2005.pdf · Chapter 14: CART 5 IntroductionIntroduction Consider tree-partioning in fuzzy modeling: • The CART

Chapter 14: CART

6

The CART AlgorithmThe CART Algorithm

The CART algorithm is a powerful algorithm which is nonparametric and has the following characteristics:

• Based on a simple idea• Computationally powerful• Can both be used in classification and regression

problems• Based on a solid statistical foundation• Suitable for high-dimensional data• Can identify important variables

Page 7: Chapter 14. Classification and Regression Treesbenedikt/Courses/ch14_2005.pdf · Chapter 14: CART 5 IntroductionIntroduction Consider tree-partioning in fuzzy modeling: • The CART

Chapter 14: CART

7

The CART AlgorithmThe CART Algorithm

Example:IF x < a AND y < b THEN z = f1

IF x < a AND y > b THEN z = f2

IF x > a AND y < c THEN z = f3

IF x > a AND y > c THEN z = f4

See next slide

Page 8: Chapter 14. Classification and Regression Treesbenedikt/Courses/ch14_2005.pdf · Chapter 14: CART 5 IntroductionIntroduction Consider tree-partioning in fuzzy modeling: • The CART

Chapter 14: CART

8

Decision TreesDecision Trees

Page 9: Chapter 14. Classification and Regression Treesbenedikt/Courses/ch14_2005.pdf · Chapter 14: CART 5 IntroductionIntroduction Consider tree-partioning in fuzzy modeling: • The CART

Chapter 14: CART

9

Decision Trees Decision Trees

Decision tree:• Splits the input space into exclusive regions• Gives every area a label, value or an operation in

order to characterize its data points. • Is easy to use in classification• Structured by inner nodes and outer nodes which

are connected by branches• Binary trees are the simplest

Page 10: Chapter 14. Classification and Regression Treesbenedikt/Courses/ch14_2005.pdf · Chapter 14: CART 5 IntroductionIntroduction Consider tree-partioning in fuzzy modeling: • The CART

Chapter 14: CART

10

Decision TreesDecision Trees

Groups of decision trees:• Classification trees• Regression trees

Page 11: Chapter 14. Classification and Regression Treesbenedikt/Courses/ch14_2005.pdf · Chapter 14: CART 5 IntroductionIntroduction Consider tree-partioning in fuzzy modeling: • The CART

Chapter 14: CART

11

The CART AlgorithmThe CART Algorithm

Basic idea of CART:• First, a tree is constructed based on the training

data • Secondly, the tree is pruned to minimize cost and

computations (minimum cost-complexity principle)

Page 12: Chapter 14. Classification and Regression Treesbenedikt/Courses/ch14_2005.pdf · Chapter 14: CART 5 IntroductionIntroduction Consider tree-partioning in fuzzy modeling: • The CART

Chapter 14: CART

12

The CART AlgorithmThe CART Algorithm

Classification trees use impurity functions, E(t):

• Entropy function• Gini index• Impurity functions take their minimum value

(zero) when the data only belong to one class but take their maximum value when the data are uniformly distributed over all classes.

• The algorithm attempts to select the partition which gives the maximum decrease in impurity.

• For binary trees it is possible to write ∆E(s,t) = E(t) - pl E(tl) - pr E(tr)

Page 13: Chapter 14. Classification and Regression Treesbenedikt/Courses/ch14_2005.pdf · Chapter 14: CART 5 IntroductionIntroduction Consider tree-partioning in fuzzy modeling: • The CART

Chapter 14: CART

13

CART: Impurity FunctionsCART: Impurity Functions

Page 14: Chapter 14. Classification and Regression Treesbenedikt/Courses/ch14_2005.pdf · Chapter 14: CART 5 IntroductionIntroduction Consider tree-partioning in fuzzy modeling: • The CART

Chapter 14: CART

14

CART: Node PartitionCART: Node PartitionEntropy function is used:

E(t) = 0.6730

E(s1,t) = 0.2231E(s2,t) = 0.0138E(s3,t) = 0.2911E(s4,t) = 0.1185E(s5,t) = 0.2231E(s6,t) = 0.0000E(s7,t) = 0.2911E(s8,t) = 0.1185

Page 15: Chapter 14. Classification and Regression Treesbenedikt/Courses/ch14_2005.pdf · Chapter 14: CART 5 IntroductionIntroduction Consider tree-partioning in fuzzy modeling: • The CART

Chapter 14: CART

15

CART: Tree GrowthCART: Tree Growth

Page 16: Chapter 14. Classification and Regression Treesbenedikt/Courses/ch14_2005.pdf · Chapter 14: CART 5 IntroductionIntroduction Consider tree-partioning in fuzzy modeling: • The CART

Chapter 14: CART

16

CART: Tree GrowthCART: Tree Growth

Page 17: Chapter 14. Classification and Regression Treesbenedikt/Courses/ch14_2005.pdf · Chapter 14: CART 5 IntroductionIntroduction Consider tree-partioning in fuzzy modeling: • The CART

Chapter 14: CART

17

CART: Error MeasureCART: Error Measure

Page 18: Chapter 14. Classification and Regression Treesbenedikt/Courses/ch14_2005.pdf · Chapter 14: CART 5 IntroductionIntroduction Consider tree-partioning in fuzzy modeling: • The CART

Chapter 14: CART

18

CART: Input/Output Relations when Outputs are ConstantCART: Input/Output Relations when Outputs are Constant

Page 19: Chapter 14. Classification and Regression Treesbenedikt/Courses/ch14_2005.pdf · Chapter 14: CART 5 IntroductionIntroduction Consider tree-partioning in fuzzy modeling: • The CART

Chapter 14: CART

19

CART: Input/Output Relations when Outputs are LinearCART: Input/Output Relations when Outputs are Linear

Page 20: Chapter 14. Classification and Regression Treesbenedikt/Courses/ch14_2005.pdf · Chapter 14: CART 5 IntroductionIntroduction Consider tree-partioning in fuzzy modeling: • The CART

Chapter 14: CART

20

Structure Identification for ANFISStructure Identification for ANFIS

CART-ANFIS processing is based on the following:

• CART is used to find the first solutions for hard rules (structural identification)

• Prerequisite part of rules is made fuzzy • ANFIS is used to fine tune the parameters

(parameter identification)• Normalization of weights is implicit in the above

procedure.

Page 21: Chapter 14. Classification and Regression Treesbenedikt/Courses/ch14_2005.pdf · Chapter 14: CART 5 IntroductionIntroduction Consider tree-partioning in fuzzy modeling: • The CART

Chapter 14: CART

21

Structural Identification for ANFISStructural Identification for ANFIS

The CART algorithm makes hard decision boundariesExample:

IF x < a AND y < b THEN z = f1

IF x < a AND y > b THEN z = f2

IF x > a AND y < c THEN z = f3

IF x > a AND y > c THEN z = f4

We can use fuzzy membership functions to make the prerequisite parts fuzzy

Page 22: Chapter 14. Classification and Regression Treesbenedikt/Courses/ch14_2005.pdf · Chapter 14: CART 5 IntroductionIntroduction Consider tree-partioning in fuzzy modeling: • The CART

Chapter 14: CART

22

CART: Two types of membership functionsCART: Two types of membership functions

Page 23: Chapter 14. Classification and Regression Treesbenedikt/Courses/ch14_2005.pdf · Chapter 14: CART 5 IntroductionIntroduction Consider tree-partioning in fuzzy modeling: • The CART

Chapter 14: CART

23

CART: Fuzzy input/output relations when the outputs are constantsCART: Fuzzy input/output relations when the outputs are constants

Page 24: Chapter 14. Classification and Regression Treesbenedikt/Courses/ch14_2005.pdf · Chapter 14: CART 5 IntroductionIntroduction Consider tree-partioning in fuzzy modeling: • The CART

Chapter 14: CART

24

CART: Fuzzy input/output relations when the outputs are linearCART: Fuzzy input/output relations when the outputs are linear