chapter 14. classification and regression treesbenedikt/courses/ch14_2005.pdf · chapter 14: cart 5...

Post on 25-Jul-2020

12 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Chapter 14. Classification and Regression Trees

Chapter 14. Classification and Regression Trees

Chapter 14: CART

Chapter 14: CART

2

Decision TreesDecision Trees

Chapter 14: CART

3

IntroductionIntroduction

ANFIS training can be split into two parts:• Structure identification• Parameter identification

Chapter 14: CART

4

IntroductionIntroduction

Structure identification in fuzzy modeling includes the following basic items:

• Select the relevant input variables• Select initial structure for ANFIS:

- Input partition- Number of membership functions for each input- Number of IF-THEN rules- Prerequisite part for fuzzy rules- Consequent part for fuzzy rules

• Initialization of parameters for membership functions

Chapter 14: CART

5

IntroductionIntroduction

Consider tree-partioning in fuzzy modeling:• The CART (Classification and Regression Tree)

algorithm• CART offers a fast method for structure

identification• CART does not have problems with the curse of

dimensionality• ANFIS based on CART is powerful in training and

application because weight normalization is included in the procedure

Chapter 14: CART

6

The CART AlgorithmThe CART Algorithm

The CART algorithm is a powerful algorithm which is nonparametric and has the following characteristics:

• Based on a simple idea• Computationally powerful• Can both be used in classification and regression

problems• Based on a solid statistical foundation• Suitable for high-dimensional data• Can identify important variables

Chapter 14: CART

7

The CART AlgorithmThe CART Algorithm

Example:IF x < a AND y < b THEN z = f1

IF x < a AND y > b THEN z = f2

IF x > a AND y < c THEN z = f3

IF x > a AND y > c THEN z = f4

See next slide

Chapter 14: CART

8

Decision TreesDecision Trees

Chapter 14: CART

9

Decision Trees Decision Trees

Decision tree:• Splits the input space into exclusive regions• Gives every area a label, value or an operation in

order to characterize its data points. • Is easy to use in classification• Structured by inner nodes and outer nodes which

are connected by branches• Binary trees are the simplest

Chapter 14: CART

10

Decision TreesDecision Trees

Groups of decision trees:• Classification trees• Regression trees

Chapter 14: CART

11

The CART AlgorithmThe CART Algorithm

Basic idea of CART:• First, a tree is constructed based on the training

data • Secondly, the tree is pruned to minimize cost and

computations (minimum cost-complexity principle)

Chapter 14: CART

12

The CART AlgorithmThe CART Algorithm

Classification trees use impurity functions, E(t):

• Entropy function• Gini index• Impurity functions take their minimum value

(zero) when the data only belong to one class but take their maximum value when the data are uniformly distributed over all classes.

• The algorithm attempts to select the partition which gives the maximum decrease in impurity.

• For binary trees it is possible to write ∆E(s,t) = E(t) - pl E(tl) - pr E(tr)

Chapter 14: CART

13

CART: Impurity FunctionsCART: Impurity Functions

Chapter 14: CART

14

CART: Node PartitionCART: Node PartitionEntropy function is used:

E(t) = 0.6730

E(s1,t) = 0.2231E(s2,t) = 0.0138E(s3,t) = 0.2911E(s4,t) = 0.1185E(s5,t) = 0.2231E(s6,t) = 0.0000E(s7,t) = 0.2911E(s8,t) = 0.1185

Chapter 14: CART

15

CART: Tree GrowthCART: Tree Growth

Chapter 14: CART

16

CART: Tree GrowthCART: Tree Growth

Chapter 14: CART

17

CART: Error MeasureCART: Error Measure

Chapter 14: CART

18

CART: Input/Output Relations when Outputs are ConstantCART: Input/Output Relations when Outputs are Constant

Chapter 14: CART

19

CART: Input/Output Relations when Outputs are LinearCART: Input/Output Relations when Outputs are Linear

Chapter 14: CART

20

Structure Identification for ANFISStructure Identification for ANFIS

CART-ANFIS processing is based on the following:

• CART is used to find the first solutions for hard rules (structural identification)

• Prerequisite part of rules is made fuzzy • ANFIS is used to fine tune the parameters

(parameter identification)• Normalization of weights is implicit in the above

procedure.

Chapter 14: CART

21

Structural Identification for ANFISStructural Identification for ANFIS

The CART algorithm makes hard decision boundariesExample:

IF x < a AND y < b THEN z = f1

IF x < a AND y > b THEN z = f2

IF x > a AND y < c THEN z = f3

IF x > a AND y > c THEN z = f4

We can use fuzzy membership functions to make the prerequisite parts fuzzy

Chapter 14: CART

22

CART: Two types of membership functionsCART: Two types of membership functions

Chapter 14: CART

23

CART: Fuzzy input/output relations when the outputs are constantsCART: Fuzzy input/output relations when the outputs are constants

Chapter 14: CART

24

CART: Fuzzy input/output relations when the outputs are linearCART: Fuzzy input/output relations when the outputs are linear

top related