semi-random model tree ensembles: an effective and scalable regression method

28
Semi-random model tree ensembles: an effective and scalable regression method Bernhard Pfahringer Department of Computer Science University of Waikato, New Zealand September 22 nd , 2011 Bernhard Pfahringer Department of Computer Science University of Waikato, New Zealand Semi-random model tree ensembles: an effective and scalable regression method September 22 nd , 2011 1 / 28

Upload: larca-upc

Post on 17-Jan-2015

1.048 views

Category:

Business


0 download

DESCRIPTION

We present and investigate ensembles of semi-random model trees as a novel regression method. Such ensembles combine the scalability of tree-based methods with predictive performance rivalling the state of the art in numeric prediction. An empirical investigation shows that Semi-Random Model Trees produce predictive performance which is competitive with state-of-the-art methods like Gaussian Processes Regression or Additive Groves of Regression Trees. The training and optimization of Random Model Trees scales better than Gaussian Processes Regression to larger datasets, and enjoys a constant advantage over Additive Groves of the order of one to two orders of magnitude.

TRANSCRIPT

Page 1: Semi-random model tree ensembles: an effective and scalable regression method

Semi-random model tree ensembles: an effectiveand scalable regression method

Bernhard PfahringerDepartment of Computer Science

University of Waikato, New Zealand

September 22nd , 2011

Bernhard Pfahringer Department of Computer Science University of Waikato, New Zealand ()Semi-random model tree ensembles: an effective and scalable regression methodSeptember 22nd , 2011 1 / 28

Page 2: Semi-random model tree ensembles: an effective and scalable regression method

Background

Outline

1 Background

2 Algorithm

3 Results

4 Summary

Bernhard Pfahringer Department of Computer Science University of Waikato, New Zealand ()Semi-random model tree ensembles: an effective and scalable regression methodSeptember 22nd , 2011 2 / 28

Page 3: Semi-random model tree ensembles: an effective and scalable regression method

Background

Local regression

non-linear functions can be approximated by a set of locally linearestimatorsRegression and model trees are fast multi-variate versions of localregression

Bernhard Pfahringer Department of Computer Science University of Waikato, New Zealand ()Semi-random model tree ensembles: an effective and scalable regression methodSeptember 22nd , 2011 3 / 28

Page 4: Semi-random model tree ensembles: an effective and scalable regression method

Background

Piece-wise linear approximation example

Bernhard Pfahringer Department of Computer Science University of Waikato, New Zealand ()Semi-random model tree ensembles: an effective and scalable regression methodSeptember 22nd , 2011 4 / 28

Page 5: Semi-random model tree ensembles: an effective and scalable regression method

Background

Sample Regression Tree: constants in the leaves

A159 <= −0.62 :A149 <= 0.52 : Y = 1.6977A149 > 0.52 : Y = 1.2213

A159 > −0.62 :A149 <= 0.638 :

A57 <= −0.485 : Y = 0.8388A57 > −0.485 : Y = 1.0569

A149 > 0.638 : Y = 0.6062

Bernhard Pfahringer Department of Computer Science University of Waikato, New Zealand ()Semi-random model tree ensembles: an effective and scalable regression methodSeptember 22nd , 2011 5 / 28

Page 6: Semi-random model tree ensembles: an effective and scalable regression method

Background

Sample Model Tree: linear models in the leaves

A159 <= −0.62 :A149 <= 0.52 : LM1A149 > 0.52 : LM2

A159 > −0.62 :A149 <= 0.638 : LM3A149 > 0.638 : LM4

LM1 Y = −0.597 ∗ A149− 0.211 ∗ A159 + 1.901LM2 Y = −0.471 ∗ A149− 0.211 ∗ A159 + 1.353LM3 Y = −0.365 ∗ A149− 0.232 ∗ A159 + 1.017LM4 Y = −0.555 ∗ A149− 0.232 ∗ A159 + 0.776

Bernhard Pfahringer Department of Computer Science University of Waikato, New Zealand ()Semi-random model tree ensembles: an effective and scalable regression methodSeptember 22nd , 2011 6 / 28

Page 7: Semi-random model tree ensembles: an effective and scalable regression method

Algorithm

Outline

1 Background

2 Algorithm

3 Results

4 Summary

Bernhard Pfahringer Department of Computer Science University of Waikato, New Zealand ()Semi-random model tree ensembles: an effective and scalable regression methodSeptember 22nd , 2011 7 / 28

Page 8: Semi-random model tree ensembles: an effective and scalable regression method

Algorithm

Ensembles of Semi-Random Model Trees

Ensembles usually improve resultsMost ensembles use randomization to generate diversity2 sources of randomness:

For each tree: divide data into a train and a validation setTo split: select best attribute from a random subset of all attributes

Bernhard Pfahringer Department of Computer Science University of Waikato, New Zealand ()Semi-random model tree ensembles: an effective and scalable regression methodSeptember 22nd , 2011 8 / 28

Page 9: Semi-random model tree ensembles: an effective and scalable regression method

Algorithm

Single Semi-Random Model Tree

Only consider median as split value (=> balanced trees)Leaf model: linear ridge regression modelCap model predictions inside observed extremesOptimise tree depth and ridge value using the validation set

Bernhard Pfahringer Department of Computer Science University of Waikato, New Zealand ()Semi-random model tree ensembles: an effective and scalable regression methodSeptember 22nd , 2011 9 / 28

Page 10: Semi-random model tree ensembles: an effective and scalable regression method

Algorithm

Build ensemble

BUILDENSEMBLE(data, numTrees, k)

1 for i = 1 to numTrees2 do randomly split data into two:3 train + validate4 BUILDTREE(train, validate, k)

Bernhard Pfahringer Department of Computer Science University of Waikato, New Zealand ()Semi-random model tree ensembles: an effective and scalable regression methodSeptember 22nd , 2011 10 / 28

Page 11: Semi-random model tree ensembles: an effective and scalable regression method

Algorithm

BuildTree

BUILDTREE(train, validate, k)

1 min← MINTARGETVALUE(train)2 max ← MAXTARGETVALUE(train)3 localSSE ← LINREG(train, validate)4 �

5 if |train| > 10 & |validate| > 106 do split ← RANDOMSPLIT(train, k)7 �

8 smT ← SMALLER(train, split)9 smV ← SMALLER(validate, split)

10 smaller ← BUILDTREE(smT , smV , k)11 �

12 laT ← LARGER(train, split)13 laV ← LARGER(validate, split)14 larger ← BUILDTREE(laT , laV , k)15 �

16 subSSE ← SSE(smaller , larger , validate)17 �

18 if localSSE < subSSE19 do smaller ← null20 larger ← null21 else22 localModel ← null

Bernhard Pfahringer Department of Computer Science University of Waikato, New Zealand ()Semi-random model tree ensembles: an effective and scalable regression methodSeptember 22nd , 2011 11 / 28

Page 12: Semi-random model tree ensembles: an effective and scalable regression method

Algorithm

BuildTree, continued

15 subSSE ← SSE(smaller , larger , validate)16 �

17 if localSSE < subSSE18 do smaller ← null19 larger ← null20 else21 localModel ← null

Bernhard Pfahringer Department of Computer Science University of Waikato, New Zealand ()Semi-random model tree ensembles: an effective and scalable regression methodSeptember 22nd , 2011 12 / 28

Page 13: Semi-random model tree ensembles: an effective and scalable regression method

Algorithm

Ridge regression

LINREG(train, validate)

1 for ridge in 10−8, 10−4, 10−2, 10−1, 1, 102 do modelr ← RIDGEREGRESS(train, ridge)3 sser ← SSE(modelr , validate)4 if bestModel == model105 do build models for ridge = 102, 103, ...6 and so on while improving7 localModel ← bestModel8 return minimum-sse-on-validation-data

Bernhard Pfahringer Department of Computer Science University of Waikato, New Zealand ()Semi-random model tree ensembles: an effective and scalable regression methodSeptember 22nd , 2011 13 / 28

Page 14: Semi-random model tree ensembles: an effective and scalable regression method

Algorithm

Random split selection

RANDOMSPLIT(train, k)

1 for i = 1 to k2 do splitAttr ← RANDOM CHOICE(allAttrs)3 stump ← STUMP(APPROX MEDIAN(splitAttr))4 compute SSE(stump, train)5 return minimum-sse stump

Bernhard Pfahringer Department of Computer Science University of Waikato, New Zealand ()Semi-random model tree ensembles: an effective and scalable regression methodSeptember 22nd , 2011 14 / 28

Page 15: Semi-random model tree ensembles: an effective and scalable regression method

Algorithm

Parameter Settings

reported experiments:

average predictions of 50 randomized model treesto split select best of 50% randomly selected attributes

generally: should optimise separately for every application, e.g. usingcross-validation

number of trees: “the more the merrier”, but diminishing returnsnumber of randomly selected attributes: 50% is a good default, butmay be depend on the total number and on sparseness

Bernhard Pfahringer Department of Computer Science University of Waikato, New Zealand ()Semi-random model tree ensembles: an effective and scalable regression methodSeptember 22nd , 2011 15 / 28

Page 16: Semi-random model tree ensembles: an effective and scalable regression method

Results

Outline

1 Background

2 Algorithm

3 Results

4 Summary

Bernhard Pfahringer Department of Computer Science University of Waikato, New Zealand ()Semi-random model tree ensembles: an effective and scalable regression methodSeptember 22nd , 2011 16 / 28

Page 17: Semi-random model tree ensembles: an effective and scalable regression method

Results

Comparison

use more than 20 Torgo/UCI datasets, > 900 examplesrepeated 2

3 training, 13 testing splits

training split into equal build and validation halves (13 , 1

3 )preprocessed for missing or categorical valuescompare to:

LR: linear ridge regression, optimise ridge valueGP: gaussian process regression, optimise noise level and RBFgammaAG: additive groves, use ”fast” script

use RMAE: relative mean absolute error

Bernhard Pfahringer Department of Computer Science University of Waikato, New Zealand ()Semi-random model tree ensembles: an effective and scalable regression methodSeptember 22nd , 2011 17 / 28

Page 18: Semi-random model tree ensembles: an effective and scalable regression method

Results

RMAE on Torgo/UCI

RMAE for Torgo/UCI data

0

10

20

30

40

50

60

70

80

90

100

colorh

istog

ram

layo

ut

cooc

textur

e

colorm

omen

ts

bank

8FM

stoc

kmv

ailero

ns

elnino

elev

ator

sfri

ed

delta

_aile

rons

2dplan

es

delta

_eleva

tors

cal_ho

using

cpu_

act

cpu_

small

bank

32nh

abalon

epo

l

hous

e_8L

puma8

NH

kin8

nm

hous

e_16

H

puma3

2H

quak

e

RMT

GP

LR

AG

Figure: RMAE for Torgo/UCI datasets, sorted by the linear regression result.

Bernhard Pfahringer Department of Computer Science University of Waikato, New Zealand ()Semi-random model tree ensembles: an effective and scalable regression methodSeptember 22nd , 2011 18 / 28

Page 19: Semi-random model tree ensembles: an effective and scalable regression method

Results

Build times on Torgo/UCI

Training time in seconds for Torgo/UCI data

0.1

1

10

100

1000

10000

100000

stockquake

abalone

delta_ailerons

bank32nh

bank8FM

cpu_act

cpu_small

kin8nm

puma32H

puma8NH

delta_elevators

ailerons po

l

elevators

cal_housing

house_16H

house_8L

2dplanesfried mv

layout

colorhistogram

colormoments

cooctextureelnino

RMT

GP

LR

AG

Figure: Training time in seconds for Torgo/UCI datasets, sorted by thenumber of instances in each dataset; note the use of a logarithmic y-scale.

Bernhard Pfahringer Department of Computer Science University of Waikato, New Zealand ()Semi-random model tree ensembles: an effective and scalable regression methodSeptember 22nd , 2011 19 / 28

Page 20: Semi-random model tree ensembles: an effective and scalable regression method

Results

UCI Census dataset

Table: Partial results, 2458285 examples in total, therefore about 800000 inthe training fold.

Method RMAE Time (secs)LR 15.96 1205RMT 9.78 19811GP ? ? (would need 5 Tb RAM)AG ? ? (estimated 2000000)

Bernhard Pfahringer Department of Computer Science University of Waikato, New Zealand ()Semi-random model tree ensembles: an effective and scalable regression methodSeptember 22nd , 2011 20 / 28

Page 21: Semi-random model tree ensembles: an effective and scalable regression method

Results

Near infrared (NIR) Datasets

proprietary NIR data

7 datasetsfrom 255 upto 7500 spectrabetween 170 and 500odd featurespreprocessed for noise and base line shift

Bernhard Pfahringer Department of Computer Science University of Waikato, New Zealand ()Semi-random model tree ensembles: an effective and scalable regression methodSeptember 22nd , 2011 21 / 28

Page 22: Semi-random model tree ensembles: an effective and scalable regression method

Results

Sample NIR spectrum

Prepocessed sample spectrum (nitrogen in soil)

-2

-1

0

1

2

3

4

1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106 113 120 127 134 141 148 155 162 169

Bernhard Pfahringer Department of Computer Science University of Waikato, New Zealand ()Semi-random model tree ensembles: an effective and scalable regression methodSeptember 22nd , 2011 22 / 28

Page 23: Semi-random model tree ensembles: an effective and scalable regression method

Results

RMAE on NIR data

RMAE for NIR datasets

10

20

30

40

50

60

70

80

90

n omd rmd tc phe ph p5 na g5

RMT

GP

LR

AG

Figure: RMAE for NIR datasets, sorted by the linear regression result.

Bernhard Pfahringer Department of Computer Science University of Waikato, New Zealand ()Semi-random model tree ensembles: an effective and scalable regression methodSeptember 22nd , 2011 23 / 28

Page 24: Semi-random model tree ensembles: an effective and scalable regression method

Results

Build times on NIR data

Training time in seconds for NIR data

0.1

1

10

100

1000

10000

100000

omd rmd na n tc ph phe p5 g5

RMTGPLRAG

Figure: Training time in seconds for NIR datasets, sorted by the number ofinstances in each dataset; note the use of a logarithmic y-scale.

Bernhard Pfahringer Department of Computer Science University of Waikato, New Zealand ()Semi-random model tree ensembles: an effective and scalable regression methodSeptember 22nd , 2011 24 / 28

Page 25: Semi-random model tree ensembles: an effective and scalable regression method

Results

Random Model Tree Build Times discussion

complexity is O(K ∗ N ∗ logN + K 2 ∗ N)

second term (linear model computation) seems to dominatetherefore observed complexity ∼ O(K 2 ∗ N)

Bernhard Pfahringer Department of Computer Science University of Waikato, New Zealand ()Semi-random model tree ensembles: an effective and scalable regression methodSeptember 22nd , 2011 25 / 28

Page 26: Semi-random model tree ensembles: an effective and scalable regression method

Summary

Outline

1 Background

2 Algorithm

3 Results

4 Summary

Bernhard Pfahringer Department of Computer Science University of Waikato, New Zealand ()Semi-random model tree ensembles: an effective and scalable regression methodSeptember 22nd , 2011 26 / 28

Page 27: Semi-random model tree ensembles: an effective and scalable regression method

Summary

Conclusions

Semi-Random Model Trees perform wellThey are fast: build time is practically linear in NCan model non-linear relationships

Bernhard Pfahringer Department of Computer Science University of Waikato, New Zealand ()Semi-random model tree ensembles: an effective and scalable regression methodSeptember 22nd , 2011 27 / 28

Page 28: Semi-random model tree ensembles: an effective and scalable regression method

Summary

Future Work

Improve efficiency for large KStudy more and different regression problemsMore comparisons to alternative regression schemesStreaming/Moa variant

Bernhard Pfahringer Department of Computer Science University of Waikato, New Zealand ()Semi-random model tree ensembles: an effective and scalable regression methodSeptember 22nd , 2011 28 / 28