gmdh and fake game: evolving ensembles of inductive...

CTU Prague, Faculty of Electrical Engineering, CTU Prague, Faculty of Electrical Engineering, Department of Computer ScienceDepartment of Computer Science

Pavel Pavel KordKordííkk

GMDH and FAKE GAME: GMDH and FAKE GAME: Evolving ensembles of inductive Evolving ensembles of inductive

modelsmodels

2/67

Outline of the tutorial …

• Introduction, FAKE GAME framework, architecture of GAME

• GAME – niching genetic algorithm• GAME – ensemble methods

– Trying to improve accuracy

• GAME – ensemble methods– Models’ quality– Credibility estimation

3/67

Theory in background

• Knowledge discovery• Data preprocessing• Data mining• Neural networks• Inductive models• Continuous optimization• Ensemble of models• Information visualization

4/67

Problem statement• Knowledge discovery (from databases) is time

consuming and expert skills demanding task.• Data preprocessing is extremely time consuming

process.• Data mining involves many experiments to find

proper methods and to adjust their parameters.

• Black-box data mining methods such us neural networks are not considered credible.

• Knowledge extraction from data mining methods is often very complicated.

5/67

Solutions proposed – FAKE GAME concept

FAKE INTERFACE

AUTOMATEDDATA MINING

INPUTDATA AUTOMATED

DATAPREPROCESSING

KNOWLEDGEEXTRACTION

andINFORMATION

VISUALIZATIONKNOWLEDGE

GAME ENGINE

6/67

Detailed view of proposed solutions

FAKE INTERFACE

MO DELMO DEL

MO DEL MO DEL

MO DEL

MO DEL Math equations

Feature ranking

Interesting behaviour

Credibilityestimation

Classes boundaries,relationship of variables

DATAWAREHOUSING

DATAINTEGRATION

DATACLEANING

INPUTDATA

Classification, Prediction,Identification and Regression

DATACOLLECTION

PROBLEMIDENTIFICATION

DATAINSPECTION

AUTOMATEDDATA

PREPROCESSINGGAME ENGINE

7/67

The GAME engine for automated data mining

MO DELMO DEL

MO DEL MO DEL

MO DEL

MO DEL

GAMEPREPROCESSED

DATA

CLASSIFICATION

REGRESSION

PREDICTION

IDENTIFICATION

OR

OR

OR

What is hidden inside?

8/67

Background (GMDH)

• Group Method of Data Handling GMDH – prof. Ivakhnenko in 1968

• The principle of INDUCTIONemployed:Inductive model grows from data, itdecomposes a multivariate probleminto subproblems, solves it in thespace of lower dimensionality (twodimensions for units with 2 inputs), combines partial solutions to get theglobal solution of the problem.

input variables

P

output variable

P P P

P P P

P P

P

Units with polynomial transfer function

MIA GMDH – the most commonly used algorithm

P

For instance:

y=ax1+bx2+c

x2x1

y

9/67

Our improvements

input variables

output variable

3 inputsmax

P C P G

P P C

L

P L C

input variables

output variable

interlayerconnections

P C P G

P P C

P

input variables

P

output variable

P P P

P P P

P P

P

3 inputs2 inputs

MIA GMDH ModGMDH GAME

unifiedunits

diversifiedunits

Non-heuristic search

Exhaustivesearch GA+DC

10/67

Recent improvements (GAME)• Group of Adaptive Models

Evolution (GAME)

- Heterogeneous units- Several competing training

methods- Interlayer connections, number

of inputs allowed increases- Niching genetic algorithm (will

be explained) employed in each layer to optimize the topology of GAME networks.

- Ensemble of models generated (will be explained)

input variables

output variable

first layer

second layer

third layer

output layer

interlayer connection

3 inputsmax

4 inputs max

P C P G

P P C

L

P L C

11/67

Heterogeneous units in GAMEx1

xn

x2

...1

1+

=

+=∑ n

n

iii axay

Linear (LinearNeuron)

x1

xn

x2

... 01 1

axayn

i

m

j

rji +

=∑ ∏

= =

Polynomial (CombiNeuron)x1

xn

x2

...( )

( )

( )0

11

22

1

2

*1 aeay n

n

iii

a

ax

n ++= +

=

+

∑ −−

+

Gaussian (GaussianNeuron)

x1

xn

x2

...03

121 sin aaxaaay n

n

iiinn +

+∗∗= +=

++ ∑

Sin (SinusNeuron)

x1

xn

x2

... 0

11

1a

e

y n

iii xa

++

=∑−=

Logistic (SigmNeuron)

x1

xn

x2

...0

*

21

1

* aeayn

iiin xaa

n +=∑

+=

+

Exponential (ExpNeuron)

x1

xn

x2

...0

11 1

*1

2

2

2

aaxxaxa

ay

n

n

i

n

jjijin

n

iii

n +++

=

+= =

+=

+

∑∑∑

Rational (PolyFractNeuron)

x1

xn

x2

...

Universal (BPNetwork)

( ))(1

12

1p

n

p pq

n

qq xy ∑∑ =

+

=

= φψ

12/67

Optimization of coefficients (learning)

x1

xn

x2

... ( )( )

( )0

11

22

1

2

*1 aeay n

n

iii

a

ax

n++= ++++

=

+

∑ −−

++++


We have inputs x1, x2, …, xn and target output y in the training data set

We are looking for optimal values of coefficients a0, a1, …, an+2

y’

The difference between unit output y’ and the target value y should be minimal for all vectors from the training data set

∑=

−=m

i

yyE1

2)'(

13/67

How to derive analytic gradient?

Error of the unit (energy surface)

Gradient of the error

Unit with gaussian transfer function

Partial derivation of error in the direction of coefficient ai

14/67

Partial derivatives of the Gauss unit

15/67

Optimization of their coefficients

Unit

repeat

Optimization

method

optimize coefficientsgiven inintial values

new values

coefficientsa1 , a2 , ..., a n

error

final values

compute

error on

training

dataestimate

gradient

a)

b)

Unit

repeat

Optimization

method

optimize coefficientsgiven inintial values

new values

coefficientsa1 , a2 , ..., a n

error

final values

compute

error on

training

data

compute

gradient

of the

error

gradient

Unit does not provide analytic gradientjust error of the unit

Unit provides analytic gradientand the error of the unit

16/67

Very efficient gradient based training for hybrid networks developed!

17/67

Outline of the talk …• Introduction, FAKE GAME framework,

architecture of GAME• GAME – niching genetic algorithm• GAME – ensemble methods



18/67

Genetic algorithm

19/67

Example of encoding into chromosome

1

2

3

4

5

6

7

NichingGA

Linear transfer unit (LinearNeuron)

12345671001000 not implemened n.i.

Polynomial trasfer unit (CombiNeuron)

Settings

Inputs

12345670000110

Transfer function

Inputs

12345672115130

12345671203211

Transfer function

n.i.

Settings

02212

3211 axxaxxay ++=

02211 axaxay ++=

20/67

The process of GAME modelsevolution – Genetic Algorithm

Encoding the GAME unit to genotype (first layer):

GA (no niching)

0 0 1 0 P trans.fn.

All individuals are connected to the most significant feature after standard GA.

P

Inputs Type Other

Select N best individuals

Result: REDUNDANT information extracted

P PP P

21/67

Better way is to use worse, but non-correlated units

By using less significant features we get more additional information than by using several best individuals connected to the most significant feature!

C

A B

f (C)

Z

X Y

f (A) = 8 f (B) = 7.99 f (X) = 8 f (Y) = 5

f (C) = 8 f (Z) = 9

f (Z)<

The outputs of units A and B are correlated combining them does not bring improvement in performance

How to do it?

22/67

Canonical GAs

Niching GAs

• Niching method - extends genetic algorithm to be able to locate multiple solutions.• Promote the formation and maintenance of stablesubpopulations in genetic algorithm.

Niching Genetic Algorithms

Mnoho metod pro zachovánírůznorodosti jedinců (niching methods):

Overspecification – dobré u dynamických systémůEcological genetic algorithms –více prostředíHeterozygote advantage –diploidní genom, fitness zvýšit o dist. rodičůCrowding (restricted replacement)– nahrazení podobných jedinců v subpop.Restricted competition – turnaj kompatibilních jedinců (threshold)Fitness sharing – sharing methods– přelidníno –> hledám jindeImmune system models - antigeny

23/67

Niching GA – Deterministic Crowding

• Deterministic Crowding: S. W. Mahfoud (1995).

• Subpopulations (niches) are based on the distanceof individuals.

C D

= H(010010, 000011) + H(1100, 1010) = 4

A B

= H(1000, 0010) + H(1000, 0010) = 4

A

B

C

D

1000

0010

010010

000011

ChromosomeUnit Inputs_processed

1000

0010

1100

1010

Distance = Hamming dist. of chromosomes + Hamming dis. of inputs processed

d (A, B)

d (C, D)

For units in the GAME network:

distance of units in one population is based on thephenotypic difference of units .

24/67

Distance of GAME units

P1

P2

1

2

3

4

5

6

7

8

Nic

hing

GA

Distance(P1,P2) = genotyphic distance + correlation of errors

Normalized distance of Inputs

Computed from units deviationson training & validation set

Normalized distance of Transfer functions+

+Normalized distance of Other attributes

Hamming(100010,101100) + features used

Euclid distance of coefficients

Distance of configuration variables

P1 P2Encoding unitsto chromosomes:

123456100010 101100

Inputs

Transfer functionOther 123456

Transfer functionOther

Inputs

25/67

The process of GAME modelsevolution – GA with DC

Niching GA (DeterministicCrowding)

0 0 1 0 P trans.fn.P

Inputs Type Other

Select the best individual from each niche

Result: MAXIMUM information extracted

Encoding the GAME unit to genotype (first layer):

Individuals connected to less significant features survived. We gained more info.

P SP

26/67

Experimental Results: Standard GAversus GA+Deterministic Crowding

Time

Day

Rs

Rn

PAR

Tair

RH

u

SatVapPr

VapPress

Battery

Number of units connected to particular variable

Ep

och

nu

mb

er

025

02

00

15

01

005

0

0 100 200 0 100 200

Genetic Algorithm GA with Deterministic Crowding

P PP P P SP

Single uniform population Three subpopul.(niches)

Units connected to the same input belongs to one niche.

27/67


8,10E-03

8,20E-03

8,30E-03

8,40E-03

8,50E-03

8,60E-03

8,70E-03

8,80E-03

GA GA+DC

RMS cold water consumption

4,80E-02

4,85E-02

4,90E-02

4,95E-02

5,00E-02

5,05E-02

5,10E-02

5,15E-02

5,20E-02

5,25E-02

5,30E-02

GA GA+DC

RMS energy consumption

3,70E-02

3,75E-02

3,80E-02

3,85E-02

3,90E-02

3,95E-02

4,00E-02

4,05E-02

4,10E-02

GA GA+DC

RMS hot water consumption

The statistical test proved that on the level of significance 95%, the GA+DC performs better than simple GA for the energy and hot water consumption.

� ��Results:

28/67


GA GA+DC

0,0E+00

5,0E-06

1,0E-05

1,5E-05

2,0E-05

2,5E-05

3,0E-05

3,5E-05

4,0E-05

4,5E-05

DC off

DC on

CR

AT

ER

DE

PT

H

CR

AT

ER

DIA

ME

TE

R

FIR

E R

AD

IUS

INS

TA

TN

T R

AD

IAT

ION

SU

MA

RA

DIA

TIO

N

WA

WE

PR

ES

SU

RE

RMS on-ground nuclear tests

Except the models of the fire radius attribute, the performance of all other models is significantly better with Deterministic Crowding en abled.

Ave

rage

RM

S e

rror

of 2

0 m

odel

s

Small simple dataset

Simple GA

� GA+DC

29/67

Results: the best way is to combine genotypic and phenotypic distance

RMS Error on Boston testing data set

0.276

0.278

0.28

0.282

0.284

0.286

0.288

0.29

0.292

None Genome Correlation Gen&Corr.

Weighted Ensemble

Simple Ensemble

Average, Minimum and Maximum RMS Errorof 10 Ensemble Models on Boston data set

0.276

0.28

0.284

0.288

0.292

0.296

0.3


RMS Error on Boston testing data set

0.276

0.278

0.28

0.282

0.284

0.286

0.288

0.29

0.292


Weighted Ensemble

Simple Ensemble

Average, Minimum and Maximum RMS Errorof 10 Ensemble Models on Boston data set

0.276

0.28

0.284

0.288

0.292

0.296

0.3


30/67

Distance of units can be monitored during the training phase

Epoch 1

Epoch 30

Sorted

Chromos. dist. Correlation Error on training ve ctors RMSE

Start of the niching GeneticAlgorithm, units are randomlyinitialized, trained and theirerror is computed,

after 30 epochs the nichingGenetic Algorithm terminates,

finally units are sortedaccording to their RMSE,chromosome differenceand the correlation.

31/67

Another visualization of diversity

genotypic distances genotypic distancescorrelation matrix correlation matrix

Model pp,layer 10, epoch 10Model imU,layer 1, epoch 9

Units connectedto the same input

Some of themare well trained

Clusters arenot apparentin the matrix

Each unithas differentgenotype(units areconnectedto diverseinputs)

Half of unitsare bettertrained thanthe others

32/67

GA or niching GA?• Genetic search is more effective than

exhaustive search• The use of niching genetic algorithm (e.g.

with Deterministic Crowding) is beneficial:– More accurate models are generated

– Feature ranking can be derived– Non-correlated units (active neurons) can be

evolved

• The crossover of active neurons in transfer function brings further improvements

33/67

Remember? Units with various transfer functions competes in the niching GA

x1

xn

x2

...1

1+

=

+=∑ n

n

iii axay

Linear (LinearNeuron)

x1

xn

x2

... 01 1

axayn

i

m

j

rji +

=∑ ∏

= =

Polynomial (CombiNeuron)x1

xn

x2

...( )

( )

( )0

11

22

1

2

*1 aeay n

n

iii

a

ax

n ++= +

=

+

∑ −−

+


x1

xn

x2

...03

121 sin aaxaaay n

n

iiinn +

+∗∗= +=

++ ∑

Sin (SinusNeuron)

x1

xn

x2

... 0

11

1a

e

y n

iii xa

++

=∑−=

Logistic (SigmNeuron)

x1

xn

x2

...0

*

21

1

* aeayn

iiin xaa

n +=∑

+=

+

Exponential (ExpNeuron)

x1

xn

x2

...0

11 1

*1

2

2

2

aaxxaxa

ay

n

n

i

n

jjijin

n

iii

n +++

=

+= =

+=

+

∑∑∑

Rational (PolyFractNeuron)

x1

xn

x2

...

Universal (BPNetwork)

( ))(1

12

1p

n

p pq

n

qq xy ∑∑ =

+

=

= φψ

34/67

Natural selection of units really works!

0.100

0.150

0.200

0.250

0.300

0.350

allFra

ctGau

ssian

Mult iG

auss

all-f a

stPer

ceptro

nall

-simpleGau

ssSigm

Polyno

mial SinExp

CombiR

300

Linea

rCom

bi

4.03E+05

Training set

Testing set

RMS on the Boston data set

0.100

0.150

0.200

0.250

0.300

0.350

allFra

ctGau

ssian

Mult iG

auss

all-f a

stPer

ceptro

nall

-simpleGau

ssSigm

Polyno

mial SinExp

CombiR

300

Linea

rCom

bi

4.03E+05

Training set

Testing set

Training set

Testing set

RMS on the Boston data set

The best performance is achieved when units of all types are competing in the construction process.

35/67

Optimization methods available in GAME

36/67

Experimental results of competing training methods on Building data set

Hot water consumption

QNCG

SADEDEall

HGAPSOCACO

SOSpalDEACOPSO

OS

Cold water consumption

QNallDE

SADECGOS

HGAPSOCACO

SOSpalDEACOPSO

Energy consumption

CGDE

QNSADE

allSOS

CACOPSO

HGAPSOACO

OSpalDE


QNCG

SADEDEall

HGAPSOCACO

SOSpalDEACOPSO

OS


QNCG

SADEDEall

HGAPSOCACO

SOSpalDEACOPSO

OS


QNallDE

SADECGOS

HGAPSOCACO

SOSpalDEACOPSO


QNallDE

SADECGOS

HGAPSOCACO

SOSpalDEACOPSO

Energy consumption

CGDE

QNSADE

allSOS

CACOPSO

HGAPSOACO

OSpalDE

Energy consumption

CGDE

QNSADE

allSOS

CACOPSO

HGAPSOACO

OSpalDE

RMS error on testing data sets (Building data) averaged over 5 runs

37/67

RMS error on the Boston data set

38/67

Classification accuracy [%] on the Spiral data set

39/67

Evaluation on diverse data sets

What is it All ?

40/67

Remember the Genetic algorithm optimizing the structure of GAME?

1

2

3

4

5

6

7

NichingGA

Linear transfer unit

12345671001000 not implemened CACO

Polynomial trasfer unit

Optimization method

Inputs

12345670000110

Transfer function

Inputs

12345672115130

12345671203211

Transfer function

DE

Opt. m.

02212

3211 axxaxxay ++=

02211 axaxay ++= added intochromosomes

41/67

Outline of the talk …

• Introduction, FAKE GAME framework, architecture of GAME

• GAME – niching genetic algorithm• GAME – ensemble methods



42/67

Ensemble models – why to use it?

• Hot topic in machine learning communities

• Successfully applied in several areas such as:– Face recognition [Gutta96, Huang00]– Optical character recognition [Drucker93, Mao98]

– Scientific image analysis [Cherkauer96]– Medical diagnosis [Cunningham00,Zhou02]

– Seismic signal classification [Shimshoni98]– Drug discovery [Langdon02]

– Feature extraction [Brown02]

43/67

Ensemble models – what is it?• The collection of models (e.g. neural

networks) is trained for the same task.• Outputs of models are then combined.

Model 1 Model 2 Model 3 Model N

Input variables

Output variable

Ensemble output

combination

…

Output variable

44/67

Bagging• Sample with replacement several training sets of size n (instead of having just one

training set of size n).

• Build model/classifier for each training set.

• Combine classifier’s predictions.

Training data

Sample 1

Sample 2

Sample M

...

Learning algorithm

Learning algorithm

Learning algorithm

Model 1(classifier)

Model 2(classifier)

Model M(classifier)

......

Sampling with replacement

Ensemble model

(classifier)

output

Averaging or voting

45/67

We employed Bagging in GAME• Group of Adaptive Models Evolution is a method we are

working on in our department (see next slide).

Training data

Sample 1

Sample 2

Sample M

...

GAME

GAME

GAME

GAMEmodel 1

GAMEmodel 2

GAMEmodel M

......

Sampling with replacement

GAME ensemble

output

Averaging or voting

46/67

Results of Bagging GAME models

124

126

128

130

132

134

136

138

1 3 5 7 9 11 13 15 17 19 21 23 25 27

RMS skeleton age estimation – training&validation dat a set

Simple ensemble

Weighted ensembleIndividual GAME models

136

138

140

142

144

146

148

150

152

154

156

1 3 5 7 9 11 13 15 17 19 21 23 25 27

RMS skeleton age estimation – testing data set

Simple ensemble

Weighted ensembleIndividual GAME models

Weighted ensemble of GAME models

The weighted ensemble has apparent tendency to overfitt the data.

While its performance is superior on the training and validation data, on the testing data there are several individual models performing better.

47/67

GAME ensembles are notstatistically significantly better thanthe best individual GAME model !

Why?• Are GAME models diverse?

– Vary input data �� (bagging)– Vary input features �� (using subset of features)– Vary initial parameters �� (random initialization of weights)– Vary model architecture �� (heterogeneous units used)– Vary training algorithm �� (heterogeneous units used)– Use a stochastic method when building model �� (niching GA)

Yes, they use several methods to promote diversity!

48/67

Ensemble models – why it works?

Bias-variance decomposition(theoretical tool to study how a training data affects the performance of models/classifiers)

Total expected error of model = variance + bias

variance : the part of the expected error due to the nature of the training set

bias: the part of the expected error caused by the fact the model is not perfect

G.Brown: Diversity in Neural networks ensembles, 2004

Ensembling reduces variance Ensembling reduces biastraining data model 1

training data model 2

model 1

model 2

ensemble model

ensemble model

49/67

GAME ensembles are notstatistically significantly better thanthe best individual GAME model !

Why?• GAME models grows until

optimal complexity (grows from the minimal form up to the required complexity)

• Therefore bias cannot befurther reduced

• Bagging slightly reduces variance, but not significantly (for datasets we have been experimented with)

Ensembling reduces bias …

model 1

model 2

ensemble model

… just for non-optimal models

model 1

model 2

ensemble model

model 3

50/67

Results of Bagging GAME models

135

140

145

150

155

160

165

170

1 2 3 4 5 6 7 8 9 10 11 12

0,265

0,266

0,267

0,268

0,269

0,27

0,271

0,272

0,273

0,274

1 2 3 4 5 6

RMS – cold water consumtion RMS – age estimation

ensemble ensemble

135

140

145

150

155

160

165

170

1 2 3 4 5 6 7 8 9 10 11 12

0,265

0,266

0,267

0,268

0,269

0,27

0,271

0,272

0,273

0,274

1 2 3 4 5 6

RMS – cold water consumtion RMS – age estimation

ensemble ensemble

Simple ensemble of GAME models

Is the ensemble is significantly better than any of individual models?

Suboptimal GAME models YES Optimal GAME models NO

51/67

Outline of the talk …• Introduction, FAKE GAME framework,

architecture of GAME• GAME – niching genetic algorithm• GAME – ensemble methods



52/67

Ensembles can estimate credibility

• We found out that ensembling GAME modelsdoes not further improve accuracy of modelling.

• Why to use ensemble then?

• There is one big advantage that ensemble models can provide: Using ensemble of models we can estimate credibility for any input vector.

• Ensembles are starting to be used in this manner in several real world applications:

53/67

Visualization of model’s behavior

x

ModGMDH min

=

yk

const.

ModGMDH

ym

constant

constant

constant

moving

moving

moving

yk1

x1

x1

x2

x3

x2

x3

x2 max

x1 x3 =

min

const.ym

x2 max

x3 =

max

GAME

GAME

54/67

Estimating credibility using ensemble of GAME models

EE1

WE1

WW 1

EW1

EE2

WE2

WW 2

EW2

EE3

WE3

WW 3

EW3

RR3

RS3

SA1

SV1

SA2

SV2

SA3

SV3

FA1

FV1

FA2

FV2

FA3

FV3

S

GAME

S

S

P

S

S

S

S

S

S

S Healthy

GAMEEE1

WE1

WW 1

EW1

EE2

WE2

WW 2

EW2

EE3

WE3

WW 3

EW3

RR3

RS3

SA1

SV1

SA2

SV2

SA3

SV3

FA1

FV1

FA2

FV2

FA3

FV3

S

GAME

S

S

P

S

S

S

S

S

S

S Healthy

GAMEEE1

WE1

WW 1

EW1

EE2

WE2

WW 2

EW2

EE3

WE3

WW 3

EW3

RR3

RS3

SA1

SV1

SA2

SV2

SA3

SV3

FA1

FV1

FA2

FV2

FA3

FV3

S

GAME

S

S

P

S

S

S

S

S

S

S Healthy

GAME

1 2 … n

data defined

x1j =0.0 1.0

y

-1.0 2.0

x2 = 0.01

y1

y2

y3

y4

y5

y6y7

y8

y9

x

GAME min

=

yk

const.constant

constant

varying

yk1

x2

x3

x2 max

x1 x3 =L

PC

PP

While x2 is varying from min to max and other inputs stay constant, the output of the model shows sensitivity of the model to variable x2 in the configuration x1, x3 = const.

In regions, where we have enough data , all models from ensemble have compromise response – they are credible .

When responses of models differthen models are not credible(dark background).

We need to know when the response of a model is based on training data and when it is just random guessing – then we can estimate the credibility of the model.

55/67

Artificial data set – partial definition

Credibility: the criterion is a dispersion of models` responses.

Advantages:

• No need of the training data set,

• Modeling method success considered,

• Inputs importance considered.

56/67

Practical example – Nuclear data A.G. Ivakhnenko – experiments with GMDH

Extremely sparse data set – 6 dimensions, 10 data vectors

57/67

Typical application of Typical application of softcomputingsoftcomputing methods in industry:methods in industry:BuildingBuilding data setdata set

Model of hotwater consumpt.

Model of energy consumption

Model of coldwater consumpt.

Black box

Looking for the optimal Looking for the optimal configuration of the GAME configuration of the GAME simulator simulator –– Pavel Pavel StaStanněěkk

GAME ensemble

58/67

x1 (Temperature outside the building)

(Col

d w

ater

con

sum

ptio

n)

min max

y1

y2

y3y4

y5

y6

y9

ym

ax

y7

y8

x2 (normalized humidity)= 0.272x3

(norm. solar radiation)= 0.7585x4

(norm. wind strenght)= 0.248

The data set is problem A of the "Great energy predictor shootout" contest. "The Great Energy Predictor Shootout" - The First Building Data Analysis And Prediction Competition; ASHRAE Meeting; Denver, Colorado; June, 1993;

Building data setBuilding data set

59/67

Credibility of models in classification

*

*

** =

=*

* =

Iris

Set

oza

Iris

Virg

inic

aIr

is V

ersi

colo

rGAME model 1 GAME model 2 GAME model 3 GAME models (1*2*3)

60/67

Credibility of models in classification

Before After

61/67

Other visualization techniques

62/67

Automated retrieval of 3D plots showing interesting behaviour

Genetic Algorithm

Genetic algorithm with special fitness function is used toadjust all other inputs (dimensions)

63/67

ConclusionNew quality in modeling:

• GAME models adapt to the data set complexity.• Hybrid models more accurate than models with

single type of unit• Compromise response of the ensemble

indicates valid system states.• Black-box model gives estimation of its accuracy

(2,3 ± 0,1)• Automated retrieval of “interesting plots” of

system behavior• Java application developed – real time

simulation of models

64/67

Classification of very

complex data set

Benchmarking Spiraldata classificationproblem solved by

GAME

Note: inner crossvalidationmechanism to prevent overfitting was disabled for this problem

Benchmarks of GAME

65/67

Internet advertisements database

BackProp best result: 87.97%

GAME best result: 93.13%

66/67

Other benchmarking problems

Classification accuracy[%]

Pima indians data set: 10fold crossvalidation results:

Stock value prediction (S&P DEP RECEIPTS):

Change in direction prediction ratio:

Backpropagation with momentum: 82.4%; GAME: 91.3%

Zoo database:Accuracy of classification into genotypes:

Backpropagation: 80.77%; GAME: 93.5%

67/67

Thank you!

gmdh and fake game: evolving ensembles of inductive...

Documents