gmdh and fake game: evolving ensembles of inductive...
TRANSCRIPT
CTU Prague, Faculty of Electrical Engineering, CTU Prague, Faculty of Electrical Engineering, Department of Computer ScienceDepartment of Computer Science
Pavel Pavel KordKordííkk
GMDH and FAKE GAME: GMDH and FAKE GAME: Evolving ensembles of inductive Evolving ensembles of inductive
modelsmodels
2/67
Outline of the tutorial …
• Introduction, FAKE GAME framework, architecture of GAME
• GAME – niching genetic algorithm• GAME – ensemble methods
– Trying to improve accuracy
• GAME – ensemble methods– Models’ quality– Credibility estimation
3/67
Theory in background
• Knowledge discovery• Data preprocessing• Data mining• Neural networks• Inductive models• Continuous optimization• Ensemble of models• Information visualization
4/67
Problem statement• Knowledge discovery (from databases) is time
consuming and expert skills demanding task.• Data preprocessing is extremely time consuming
process.• Data mining involves many experiments to find
proper methods and to adjust their parameters.
• Black-box data mining methods such us neural networks are not considered credible.
• Knowledge extraction from data mining methods is often very complicated.
5/67
Solutions proposed – FAKE GAME concept
FAKE INTERFACE
AUTOMATEDDATA MINING
INPUTDATA AUTOMATED
DATAPREPROCESSING
KNOWLEDGEEXTRACTION
andINFORMATION
VISUALIZATIONKNOWLEDGE
GAME ENGINE
6/67
Detailed view of proposed solutions
FAKE INTERFACE
MO DELMO DEL
MO DEL MO DEL
MO DEL
MO DEL Math equations
Feature ranking
Interesting behaviour
Credibilityestimation
Classes boundaries,relationship of variables
DATAWAREHOUSING
DATAINTEGRATION
DATACLEANING
INPUTDATA
Classification, Prediction,Identification and Regression
DATACOLLECTION
PROBLEMIDENTIFICATION
DATAINSPECTION
AUTOMATEDDATA
PREPROCESSINGGAME ENGINE
7/67
The GAME engine for automated data mining
MO DELMO DEL
MO DEL MO DEL
MO DEL
MO DEL
GAMEPREPROCESSED
DATA
CLASSIFICATION
REGRESSION
PREDICTION
IDENTIFICATION
OR
OR
OR
What is hidden inside?
8/67
Background (GMDH)
• Group Method of Data Handling GMDH – prof. Ivakhnenko in 1968
• The principle of INDUCTIONemployed:Inductive model grows from data, itdecomposes a multivariate probleminto subproblems, solves it in thespace of lower dimensionality (twodimensions for units with 2 inputs), combines partial solutions to get theglobal solution of the problem.
input variables
P
output variable
P P P
P P P
P P
P
Units with polynomial transfer function
MIA GMDH – the most commonly used algorithm
P
For instance:
y=ax1+bx2+c
x2x1
y
9/67
Our improvements
input variables
output variable
3 inputsmax
P C P G
P P C
L
P L C
input variables
output variable
interlayerconnections
P C P G
P P C
P
input variables
P
output variable
P P P
P P P
P P
P
3 inputs2 inputs
MIA GMDH ModGMDH GAME
unifiedunits
diversifiedunits
Non-heuristic search
Exhaustivesearch GA+DC
10/67
Recent improvements (GAME)• Group of Adaptive Models
Evolution (GAME)
- Heterogeneous units- Several competing training
methods- Interlayer connections, number
of inputs allowed increases- Niching genetic algorithm (will
be explained) employed in each layer to optimize the topology of GAME networks.
- Ensemble of models generated (will be explained)
input variables
output variable
first layer
second layer
third layer
output layer
interlayer connection
3 inputsmax
4 inputs max
P C P G
P P C
L
P L C
11/67
Heterogeneous units in GAMEx1
xn
x2
...1
1+
=
+=∑ n
n
iii axay
Linear (LinearNeuron)
x1
xn
x2
... 01 1
axayn
i
m
j
rji +
=∑ ∏
= =
Polynomial (CombiNeuron)x1
xn
x2
...( )
( )
( )0
11
22
1
2
*1 aeay n
n
iii
a
ax
n ++= +
=
+
∑ −−
+
Gaussian (GaussianNeuron)
x1
xn
x2
...03
121 sin aaxaaay n
n
iiinn +
+∗∗= +=
++ ∑
Sin (SinusNeuron)
x1
xn
x2
... 0
11
1a
e
y n
iii xa
++
=∑−=
Logistic (SigmNeuron)
x1
xn
x2
...0
*
21
1
* aeayn
iiin xaa
n +=∑
+=
+
Exponential (ExpNeuron)
x1
xn
x2
...0
11 1
*1
2
2
2
aaxxaxa
ay
n
n
i
n
jjijin
n
iii
n +++
=
+= =
+=
+
∑∑∑
Rational (PolyFractNeuron)
x1
xn
x2
...
Universal (BPNetwork)
( ))(1
12
1p
n
p pq
n
qq xy ∑∑ =
+
=
= φψ
12/67
Optimization of coefficients (learning)
x1
xn
x2
... ( )( )
( )0
11
22
1
2
*1 aeay n
n
iii
a
ax
n++= ++++
=
+
∑ −−
++++
Gaussian (GaussianNeuron)
We have inputs x1, x2, …, xn and target output y in the training data set
We are looking for optimal values of coefficients a0, a1, …, an+2
y’
The difference between unit output y’ and the target value y should be minimal for all vectors from the training data set
∑=
−=m
i
yyE1
2)'(
13/67
How to derive analytic gradient?
Error of the unit (energy surface)
Gradient of the error
Unit with gaussian transfer function
Partial derivation of error in the direction of coefficient ai
14/67
Partial derivatives of the Gauss unit
15/67
Optimization of their coefficients
Unit
repeat
Optimization
method
optimize coefficientsgiven inintial values
new values
coefficientsa1 , a2 , ..., a n
error
final values
compute
error on
training
dataestimate
gradient
a)
b)
Unit
repeat
Optimization
method
optimize coefficientsgiven inintial values
new values
coefficientsa1 , a2 , ..., a n
error
final values
compute
error on
training
data
compute
gradient
of the
error
gradient
Unit does not provide analytic gradientjust error of the unit
Unit provides analytic gradientand the error of the unit
16/67
Very efficient gradient based training for hybrid networks developed!
17/67
Outline of the talk …• Introduction, FAKE GAME framework,
architecture of GAME• GAME – niching genetic algorithm• GAME – ensemble methods
– Trying to improve accuracy
• GAME – ensemble methods– Models’ quality– Credibility estimation
18/67
Genetic algorithm
19/67
Example of encoding into chromosome
1
2
3
4
5
6
7
NichingGA
Linear transfer unit (LinearNeuron)
12345671001000 not implemened n.i.
Polynomial trasfer unit (CombiNeuron)
Settings
Inputs
12345670000110
Transfer function
Inputs
12345672115130
12345671203211
Transfer function
n.i.
Settings
02212
3211 axxaxxay ++=
02211 axaxay ++=
20/67
The process of GAME modelsevolution – Genetic Algorithm
Encoding the GAME unit to genotype (first layer):
GA (no niching)
0 0 1 0 P trans.fn.
All individuals are connected to the most significant feature after standard GA.
P
Inputs Type Other
Select N best individuals
Result: REDUNDANT information extracted
P PP P
21/67
Better way is to use worse, but non-correlated units
By using less significant features we get more additional information than by using several best individuals connected to the most significant feature!
C
A B
f (C)
Z
X Y
f (A) = 8 f (B) = 7.99 f (X) = 8 f (Y) = 5
f (C) = 8 f (Z) = 9
f (Z)<
The outputs of units A and B are correlated combining them does not bring improvement in performance
How to do it?
22/67
Canonical GAs
Niching GAs
• Niching method - extends genetic algorithm to be able to locate multiple solutions.• Promote the formation and maintenance of stablesubpopulations in genetic algorithm.
Niching Genetic Algorithms
Mnoho metod pro zachovánírůznorodosti jedinců (niching methods):
Overspecification – dobré u dynamických systémůEcological genetic algorithms –více prostředíHeterozygote advantage –diploidní genom, fitness zvýšit o dist. rodičůCrowding (restricted replacement)– nahrazení podobných jedinců v subpop.Restricted competition – turnaj kompatibilních jedinců (threshold)Fitness sharing – sharing methods– přelidníno –> hledám jindeImmune system models - antigeny
23/67
Niching GA – Deterministic Crowding
• Deterministic Crowding: S. W. Mahfoud (1995).
• Subpopulations (niches) are based on the distanceof individuals.
C D
= H(010010, 000011) + H(1100, 1010) = 4
A B
= H(1000, 0010) + H(1000, 0010) = 4
A
B
C
D
1000
0010
010010
000011
ChromosomeUnit Inputs_processed
1000
0010
1100
1010
Distance = Hamming dist. of chromosomes + Hamming dis. of inputs processed
d (A, B)
d (C, D)
For units in the GAME network:
distance of units in one population is based on thephenotypic difference of units .
24/67
Distance of GAME units
P1
P2
1
2
3
4
5
6
7
8
Nic
hing
GA
Distance(P1,P2) = genotyphic distance + correlation of errors
Normalized distance of Inputs
Computed from units deviationson training & validation set
Normalized distance of Transfer functions+
+Normalized distance of Other attributes
Hamming(100010,101100) + features used
Euclid distance of coefficients
Distance of configuration variables
P1 P2Encoding unitsto chromosomes:
123456100010 101100
Inputs
Transfer functionOther 123456
Transfer functionOther
Inputs
25/67
The process of GAME modelsevolution – GA with DC
Niching GA (DeterministicCrowding)
0 0 1 0 P trans.fn.P
Inputs Type Other
Select the best individual from each niche
Result: MAXIMUM information extracted
Encoding the GAME unit to genotype (first layer):
Individuals connected to less significant features survived. We gained more info.
P SP
26/67
Experimental Results: Standard GAversus GA+Deterministic Crowding
Time
Day
Rs
Rn
PAR
Tair
RH
u
SatVapPr
VapPress
Battery
Number of units connected to particular variable
Ep
och
nu
mb
er
025
02
00
15
01
005
0
0 100 200 0 100 200
Genetic Algorithm GA with Deterministic Crowding
P PP P P SP
Single uniform population Three subpopul.(niches)
Units connected to the same input belongs to one niche.
27/67
Experimental Results: Standard GAversus GA+Deterministic Crowding
8,10E-03
8,20E-03
8,30E-03
8,40E-03
8,50E-03
8,60E-03
8,70E-03
8,80E-03
GA GA+DC
RMS cold water consumption
4,80E-02
4,85E-02
4,90E-02
4,95E-02
5,00E-02
5,05E-02
5,10E-02
5,15E-02
5,20E-02
5,25E-02
5,30E-02
GA GA+DC
RMS energy consumption
3,70E-02
3,75E-02
3,80E-02
3,85E-02
3,90E-02
3,95E-02
4,00E-02
4,05E-02
4,10E-02
GA GA+DC
RMS hot water consumption
The statistical test proved that on the level of significance 95%, the GA+DC performs better than simple GA for the energy and hot water consumption.
� ��Results:
28/67
Experimental Results: Standard GAversus GA+Deterministic Crowding
GA GA+DC
0,0E+00
5,0E-06
1,0E-05
1,5E-05
2,0E-05
2,5E-05
3,0E-05
3,5E-05
4,0E-05
4,5E-05
DC off
DC on
CR
AT
ER
DE
PT
H
CR
AT
ER
DIA
ME
TE
R
FIR
E R
AD
IUS
INS
TA
TN
T R
AD
IAT
ION
SU
MA
RA
DIA
TIO
N
WA
WE
PR
ES
SU
RE
RMS on-ground nuclear tests
Except the models of the fire radius attribute, the performance of all other models is significantly better with Deterministic Crowding en abled.
Ave
rage
RM
S e
rror
of 2
0 m
odel
s
Small simple dataset
Simple GA
� GA+DC
29/67
Results: the best way is to combine genotypic and phenotypic distance
RMS Error on Boston testing data set
0.276
0.278
0.28
0.282
0.284
0.286
0.288
0.29
0.292
None Genome Correlation Gen&Corr.
Weighted Ensemble
Simple Ensemble
Average, Minimum and Maximum RMS Errorof 10 Ensemble Models on Boston data set
0.276
0.28
0.284
0.288
0.292
0.296
0.3
None Genome Correlation Gen&Corr.
RMS Error on Boston testing data set
0.276
0.278
0.28
0.282
0.284
0.286
0.288
0.29
0.292
None Genome Correlation Gen&Corr.
Weighted Ensemble
Simple Ensemble
Average, Minimum and Maximum RMS Errorof 10 Ensemble Models on Boston data set
0.276
0.28
0.284
0.288
0.292
0.296
0.3
None Genome Correlation Gen&Corr.
30/67
Distance of units can be monitored during the training phase
Epoch 1
Epoch 30
Sorted
Chromos. dist. Correlation Error on training ve ctors RMSE
Start of the niching GeneticAlgorithm, units are randomlyinitialized, trained and theirerror is computed,
after 30 epochs the nichingGenetic Algorithm terminates,
finally units are sortedaccording to their RMSE,chromosome differenceand the correlation.
31/67
Another visualization of diversity
genotypic distances genotypic distancescorrelation matrix correlation matrix
Model pp,layer 10, epoch 10Model imU,layer 1, epoch 9
Units connectedto the same input
Some of themare well trained
Clusters arenot apparentin the matrix
Each unithas differentgenotype(units areconnectedto diverseinputs)
Half of unitsare bettertrained thanthe others
32/67
GA or niching GA?• Genetic search is more effective than
exhaustive search• The use of niching genetic algorithm (e.g.
with Deterministic Crowding) is beneficial:– More accurate models are generated
– Feature ranking can be derived– Non-correlated units (active neurons) can be
evolved
• The crossover of active neurons in transfer function brings further improvements
33/67
Remember? Units with various transfer functions competes in the niching GA
x1
xn
x2
...1
1+
=
+=∑ n
n
iii axay
Linear (LinearNeuron)
x1
xn
x2
... 01 1
axayn
i
m
j
rji +
=∑ ∏
= =
Polynomial (CombiNeuron)x1
xn
x2
...( )
( )
( )0
11
22
1
2
*1 aeay n
n
iii
a
ax
n ++= +
=
+
∑ −−
+
Gaussian (GaussianNeuron)
x1
xn
x2
...03
121 sin aaxaaay n
n
iiinn +
+∗∗= +=
++ ∑
Sin (SinusNeuron)
x1
xn
x2
... 0
11
1a
e
y n
iii xa
++
=∑−=
Logistic (SigmNeuron)
x1
xn
x2
...0
*
21
1
* aeayn
iiin xaa
n +=∑
+=
+
Exponential (ExpNeuron)
x1
xn
x2
...0
11 1
*1
2
2
2
aaxxaxa
ay
n
n
i
n
jjijin
n
iii
n +++
=
+= =
+=
+
∑∑∑
Rational (PolyFractNeuron)
x1
xn
x2
...
Universal (BPNetwork)
( ))(1
12
1p
n
p pq
n
qq xy ∑∑ =
+
=
= φψ
34/67
Natural selection of units really works!
0.100
0.150
0.200
0.250
0.300
0.350
allFra
ctGau
ssian
Mult iG
auss
all-f a
stPer
ceptro
nall
-simpleGau
ssSigm
Polyno
mial SinExp
CombiR
300
Linea
rCom
bi
4.03E+05
Training set
Testing set
RMS on the Boston data set
0.100
0.150
0.200
0.250
0.300
0.350
allFra
ctGau
ssian
Mult iG
auss
all-f a
stPer
ceptro
nall
-simpleGau
ssSigm
Polyno
mial SinExp
CombiR
300
Linea
rCom
bi
4.03E+05
Training set
Testing set
Training set
Testing set
RMS on the Boston data set
The best performance is achieved when units of all types are competing in the construction process.
35/67
Optimization methods available in GAME
36/67
Experimental results of competing training methods on Building data set
Hot water consumption
QNCG
SADEDEall
HGAPSOCACO
SOSpalDEACOPSO
OS
Cold water consumption
QNallDE
SADECGOS
HGAPSOCACO
SOSpalDEACOPSO
Energy consumption
CGDE
QNSADE
allSOS
CACOPSO
HGAPSOACO
OSpalDE
Hot water consumption
QNCG
SADEDEall
HGAPSOCACO
SOSpalDEACOPSO
OS
Hot water consumption
QNCG
SADEDEall
HGAPSOCACO
SOSpalDEACOPSO
OS
Cold water consumption
QNallDE
SADECGOS
HGAPSOCACO
SOSpalDEACOPSO
Cold water consumption
QNallDE
SADECGOS
HGAPSOCACO
SOSpalDEACOPSO
Energy consumption
CGDE
QNSADE
allSOS
CACOPSO
HGAPSOACO
OSpalDE
Energy consumption
CGDE
QNSADE
allSOS
CACOPSO
HGAPSOACO
OSpalDE
RMS error on testing data sets (Building data) averaged over 5 runs
37/67
RMS error on the Boston data set
38/67
Classification accuracy [%] on the Spiral data set
39/67
Evaluation on diverse data sets
What is it All ?
40/67
Remember the Genetic algorithm optimizing the structure of GAME?
1
2
3
4
5
6
7
NichingGA
Linear transfer unit
12345671001000 not implemened CACO
Polynomial trasfer unit
Optimization method
Inputs
12345670000110
Transfer function
Inputs
12345672115130
12345671203211
Transfer function
DE
Opt. m.
02212
3211 axxaxxay ++=
02211 axaxay ++= added intochromosomes
41/67
Outline of the talk …
• Introduction, FAKE GAME framework, architecture of GAME
• GAME – niching genetic algorithm• GAME – ensemble methods
– Trying to improve accuracy
• GAME – ensemble methods– Models’ quality– Credibility estimation
42/67
Ensemble models – why to use it?
• Hot topic in machine learning communities
• Successfully applied in several areas such as:– Face recognition [Gutta96, Huang00]– Optical character recognition [Drucker93, Mao98]
– Scientific image analysis [Cherkauer96]– Medical diagnosis [Cunningham00,Zhou02]
– Seismic signal classification [Shimshoni98]– Drug discovery [Langdon02]
– Feature extraction [Brown02]
43/67
Ensemble models – what is it?• The collection of models (e.g. neural
networks) is trained for the same task.• Outputs of models are then combined.
Model 1 Model 2 Model 3 Model N
Input variables
Output variable
Ensemble output
combination
…
Output variable
44/67
Bagging• Sample with replacement several training sets of size n (instead of having just one
training set of size n).
• Build model/classifier for each training set.
• Combine classifier’s predictions.
Training data
Sample 1
Sample 2
Sample M
...
Learning algorithm
Learning algorithm
Learning algorithm
Model 1(classifier)
Model 2(classifier)
Model M(classifier)
......
Sampling with replacement
Ensemble model
(classifier)
output
Averaging or voting
45/67
We employed Bagging in GAME• Group of Adaptive Models Evolution is a method we are
working on in our department (see next slide).
Training data
Sample 1
Sample 2
Sample M
...
GAME
GAME
GAME
GAMEmodel 1
GAMEmodel 2
GAMEmodel M
......
Sampling with replacement
GAME ensemble
output
Averaging or voting
46/67
Results of Bagging GAME models
124
126
128
130
132
134
136
138
1 3 5 7 9 11 13 15 17 19 21 23 25 27
RMS skeleton age estimation – training&validation dat a set
Simple ensemble
Weighted ensembleIndividual GAME models
136
138
140
142
144
146
148
150
152
154
156
1 3 5 7 9 11 13 15 17 19 21 23 25 27
RMS skeleton age estimation – testing data set
Simple ensemble
Weighted ensembleIndividual GAME models
Weighted ensemble of GAME models
The weighted ensemble has apparent tendency to overfitt the data.
While its performance is superior on the training and validation data, on the testing data there are several individual models performing better.
47/67
GAME ensembles are notstatistically significantly better thanthe best individual GAME model !
Why?• Are GAME models diverse?
– Vary input data ���� (bagging)– Vary input features ���� (using subset of features)– Vary initial parameters ���� (random initialization of weights)– Vary model architecture ���� (heterogeneous units used)– Vary training algorithm ���� (heterogeneous units used)– Use a stochastic method when building model ���� (niching GA)
Yes, they use several methods to promote diversity!
48/67
Ensemble models – why it works?
Bias-variance decomposition(theoretical tool to study how a training data affects the performance of models/classifiers)
Total expected error of model = variance + bias
variance : the part of the expected error due to the nature of the training set
bias: the part of the expected error caused by the fact the model is not perfect
G.Brown: Diversity in Neural networks ensembles, 2004
Ensembling reduces variance Ensembling reduces biastraining data model 1
training data model 2
model 1
model 2
ensemble model
ensemble model
49/67
GAME ensembles are notstatistically significantly better thanthe best individual GAME model !
Why?• GAME models grows until
optimal complexity (grows from the minimal form up to the required complexity)
• Therefore bias cannot befurther reduced
• Bagging slightly reduces variance, but not significantly (for datasets we have been experimented with)
Ensembling reduces bias …
model 1
model 2
ensemble model
… just for non-optimal models
model 1
model 2
ensemble model
model 3
50/67
Results of Bagging GAME models
135
140
145
150
155
160
165
170
1 2 3 4 5 6 7 8 9 10 11 12
0,265
0,266
0,267
0,268
0,269
0,27
0,271
0,272
0,273
0,274
1 2 3 4 5 6
RMS – cold water consumtion RMS – age estimation
ensemble ensemble
135
140
145
150
155
160
165
170
1 2 3 4 5 6 7 8 9 10 11 12
0,265
0,266
0,267
0,268
0,269
0,27
0,271
0,272
0,273
0,274
1 2 3 4 5 6
RMS – cold water consumtion RMS – age estimation
ensemble ensemble
Simple ensemble of GAME models
Is the ensemble is significantly better than any of individual models?
Suboptimal GAME models YES Optimal GAME models NO
51/67
Outline of the talk …• Introduction, FAKE GAME framework,
architecture of GAME• GAME – niching genetic algorithm• GAME – ensemble methods
– Trying to improve accuracy
• GAME – ensemble methods– Models’ quality– Credibility estimation
52/67
Ensembles can estimate credibility
• We found out that ensembling GAME modelsdoes not further improve accuracy of modelling.
• Why to use ensemble then?
• There is one big advantage that ensemble models can provide: Using ensemble of models we can estimate credibility for any input vector.
• Ensembles are starting to be used in this manner in several real world applications:
53/67
Visualization of model’s behavior
x
ModGMDH min
=
yk
const.
ModGMDH
ym
constant
constant
constant
moving
moving
moving
yk1
x1
x1
x2
x3
x2
x3
x2 max
x1 x3 =
min
const.ym
x2 max
x3 =
max
GAME
GAME
54/67
Estimating credibility using ensemble of GAME models
EE1
WE1
WW 1
EW1
EE2
WE2
WW 2
EW2
EE3
WE3
WW 3
EW3
RR3
RS3
SA1
SV1
SA2
SV2
SA3
SV3
FA1
FV1
FA2
FV2
FA3
FV3
S
GAME
S
S
P
S
S
S
S
S
S
S Healthy
GAMEEE1
WE1
WW 1
EW1
EE2
WE2
WW 2
EW2
EE3
WE3
WW 3
EW3
RR3
RS3
SA1
SV1
SA2
SV2
SA3
SV3
FA1
FV1
FA2
FV2
FA3
FV3
S
GAME
S
S
P
S
S
S
S
S
S
S Healthy
GAMEEE1
WE1
WW 1
EW1
EE2
WE2
WW 2
EW2
EE3
WE3
WW 3
EW3
RR3
RS3
SA1
SV1
SA2
SV2
SA3
SV3
FA1
FV1
FA2
FV2
FA3
FV3
S
GAME
S
S
P
S
S
S
S
S
S
S Healthy
GAME
1 2 … n
data defined
x1j =0.0 1.0
y
-1.0 2.0
x2 = 0.01
y1
y2
y3
y4
y5
y6y7
y8
y9
x
GAME min
=
yk
const.constant
constant
varying
yk1
x2
x3
x2 max
x1 x3 =L
PC
PP
While x2 is varying from min to max and other inputs stay constant, the output of the model shows sensitivity of the model to variable x2 in the configuration x1, x3 = const.
In regions, where we have enough data , all models from ensemble have compromise response – they are credible .
When responses of models differthen models are not credible(dark background).
We need to know when the response of a model is based on training data and when it is just random guessing – then we can estimate the credibility of the model.
55/67
Artificial data set – partial definition
Credibility: the criterion is a dispersion of models` responses.
Advantages:
• No need of the training data set,
• Modeling method success considered,
• Inputs importance considered.
56/67
Practical example – Nuclear data A.G. Ivakhnenko – experiments with GMDH
Extremely sparse data set – 6 dimensions, 10 data vectors
57/67
Typical application of Typical application of softcomputingsoftcomputing methods in industry:methods in industry:BuildingBuilding data setdata set
Model of hotwater consumpt.
Model of energy consumption
Model of coldwater consumpt.
Black box
Looking for the optimal Looking for the optimal configuration of the GAME configuration of the GAME simulator simulator –– Pavel Pavel StaStanněěkk
GAME ensemble
58/67
x1 (Temperature outside the building)
(Col
d w
ater
con
sum
ptio
n)
min max
y1
y2
y3y4
y5
y6
y9
ym
ax
y7
y8
x2 (normalized humidity)= 0.272x3
(norm. solar radiation)= 0.7585x4
(norm. wind strenght)= 0.248
The data set is problem A of the "Great energy predictor shootout" contest. "The Great Energy Predictor Shootout" - The First Building Data Analysis And Prediction Competition; ASHRAE Meeting; Denver, Colorado; June, 1993;
Building data setBuilding data set
59/67
Credibility of models in classification
*
*
** =
=*
* =
Iris
Set
oza
Iris
Virg
inic
aIr
is V
ersi
colo
rGAME model 1 GAME model 2 GAME model 3 GAME models (1*2*3)
60/67
Credibility of models in classification
Before After
61/67
Other visualization techniques
62/67
Automated retrieval of 3D plots showing interesting behaviour
Genetic Algorithm
Genetic algorithm with special fitness function is used toadjust all other inputs (dimensions)
63/67
ConclusionNew quality in modeling:
• GAME models adapt to the data set complexity.• Hybrid models more accurate than models with
single type of unit• Compromise response of the ensemble
indicates valid system states.• Black-box model gives estimation of its accuracy
(2,3 ± 0,1)• Automated retrieval of “interesting plots” of
system behavior• Java application developed – real time
simulation of models
64/67
Classification of very
complex data set
Benchmarking Spiraldata classificationproblem solved by
GAME
Note: inner crossvalidationmechanism to prevent overfitting was disabled for this problem
Benchmarks of GAME
65/67
Internet advertisements database
BackProp best result: 87.97%
GAME best result: 93.13%
66/67
Other benchmarking problems
Classification accuracy[%]
Pima indians data set: 10fold crossvalidation results:
Stock value prediction (S&P DEP RECEIPTS):
Change in direction prediction ratio:
Backpropagation with momentum: 82.4%; GAME: 91.3%
Zoo database:Accuracy of classification into genotypes:
Backpropagation: 80.77%; GAME: 93.5%
67/67
Thank you!