psms for neural networks on the agnostic vs prior knowledge challenge
DESCRIPTION
PSMS for Neural Networks on the Agnostic vs Prior Knowledge Challenge. Hugo Jair Escalante, Manuel Montes and Enrique Sucar Computer Science Department National Astrophysics, Optics and Electronics, México. IJCNN-2007 ALvsPK Challenge. Orlando, Florida, August 17, 2007. Outline. - PowerPoint PPT PresentationTRANSCRIPT
PSMS for Neural Networks on the Agnostic vs Prior Knowledge
Challenge
Hugo Jair Escalante, Manuel Montes and Enrique Sucar
Computer Science DepartmentNational Astrophysics, Optics and Electronics,
México
IJCNN-2007 ALvsPK Challenge Orlando, Florida, August 17, 2007
Outline
• Introduction
• Particle swarm optimization
• Particle swarm model selection
• Results
• Conclusions
Introduction: model selection
• Agnostic learning – General purpose methods– No knowledge on the task at hand or on
machine learning is required
• Prior knowledge– Prior knowledge can increases model’s
accuracy– Expert domain is needed
Introduction• Problem: Given a set of preprocessing
methods, feature selection and learning algorithms (CLOP), select the best combination of them, together with their hyperparameters
• Solution: Bio-inspired search strategy (PSO)
Bird flocking Fish schooling
Particle swarm optimization (PSO)
• A population of individuals is created (Swarm)
• Each individual (particle) represents a solution to the problem at hand
• Particles fly through the search space by considering the best global and individual solutions
• A fitness function is used for evaluating solutions
Particle swarm optimization (PSO)
• Begin– Initialize swarm– Locate leader (pg)– it=0– While it < max_it
• For each particle– Update Position (2)– Evaluation (fitness)– Update particle’s best (p)
• EndFor• Update leader (pg)• it++;
– EndWhile
• End
PSO for model selection (PSMS)
• Each particle encodes a CLOP model
• Cross-validation BER is used for evaluating models
Experiment's settings
• Standard parameters for PSO
• 10 particles per swarm
• PSMS applied to ADA, GINA, HIVA and SYLVA
• 5-cross validation was used
Results up to March 1st
Corrida_final
• 500 iterations for ADA• 100 iterations for HIVA, GIVA • 50 iterations for SYLVA• Trial and error for NOVA
Results up to March 1st
Best ave. BER still held by Reference (Gavin Cawley) with “the bad”. Note that the best entry for each dataset is not necessarily the best entry overall. Some of the best agnostic entries of individual datasets were made as part of prior knowledge entries (the bottom four); there is no corresponding overall agnostic ranking.
Agnostic learning best ranked entries as of March 1st, 2007
ADA GINA HIVA NOVA SYLVA Overall score Overall best Overall entry rkRoman Lutz 1 1 7 1 1 0.1507 LogitBoost with trees 1Juha Reunanen 5 3 4 2 3 0.2123 cross-indexing-7a 4H. Jair Escalante 4 4 3 5 4 0.2857 Corrida_final 6Vladimir Nikulin 6 5 2 8 5 0.4786 vrs4 11Vojtech Franc 9 6 1 9 7 0.5269 RBF SVM 13J. Wichard 3 8 6 11 2 0.5364 ensemble of trees 14Erinija 7 7 5 7 8 0.5583 liknon feature selection+ state of art classifiers17Marc Boulle 2 10 8 3 6 0.5985 Data Grid (CMA) 19weseeare 8 11 10 6 12 0.7328 AUCmax 22pipibjc 10 13 11 10 13 0.8819 naiveBayes_Ensemble 26cww 12 9 13 12 10 0.8941 cww1_1 27Chloe Azencott 13 12 12 11Jorge Sueiras 9Zhihang Chen 11 2 12 9
Results up to March 1st
100 Iterations 500 Iterations (ADA)
Results up to August 1st
Dataset Entry name Models Entry ID Test BER Test AUC Score
ADA Corrida_final_10CV
chain({standardize({'center=0'}),normalize({'center=1'}),shift_n_s
cale({'take_log=0'}),neural({'units=5',
'shrinkage=1.4323','balance=0','maxiter=257'}), bias}) 922 0.1804 0.9015 0.09
GINA AdaBoost *
chain({normalize({'center=0'}),svc({'coef0=0.1',
'degree=5','gamma=0','shrinkage=0.01'}), bias}) 170 0.053 0.9878 0.3846
HIVA Corrida_final
chain({standardize({'center=1'}),normalize({'center=0'}),
neural({'units=5', 'shrinkage=3.028','balance=0','m
axiter=448'}), bias}) 919 0.2854 0.7551 0.0884
NOVA AdaBoost *
chain({normalize({'center=0'}),gentleboost(neural({'units=1', 'shrinkage=0.2' , 'balance=1',
'maxiter=50'}) , {'units=10','rejNum=3'}), bias}) 170 0.0504 0.9895 0.1987
SYLVA PSMS_100_4all_NCV
chain({standardize({'center=0'}),normalize({'center=0'}),shift_n_s
cale({'center=1'}),neural({'units=8',
'shrinkage=1.2853','balance=0','maxiter=362'}), bias}) 987 0.0084 0.9989 0.2362
Overall PSMS_100_4all_NCVSame as Corrida_final except by
Sylva’s model 987 0.1178 0.925 0.2464
* Models selected by trial and error
Results up to August 1st
Best ave. BER still held by Reference (Gavin Cawley) with “the bad”. Note that the best entry for each dataset is not necessarily the best entry overall. The blue shaded entries did not count towards the prize (participant part of a group or not wishing to be identified).
Agnostic learning best ranked entries as of August 1st, 2007
ADA GINA HIVA NOVA SYLVA Score Overall best Entry rkRoman Lutz 1 1 10 2 2 0.1431 LogitBoost with trees 1Alexander Borisov 2 2 5 9 10 0.2373 out1-fs-nored-val (Intel final 1) 4Juha Reunanen 9 5 4 4 8 0.2424 cross-indexing-8 5Vladimir Martyanov 3 3 7 11 3 0.2432 GBT + PF 6Vladimir Nikulin 4 7 6 3 6 0.2436 vn5 7H. Jair Escalante 8 10 2 7 4 0.2464 PSMS_100_4all_NCV 8Mehreen Saeed 11 6 11 1 5 0.2786 Submit D final 11Erinija Pranckeviene 6 11 3 15 1 0.3105 TMK 17Joerg Wichard 10 8 9 12 9 0.4332 liknon feature selection + soa(1) 27Namgil Lee 12 12 8 10 7 0.5059 mlp+commitee+mcic 34Vojtech Franc 14 9 1 13 12 0.5355 RBF SVM 36Marc Boulle 5 14 12 5 11 0.6062 Data Grid (CMA) 40Entry777 7 17 13 16 15 0.7270 Linear Regression Tree 43weseeare 13 16 15 8 17 0.7405 AUCmax 44pipibjc 15 18 16 14 18 0.8862 naiveBayes_Ensemble 47cww 17 13 19 16 14 0.9043 cww1_1 48Stijn Vanderlooy 19 15 18 17 16 0.9398 micc-ikat 50
Results up to August 1st
BER AUC
Conclusions
• Competitive and simple models are obtained with PSMS
• No knowledge on the problem at hand neither on machine learning is required
• PSMS is easy to implement
• It suffers from the same problem as other search algorithms