psms for neural networks on the agnostic vs prior knowledge challenge

PSMS for Neural Networks on the Agnostic vs Prior Knowledge

Challenge

Hugo Jair Escalante, Manuel Montes and Enrique Sucar

Computer Science DepartmentNational Astrophysics, Optics and Electronics,

México

IJCNN-2007 ALvsPK Challenge Orlando, Florida, August 17, 2007

Outline

• Introduction

• Particle swarm optimization

• Particle swarm model selection

• Results

• Conclusions

Introduction: model selection

• Agnostic learning – General purpose methods– No knowledge on the task at hand or on

machine learning is required

• Prior knowledge– Prior knowledge can increases model’s

accuracy– Expert domain is needed

Introduction• Problem: Given a set of preprocessing

methods, feature selection and learning algorithms (CLOP), select the best combination of them, together with their hyperparameters

• Solution: Bio-inspired search strategy (PSO)

Bird flocking Fish schooling

Particle swarm optimization (PSO)

• A population of individuals is created (Swarm)

• Each individual (particle) represents a solution to the problem at hand

• Particles fly through the search space by considering the best global and individual solutions

• A fitness function is used for evaluating solutions

Particle swarm optimization (PSO)

• Begin– Initialize swarm– Locate leader (pg)– it=0– While it < max_it

• For each particle– Update Position (2)– Evaluation (fitness)– Update particle’s best (p)

• EndFor• Update leader (pg)• it++;

– EndWhile

• End

PSO for model selection (PSMS)

• Each particle encodes a CLOP model

• Cross-validation BER is used for evaluating models

Experiment's settings

• Standard parameters for PSO

• 10 particles per swarm

• PSMS applied to ADA, GINA, HIVA and SYLVA

• 5-cross validation was used

Results up to March 1st

Corrida_final

• 500 iterations for ADA• 100 iterations for HIVA, GIVA • 50 iterations for SYLVA• Trial and error for NOVA


Best ave. BER still held by Reference (Gavin Cawley) with “the bad”. Note that the best entry for each dataset is not necessarily the best entry overall. Some of the best agnostic entries of individual datasets were made as part of prior knowledge entries (the bottom four); there is no corresponding overall agnostic ranking.

Agnostic learning best ranked entries as of March 1st, 2007

ADA GINA HIVA NOVA SYLVA Overall score Overall best Overall entry rkRoman Lutz 1 1 7 1 1 0.1507 LogitBoost with trees 1Juha Reunanen 5 3 4 2 3 0.2123 cross-indexing-7a 4H. Jair Escalante 4 4 3 5 4 0.2857 Corrida_final 6Vladimir Nikulin 6 5 2 8 5 0.4786 vrs4 11Vojtech Franc 9 6 1 9 7 0.5269 RBF SVM 13J. Wichard 3 8 6 11 2 0.5364 ensemble of trees 14Erinija 7 7 5 7 8 0.5583 liknon feature selection+ state of art classifiers17Marc Boulle 2 10 8 3 6 0.5985 Data Grid (CMA) 19weseeare 8 11 10 6 12 0.7328 AUCmax 22pipibjc 10 13 11 10 13 0.8819 naiveBayes_Ensemble 26cww 12 9 13 12 10 0.8941 cww1_1 27Chloe Azencott 13 12 12 11Jorge Sueiras 9Zhihang Chen 11 2 12 9


100 Iterations 500 Iterations (ADA)

Results up to August 1st

Dataset Entry name Models Entry ID Test BER Test AUC Score

ADA Corrida_final_10CV

chain({standardize({'center=0'}),normalize({'center=1'}),shift_n_s

cale({'take_log=0'}),neural({'units=5',

'shrinkage=1.4323','balance=0','maxiter=257'}), bias}) 922 0.1804 0.9015 0.09

GINA AdaBoost *

chain({normalize({'center=0'}),svc({'coef0=0.1',

'degree=5','gamma=0','shrinkage=0.01'}), bias}) 170 0.053 0.9878 0.3846

HIVA Corrida_final

chain({standardize({'center=1'}),normalize({'center=0'}),

neural({'units=5', 'shrinkage=3.028','balance=0','m

axiter=448'}), bias}) 919 0.2854 0.7551 0.0884

NOVA AdaBoost *

chain({normalize({'center=0'}),gentleboost(neural({'units=1', 'shrinkage=0.2' , 'balance=1',

'maxiter=50'}) , {'units=10','rejNum=3'}), bias}) 170 0.0504 0.9895 0.1987

SYLVA PSMS_100_4all_NCV

chain({standardize({'center=0'}),normalize({'center=0'}),shift_n_s

cale({'center=1'}),neural({'units=8',

'shrinkage=1.2853','balance=0','maxiter=362'}), bias}) 987 0.0084 0.9989 0.2362

Overall PSMS_100_4all_NCVSame as Corrida_final except by

Sylva’s model 987 0.1178 0.925 0.2464

* Models selected by trial and error


Best ave. BER still held by Reference (Gavin Cawley) with “the bad”. Note that the best entry for each dataset is not necessarily the best entry overall. The blue shaded entries did not count towards the prize (participant part of a group or not wishing to be identified).

Agnostic learning best ranked entries as of August 1st, 2007

ADA GINA HIVA NOVA SYLVA Score Overall best Entry rkRoman Lutz 1 1 10 2 2 0.1431 LogitBoost with trees 1Alexander Borisov 2 2 5 9 10 0.2373 out1-fs-nored-val (Intel final 1) 4Juha Reunanen 9 5 4 4 8 0.2424 cross-indexing-8 5Vladimir Martyanov 3 3 7 11 3 0.2432 GBT + PF 6Vladimir Nikulin 4 7 6 3 6 0.2436 vn5 7H. Jair Escalante 8 10 2 7 4 0.2464 PSMS_100_4all_NCV 8Mehreen Saeed 11 6 11 1 5 0.2786 Submit D final 11Erinija Pranckeviene 6 11 3 15 1 0.3105 TMK 17Joerg Wichard 10 8 9 12 9 0.4332 liknon feature selection + soa(1) 27Namgil Lee 12 12 8 10 7 0.5059 mlp+commitee+mcic 34Vojtech Franc 14 9 1 13 12 0.5355 RBF SVM 36Marc Boulle 5 14 12 5 11 0.6062 Data Grid (CMA) 40Entry777 7 17 13 16 15 0.7270 Linear Regression Tree 43weseeare 13 16 15 8 17 0.7405 AUCmax 44pipibjc 15 18 16 14 18 0.8862 naiveBayes_Ensemble 47cww 17 13 19 16 14 0.9043 cww1_1 48Stijn Vanderlooy 19 15 18 17 16 0.9398 micc-ikat 50


BER AUC

Conclusions

• Competitive and simple models are obtained with PSMS

• No knowledge on the problem at hand neither on machine learning is required

• PSMS is easy to implement

• It suffers from the same problem as other search algorithms

psms for neural networks on the agnostic vs prior knowledge challenge

Documents