ga-based feature selection and parameter optimization for support vector machine cheng-lung huang,...
TRANSCRIPT
![Page 1: GA-Based Feature Selection and Parameter Optimization for Support Vector Machine Cheng-Lung Huang, Chieh-Jen Wang Expert Systems with Applications, Volume](https://reader035.vdocument.in/reader035/viewer/2022062718/56649eb55503460f94bbe5d9/html5/thumbnails/1.jpg)
GA-Based Feature Selection and Parameter Optimization for Support Vector Machine
Cheng-Lung Huang, Chieh-Jen Wang
Expert Systems with Applications, Volume 31, Issue 2, pp.231-240, 2006.
![Page 2: GA-Based Feature Selection and Parameter Optimization for Support Vector Machine Cheng-Lung Huang, Chieh-Jen Wang Expert Systems with Applications, Volume](https://reader035.vdocument.in/reader035/viewer/2022062718/56649eb55503460f94bbe5d9/html5/thumbnails/2.jpg)
Outline
INTRODUCTION BACKGROUND KNOWLEDGE GA-BASED FEATURE SELECTION AND
PARAMETER OPTIMIZATION NUMERICAL ILLUSTRATIONS CONCLUSION
![Page 3: GA-Based Feature Selection and Parameter Optimization for Support Vector Machine Cheng-Lung Huang, Chieh-Jen Wang Expert Systems with Applications, Volume](https://reader035.vdocument.in/reader035/viewer/2022062718/56649eb55503460f94bbe5d9/html5/thumbnails/3.jpg)
Introduction
Support vector machines (SVM) were first suggested by Vapnik (1995) for classification.
SVM classifies data with different class label by determining a set of support vectors that outline a hyperplane in the feature space.
The kernel function transforms train data vector to feature space.
SVM used in a range of problems including pattern recognition (Pontil and Verri 1998),
bioinformatics (Brown et al. 1999), text categorization (Joachims 1998).
![Page 4: GA-Based Feature Selection and Parameter Optimization for Support Vector Machine Cheng-Lung Huang, Chieh-Jen Wang Expert Systems with Applications, Volume](https://reader035.vdocument.in/reader035/viewer/2022062718/56649eb55503460f94bbe5d9/html5/thumbnails/4.jpg)
Problems
While using SVM, we confront two problems: How to set the best parameters for SVM ! How to choose the input attributes for SVM !
![Page 5: GA-Based Feature Selection and Parameter Optimization for Support Vector Machine Cheng-Lung Huang, Chieh-Jen Wang Expert Systems with Applications, Volume](https://reader035.vdocument.in/reader035/viewer/2022062718/56649eb55503460f94bbe5d9/html5/thumbnails/5.jpg)
Feature Selection
Feature selection is used to identify a powerfully predictive subset of fields within the database and to reduce the number of fields presented to the mining process.
Affects several aspects of pattern classification: 1.The accuracy of classification algorithm learned 2.The time needed for learning a classification function 3.The number of examples needed for learning 4.The cost associated with feature
![Page 6: GA-Based Feature Selection and Parameter Optimization for Support Vector Machine Cheng-Lung Huang, Chieh-Jen Wang Expert Systems with Applications, Volume](https://reader035.vdocument.in/reader035/viewer/2022062718/56649eb55503460f94bbe5d9/html5/thumbnails/6.jpg)
SVM Parameters Setting
Proper parameters setting can improve the classification accuracy of SVM.
The parameters that should be optimized include penalty parameter C and the parameters with different kernel function.
Grid Algorithm is an alternative to find the best C and the gamma parameter, however it is time consuming and does not perform well.
![Page 7: GA-Based Feature Selection and Parameter Optimization for Support Vector Machine Cheng-Lung Huang, Chieh-Jen Wang Expert Systems with Applications, Volume](https://reader035.vdocument.in/reader035/viewer/2022062718/56649eb55503460f94bbe5d9/html5/thumbnails/7.jpg)
Research Purposes
This research objective is to optimize the parameters and the feature subset simultaneously, without degrading the classification accuracy of SVM.
Genetic Algorithms (GA) have the potential to be used to generate both the feature subset and the SVM parameters at the same time.
![Page 8: GA-Based Feature Selection and Parameter Optimization for Support Vector Machine Cheng-Lung Huang, Chieh-Jen Wang Expert Systems with Applications, Volume](https://reader035.vdocument.in/reader035/viewer/2022062718/56649eb55503460f94bbe5d9/html5/thumbnails/8.jpg)
Chromosomes Design
: represents the value of parameter C
: represents the value of parameter γ
: represents selected features
1 ~ cnC Cg g
1 ~ ng g
1 ~ cnC Cg g
1 ~ fnf fg g
1Cg 1g
1fgcn
Cg ng
fnfgi
Cg jg kfg
![Page 9: GA-Based Feature Selection and Parameter Optimization for Support Vector Machine Cheng-Lung Huang, Chieh-Jen Wang Expert Systems with Applications, Volume](https://reader035.vdocument.in/reader035/viewer/2022062718/56649eb55503460f94bbe5d9/html5/thumbnails/9.jpg)
Genotype to Phenotype The bit strings for parameter C and γ are genotype
that should be transformed into phenotype
Value: the phenotype value Pmin: the minimum value of parameter (user define) Pmax: the maximum value of parameter (user define) D: the decimal value of bit string
L: the length of bit string
max minmin *
2 1L
P PValue P D
![Page 10: GA-Based Feature Selection and Parameter Optimization for Support Vector Machine Cheng-Lung Huang, Chieh-Jen Wang Expert Systems with Applications, Volume](https://reader035.vdocument.in/reader035/viewer/2022062718/56649eb55503460f94bbe5d9/html5/thumbnails/10.jpg)
Fitness Function Design
WA: SVM classification accuracy weight SVM_accuracy: SVM classification accuracy WF: weight of the features Ci: cost of feature i Fi: “1” represents that feature i is selected; “0”
represents that feature i is not selected
1
1
* _ *( * )fn
A F i ii
fitness W SVM accuracy W C F
![Page 11: GA-Based Feature Selection and Parameter Optimization for Support Vector Machine Cheng-Lung Huang, Chieh-Jen Wang Expert Systems with Applications, Volume](https://reader035.vdocument.in/reader035/viewer/2022062718/56649eb55503460f94bbe5d9/html5/thumbnails/11.jpg)
System Flows for GA-based SVM
(1) Data preprocess: scaling (2) Converting genotype to phenotype (3) Feature subset (4) Fitness evaluation (5) Termination criteria (6) Genetic operation
![Page 12: GA-Based Feature Selection and Parameter Optimization for Support Vector Machine Cheng-Lung Huang, Chieh-Jen Wang Expert Systems with Applications, Volume](https://reader035.vdocument.in/reader035/viewer/2022062718/56649eb55503460f94bbe5d9/html5/thumbnails/12.jpg)
Figure of System Flows
Scaling
Testing setTraining set
Dataset
Termination are satisfied?
Training SVM classifier
Fitness evaluation
No
Yes
Optimized (C, ) and feature subset
Classfication accuracy for testing set
Genetic operatation
Converting genotype to phenotype
Population
Selected feature subset
Testing set with selected feature subset
Training set with selected feature subset
Parameter genes
Phenotype of parameter genes
Feature genes
Phenotype of feature genes
Trained SVM classifier
![Page 13: GA-Based Feature Selection and Parameter Optimization for Support Vector Machine Cheng-Lung Huang, Chieh-Jen Wang Expert Systems with Applications, Volume](https://reader035.vdocument.in/reader035/viewer/2022062718/56649eb55503460f94bbe5d9/html5/thumbnails/13.jpg)
Experimental Dataset
No. Names #Classes #InstancesNominalfeatures
Numericfeatures
Totalfeatures
1 German (credit card) 2 1000 0 24 24
2 Australian (credit card) 2 690 6 8 14
3 Pima-Indian diabetes 2 760 0 8 8
4 Heart disease (Statlog Project) 2 270 7 6 13
5 Breast cancer(Wisconsin) 2 699 0 10 10
6 Contraceptive Method Choice 3 1473 7 2 9
7 Ionosphere 2 351 0 34 34
8 Iris 3 150 0 4 4
9 Sonar 2 208 0 60 60
10 Statlog project : vehicle 4 940 0 18 18
11 Vowel 11 990 3 10 13
![Page 14: GA-Based Feature Selection and Parameter Optimization for Support Vector Machine Cheng-Lung Huang, Chieh-Jen Wang Expert Systems with Applications, Volume](https://reader035.vdocument.in/reader035/viewer/2022062718/56649eb55503460f94bbe5d9/html5/thumbnails/14.jpg)
Experiments Description
To guarantee that the present results are valid and can be generalized for making predictions regarding new data
Using k-fold-cross-validation This study used k = 10, meaning that all of the
data will be divided into ten parts, each of which will take turns at being the testing data set.
![Page 15: GA-Based Feature Selection and Parameter Optimization for Support Vector Machine Cheng-Lung Huang, Chieh-Jen Wang Expert Systems with Applications, Volume](https://reader035.vdocument.in/reader035/viewer/2022062718/56649eb55503460f94bbe5d9/html5/thumbnails/15.jpg)
Accuracy Calculation
Accuracy using the binary target datasets can be demonstrated by the positive hit rate (sensitivity), the negative hit rate (specificity), and the overall hit rate.
For the multiple class datasets, the accuracy is demonstrated only by the average hit rate.
![Page 16: GA-Based Feature Selection and Parameter Optimization for Support Vector Machine Cheng-Lung Huang, Chieh-Jen Wang Expert Systems with Applications, Volume](https://reader035.vdocument.in/reader035/viewer/2022062718/56649eb55503460f94bbe5d9/html5/thumbnails/16.jpg)
Accuracy Calculation
Sensitivity is the proportion of cases with positive class that are classified as positive: P(T+|D+) = TP / (TP+FN).
Specificity is the proportion of cases with the negative class: P(T-|D-) = TN / (TN + FP).
Overall hit rate is the overall accuracy which is calculated by (TP+TN) / (TN+FP+FN+FP).
Target (or Disease)
+ -
Predicted(or Test)
+ True Positive(TP) False Positive(FP)
- False Negative(FN) True Negative(TN)
![Page 17: GA-Based Feature Selection and Parameter Optimization for Support Vector Machine Cheng-Lung Huang, Chieh-Jen Wang Expert Systems with Applications, Volume](https://reader035.vdocument.in/reader035/viewer/2022062718/56649eb55503460f94bbe5d9/html5/thumbnails/17.jpg)
Accuracy Calculation
The SVM_accuracy of the fitness in function is measured by Sensitivity*Specificity for the datasets with two classes (positive or negative).
Overall hit rate for the datasets with multiple classes.
![Page 18: GA-Based Feature Selection and Parameter Optimization for Support Vector Machine Cheng-Lung Huang, Chieh-Jen Wang Expert Systems with Applications, Volume](https://reader035.vdocument.in/reader035/viewer/2022062718/56649eb55503460f94bbe5d9/html5/thumbnails/18.jpg)
GA Parameter Setting
Chromosome Represented by using Binary Code Population Size 500 Crossover Rate 0.7 ,One Point Crossover Mutation Rate 0.02 Roulette Wheel Selection Elitism Replacement
![Page 19: GA-Based Feature Selection and Parameter Optimization for Support Vector Machine Cheng-Lung Huang, Chieh-Jen Wang Expert Systems with Applications, Volume](https://reader035.vdocument.in/reader035/viewer/2022062718/56649eb55503460f94bbe5d9/html5/thumbnails/19.jpg)
WA and WF
Weight WA and WF can influence the experiment result according to the fitness function
The higher WA is; the higher classification accuracy is.
The higher WF is; the smaller the number of features is.
![Page 20: GA-Based Feature Selection and Parameter Optimization for Support Vector Machine Cheng-Lung Huang, Chieh-Jen Wang Expert Systems with Applications, Volume](https://reader035.vdocument.in/reader035/viewer/2022062718/56649eb55503460f94bbe5d9/html5/thumbnails/20.jpg)
Folder #4 of German Dataset Curve Diagram
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1
Accurcay Weight
Accuracy Fitness
0
5
10
15
20
1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1
Accuracy Weight
Feature Number
![Page 21: GA-Based Feature Selection and Parameter Optimization for Support Vector Machine Cheng-Lung Huang, Chieh-Jen Wang Expert Systems with Applications, Volume](https://reader035.vdocument.in/reader035/viewer/2022062718/56649eb55503460f94bbe5d9/html5/thumbnails/21.jpg)
Experimental Results for German Dataset
![Page 22: GA-Based Feature Selection and Parameter Optimization for Support Vector Machine Cheng-Lung Huang, Chieh-Jen Wang Expert Systems with Applications, Volume](https://reader035.vdocument.in/reader035/viewer/2022062718/56649eb55503460f94bbe5d9/html5/thumbnails/22.jpg)
Results summary (GA-based approach vs. Grid search )
GA-based approach Grid algorithm p-value for Wilcoxon Te
sting
Names Number of Original features
Number of Selected features
AveragePositiveHit Rate
AverageNegativeHit Rate
AverageOverall Hit
Rate%
AveragePositiveHit Rate
AverageNegativeHit Rate
AverageOverall Hit
Rate%
German 24 13±1.83 0.89608 0.76621 85.6±1.96 0.888271 0.462476 76±4.06 0.005*
Australian 14 3±2.45 0.8472 0.92182 88.1±2.25 0.885714 0.823529 84.7±4.74 0.028*
diabetes 8 3.7±0.95 0.78346 0.87035 81.5±7.13 0.592593 0.88 77.3±3.03 0.139
Heart disease 13 5.4±1.85 0.94467 0.95108 94.8±3.32 0.75 0.909091 83.7±6.34 0.005*
breast cancer 10 1±0 0.9878 0.8996 96.19±1.24 0.98 0.944444 95.3±2.28 0.435
Contraceptive 9 5.4±0.53 N/A N/A 71.22±4.15 N/A N/A 53.53±2.43 0.005*
ionosphere 34 6±0 0.9963 0.9876 98.56±2.03 0.94 0.9 89.44±3.58 0.005*
iris 4 1±0 N/A N/A 100±0 N/A N/A 97.37±3.46 0.046*
sonar 60 15±1.1 0.9863 0.9842 98±3.5 0.65555 0.9 87±4.22 0.004*
vehicle 18 9.2±1.4 N/A N/A 84.06±3.54 N/A N/A 83.33±2.74 0.944
Vowel 13 7.8±1 N/A N/A 99.3±0.82 N/A N/A 95.95±2.91 0.02*
![Page 23: GA-Based Feature Selection and Parameter Optimization for Support Vector Machine Cheng-Lung Huang, Chieh-Jen Wang Expert Systems with Applications, Volume](https://reader035.vdocument.in/reader035/viewer/2022062718/56649eb55503460f94bbe5d9/html5/thumbnails/23.jpg)
ROC curve for fold #4 of German Credit Dataset
1 - Specificity
1.00.75.50.250.00
Sen
sit
ivit
y1.00
.75
.50
.25
0.00
Guide
Grid search
GA-based approach
![Page 24: GA-Based Feature Selection and Parameter Optimization for Support Vector Machine Cheng-Lung Huang, Chieh-Jen Wang Expert Systems with Applications, Volume](https://reader035.vdocument.in/reader035/viewer/2022062718/56649eb55503460f94bbe5d9/html5/thumbnails/24.jpg)
Average AUC for Datasets
GA-based approach Grid algorithm
German 0.8424 0.7886
Australian 0.9019 0.8729
diabetes 0.8298 0.7647
Heart disease 0.9458 0.8331
breast cancer 0.9423 0.9078
Contraceptive 0.7701 0.6078
ionosphere 0.9661 0.8709
iris 0.9756 0.9572
sonar 0.9522 0.8898
vehicle 0.8587 0.8311
Vowel 0.9513 0.9205
![Page 25: GA-Based Feature Selection and Parameter Optimization for Support Vector Machine Cheng-Lung Huang, Chieh-Jen Wang Expert Systems with Applications, Volume](https://reader035.vdocument.in/reader035/viewer/2022062718/56649eb55503460f94bbe5d9/html5/thumbnails/25.jpg)
Conclusion
We proposed a GA-based strategy to select features subset and to set the parameters for SVM classification.
We have conducted two experiments to evaluate the classification accuracy of the proposed GA-based approach with RBF kernel and the grid search method on 11 real-world datasets from UCI database.
Generally, compared with the grid search approach, the proposed GA-based approach has good accuracy performance with fewer features.
![Page 26: GA-Based Feature Selection and Parameter Optimization for Support Vector Machine Cheng-Lung Huang, Chieh-Jen Wang Expert Systems with Applications, Volume](https://reader035.vdocument.in/reader035/viewer/2022062718/56649eb55503460f94bbe5d9/html5/thumbnails/26.jpg)
Thank YouQ & A