a genetic algorithm-based method for feature subset selection

15
1 A genetic algorithm-based method for feature subset selection Feng Tan; Xuezheng Fu; Yanqing Zang; Anu G. Bourgeois Springer Soft Comput (2008) 12:111-120 Yi-Chia Lan

Upload: melanie-rosales

Post on 01-Jan-2016

47 views

Category:

Documents


3 download

DESCRIPTION

A genetic algorithm-based method for feature subset selection. Feng Tan; Xuezheng Fu; Yanqing Zang; Anu G. Bourgeois Springer Soft Comput (2008) 12:111-120 Yi-Chia Lan. Outline. Introduction Feature selection methods Entropy-based feature ranking T-statistics - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: A genetic algorithm-based method for feature subset selection

1

A genetic algorithm-based method for feature subset selection

Feng Tan; Xuezheng Fu; Yanqing Zang; Anu G. Bourgeois

Springer Soft Comput (2008) 12:111-120

Yi-Chia Lan

Page 2: A genetic algorithm-based method for feature subset selection

2

OutlineIntroduction

Feature selection methods

Entropy-based feature ranking T-statistics SVM-RFE(Recursive Feature

Elimination)

Framework of feature selection algorithm

Experiments and results

Page 3: A genetic algorithm-based method for feature subset selection

3

Introduction (cont.)

Page 4: A genetic algorithm-based method for feature subset selection

Training data (sets)

Test data (sets)

Classificatory accuracy

Introduction (cont.)

Page 5: A genetic algorithm-based method for feature subset selection

Introduction

1. Feature selection

Removing redundant irrelevant or noise features Improve the predictive accuracy

2. The experimental result demonstrate:

Higher classification accuracy Minimize size of feature subsets

Page 6: A genetic algorithm-based method for feature subset selection

Feature selection and extraction

Page 7: A genetic algorithm-based method for feature subset selection
Page 8: A genetic algorithm-based method for feature subset selection

Feature selection methods (cont.) Entropy-based

α : parameter

: average distance among the instances

: Euclidean distance between the two instances

Page 9: A genetic algorithm-based method for feature subset selection

Feature selection methods (cont.) T-statistics

Page 10: A genetic algorithm-based method for feature subset selection

Feature selection methods SVM-RFE

At the optimum of J , the first order is neglected

second order becomes

Page 11: A genetic algorithm-based method for feature subset selection

Genetic algorithm

Page 12: A genetic algorithm-based method for feature subset selection

Framework of feature selection algorithm (cont.)

Page 13: A genetic algorithm-based method for feature subset selection

Fitness function :

x : feature vector representing ; c(x) : classification accuracyw : parameter {0~1} ; s(x) : weighted size

Framework of feature selection algorithm

Crossover : Single-point crossover operator

Mutation : 0.001

Page 14: A genetic algorithm-based method for feature subset selection

Experiment result (1)

Page 15: A genetic algorithm-based method for feature subset selection

Experiment result (2)