augmented negative selection algorithm with variable-coverage detectors zhou ji, zhou ji, st. jude...

22
Augmented Negative Augmented Negative Selection Algorithm Selection Algorithm with Variable- with Variable- Coverage Detectors Coverage Detectors Zhou Ji, Zhou Ji, St. Jude Children’s Research Hospital Dipankar Dasgupta, Dipankar Dasgupta, The University of Memphis EC 2004. June 20-23, 2004. Portland, Oregon.

Upload: miguel-watkins

Post on 27-Mar-2015

226 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Augmented Negative Selection Algorithm with Variable-Coverage Detectors Zhou Ji, Zhou Ji, St. Jude Childrens Research Hospital Dipankar Dasgupta, Dipankar

Augmented Negative Augmented Negative Selection Algorithm with Selection Algorithm with

Variable-Coverage Variable-Coverage DetectorsDetectors

Zhou Ji, Zhou Ji, St. Jude Children’s Research Hospital

Dipankar Dasgupta, Dipankar Dasgupta, The University of Memphis

CEC 2004. June 20-23, 2004. Portland, Oregon.

Page 2: Augmented Negative Selection Algorithm with Variable-Coverage Detectors Zhou Ji, Zhou Ji, St. Jude Childrens Research Hospital Dipankar Dasgupta, Dipankar

IntroductionIntroduction

AIS – Artificial Immune SystemsAIS – Artificial Immune Systems Major types of AIS:Major types of AIS:

Negative selectionNegative selection Immune networksImmune networks Clonal SelectionClonal Selection

Matching rule is one of the most important Matching rule is one of the most important components in a negative or positive components in a negative or positive selection algorithm. selection algorithm.

Page 3: Augmented Negative Selection Algorithm with Variable-Coverage Detectors Zhou Ji, Zhou Ji, St. Jude Childrens Research Hospital Dipankar Dasgupta, Dipankar

Introduction (continued)Introduction (continued)matching rulesmatching rules

For binary representation:For binary representation: rcb (r-contiguous bits), rcb (r-contiguous bits), r-chunks,r-chunks, Hamming distance Hamming distance

For real-valued representation:For real-valued representation: Usually based on Euclidean distance or other Usually based on Euclidean distance or other

distance measuresdistance measures

Page 4: Augmented Negative Selection Algorithm with Variable-Coverage Detectors Zhou Ji, Zhou Ji, St. Jude Childrens Research Hospital Dipankar Dasgupta, Dipankar

Introduction (continued)Introduction (continued)

By allowing the detectors to have some variable By allowing the detectors to have some variable properties, properties, V-detectorV-detector enhances negative enhances negative selection algorithm from several aspects:selection algorithm from several aspects: It takes fewer large detectors to cover non-self region It takes fewer large detectors to cover non-self region

– saving time and space– saving time and space Small detector covers “holes” better.Small detector covers “holes” better. Coverage is estimated when the detector set is Coverage is estimated when the detector set is

generated.generated. The shapes of detectors or even the types of The shapes of detectors or even the types of

matching rules can be extended to be variable matching rules can be extended to be variable too.too.

Page 5: Augmented Negative Selection Algorithm with Variable-Coverage Detectors Zhou Ji, Zhou Ji, St. Jude Childrens Research Hospital Dipankar Dasgupta, Dipankar

Comparison of constant-sized detectors Comparison of constant-sized detectors and variable-sized detectorsand variable-sized detectors

Constant-sized detectors Variable-sized detectors

Page 6: Augmented Negative Selection Algorithm with Variable-Coverage Detectors Zhou Ji, Zhou Ji, St. Jude Childrens Research Hospital Dipankar Dasgupta, Dipankar

Algorithm (training stage)Algorithm (training stage)

D

mD

xDD

xisd

iisSis

nx

sr

m

S

srm,(S

return :9

|| Until:8

} { :7

2 togo ,srd if :6

and between distanceEuclidean :5

,...}2,1,{in every for Repeat :4

0] [1, from sample random :3

Repeat :2

D :1

radius self :

detectors ofnumber :

samples self ofset :

),Set-Detector

Dreturn :20

maxT|D| Until:19

exit coverage) self maximum-1/(1 T if :18

1TT else :17

r radius and location xith detector w a is

r x, where},,{DD then 0r if :16

:sr-drr then sr-d if :15

xand isbetween distanceEuclidean d :14

Sin severy for Repeat :13

:4 togo :12

return then )01/(1 tif :11

1t t :10

iddetector of

radius theis )ir(d where then,)ir(ddd if :9

id oflocation theis )i x(d where x,and )i x(d

between distanceEuclidean dd :8

...} 2, 1,i .i{dDin idevery for Repeat :7

]01[ from sample random :6

inifiniter :5

0T :4

0 t :3

Repeat :2

D :1

coverage expected :0c

radius self :

detector ofnumber maximum :

samples self ofset :

),maxT Set(S,-Detector-V

rx

i

Dc

n, x

sr

maxT

S

ocs, r

Generation of constant-sized detectors

Generation of variable-sized detectors

Page 7: Augmented Negative Selection Algorithm with Variable-Coverage Detectors Zhou Ji, Zhou Ji, St. Jude Childrens Research Hospital Dipankar Dasgupta, Dipankar

Outline of the algorithm Outline of the algorithm (generation of variable-sized detector set)(generation of variable-sized detector set)

Page 8: Augmented Negative Selection Algorithm with Variable-Coverage Detectors Zhou Ji, Zhou Ji, St. Jude Childrens Research Hospital Dipankar Dasgupta, Dipankar

Screenshots of the softwareScreenshots of the software

Message view Visualization of data points and detectors

Page 9: Augmented Negative Selection Algorithm with Variable-Coverage Detectors Zhou Ji, Zhou Ji, St. Jude Childrens Research Hospital Dipankar Dasgupta, Dipankar

Experiments and ResultsExperiments and Results

Synthetic DataSynthetic Data 2D. Training data are randomly chosen from the 2D. Training data are randomly chosen from the

normal region.normal region. Fisher’s Iris DataFisher’s Iris Data

One of the three types is considered as “normal”.One of the three types is considered as “normal”. Biomedical DataBiomedical Data

Abnormal data are the medical measures of disease Abnormal data are the medical measures of disease carrier patients.carrier patients.

Pollution DataPollution Data Abnormal data are made by artificially altering the Abnormal data are made by artificially altering the

normal air measurementsnormal air measurements

Page 10: Augmented Negative Selection Algorithm with Variable-Coverage Detectors Zhou Ji, Zhou Ji, St. Jude Childrens Research Hospital Dipankar Dasgupta, Dipankar

Synthetic data - Synthetic data - Cross-shaped self spaceCross-shaped self space Shape of self region and example detector coverageShape of self region and example detector coverage

(a) Actual self space (b) self radius = 0.05 (c) self radius = 0.1

Page 11: Augmented Negative Selection Algorithm with Variable-Coverage Detectors Zhou Ji, Zhou Ji, St. Jude Childrens Research Hospital Dipankar Dasgupta, Dipankar

Synthetic data - Synthetic data - Cross-shaped self spaceCross-shaped self space ResultsResults

0

20

40

60

80

100

120

0.01 0.03 0.05 0.07 0.09 0.11 0.13 0.15 0.17 0.19

self radius

det

ecti

on

rat

e

0

10

20

30

40

50

60

70

80

90

fals

e a

larm

rat

e

Detection rate (99.99% coverage) Detection rate (99% coverage)False alarm rate (99% coverage) False alarm rate (99.99% coverage)

0

200

400

600

800

1000

1200

0.01 0.03 0.05 0.07 0.09 0.11 0.13 0.15 0.17 0.19

self radius

nu

mb

er o

f d

etec

tors

99.99% coverage 99% coverage

Detection rate and false alarm rate Number of detectors

Page 12: Augmented Negative Selection Algorithm with Variable-Coverage Detectors Zhou Ji, Zhou Ji, St. Jude Childrens Research Hospital Dipankar Dasgupta, Dipankar

Error ratesError rates

0

5

10

15

20

25

30

35

40

45

0.01 0.03 0.05 0.07 0.09 0.11 0.13 0.15 0.17 0.19

self radius

err

or

rate

(p

erc

en

tag

e)

false negative (99% coverage) false positive (99% coverage)

Page 13: Augmented Negative Selection Algorithm with Variable-Coverage Detectors Zhou Ji, Zhou Ji, St. Jude Childrens Research Hospital Dipankar Dasgupta, Dipankar

Synthetic data - Synthetic data - Ring-shaped self spaceRing-shaped self space Shape of self region and example detector coverageShape of self region and example detector coverage

(a) Actual self space (b) self radius = 0.05 (c) self radius = 0.1

Page 14: Augmented Negative Selection Algorithm with Variable-Coverage Detectors Zhou Ji, Zhou Ji, St. Jude Childrens Research Hospital Dipankar Dasgupta, Dipankar

0

20

40

60

80

100

120

0.01 0.03 0.05 0.07 0.09 0.11 0.13 0.15 0.17 0.19

self radius

det

ecti

on

rat

e

0

10

20

30

40

50

60

70

fals

e a

larm

rat

e

Detection rate (99.99% coverage) Detection rate (99% coverage)False alarm rate (99% coverage) False alarm rate (99.99% coverage)

0

200

400

600

800

1000

1200

0.01 0.03 0.05 0.07 0.09 0.11 0.13 0.15 0.17 0.19

self radius

nu

mb

er o

f d

etec

tors

99.99% coverage 99% coverage

Synthetic data - Synthetic data - Ring-shaped self spaceRing-shaped self space ResultsResults

Detection rate and false alarm rate Number of detectors

Page 15: Augmented Negative Selection Algorithm with Variable-Coverage Detectors Zhou Ji, Zhou Ji, St. Jude Childrens Research Hospital Dipankar Dasgupta, Dipankar

Iris dataIris dataComparison with other methods: performanceComparison with other methods: performance

Detection rate False alarm rate

Setosa 100% MILA 95.16 0

NSA (single level) 100 0

V-detector 99.98 0

Setosa 50% MILA 94.02 8.42

NSA (single level) 100 11.18

V-detector 99.97 1.32

Versicolor 100% MILA 84.37 0

NSA (single level) 95.67 0

V-detector 85.95 0

Versicolor 50% MILA 84.46 19.6

NSA (single level) 96 22.2

V-detector 88.3 8.42

Virginica 100% MILA 75.75 0

NSA (single level) 92.51 0

V-detector 81.87 0

Virginica 50% MILA 88.96 24.98

NSA (single level) 97.18 33.26

V-detector 93.58 13.18

Page 16: Augmented Negative Selection Algorithm with Variable-Coverage Detectors Zhou Ji, Zhou Ji, St. Jude Childrens Research Hospital Dipankar Dasgupta, Dipankar

Iris dataIris dataComparison with other methods: number of detectorsComparison with other methods: number of detectors

mean max Min SD

Setosa 100% 20 42 5 7.87

Setosa 50% 16.44 33 5 5.63

Veriscolor 100% 153.24 255 72 38.8

Versicolor 50% 110.08 184 60 22.61

Virginica 100% 218.36 443 78 66.11

Virginica 50% 108.12 203 46 30.74

Page 17: Augmented Negative Selection Algorithm with Variable-Coverage Detectors Zhou Ji, Zhou Ji, St. Jude Childrens Research Hospital Dipankar Dasgupta, Dipankar

Iris DataIris DataVirginica as normal, 50% points used to trainVirginica as normal, 50% points used to train

0

20

40

60

80

100

120

0.01 0.03 0.05 0.07 0.09 0.11 0.13 0.15 0.17 0.19

self radius

de

tec

tio

n r

ate

0

10

20

30

40

50

60

fals

e a

larm

ra

te

Detection rate (99.99% coverage) Detection rate (99% coverage)False alarm rate (99% coverage) False alarm rate (99.99% coverage)

0

200

400

600

800

1000

1200

0.01 0.03 0.05 0.07 0.09 0.11 0.13 0.15 0.17 0.19

self radius

nu

mb

er

of

de

tec

tors

99.99% coverage 99% coverage

Detection rate and false alarm rate Number of detectors

Page 18: Augmented Negative Selection Algorithm with Variable-Coverage Detectors Zhou Ji, Zhou Ji, St. Jude Childrens Research Hospital Dipankar Dasgupta, Dipankar

Biomedical dataBiomedical data

Blood measure for a group of 209 patientsBlood measure for a group of 209 patients Each patient has four different types of Each patient has four different types of

measurementmeasurement 75 patients are carriers of a rare genetic 75 patients are carriers of a rare genetic

disorder. Others are normal.disorder. Others are normal.

Page 19: Augmented Negative Selection Algorithm with Variable-Coverage Detectors Zhou Ji, Zhou Ji, St. Jude Childrens Research Hospital Dipankar Dasgupta, Dipankar

Biomedical data Biomedical data

0

10

20

30

40

50

60

70

80

90

100

0.01 0.03 0.05 0.07 0.09 0.11 0.13 0.15 0.17 0.19

self radius

de

tec

tio

n r

ate

0

10

20

30

40

50

60

fals

e a

larm

ra

te

Detection rate (99.99% coverage) Detection rate (99% coverage)False alarm rate (99% coverage) False alarm rate (99.99% coverage)

0

200

400

600

800

1000

1200

0.01 0.03 0.05 0.07 0.09 0.11 0.13 0.15 0.17 0.19

self radiusn

um

be

r o

f d

ete

cto

rs

99.99% coverage 99% coverage

Detection rate and false alarm rate Number of detectors

Page 20: Augmented Negative Selection Algorithm with Variable-Coverage Detectors Zhou Ji, Zhou Ji, St. Jude Childrens Research Hospital Dipankar Dasgupta, Dipankar

Air pollution dataAir pollution data Totally 60 original records.Totally 60 original records. Each is 16 different measurements concerning air Each is 16 different measurements concerning air

pollution.pollution. All the real data are considered as normal.All the real data are considered as normal. More data are made artificially:More data are made artificially:

1.1. Decide the normal range of each of 16 measurementsDecide the normal range of each of 16 measurements2.2. Randomly choose a real recordRandomly choose a real record3.3. Change three randomly chosen measurements within a larger Change three randomly chosen measurements within a larger

than normal rangethan normal range4.4. If some the changed measurements are out of range, the If some the changed measurements are out of range, the

record is considered abnormal; otherwise they are considered record is considered abnormal; otherwise they are considered normalnormal

Totally 1000 records including the original 60 are used Totally 1000 records including the original 60 are used as test data. The original 60 are used as training data.as test data. The original 60 are used as training data.

Page 21: Augmented Negative Selection Algorithm with Variable-Coverage Detectors Zhou Ji, Zhou Ji, St. Jude Childrens Research Hospital Dipankar Dasgupta, Dipankar

Pollution dataPollution data

0

20

40

60

80

100

120

0.01 0.03 0.05 0.07 0.09 0.11 0.13 0.15 0.17 0.19

self radius

de

tec

tio

n r

ate

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

fals

e a

larm

ra

te

Detection rate (99.99% coverage) Detection rate (99% coverage)False alarm rate (99% coverage) False alarm rate (99.99% coverage)

0

200

400

600

800

1000

1200

0.01 0.03 0.05 0.07 0.09 0.11 0.13 0.15 0.17 0.19

self radius

nu

mb

er

of

de

tec

tors

99.99% coverage 99% coverage

Detection rate and false alarm rate Number of detectors

Page 22: Augmented Negative Selection Algorithm with Variable-Coverage Detectors Zhou Ji, Zhou Ji, St. Jude Childrens Research Hospital Dipankar Dasgupta, Dipankar

ConclusionConclusion

V-detectorV-detector’s advantages:’s advantages:

1.1. Fewer detectors to achieve similar or better Fewer detectors to achieve similar or better coverage.coverage.

2.2. Smaller detectors can be used when necessary.Smaller detectors can be used when necessary.

3.3. Coverage estimate is included automatically.Coverage estimate is included automatically. Future work:Future work:

Variable shape of detectors, variable matching rulesVariable shape of detectors, variable matching rules More analysisMore analysis