battista biggio, invited keynote @ aisec 2014 - on learning and recognition of secure patterns

40
Pa#ern Recogni-on and Applica-ons Lab University of Cagliari, Italy Department of Electrical and Electronic Engineering On Learning and Recognition of Secure Patterns BaAsta Biggio Dept. Of Electrical and Electronic Engineering University of Cagliari, Italy Sco#sdale, Arizona, US, Nov., 7 2014 AISec 2014

Upload: pra-group-university-of-cagliari

Post on 02-Jul-2015

141 views

Category:

Education


0 download

DESCRIPTION

Learning and recognition of secure patterns is a well-known problem in nature. Mimicry and camouflage are widely-spread techniques in the arms race between predators and preys. All of the information acquired by our senses is therefore not necessarily secure or reliable. In machine learning and pattern recognition systems, we have started investigating these issues only recently, with the goal of learning to discriminate between secure and hostile patterns. This phenomenon has been especially observed in the context of adversarial settings like biometric recognition, malware detection and spam filtering, in which data can be adversely manipulated by humans to undermine the outcomes of an automatic analysis. As current pattern recognition methods are not natively designed to deal with the intrinsic, adversarial nature of these problems, they exhibit specific vulnerabilities that an adversary may exploit either to mislead learning or to avoid detection. Identifying these vulnerabilities and analyzing the impact of the corresponding attacks on pattern classifiers is one of the main open issues in the novel research field of adversarial machine learning. In the first part of this talk, I introduce a general framework that encompasses and unifies previous work in the field, allowing one to systematically evaluate classifier security against different, potential attacks. As an example of application of this framework, in the second part of the talk, I discuss evasion attacks, where malicious samples are manipulated at test time to avoid detection. I then show how carefully-designed poisoning attacks can mislead learning of support vector machines by manipulating a small fraction of their training data, and how to poison adaptive biometric verification systems to compromise the biometric templates (face images) of the enrolled clients. Finally, I briefly discuss our ongoing work on attacks against clustering algorithms, and sketch some possible future research directions.

TRANSCRIPT

Page 1: Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition of Secure Patterns

Pa#ern  Recogni-on    and  Applica-ons  Lab  

                               

 University  

of  Cagliari,  Italy  

 

Department  of  Electrical  and  Electronic  

Engineering  

On Learning and Recognition of Secure Patterns

BaAsta  Biggio    

Dept.  Of  Electrical  and  Electronic  Engineering  University  of  Cagliari,  Italy  

Sco#sdale,  Arizona,  US,  Nov.,  7  2014  AISec  2014  

Page 2: Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition of Secure Patterns

 

http://pralab.diee.unica.it

Secure Patterns in Nature

•  Learning of secure patterns is a well-known problem in nature –  Mimicry and camouflage –  Arms race between predators and preys

2  

Page 3: Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition of Secure Patterns

 

http://pralab.diee.unica.it

Secure Patterns in Computer Security

•  Similar phenomenon in machine learning and computer security –  Obfuscation and polymorphism to hide malicious content

Spam emails Malware

Start 2007 with a bang! Make WBFS YOUR PORTFOLIO’s first winner of the year ..."

<script type="text/javascript" src=”http://palwas.servehttp.com//ml.php"></script>"

...  var PGuDO0uq19+PGuDO0uq20; EbphZcei=PVqIW5sV.replace(/jTUZZ/g,"%"); var eWfleJqh=unescape;"Var NxfaGVHq=“pqXdQ23KZril30”;"q9124=this; var SkuyuppD= q9124["WYd1GoGYc2uG1mYGe2YnltY".replace(/[Y12WlG\:]/g, "")];SkuyuppD.write(eWfleJqh(EbphZcei));"..."

3  

Page 4: Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition of Secure Patterns

 

http://pralab.diee.unica.it

•  Adaptation/evolution is crucial to survive!

Arms Race

Attackers Evasion techniques

System designers Design of effective countermeasures

text analysis on spam emails

visual analysis of attached images

Image-based spam

…  4  

Page 5: Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition of Secure Patterns

 

http://pralab.diee.unica.it

Machine  Learning  in  Computer  Security  

5  

Page 6: Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition of Secure Patterns

 

http://pralab.diee.unica.it

Design of Learning-based Systems Training phase

x   x  x  x   x  x  x  

x  x   x  

x  x  

x   x  x  x  x  

x1 x2 ... xd

pre-­‐processing  and  feature  extrac-on  

 

training  data  (with  labels)  

classifier  learning  

Linear  classifiers:  assign  a  weight  to  each  feature  and  classify  a  sample  based    on  the  sign  of  its  score  

f (x) = sign(wT x)+1, malicious−1, legitimate

"#$

%$

start"bang portfolio winner"year ... university campus"

Start 2007 with a bang! Make WBFS YOUR PORTFOLIO’s first winner of the year ..."

start"bang portfolio winner"year ... university campus"

1 1"1"1"1"..."0"0"

x  SPAM   start"bang portfolio winner"year ... university campus"

+2 +1"+1"+1"+1"..."-3"-4"

w  

6  

Page 7: Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition of Secure Patterns

 

http://pralab.diee.unica.it

Design of Learning-based Systems Test phase

x1 x2 ... xd

pre-­‐processing  and  feature  extrac-on  

 

test  data  

x  x  x  

x  

x  x  x  x  

x   x  x  x  

x   x  x  x  x  

classifica-on  and  performance  evalua-on  e.g.,  classifica-on  accuracy  

 

Start 2007 with a bang! Make WBFS YOUR PORTFOLIO’s first winner of the year ..."

start"bang portfolio winner"year ... university campus"

1 1"1"1"1"..."0"0"

start"bang portfolio winner"year ... university campus"

+2 +1"+1"+1"+1"..."-3"-4"

+6 > 0, SPAM"(correctly  classified)  

x   w  

Linear  classifiers:  assign  a  weight  to  each  feature  and  classify  a  sample  based    on  the  sign  of  its  score  

f (x) = sign(wT x)+1, malicious−1, legitimate

"#$

%$

7  

Page 8: Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition of Secure Patterns

 

http://pralab.diee.unica.it

Can Machine Learning Be Secure?

•  Problem: how to evade a linear (trained) classifier?

Start 2007 with a bang! Make WBFS YOUR PORTFOLIO’s first winner of the year ..."

start"bang portfolio winner"year ... university campus"

1 1"1"1"1"..."0"0"

+6 > 0, SPAM"(correctly  classified)  

f (x) = sign(wT x)

x  

start"bang portfolio winner"year ... university campus"

+2 +1"+1"+1"+1"..."-3"-4"

w  

x’  

St4rt 2007 with a b4ng! Make WBFS YOUR PORTFOLIO’s first winner of the year ... campus!

start"bang portfolio winner"year ... university campus"

0 0!1"1"1"..."0"1!

+3 -4 < 0, HAM"(misclassified  email)  

f (x) = sign(wT x)

8  

Page 9: Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition of Secure Patterns

 

http://pralab.diee.unica.it

Can Machine Learning Be Secure?

•  Underlying assumption of machine learning techniques –  Training and test data are sampled from the same distribution

•  In practice –  The classifier generalizes well from known examples (+ random noise) –  … but can not cope with carefully-crafted attacks!

•  It should be taught how to do that –  Explicitly taking into account adversarial data manipulation –  Adversarial machine learning

•  Problem: how can we assess classifier security in a more systematic manner?

(1)  M. Barreno, B. Nelson, R. Sears, A. D. Joseph, and J. D. Tygar. Can machine learning be secure? ASIACCS 2006

(2)  B. Biggio, G. Fumera, F. Roli. Security evaluation of pattern classifiers under attack. IEEE Trans. on Knowl. and Data Engineering, 2014 9  

Page 10: Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition of Secure Patterns

 

http://pralab.diee.unica.it

Security  Evalua>on  of  Pa@ern  Classifiers  

(1)  B. Biggio, G. Fumera, F. Roli. Security evaluation of pattern classifiers under attack. IEEE Trans. on Knowl. and Data Engineering, 2014

(2)  B. Biggio et al., Security evaluation of SVMs. SVM applications. Springer, 2014

10  

Page 11: Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition of Secure Patterns

 

http://pralab.diee.unica.it

Adversary model •  Goal of the attack •  Knowledge of the attacked system •  Capability of manipulating data •  Attack strategy as an optimization problem

Security Evaluation of Pattern Classifiers [B. Biggio, G. Fumera, F. Roli, IEEE Trans. KDE 2014]

Bounded adversary!

C1

C2

ac

cu

rac

y

Performance of more secure classifiers should degrade more gracefully under attack

0   Bound on knowledge / capability (e.g., number of modified words)

performance degradation of text classifiers in spam filtering against a different number of modified words in spam emails

Security Evaluation Curve

11  

Page 12: Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition of Secure Patterns

 

http://pralab.diee.unica.it

Adversary’s Goal

1.  Security violation [Barreno et al. ASIACCS06] –  Integrity: evade detection without

compromising system operation –  Availability: classification errors

to compromise system operation

–  Privacy: gaining confidential information about system users

2.  Attack’s specificity [Barreno et al. ASIACCS06]

–  Targeted/Indiscriminate: misclassification of a specific set/any sample

Integrity

Availability Privacy

Spearphishing  vs  phishing  12  

Page 13: Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition of Secure Patterns

 

http://pralab.diee.unica.it

Adversary’s Knowledge

•  Perfect knowledge –  upper bound on the performance degradation under attack

TRAINING DATA FEATURE

REPRESENTATION

LEARNING ALGORITHM

e.g., SVM

x1 x2 ... xd

x   x  x  x   x  x  x  

x  x   x  

x  x  

x   x  x  x  x  

- Learning algorithm - Parameters (e.g. feature weights) - Feedback on decisions

13  

Page 14: Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition of Secure Patterns

 

http://pralab.diee.unica.it

•  Attack’s influence [Barreno et al. ASIACCS06]

–  Manipulation of training/test data

•  Constraints on data manipulation

–  maximum number of samples that can be added to the training data •  the attacker usually controls only a small fraction of the training samples

–  maximum amount of modifications •  application-specific constraints

in feature space •  e.g., max. number of words that

are modified in spam emails

Adversary’s Capability

d(x, !x ) ≤ dmax

x2  

x1  

f(x)

x

Feasible domain

x '

14  

Page 15: Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition of Secure Patterns

 

http://pralab.diee.unica.it

Main Attack Scenarios

•  Evasion attacks –  Goal: integrity violation, indiscriminate attack –  Knowledge: perfect / limited –  Capability: manipulating test samples e.g., manipulation of spam emails at test time to evade detection

•  Poisoning attacks –  Goal: availability violation, indiscriminate attack –  Knowledge: perfect / limited –  Capability: injecting samples into the training data e.g., send spam with some ‘good words’ to poison the anti-spam filter, which may subsequently misclassify legitimate emails containing such ‘good words’

15  

Page 16: Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition of Secure Patterns

 

http://pralab.diee.unica.it

Targeted classifier: SVM

•  Maximum-margin linear classifier f (x) = sign(g(x)), g(x) = wT x + b

minw,b

12wTw+C max 0,1− yi f (xi )( )

i∑

1/margin   classifica-on  error  on  training  data  (hinge  loss)  

16  

Page 17: Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition of Secure Patterns

 

http://pralab.diee.unica.it

•  Enables learning and classification

using only dot products between samples

•  Kernel functions for nonlinear classification –  e.g., RBF Kernel

Kernels and Nonlinearity

w = αi yixii∑ → g(x) = αi yi x, xi

i∑ + b

support vectors

k(x, xi ) = exp −γ x − xi2( )−2−1.5−1−0.500.511.5

17  

Page 18: Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition of Secure Patterns

 

http://pralab.diee.unica.it

Evasion  A@acks  

18  

1.  B. Biggio, I. Corona, D. Maiorca, B. Nelson, N. Srndic, P. Laskov, G. Giacinto, and F. Roli. Evasion attacks against machine learning at test time. ECML PKDD, 2013.

2.  B. Biggio et al., Security evaluation of SVMs. SVM applications. Springer, 2014

Page 19: Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition of Secure Patterns

 

http://pralab.diee.unica.it

A Simple Example

•  Problem: how to evade a linear (trained) classifier? –  We have seen this already…

•  But… what if the classifier is nonlinear? –  Decision functions can be arbitrarily complicated, with no clear

relationship between features (x) and classifier parameters (w)

St4rt 2007 with a b4ng! Make WBFS YOUR PORTFOLIO’s first winner of the year ... campus!

start"bang portfolio winner"year ... university campus"

0 0!1"1"1"..."0"1!

start"bang portfolio winner"year ... university campus"

+2 +1"+1"+1"+1"..."-3"-4"

+3 -4 < 0, HAM"(misclassified  email)  

f (x) = sign(wT x)x’   w  

19  

Page 20: Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition of Secure Patterns

 

http://pralab.diee.unica.it

Gradient-descent Evasion Attacks

•  Goal: maximum-confidence evasion •  Knowledge: perfect •  Attack strategy:

•  Non-linear, constrained optimization –  Gradient descent: approximate

solution for smooth functions

•  Gradients of g(x) can be analytically computed in many cases

–  SVMs, Neural networks

−2−1.5

−1−0.5

00.5

11.5

x

f (x) = sign g(x)( ) =+1, malicious−1, legitimate

"#$

%$

minx 'g(x ')

s.t. d(x, x ') ≤ dmax

x '

20  

Page 21: Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition of Secure Patterns

 

http://pralab.diee.unica.it

Computing Descent Directions

Support vector machines

Neural networks

x1  

xd  

δ1  

δk  

δm  

xf   g(x)  

w1  

wk  

wm  

v11  

vmd  

vk1  

g(x) = αi yik(x,i∑ xi )+ b, ∇g(x) = αi yi∇k(x, xi )

i∑

g(x) = 1+ exp − wkδk (x)k=1

m

∑#

$%

&

'(

)

*+

,

-.

−1

∂g(x)∂x f

= g(x) 1− g(x)( ) wkδk (x) 1−δk (x)( )vkfk=1

m

RBF kernel gradient: ∇k(x,xi ) = −2γ exp −γ || x − xi ||2{ }(x − xi )

21  

Page 22: Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition of Secure Patterns

 

http://pralab.diee.unica.it

An Example on Handwritten Digits

•  Nonlinear SVM (RBF kernel) to discriminate between ‘3’ and ‘7’

•  Features: gray-level pixel values –  28 x 28 image = 784 features

Before attack (3 vs 7)

5 10 15 20 25

5

10

15

20

25

After attack, g(x)=0

5 10 15 20 25

5

10

15

20

25

After attack, last iter.

5 10 15 20 25

5

10

15

20

25

0 500−2

−1

0

1

2g(x)

number of iterationsNumber of modified gray-level values

Few modifications are enough to evade detection!

… without even mimicking the targeted class (‘7’)

22  

Page 23: Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition of Secure Patterns

 

http://pralab.diee.unica.it

Bounding the Adversary’s Knowledge Limited knowledge attacks

•  Only feature representation and learning algorithm are known •  Surrogate data sampled from the same distribution as the

classifier’s training data •  Classifier’s feedback to label surrogate data

PD(X,Y) data  

Surrogate training data

f(x)

Send queries

Get labels Learn surrogate classifier

f’(x)

23  

Page 24: Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition of Secure Patterns

 

http://pralab.diee.unica.it

Experiments on PDF Malware Detection

•  PDF: hierarchy of interconnected objects (keyword/value pairs)

•  Adversary’s capability

–  adding up to dmax objects to the PDF –  removing objects may

compromise the PDF file (and embedded malware code)!

/Type    2  /Page    1  /Encoding  1  …  

13  0  obj  <<  /Kids  [  1  0  R  11  0  R  ]  /Type  /Page  ...  >>  end  obj  17  0  obj  <<  /Type  /Encoding  /Differences  [  0  /C0032  ]  >>  endobj    

Features:  keyword  count  

minx 'g(x ')

s.t. d(x, x ') ≤ dmax

x ≤ x '

24  

Page 25: Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition of Secure Patterns

 

http://pralab.diee.unica.it

Experiments on PDF Malware Detection

•  Dataset: 500 malware samples (Contagio), 500 benign (Internet) –  Targeted (surrogate) classifier trained on 500 (100) samples

•  Evasion rate (FN) at FP=1% vs max. number of added keywords

–  Averaged on 5 repetitions –  Perfect knowledge (PK); Limited knowledge (LK)

0 10 20 30 40 500

0.2

0.4

0.6

0.8

1

dmax

FN

SVM (Linear), λ=0

PK (C=1)LK (C=1)

Linear SVM

0 10 20 30 40 500

0.2

0.4

0.6

0.8

1

dmax

FN

SVM (RBF), λ=0

PK (C=1)LK (C=1)

Nonlinear SVM (RBF Kernel)

Number of added keywords to each PDF Number of added keywords to each PDF

25  

Page 26: Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition of Secure Patterns

 

http://pralab.diee.unica.it

Poisoning  A@acks  against  SVMs  

26  1.  B. Biggio, B. Nelson, P. Laskov. Poisoning attacks against SVMs. ICML, 2012 2.  B. Biggio et al., Security evaluation of SVMs. SVM applications. Springer, 2014

Page 27: Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition of Secure Patterns

 

http://pralab.diee.unica.it

Classifier  

Web  Server  

Tr  

HTTP  requests  

Poisoning Attacks against SVMs [B. Biggio, B. Nelson, P. Laskov, ICML 2012]

 h#p://www.vulnerablehotel.com/components/  com_hbssearch/longDesc.php?h_id=1&  id=-­‐2%20union%20select%20concat%28username,  0x3a,password%29%20from%20jos_users-­‐-­‐  

malicious  

 h#p://www.vulnerablehotel.com/login/  

legitimate  ✔  

✔  

27  

Page 28: Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition of Secure Patterns

 

http://pralab.diee.unica.it

•  Poisoning classifier to cause denial of service

Classifier  

Web  Server  

Tr  

HTTP  requests  

poisoning  a#ack  

Poisoning Attacks against SVMs [B. Biggio, B. Nelson, P. Laskov, ICML 2012]

 h#p://www.vulnerablehotel.com/components/  com_hbssearch/longDesc.php?h_id=1&  id=-­‐2%20union%20select%20concat%28username,  0x3a,password%29%20from%20jos_users-­‐-­‐  

legitimate  

 h#p://www.vulnerablehotel.com/login/  

malicious  

✖  

✖  

 h#p://www.vulnerablehotel.com/login/components/  

 h#p://www.vulnerablehotel.com/login/longDesc.php?h_id=1&  

28  

Page 29: Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition of Secure Patterns

 

http://pralab.diee.unica.it

•  Adversary model –  Goal: to maximize classification error (availiability, indiscriminate) –  Knowledge: perfect knowledge (trained SVM and TR set are known) –  Capability: injecting samples into TR

•  Attack strategy –  optimal attack point xc in TR that maximizes classification error

xc

classifica-on  error  =  0.039  classifica-on  error  =  0.022  

Adversary Model and Attack Strategy

29  

Page 30: Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition of Secure Patterns

 

http://pralab.diee.unica.it

•  Adversary model –  Goal: to maximize classification error (availiability, indiscriminate) –  Knowledge: perfect knowledge (trained SVM and TR set are known) –  Capability: injecting samples into TR

•  Attack strategy –  optimal attack point xc in TR that maximizes classification error

classifica-on  error  =  0.022  

xc

classifica-on  error  as  a  func-on  of  xc  

Adversary Model and Attack Strategy

30  

Page 31: Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition of Secure Patterns

 

http://pralab.diee.unica.it

•  Max. classification error L(xc) w.r.t. xc through gradient ascent

•  Gradient is not easy to compute –  The training point affects the

classification function –  Details of the derivation

are in the paper

Poisoning Attack Algorithm

xc(0)

xc

xc(0) xc

31  1.  B. Biggio, B. Nelson, P. Laskov. Poisoning attacks against SVMs. ICML, 2012

Page 32: Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition of Secure Patterns

 

http://pralab.diee.unica.it

Experiments on the MNIST digits Single-point attack

•  Linear SVM; 784 features; TR: 100; VAL: 500; TS: about 2000 –  ‘0’ is the malicious (attacking) class –  ‘4’ is the legitimate (attacked) one

xc(0) xc

32  

Page 33: Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition of Secure Patterns

 

http://pralab.diee.unica.it

Experiments on the MNIST digits Multiple-point attack

•  Linear SVM; 784 features; TR: 100; VAL: 500; TS: about 2000 –  ‘0’ is the malicious (attacking) class –  ‘4’ is the legitimate (attacked) one

33  

Page 34: Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition of Secure Patterns

 

http://pralab.diee.unica.it

A@acking  Clustering  

34  

1.  B. Biggio, I. Pillai, S. R. Bulò, D. Ariu, M. Pelillo, and F. Roli. Is data clustering in adversarial settings secure? AISec, 2013

2.  B. Biggio, S. R. Bulò, I. Pillai, M. Mura, E. Z. Mequanint, M. Pelillo, and F. Roli. Poisoning complete-linkage hierarchical clustering. S+SSPR, 2014

Page 35: Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition of Secure Patterns

 

http://pralab.diee.unica.it

Attacking Clustering

•  So far, we have considered supervised learning –  Training data consisting of samples and class labels

•  In many applications, labels are not available or costly to obtain –  Unsupervised learning

•  Training data only include samples – no labels!

•  Malware clustering –  To identify variants of existing malware or new malware families

x   x  x  x   x  x  x  

x  x   x  

x  x  

x   x  x  x  x  

x1 x2 ... xd

feature extraction (e.g., URL length,

num. of parameters, etc.)

data collection (honeypots)

clustering of malware families (e.g., similar HTTP

requests)

data analysis / countermeasure design (e.g., signature generation)

for  each  cluster        if  …              then  …        else  …  

35  

Page 36: Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition of Secure Patterns

 

http://pralab.diee.unica.it

Is Data Clustering Secure?

•  Attackers can poison input data to subvert malware clustering

x  x  x  

x   x  x  x  

x  x  

x  

x  

x  x  

x  x  x  x  

x1 x2 ... xd

feature extraction (e.g., URL length,

num. of parameters, etc.)

data collection (honeypots)

clustering of malware families (e.g., similar HTTP

requests)

data analysis / countermeasure design (e.g., signature generation)

for  each  cluster        if  …              then  …        else  …  

Well-­‐cra1ed  HTTP  requests  to  subvert  clustering    h#p://www.vulnerablehotel.com/…  h#p://www.vulnerablehotel.com/…  h#p://www.vulnerablehotel.com/…  h#p://www.vulnerablehotel.com/…  

… is significantly compromised

… becomes useless (too many false alarms, low detection rate)

36  

Page 37: Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition of Secure Patterns

 

http://pralab.diee.unica.it

Our Work

•  A framework to identify/design attacks against clustering algorithms –  Poisoning: add samples to maximally compromise the clustering output –  Obfuscation: hide samples within existing clusters

•  Some clustering algorithms can be very sensitive to poisoning! –  single- and complete-linkage hierarchical clustering can be easily

compromised by creating heterogeneous clusters •  Details on the attack derivation and implementation are in the papers

Clustering on untainted data (80 samples) Clustering after adding 10 attack samples

37  1.  B. Biggio et al. Is data clustering in adversarial settings secure? AISec, 2013 2.  B. Biggio et al. Poisoning complete-linkage hierarchical clustering. S+SSPR, 2014

Page 38: Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition of Secure Patterns

 

http://pralab.diee.unica.it

Conclusions and Future Work

•  Learning-based systems can be vulnerable to well-crafted, sophisticated attacks devised by skilled attackers

–  … that exploit specific vulnerabilities of machine learning algorithms!

•  Future (and ongoing) work –  Privacy Attacks –  Secure Learning, Clustering and Feature Selection/Reduction

Secure learning algorithms

Attacks against learning

38  

Page 39: Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition of Secure Patterns

 

http://pralab.diee.unica.it

Joint work with …

?   Any questions Thanks  for  your  a#en-on!  

39  

… and many others

Page 40: Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition of Secure Patterns

 

http://pralab.diee.unica.it

References 1.  B. Biggio, G. Fumera, and F. Roli. Security evaluation of pattern classifiers under attack. IEEE Trans. on

Knowl. and Data Eng., 26(4):984–996, April 2014. 2.  M. Barreno, B. Nelson, R. Sears, A. D. Joseph, and J. D. Tygar. Can machine learning be secure? In

Proc. ACM Symp. Information, Computer and Comm. Sec., ASIACCS ’06, pages 16–25, New York, NY, USA, 2006. ACM.

3.  M. Barreno, B. Nelson, A. Joseph, and J. Tygar. The security of machine learning. Machine Learning, 81:121–148, 2010.

4.  B. Biggio, I. Corona, B. Nelson, B. Rubinstein, D. Maiorca, G. Fumera, G. Giacinto, and F. Roli. Security evaluation of support vector machines in adversarial environments. In Y. Ma and G. Guo, eds, Support Vector Machines Applications, pages 105–153. Springer International Publishing, 2014.

5.  B. Biggio, G. Fumera, and F. Roli. Pattern recognition systems under attack: Design issues and research challenges. Int’l J. Patt. Recogn. Artif. Intell., 2014, In press.

6.  B. Biggio, I. Corona, D. Maiorca, B. Nelson, N. Srndic, P. Laskov, G. Giacinto, and F. Roli. Evasion attacks against machine learning at test time. In H. Blockeel et al., editors, European Conf. on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD), Part III, volume 8190 of LNCS, pp. 387–402. Springer, 2013.

7.  B. Biggio, B. Nelson, and P. Laskov. Poisoning attacks against support vector machines. In J. Langford and J. Pineau, eds, 29th Int’l Conf. on Machine Learning, pp.1807–1814. Omnipress, 2012.

8.  B. Biggio, I. Pillai, S. R. Bulò, D. Ariu, M. Pelillo, and F. Roli. Is data clustering in adversarial settings secure? In Proc. 2013 ACM Workshop on Artificial Intell. and Security, AISec ’13, pages 87–98, New York, NY, USA, 2013. ACM.

9.  B. Biggio, S. R. Bulò, I. Pillai, M. Mura, E. Z. Mequanint, M. Pelillo, and F. Roli. Poisoning complete-linkage hierarchical clustering. In P. Franti, G. Brown, M. Loog, F. Escolano, and M. Pelillo, editors, Joint IAPR Int’l Workshop on Structural, Syntactic, and Statistical Patt. Recogn., volume 8621 of LNCS, pages 42–52, Joensuu, Finland, 2014. Springer Berlin Heidelberg.

40