adversarial pattern classification using multiple classifiers and randomisation

Pattern Recognition and Applications GroupUniversity of Cagliari, Italy

Department of Electrical and Electronic Engineering

R AP

Adversarial pattern classificationAdversarial pattern classification

using multiple classifiers and randomizationusing multiple classifiers and randomization

Battista Biggio, Giorgio Fumera, Fabio Roli

S+SSPR 2008,Orlando, Florida, December 4th, 2008

physicalprocess

acquisition/measurement

pattern(image, textdocument, ...)‏

x1x2...xn

featurevector

learningalgorithm

classifier

randomnoise

ed by sets of coupleds for formal neuronsation of essentialsfeat

Example: OCR

But many security applications, such as spam filtering, do not fit well with theabove model:

noise is not random, but adversarial. Malicious errors.false negatives are not random, they are created to evade the classifiertraining data can be “tainted” by the attackeran important classifier’s feature is its “hardness of evasion”, that is, the effort thatthe attacker has to do for evading the classifier

Standard pattern Standard pattern classification classification modelmodel

Adversarial Adversarial pattern pattern classificationclassification

It’s a game with two players: the classifier and the adversaryThe adversary camouflages illegitimate patterns in adversarial way to evade the classifierThe classifier should be adversary-aware to handle the adversarial noise and toimplement defence strategies

measurementpattern(e-mail,network packet,fingerprint, ...)‏

x1x2...xn

featurevector

learningalgorithm

classifier

adversarialnoise

Example:spam e-mails

Spam message:CNBC Features MPRG on PowerLunch Today, Price Climbs74%!The Motion Picture GroupSymbol: MPRGPrice: $0.33 UP 74%

AnAn example of adversarial classificationexample of adversarial classification

Feature weightsbuy = 1.0viagra = 5.0

Total score = 6.0

From: [email protected] Viagra !

> 5.0 (threshold)

Spam

Spam FilteringSpam Filtering

Linear Classifier1st round1st round

Note that the popular SpamAssassin filter is really a linear classifier See http://spamassassin.apache.org

A game in the feature spaceA game in the feature space……

1st round1st round

X2

X1

+++

+

+--

--

-

yc(x)

Feature weightsbuy = 1.0viagra = 5.0

Classifier’s weights are learnt using an initial “untainted” training set

See, for example, the case of the SpamAssassin filterhttp://spamassassin.apache.org/full/3.0.x/dist/masses/README.perceptron

From: [email protected] Viagra!

N. Dalvi et al., Adversarial classification, 10th ACM SIGKDD Int. Conf., 2004


Feature weightsbuy = 1.0viagra = 5.0University = -2.0Florida = -3.0

Total score = 1.0

From: [email protected] Viagra!Florida UniversityNanjing

< 5.0 (threshold)

Spammer attacks by adding Spammer attacks by adding ““goodgood”” words words……

Linear Classifier2nd round2nd round

Ham

A game in the feature spaceA game in the feature space……

2nd round2nd round Feature weightsbuy = 1.0viagra = 5.0University = -2.0Florida = -3.0

Adding good words is a typical trick used by spammers for evading a filter

The spammer’s goal is modifying the mail so that the filter is evaded but themessage is still understandable by humans

Spammer attacks by adding Spammer attacks by adding ““goodgood”” words words……X2

X1

+++

+

+---

-

-

yc(x)

-


N. Dalvi et al., Adversarial classification, 10th ACM SIGKDD Int. Conf., 2004

Modelling the spammerModelling the spammer’’s attack strategys attack strategyN. Dalvi et al., Adversarial classification, 10th ACM SIGKDD Int. Conf., 2004

X2

X1

+yc(x)The adversary uses a costfunction A(x) to select maliciouspatterns that can becamouflaged as innocent withminimum cost W(x, x’)

�

A(x) = argmaxx '!X

UA(y

c(x' ),+)"W (x,x' )[ ]

- xx’

A(x)

Adversary utility is higher when malicious patterns are misclassified:

For spammers, the cost W(x, x’) is related to adding words, replacing words, etc.

The adversary transforms a malicious pattern x into an innocent pattern x’ if thecamouflage cost W(x, x’) is lower than the utility gain

In spam filtering, the adversary selects spam mails which can be camouflaged as hammails with a minimum number of modifications of mail content

�

UA(!,+) >U

A(+,+)


Feature weightsbuy = 1.0viagra = 5.0University = -0.3Florida = -0.3

Total score = 5.4


> 5.0 (threshold)

Classifier reaction by retrainingClassifier reaction by retraining……

Linear Classifier3rd round3rd round

Spam

Modelling classifier reactionModelling classifier reaction

3rd round3rd roundFeature weightsbuy = 1.0viagra = 5.0University = -2.0Florida = -3.0

X2

X1

+++

+

+---

-

-

yc(x)

-


Classifier retrainingClassifier retraining……N. Dalvi et al., Adversarial classification, 10th ACM SIGKDD Int. Conf., 2004

The classifier is adversary-aware, it takes into account the previous moves ofthe adversary

In real cases, this means that the filter’s user provides the correct labels formislabelled mails

The classifier constructs a new decision boundary yc(x) if this move gives anutility higher than the cost for extracting features and re-training

Adversary-aware classifierAdversary-aware classifierN. Dalvi et al., Adversarial classification, 10th ACM SIGKDD Int. Conf., 2004

Results reported in this paper showed that classifier performancesignificantly degrades if the adversarial nature of the task is not takeninto account, while an adversary-aware classifier can performsignificantly better

By anticipating the adversary strategy, we can defeat it.

““If you know the enemy and know yourself, you need not fearIf you know the enemy and know yourself, you need not fearthe result of a hundred battlesthe result of a hundred battles””((Sun Tzu, 500 BC)Sun Tzu, 500 BC)

Real anti-spam filters should be adversary-aware, which means thatthey should adapt to and anticipate adversary’s moves: exploiting thefeedback of the user, changing their operation, etc.

Mimimum costcamouflage(s)BUY VI@GRA!

x

1

+ x '

C(x) = + C(x) = !

Beyond classifier retrainingBeyond classifier retraining……

x

2

Real anti-spam filters can be re-trained by the feedback of the userswhich can provide correct labels for the mislabelled mails. In the modelof Dalvi et al., this corresponds to the assumption of perfect knowledgeof the adversary’s strategy function A(x)

Defence strategies in adversarial classificationDefence strategies in adversarial classification

Beyond retraining, are there other defence strategies thatBeyond retraining, are there other defence strategies thatwe can implement?we can implement?

A defence strategy: hiding information by randomizationA defence strategy: hiding information by randomization

““Keep the adversary guessing. If your strategy is a mystery, it cannot beKeep the adversary guessing. If your strategy is a mystery, it cannot becounteracted. This gives you a significant advantagecounteracted. This gives you a significant advantage””(Sun Tzu, 500 BC)(Sun Tzu, 500 BC)

Am I evading it? X2

X1

+y1(x)

-x

x’

y2(x)

- +Two randomrealizations of theboundary yc(x)

An intuitive strategy for making a classifier harder to evade is to hideinformation about it to the adversary

A possible implementation of this strategy is to introduce some randomness inthe placement of the classification boundary


Am I evading it? X2

X1

+y1(x)

-x

x’A(x)=x’

y2(x)

- +Two randomrealizations of yc(x)A(x)=x’ does notevade y1(x) !

Consider a randomized classifier yc(x, T), where the random variable is the training set T

Example: assume that UA(-,+)=5, UA(+,+)=0, W(x’, x)=3

Case 1: the adversary knows the actual boundary y2(x)The adversary’s gain if the pattern x is changed into x’ is UA(x’, x) - W(x’, x)= 5 - 3 = 2,then the adversary does the transformation ad evades the classifier.

Case 2: two random boundaries with P(y1(x))=P(y1(x))=0.5The expected gain is: [UA(x’, x) * P(y1(x)) + UA(x’, x) * P(y2(x))] - W(x’, x)= [0 * 0.5 - 5 * 0.5] - 3 = 2.5 - 3 < 0,then the adversary does not move, even if such move would allow evading the classifier.


Am I evading it? X2

X1

+y1(x)

-x

x’A(x)=x’

y2(x)

- +

Why is a randomized classifier harder to evade?Why is a randomized classifier harder to evade?

In the Proceedings paper we show that adversary’s strategy A(x) becomessuboptimal. Adversary does not camouflage malicious patterns that would allowevading the classifier, or camouflage malicious patterns which are misclassifiedby the classifier.

�

EYC

A(x){ } = argmaxx '!X

EYC

UA(y

c(x' ),+){ }"W (x, x' )[ ]

EYC

A(x){ }# A(x /yc (x' )) = Aopt(x)

Key points:

yc(x) becomes a random variable: YcThe adversary has to compute theexpected value of A(x) by averagingover possible realizations of yc(x)

Black/White List

URL Filter

Signature Filter

Header Analysis

Content Analysis

Σ… Assigned class

legitimate

spam

Evade hard MCS with randomizationEvade hard MCS with randomization

http://spamassassin.apache.org

The defence strategy based on “randomization” can be implemented in severalwaysWe implemented it using the multiple classifiers approach, by the randomisationof the combination functionFor our experiments, we used the SpamAssassin filter that is basically a linearlyweighted combination of classifiers, and randomized the weights by training setbootstrapping

ExperimentsExperiments with multiple classifiers and randomizationwith multiple classifiers and randomization

E-mail data set: TREC 200775,419 real e-mail messagesreceived between Apr.-July 200725,220 ham, 50,199 spam

SpamAssassin architecture

Experimental set upWe used SpamAssassin filter with aweighted sum as combination function(a SVM with linear kernel)

Randomization of the combinationfunction by bootstrap. The adversary“sees” 100 different sets of weightswith identical probability.

Key point: the adversary does not know theactual set of weights deployed for combiningmultiple classifiers (filtering rules). So it candevise only a suboptimal strategy A(x).

11.2119.551.461.300.560.98

�

A

rnd

U

�

C

rnd

U

�

C

det

U

�

detFN (%)

�

rnd

FN (%)

�

A

det

U

The average false negative rate decreasesfrom 19.55% to 11.25% when the classifieruses randomizationThis is confirmed by the decrease ofadversary’s utility and the increase ofclassifier’s utility

Assume that the adversary can make anymodification which reduces the score of a rule

adversarial pattern classification using multiple classifiers and randomisation

Education