ariel stolerman rebekah overdorf sadia afroz rachel greenstadt · 2017-11-29 · title: classify,...

1
Classify, but Verify Breaking the Closed-World Assumption in Stylometric Authorship Attribution Ariel Stolerman Rebekah Overdorf Sadia Afroz Rachel Greenstadt The Privacy, Security and Automation Lab, Drexel University USENIX Security, August 14-16, 2013 Motivation I The web is full of anonymous communication that was never meant to be analyzed for authorship attribution I Stylometry is a form of authorship attribution that relies on the linguistic information found in a document I Stylometry research has thus far focused on closed-world models, limited to a set of known suspect authors I Often the closed-world assumption is broken, requiring a solution for forensic analysts and Internet activists who wish to remain anonymous Contribution and Application to Security & Privacy I The Classify-Verify method An abstaining classification approach that augments authorship classification with a verification step I Performs well in open-world problems with similar accuracy as traditional methods in closed-world problems I Improves closed-world solutions by replacing misclassifications with “unknown” I Performs well in adversarial settings where traditional methods fail without the need to train on adversarial data I The Sigma Verification method An extension of the distractorless verification method [Noecker & Ryan, LREC’12] for author-document distance measurement I Incorporates pairwise distances within the author’s documents I Normalizes over the standard deviations of the author’s features I Security & Privacy Applications Useful when the target class may absent from the suspect set: I Authorship Attribution/Verification (this work) I Website fingerprinting I Malware family identification Problem Statement Definitions: I D – document of unknown authorship I A – candidate author I A = {A 1 , ..., A n } – set of candidate authors I p = Pr [A D ∈A] – the probability that D ’s author is in the set of candidates A, denoted the in-set prob. (1 - p is the not-in-set prob.) I t – verification acceptance threshold Problems: I Authorship Attribution: Which A ∈A is the author of D ? I Authorship Verification: is D written by A? I The Classify-Verify Problem: given D , A and optionally p : I Determine the author A ∈A of D , or I Determine that D ’s author is not in A (w.r.t. acceptance threshold t ) Classify-Verify Train 1 2 Document Accept Reject Set ? ? Flow of the Classify-Verify algorithm on a test document D and a suspect set A, with optional acceptance threshold t and in-set prob. p. The Classify-Verify Algorithm Input: Document D , suspect author set A = {A 1 , ..., A n }, target measure to maximize μ Optional: in-set prob. p , manual threshold t Output: A D if A D ∈A, and otherwise C A classifier trained on A V A = {V A 1 , ..., V A n }← verifiers trained on A if t , p not set then t threshold maximizing p -μ R of Classify- Verify cross-validation on A else if t not set then t threshold maximizing p -μ of Classify- Verify cross-validation on A end if A C A (D ) if V A (D , t )= True then return A else return end if Synopsis: I Train one closed-world classifier C A over A and n verifiers †‡ V 1 , ..., V n for each A i ∈A I Classify D using C A , and let the result be author A i I Verify D using V i I If it accepts, return the author A i I Otherwise, return , which stands for “none” Verification Methods Classifier-Induced Verifiers Let P i denote the i th order statistic of the probability outputs of C A (D ), then: I P 1 : probability of the chosen class I P 1 -P 2 -Diff : difference between chosen and second-to-chosen class probabilities I Gap-Conf [Paskov, MIT 2010] : P 1 -P 2 -Diff based on n 1-vs-all classifiers Standalone Verifiers * distractorless or Sigma verification Verification Acceptance Threshold t I Manual: acceptance threshold t set manually I p -induced threshold: t is set empirically using cross-validation over the training set, to maximize the target evaluation measure μ (e.g. F1-score) for given in-set prob. p I p -Robust: t is set like in p -induced, but to maximize the average μ across any p 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 F1-score Verification Threshold in-set not-in-set 0.5-F1 F1-score for Classify-Verify on EBG using p = 0.5, SMO SVM and V σ with varying manually-set thresholds. Evaluation & Results I Corpora: I EBG: The Extended-Brennan-Greenstadt Adversarial corpus [Brennan et al., ACM Trans. Inf. Syst. Secur.’12] , 45 authors I Blog: The ICWSM 2009 Spinn3r Blog dataset [Burton et al., ICWSM’09], 50 authors I Closed-world classifier: SVM SMO I Feature set: 500 most common character bigrams Results: Term Meaning p Prob. of in-set documents p-F 1 F1-score of Classify-Verify for p prob. of in-set documents p-F 1 R F1-score of Classify-Verify for p prob. of in-set documents, using robust thresholds p-Base F1-score of closed-world classifier for p prob. of in-set documents 1-Base means pure closed-world settings For each p [0, 1], p-Base = p · 1-Base F1-score references terminology 0.749 0.799 0.868 0.873 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Gap-Conf P1 P1-P2-Diff F1-score 0.5-F1 0.5-Base 1-Base 0.5-F 1 results of the Classify-Verify method on the EBG corpus, where the expected prob. of in-set and not-in-set documents is equal (50%). Classify-Verify attains 0.5-F 1 that outperforms 0.5-Base and even close to 1-Base. 0.869 0.819 0.88 0.869 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 V Gap-Conf P1 P1-P2-Diff F1-score 0.5-F1 0.5-Base 1-Base 0.5-F 1 results of the Classify-Verify method on the blog corpus, where the expected prob. of in-set and not-in-set documents is equal (50%). Classify-Verify attains 0.5-F 1 that outperforms both 0.5-Base and 1-Base. 0.367 0.826 0.612 0.693 0.809 0.784 0.874 0.874 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Gap-Conf P1 P1-P2-Diff F1-score Imitation Obfuscation F1-score results of the Classify-Verify method on the EBG corpus under imitation and obfuscation attacks, where the expected prob. of in-set and not-in-set documents is equal (50%). Classify-Verify successfully thwarts most of the attacks. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 F1-score p = In-Set Portion Gap-Conf P1 P1-P2-Diff p-Base p-F 1 R results of the Classify-Verify method on the EBG corpus, where the expected prob. of in-set documents p varies from 10% to 90% and is assumed to be unknown. Robust p-independent thresholds are used for the underlying verifiers. Classify-Verify attains p-F 1 R that outperforms the respective p-Base. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 F1-score p = In-Set Portion V Gap-Conf P1 P1-P2-Diff p-Base p-F 1 R results of the Classify-Verify method on the blog corpus, where the expected prob. of in-set documents p varies from 10% to 90% and is assumed to be unknown. Robust p-independent thresholds are used for the underlying verifiers. Classify-Verify attains p-F 1 R that outperforms the respective p-Base. *Distractorless & Sigma Verification Distractorless – V [Noecker & Ryan, LREC’12] : verification based on vector distance between A’s centroid A and D , using cosine distance: δ (A, D )= A · D kAkkD k = n i =1 A i D i n i =1 A 2 i n i =1 D 2 i Sigma – V a σ : enhances distractorless verification with per-feature SD (V σ ) and per-author threshold (V a ) normalization Distance \Test δ< t δ - δ A < t δ D ,A = Δ(D i , A i ) n i =1 V V a δ σ D ,A = Δ( D i σ (A) i , A i σ (A) i ) n i =1 V σ V a σ Differences in distance calculation and t -threshold test for V , V σ and V a . 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 TPR FPR V Vaσ ROC curves for V , V σ and V a σ evaluation on the EBG corpus. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 TPR FPR V Vaσ ROC curves for V , V σ and V a σ evaluation on the blog corpus. Dept. of Computer Science - Drexel University - 3175 JFK Blvd., Philadelphia, PA 19104 Mail: {stolerman,rjo43,sa499,greenie}@cs.drexel.edu WWW: http://psal.cs.drexel.edu/

Upload: others

Post on 10-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Ariel Stolerman Rebekah Overdorf Sadia Afroz Rachel Greenstadt · 2017-11-29 · Title: Classify, but Verify - Breaking the Closed-World Assumption in Stylometric Authorship Attribution

Classify, but VerifyBreaking the Closed-World Assumption in Stylometric Authorship Attribution

Ariel Stolerman Rebekah Overdorf Sadia Afroz Rachel GreenstadtThe Privacy, Security and Automation Lab, Drexel University

USENIX Security, August 14-16, 2013

Motivation

I The web is full of anonymous communication that was never meant tobe analyzed for authorship attribution

I Stylometry is a form of authorship attribution that relies on thelinguistic information found in a document

I Stylometry research has thus far focused on closed-world models,limited to a set of known suspect authors

I Often the closed-world assumption is broken, requiring a solution forforensic analysts and Internet activists who wish to remain anonymous

Contribution and Application to Security & Privacy

I The Classify-Verify methodAn abstaining classification approach that augments authorshipclassification with a verification stepI Performs well in open-world problems with similar accuracy as traditional methods

in closed-world problemsI Improves closed-world solutions by replacing misclassifications with “unknown”I Performs well in adversarial settings where traditional methods fail without the

need to train on adversarial data

I The Sigma Verification methodAn extension of the distractorless verification method [Noecker & Ryan, LREC’12]

for author-document distance measurementI Incorporates pairwise distances within the author’s documentsI Normalizes over the standard deviations of the author’s features

I Security & Privacy ApplicationsUseful when the target class may absent from the suspect set:I Authorship Attribution/Verification (this work)I Website fingerprintingI Malware family identification

Problem Statement

Definitions:

I D – document of unknown authorship

I A – candidate author

IA = {A1, ...,An} – set of candidate authors

I p = Pr [AD ∈ A] – the probability that D’s author is in the set ofcandidates A, denoted the in-set prob. (1− p is the not-in-set prob.)

I t – verification acceptance threshold

Problems:

I Authorship Attribution: Which A ∈ A is the author of D?

I Authorship Verification: is D written by A?I The Classify-Verify Problem: given D, A and optionally p:

I Determine the author A ∈ A of D, orI Determine that D’s author is not in A (w.r.t. acceptance threshold t)

Classify-Verify

𝒜 Train

𝐶𝒜

𝑉𝐴1 𝑉𝐴2 … 𝑉𝐴𝑖 … 𝑉𝐴𝑛

Document 𝐷

𝐴𝑖

Accept Reject

𝑡

Set

𝑡?

𝑝?

Flow of the Classify-Verify algorithm on a test document Dand a suspect set A, with optional acceptance threshold t

and in-set prob. p.

The Classify-Verify AlgorithmInput: Document D, suspect author set A =

{A1, ...,An}, target measure to maximize µOptional: in-set prob. p, manual threshold t

Output: AD if AD ∈ A, and ⊥ otherwiseCA← classifier trained on AVA = {VA1, ...,VAn} ← verifiers trained on Aif t ,p not set then

t ← threshold maximizing p-µR of Classify-Verify cross-validation on Aelse if t not set then

t ← threshold maximizing p-µ of Classify-Verify cross-validation on Aend ifA← CA(D)

if VA(D, t) = True thenreturn A

elsereturn ⊥

end if

Synopsis:

I Train one closed-world classifier CA over A and n verifiers†‡ V1, ...,Vn

for each Ai ∈ AI Classify D using CA, and let the result be author Ai

I Verify D using Vi

I If it accepts, return the author Ai

I Otherwise, return ⊥, which stands for “none”

†Verification Methods

Classifier-Induced VerifiersLet Pi denote the i th order statistic of the probability outputs ofCA(D), then:I P1 : probability of the chosen classI P1-P2-Diff : difference between chosen and second-to-chosen

class probabilitiesI Gap-Conf [Paskov, MIT 2010] : P1-P2-Diff based on n 1-vs-all classifiers

Standalone Verifiers ∗

distractorless or Sigma verification

‡Verification Acceptance Threshold t

I Manual: acceptance threshold t setmanually

I p-induced threshold: t is set empiricallyusing cross-validation over the training set,to maximize the target evaluation measureµ (e.g. F1-score) for given in-set prob. p

I p-Robust: t is set like in p-induced, but tomaximize the average µ across any p

00.10.20.30.40.50.60.70.80.91

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4

F1-s

core

Verification Threshold

in-set not-in-set 0.5-F1

F1-score for Classify-Verify on EBG usingp = 0.5, SMO SVM and Vσ with varying

manually-set thresholds.

Evaluation & Results

I Corpora:I EBG: The Extended-Brennan-Greenstadt Adversarial corpus [Brennan et al., ACM Trans.

Inf. Syst. Secur.’12] , 45 authorsI Blog: The ICWSM 2009 Spinn3r Blog dataset [Burton et al., ICWSM’09], 50 authors

I Closed-world classifier: SVM SMO

I Feature set: 500 most common character bigrams

Results:

Term Meaning

p Prob. of in-set documentsp-F1 F1-score of Classify-Verify for p prob.

of in-set documentsp-F1R F1-score of Classify-Verify for p prob.

of in-set documents, using robustthresholds

p-Base F1-score of closed-world classifier forp prob. of in-set documents1-Base means pure closed-worldsettingsFor each p ∈ [0,1], p-Base = p · 1-Base

F1-score references terminology

0.749 0.799

0.868 0.873

00.10.20.30.40.50.60.70.80.9

1

Vσ Gap-Conf P1 P1-P2-Diff

F1-s

core

0.5-F10.5-Base 1-Base

0.5-F1 results of the Classify-Verify method onthe EBG corpus, where the expected prob. of

in-set and not-in-set documents is equal (50%).Classify-Verify attains 0.5-F1 that outperforms

0.5-Base and even close to 1-Base.

0.869 0.819

0.88 0.869

00.10.20.30.40.50.60.70.80.9

1

V Gap-Conf P1 P1-P2-Diff

F1-s

core

0.5-F10.5-Base 1-Base

0.5-F1 results of the Classify-Verify method onthe blog corpus, where the expected prob. of

in-set and not-in-set documents is equal (50%).Classify-Verify attains 0.5-F1 that outperforms

both 0.5-Base and 1-Base.

0.367

0.826

0.612 0.693

0.809 0.784 0.874 0.874

00.10.20.30.40.50.60.70.80.9

1

Vσ Gap-Conf P1 P1-P2-Diff

F1-s

core

Imitation Obfuscation

F1-score results of the Classify-Verify methodon the EBG corpus under imitation and

obfuscation attacks, where the expected prob.of in-set and not-in-set documents is equal(50%). Classify-Verify successfully thwarts

most of the attacks.

00.10.20.30.40.50.60.70.80.9

1

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

F1-s

core

p = In-Set Portion

Vσ Gap-Conf P1 P1-P2-Diff p-Base

p-F1R results of the Classify-Verify method onthe EBG corpus, where the expected prob. ofin-set documents p varies from 10% to 90%

and is assumed to be unknown. Robustp-independent thresholds are used for theunderlying verifiers. Classify-Verify attains

p-F1R that outperforms the respective p-Base.

00.10.20.30.40.50.60.70.80.91

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

F1-s

core

p = In-Set Portion

V Gap-Conf P1 P1-P2-Diff p-Base

p-F1R results of the Classify-Verify method onthe blog corpus, where the expected prob. ofin-set documents p varies from 10% to 90%

and is assumed to be unknown. Robustp-independent thresholds are used for theunderlying verifiers. Classify-Verify attains

p-F1R that outperforms the respective p-Base.

∗Distractorless & Sigma Verification

Distractorless – V [Noecker & Ryan, LREC’12] : verification based on vectordistance between A’s centroid A and D, using cosine distance:

δ(A,D) =A · D‖A‖‖D‖

=

∑ni=1 AiDi∑n

i=1 A2i∑n

i=1 D2i

Sigma – V aσ : enhances distractorless verification with per-feature SD

(Vσ) and per-author threshold (V a) normalization

Distance \Test δ < t δ − δA < t

δD,A = ∆(Di,Ai)ni=1 V V a

δσD,A = ∆( Diσ(A)i

, Aiσ(A)i

)ni=1 Vσ V a

σ

Differences in distance calculation and t-threshold test for V ,Vσ and V a.

00.10.20.30.40.50.60.70.80.9

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

TPR

FPR

V Vσ Vaσ

ROC curves for V , Vσ and V aσ

evaluation on the EBGcorpus.

00.10.20.30.40.50.60.70.80.9

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

TPR

FPR

V Vσ Vaσ

ROC curves for V , Vσ and V aσ

evaluation on the blogcorpus.

Dept. of Computer Science - Drexel University - 3175 JFK Blvd., Philadelphia, PA 19104 Mail: {stolerman,rjo43,sa499,greenie}@cs.drexel.edu WWW: http://psal.cs.drexel.edu/