distributed collaborative filtering for robust recommendation against shilling attacks distributed...

DISTRIBUTED COLLABORATIVE DISTRIBUTED COLLABORATIVE FILTERING FOR ROBUST FILTERING FOR ROBUST

RECOMMENDATION RECOMMENDATION AGAINST SHILLING ATTACKSAGAINST SHILLING ATTACKS

AE-TTIE JI1, CHEOL YEON1, HEUNG-NAM KIM1, AND GEUN-SIK JO2

1 Intelligent E-Commerce Systems Laboratory, Department of Computer Science & Information Engineering, Inha University

{aerry13, entireboy, nami}@eslab.inha.ac.kr

2 School of Computer Science & Engineering, Inha University, 253 Yonghyun-dong, Incheon, Korea 402-751

gsjo@inha.ac.kr

INTRODUCTION & BACKGROUNDSINTRODUCTION & BACKGROUNDS

A Robustness Analysis of Collaborative Filtering User profiles made by anonymous unauthenticated users Vulnerability to Profile Injection Attacks PocketLens - Distributed Personal Recommender It can partially improve the effects of PIA from system

providers.

Trust in Recommender Systems But, it is still not safe from anonymous attackers! “Trust” in Recommender systems

Automated attack detection schemes and robustness of recommendation algorithms.

Correlation between trust and user similarity

TCFMA ARCHITECTURETCFMA ARCHITECTURETRUST-BASED COLLABORATIVE FILTERING TRUST-BASED COLLABORATIVE FILTERING WITH MOBILE AGENTSWITH MOBILE AGENTS

Credibility of recommendations To achieve robustness against shilling attacks Distributed Personal Recommender Web of Trust

Trust Propagation To overcome sparseness of webs of trust The Advogato trust metric

Scalability To raise the efficiency of distributed computing Mobile Agent Framework

ARCHITECTUREARCHITECTURE

Owner’s Similarity Model

TrustList

ItemList

BlockList

Web of Trust

Action & Feedback

Recommendation

UpdateSimilarity

Dispatch

Creation

Dispatch

MobileAgent

Model Owner

Get Neighbors’ Ratings

Neighbors’Ratings

Find Migration Path

Owner’sTrust List

Neighbor’sAgent

Mobile Agent

Message

Neighbors’Trust List

User Agent

Fig. 1. Overview of trust-based collaborative filtering with mobile agents

THE MEANING OF NOTATIONS THE MEANING OF NOTATIONS

PX Arbitrary user included in web of trust

PO Target user, i.e. similarity model owner

PC Current user who PO’s mobile agent is visiting at the moment

{TRUSTPx} List of users who are trusted by PX

{BLOCKPx} List of users who are distrusted by PX

{ITEMSPx}List of <item, rating> pairs, i.e. items which PX already has

expressed his or her own opinion and these preference ratings.

{PATHPx} Migration path which PX’s mobile agent migrates along

AGENTPx Personal agent of PX

AGENTMPx Mobile agent of PX

Table 1. The meaning of notations

TRUST-BASED USER SELECTIONTRUST-BASED USER SELECTION

I. AGENTPo finds the migration path {PATHPo} that includes users trusted by PO for a mobile agent AGENTM

II. The neighbors of target user PO are chosen from the users included in {PATHPo}.

III. PO’s personal agent AGENTPo creates a mobile agent, AGENTMPo,

to find neighbors and build a similarity model based on them incrementally.

IV. AGENTMPo traces the path recursively until no users exist in

{PATHPo}∩{TRUSTPc}.

V. AGENTMPo is disposed of from the last node after visiting all users

in {PATHPo}.

TRUST-BASED USER SELECTIONTRUST-BASED USER SELECTION

The Advogato maximum flow algorithm Discover which users are trusted by credible

members of an online community and which are not.

The bottleneck property “the total trust quantity accorded to an s → t edge

is not significantly affected by changes to the successors of t”

The minimum number of profiles that make the attack succeed is not included in the process of collaborative filtering.

INCREMENTAL MODEL BUILDINGINCREMENTAL MODEL BUILDING

I. AGENTMPO identifies IOi and IPj that are

{ITEMSPO}∩{ITEMSPC} and {ITEMSPC} - {ITEMSPO} respectively, by communicating with a neighbor agent AGENTPC.

II. For each pair (IOi, IPj), AGENTMPO calculates values and sends

the values to its own user agent AGENTPO. (cosine and adjusted cosine similarity)

jcicji

IPPIPIP

IOPIOIO

IPPIOPIPIO

RatingW

RatingRatingW

cjccicji

PIPPIPIP

PIOPIOIO

PIPPPIOPIPIO

AvgRatingRatingW

AvgRatingRatingAvgRatingRatingW

INCREMENTAL MODEL BUILDINGINCREMENTAL MODEL BUILDING

III. AGENTPO adds up these values incrementally until AGENTMPO

sends values of all users in {PATHPO} except for those which don’t have IOi.

IV. AGENTPO calculates the similarity of item pair (IOi, IPj).

IPIPDenomDenom

IOIODenomDenom

IPIONumerNumer

),(DenomDenom

IPIOIPIOsim

AGENTS’ TASKS IN EACH CASEAGENTS’ TASKS IN EACH CASE

TrustList

ItemList

BlockList

TrustList

ItemList

BlockList

TrustList

ItemList

BlockList

Owner’sSimilarity Model

TrustList

ItemList

BlockList

Owner’sItem List

Neighbor’sAgent

MobileAgent

Matched item rating

Request for matched item rating

MobileAgent

UserAgent

Reject

MobileAgent

Pre-computed information

Migration Path

Information for similarity

Migration PathInformation for similarity

Neighbor’sAgent

Neighbor’sSimilarity Model

[Case 1]

[Case 2]

[Case 3]

Rejection Message

Migration Path

Fig. 2. Agents’ tasks in each case

RECOMMENDATIONS & FEEDBACKRECOMMENDATIONS & FEEDBACK Predictions

Feedback

IP1 ... IPk ... IPj

Delete

Update

User Agent

Model Owner

IP4 ... IPk ... IPn

Trusted peers’Agent

UpdatePropagating

user feedback

Ratingfeedback

Recom-mending

IOAll ji

IOAll IOPji

IPPIPIOsim

RatingIPIOsimratingp

}),({_

Fig. 3. Recommendations and propagation user’s feedback

DATASETS & EVALUATION DATASETS & EVALUATION METRICS METRICS

Datasets Crawling through epinions.com in May 2006

http://www.epinions.com Numeric rating of item is in the range of 1 to 5 Web of Trust among users

Users who had rated at least 5 item Users who had expressed trust opinion to at least 25

users Items that had been rated by at least 10 users

users trusts items rating

4,751 216,490 2,955 121,862

Table 1. Dataset for Experiment

DATASETS & EVALUATION METRICS DATASETS & EVALUATION METRICS

Evaluation Metrics

Mean absolute error (MAE)

Absolute Prediction Shift (APS)

ratingaratingpMAE

ratingpratingpAPS

PERFORMANCE EVALUATIONPERFORMANCE EVALUATION

Prototype system implemented using IBM aglet Software with JDK 1.4.2

Benchmark system to compare the performanceRandom model building (in PocketLens)

- Miller, B., Konstan, J., Terveen, L., Riedl, J.: PocketLens: Towards a Personal Recommender System.

In ACM Transactions on Information Systems 22 (2004) 437-476

PERFORMANCE EVALUATIONPERFORMANCE EVALUATION Overall Performance of Prediction Quality

TCFMA + cosine-based scheme showed better prediction quality than the other two methods.

Even a small number of users can result in a relatively better model with our proposed methods

Table 2. Overall Performance of Prediction Quality

Neighbor peer size 10 30 50 70 100

Random 1.2866 1.2863 1.2859 1.2859 1.2859

TCFMA + cosine 1.2113 1.2114 1.2100 1.2101 1.2101

TCFMA + adjusted 1.2384 1.2480 1.2412 1.2415 1.2402

PERFORMANCEPERFORMANCE EVALUATIONEVALUATION

Positive Effect of Trust for Prediction

Datasets with users who have more than x trusted users.

The more trust opinions are included in each user, the better the prediction quality obtained.

Direct trust opinions have a positive influence on prediction quality.

Trust x Trust 5 Trust 10 Trust 15 Trust 25 Trust 45

TCFMA + cosine 1.4131 1.3338 1.3313 1.2867 1.1611

TCFMA + adjusted 1.5688 1.3028 1.2952 1.2512 1.2238

Table 3. Sensitivity of trust on MAE (neighbor peer size = 50)

PERFORMANCE EVALUATIONPERFORMANCE EVALUATION Robustness of the shilling problem

The set of manipulated users including arbitrary 50 ratings were inserted into the training dataset.

Fig. 4. Comparison of robustness on manipulated users

100 500 1000 2000

The Number of Injected Manipulated Users

#MR of TCFMA + cosine #MR of Random

APS of TCFMA + cosine APS of Random

PERFORMANCE EVALUATIONPERFORMANCE EVALUATION Efficiency of similarity model building

The time required for model building The number of neighbors required for model building

The proposed method is far superior with respect to the effectiveness of similarity model building.

Table 4. Comparison of required time and accessed users (neighbor user size = 50)

Model Owner User 1 User 2 User 3 User 4 User 5 Average

TCFMA + cosine

Time(ms) 5786.81 11576.54 9776.97 12676.54 9425.59 9848.49

# User 292.64 861.94 680.08 953.54 636.64 684.968

RandomTime(ms) 31590.24 30129.18 31966.27 23209.48 20977.24 27574.48

# User 4379.48 4209.75 4505.89 3315.13 2962.29 3874.51

CONCLUSIONCONCLUSION

We proposed a novel TCFMA architecture to solve the problems that can occur in online CF recommender systems related to an improper use of personal information and a profile injection attack.

We obtained very good robustness from malicious attacks without any degradation of prediction quality, compared to general peer-to-peer CF recommender systems.

We also achieved efficient distributed computing for building item-item similarity models by adding useful functionalities of mobile agents.

FUTURE WORKFUTURE WORK Trust Decay

The trust relationship becomes weaker as it forwards to its successors.

It is essential to take this phenomenon into consideration for applying trust propagation algorithms to real-world applications.

Attack Detection Automated attack detection algorithms based on

diverse types of attack models can lead to more robust recommendation algorithms.

!!!!THANK!!!!THANK YOU!!!!YOU!!!!

TRUST GRAPH CONVERSION - TRUST GRAPH CONVERSION - ADVOGATOADVOGATOAdvogato graph transform

function transform ( G = (V, E, CV)) {

set E′ 0, V′ 0;for all x ∈ V do

add node x+ to V′ ;

add node x- to V′ ;

if CV (x) >= 1 then add edge (x-, x+) to E′;

set CE′ (x-, x+) CV (x) -1;for all edge (x, y) E ∈ do

add edge (x+, y-) to E′;

set CE′ (x+, y-) ∞;end doadd edge (x, supersink) to E′;

set CE′ (x-, supersink) 1;end if

end doreturn G′ =(V′, E′, CE′ );

CAPACITY ASSIGNMENT CAPACITY ASSIGNMENT

TrustSource

CONVERTED GRAPHCONVERTED GRAPH

SuperSink

TrustSource

TRUST PROPAGATION & FINDING TRUST PROPAGATION & FINDING MIGRATION PATHMIGRATION PATH

Ford-fulkerson maxflow algorithmfunction maxflow (G′, seed, supersink) {

for each edge (x, y) E∈ ′ in G′ do F (x, y) 0;F (y, x) 0;

end do

while there exists a path P from seed to supersink in the residual Network G′F do

CF(P) min {CF (x, y) : (x, y) in P};for each edge (x, y) in P do

F (x, y) F (x, y) + CF (P);F (y, x) -F (x, y)

end doend while

EXAMPLESEXAMPLES

items ratings

A Matrix 3

Space Odyssey 4

Dark City 4

items ratings

B Matrix 4

Star Wars 3

Dark City 4

Ghost Busters 5

items ratings

CSpace

Odyssey 3

Star Wars 1

Dark City 3

ResidentEvil 2

items ratings

D AI 4

Resident Evil 5

Minority Report 3

Star Wars 2

Dark City 1

EXAMPLESEXAMPLES22222222 32134134

34211334

items ratings

A Matrix 3

OwnerSpace

Odyssey 4

Dark City 4

items ratings

BNeighbor

Matrix 4

Star Wars 3

Dark City 4

Ghost Busters 5

Mi: Model owner’s itemsNi: Neighbor’s items

　 Star WarsGhost Busters

Resident Evil

Matrix ∨ ∨ 　

Dark City 0.8910 ∨ 0.6459

AI 0.9487 　 0.9191

Resident Evil

Matrix 0.9995 0.9856 　

Dark City 0.9134 0.8957 0.6459

AI 0.9487 　 0.9191

0.913442+32+12+42

32+12+22+324*3+3*1+1*2+4*3

2222222222 1321334134

1334211334

items ratings

CSpace

Odyssey 3

Neighbor Star Wars 1

Dark City 3

ResidentEvil 2

Mi: Model owner’s itemsNi: Neighbor’s items

Resident Evil

Matrix 0.9995 0.9856 　

Dark City 0.9134 0.8957 0.6459

AI 0.9487 　 0.9191

Niitems ratings

A Matrix 3

OwnerSpace

Odyssey 4

Dark City 4

Resident Evil

Matrix 0.9995 0.9856 　

Dark City 0.9146 0.8957 0.6459

Space Odyssey

∨ 　 ∨

AI 0.9487 　 0.9191

0.914642+32+12+42+32

32+12+22+32+124*3+3*1+1*2+4*3+3*1

EXAMPLESEXAMPLES

EXAMPLESEXAMPLES222222222222 513213134134

511334211334

Resident Evil

Minority Report

Matrix 0.9995 0.9856 　　

Dark City 0.7329 0.8957 0.5658 　 ∨

Space Odyssey

∨ 　 ∨ 　

AI 0.9487 　 0.8967 　 ∨

items ratings

A Matrix 3

OwnerSpace

Odyssey 4

Dark City 4

items ratings

D AI 4

NeighborResident

Evil 5

Minority Report 3

Star Wars 5

Dark City 1

Resident Evil

Matrix 0.9995 0.9856 　

Dark City 0.9146 0.8957 0.6459

Space Odyssey

∨ 　 ∨

AI 0.9487 　 0.9191

0.732942+32+12+42+32+12

32+12+22+32+12+524*3+3*1+1*2+4*3+3*1+1*5

distributed collaborative filtering for robust recommendation against shilling attacks distributed...

mobile agent agent

user agent agent po

neighbor agent agent

path po

p o s personal agent

items po

users items

p x s mobile agent

Documents

recommendation of learning objects applying collaborative...

n mlr: robust & reliable route recommendation on road …

collaborative filtering recommendation system

recommendation systems: applying amazon's collaborative...

redalyc.a comparison of robust kalman filtering methods for...

scalable collaborative filtering recommendation algorithms...

a collaborative filtering recommendation algorithm based...

collaborative filtering meets mobile recommendation: a...

a collaborative filtering recommendation system …...

collaborative filtering based recommendation system ... ·...

research on collaborative filtering ... - web of...

recommendation systems the netflix...

collaborative filtering recommendation

extreme value statistics and robust filtering for

a robust iterative filtering technique for data aggregation...

dt-cwt robust filtering algorithm for the extraction of

item-based collaborative filtering recommendation algorithms

stability of collaborative filtering recommendation

scalable collaborative filtering for commerce recommendation

nonlinear filtering and robust learning