personalizing the web: building effective recommender systems

Center for Web IntelligenceCenter for Web IntelligenceSchool of CTI, DePaul UniversityChicago, Illinois, USA

Personalizing the Web:Building effective recommender systems

Bamshad Mobasher

Center for Web Intelligence

School of Computer Science, Telecommunication, and Information SystemsDePaul University, Chicago, Illinois, USA


2

Outline

Web Personalization & Recommender systemsBasic Approaches & Algorithms

Special focus on collaborative filtering

Extending Traditional ApproachesHybrid modelsPersonalization Based on Data Mining

Vulnerability of Collaborative Filtering to Attacks


3

Web Personalization

The Problem Dynamically serve customized content (pages, products, recommendations,

etc.) to users based on their profiles, preferences, or expected interests

Common Approaches

Collaborative Filtering Give recommendations to a user based on preferences of “similar” users Preferences on items may be explicit or implicit

Content-Based Filtering Give recommendations to a user based on items with “similar” content in

the user’s profile

Rule-Based (Knowledge-Based) Filtering Provide recommendations to users based on predefined (or learned) rules age(x, 25-35) and income(x, 70-100K) and childred(x, >=3) recommend(x,

Minivan)


4

Content-Based Recommender Systems


5

Content-Based Recommenders: Personalized Search Agents

How can the search engine determine the “user’s context”?

Query: “Madonna and Child”

?

?

Need to “learn” the user profile: User is an art historian? User is a pop music fan?


6

Collaborative Recommender Systems


7



8



9

Other Forms of Collaborative Filtering

Social Tagging (Folksonomy) people add free-text tags

to their content where people happen to

use the same terms then their content is linked

frequently used terms floating to the top to create a kind of positive feedback loop for popular tags.

Examples: Del.icio.us Flickr QLoud & iTunes


10

The Recommendation Task

Basic formulation as a prediction problem

Typically, the profile Pu contains preference scores by u on some other items, {i1, …, ik} different from it preference scores on i1, …, ik may have been obtained explicitly

(e.g., movie ratings) or implicitly (e.g., time spent on a product page or a news article)

Given a profile Pu for a user u, and a target item it, predict the preference score of user u on item it

Given a profile Pu for a user u, and a target item it, predict the preference score of user u on item it


11

Content-Based Recommenders

Predictions for unseen (target) items are computed based on their similarity (in terms of content) to items in the user profile.

E.g., user profile Pu contains

recommend highly: and recommend “mildly”:


12


Collaborative filtering recommenders Predictions for unseen (target) items are computed based the other

users’ with similar interest scores on items in user u’s profilei.e. users with similar tastes (aka “nearest neighbors”)requires computing correlations between user u and other users

according to interest scores or ratingsk-nearest-neighbor (knn) strategy

Can we predict Karen’s rating on the unseen item Independence Day?Can we predict Karen’s rating on the unseen item Independence Day?


13

Basic Collaborative Filtering Process

Neighborhood Formation Phase

Recommendations

NeighborhoodFormation

NeighborhoodFormation

RecommendationEngine

RecommendationEngine

Current User Record

HistoricalUser Records

user item rating

<user, item1, item2, …>

NearestNeighbors

CombinationFunction

Recommendation Phase

Both of the Neighborhood formation and the recommendation phases are real-time components


14

Collaborative Filtering: Measuring Similarities

Pearson Correlation weight by degree of correlation between user U and user J

1 means very similar, 0 means no correlation, -1 means dissimilar

Works well in case of user ratings (where there is at least a range of 1-5)

Not always possible (in some situations we may only have implicit binary values, e.g., whether a user did or did not select a document)

Alternatively, a variety of distance or similarity measures can be used

Average rating of user Jon all items.2 2

( )( )

( ) ( )UJ

U U J Jr

U U J J


15

Collaborative filtering recommenders Predictions for unseen (target) items are computed based the

other users’ with similar interest scores on items in user u’s profilei.e. users with similar tastes (aka “nearest neighbors)requires computing correlations between user u and other

users according to interest scores or ratings

prediction

Correlation to KarenCorrelation to KarenPredictions for Karen on Indep. Day based on the K nearest neighbors

Predictions for Karen on Indep. Day based on the K nearest neighbors



16

Collaborative Filtering: Making Predictions

When generating predictions from the nearest neighbors, neighbors can be weighted based on their distance to the target user

To generate predictions for a target user a on an item i:

ra = mean rating for user a

u1, …, uk are the k-nearest-neighbors to a

ru,i = rating of user u on item I

sim(a,u) = Pearson correlation between a and u

This is a weighted average of deviations from the neighbors’ mean ratings (and closer neighbors count more)

k

u

k

u uiuaia

uasim

uasimrrrp

1

1 ,,

),(

),()(


17

Example Collaborative System

Item1 Item 2 Item 3 Item 4 Item 5 Item 6 Correlation with Alice

Alice 5 2 3 3 ?

User 1 2 4 4 1 -1.00

User 2 2 1 3 1 2 0.33

User 3 4 2 3 2 1 .90

User 4 3 3 2 3 1 0.19

User 5 3 2 2 2 -1.00

User 6 5 3 1 3 2 0.65

User 7 5 1 5 1 -1.00

Bestmatch

Prediction

Using k-nearest neighbor with k = 1


18

Item-based Collaborative Filtering

Find similarities among the items based on ratings across users Often measured based on a variation of Cosine measure

Prediction of item I for user a is based on the past ratings of user a on items similar to i.

Suppose:

Predicted rating for Karen on Indep. Day will be 7, because she rated Star Wars 7 That is if we only use the most similar item Otherwise, we can use the k-most similar items and again use a weighted

average

sim(Star Wars, Indep. Day) > sim(Jur. Park, Indep. Day) > sim(Termin., Indep. Day)


19

Item-Based Collaborative Filtering

Item1 Item 2 Item 3 Item 4 Item 5 Item 6

Alice 5 2 3 3 ?

User 1 2 4 4 1

User 2 2 1 3 1 2

User 3 4 2 3 2 1

User 4 3 3 2 3 1

User 5 3 2 2 2

User 6 5 3 1 3 2

User 7 5 1 5 1

Item similarity

0.76 0.79 0.60 0.71 0.75Bestmatch

Prediction


20

Collaborative Filtering: Evaluation

split users into train/test setsfor each user a in the test set:

split a’s votes into observed (I) and to-predict (P)measure average absolute deviation between

predicted and actual votes in PMAE = mean absolute error

average over all test users


21

Semantically Enhanced Collaborative Filtering

Basic Idea: Extend item-based collaborative filtering to incorporate both

similarity based on ratings (or usage) as well as semantic similarity based on domain knowledge

Semantic knowledge about items Can be extracted automatically from the Web based on domain-

specific reference ontologies Used in conjunction with user-item mappings to create a

combined similarity measure for item comparisons Singular value decomposition used to reduce noise in the

semantic data

Semantic combination threshold Used to determine the proportion of semantic and rating (or

usage) similarities in the combined measure


22

Semantically Enhanced Hybrid Recommendation

An extension of the item-based algorithm Use a combined similarity measure to compute item similarities:

where, SemSim is the similarity of items ip and iq based on semantic

features (e.g., keywords, attributes, etc.); andRateSim is the similarity of items ip and iq based on user

ratings (as in the standard item-based CF) is the semantic combination parameter:

= 1 only user ratings; no semantic similarity = 0 only semantic features; no collaborative similarity


23

Semantically Enhanced CF

Movie data set Movie ratings from the movielens data set Semantic info. extracted from IMDB based on the following

ontology

Movie

Actor DirectorYearName Genre

Genre-All

Romance Comedy

Romantic Comedy

Black Comedy

Kids & Family

Action

Actor

Name Movie Nationality

Director


Movie


Movie


Genre-All

Romance Comedy

Romantic Comedy

Black Comedy

Kids & Family

Action

Genre-All

Romance Comedy

Romantic Comedy

Black Comedy

Kids & Family

Action

Actor


Actor


Director



24

Semantically Enhanced CF Used 10-fold x-validation on randomly selected test and training

data sets Each user in training set has at least 20 ratings (scale 1-5)

Movie Data SetRating Prediction Accuracy

0.71

0.72

0.73

0.74

0.75

0.76

0.77

0.78

0.79

0.8

No. of Neighbors

MA

E

enhanced standard

Movie Data Set Impact of SVD and Semantic Threshold

0.725

0.73

0.735

0.74

0.745

0.75

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Alpha

MA

E

SVD-100 No-SVD


25

Semantically Enhanced CF

Dealing with new items and sparse data sets For new items, select all movies with only one rating as the test data Degrees of sparsity simulated using different ratios for training data

Movie Data Set Prediction Accuracy for New Items

0.72

0.74

0.76

0.78

0.8

0.82

0.84

0.86

0.88

5 10 20 30 40 50 60 70 80 90 100 110 120

No. of Neighbors

MA

E

Avg. Rating as Prediction Semantic Prediction

Movie Data Set% Improvement in MAE

0

5

10

15

20

25

0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1

Train/Test Ratio

% Im

pro

vem

ent


26

Collaborative Filtering: Problems

Problems with standard CF major problem with CF is scalability

neighborhood formation is done in real-time small number of users relative to items may result in poor performance

data become too sparse to provide accurate predictions “new item” problem Vulnerability to attacks (will come back to this later)

Problems in context of clickstream / e-commerce data explicit user ratings are not available

features are binary (visit or a non-visit for a particular item) or a function of the time spent on a particular item

a visit to a page is not necessarily an indication of interest in that item number of user records (and items) is far larger than the standard domains

for CF where users are limited to purchasers or people who rated items need to rely on very short user histories


27

Web Mining Approach to Personalization

Basic Idea generate aggregate user models (usage profiles) by discovering user

access patterns through Web usage mining (offline process)Clustering user transactionsClustering itemsAssociation rule miningSequential pattern discovery

match a user’s active session against the discovered models to provide dynamic content (online process)

Advantages no explicit user ratings or interaction with users helps preserve user privacy, by making effective use of anonymous data enhance the effectiveness and scalability of collaborative filtering


28

Web Usage Mining

Web Usage Mining discovery of meaningful patterns from data generated by user access to

resources on one or more Web/application servers

Typical Sources of Data: automatically generated Web/application server access logs e-commerce and product-oriented user events (e.g., shopping cart

changes, product clickthroughs, etc.) user profiles and/or user ratings meta-data, page content, site structure

User Transactions sets or sequences of pageviews possibly with associated weights a pageview is a set of page files and associated objects that contribute

to a single display in a Web Browser


29

Personalization Based on Web Usage Mining

Offline Process

Web &ApplicationServer Logs

Data CleaningPageview Identification

SessionizationData Integration

Data Transformation

Data Preprocessing

UserTransactionDatabase

Transaction ClusteringPageview ClusteringCorrelation Analysis

Association Rule MiningSequential Pattern Mining

Usage Mining

Patterns

Pattern FilteringAggregation

Characterization

Pattern Analysis

Site Content& Structure

Domain Knowledge

AggregateUsage Profiles

Data Preparation Phase Pattern Discovery Phase


30

Personalization Based on Web Usage Mining:

Online Process

Recommendation EngineRecommendation Engine

Web Server Client BrowserActive Session

RecommendationsIntegrated User Profile

AggregateUsage Profiles

<user,item1,item2,…>

Stored User Profile

Domain Knowledge


31

Conceptual Representation of User Transactions or Sessions

A B C D E Fuser0 15 5 0 0 0 185user1 0 0 32 4 0 0user2 12 0 0 56 236 0user3 9 47 0 0 0 134user4 0 0 23 15 0 0user5 17 0 0 157 69 0user6 24 89 0 0 0 354user7 0 0 78 27 0 0user8 7 0 45 20 127 0user9 0 38 57 0 0 15

Session/user data

Pageview/objects

Raw weights are usually based on time spent on a page, but in practice, need to normalize and transform.


32

Web Usage Mining: clustering example

Transaction Clusters: Clustering similar user transactions and using centroid of each

cluster as a usage profile (representative for a user segment)

Support URL Pageview Description

1.00 /courses/syllabus.asp?course=450-96-303&q=3&y=2002&id=290

SE 450 Object-Oriented Development class syllabus

0.97 /people/facultyinfo.asp?id=290 Web page of a lecturer who thought the above course

0.88 /programs/ Current Degree Descriptions 2002

0.85 /programs/courses.asp?depcode=96&deptmne=se&courseid=450

SE 450 course description in SE program

0.82 /programs/2002/gradds2002.asp M.S. in Distributed Systems program description

Sample cluster centroid from CTI Web site (cluster size =330)


33

Using Clusters for Personalization

A.html B.html C.html D.html E.html F.htmluser0 1 1 0 0 0 1user1 0 0 1 1 0 0user2 1 0 0 1 1 0user3 1 1 0 0 0 1user4 0 0 1 1 0 0user5 1 0 0 1 1 0user6 1 1 0 0 0 1user7 0 0 1 1 0 0user8 1 0 1 1 1 0user9 0 1 1 0 0 1

A.html B.html C.html D.html E.html F.htmlCluster 0 user 1 0 0 1 1 0 0

user 4 0 0 1 1 0 0user 7 0 0 1 1 0 0

Cluster 1 user 0 1 1 0 0 0 1user 3 1 1 0 0 0 1user 6 1 1 0 0 0 1user 9 0 1 1 0 0 1

Cluster 2 user 2 1 0 0 1 1 0user 5 1 0 0 1 1 0user 8 1 0 1 1 1 0

PROFILE 0 (Cluster Size = 3)--------------------------------------1.00 C.html1.00 D.html

PROFILE 1 (Cluster Size = 4)--------------------------------------1.00 B.html1.00 F.html0.75 A.html0.25 C.html

PROFILE 2 (Cluster Size = 3)--------------------------------------1.00 A.html1.00 D.html1.00 E.html0.33 C.html

Original Session/user

data

Result ofClustering

Given an active session A B, the best matching profile is Profile 1. This may result in a recommendation for page F.html, since it appears with high weight in that profile.

Given an active session A B, the best matching profile is Profile 1. This may result in a recommendation for page F.html, since it appears with high weight in that profile.


34

Profile Injection Attacks

Consist of a number of "attack profiles" added to the system by providing ratings for various items engineered to bias the system's recommendations Two basic types:

“Push attack” (“Shilling”): designed to promote an item“Nuke attack”: designed to demote a item

Prior work has shown that CF recommender systems are highly vulnerable to such attacks

Attack Models strategies for assigning ratings to items based on

knowledge of the system, products, or users examples of attack models: “random”, “average”,

“bandwagon”, “segment”, “love-hate”


35

A Successful Push Attack

Item1 Item 2 Item 3 Item 4 Item 5 Item 6 Correlation with Alice

Alice 5 2 3 3 ?

User 1 2 4 4 1 -1.00

User 2 2 1 3 1 2 0.33

User 3 4 2 3 2 1 .90

User 4 3 3 2 3 1 0.19

User 5 3 2 2 2 -1.00

User 6 5 3 1 3 2 0.65

User 7 5 1 5 1 -1.00

Attack 1 2 3 2 5 -1.00

Attack 2 3 2 3 2 5 0.76

Attack 3 3 2 2 2 5 0.93

Prediction

Best

Match

“user-based” algorithm using k-nearest neighbor with k = 1


36

A Generic Attack Profile

Attack models differ based on ratings assigned to filler and selected items

… … … it

… … null null null

Ratings for k selected items

Rating for the target item

1Si S

ki

IS

1Fi F

li

IF

1i

vi

I

Ratings for l filler items

Unrated items in the attack profile

1( )Fi ( )Fli

1( )Si ( )Ski ( )ti


37

Random ratings for l filler items

Average and Random Attack Models

Random Attack: filler items are assigned random ratings drawn from the overall distribution of ratings on all items across the whole DB

Average Attack: ratings each filler item drawn from distribution defined by average rating for that item in the DB

The percentage of filler items determines the amount knowledge (and effort) required by the attacker

… … it

… null null null rmax


1Fi F

li

IF

1i

vi

I


1( )Fi ( )Fli


38

Bandwagon Attack Model

What if the system's rating distribution is unknown? Identify products that are frequently rated (e.g., “blockbuster” movies) Associate the pushed product with them Ratings for the filler items centered on overall system average rating

(Similar to Random attack) frequently rated items can be guessed or obtained externally

… … … it

rmax … rmax … null null null rmax

Ratings for k frequently rated items


1Si S

ki

IS

1Fi F

li

IF

1i

vi

I

Random ratings for l filler items


1( )Fi ( )Fli


39

Segment Attack Model

Assume attacker wants to push product to a target segment of users those with preference for similar products

fans of Harrison Ford fans of horror movies

like bandwagon but for semantically-similar items originally designed for attacking item-based CF algorithms

maximize sim(target item, segment items)minimize sim(target item, non-segment items)

… … … it

rmax … rmax rmin … rmin null null null rmax

Ratings for k favorite items in user segment


1Si S

ki

IS

1Fi F

li

IF

1i

vi

I

Ratings for l filler items



40

Nuke Attacks: Love/Hate Attack Model

… … it

rmax … rmax null null null rmin

Min rating for the target item

1Fi F

li

IF

1i

vi

I

Max rating for l filler items


A limited-knowledge attack in its simplest form Target item given the minimum rating value All other ratings in the filler item set are given the maximum rating value

Note: Variations of this (an the other models) can also be used as a push or

nuke attacks, essentially by switching the roles of rmin and rmax.


41

How Effective Can Attacks Be?

First A Methodological Note Using MovieLens 100K data set 50 different "pushed" movies

selected randomly but mirroring overall distribution 50 users randomly pre-selected

Results were averages over all runs for each movie-user pair K = 20 in all experiments Evaluating results

prediction shift how much the rating of the pushed movie differs before and

after the attackhit ratio

how often the pushed movie appears in a recommendation list before and after the attack


42

Example Results: Average Attack Average attack is very effective against user based algorithm

(Random not as effective) Item-based CF more robust (but vulnerable to other attack

types such as “segment attack” [Burke & Mobasher, 2005]

Average attack

0

0.2

0.40.6

0.8

1

1.21.4

1.6

1.8

0% 3% 6% 9% 12% 15%

Attack Size

Pre

dic

tio

n S

hif

t

User Based Item Based


43

Example Results: Bandwagon Attack Only a small profile needed (3%-7%) Only a few (< 10) popular movies needed As effective as the more data-intensive average attack (but still not

effective against item-based algorithms)

Bandwagon and Average Attacks

00.2

0.40.6

0.81

1.21.4

1.6

0% 3% 6% 9% 12% 15%

Attack Size

Pre

dic

tio

n S

hif

t

Average(10%) Bandwagon(6%)

Bandwagon and Average Attacks(10% attack size)

0

0.2

0.4

0.6

0.8

1

0 10 20 30 40 50 60

# of recommendations

Hit

Rat

io

Average Attack Bandwagon Attack Baseline


44

Results: Impact of Profile Size

Only a small number of filler items need to be assigned ratings. An attacker, therefore, only needs to use part of the product space to make the attack effective.

In the item-based algorithm we don’t see the same drop-off, but prediction shift shows a logarithmic behavior – near maximum at about 7% filler size.


45

Example Results: Segmented Attack Against Item-Based CF

Item-Based Algorithm: 1% Attack against the Horror Movie Segment

0%

10%

20%

30%

40%

50%

60%

0 10 20 30 40 50

# of Recommendations

Hit

Rat

io

in-segment all-user pre-attack

Item-Based Algorithm: Horror Movie Segment

0

0.2

0.4

0.6

0.8

1

1.2

0% 5% 10% 15%

Attack Size

Pre

dic

tio

n S

hif

t

in-segment all-user

•Very effective against targeted group•Best against item-based•Also effective against user-based

•Low knowledge


46

Possible Solutions

Explicit trust calculation? select peers through network of trust relationships law of large numbers

hard to achieve numbers needed for CF to work wellHybrid recommendation

Some indications that some hybrids may be more robustModel-based recommenders

Certain recommenders using clustering are more robust, but generally at the cost of less accuracy

But a probabilistic approach has been shown to be relatively accurate [See: Model-Based Collaborative Filtering as a Defense Against Profile Injection Attacks, B. Mobasher, R. Burke, JJ Sandvig. AAAI 2006, Boston.]

Detection and Response


47

Results: Semantically Enhanced Hybrid

Alpha 0.0 = 100% semantic item-based similarityAlpha 1.0 = 100% collaborative item-based similarity

Hybrid Algorithm 10% Horror Segment Attack at Alpha = 0.4

0

0.1

0.2

0.3

0.4

0.5

0.6

0 10 20 30 40 50

# of RecommendationsH

it R

ati

o

Hybrid Item based

Hybrid Algorithm - Impact of Semantic / Collaborative Combination (Alpha) on

Prediction Accuracy

0.56

0.57

0.58

0.59

0.6

0.61

0.62

0 0.2 0.4 0.6 0.8 1

Alpha

MA

E

Semantic features extracted for movies: top actors, director, genre, synopsis (top keywords), etc.


48

Approaches to Detection & Response

Profile Classification Classification model to identify attack profiles and exclude these

profiles in computing predictions Uses the characteristic features of most successful attack models Designed to increase cost of attacks by detecting most effective attacks

Anomaly Detection Classify Items (as being possibly under attack)

Not dependent on known attack models Can shed some light on which type of items are most vulnerable to

which types of attacks

But, what if the attack does not closely correspond to known attack signature

In Practice: need a comprehensive framework combining both approaches


49

Anomaly Detection: Using Control Charts

0

0.5

1

1.5

2

2.5

3

3.5

1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65

Items

Item

's a

vera

ge

rati

ng

Upper and lower boundaries on average ratings of items used as signal thresholds for push and nuke attacks, respectively.

A new item’s average rating

Observations: avg. ratings on training items in a particular category, assuming no biased ratings


50

Anomaly Detection: Using Time Series

0

2

4

6

1 6 11 16 21 26 31 36 41 46 51 56 61 66 71

time interval

aver

age

rati

ng

per

in

terv

alw ithout attack push nuke

A sudden change in an item’s mean rating may indicate a suspicious pattern


51

Anomaly Detection Results

SPC can be effective in identifying items under attack

Time series effective in long-term monitoring of items

Detection performance highly affected by the rating density and popularity of items

For more on the anomaly detection approach see:

Securing Collaborative Filtering Against Malicious Attacks Through Anomaly Detection. R. Bhaumik, C. Williams, B. Mobasher, R. BurkeIn Proceedings of the 4th Workshop on Intelligent Techniques for Web Personalization (ITWP'06), held at AAAI 2006, Boston, July 2006.


52

Classification-Based Approach to Detection

Profile Classification Automatically identify attack profiles and exclude them from predictions Reverse-engineered profiles likely to be most damaging Increase cost of attacks by detecting most effective attacks Characteristics of known attack models are likely to appear in other

effective attacks as well

Basic Approach Create attributes that capture characteristics of suspicious profiles Use attributes to build classification models Apply model to user profiles to identify and discount potential attacks

Two Types of Detection Attributes Generic – Focus on overall profile characteristics

Model-specific – based on characteristics of specific attack models Partition profile to maximize similarity to known models Generate attributes related to partition characteristics


53

Methodological Note for Detection Results

Data set Using MovieLens 100K data set Data split 50% training, 50% test

Profile classifier - Supervised training approach kNN classifier, k=9 Training data

Half of actual data labeled as “Authentic” Insert a mix of attack profiles built from several attack models labeled as “Attack”

Test data Start with second half of actual data Insert test attack profiles targeting different movies than targeted in training data

Recommendation Algorithm User based kNN, k = 20

Evaluating results 50 different target movies

selected randomly but mirroring overall distribution 50 users randomly pre-selected

Results were averaged over all runs for each movie-user pair


54

Evaluation Metrics

Detection attribute value: Information Gain – attack profile vs. authentic profile

Classification performance:True positive = # of attack profiles correctly identified

False positive = # of authentic profiles misclassified as attacks

False negatives = # of attack profiles misclassified as authentic

Precision = true positives / (true pos. + false pos.)

Percent of profiles identified as attacks that are attacks Recall = true positives / (true pos. + false negatives)

Percent of attack profiles that were identified correctly

Recommender robustness: Prediction shift – change in recommender’s prediction resulting

from the attack


55

Classification Effectiveness: Average and Random Push Attacks

Push attack precision

0%

10%

20%

30%

40%

50%

60%

0% 20% 40% 60% 80% 100%

Filler Size

Pre

cisi

on

Average-Model detection Random-Model detection

Average-Chirita detection Random-Chirita detection

Push attack recall

0%

20%

40%

60%

80%

100%

0% 20% 40% 60% 80% 100%

Filler Size

Rec

all


Average-Chirita detection Random-Chirita detection

Note: As a baseline we compared our classifier with the ad hoc approach for attack detection by Chirita et al., WIDM 2005, which does not use all of the proposed attributes and does not build a classification model.

Note: As a baseline we compared our classifier with the ad hoc approach for attack detection by Chirita et al., WIDM 2005, which does not use all of the proposed attributes and does not build a classification model.


56

Robustness:Impact of Detection on Prediction Shift Due to Attacks

Push attack prediction shift (3% filler size)

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

0% 2% 4% 6% 8% 10% 12% 14%

Attack Size

Pre

dic

tio

n S

hif

tAverage-No detection Random-No detection



57

Attacks in Collaborative Recommenders: Summary

Collaborative spam (clam?) Worse than we thought; common algorithms vulnerable;

targeting quite easy to achieve Attacks, if designed correctly, can require very limited system-

or user-specific knowledge

Need to understanding properties of attack models Can help in designing more robust algorithms

E.g., hybrid and model-based algorithms Needed fro effective detection and response

Most effective attacks are those that mimic known attack models


58

ConclusionsWhy recommender systems?

Many algorithmic advances more accurate and reliable systems more confidence by users

Assist users inFinding more relevant information, items, productsGive users alternatives broaden user knowledgeBuilding communities

Help companies toBetter engage users and customers building loyaltyIncrease sales (on average 5-10%)

Problems and challenges More complex Web-based applications more complex user

interactions need more sophisticated models Need to further explore the impact of recommendations on (a)

user behavior and (b) on the evolution of Web communities Privacy, security, trust


59

?


60

A Push Attack Against Item-Based Algorithm

Item1 Item 2 Item 3 Item 4 Item 5 Item 6

Alice 5 2 3 3 ?

User 1 2 4 4 1

User 2 2 1 3 1 2

User 3 4 2 3 2 1

User 4 3 3 2 3 1

User 5 3 2 2 2

User 6 5 3 1 3 2

User 7 5 1 5 1

Attack 1 5 1 1 1 1 5

Attack 2 5 1 1 1 1 5

Attack 3 5 1 1 1 1 5

Item similarity

0.89 0.53 0.49 0.70 0.50

Prediction

BestMatch


61

Examples of Generic Attributes

Weighted Deviation from Mean Agreement (WDMA) Average difference in profile’s rating from mean rating on

each item weighted by the item’s inverse rating frequency squared

Weighted Degree of Agreement (WDA) Sum of profile’s rating agreement with mean rating on

each item weighted by inverse rating frequency

Average correlation of the profile's k nearest neighbors Captures rogue profiles that are part of large attacks with

similar characteristics

Variance in the number of ratings in a profile compared to the average number of ratings per user Few real users rate a large # of items

,

20WDMA

un u i i

i iu

u

r r

l

n

,

0

WDAun u i i

ui i

r r

l

2

0

# #LengthVar

(# # )

j

j N

ji

ratings ratings

ratings ratings

1DegSim

k

iji

j

W

k


62

Model Specific Attributes

Partition profile to maximize similarity to known models

Generate attributes related to partition characteristics that would stand out if the profile was that type of attack


63

Examples of Model Specific Attributes

Average attack detection model Partition profile to minimize variance in

ratings in Pu,F from mean rating for each item

For average attack, the mean variance of the filler partition is likely less than an authentic user

Segment attack detection model Partition profile into items with high ratings

and low ratings For segment attack, the difference between

the average rating of these two groups is likely greater than that of an authentic user

Target focus detection model (TMF) Use the identified Pu,T partitions to identify

concentrations of items under attack across all profiles

ivØ…i1ØilF…………i1Fit ivØ…i1ØilF…………i1Fit

iu,t Iu,F Iu,Ø

Pu,FPu,T

arg

2

,( )

argMeanVar( , )| |

j t et

i j ii P r

t et

r r

r jK

ivØ…i1ØilF…i1FikS…i1Sit ivØ…i1ØilF…i1FikS…i1Sit

iu,t Iu,S Iu,F Iu,Ø

Pu,FPu,T

, ,, ,

, ,

FMTD u T u Fu i u ki P k P

u

u T u F

r r

P P

personalizing the web: building effective recommender systems

Documents

preference score of

unseen target items

users context

similar content

terms of content

similar tastes

itpreference scores

unseen item independence