speaker pham cong dinh

A quick introduction to item-based collaborative filtering

Pham Cong Dinh @pcdinhPHPDay Saigon 2012

Outline

● PHP popularity and challenges to produce engaging content

● Recommendation engine at work● How to build a item-based collaborative

filtering-based recommendation engine

PHP is everywhere

● W3Tech report in 2012●

●

●

PHP website distribution

● Reported by builtwith.com in 2012 (more than 28 millions site in PHP)

●

●

●

You have a website. Now what?

Information overload

From http://bethesignal.org/

ORno engaging

content?

Why recommendation system?

●

Recommendation engine at work

Build a recommendation system

● Collaborative filtering: user and item– Filtering: automatic predictions about the interests

of a user

– Collaborative: many users (preferences or taste information)

Item-based collaborative filtering

● Model-based– The similarities between different items in the data

set are calculated

– Predict ratings for user-item pairs not present in the data set

Steps to do item-based collaborative filtering

● Data collection and representations (preferences/taste …)

● Finding the relationships and determine the similarity

● Recommendation computations - recommendations/suggestions/discoveries (produce engaging content)

Collaborative filtering: data collection


– Clicks

– Likes, favorites

– Watch, read

– Survey

– Ratings

– Others …

● E.x: Find the set of movies that user X likes

(user, item)

✗X,1

✗X,2

✗Y,1

✗Y,2

✗Z,2

✗Z,3


– Clicks

– Likes, favorites

– Watch, read

– Survey

– Ratings

– Others …

● E.x: Find the set of movies that user X likes

Collaborative filtering: Similarity (1)

● Finding the relationships and determine the similarity

– The similarity values between items are measured by observing all the users who have interacted (rated) both the items

● E.x: Find a group of movies that is similar to these set of movies that we know user X likes


● Manhattan distance: |x1 – x2| + |y1 - y2|●

●

User(x, y)

Amy(5, 5)Bill(2, 5)Jim(1, 4)

Item(x1, x2, x3) → RatingsSnow Crash(5, 2, 1)Girl with the Dragon Tattoo (5, 5, 1)

Manhattan distance→ Amy – Bill: |5 – 2| + |5 – 5| = 3→ Snow Crash - Girl with the Dragon Tattoo: 3

X

Y

X Y


● Cosine distance: the angle between these vectors. Value: -1 (no related) to 1

Item(x1, x2, x3) → RatingsSnow Crash(5, 2, 1)Girl with the Dragon Tattoo (5, 5, 1)

Cosine distance→ Snow Crash - Girl with the Dragon Tattoo: (5x5 + 2x5 + 1x1) / (( 5x5 + 2x2 + 1 x 1) x ( 5x5 + 5x5 + 1x1))

PHP: https://github.com/aoiaoi/CosineSimilarity/blob/master/CosineSimilarity.php


● Pearson Correlation Coefficient: from -1 (no related) to +1

●

●

●

●

● How much the ratings by common users for a pair of items deviate from average ratings for those items

● Correlation is basically the average product


● Euclidean distance: the "ordinary" distance between two points.

●

●

●

●

● Values: Near 0 (no related) to 1


● Spearman distance: Spearman distance is a square of Euclidean Distance between two rank vectors. A perfect positive correlation is +1 and a perfect negative correlation is -1.

●

●

● Spearman Rank Correlation: The range of Spearman Correlation is from -1 to 1 (a perfect Spearman correlation of +1)


● Adjusted Euclidean distance: take length of vectors into account

●

Collaborative filtering: Recommendation computations

● Calculate similarity between Item A that user X watch/buy/like with items that User X does not watch/buy/like

● Score all the items (e.x: apply weighted algorithms – average score by the other)

● Sorting● Return top-N items

Collaborative filtering: Other issues

● Accuracy of Predicting Ratings. To evaluate

accuracy when predicting unrated item for the active user, use Mean Absolute Error (MAE).

● Accuracy of Recommendations. To evaluate the accuracy of recommendations, use Mean Average Precision (MAP), which is defined as Average of the Average Precision (AP) value for a set of queries (a query could be considered as a user’s asking for recommending items in recommender systems).

The End

● Q & A

speaker pham cong dinh

Documents