collaborative filtering recommendation algorithm based on hadoop

20
2015/2/27 Scaling-up Item-based Collaborative Filtering Recommendation Algorithm based on Hadoop Jing Jiang, Jie Lu, Guangquan Zhang, Guodong Long 2011 IEEE World Congress Services

Upload: tien-yang-wu

Post on 17-Jul-2015

162 views

Category:

Software


1 download

TRANSCRIPT

Page 1: Collaborative Filtering Recommendation Algorithm based on Hadoop

2015/2/27

Scaling-up Item-based Collaborative Filtering Recommendation Algorithm based on Hadoop

Jing Jiang, Jie Lu, Guangquan Zhang, Guodong Long 2011 IEEE World Congress Services

Page 2: Collaborative Filtering Recommendation Algorithm based on Hadoop

outline

✤ Collaborative Filtering

✤ scaling-up item-based CF

✤ experimentation and evaluation

Page 3: Collaborative Filtering Recommendation Algorithm based on Hadoop

Collaborative Filtering

✤ Collaborative filtering (CF) techniques have achieved widespread success in E-commerce nowadays.

Page 4: Collaborative Filtering Recommendation Algorithm based on Hadoop

Collaborative Filtering

✤ Collaborative filtering is a method of making automatic predictions (filtering) about the interests of a user by collecting preferences or taste information from many users (collaborating). from wiki

Page 5: Collaborative Filtering Recommendation Algorithm based on Hadoop

Collaborative Filtering

1. Weight all users with respect to similarity with active user

2. Select a subset of users to use as a set of predictors

3. Compute a prediction from a weighted combination of selected neighbors’ ratings

Page 6: Collaborative Filtering Recommendation Algorithm based on Hadoop

1. Weight all users with respect to similarity with active user

2. Select a subset of users to use as a set of predictors

3. Compute a prediction from a weighted combination of selected neighbors’ ratings

simple compute

Nathan [5,1,5]

Joe [5,2,5]

John [2,5,2.5]

Al [2,2,4]

use cosine compute similarity

cos (Nathan,Joe) 0.99

cos (Nathan,John) 0.64

cos (Nathan,Al) 0.91

Page 7: Collaborative Filtering Recommendation Algorithm based on Hadoop

1. Weight all users with respect to similarity with active user

2. Select a subset of users to use as a set of predictors

3. Compute a prediction from a weighted combination of selected neighbors’ ratings

simple compute

cos (Nathan,Joe) 0.99

cos (Nathan,John) 0.64

cos (Nathan,Al) 0.91

(0.99*4+0.64*3+0.91*2)/(0.99+0.64+0.91) = 3.03

0.99

0.910.64

? = 3.03

Page 8: Collaborative Filtering Recommendation Algorithm based on Hadoop

Collaborative Filtering

✤ User-Based CF

✤ Item-Based CF

compute similarity base on user

compute similarity base on item

Page 9: Collaborative Filtering Recommendation Algorithm based on Hadoop

Collaborative Filtering

✤ User-Based CFcompute similarity base on user

if predict user A to item4 rating user B to item4 rating is 5 user F to item4 rating is 1

user A to item4 =

5 * similarities (user A, user B) + 1 * similarities (user A, user F)

similarities (user A, user B) + similarities (user A, user F)

Page 10: Collaborative Filtering Recommendation Algorithm based on Hadoop

Collaborative Filtering

✤ Item-Based CFcompute similarity base on item

if predict user A to item4 rating user A to item2 rating is 1 user A to item3 rating is 1

user A to item4 =

1 * similarities (item2, item4) + 1 * similarities (item3, item4)

similarities (item2, item4) + similarities (item3, item4)

Page 11: Collaborative Filtering Recommendation Algorithm based on Hadoop

scaling-up item-based CF

divide CF algorithm into two steps as follows:

Similarity computation

Prediction and Recommendation

pearson correlation(1,-1)

j

Page 12: Collaborative Filtering Recommendation Algorithm based on Hadoop

scaling-up item-based CF

pearson correlation(1,-1)

j

Covariance

Page 13: Collaborative Filtering Recommendation Algorithm based on Hadoop

scaling-up item-based CF

Similarity computation

apple milk toast

sam 2 0 4

john 5 5 3

tim 2 4 ?

u

i j

j

Ri = (2+5+2)/3 Rj = (4+3)/2

Page 14: Collaborative Filtering Recommendation Algorithm based on Hadoop

scaling-up item-based CF

Similarity computation

apple milk toast

sam 2 0 4

john 5 5 3

tim 2 4 ?

u

j i

Ru(sam) = (2+0+4)/3

Rj = (2+5+2)/3 Ri = (4+3)/2

Page 15: Collaborative Filtering Recommendation Algorithm based on Hadoop

scaling-up item-based CF

The three parts of intensive computation are:

(1) computing the average rating for each item

(2) computing the similarity between item pairs

(3) computing predicted items for the target user

Page 16: Collaborative Filtering Recommendation Algorithm based on Hadoop

item i by user j

map item i

1 2 3

Page 17: Collaborative Filtering Recommendation Algorithm based on Hadoop

1

where means the set of users who rated the item k and item l

Page 18: Collaborative Filtering Recommendation Algorithm based on Hadoop

2

similarity

3

map user jmap user j

Page 19: Collaborative Filtering Recommendation Algorithm based on Hadoop

experimentation and evaluation

3 nodes

nodes with Intel P4 CPU, 1G RAM, 80G disk

All the machines were connectedwith one 100Mbps switch.

Page 20: Collaborative Filtering Recommendation Algorithm based on Hadoop

experimentation and evaluation

13

20