collaborative filtering
DESCRIPTION
Collaborative Filtering. Rong Jin Department of Computer Science and Engineering Michigan State University. Outline. Brief introduction information filtering Collaborative filtering Major issues in collaborative filtering Main methods for collaborative filtering - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Collaborative Filtering](https://reader036.vdocument.in/reader036/viewer/2022062310/56816865550346895ddec1d7/html5/thumbnails/1.jpg)
1
Collaborative Filtering
Rong JinDepartment of Computer Science and EngineeringMichigan State University
![Page 2: Collaborative Filtering](https://reader036.vdocument.in/reader036/viewer/2022062310/56816865550346895ddec1d7/html5/thumbnails/2.jpg)
2
Outline Brief introduction information filtering Collaborative filtering
Major issues in collaborative filtering Main methods for collaborative filtering Flexible mixture model for collaborative filtering Decoupling model for collaborative filtering
![Page 3: Collaborative Filtering](https://reader036.vdocument.in/reader036/viewer/2022062310/56816865550346895ddec1d7/html5/thumbnails/3.jpg)
3
Short vs. Long Term Info. Need Short-term information need (Ad hoc retrieval)
“Temporary need”, e.g., info about used cars Information source is relatively static User “pulls” information Application example: library search, Web search
Long-term information need (Filtering) “Stable need”, e.g., new data mining algorithms Information source is dynamic System “pushes” information to user Applications: news filter
![Page 4: Collaborative Filtering](https://reader036.vdocument.in/reader036/viewer/2022062310/56816865550346895ddec1d7/html5/thumbnails/4.jpg)
4
Examples of Information Filtering News filtering Email filtering Movie/book/product recommenders Literature recommenders And many others …
![Page 5: Collaborative Filtering](https://reader036.vdocument.in/reader036/viewer/2022062310/56816865550346895ddec1d7/html5/thumbnails/5.jpg)
5
Information Filtering Basic filtering question: Will user U like item X? Two different ways of answering it
Look at what U likes characterize X content-based filtering
Look at who likes X characterize U collaborative filtering
Combine content-based filtering and collaborative filtering
![Page 6: Collaborative Filtering](https://reader036.vdocument.in/reader036/viewer/2022062310/56816865550346895ddec1d7/html5/thumbnails/6.jpg)
6
Other Names for Information Filtering Content-based filtering is also called
“Adaptive Information Filtering” in TREC “Selective Dissemination of Information” (SDI)
in Library & Information Science Collaborative filtering is also called
Recommender systems
![Page 7: Collaborative Filtering](https://reader036.vdocument.in/reader036/viewer/2022062310/56816865550346895ddec1d7/html5/thumbnails/7.jpg)
7
Example: Content-based Filtering
Description:A homicide detective and a fire marshall must stop a pair of murderers who commit videotaped crimes to become media darlings
Rating:
Description: Benjamin Martin is drawn into the American revolutionary war against his will when a brutal British commander kills his son.
Rating:
Description: A biography of sports legend, Muhammad Ali, from his early days to his days in the ring
Rating:
History What to Recommend?Description: A high-school boy is given the chance to write a story about an up-and-coming rock band as he accompanies it on their concert tour.
Recommend: ?
Description: A young adventurer named Milo Thatch joins an intrepid group of explorers to find the mysterious lost continent of Atlantis.
Recommend: ?
No
Yes
![Page 8: Collaborative Filtering](https://reader036.vdocument.in/reader036/viewer/2022062310/56816865550346895ddec1d7/html5/thumbnails/8.jpg)
8
Example: Collaborative Filtering
User 1 1 5 3 4 3
User 2 4 1 5 2 5User 3 2 ? 3 5 4
User 3 is more similar to user 1 than user 2
5 for movie “15 minutes” for user 3
5
![Page 9: Collaborative Filtering](https://reader036.vdocument.in/reader036/viewer/2022062310/56816865550346895ddec1d7/html5/thumbnails/9.jpg)
9
Collaborative Filtering (CF) vs. Content-based Filtering (CBF) CF do not need content of items while CBF
relies the content of items CF is useful when content of items
are not available or difficult to acquire are brief and insufficient
Example: movie recommendation A movie is preferred may because
its actor its director its popularity
![Page 10: Collaborative Filtering](https://reader036.vdocument.in/reader036/viewer/2022062310/56816865550346895ddec1d7/html5/thumbnails/10.jpg)
10
Application of Collaborative Filtering
![Page 11: Collaborative Filtering](https://reader036.vdocument.in/reader036/viewer/2022062310/56816865550346895ddec1d7/html5/thumbnails/11.jpg)
11?
Collaborative Filtering Goal: Making filtering decisions for an individual user based
on the judgments of other users
u1
u2
…
um
Users: U
Objects: O
o1 o2 … oj oj+1… on
3 1 …. … 4 2 ?
2 5 ? 4 3
? 3 ? 1 2
utest 3 4…… 1
![Page 12: Collaborative Filtering](https://reader036.vdocument.in/reader036/viewer/2022062310/56816865550346895ddec1d7/html5/thumbnails/12.jpg)
12
Collaborative Filtering Goal: Making filtering decisions for an individual user based
on the judgments of other users
General idea Given a user u, find similar users {u1, …, um}
Predict u’s rating based on the ratings of u1, …, um
![Page 13: Collaborative Filtering](https://reader036.vdocument.in/reader036/viewer/2022062310/56816865550346895ddec1d7/html5/thumbnails/13.jpg)
13
Example: Collaborative Filtering
User 1 1 5 3 4 3
User 2 4 1 5 2 5User 3 2 ? 3 5 4
User 3 is more similar to user 2 than user 1
5 for movie “15 minutes” for user 3
5
![Page 14: Collaborative Filtering](https://reader036.vdocument.in/reader036/viewer/2022062310/56816865550346895ddec1d7/html5/thumbnails/14.jpg)
14
Memory-based Approaches for CF The key is to find users that are similar to the test user Traditional approach
Measure the similarity in rating patterns between different users Example: Pearson Correlation Coefficient
0
0 00
,
,
( ( ) )ˆ ( )
y y y yy Y
y yy y
y Y
w R x RR x R
w
0 0
0
( )^ ( ), 2 2
( )^ ( ) ( )^ ( )
( ( ) )( ( ) )
( ( ) ) ( ( ) )o
o o
y y y yx X y X y
y yy y y y
x X y X y x X y X y
R x R R x R
wR x R R x R
![Page 15: Collaborative Filtering](https://reader036.vdocument.in/reader036/viewer/2022062310/56816865550346895ddec1d7/html5/thumbnails/15.jpg)
15
Pearson Correlation Coefficient for CF Similarity between a training user y and a test user y0: 0,y yw
( ) : the rating of object given by user
: the average rate given by user
( ) : the set set of objects rated by user
y
y
R x x y
R y
X y y
0 0
0
( )^ ( ), 2 2
( )^ ( ) ( )^ ( )
( ( ) )( ( ) )
( ( ) ) ( ( ) )o
o o
y y y yx X y X y
y yy y y y
x X y X y x X y X y
R x R R x R
wR x R R x R
Remove the rating bias from each training user
Normalized Rate: ( )y yR x R
![Page 16: Collaborative Filtering](https://reader036.vdocument.in/reader036/viewer/2022062310/56816865550346895ddec1d7/html5/thumbnails/16.jpg)
16
Pearson Correlation Coefficient for CF Estimate ratings for the test user
0
0 00
,
,
( ( ) )ˆ ( )
| |
y y y yy Y
y yy y
y Y
w R x RR x R
w
Weighted vote of normalized rates
![Page 17: Collaborative Filtering](https://reader036.vdocument.in/reader036/viewer/2022062310/56816865550346895ddec1d7/html5/thumbnails/17.jpg)
17
Example
User 1 1 5 3 4 3
Normalized Rate
User 2 4 1 5 2 5
Normalized Rate
User 3 2 ? 3 5 4
Normalize Rate
0,y yw
![Page 18: Collaborative Filtering](https://reader036.vdocument.in/reader036/viewer/2022062310/56816865550346895ddec1d7/html5/thumbnails/18.jpg)
18
Example
User 1 1 5 3 4 3
Normalized Rate -2.2 1.8 -0.2 0.8 -0.2
User 2 4 1 5 2 5
Normalized Rate 0.6 -2.4 1.6 -1.4 1.6
User 3 2 ? 3 5 4
Normalize Rate -1.5 -0.5 1.5 0.5
0,y yw
![Page 19: Collaborative Filtering](https://reader036.vdocument.in/reader036/viewer/2022062310/56816865550346895ddec1d7/html5/thumbnails/19.jpg)
19
Example
User 1 1 5 3 4 3
Normalized Rate -2.2 1.8 -0.2 0.8 -0.2 0.85
User 2 4 1 5 2 5
Normalized Rate 0.6 -2.4 1.6 -1.4 1.6 -0.49
User 3 2 ? 3 5 4
Normalize Rate -1.5 -0.5 1.5 0.5
0,y yw
0.85 1.8 0.49 (-2.4) 3.5 0.85 0.49
5.5
r
![Page 20: Collaborative Filtering](https://reader036.vdocument.in/reader036/viewer/2022062310/56816865550346895ddec1d7/html5/thumbnails/20.jpg)
20
Problems with Memory-based Approaches
User 1 ? 5 3 4 2
User 2 4 1 5 ? 5
User 3 5 ? 4 2 5
User 4 1 5 3 5 ?
Most users only rate a few items Two similar users can may not rate the same set of items Clustering users and items
![Page 21: Collaborative Filtering](https://reader036.vdocument.in/reader036/viewer/2022062310/56816865550346895ddec1d7/html5/thumbnails/21.jpg)
21
Flexible Mixture Model (FMM)Cluster both users and items simultaneously
User 1 ? 5 3 4 2
User 2 4 1 5 ? 5
User 3 5 ? 4 2 5
User 4 1 5 3 5 ?
User clustering and item clustering are correlated !
![Page 22: Collaborative Filtering](https://reader036.vdocument.in/reader036/viewer/2022062310/56816865550346895ddec1d7/html5/thumbnails/22.jpg)
22
Flexible Mixture Model (FMM)Cluster both users and items simultaneously
User Class I 1 p(4)=1/4p(5)=3/4
3
User Class II p(4)=1/4p(5)=3/4
p(1)=1/2p(2)=1/2
p(4)=1/2p(5)=1/2
Movie Type I
Movie Type II
Movie Type III
Unknown ratings are gone!
![Page 23: Collaborative Filtering](https://reader036.vdocument.in/reader036/viewer/2022062310/56816865550346895ddec1d7/html5/thumbnails/23.jpg)
23
Flexible Mixture Model (FMM)
Zo Zu
O U R
uo ZZ
uoluloluolll ZZrPZuPZoPZPZPruoP,
)()()()()()( ),|()|()|()()(),,(
P(o|Zo) P(u|Zu) P(Zo) P(Zu)
P(r|Zo,Zu)
Zu: user class
Zo: item class
U: user
O: item
R: rating
Hidden variable
Observed variable
![Page 24: Collaborative Filtering](https://reader036.vdocument.in/reader036/viewer/2022062310/56816865550346895ddec1d7/html5/thumbnails/24.jpg)
24
Annealed Expectation Maximization (AEM) algorithm E-step: calculate posterior probability for hidden
variables zu and Zo
b: temperature for Annealed EM algorithm M-step: updated parameters
Flexible Mixture Model: Estimation
uo ZZ
buoluloluo
buoluloluo
llluo ZZrPZuPZoPZPZPZZrPZuPZoPZPZP
ruozzP
,)()()(
)()()()()()( )),|()|()|()()((
)),|()|()|()()((),,|,(
),|();|();|();();( )()()( uoluloluo ZZrPZuPZoPZPZP
![Page 25: Collaborative Filtering](https://reader036.vdocument.in/reader036/viewer/2022062310/56816865550346895ddec1d7/html5/thumbnails/25.jpg)
25
Flexible Mixture Model: Predication
Fold-in process Repeat the EM algorithm including ratings from
the test user Fix all the parameters except for P(ut|zu)
uo ZZ
uout
ouolt ZZrPZuPZoPZPZPruoP
,)( ),|()|()|()()(),,(
Key issue:What user class does the test user belong to ?
![Page 26: Collaborative Filtering](https://reader036.vdocument.in/reader036/viewer/2022062310/56816865550346895ddec1d7/html5/thumbnails/26.jpg)
26
Another Prob. with Memory-based Approaches
User 1 2 5 3 4 2
User 2 4 1 4 1 3
User 3 5 2 5 2 5
User 4 1 4 2 3 1
Users with similar interests can have different rating patterns Decoupling preference patterns from rating patterns
![Page 27: Collaborative Filtering](https://reader036.vdocument.in/reader036/viewer/2022062310/56816865550346895ddec1d7/html5/thumbnails/27.jpg)
27
Decoupling Model (DM)
Zo Zu
O U
Hidden variableObserved variable
Zu: user classZo: item classU: userO: itemR: rating
![Page 28: Collaborative Filtering](https://reader036.vdocument.in/reader036/viewer/2022062310/56816865550346895ddec1d7/html5/thumbnails/28.jpg)
28
Decoupling Model (DM)
Zu: user classZo: item classU: userO: itemR: ratingZpref : whether users like items
Zo Zu
O U
Zpref
Hidden variableObserved variable
![Page 29: Collaborative Filtering](https://reader036.vdocument.in/reader036/viewer/2022062310/56816865550346895ddec1d7/html5/thumbnails/29.jpg)
29
Decoupling Model (DM)
Zu: user classZo: item classU: userO: itemR: ratingZpref : whether users like items ZR: rating class
Zo Zu
O U R
Zpref
ZR
Separating preference and rating patterns User class + Rating class rating R
Zu Zpref and ZR +Zpref r
Hidden variableObserved variable
![Page 30: Collaborative Filtering](https://reader036.vdocument.in/reader036/viewer/2022062310/56816865550346895ddec1d7/html5/thumbnails/30.jpg)
30
Experiment Datasets: EachMovie and MovieRating
Evaluation: Mean Absolute Error (MAE): average absolute deviation of the
predicted ratings to the actual ratings on items.
The smaller MAE, the better the performance
MovieRating EachMovieNumber of Users 500 2000Number of Items 1000 1682Avg. # of rated items/User 87.7 129.6Number of ratings 5 6
|)(|1 ^
)()( )( lyl
lTest
xRrL
MAEl
![Page 31: Collaborative Filtering](https://reader036.vdocument.in/reader036/viewer/2022062310/56816865550346895ddec1d7/html5/thumbnails/31.jpg)
31
Experiment Protocol Test the sensitivity of the proposed model to the
amount of training data Vary the number of training users MovieRating dataset: 100 and 200 training users EachMovie dataset: 200 and 400 training users
Test the sensitivity of the proposed model to the information needed for the test user Vary the number of rated items provided by the test user
5, 10, and 20 items are given with ratings
![Page 32: Collaborative Filtering](https://reader036.vdocument.in/reader036/viewer/2022062310/56816865550346895ddec1d7/html5/thumbnails/32.jpg)
32
Experimental Results:FMM and other baseline algorithms
0.750.770.790.810.830.850.870.89
0.750.770.790.810.830.850.870.89
Movie Rating, 100 Training Users
Movie Rating, 200 Training Users
0.90.951
1.051.11.151.21.251.3
0.90.951
1.051.11.151.21.251.3
Each Movie, 400 Training Users
Each Movie, 200 Training Users
Given: 5 10 20 Given: 5 10 20
Given: 5 10 20Given: 5 10 20
MAE
MAE
MAE
MAE
A smaller MAE indicates better
performance
![Page 33: Collaborative Filtering](https://reader036.vdocument.in/reader036/viewer/2022062310/56816865550346895ddec1d7/html5/thumbnails/33.jpg)
33
FMM vs. DM
Results on Movie Rating
Results on Each Movie
Training Users Size Algorithms 5 Items
Given 10 Items
Given20 Items Given
100FMM 0.829 0.822 0.807
DM 0.791 0774 0.751
200FMM 0.800 0.787 0.768
DM 0.770 0.753 0.730
Training Users Size Algorithms 5 Items
Given10 Items
Given20 Items Given
200FMM 1.07 1.04 1.02
DM 1.06 1.02 1.00
400FMM 1.05 1.03 1.01
DM 1.04 1.01 0.99
Smaller value indicates better performance