recommender system
TRANSCRIPT
![Page 1: Recommender system](https://reader031.vdocument.in/reader031/viewer/2022022201/5886935d1a28abf6158b6cab/html5/thumbnails/1.jpg)
Recommender System
Yinghan Fu
![Page 2: Recommender system](https://reader031.vdocument.in/reader031/viewer/2022022201/5886935d1a28abf6158b6cab/html5/thumbnails/2.jpg)
Non-personalized recommendation
โข Every customer gets the same recommendation.
![Page 3: Recommender system](https://reader031.vdocument.in/reader031/viewer/2022022201/5886935d1a28abf6158b6cab/html5/thumbnails/3.jpg)
Non-personalized recommendation
1
1 +1๐๐ง2๐ +1
2๐๐ง2 โ ๐ง
1
๐๐ 1 โ ๐ +
1
4๐2๐ง2
Reddit comment score:
๐ : percentage of positive ratings in all ratings
1
1 +1๐๐ง2๐ +1
2๐๐ง2 โ ๐ง
1
๐๐ 1 โ ๐ +
1
4๐2๐ง2 ,
1
1 +1๐๐ง2๐ +1
2๐๐ง2 + ๐ง
1
๐๐ 1 โ ๐ +
1
4๐2๐ง2
In binomial distribution, confidence interval (Wilson score interval) of ๐:
http://www.redditblog.com/2009/10/reddits-new-comment-sorting-system.html
![Page 4: Recommender system](https://reader031.vdocument.in/reader031/viewer/2022022201/5886935d1a28abf6158b6cab/html5/thumbnails/4.jpg)
Non-personalized recommendation
โข Advantage
โ Quick to calculate.
โ Under the right context, can be accurate
โข Disadvantage
โ Without the context, not so helpful
![Page 5: Recommender system](https://reader031.vdocument.in/reader031/viewer/2022022201/5886935d1a28abf6158b6cab/html5/thumbnails/5.jpg)
Personalized recommendation You open Amazon in the browser:
Why is this a failure compared with the Reddit comment ranking? No context!
![Page 6: Recommender system](https://reader031.vdocument.in/reader031/viewer/2022022201/5886935d1a28abf6158b6cab/html5/thumbnails/6.jpg)
Content-based recommendation
![Page 7: Recommender system](https://reader031.vdocument.in/reader031/viewer/2022022201/5886935d1a28abf6158b6cab/html5/thumbnails/7.jpg)
Content-based recommendation Based on a book
Magic
Harry Potter and the Deathly Hallows
Me
๐ก๐(๐) โ ๐๐๐(๐)
๐๐๐(๐) =1
๐(๐)
http://nlp.stanford.edu/IR-book/html/htmledition/document-and-query-weighting-schemes-1.html
![Page 8: Recommender system](https://reader031.vdocument.in/reader031/viewer/2022022201/5886935d1a28abf6158b6cab/html5/thumbnails/8.jpg)
โข Advantage
โ Quick to compute.
โข Disadvantage
โ Need manual labelling.
Content-based recommendation
![Page 9: Recommender system](https://reader031.vdocument.in/reader031/viewer/2022022201/5886935d1a28abf6158b6cab/html5/thumbnails/9.jpg)
Collaborative filtering
๐ข1๐ข2๐ข3๐ข4๐ข5โฎ
4 3 5 โฆ5 5 5 4 โฏ2 3 โฏ1 4 2 โฏ
1 1 โฏโฎ โฎ โฎ โฎ โฑ
๐ฃ1 ๐ฃ2 ๐ฃ3 ๐ฃ4 โฆ
๐ข๐ vector for user ๐ ๐ฃ๐ vector for item ๐
![Page 10: Recommender system](https://reader031.vdocument.in/reader031/viewer/2022022201/5886935d1a28abf6158b6cab/html5/thumbnails/10.jpg)
Collaborative filtering
โข Item-item collaborative filtering
โข ๐๐๐ = ๐ (๐,๐โฒ)๐
๐๐โฒ๐โฒ
๐ (๐,๐โฒ)๐โฒ
๐ ๐, ๐โฒ = ๐(๐ฃ๐ , ๐ฃ๐โฒ)
โข User-user collaborative filtering
โข ๐๐๐ = ๐ (๐,๐โฒ)๐
๐โฒ๐๐โฒ
๐ (๐,๐โฒ)๐โฒ
๐ ๐, ๐โฒ = ๐(๐ข๐ , ๐ข๐โฒ)
โข Slow to compute, more accurate for most situations.
http://files.grouplens.org/papers/FnT%20CF%20Recsys%20Survey.pdf
http://grouplens.org/site-content/uploads/Item-Based-WWW-2001.pdf
![Page 11: Recommender system](https://reader031.vdocument.in/reader031/viewer/2022022201/5886935d1a28abf6158b6cab/html5/thumbnails/11.jpg)
Collaborative filtering
โข Variation of kernel
โข ๐ ๐ข๐ , ๐ข๐โฒ = ๐๐๐ ๐ข๐ , ๐ข๐โฒ cosines similarity
โข ๐ ๐ข๐ , ๐ข๐โฒ = ๐ ๐ข๐ , ๐ข๐โฒ correlation similarity
โข ๐ข๐โฒ = ๐ข๐ โ
๐ฃ1๐ฃ2๐ฃ3โฎ
๐ ๐ข๐ , ๐ข๐โฒ = ๐๐๐ ๐ข๐โฒ, ๐ข๐โฒโฒ
adjusted cosine similarity
![Page 12: Recommender system](https://reader031.vdocument.in/reader031/viewer/2022022201/5886935d1a28abf6158b6cab/html5/thumbnails/12.jpg)
Collaborative filtering
โข Variation of neighbor size
โข ๐๐๐ = ๐ (๐,๐โฒ)๐
๐โฒ๐๐โฒโ๐ต
๐ (๐,๐โฒ)๐โฒโ๐ต
โข Normalizing, centering and linearly transforming the vectors.
![Page 13: Recommender system](https://reader031.vdocument.in/reader031/viewer/2022022201/5886935d1a28abf6158b6cab/html5/thumbnails/13.jpg)
Collaborative filtering
๐๐ด๐ธ =1
๐ |๐๐ โ ๐๐|
1โค๐โค๐
http://grouplens.org/site-content/uploads/Item-Based-WWW-2001.pdf
![Page 14: Recommender system](https://reader031.vdocument.in/reader031/viewer/2022022201/5886935d1a28abf6158b6cab/html5/thumbnails/14.jpg)
Collaborative filtering
http://grouplens.org/site-content/uploads/Item-Based-WWW-2001.pdf
![Page 15: Recommender system](https://reader031.vdocument.in/reader031/viewer/2022022201/5886935d1a28abf6158b6cab/html5/thumbnails/15.jpg)
Collaborative filtering
๐11 โฏ ๐1๐โฎ โฑ โฎ๐๐1 โฏ ๐๐๐
โ๐1๐
โฎ๐๐๐
๐1 โฆ ๐๐
๐ร๐ ๐ร๐ ๐ร๐
SVD can factorize any matrix โฆ without null values! Null value is the reason we want to do matrix factorization in the first place. Quick to predict ratings โฆ if we are able to factorize the matrix.
![Page 16: Recommender system](https://reader031.vdocument.in/reader031/viewer/2022022201/5886935d1a28abf6158b6cab/html5/thumbnails/16.jpg)
Collaborative filtering
๐11 โฏ ๐1๐โฎ โฑ โฎ๐๐1 โฏ ๐๐๐
โ๐1๐
โฎ๐๐๐๐1 โฆ ๐๐
๐ร๐ ๐ร๐ ๐ร๐
๐ ๐๐๐ ๐ผ,๐ฝ = ๐(๐๐๐๐๐ , ๐
2)
MLE/minimize:
1
2๐๐๐ โ ๐๐
๐๐๐
2
๐๐๐โ ๐๐ข๐๐
http://papers.nips.cc/paper/3208-probabilistic-matrix-factorization.pdf
![Page 17: Recommender system](https://reader031.vdocument.in/reader031/viewer/2022022201/5886935d1a28abf6158b6cab/html5/thumbnails/17.jpg)
Collaborative filtering
๐11 โฏ ๐1๐โฎ โฑ โฎ๐๐1 โฏ ๐๐๐
โ๐1๐
โฎ๐๐๐๐1 โฆ ๐๐
๐ร๐ ๐ร๐ ๐ร๐
๐ ๐๐๐ ๐ผ,๐ฝ = ๐(๐๐๐๐๐ , ๐
2)
MLE/minimize:
๐ธ = 1
2๐๐๐ โ ๐๐
๐๐๐
2
+ฮป
2๐๐๐โ ๐๐ข๐๐
๐๐2
๐
+ ๐๐2
๐
![Page 18: Recommender system](https://reader031.vdocument.in/reader031/viewer/2022022201/5886935d1a28abf6158b6cab/html5/thumbnails/18.jpg)
Collaborative filtering
MLE/minimize:
๐ธ = 1
2๐๐๐ โ ๐๐
๐๐๐
2
+ฮป
2๐๐๐โ ๐๐ข๐๐
๐๐2
๐
+ ๐๐2
๐
Derivative: ๐๐ธ
๐๐๐= ๐๐๐ โ ๐๐
๐๐๐ ๐๐๐,๐๐๐โ ๐๐ข๐๐
+ ฮป๐๐
๐๐ธ
๐๐๐= ๐๐๐ โ ๐๐
๐๐๐ ๐๐๐,๐๐๐โ ๐๐ข๐๐
+ ฮป๐๐
Advantage: Quick for predicting new ratings. More accurate with enough data? Disadvantage: Difficult to update the model.
![Page 19: Recommender system](https://reader031.vdocument.in/reader031/viewer/2022022201/5886935d1a28abf6158b6cab/html5/thumbnails/19.jpg)
Collaborative filtering
MLE/minimize:
๐ธ = 1
2๐๐๐ โ ๐๐
๐๐๐
2
+ฮป
2๐๐๐โ ๐๐ข๐๐
๐๐2
๐
+ ๐๐2
๐
Stochastic Gradient Descent: for ๐๐๐:
๐๐โฒ = ๐ผ ๐๐๐ โ ๐๐
๐๐๐ ๐๐ โ ๐๐๐
๐๐โฒ = ๐ผ ๐๐๐ โ ๐๐
๐๐๐ ๐๐ โ ๐๐๐
๐๐= ๐๐โฒ
๐๐= ๐๐โฒ
http://sifter.org/~simon/journal/20061211.html
Advantage: Easy to update the model
![Page 21: Recommender system](https://reader031.vdocument.in/reader031/viewer/2022022201/5886935d1a28abf6158b6cab/html5/thumbnails/21.jpg)
Evaluation
โข Basic accuracy โ MAE
โ RMSD
โข Ranking accuracy โ Pearson correlation over ranks
โ Kendall tau test
โข Decision support โ Precision
โ Recall
https://www.coursera.org/learn/recommender-systems/supplement/Jh5Kx/pdf-version-of-module-5-presentations