probabilistic collaborative filtering with negative cross entropy

1
1 Probabilistic Collaborative Filtering with Negative Cross Entropy Alejandro Bellogín 1 , 3 , Javier Parapar 2 , Pablo Castells 3 [email protected], [email protected], [email protected] 1 Information Access, Centrum Wiskunde & Informatica 2 Information Retrieval Lab, University of A Coruña 3 Information Retrieval Group, Universidad Autónoma de Madrid Introduction Neighbourhood identification in memory-based CF algorithms is based on selecting those users who are most similar to the active user according to a certain similarity metric. Hypothesis: neighbour-based CF tech- niques may be improved by using Rele- vance Models (RM) from Information Re- trieval (IR) to identify such neighbour- hoods (and also to weight them). Experiments: a relevance-based lan- guage model has been introduced into a neighbour-based CF algorithm which outperforms other standard techniques in terms of ranking precision. Further im- provements are achieved when we use a complete probabilistic representation of the problem. Relevance Modelling for Recommendation We decompose the rating prediction task ˆ r (u, i)= C v N k (u) sim(u, v )r (v,i) (*) as: 1. We compute a relevance model for each user in order to capture how relevant any other user would be as a potential neighbour. Probability of a neighbour v under the relevance model R u for a given user u: p(v |R u )= X iP RS (u) p(i)p(v |i) Y j I (u) p(j |i) (1) where p(i) is the probability of the item i in the collection, p(v |i) is the probability of the neighbour v given the item i, and p(i|j ) is the conditional probability of item i given another item j . I (u) corresponds to the set of items rated by user u, and P RS (u) I (u) is the subset of items rated by u above some specific threshold. 2. Then we replace the rating prediction by weighted average in CF with the negative cross entropy (from IR) to incorporate the information learnt from the RM: ˆ r (u, i) = H (p(·|R u ); p(·|i)|N k (u)) = X v N k (u) p(v |R u ) log p(v |i) (2) Notation: RMUB: Eqs. (1) + (*) and RMCE: Eqs. (1) + (2) Experiments and Results Evaluation methodology: TestItems [1] (for each user a ranking is generated by predicting a score for every item in the test set). Baselines: UB (user-based CF with Pearson’s correlation as similarity measure), NC+P (Normalised Cut (NC) with Pearson similarity [2]) (better performing than a matrix factorization algorithm), UIR (relevance model for log-based CF [6]), URM (rating-based probability estimations [7]). 0 0.05 0.1 0.15 0.2 0.25 100 200 300 400 500 600 700 P@5 k RMUB λ=opt × × × × × × × × × × × × × × × RMUB λ=0.5 RMCE λ=opt O O O O O O O O O O O O O O O RMCE λ=0.5 UB NC + P N N N N N N N N N N N N N N N Evolution of the performance of the compared methods in terms of P @5 when varying k on the MovieLens 100K collection. MovieLens 100K Method P@5 nDCG@5 nDCG@10 P@50 cvg UB 0.049 cd 0.041 cd 0.047 cd 0.056 ce 100% NC+P 0.111 acde 0.097 acde 0.095 acde 0.058 ce 83% UIR 0.004 0.002 0.002 0.002 100% URM 0.005 0.003 0.018 0.054 ce 100% RMUB 0.081 acd 0.064 acd 0.062 acd 0.050 c 60% RMCE 0.224 abcde 0.204 abcde 0.204 abcde 0.138 abcde 100% MovieLens 1M Method P@5 nDCG@5 nDCG@10 P@50 cvg UB 0.035 cd 0.031 cd 0.031 cd 0.039 cde 100% NC+P 0.037 acd 0.033 acd 0.036 acd 0.048 acde 99% UIR 0.001 0.001 0.001 0.001 100% URM 0.001 0.001 0.006 0.034 c 100% RMUB 0.075 abcd 0.061 abcd 0.057 abcd 0.038 c 41.4% RMCE 0.187 abcde 0.176 abcde 0.168 abcde 0.108 abcde 100% Summary of comparative effectiveness References [1] B ELLOGÍN , A., C ASTELLS , P., AND C ANTADOR , I. Precision-oriented evaluation of recommender systems: an algorithmic comparison. In RecSys (2011) [2] B ELLOGÍN , A., AND P ARAPAR , J. Using graph partitioning techniques for neighbour selection in user-based collaborative filtering. In RecSys (2012) [3] L AVRENKO , V., AND C ROFT, W. B. Relevance based language models. In SIGIR (2001) [4] M C L AUGHLIN , M. R., AND H ERLOCKER , J. L. A collaborative filtering algorithm and evaluation metric that accurately model the user experience. In SIGIR (2004) [5] P ARAPAR , J., B ELLOGÍN , A., C ASTELLS , P., AND B ARREIRO , A. Relevance-based language modelling for recommender systems. IPM. 49, 4 (2013) [6] WANG , J., DE V RIES , A., AND R EINDERS , M. A user-item relevance model for log-based collaborative filtering. In ECIR (2006) [7] WANG , J., DE V RIES , A. P., AND R EINDERS , M. J. T. Unified relevance models for rating prediction in collab- orative filtering. ACM TOIS 26 (2008) Conclusions RMUB outperforms other state-of-the-art approaches but is not optimal. The complete probabilistic model (RMCE) achieves even larger improve- ments. Improvement in performance are consis- tent across different datasets We have also produced different mappings the involved variables [5], nevertheless, more research is still needed on this point. RecSys 2013, 7th ACM Conference on Recommender Systems. October 12–16, 2013. Hong Kong, China.

Upload: alejandro-bellogin

Post on 26-May-2015

568 views

Category:

Education


4 download

DESCRIPTION

Poster presented at ACM RecSys 2013

TRANSCRIPT

Page 1: Probabilistic Collaborative Filtering with Negative Cross Entropy

1

Probabilistic Collaborative Filteringwith Negative Cross EntropyAlejandro Bellogín1,3, Javier Parapar2, Pablo Castells3

[email protected], [email protected], [email protected] Access, Centrum Wiskunde & Informatica2Information Retrieval Lab, University of A Coruña3Information Retrieval Group, Universidad Autónoma de Madrid

Introduction

• Neighbourhood identification inmemory-based CF algorithms is basedon selecting those users who are mostsimilar to the active user according to acertain similarity metric.

• Hypothesis: neighbour-based CF tech-niques may be improved by using Rele-vance Models (RM) from Information Re-trieval (IR) to identify such neighbour-hoods (and also to weight them).

• Experiments: a relevance-based lan-guage model has been introduced intoa neighbour-based CF algorithm whichoutperforms other standard techniques interms of ranking precision. Further im-provements are achieved when we use acomplete probabilistic representation ofthe problem.

Relevance Modelling for RecommendationWe decompose the rating prediction task r̂(u, i) = C

∑v∈Nk(u)

sim(u, v)r(v, i) (*) as:1. We compute a relevance model for each user in order to capture how relevant any other

user would be as a potential neighbour.Probability of a neighbour v under the relevance model Ru for a given user u:

p(v|Ru) =∑

i∈PRS(u)

p(i)p(v|i)∏

j∈I(u)

p(j|i) (1)

where p(i) is the probability of the item i in the collection, p(v|i) is the probability ofthe neighbour v given the item i, and p(i|j) is the conditional probability of item i givenanother item j. I(u) corresponds to the set of items rated by user u, and PRS(u) ⊂ I(u)is the subset of items rated by u above some specific threshold.

2. Then we replace the rating prediction by weighted average in CF with the negative crossentropy (from IR) to incorporate the information learnt from the RM:

r̂(u, i) = H(p(·|Ru); p(·|i)|Nk(u)) =∑

v∈Nk(u)

p(v|Ru) log p(v|i) (2)

Notation: RMUB: Eqs. (1) + (*) and RMCE: Eqs. (1) + (2)

Experiments and ResultsEvaluation methodology: TestItems [1] (for each user a ranking is generated by predicting a score for every item in the test set).Baselines: UB (user-based CF with Pearson’s correlation as similarity measure), NC+P (Normalised Cut (NC) with Pearson similarity [2])(better performing than a matrix factorization algorithm), UIR (relevance model for log-based CF [6]), URM (rating-based probabilityestimations [7]).

0

0.05

0.1

0.15

0.2

0.25

100 200 300 400 500 600 700

P@5

k

RMUBλ=opt

×

×× × × × ×

××

× × × × ×

×

RMUBλ=0.5

� � � �� �

��

� � � � �

RMCEλ=opt

O O O O O O O O O O O O O O

O

RMCEλ=0.5

◦ ◦ ◦ ◦ ◦◦

◦ ◦ ◦ ◦ ◦ ◦ ◦◦

UB

•• • • • • • • • • • • •

NC + P

N

NN N N N

NN

NN N N

N N

N

Evolution of the performance of the compared methodsin terms of P@5 when varying k on the MovieLens 100K collection.

MovieLens 100K

Method P@5 nDCG@5 nDCG@10 P@50 cvgUB 0.049cd 0.041cd 0.047cd 0.056ce 100%

NC+P 0.111acde 0.097acde 0.095acde 0.058ce 83%UIR 0.004 0.002 0.002 0.002 100%

URM 0.005 0.003 0.018 0.054ce 100%RMUB 0.081acd 0.064acd 0.062acd 0.050c 60%RMCE 0.224abcde0.204abcde0.204abcde 0.138abcde100%

MovieLens 1M

Method P@5 nDCG@5 nDCG@10 P@50 cvgUB 0.035cd 0.031cd 0.031cd 0.039cde 100%

NC+P 0.037acd 0.033acd 0.036acd 0.048acde 99%UIR 0.001 0.001 0.001 0.001 100%

URM 0.001 0.001 0.006 0.034c 100%RMUB 0.075abcd 0.061abcd 0.057abcd 0.038c 41.4%RMCE 0.187abcde0.176abcde0.168abcde 0.108abcde 100%

Summary of comparative effectiveness

References[1] BELLOGÍN, A., CASTELLS, P., AND CANTADOR, I. Precision-oriented evaluation of recommender systems: an

algorithmic comparison. In RecSys (2011)

[2] BELLOGÍN, A., AND PARAPAR, J. Using graph partitioning techniques for neighbour selection in user-basedcollaborative filtering. In RecSys (2012)

[3] LAVRENKO, V., AND CROFT, W. B. Relevance based language models. In SIGIR (2001)

[4] MCLAUGHLIN, M. R., AND HERLOCKER, J. L. A collaborative filtering algorithm and evaluation metric thataccurately model the user experience. In SIGIR (2004)

[5] PARAPAR, J., BELLOGÍN, A., CASTELLS, P., AND BARREIRO, A. Relevance-based language modelling forrecommender systems. IPM. 49, 4 (2013)

[6] WANG, J., DE VRIES, A., AND REINDERS, M. A user-item relevance model for log-based collaborative filtering.In ECIR (2006)

[7] WANG, J., DE VRIES, A. P., AND REINDERS, M. J. T. Unified relevance models for rating prediction in collab-orative filtering. ACM TOIS 26 (2008)

Conclusions• RMUB outperforms other state-of-the-art

approaches but is not optimal.

• The complete probabilistic model(RMCE) achieves even larger improve-ments.

• Improvement in performance are consis-tent across different datasets

We have also produced different mappingsthe involved variables [5], nevertheless,more research is still needed on this point.

RecSys 2013, 7th ACM Conference on Recommender Systems. October 12–16, 2013. Hong Kong, China.