modeling difficulty in recommender systems

Competence Center Information Retrieval & Machine Learning

Modeling Difficulty in Recommender Systems

Benjamin Kille (@bennykille)

September 9, 2012

Recommendation Utility Evaluation: Beyond RMSE (2012)

2

Outline

► Recommender System Evaluation

► Problem definition

► Difficulty in Recommender Systems

► Future work

► Conclusions

September 9, 2012

Recommendation Utility Evaluation: Beyond RMSE (2012)

Recommendation Utility Evaluation: Beyond RMSE (2012)3

Recommender Systems Evaluation

► Definition of Evaluation measure:

RMSE (rating prediction scenario)

nDCG (ranking scenario)

Precision@N (top-N scenario)

► Splitting data into training and test partition

► Reporting results as average over the full set of users

► Is recommending to all users equally difficult?

September 9, 2012


Observed Differences

► Users differ with respect to Demographics (e.g., age, gender and location) Taste Needs Expectations Consumption patterns …

► Recommendation algorithms do not perform equally for each single userusers should not be evaluated all in the same way!

September 9, 2012


Risks of disregarding users‘ differences

► A subset of users receives worse recommendations than possible

► recommendation algorithm optimization targets all users equally:

„easy“ users costs could be saved „difficult“ users insufficient optimization

Control optimization towards those users who really require it!

How to determine difficulty?

September 9, 2012


Problem Formulation

► Measuring how difficult it will be to recommend items to a user

► Ideally: deriving difficulty directly from user attributes► Problem: unkown correlation between (combinations of)

attributes and difficulty

► We need a method to calculate the correlation of user attributes and the recommendation difficulty

September 9, 2012


Difficulty in Information Retrieval

► Target object: query► Method:

September 9, 2012

Query

IR-System IR-System IR-System IR-System IR-System

Doc 1 Doc 1

Doc 1

Doc 1Doc 1

Doc 2 Doc 2Doc 3Doc 2

Doc 2

Doc 4Doc 3 Doc 4Doc 2Doc 3

… … … … …

Difficulty = Diversity of returned list of documents


Difficulty in Recommender Systems

► Selecting several recommendation methods (state-of-the-art)► Measure the diversity of their output for a specific user► Based on the methods‘ agreement with respect to predicted

rating / ranking / top-N items, we conclude: high agreement low difficulty low agreement high difficulty

► Target correlation (user attributes ~ difficulty) can be estimated using the observed difficulties for a sufficiently large set of users

September 9, 2012


Future Work

► Experimentally verify feasability of difficulty estimation

► Evaluate observed correlation (user attributes ~ difficulty) on

data sets

► Investigate business rationale (reduced costs through

controlled optimization efforts)

► How to deal with sparsity / cold-start issues

September 9, 2012


Conclusions

► Users should not be treated equally when evaluating

recommender systems

► Difficulty of recommendation tasks varies between users

► Difficulty will allow to control optimization towards those users

who require it

► Diversity metrics could be used to estimate difficulty scores

(analogously to information retrieval)

► Proposed method needs to be evaluated

September 9, 2012


Thank you for your attention!

Questions

September 9, 2012


References

[He2008] J. He, M. Larson, and M. De Rijke. Using coherence-based measures to predict query difficulty. ECIR 2008

[Herlocker2004] J. Herlocker, J. Konstan, L. Terveen, and J. Riedl. Evaluating collaborative filtering recommender systems. ACM TOIS 22(1)

2004[Kuncheva2003] L. Kuncheva and C. Whitaker. Measures of

diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine Learning 51 2003

[Vargas2011] S. Vargas and P. Castells. Rank and relevance in novelty and diversity metrics for recommender systems. RecSys 2011

September 9, 2012

modeling difficulty in recommender systems

Documents

users difficulty

modeling difficulty

differences users

difficulty ondata sets

subset of users

conclusions users

easy users costs

risks of disregarding