parsimonious and adaptive contextual information acquisition in recommender systems

IntRS’15 - September 2015, Vienna, Austria

Parsimonious and Adaptive Contextual Information Acquisition in Recommender

Systems

Matthias Braunhofer1, Ignacio Fernández-Tobías2 and Francesco Ricci1

1Free University of Bozen - BolzanoPiazza Domenicani 3, 39100 Bolzano, Italy

{mbraunhofer,fricci}@unibz.it 2Universidad Autónoma de Madrid

C / Francisco Tomás y Valiente 11, 28049 Madrid, [email protected]

http://unibz.it

mailto:[email protected]


Outline

2

• Introduction

• Related Works

• Selective Context Acquisition

• Experimental Evaluation and Results

• Conclusions and Future Work


Outline

2

• Introduction

• Related Works



• Conclusions and


Context-Aware Recommender Systems

• Context-Aware Recommender Systems (CARSs) aim to provide better recommendations by exploiting contextual information (e.g., weather)

• Rating prediction function is: R: Users x Items x Context → Ratings

3

3 ? 4

2 5 4

? 3 4

1 ? 1

2 5

? 3

3 ? 5

2 5

? 3

5 ? 5

4 5 4

? 3 5


Challenges for CARSs

4

• Identification of contextual factors that influence user preferences and the decision making process, and hence are worth to be collected from the users along with their ratings

• Development of a predictive model for predicting the user’s ratings for items under various contextual situations

• Design of a human-computer interaction layer on top of the predictive model


Example STS (South Tyrol Suggests)

5

STS provides context-aware suggestions for Places Of Interest (POIs) in South Tyrol, Italy


Example STS w/o Selective Context Acquisition

6

Don’t.All contextual factors are requested.


Example STS w/ Selective Context Acquisition

7

Do.Only relevant contextual factors are requested.


Outline

8

• Related Works




• Introduction


Context Selection A Priori (i.e., Before Collecting Ratings)

• (Baltrunas et al., 2012): Development of a web survey where users were requested to evaluate the influence of contextual conditions on POI categories

• This allowed to identify the relevant contextual factors for different POI categories (using mutual information statistic)

• Pros: can acquire ratings under relevant contextual conditions

• Cons: artificial setting; survey requires extra effort from the user

9


Context Selection A Posteriori (i.e., After Collecting Ratings)

• (Odić et al., 2013): Provision of several statistic-based methods for detection of relevant context, i.e., unalikeability, entropy, sample variance, χ2 test, Freeman–Halton test

• Results show a significant difference in prediction of ratings in context detected as relevant and the one detected as irrelevant

• Pros: can improve rating prediction

• Cons: still irrelevant context is acquired in the rating acquisition phase

10

Relevant context Unclassified context

Irrelevant context Baseline predictors


Outline

11

• Related Works



• Conclusions and F

• Introduction


Parsimonious & Adaptive Context Acquisition

• Main idea: for each user-item pair (u, i), identify the contextual factors that when acquired together with u’s rating for i improve most the overall system

• Heuristic: acquire the contextual factors that have the largest impact on rating prediction

• Example:

12

(Alice, Skiing)

Season

Weather

Temperature

Daytime

Impact0.000 0.125 0.250 0.375 0.500


Parsimonious & Adaptive Context Acquisition

• Main idea: for each user-item pair (u, i), identify the contextual factors that when acquired together with u’s rating for i improve most the overall system

• Heuristic: acquire the contextual factors that have the largest impact on rating prediction

• Example:

12

(Alice, Skiing)

Season

Weather

Temperature

Daytime

Impact0.000 0.125 0.250 0.375 0.500

How to quantify this

impact?


CARS Prediction Model

• We use a new variant of Context-Aware Matrix Factorization (CAMF) (Baltrunas et al., 2011) that treats contextual conditions similarly to either item or user attributes

• Advantage: allows to capture latent correlations and patterns between a potentially wide range of knowledge sources ⟹ ideal to derive the usefulness of contextual factors

13

ruic1,...,ck = (qi + xaa∈A(i )∪C (i )∑ )T ⋅(pu + yb

b∈A(u )∪C (u )∑ )+ ri + bu

qi latent factor vector of item iA(i) set of conventional item attributes (e.g., genre)C(i) set of contextual item attributes (e.g., weather)xa latent factor vector of item attribute apu latent factor vector of user uA(u) set of conventional user attributes (e.g., age)C(u) set of contextual user attributes (e.g., mood)yb latent factor vector of user attribute bṝi average rating for item ibu baseline for user u


Largest Deviation

• Computes a personalized relevance score for a contextual factor Cj and a user-item pair (u, i)

• Given (u, i), it first measures the “impact” of each contextual condition cj ∈ Cj

by calculating the absolute deviation between the rating prediction when the condition holds (i.e., ȓuicj) and the predicted context-free rating (i.e., ȓui):

where fcj is the normalized frequency of cj

• Finally, it takes the average of these individual scores for the contextual conditions to yield a single relevance score for the contextual factor Cj

14

wuicj= fcj ruic j − rui ,


Illustrative Example

• ȓAlice Skiing Sunny = 5

• ȓAlice Skiing = 3.5

• 20% of ratings are tagged with Sunny (i.e., fSunny = 0.2)

• ŵAlice Skiing Sunny = 0.2⋅|5 - 3.5| = 0.3

15


Outline

16

• Introduction

• Related Works



• Conclusions


CoMoDa TripAdvisor

Domain Movies POIs

Rating scale 1-5 1-5

Ratings 2,098 4,147

Users 112 3,916

Items 1,189 569

Contextual factors 12 3

Contextual conditions 49 31

User attributes 4 2

Item features 7 12

Datasets

17


CoMoDa TripAdvisor

Domain Movies POIs


Ratings 2,098 4,147

Users 112 3,916

Items 1,189 569



User attributes 4 2

Item features 7 12

Datasets

17

time, daytype, season, location, weather, social, mood, …


CoMoDa TripAdvisor

Domain Movies POIs


Ratings 2,098 4,147

Users 112 3,916

Items 1,189 569



User attributes 4 2

Item features 7 12

Datasets

17

age, gender, city, country


CoMoDa TripAdvisor

Domain Movies POIs


Ratings 2,098 4,147

Users 112 3,916

Items 1,189 569



User attributes 4 2

Item features 7 12

Datasets

17

director, country, language, year, budget,

genres, actors


CoMoDa TripAdvisor

Domain Movies POIs


Ratings 2,098 4,147

Users 112 3,916

Items 1,189 569



User attributes 4 2

Item features 7 12

Datasets

17

type, month and year of the trip


CoMoDa TripAdvisor

Domain Movies POIs


Ratings 2,098 4,147

Users 112 3,916

Items 1,189 569



User attributes 4 2

Item features 7 12

Datasets

17

user location, member type


CoMoDa TripAdvisor

Domain Movies POIs


Ratings 2,098 4,147

Users 112 3,916

Items 1,189 569



User attributes 4 2

Item features 7 12

Datasets

17

item type, amenities, item

locality, price range, hotel class, …


Evaluation Procedure Overview

18



18

• Repeated random sub-sampling validation (20 times):



18

25% 50% 25%

Training set Candidate set Testing set


• Randomly partition the ratings into three subsets



18

25% 50% 25%



• For each user-item pair (u,i) in the candidate set, compute the N most relevant contextual factors and transfer the corresponding rating and context information ruic in the candidate set to the training set as ruic' with c' ⊆ c containing the associated contextual conditions for these factors




18

25% 50% 25%





• Measure user-averaged MAE (U-MAE), Precision@10 and Recall@10 on the testing set, after training the prediction model on the new extended training set



18

25% 50% 25%





• Measure user-averaged MAE (U-MAE), Precision@10 and Recall@10 on the testing set, after training the prediction model on the new extended training set

• Repeat


user-item pairtop two contextual factors

rating transferred to training set

Evaluation Procedure Example

19

++

=

rating in candidate set


(Alice, Skiing)top two contextual factors



19

++

=



(Alice, Skiing)Season and Weather



19

++

=






19

rAlice Skiing Winter, Sunny, Warm, Morning = 5++

=




19

rAlice Skiing Winter, Sunny, Warm, Morning = 5

rAlice Skiing Winter, Sunny = 5

++

=


Baseline Methods for Evaluation

• Mutual Information (Baltrunas et al., 2012): given a user-item pair (u,i), it computes the relevance score for the contextual factor Cj as the normalized mutual information between the ratings for items belonging to i’s category and Cj

• Freeman-Halton Test (Odić et al., 2013): calculates the relevance of a contextual factor Cj using the Freeman-Halton test, which is the Fisher’s exact test extended for contingency tables > 2 × 2

• Minimum Redundancy Maximum Relevance - mRMR (Peng et al., 2005): ranks each contextual factor Cj according to its relevance to the rating variable and redundancy to other contextual factors

• Random: randomly chooses the top N contextual factors for a user-item pair

20


Evaluation Results U-MAE

21

CoMoDa

U-M

AE

0.71

0.72

0.73

0.74

0.75

0.76

0.77

0.78

0.79

0.80

0.81

0.82

Number of Selected Contextual Factors

1 2 3 4

Largest Deviation Mutual Information Freeman-Halton mRMR Random All features

TripAdvisor

U-M

AE0.5200.5210.5220.5230.5240.5250.5260.5270.5280.5290.5300.5310.5320.533

Number of Selected Contextual Factors1 2 3

*

** * *

* **

*

*

*

*


Evaluation Results Precision@10

22

CoMoDa

Prec

isio

n@10

0.0000

0.0002

0.0004

0.0006

0.0008

0.0010

0.0012

0.0014

0.0016


1 2 3 4


TripAdvisor

Prec

isio

n@10

0.01000.01050.01100.01150.01200.01250.01300.01350.01400.01450.01500.01550.0160


**

*

**

* *

**

*


Evaluation Results Recall@10

23

CoMoDa

Reca

ll@10

0.000

0.002

0.004

0.006

0.008

0.010

0.012

0.014

0.016


1 2 3 4


TripAdvisor

Reca

ll@10

0.1000.1050.1100.1150.1200.1250.1300.1350.1400.1450.1500.1550.160


**

*

** * *

**

*


Evaluation Results Practical Implications

• Using Largest Deviation, we know that we can ask only the contextual factors C1, C2 and C3 when we ask user u to rate item i

24


Outline

25

• Introduction

• Related Works





Conclusions

• Identifying which contextual factors should be acquired from the user upon rating an item is an important and practical problem for CARSs

• We tackled this problem with a new method that asks the user to specify those contextual factors that if considered in the CARS prediction model would produce a rating prediction that is most different from the context-free prediction

• Results from our offline experiment confirm that the proposed parsimonious context acquisition strategy elicits ratings with contextual information that improve more the recommendation performance

26


Future Work

• Evaluate the performance of employing an Active Learning method for adaptively selecting both the item to rate and the contextual information to add

• Understand how the proposed method can be extended to generate requests for contextual data that takes into account possible correlations between contextual factors

• Update the evaluation procedure so that it can be used also on rating datasets for which only a subset of contextual factors is known

• Integrate the developed method into our STS app and perform a live user study

27


Questions?

Thank you.

parsimonious and adaptive contextual information acquisition in recommender systems

Internet