continuous evaluation of collaborative recommender systems in data stream management systems

Continuous Evaluation ofCollaborative Recommender Systems in

Data Stream Management Systems

Cornelius A. Ludmann, Marco Grawunder, Timo Michelsen, H.-Jürgen Appelrath

{cornelius.ludmann, marco.grawunder, timo.michelsen, appelrath}@uni-oldenburg.de

University of Oldenburg · Department of Computer ScienceInformation Systems Group

DSEP @ BTW 2015, Hamburg, GermanyMarch 3, 2015

cornelius.ludmann.org · @ludmann

http://www.btw-2015.de/?dsep

http://www.btw-2015.de/

http://cornelius.ludmann.org/

https://twitter.com/ludmann

2

Content

● Short introduction to Recommender Systems● From static and finite rating datasets to

an infinite stream of rating events● Continuous query plan for collaborative RecSys● Evaluation methodology for a DSMS-based

RecSys● Prototypical implementation with Odysseus

http://odysseus.informatik.uni-oldenburg.de/

3

Problem: Information Overload

Phot

o: C

C BY

-NC

2.0,

htt

ps://

ww

w.fl

ickr

.com

/pho

tos/

lyon

ora/

3608

6564

28/,

Use

r Leo

nora

Gio

vana

zzi

How to handle the flood of information?

https://www.flickr.com/photos/lyonora/3608656428/

4

From Search to Recommendation

“The Web [...] is leaving the era of search and entering one of discovery.

What's the difference?Search is what you do when you're looking for

something. Discovery is when something wonderful that you didn't know existed, or didn't know how to

ask for, finds you.”

– Jeffrey M. O'Brien: “The race to create a 'smart' Google”, CNN Money / Fortune Magazine (2006)

http://archive.fortune.com/magazines/fortune/fortune_archive/2006/11/27/8394347/index.htm

http://archive.fortune.com/magazines/fortune/fortune_archive/2006/11/27/8394347/index.htm

5

The Recommender Problem

Estimate a utility function that predicts how a user will like an item.

6



products

7



movies

8



music

9



quantified by a rating score

e. g., 1 to 5 stars:

10



set of usersset of items

range of the rating score

e. g.:

11



How to estimate ?

A common approach is Collaborative Filtering.

12

Collaborative Filtering (CF)

5 3 3

1 5

5 4 3

1 2 4

1

4 5

ratings are given by the

users:

explicitly or implicitly by

their behavior

13


5 3 3

1 5

5 4 3

1 2 4

1

4 5

sparse rating matrix

14


A learner “finds” an approximationto the true function :

Examples for learners are:

● User similarities methods● Matrix Factorization methods

15


5 3 2

1 5

5 4 3 2.4

1 2 4

1

4 3

predict a rating for all unrated

items

recommend items with the

highest predicted

ratings

16

What is about …?

… different situations?

… mood of the users?

… (temporary) changes of user interests?

17

Problem of Traditional RecSys

The interests of users on items can differ at different point in times.

● Depending on the situation of the user– Context awareness– Hidden contexts

● Due to changing user preferences– Concept drift

Context-Aware RecSys consider context data.

Time-Aware RecSys consider temporal effects.

18

From Static Rating Datasets …

Traditional RecSys learners use a static and finite set of rating data to build a RecSys model.

Reoccurring model learning after a specific time span to incorporate new rating data.

19

… to Rating Events (Real-life RecSys)

P Qx

Model

rating event (user, item, rating)

at time t

Rating events occur continuously, which leads to new learning data that potentially improves the predictions for all users.

20

Related Work

● Time-aware Collaborative RecSys (e. g., Koren 2010)

● Incremental/Online Collaborative RecSys Algorithms (e. g., BRISMF by Gábor et al. 2009)

● Collaborative RecSys with Apache Storm by Ali et al. 2011– Algorithm for parallel collaborative filtering

● StreamRec by Campos et al. 2011– Based on Microsoft StreamInsight– Calculates item/user similarities with DSMS operators

● Massive Online Analysis (MOA) by Bifet et al. 2010– Framework for evaluating machine learning algorithms– Implements BRISMF

21

Why DSMS + RecSys?

Characteristics of a (our) DSMS:● Continuous queries on potentially infinite data

streams● Stream-based operators (one pass) /

stream-based relational algebra● Query plan as directed graph of operators● Logical operators are transformed to

physical operators● Query optimizations, query sharing● Time annotation of stream elements● Operator framework

22

Why DSMS + RecSys?

What is the current interest of a user?● Continuously processing of rating data● Models are valid a specific time interval● Deterministic temporal matching of model and data

In general:● Take advantages of DSMS features (like optimizations,

flexible query formulation, …)● Usage of established standard operators● Extensibility like pre- and post-processing

(e. g., context reasoning/modeling, normalization, …)

23

RecSys Operators

get unrateditems

predictrating

trainrecsys model

recommend

Feedback

rating event model with validity time

interval

request for recommendations

recommendationcandidates

select top-k

24

RecSys Operators

get unrateditems

predictrating

trainrecsys model

recommend

Feedback

window

Rating data is potentially infinite in size!

A window operator limits the tuples in memory.

25

RecSys Operators

● “train recsys model” Operator– Implements a learner to train a model.– Holds valid rating data.– Outputs models with a validity time interval.

● “get unrated items” Operator– Gets request for recommendations.– Outputs for each item that was not rated by the requesting user a

tuple as recommendation candidate.● “predict rating” Operator

– Predicts for each recommendation candidate the rating.● “recommend” Operator

– Selects the recommendations for the requesting user.– Selects items with a min. rating and/or top-K items.

26

Operators for Continuous Evaluation

get unrateditems

route

predictrating

trainrecsys model

predictrating

recommend

testprediction

Feedback

window

test datalearning data

predictionerror

27

Operators for Continuous Evaluation

● “route” Operator– Distributes incoming data as learning or test data.– Implements an evaluation methodology.– e. g., Hold out: route 10 % as test data

● “predict rating” Operator– Predicts for each test tuple the rating.

● “test prediction” Operator– Calculates an error value for true and predicted .– e. g., Root Mean Square Error (RMSE)

28

Prototypical Implementation

29

Physical Query Plan

requests for recommendations (RfR)

rating data

metadata creation(time interval)

routes learning and test data

joins RfR with temporal matching models

joins test data with temporal matching

models

limits validity of learning data

implements learner

“now” window

30

Physical Query Plan

adds predicted rating to test tuple

map operator

aggregation size

aggregation operator

map operator

outputs unrated items

joins unrated itemsand models

adds predicted rating to unrated item

selects items with min rating

selects top-K items

31

Physical Operators

● “train recsys model” Operator– BRISMF (incremental matrix factorization)– BRISMF implementation of Massive Online Analysis

(MOA) integrated in Odysseushttp://moa.cms.waikato.ac.nz/details/recommender-systems/

– One model for each subset of valid learning tuples.– Exactly one valid model at every point in time.– Needs to hold all learning tuples in memory but

does not build models from scratch.

http://moa.cms.waikato.ac.nz/details/recommender-systems/

32

Physical Operators

● “interleaved test-than-train” Operator– Physical counterpart to “route” operator– Using rating tuples for learning and testing– Sets validity interval of test tuple to [t-1, t) to ensure a

matching to a model that has not used this tuple for learning

● “test prediction” Operator is implemented by a– map, → calculates square error– time window, → sets aggregation time span– aggregation, → aggregates errors (avg)– and another map operator → calculates root of error

33

Plot of Root Mean Square Error

34

Prototype Evaluation

● Comparison of RMSE after every learning tuple with Massive Online Analysis (MOA)– MovieLens dataset, ordered by timestamp, read line-by-line as

rating data– No decay of learning tuples (unbounded window)– Aggregation of RMSE over the whole dataset– (Random users for request for recommendations)

● Same results as MOA– MOA operates sequentially (first test, than train),

we ensure the correct order by the time annotation and the temporal join

– Temporal matching works as expected

35

Summary and Future Work

● Generic, extendable and modular structure for a RecSys based on DSMS operators

● Logical operators allow different physical implementations● Time annotations ensures deterministic temporal matching of models

and data● Prototypical implementation with BRISMF and Interleaved Test-Than-

Train

● Future Work:– Implementation of learners that consider temporal aspects– Impact of decay of tuples (different windows) on accuracy, latency,

throughput, memory consumption– Optimizations of algorithms, query plan, transformations …

Thank you for your attention!

continuous evaluation of collaborative recommender systems in data stream management systems

Technology

utility function

rating scoree

user leonora giovanazzihow

true function

recsys model

finite rating datasets

timeaware recsys

finite set of rating