continuous evaluation of collaborative recommender systems in data stream management systems
TRANSCRIPT
Continuous Evaluation ofCollaborative Recommender Systems in
Data Stream Management Systems
Cornelius A. Ludmann, Marco Grawunder, Timo Michelsen, H.-Jürgen Appelrath
{cornelius.ludmann, marco.grawunder, timo.michelsen, appelrath}@uni-oldenburg.de
University of Oldenburg · Department of Computer ScienceInformation Systems Group
DSEP @ BTW 2015, Hamburg, GermanyMarch 3, 2015
cornelius.ludmann.org · @ludmann
2
Content
● Short introduction to Recommender Systems● From static and finite rating datasets to
an infinite stream of rating events● Continuous query plan for collaborative RecSys● Evaluation methodology for a DSMS-based
RecSys● Prototypical implementation with Odysseus
3
Problem: Information Overload
Phot
o: C
C BY
-NC
2.0,
htt
ps://
ww
w.fl
ickr
.com
/pho
tos/
lyon
ora/
3608
6564
28/,
Use
r Leo
nora
Gio
vana
zzi
How to handle the flood of information?
4
From Search to Recommendation
“The Web [...] is leaving the era of search and entering one of discovery.
What's the difference?Search is what you do when you're looking for
something. Discovery is when something wonderful that you didn't know existed, or didn't know how to
ask for, finds you.”
– Jeffrey M. O'Brien: “The race to create a 'smart' Google”, CNN Money / Fortune Magazine (2006)
http://archive.fortune.com/magazines/fortune/fortune_archive/2006/11/27/8394347/index.htm
5
The Recommender Problem
Estimate a utility function that predicts how a user will like an item.
6
The Recommender Problem
Estimate a utility function that predicts how a user will like an item.
products
7
The Recommender Problem
Estimate a utility function that predicts how a user will like an item.
movies
8
The Recommender Problem
Estimate a utility function that predicts how a user will like an item.
music
9
The Recommender Problem
Estimate a utility function that predicts how a user will like an item.
quantified by a rating score
e. g., 1 to 5 stars:
10
The Recommender Problem
Estimate a utility function that predicts how a user will like an item.
set of usersset of items
range of the rating score
e. g.:
11
The Recommender Problem
Estimate a utility function that predicts how a user will like an item.
How to estimate ?
A common approach is Collaborative Filtering.
12
Collaborative Filtering (CF)
5 3 3
1 5
5 4 3
1 2 4
1
4 5
ratings are given by the
users:
explicitly or implicitly by
their behavior
13
Collaborative Filtering (CF)
5 3 3
1 5
5 4 3
1 2 4
1
4 5
sparse rating matrix
14
Collaborative Filtering (CF)
A learner “finds” an approximationto the true function :
Examples for learners are:
● User similarities methods● Matrix Factorization methods
15
Collaborative Filtering (CF)
5 3 2
1 5
5 4 3 2.4
1 2 4
1
4 3
predict a rating for all unrated
items
recommend items with the
highest predicted
ratings
16
What is about …?
… different situations?
… mood of the users?
… (temporary) changes of user interests?
17
Problem of Traditional RecSys
The interests of users on items can differ at different point in times.
● Depending on the situation of the user– Context awareness– Hidden contexts
● Due to changing user preferences– Concept drift
Context-Aware RecSys consider context data.
Time-Aware RecSys consider temporal effects.
18
From Static Rating Datasets …
Traditional RecSys learners use a static and finite set of rating data to build a RecSys model.
Reoccurring model learning after a specific time span to incorporate new rating data.
19
… to Rating Events (Real-life RecSys)
P Qx
Model
rating event (user, item, rating)
at time t
Rating events occur continuously, which leads to new learning data that potentially improves the predictions for all users.
20
Related Work
● Time-aware Collaborative RecSys (e. g., Koren 2010)
● Incremental/Online Collaborative RecSys Algorithms (e. g., BRISMF by Gábor et al. 2009)
● Collaborative RecSys with Apache Storm by Ali et al. 2011– Algorithm for parallel collaborative filtering
● StreamRec by Campos et al. 2011– Based on Microsoft StreamInsight– Calculates item/user similarities with DSMS operators
● Massive Online Analysis (MOA) by Bifet et al. 2010– Framework for evaluating machine learning algorithms– Implements BRISMF
21
Why DSMS + RecSys?
Characteristics of a (our) DSMS:● Continuous queries on potentially infinite data
streams● Stream-based operators (one pass) /
stream-based relational algebra● Query plan as directed graph of operators● Logical operators are transformed to
physical operators● Query optimizations, query sharing● Time annotation of stream elements● Operator framework
22
Why DSMS + RecSys?
What is the current interest of a user?● Continuously processing of rating data● Models are valid a specific time interval● Deterministic temporal matching of model and data
In general:● Take advantages of DSMS features (like optimizations,
flexible query formulation, …)● Usage of established standard operators● Extensibility like pre- and post-processing
(e. g., context reasoning/modeling, normalization, …)
23
RecSys Operators
get unrateditems
predictrating
trainrecsys model
recommend
Feedback
rating event model with validity time
interval
request for recommendations
recommendationcandidates
select top-k
24
RecSys Operators
get unrateditems
predictrating
trainrecsys model
recommend
Feedback
window
Rating data is potentially infinite in size!
A window operator limits the tuples in memory.
25
RecSys Operators
● “train recsys model” Operator– Implements a learner to train a model.– Holds valid rating data.– Outputs models with a validity time interval.
● “get unrated items” Operator– Gets request for recommendations.– Outputs for each item that was not rated by the requesting user a
tuple as recommendation candidate.● “predict rating” Operator
– Predicts for each recommendation candidate the rating.● “recommend” Operator
– Selects the recommendations for the requesting user.– Selects items with a min. rating and/or top-K items.
26
Operators for Continuous Evaluation
get unrateditems
route
predictrating
trainrecsys model
predictrating
recommend
testprediction
Feedback
window
test datalearning data
predictionerror
27
Operators for Continuous Evaluation
● “route” Operator– Distributes incoming data as learning or test data.– Implements an evaluation methodology.– e. g., Hold out: route 10 % as test data
● “predict rating” Operator– Predicts for each test tuple the rating.
● “test prediction” Operator– Calculates an error value for true and predicted .– e. g., Root Mean Square Error (RMSE)
28
Prototypical Implementation
29
Physical Query Plan
requests for recommendations (RfR)
rating data
metadata creation(time interval)
routes learning and test data
joins RfR with temporal matching models
joins test data with temporal matching
models
limits validity of learning data
implements learner
“now” window
30
Physical Query Plan
adds predicted rating to test tuple
map operator
aggregation size
aggregation operator
map operator
outputs unrated items
joins unrated itemsand models
adds predicted rating to unrated item
selects items with min rating
selects top-K items
31
Physical Operators
● “train recsys model” Operator– BRISMF (incremental matrix factorization)– BRISMF implementation of Massive Online Analysis
(MOA) integrated in Odysseushttp://moa.cms.waikato.ac.nz/details/recommender-systems/
– One model for each subset of valid learning tuples.– Exactly one valid model at every point in time.– Needs to hold all learning tuples in memory but
does not build models from scratch.
32
Physical Operators
● “interleaved test-than-train” Operator– Physical counterpart to “route” operator– Using rating tuples for learning and testing– Sets validity interval of test tuple to [t-1, t) to ensure a
matching to a model that has not used this tuple for learning
● “test prediction” Operator is implemented by a– map, → calculates square error– time window, → sets aggregation time span– aggregation, → aggregates errors (avg)– and another map operator → calculates root of error
33
Plot of Root Mean Square Error
34
Prototype Evaluation
● Comparison of RMSE after every learning tuple with Massive Online Analysis (MOA)– MovieLens dataset, ordered by timestamp, read line-by-line as
rating data– No decay of learning tuples (unbounded window)– Aggregation of RMSE over the whole dataset– (Random users for request for recommendations)
● Same results as MOA– MOA operates sequentially (first test, than train),
we ensure the correct order by the time annotation and the temporal join
– Temporal matching works as expected
35
Summary and Future Work
● Generic, extendable and modular structure for a RecSys based on DSMS operators
● Logical operators allow different physical implementations● Time annotations ensures deterministic temporal matching of models
and data● Prototypical implementation with BRISMF and Interleaved Test-Than-
Train
● Future Work:– Implementation of learners that consider temporal aspects– Impact of decay of tuples (different windows) on accuracy, latency,
throughput, memory consumption– Optimizations of algorithms, query plan, transformations …
Thank you for your attention!