open recommendation platform
DESCRIPTION
TRANSCRIPT
![Page 1: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/1.jpg)
Open Recommendation Platform
ACM RecSys 2013, Hong Kong
Torben Brodtplista GmbH
Keynote
International News RecommenderSystems Workshop and Challenge
October 13th, 2013
![Page 2: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/2.jpg)
where● news websites● below the article
different types● content● advertising
Where it’s coming from
Recommendations
Visitors Publisher
![Page 3: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/3.jpg)
* company i am working for
Where it’s coming from
good recommendations for...
User happy!
Advertiser happy!
Publisher happy!
plista* happy!
![Page 4: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/4.jpg)
Where it’s coming from
some years ago
Visitors PublisherContext
Recommendations
Collaborative Filtering
![Page 5: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/5.jpg)
● well known algorithm● more data means more
knowledge
Where it’s coming from
one recommender
Collaborative Filtering
● time● trust● mainstream
Parameter Tuning
![Page 6: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/6.jpg)
2008● finished studies● 1st publication● plista was born
today● 5k recs/second● many publishers
Where it’s coming from
one recommender = good results
![Page 7: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/7.jpg)
"use as many recommenders as possible!"
Where it’s coming from
netflix prize
![Page 8: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/8.jpg)
Collaborative Filtering
Most Popular
Text Similarity
etc ...
Where it’s coming from
more recommenders
![Page 9: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/9.jpg)
● we have one score● lucky success? bad loss?● we needed to keep track
on different recommenders
success: 0.31 %
understanding performance
lost in serendipity
![Page 10: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/10.jpg)
number of● clicks● orders● engages● time on site● money
bad good
understanding performance
how to measure success
10
![Page 11: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/11.jpg)
● features?● big data math?● counting!
for blending we just count floats
understanding performance
evaluation technology
Algo1 1+1 10
Algo2 100 2+5
Algo...
![Page 12: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/12.jpg)
understanding performance
evaluation technology
impressions
collaborative filtering 500 +1
most popular 500
text similarity 500
ZINCRBY"impressions"
"collaborative_filtering"
"1"
ZREVRANGEBYSCORE "impressions"
![Page 13: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/13.jpg)
understanding performance
evaluation technology
impressions
collaborative filtering 500
most popular 500
text similarity 500
clicks
collaborative filtering 100
most popular 10
... 1
needs division
ZREVRANGEBYSCORE "clicks"
ZREVRANGEBYSCORE "impressions"
![Page 14: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/14.jpg)
● CF is "always" the best recommender
● but "always" is just avg of all context
lets check on context!
understanding performance
evaluation resultssuccess
![Page 15: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/15.jpg)
● We like anonymization! We have a big context featured by the web
● URL + HTTP Headers provide○ user agent -> device -> mobile○ IP address -> geolocation○ referer -> origin (search, direct)
Context
Context
![Page 16: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/16.jpg)
consider list of best recommender in each context attribute sorted list for what is relevant by● clicks (content recs)● price (advertising recs)
publisher = welt.de
collaborative filtering 689
most popular 420
text similarity 135
category = archive
text similarity 400
collaborative filtering 200
... 100
hour = 15
recent 80
collaborative filtering 10
... 5
Context
Context
![Page 17: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/17.jpg)
publisher = welt.de
collaborative filtering 689
most popular 420
text similarity 135
weekday = sunday
collaborative filtering 400
most popular 200
... 100category = archive
text similarity 200
collaborative filtering 10
... 5
ZUNION clk ... WEIGHTS p:welt.de:clk 4 w:sunday:clk 1 c:archive:clk 1
ZREVRANGEBYSCORE
"clk"
ZUNION imp ... WEIGHTS p:welt.de:imp 4 w:sunday:imp 1 c:archive:imp 1
ZREVRANGEBYSCORE
"imp"
Context
evaluation context
![Page 18: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/18.jpg)
Context can be used for optimization and targeting.
classical targeting is limitation
Context
Targeting
![Page 19: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/19.jpg)
Recommenders
collaborative filtering 500 +1
most popular 500
text similarity 500
Advertising
RWE Europe 500 +1
IBM Germany 500
Intel Austria 500
Onsite
new iphone su...
500 +1
twitter buys p.. 500
google has seri. 500
Advertising
RWE Europe 500 +1
IBM Germany 500
Intel Austria 500
Recommenders
collaborative filtering 500 +1
most popular 500
text similarity 500
Onsite
new iphone su...
500 +1
twitter buys p.. 500
google has seri. 500
Context
Livecube
![Page 20: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/20.jpg)
context
recap● added another
dimension
result
● better for news: Collaborative Filtering
● better for content: Text Similarity
Context
evaluation contextsuccess
20
![Page 21: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/21.jpg)
what did we get?
● possibly many recommenders
● know how to measure success
● technology to see success
now breath!
![Page 22: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/22.jpg)
● real-time evaluation technology exists
● to choose best algorithm for current context we need to learn: multi-armed bayesian bandit
the ensemble
![Page 23: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/23.jpg)
Data Science
“shuffle” exploration exploitation
temporary success?
No. 1 getting most
local minima?
Interested? Look for Ted Dunning + Bayesian Bandit
![Page 24: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/24.jpg)
● new total / avg is much better
● thx bandit● thx ensemble
more research● timeseries
✓ better results
time
success
![Page 25: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/25.jpg)
✓ easy exploration
● tradeoff (money decision)● between price/time we
“waste” in offline evaluation● and price we loose with
bad recommendations
![Page 26: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/26.jpg)
● minimum pre-testing● no risk if recommender
crashs● "bad" code might find
its context
try and error
![Page 27: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/27.jpg)
● now plista developers can try ideas
● and allow researchers to do the same
collaboration
![Page 28: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/28.jpg)
Ensemble is able to choose
big pool of algorithms
Collaborative Filtering
Most Popular
Text Similarity
Ensemble
BPR-LinearWR-MFSVD++etc.
Research Algorithms
![Page 29: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/29.jpg)
researcher has idea
src http://g-ecx.images-amazon.com/images/G/03/video/m/feature/wickie_figur.jpg
![Page 30: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/30.jpg)
● first and only dataset in news context○ millions of items○ only relevant for short time
● dataset has many attributes !!● many publishers have user intersection
○ regional○ contextual
● real world !!!○ you can guide the user○ you don’t need to follow his route
● real time !!○ This is industry, it has to be usable
researcher has idea
src http://userserve-ak.last.fm/serve/_/7291575/Wickie%2B4775745.jpg
30
![Page 31: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/31.jpg)
... probably hosted by university, plista or any cloud provider?
... needs to start the server
![Page 32: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/32.jpg)
"message bus"● event notifications
○ impression○ click
● error notifications● item updates
train model from it
... api implementation
![Page 33: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/33.jpg)
{ // json
"type": "impression",
"context": {
"simple": {
"27": 418, // publisher
"14": 31721, // widget
...
},
"lists": {
"10": [100, 101] // channel
}
...
}
... package content
api specs hosted at http://orp.plista.com
![Page 34: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/34.jpg)
{ // json
"type": "impression",
"recs": ...
// what was recommended
}
api specs hosted at http://orp.plista.com
... package content
![Page 35: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/35.jpg)
{ // json
"type": "click",
"context": ...
// will include the position
}
... package content
api specs hosted at http://orp.plista.com
![Page 36: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/36.jpg)
recs
{ // json
"recs": {
"int": {
"3": [13010630, 84799192]
// 3 refers to content recommendations
}
...
}
generated by researchersto be shown to real user
API
Real User
Researcher
... reply to recommendation requests
api specs hosted at http://orp.plista.com
![Page 37: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/37.jpg)
recs
Real User
Researcher
● happy user
● happy researcher
● happy plista
research can profit
● real user feedback
● real benchmark
quality is win win #2
![Page 38: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/38.jpg)
use common frameworks
src http://en.wikipedia.org/wiki/Pac-Man
how to build fast system?
![Page 39: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/39.jpg)
● no movies!
● news articles will outdate!
● visitors need the recs NOW
● => handle the data very fast
src http://static.comicvine.com/uploads/original/10/101435/2026520-flash.jpg
quick and fast
![Page 40: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/40.jpg)
● fast web server
● fast network protocol
● fast message queue
● fast storage
or Apache Kafka
"send quickly" technologies
40
![Page 41: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/41.jpg)
"real-time features feel better in a real-time world"
we don't need batch! see http://goo.gl/AJntul
our setup● php, its easy● redis, its fast● r, its well known
comparison to plista
![Page 42: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/42.jpg)
Overview
Publisher
Recommendations
Feedback
Collaborative Filtering
Most Popular
Text Similarity
etc.
Preferences
EnsembleVisitors
![Page 43: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/43.jpg)
● 2012
○ Contest v1
● 2013
○ ACM RecSys “News
Recommender Challenge”
● 2014
○ CLEF News Recommendation
Evaluation Labs “newsreel”
Overview
![Page 44: Open recommendation platform](https://reader033.vdocument.in/reader033/viewer/2022051412/54b73e674a795966598b46e6/html5/thumbnails/44.jpg)
Contacthttp://goo.gl/pvXm5 (Blog)[email protected]://lnkd.in/MUXXuvxing.com/profile/Torben_Brodtwww.plista.com
News Recommender Challengehttps://sites.google.com/site/newsrec2013/
#RecSys@torbenbrodt @NRSws2013 @plista
questions?