living labs challenge workshop

39
O pen R ecommendation P latform For Researchers and Developers Living Labs Challenge Workshop University of Amsterdam June 6th, 2014 Torben Brodt plista GmbH -> http://orp.plista.com -> http://living-labs.net/llc/ @torbenbrodt

Upload: torben-brodt

Post on 12-Apr-2017

155 views

Category:

Technology


4 download

TRANSCRIPT

Page 1: Living Labs Challenge Workshop

Open Recommendation Platform

For Researchersand DevelopersLiving Labs Challenge WorkshopUniversity of AmsterdamJune 6th, 2014

Torben Brodtplista GmbH

-> http://orp.plista.com-> http://living-labs.net/llc/

@torbenbrodt

Page 2: Living Labs Challenge Workshop

1. what we built for ourselves○ recommendation engine

2. how we built it○ big data math○ system architecture

3. application for “living labs”○ for developers, researchers and geeks

Contents

@torbenbrodt

Not just opening algorithms to partners, But opening our platform to algorithms.

Page 3: Living Labs Challenge Workshop

where● news websites● below the article● now in NL too!

different types● content● advertising

What we built for ourselves

Recommendation Engine

Visitors Publisher

#@torbenbrodt

Page 4: Living Labs Challenge Workshop

What we built for ourselves

Recommendation Engine

Visitors PublisherResults

Request

Engine

@torbenbrodt

Context

Personalized

II

Page 5: Living Labs Challenge Workshop

What we built for ourselves

Collaborative Filtering

Peter James

Peter and James have sth in common.They both like football

Term: User Similarity@torbenbrodt

Page 6: Living Labs Challenge Workshop

What we built for ourselves

Collaborative Filtering

Peter James

Tennis will be recommendation for Peter, because James likes it too.

Item Recommendation from User Similarity@torbenbrodt

Page 7: Living Labs Challenge Workshop

● more data => more knowledge

● not needed: ○ domain knowledge○ concrete user○ concrete article

What we built for ourselves

Collaborative Filtering

@torbenbrodt

Page 8: Living Labs Challenge Workshop

Text Similarity

What we built for ourselves

More recommenders

● article content matching recommendation content

● but which ads to present to political content?

Most Popular

etc ...

● premise: what everybody likes is also good to me

● e.g. public trends, social likes, wiki data

@torbenbrodt

Page 9: Living Labs Challenge Workshop

Text Similarity

What we built for ourselves

More recommenders

● article content matching recommendation content

● but which ads to present to political content?

Most Popular

etc ...

● premise: what everybody likes is also good to me

● e.g. public trends, social likes, wiki data

@torbenbrodt

Page 10: Living Labs Challenge Workshop

Text Similarity

What we built for ourselves

More recommenders

● article content matching recommendation content

● but which ads to present to political content?

Most Popular

etc ...

● premise: what everybody likes is also good to me

● e.g. public trends, social likes, wiki data, NLP, Matrix Fac.

@torbenbrodt

Page 11: Living Labs Challenge Workshop

What we built for ourselves

good recommendations for...

User happy!

Advertiser happy!

Publisher happy!

plista happy!

@torbenbrodt

Page 12: Living Labs Challenge Workshop

What we built for ourselves

What are the goals?

high number of...● clicks● attention● orders● engages/videos● time on site● page depth

bad good

@torbenbrodt

Page 13: Living Labs Challenge Workshop

What we built for ourselves

Who wants this goals?

Advertising Goal

RWE Europe 500 +1

IBM Germany 500

Intel Austria 500

Recommenders Goal

collaborative filtering 500 +1

most popular 500

text similarity 500

Content Goal

new iphone su...

500 +1

twitter buys p.. 500

google has seri. 500

@torbenbrodt

Page 14: Living Labs Challenge Workshop

What we built for ourselves

Who wants this goals?

Advertising Goal

RWE Europe 500 +1

IBM Germany 500

Intel Austria 500

Recommenders Goal

collaborative filtering 500 +1

most popular 500

text similarity 500

Content Goal

new iphone su...

500 +1

twitter buys p.. 500

google has seri. 500

used to A/B test our algorithms

@torbenbrodt

Page 15: Living Labs Challenge Workshop

What we built for ourselves

Who wants this goals?

Advertising Goal

RWE Europe 500 +1

IBM Germany 500

Intel Austria 500

Recommenders Goal

collaborative filtering 500 +1

most popular 500

text similarity 500

Content Goal

new iphone su...

500 +1

twitter buys p.. 500

google has seri. 500

@torbenbrodt

Page 16: Living Labs Challenge Workshop

What we built for ourselves

All goals have a context

Ad or Content or Recommender

...

...

...

● user agent > device > mobile● IP address > geolocation● referer > origin (search,

direct)● anonym!

@torbenbrodt

Page 17: Living Labs Challenge Workshop

What we built for ourselves

All goals have a context

Which channel to show the Advertising

Which publishers tend to click on Semantic Recommendations

Which geolocation is the right for this Content

Questions the context can answer

Answers to this are given by the algorithms

@torbenbrodt

Page 18: Living Labs Challenge Workshop

What we built for ourselves

All goals have a context

● answers change each second● bayesian bandit approach

temporary success?

No. 1 getting most

local minima?

@torbenbrodt

Page 19: Living Labs Challenge Workshop

✓ easy exploration

● minimum pre-testing● no risk if recommender

crashs● "bad" code might find

its context

Page 20: Living Labs Challenge Workshop

numbers in short● 5k recs per second● 250 Mbit contextual data● 100 items per second

quite scaling issues

● big data math● message bus

How we built it?

#@torbenbrodt

Page 21: Living Labs Challenge Workshop

Events

Technology Stack

Message Bus

Subscribers● algorithms● payment● etc

Visitor

● new articles● delivered● clicks

@torbenbrodt

Page 22: Living Labs Challenge Workshop

How we built it?

Big Data Math

Article 1+1 10

Article 100 2+5

Art...

@torbenbrodt

number of● clicks● orders● engages● time on site● money

What math do we need?

Page 23: Living Labs Challenge Workshop

● Addition can solve most formulas● with Logarithm also multiplications● Real-Time Ready

○ atomic○ fast

How we built it?

Big Data Math

@torbenbrodt

Page 24: Living Labs Challenge Workshop

How we built it?

Big Data Math

welt.de_201406

new iphone su... 500 +1

twitter buys p.. 400

google has seri... 300

ZINCRBY (WRITE)"welt.de_201406"

"article 1"

"1"

ZUNION (JOIN) “welt.de_201406”

“geolocation:NL_201406”

ZREVRANGEBYSCORE (FETCH)

@torbenbrodt

Page 25: Living Labs Challenge Workshop

Application for Living Labs

#

● These are your visitors

@torbenbrodt

● This is your data

● Assume this is open!

● This is your challenge

Page 26: Living Labs Challenge Workshop

● Message Bus provides YOU with data

Application for Living Labs

Your role in the ORP

@torbenbrodt

plistaORPmaster

YOU!

● Real-Time Results are provided by YOU

● ORP master will choose YOU● User will see YOUR results

Page 27: Living Labs Challenge Workshop

Try latest technologies

Application for Living Labs

YOU, a technology enthusiast

● Mahout implementation exists with Kornakapi

● what will be next? Oryx? MyMediaLite? LensKit? Predict.io?

we have strong open source connections

@torbenbrodt

Page 28: Living Labs Challenge Workshop

● try if ideas work● write papers● we are on

conferences!○ sigir 2013○ recsys 2013○ clef 2014○ … 2015 ?

we have strong university cooperations

Application for Living Labs

YOU, a researcher

@torbenbrodt

Page 29: Living Labs Challenge Workshop

● plista earns money with recommendations on publishers

● help us -> we help you● weekly contest with

250 € prices

http://contest.plista.com(currently in maintenance)

Application for Living Labs

YOU, a partner

@torbenbrodt

Page 30: Living Labs Challenge Workshop

Application for Living Labs

YOU, a developer

● APIs in php and java exists

● start your own using the api

@torbenbrodt

Page 31: Living Labs Challenge Workshop

Your server is probably hosted by university, plista or any cloud provider

Application for Living Labs

YOU, a developer

@torbenbrodt

Page 32: Living Labs Challenge Workshop

"message bus"● event notifications

○ impression○ click

● error notifications● item updates

train model from it

Application for Living Labs

YOU, a developer

@torbenbrodt

Page 33: Living Labs Challenge Workshop

{ // json

"type": "impression",

"context": {

"simple": {

27: 418, // publisher

14: 31721, // widget

...

},

"lists": {

"10": [100, 101] // channel

}

...

}

Application for Living Labs

YOU, a developer

@torbenbrodt

Page 34: Living Labs Challenge Workshop

recs

Your response shown to real users

{ // json

"recs": {

"int": {

"3": [13010630, 84799192]

// 3 refers to content recommendations

}

...

}

API

Real User

YOU

Application for Living Labs

YOU, a developer

api specs hosted at http://orp.plista.com

@torbenbrodt

Page 35: Living Labs Challenge Workshop

recs

Real User

YOU

● user, publisher,

advertiser, plista

YOU can profit

● real user feedback

● real benchmark

with others

Application for Living Labs

quality is win win

@torbenbrodt

Page 36: Living Labs Challenge Workshop

● 2012

○ Contest v1

● 2013 October

○ ACM RecSys “News

Recommender Challenge”

● 2014 November

○ CLEF News Recommendation

Evaluation Labs “newsreel”

Application for Living Labs

Overview

@torbenbrodt

Page 37: Living Labs Challenge Workshop

Application for Living Labs

Challenge Numbers :)

● during recsys’13:○ 571,744,114 impressions delivered by researchers

○ 23 registrations => 11 active teams

● news articles of ~13 publishers

● contextual data with ~50 attributes

● cross domain application

Page 38: Living Labs Challenge Workshop

Application for Living Labs

Challenge Challenges :(

● what is the benchmark○ click per impression?○ absolute number of clicks?○ absolute number weighted by time range?

● integration in real application is challenging○ starting from scratch?○ having runtime environment?

● papers better match offline data○ here i can compare against previous work○ are we working for papers or for passion?

● real users = real privacy issues?