mlconf nyc justin basilico

11

Learning a Personalized HomepageJustin BasilicoPage Algorithms Engineering April 11, 2014

NYC 2014

@JustinBasilico

3

Change of focus

2006 2014

4

Netflix Scale > 44M members

> 40 countries

> 1000 device types

> 5B hours in Q3 2013

Plays: > 30M/day

Log 100B events/day

31.62% of peak US downstream traffic

55

Approach to Recommendation“Emmy Winning”

6

Goal

Help members find content to watch and enjoy to maximize member satisfaction and retention

7

Everything is a Recommendation

Row

s

Ranking

Over 75% of what people watch comes from our recommendations

Recommendations are driven by Machine Learning

8

Top Picks

Personalization awareness

Diversity

9

Personalized genres Genres focused on user interest

Derived from tag combinations

Provide context and evidence

How are they generated?

Implicit: Based on recent plays, ratings & other interactions

Explicit: Taste preferences

Hybrid: combine the above

10

Similarity Find something similar to

something you’ve liked

Because you watched rows

Also Video display page In response to user actions

(search, list add, …)

11

Support for Recommendations

Social Support

1212

Learning to Recommend

13

Machine Learning Approach

Problem

Data

ModelAlgorithm

Metrics

14

Data Plays

Duration, bookmark, time, device, …

Ratings

Metadata Tags, synopsis, cast, …

Impressions

Interactions Search, list add, scroll, …

Social

15

Models & Algorithms Regression (Linear, logistic, elastic net)

SVD and other Matrix Factorizations

Factorization Machines

Restricted Boltzmann Machines

Deep Neural Networks

Markov Models and Graph Algorithms

Clustering

Latent Dirichlet Allocation

Gradient Boosted Decision Trees/Random Forests

Gaussian Processes

…

16

Offline/Online testing process

Rollout Feature to all users

Offline testing

Online A/Btesting[success] [success]

[fail]

days Weeks to months

17

Rating Prediction First progress prize

Top 2 algorithms

Matrix Factorization (SVD++)

Restricted Boltzmann Machines (RBM)

Ensemble: Linear blend

R

Videos

≈

Use

rs U

V

(99% Sparse) d

Videos

Use

rs

d×

18

Ranking by ratings

4.7 4.6 4.5 4.5 4.5 4.5 4.5 4.5 4.5 4.5

Niche titlesHigh average ratings… by those who would watch it

19

RMSE

20

Learning to Rank Approaches:

Point-wise: Loss over items (Classification, ordinal regression, MF, …)

Pair-wise: Loss over preferences (RankSVM, RankNet, BPR, …)

List-wise: (Smoothed) loss over ranking (LambdaMART, DirectRank, GAPfm, …)

Ranking quality measures: NDCG, MRR, ERR, MAP, FCP,

Precision@N, Recall@N, …0 10 20 30 40 50 60 70 80 90 100

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

NDCG MRR FCP

Rank

Impo

rtan

ce

21

Example: Two features, linear model

Popularity

Pred

icte

d Ra

ting

1

23

4

5

Linear Model:frank(u,v) = w1 p(v) + w2 r(u,v)

Final Ranking

22

Ranking

2323

Putting it together“Learning to Row”

24

Page-level algorithmic challenge

10,000s of possible

rows …

10-40 rows

Variable number of possible videos per

row (up to thousands)

1 personalized page

per device

25

Balancing a Personalized Page

vs.Accurate Diverse

vs.Discovery Continuation

vs.Depth Coverage

vs.Freshness Stability

vs.Recommendations Tasks

26

2D Navigational Modeling

More likely to see

Less likely

27

Row Lifecycle

Select Candidates

Select Evidence

Rank

Filter

Format

Choose

28

Building a page algorithmically Approaches

Template: Non-personalized layout Row-independent: Greedy rank rows by f(r | u, c) Stage-wise: Pick next rows by f(r | u, c, p1:n)

Page-wise: Total page fitness f(p | u, c)

Obey constraints per device Certain rows may be required Examples: Continue watching and My List

29

Row Features Quality of items Features of items Quality of evidence User-row interactions Item/row metadata Recency Item-row affinity Row length Position on page Context Title Diversity Freshness …

30

Page-level Metrics How do you measure the quality of

the homepage? Ease of discovery Diversity Novelty …

Challenges: Position effects Row-video generalization

2D versions of ranking quality metrics Example: Recall @ row-by-column

0 5 10 15 20 25 30 35

RowRe

call

3131

Conclusions

32

Evolution of Recommendation Approach

Rating Ranking Page Generation

4.7

33

Research Directions

Context awareness

Full-page optimization

Presentation effects

Social recommendation

Personalized learning to rank Cold start

34Thank You Justin Basilico

[email protected]@JustinBasilico

We’re hiring

mlconf nyc justin basilico

Technology

list add

rows

watch

loss

wise

success