mlconf nyc justin basilico
DESCRIPTION
TRANSCRIPT
11
Learning a Personalized HomepageJustin BasilicoPage Algorithms Engineering April 11, 2014
NYC 2014
@JustinBasilico
2
3
Change of focus
2006 2014
4
Netflix Scale > 44M members
> 40 countries
> 1000 device types
> 5B hours in Q3 2013
Plays: > 30M/day
Log 100B events/day
31.62% of peak US downstream traffic
55
Approach to Recommendation“Emmy Winning”
6
Goal
Help members find content to watch and enjoy to maximize member satisfaction and retention
7
Everything is a Recommendation
Row
s
Ranking
Over 75% of what people watch comes from our recommendations
Recommendations are driven by Machine Learning
8
Top Picks
Personalization awareness
Diversity
9
Personalized genres Genres focused on user interest
Derived from tag combinations
Provide context and evidence
How are they generated?
Implicit: Based on recent plays, ratings & other interactions
Explicit: Taste preferences
Hybrid: combine the above
10
Similarity Find something similar to
something you’ve liked
Because you watched rows
Also Video display page In response to user actions
(search, list add, …)
11
Support for Recommendations
Social Support
1212
Learning to Recommend
13
Machine Learning Approach
Problem
Data
ModelAlgorithm
Metrics
14
Data Plays
Duration, bookmark, time, device, …
Ratings
Metadata Tags, synopsis, cast, …
Impressions
Interactions Search, list add, scroll, …
Social
15
Models & Algorithms Regression (Linear, logistic, elastic net)
SVD and other Matrix Factorizations
Factorization Machines
Restricted Boltzmann Machines
Deep Neural Networks
Markov Models and Graph Algorithms
Clustering
Latent Dirichlet Allocation
Gradient Boosted Decision Trees/Random Forests
Gaussian Processes
…
16
Offline/Online testing process
Rollout Feature to all users
Offline testing
Online A/Btesting[success] [success]
[fail]
days Weeks to months
17
Rating Prediction First progress prize
Top 2 algorithms
Matrix Factorization (SVD++)
Restricted Boltzmann Machines (RBM)
Ensemble: Linear blend
R
Videos
≈
Use
rs U
V
(99% Sparse) d
Videos
Use
rs
d×
18
Ranking by ratings
4.7 4.6 4.5 4.5 4.5 4.5 4.5 4.5 4.5 4.5
Niche titlesHigh average ratings… by those who would watch it
19
RMSE
20
Learning to Rank Approaches:
Point-wise: Loss over items (Classification, ordinal regression, MF, …)
Pair-wise: Loss over preferences (RankSVM, RankNet, BPR, …)
List-wise: (Smoothed) loss over ranking (LambdaMART, DirectRank, GAPfm, …)
Ranking quality measures: NDCG, MRR, ERR, MAP, FCP,
Precision@N, Recall@N, …0 10 20 30 40 50 60 70 80 90 100
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
NDCG MRR FCP
Rank
Impo
rtan
ce
21
Example: Two features, linear model
Popularity
Pred
icte
d Ra
ting
1
23
4
5
Linear Model:frank(u,v) = w1 p(v) + w2 r(u,v)
Final Ranking
22
Ranking
2323
Putting it together“Learning to Row”
24
Page-level algorithmic challenge
10,000s of possible
rows …
10-40 rows
Variable number of possible videos per
row (up to thousands)
1 personalized page
per device
25
Balancing a Personalized Page
vs.Accurate Diverse
vs.Discovery Continuation
vs.Depth Coverage
vs.Freshness Stability
vs.Recommendations Tasks
26
2D Navigational Modeling
More likely to see
Less likely
27
Row Lifecycle
Select Candidates
Select Evidence
Rank
Filter
Format
Choose
28
Building a page algorithmically Approaches
Template: Non-personalized layout Row-independent: Greedy rank rows by f(r | u, c) Stage-wise: Pick next rows by f(r | u, c, p1:n)
Page-wise: Total page fitness f(p | u, c)
Obey constraints per device Certain rows may be required Examples: Continue watching and My List
29
Row Features Quality of items Features of items Quality of evidence User-row interactions Item/row metadata Recency Item-row affinity Row length Position on page Context Title Diversity Freshness …
30
Page-level Metrics How do you measure the quality of
the homepage? Ease of discovery Diversity Novelty …
Challenges: Position effects Row-video generalization
2D versions of ranking quality metrics Example: Recall @ row-by-column
0 5 10 15 20 25 30 35
RowRe
call
3131
Conclusions
32
Evolution of Recommendation Approach
Rating Ranking Page Generation
4.7
33
Research Directions
Context awareness
Full-page optimization
Presentation effects
Social recommendation
Personalized learning to rank Cold start