![Page 1: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/1.jpg)
Building the Next New York Times Recommendation Engine
By Alexander Spangher
![Page 2: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/2.jpg)
Problem Statement:
1. The New York Times publishes over 300 articles, blog posts and interactive stories a day.
Corpus:
n articles that are still relevant over the past x days
![Page 3: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/3.jpg)
For each user:
1 2 3 4 ...
30 day reading history
![Page 4: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/4.jpg)
![Page 5: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/5.jpg)
Machine Learning
“All of machine learning can be broken down into regression and matrix factorization.”
-A drunk PhD student at a bar
1. Regression: f(input) = output
2. Factorization: f(output) = output
-Yann Lecun, 2015
![Page 6: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/6.jpg)
Problem Statement (Refined)
1. Define pool of articles.
Not all articles expire at the same rate
2. Rank order articles based on reading history of user.
Assume that reader’s future preferences will match past preferences
![Page 7: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/7.jpg)
Defining the Pool of Articles
![Page 8: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/8.jpg)
Defining Relevancy
![Page 9: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/9.jpg)
Exponential Distribution
![Page 10: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/10.jpg)
Evergreen Model
Section,Desk,Word Count...
clicks per day2. Learn relationship between features and metric
1. Learn training metric
3. Convert to interpretable expiration date
![Page 11: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/11.jpg)
Fit a to each item in training set
Fit:
i
![Page 12: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/12.jpg)
Likelihood function:
Maximum Likelihood Estimate (MLE)
likelihood of data and parameters
joint pdf of data given parameter
product of independent pdf’s
![Page 13: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/13.jpg)
Maximum Likelihood Estimate
Given timestamp of every click:
![Page 14: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/14.jpg)
Maximum Likelihood Estimate
???
![Page 15: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/15.jpg)
Maximum Log Likelihood Estimate
![Page 16: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/16.jpg)
Or, use optimization package:
Python: http://cvxopt.org/
Convex Optimization by Stephen Boyd
![Page 17: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/17.jpg)
Learn relationship between article features and
x = [desk, word count, section, ...]
y =
General Linear model:
![Page 18: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/18.jpg)
![Page 19: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/19.jpg)
Performance
![Page 20: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/20.jpg)
Building the Recommender
(http://open.blogs.nytimes.com/2015/08/11/building-the-next-new-york-times-recommendation-engine/)
![Page 21: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/21.jpg)
First Iteration
Keyword-Based model: TF-IDF Vector
N = number of times word appears in documentD = number of documents that word appears in
![Page 22: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/22.jpg)
First Iteration
Keyword-Based model: TF-IDF Vector
[ 0.02, 0.5, 0, 0, … , .01 ]
[ 0.9, 0.01, 0.2, … , .05 ]
fun cat dog scholar nice
![Page 23: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/23.jpg)
Feedback:
“Recommendations work for me
I have been following the Oscar Pistorius case for over a year now and every time there has been a relevant story about the case, I have been recommended that story.
Recommendations seem to be working very well for me.”
![Page 24: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/24.jpg)
Feedback:
“No More Brooks recommendations, please
Your constant pushing of David Brooks onto me is like an annoying grandmother who won't believe you are really allergic to peanuts even though you regularly go into anaphylactic shock at her dinner table and need to be rushed to the hospital. What can I say… you're killing me. Please stop it.
...
Thanks for your attention to this matter.”
![Page 25: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/25.jpg)
Feedback:
“Dear NY Times,
You seem to have missed the fact that, while I do read the Weddings section, I only (or almost only) read about the weddings of same sex couples.
Please stop recommending heterosexual weddings articles to me!!”
![Page 26: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/26.jpg)
[ 0.02, 0.5, 0, 0, … , .01 ]
[ 0.9, 0.01, 0.2, … , .05 ]
1 2 3 4 k
LDA-Based model: Topic Vector
Second Iteration:
![Page 27: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/27.jpg)
Example topic, pr
obab
ility
wei
ght
cat yarn tree building car money bank paw toy newspaper Spotify
![Page 28: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/28.jpg)
Example topic, :pr
obab
ility
wei
ght
cat yarn tree building car money bank paw toy newspaper Spotify
![Page 29: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/29.jpg)
LDA
![Page 30: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/30.jpg)
David Blei (2003)
![Page 31: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/31.jpg)
Topic Space
![Page 32: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/32.jpg)
How do we learn these parameters?
LDA Definition:
Choose ~ Dirichlet(ɑ)𝜃For each in document:
Choose word topic ~ Mult( )𝜃Choose word from
![Page 33: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/33.jpg)
Variational Inference
Image borrowed from David Blei (2003)
![Page 34: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/34.jpg)
Variational Inference (cont.)
![Page 35: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/35.jpg)
Variational Inference (cont.)
1. (E-Step):
2. (M-Step):
tractable!!!
![Page 36: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/36.jpg)
Collaborative Topic Modeling (CTM)
Image borrowed from David Blei (2011)
The graphical model for the CTM model we use.
![Page 37: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/37.jpg)
Scaling the algorithm
Training procedure is batch. Do we have time to scale to all our users, in real time???
![Page 38: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/38.jpg)
Strategy:
1. Iterate until some variables don’t change (article-topics).
2. Scale out, fixing non-changing variables. Update equation for one variable becomes a closed-form equation.
![Page 39: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/39.jpg)
Algorithm
1. Batch train on training set of users
2. Fix and scale out to all users
![Page 40: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/40.jpg)
Derive scores for users
As seen in:
http://benanne.github.io/2014/08/05/spotify-cnns.html!!
![Page 41: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/41.jpg)
C parameter: the back-off average
![Page 42: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/42.jpg)
Any vector-based algorithm.
1)Deep Network (Spotify’s audio-CNN)
2)Shallow Network (Doc2Vec)
3)Topic Model
4)pLSA
![Page 43: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/43.jpg)
In conclusion
Modeling is fun!
All models are bad, but some can be useful!
Improve by recognizing shortfalls.
Evaluate on KPIs, on customer feedback, on design decisions.
![Page 44: DataEngConf: Building the Next New York Times Recommendation Engine](https://reader035.vdocument.in/reader035/viewer/2022062902/58edb6d21a28abaa308b45f5/html5/thumbnails/44.jpg)
not functional
sub-optimalflat-lining/degrading