![Page 1: Recommender systems and the Netflix prizecseweb.ucsd.edu/classes/wi12/cse91-a/Lectures/...Grand Prize (BellKor’s Pragmatic Chaos) : 0.8554 Major Challenges 1. Size of data – Need](https://reader035.vdocument.in/reader035/viewer/2022062920/5f029ad17e708231d40517f5/html5/thumbnails/1.jpg)
Recommender systemsand
the Netflix prize
Charles Elkan
January 14, 2011
Solving th
e World
's Problems C
reatively
![Page 2: Recommender systems and the Netflix prizecseweb.ucsd.edu/classes/wi12/cse91-a/Lectures/...Grand Prize (BellKor’s Pragmatic Chaos) : 0.8554 Major Challenges 1. Size of data – Need](https://reader035.vdocument.in/reader035/viewer/2022062920/5f029ad17e708231d40517f5/html5/thumbnails/2.jpg)
Recommender systems
We Know What You OughtTo Be Watching This Summer
![Page 3: Recommender systems and the Netflix prizecseweb.ucsd.edu/classes/wi12/cse91-a/Lectures/...Grand Prize (BellKor’s Pragmatic Chaos) : 0.8554 Major Challenges 1. Size of data – Need](https://reader035.vdocument.in/reader035/viewer/2022062920/5f029ad17e708231d40517f5/html5/thumbnails/3.jpg)
![Page 4: Recommender systems and the Netflix prizecseweb.ucsd.edu/classes/wi12/cse91-a/Lectures/...Grand Prize (BellKor’s Pragmatic Chaos) : 0.8554 Major Challenges 1. Size of data – Need](https://reader035.vdocument.in/reader035/viewer/2022062920/5f029ad17e708231d40517f5/html5/thumbnails/4.jpg)
“We’re quite curious, really. To the tune of one million dollars.” – Netflix Prize rules
1. Goal: Improve the Netflix recommendation algorithm, Cinematch 2. Criterion: Reduction in error (RMSE)3. Oct ’06: Contest start 4. Oct ’07: $50K progress prize for 8.43% improvement5. Oct ’08: $50K progress prize for 9.44% improvement6. Sept ’09: $1 million grand prize for 10.06% improvement
![Page 5: Recommender systems and the Netflix prizecseweb.ucsd.edu/classes/wi12/cse91-a/Lectures/...Grand Prize (BellKor’s Pragmatic Chaos) : 0.8554 Major Challenges 1. Size of data – Need](https://reader035.vdocument.in/reader035/viewer/2022062920/5f029ad17e708231d40517f5/html5/thumbnails/5.jpg)
score movie user1 21 15 213 14 345 24 123 23 768 25 76 34 45 41 568 52 342 52 234 55 76 64 56 6
movie user? 62 1? 96 1? 7 2? 3 2? 47 3? 15 3? 41 4? 28 4? 93 5? 74 5? 69 6? 83 6
Training data Test data
Movie rating data
• Training data– 100 million ratings– 480,000 users– 17,770 movies– 6 years of data:
2000-2005• Test data
– Last few ratings from each user (2.8 million)
• Dates of ratings are given
![Page 6: Recommender systems and the Netflix prizecseweb.ucsd.edu/classes/wi12/cse91-a/Lectures/...Grand Prize (BellKor’s Pragmatic Chaos) : 0.8554 Major Challenges 1. Size of data – Need](https://reader035.vdocument.in/reader035/viewer/2022062920/5f029ad17e708231d40517f5/html5/thumbnails/6.jpg)
What is RMSE?
• RMSE stands for “root mean squared error.”
• Let p be the prediction for user u and movie m; let r be the true rating.
• RMSE = √ ∑ (p - r)² / n
• RMSE measures the average mistake, with higher penalty for big mistakes, i.e. large values of (p-r)².
• You can’t have a contest without a precise goal!
![Page 7: Recommender systems and the Netflix prizecseweb.ucsd.edu/classes/wi12/cse91-a/Lectures/...Grand Prize (BellKor’s Pragmatic Chaos) : 0.8554 Major Challenges 1. Size of data – Need](https://reader035.vdocument.in/reader035/viewer/2022062920/5f029ad17e708231d40517f5/html5/thumbnails/7.jpg)
#ratings per user
1. Avg #ratings/user: 208
![Page 8: Recommender systems and the Netflix prizecseweb.ucsd.edu/classes/wi12/cse91-a/Lectures/...Grand Prize (BellKor’s Pragmatic Chaos) : 0.8554 Major Challenges 1. Size of data – Need](https://reader035.vdocument.in/reader035/viewer/2022062920/5f029ad17e708231d40517f5/html5/thumbnails/8.jpg)
Most Active Users
User ID # Ratings Mean Rating305344 17,651 1.90387418 17,432 1.81
2439493 16,560 1.221664010 15,811 4.262118461 14,829 4.081461435 9,820 1.371639792 9,764 1.331314869 9,739 2.95
The dataset contains 17,770 movies!
![Page 9: Recommender systems and the Netflix prizecseweb.ucsd.edu/classes/wi12/cse91-a/Lectures/...Grand Prize (BellKor’s Pragmatic Chaos) : 0.8554 Major Challenges 1. Size of data – Need](https://reader035.vdocument.in/reader035/viewer/2022062920/5f029ad17e708231d40517f5/html5/thumbnails/9.jpg)
#ratings per movie
1. Avg #ratings/movie: 5627
![Page 10: Recommender systems and the Netflix prizecseweb.ucsd.edu/classes/wi12/cse91-a/Lectures/...Grand Prize (BellKor’s Pragmatic Chaos) : 0.8554 Major Challenges 1. Size of data – Need](https://reader035.vdocument.in/reader035/viewer/2022062920/5f029ad17e708231d40517f5/html5/thumbnails/10.jpg)
Movies Rated Most Often
Title # Ratings Mean RatingMiss Congeniality 227,715 3.36Independence Day 216,233 3.72The Patriot 200,490 3.78The Day After Tomorrow 194,695 3.44Pretty Woman 190,320 3.90Pirates of the Caribbean 188,849 4.15The Green Mile 180,883 4.31Forrest Gump 180,736 4.30
![Page 11: Recommender systems and the Netflix prizecseweb.ucsd.edu/classes/wi12/cse91-a/Lectures/...Grand Prize (BellKor’s Pragmatic Chaos) : 0.8554 Major Challenges 1. Size of data – Need](https://reader035.vdocument.in/reader035/viewer/2022062920/5f029ad17e708231d40517f5/html5/thumbnails/11.jpg)
Important RMSE levels
Prize’07 (BellKor): 0.8712
Cinematch system: 0.9514
Movie average: 1.0533
User average: 1.0651
Global average: 1.1296
Inherent noise: ????
Personalization
erroneous
accurate
Prize’08 (BellKor+BigChaos): 0.8616
Grand Prize (BellKor’s Pragmatic Chaos) : 0.8554
![Page 12: Recommender systems and the Netflix prizecseweb.ucsd.edu/classes/wi12/cse91-a/Lectures/...Grand Prize (BellKor’s Pragmatic Chaos) : 0.8554 Major Challenges 1. Size of data – Need](https://reader035.vdocument.in/reader035/viewer/2022062920/5f029ad17e708231d40517f5/html5/thumbnails/12.jpg)
Major Challenges1. Size of data
– Need memory management and efficiency of algorithms
2. Training and test data are different– Test ratings are later in time
3. 99% of data are missing– Eliminates many standard methods
4. Countless factors affect ratings:– Genre, movie vs. TV vs. other– Style of action, dialogue, plot, music– Director, actors
movie #16322
![Page 13: Recommender systems and the Netflix prizecseweb.ucsd.edu/classes/wi12/cse91-a/Lectures/...Grand Prize (BellKor’s Pragmatic Chaos) : 0.8554 Major Challenges 1. Size of data – Need](https://reader035.vdocument.in/reader035/viewer/2022062920/5f029ad17e708231d40517f5/html5/thumbnails/13.jpg)
Types of recommender systems
1.Personalized recommendations of items (e.g. Amazon products) to users
3.Content-based:– Pre-specified attributes measured for items– Users’ interests estimated for same attributes– Examples: eHarmony, Pandora
4.Collaborative filtering (CF):– Does not require content information about items
or user surveys – Infers relationships from purchases or ratings– Nearest neighbor methods– Hidden attribute methods
![Page 14: Recommender systems and the Netflix prizecseweb.ucsd.edu/classes/wi12/cse91-a/Lectures/...Grand Prize (BellKor’s Pragmatic Chaos) : 0.8554 Major Challenges 1. Size of data – Need](https://reader035.vdocument.in/reader035/viewer/2022062920/5f029ad17e708231d40517f5/html5/thumbnails/14.jpg)
![Page 15: Recommender systems and the Netflix prizecseweb.ucsd.edu/classes/wi12/cse91-a/Lectures/...Grand Prize (BellKor’s Pragmatic Chaos) : 0.8554 Major Challenges 1. Size of data – Need](https://reader035.vdocument.in/reader035/viewer/2022062920/5f029ad17e708231d40517f5/html5/thumbnails/15.jpg)
Geared towards females
Geared towards males
Serious
Escapist
The PrincessDiaries
The Lion King
Braveheart
Lethal Weapon
Independence Day
AmadeusThe Color Purple
Dumb and Dumber
Ocean’s 11Sense and Sensibility
Gus
Dave
Hidden attribute methods
![Page 16: Recommender systems and the Netflix prizecseweb.ucsd.edu/classes/wi12/cse91-a/Lectures/...Grand Prize (BellKor’s Pragmatic Chaos) : 0.8554 Major Challenges 1. Size of data – Need](https://reader035.vdocument.in/reader035/viewer/2022062920/5f029ad17e708231d40517f5/html5/thumbnails/16.jpg)
Lessons from the Netflix contest
Movie # 13043
![Page 17: Recommender systems and the Netflix prizecseweb.ucsd.edu/classes/wi12/cse91-a/Lectures/...Grand Prize (BellKor’s Pragmatic Chaos) : 0.8554 Major Challenges 1. Size of data – Need](https://reader035.vdocument.in/reader035/viewer/2022062920/5f029ad17e708231d40517f5/html5/thumbnails/17.jpg)
Lesson #1: Look at the data
1. Major steps forward were based on including new aspects of the data:– Time-based effects– Selection bias:
• Which movies a user rates is predictive of rating values• Daily rating counts are predictive
2. Use human intelligence to define new features of the data.
![Page 18: Recommender systems and the Netflix prizecseweb.ucsd.edu/classes/wi12/cse91-a/Lectures/...Grand Prize (BellKor’s Pragmatic Chaos) : 0.8554 Major Challenges 1. Size of data – Need](https://reader035.vdocument.in/reader035/viewer/2022062920/5f029ad17e708231d40517f5/html5/thumbnails/18.jpg)
Multiple sources of temporal dynamics
• Item-based effects:– Product perception and popularity change constantly– Seasonal patterns influence popularity
• User-based effects:– Customers continually change their tastes– Transient, short-term bias; anchoring– Drifting rating scale– Change of rater within household
![Page 19: Recommender systems and the Netflix prizecseweb.ucsd.edu/classes/wi12/cse91-a/Lectures/...Grand Prize (BellKor’s Pragmatic Chaos) : 0.8554 Major Challenges 1. Size of data – Need](https://reader035.vdocument.in/reader035/viewer/2022062920/5f029ad17e708231d40517f5/html5/thumbnails/19.jpg)
Something happened in 2004…
2004
![Page 20: Recommender systems and the Netflix prizecseweb.ucsd.edu/classes/wi12/cse91-a/Lectures/...Grand Prize (BellKor’s Pragmatic Chaos) : 0.8554 Major Challenges 1. Size of data – Need](https://reader035.vdocument.in/reader035/viewer/2022062920/5f029ad17e708231d40517f5/html5/thumbnails/20.jpg)
Are old movies better than new ones?
![Page 21: Recommender systems and the Netflix prizecseweb.ucsd.edu/classes/wi12/cse91-a/Lectures/...Grand Prize (BellKor’s Pragmatic Chaos) : 0.8554 Major Challenges 1. Size of data – Need](https://reader035.vdocument.in/reader035/viewer/2022062920/5f029ad17e708231d40517f5/html5/thumbnails/21.jpg)
• Matrix factorization is the leading approach– Gradient-descent-based optimization– Integration of biases – Incorporation of implicit feedback– Accounting for temporal effects– Combination with a neighborhood model–
Lesson #3: Mathematics helps!
![Page 22: Recommender systems and the Netflix prizecseweb.ucsd.edu/classes/wi12/cse91-a/Lectures/...Grand Prize (BellKor’s Pragmatic Chaos) : 0.8554 Major Challenges 1. Size of data – Need](https://reader035.vdocument.in/reader035/viewer/2022062920/5f029ad17e708231d40517f5/html5/thumbnails/22.jpg)
Matrix factorization method
4 5 5 3 13 1 2 4 4 5
5 3 4 3 2 1 4 2
2 4 5 4 2
5 2 2 4 3 4
4 2 3 3 1
items
-1 -0.4 0.10.5 0.6 -0.5
0.5 0.3 -0.2
0.3 2.1 0.2
-2 2.1 -0.7
0.3 0.7 1.1
-0.9 2.4 1.4 0.3 -0.4 0.8 -0.5 -2 0.5 0.3 -0.2 1.11.3 -0.1 1.2 -0.7 2.9 1.4 -1 0.3 1.4 0.5 0.7 -0.8
0.1 -0.6 0.7 0.8 0.4 -0.3 0.9 2.4 1.7 0.6 -0.4 2.1
=
items
users
users
This is a rank-3 linear algebra approximation!
![Page 23: Recommender systems and the Netflix prizecseweb.ucsd.edu/classes/wi12/cse91-a/Lectures/...Grand Prize (BellKor’s Pragmatic Chaos) : 0.8554 Major Challenges 1. Size of data – Need](https://reader035.vdocument.in/reader035/viewer/2022062920/5f029ad17e708231d40517f5/html5/thumbnails/23.jpg)
Estimate unknown ratings as dot-products of factor values:
4 5 5 3 13 1 2 4 4 5
5 3 4 3 2 1 4 2
2 4 5 4 2
5 2 2 4 3 4
4 2 3 3 1
items
0.2 -0.4 0.10.5 0.6 -0.5
0.5 0.3 -0.2
0.3 2.1 1.1
-2 2.1 -0.7
0.3 0.7 -1
-0.9 2.4 1.4 0.3 -0.4 0.8 -0.5 -2 0.5 0.3 -0.2 1.11.3 -0.1 1.2 -0.7 2.9 1.4 -1 0.3 1.4 0.5 0.7 -0.8
0.1 -0.6 0.7 0.8 0.4 -0.3 0.9 2.4 1.7 0.6 -0.4 2.1
=
items
users
users
?
![Page 24: Recommender systems and the Netflix prizecseweb.ucsd.edu/classes/wi12/cse91-a/Lectures/...Grand Prize (BellKor’s Pragmatic Chaos) : 0.8554 Major Challenges 1. Size of data – Need](https://reader035.vdocument.in/reader035/viewer/2022062920/5f029ad17e708231d40517f5/html5/thumbnails/24.jpg)
Estimate unknown ratings as inner-products of factors:
4 5 5 3 13 1 2 4 4 5
5 3 4 3 2 1 4 2
2 4 5 4 2
5 2 2 4 3 4
4 2 3 3 1
items
0.2 -0.4 0.10.5 0.6 -0.5
0.5 0.3 -0.2
0.3 2.1 1.1
-2 2.1 -0.7
0.3 0.7 -1
-0.9 2.4 1.4 0.3 -0.4 0.8 -0.5 4 0.5 0.3 -0.2 1.11.3 -0.1 1.2 -0.7 2.9 1.4 -1 0.5 1.4 0.5 0.7 -0.8
0.1 -0.6 0.7 0.8 0.4 -0.3 0.9 1.4 1.7 0.6 -0.4 2.1
=
~
items
users
users
?
![Page 25: Recommender systems and the Netflix prizecseweb.ucsd.edu/classes/wi12/cse91-a/Lectures/...Grand Prize (BellKor’s Pragmatic Chaos) : 0.8554 Major Challenges 1. Size of data – Need](https://reader035.vdocument.in/reader035/viewer/2022062920/5f029ad17e708231d40517f5/html5/thumbnails/25.jpg)
Estimate unknown ratings as inner-products of factors:
4 5 5 3 13 1 2 4 4 5
5 3 4 3 2 1 4 2
2 4 5 4 2
5 2 2 4 3 4
4 2 3 3 1
items
0.2 -0.4 0.10.5 0.6 -0.5
0.5 0.3 -0.2
0.3 2.1 1.1
-2 2.1 -0.7
0.3 0.7 -1
-0.9 2.4 1.4 0.3 -0.4 0.8 -0.5 4 0.5 0.3 -0.2 1.11.3 -0.1 1.2 -0.7 2.9 1.4 -1 0.5 1.4 0.5 0.7 -0.8
0.1 -0.6 0.7 0.8 0.4 -0.3 0.9 1.4 1.7 0.6 -0.4 2.1
~
~
items
users
1.6
0.5*4 + 0.6*0.5 + (-0.5)*1.4 = 1.6
users
![Page 26: Recommender systems and the Netflix prizecseweb.ucsd.edu/classes/wi12/cse91-a/Lectures/...Grand Prize (BellKor’s Pragmatic Chaos) : 0.8554 Major Challenges 1. Size of data – Need](https://reader035.vdocument.in/reader035/viewer/2022062920/5f029ad17e708231d40517f5/html5/thumbnails/26.jpg)
Matrix factorization model4 5 5 3 1
3 1 2 4 4 55 3 4 3 2 1 4 22 4 5 4 2
5 2 2 4 3 44 2 3 3 1
0.2 -0.4 0.10.5 0.6 -0.50.5 0.3 -0.20.3 2.1 1.1-2 2.1 -0.70.3 0.7 -1
-0.9 2.4 1.4 0.3 -0.4 0.8 -0.5 -2 0.5 0.3 -0.2 1.11.3 -0.1 1.2 -0.7 2.9 1.4 -1 0.3 1.4 0.5 0.7 -0.8
0.1 -0.6 0.7 0.8 0.4 -0.3 0.9 2.4 1.7 0.6 -0.4 2.1=
Why can’t we use standard linear algebra?
1. Standard linear algebra only applies to matrices where every entry has a known value.
3. Smoothing is necessary: We must learn as much signal as possible where there are sufficient data, but not overfit where data are scarce.
![Page 27: Recommender systems and the Netflix prizecseweb.ucsd.edu/classes/wi12/cse91-a/Lectures/...Grand Prize (BellKor’s Pragmatic Chaos) : 0.8554 Major Challenges 1. Size of data – Need](https://reader035.vdocument.in/reader035/viewer/2022062920/5f029ad17e708231d40517f5/html5/thumbnails/27.jpg)
Matrix approximation and the Netflix contest
• Probably the most popular contest method – Powerful, fast, and easy to program– Simon Funk described first gradient-descent SVD method.– Immediately ranked 3rd place on leaderboard– Still today: many related discussions in the Prize forum
Monday, December 11, 2006Netflix Update: Try This at Home
Simon Funk is the pseudonym of
Brandyn Webb, UCSD CSE B.S.
alumnus.
![Page 28: Recommender systems and the Netflix prizecseweb.ucsd.edu/classes/wi12/cse91-a/Lectures/...Grand Prize (BellKor’s Pragmatic Chaos) : 0.8554 Major Challenges 1. Size of data – Need](https://reader035.vdocument.in/reader035/viewer/2022062920/5f029ad17e708231d40517f5/html5/thumbnails/28.jpg)
Ideas needed to win the Netflix prize
1. Matrix factorization (see your linear algebra class)2. RMSE cost function (see your statistics class)3. Gradient descent (see your calculus class)4. Stochastic gradient descent (machine learning)5. Regularization (machine learning)6. Baseline factors7. A different target: which movies does a user rate?8. Time-dependent factor values
28
![Page 29: Recommender systems and the Netflix prizecseweb.ucsd.edu/classes/wi12/cse91-a/Lectures/...Grand Prize (BellKor’s Pragmatic Chaos) : 0.8554 Major Challenges 1. Size of data – Need](https://reader035.vdocument.in/reader035/viewer/2022062920/5f029ad17e708231d40517f5/html5/thumbnails/29.jpg)
Conclusion
Learn to be computer scientists, then go out and change the world!
Solving th
e World
's Problems C
reatively