collaborative filtering matrix completion alternating least squares · 2016-05-20 · women on the...
TRANSCRIPT
![Page 1: Collaborative Filtering Matrix Completion Alternating Least Squares · 2016-05-20 · Women on the Verge of a Nervous Breakdown What do I recommend??? ©Sham Kakade 2016 3. Netflix](https://reader034.vdocument.in/reader034/viewer/2022050401/5f7f14e42bb6e969de5d3775/html5/thumbnails/1.jpg)
©Sham Kakade 2016 1
Machine Learning for Big Data CSE547/STAT548, University of Washington
Sham Kakade
May 19, 2016
Collaborative FilteringMatrix Completion
Alternating Least Squares
Case Study 4: Collaborative Filtering
![Page 2: Collaborative Filtering Matrix Completion Alternating Least Squares · 2016-05-20 · Women on the Verge of a Nervous Breakdown What do I recommend??? ©Sham Kakade 2016 3. Netflix](https://reader034.vdocument.in/reader034/viewer/2022050401/5f7f14e42bb6e969de5d3775/html5/thumbnails/2.jpg)
Collaborative Filtering
• Goal: Find movies of interest to a user based on movies watched by the user and others
• Methods: matrix factorization
©Sham Kakade 2016 2
![Page 3: Collaborative Filtering Matrix Completion Alternating Least Squares · 2016-05-20 · Women on the Verge of a Nervous Breakdown What do I recommend??? ©Sham Kakade 2016 3. Netflix](https://reader034.vdocument.in/reader034/viewer/2022050401/5f7f14e42bb6e969de5d3775/html5/thumbnails/3.jpg)
City of God
Wild Strawberries
The Celebration
La Dolce Vita
Women on the Verge of aNervous Breakdown
What do I recommend???
©Sham Kakade 2016 3
![Page 4: Collaborative Filtering Matrix Completion Alternating Least Squares · 2016-05-20 · Women on the Verge of a Nervous Breakdown What do I recommend??? ©Sham Kakade 2016 3. Netflix](https://reader034.vdocument.in/reader034/viewer/2022050401/5f7f14e42bb6e969de5d3775/html5/thumbnails/4.jpg)
Netflix Prize
• Given 100 million ratings on a scale of 1 to 5, predict 3 million ratings to highest accuracy
• 17770 total movies
• 480189 total users
• Over 8 billion total ratings
• How to fill in the blanks?
©Sham Kakade 2016 4Figures from Ben Recht
![Page 5: Collaborative Filtering Matrix Completion Alternating Least Squares · 2016-05-20 · Women on the Verge of a Nervous Breakdown What do I recommend??? ©Sham Kakade 2016 3. Netflix](https://reader034.vdocument.in/reader034/viewer/2022050401/5f7f14e42bb6e969de5d3775/html5/thumbnails/5.jpg)
Matrix Completion Problem
• Filling missing data?
©Sham Kakade 2016 5
Xij known for black cells
Xij unknown for white cells
Rows index moviesColumns index users
X =Rows index users
Columns index movies
![Page 6: Collaborative Filtering Matrix Completion Alternating Least Squares · 2016-05-20 · Women on the Verge of a Nervous Breakdown What do I recommend??? ©Sham Kakade 2016 3. Netflix](https://reader034.vdocument.in/reader034/viewer/2022050401/5f7f14e42bb6e969de5d3775/html5/thumbnails/6.jpg)
Interpreting Low-Rank Matrix Completion (aka Matrix Factorization)
©Sham Kakade 2016 6
X LR’
=
![Page 7: Collaborative Filtering Matrix Completion Alternating Least Squares · 2016-05-20 · Women on the Verge of a Nervous Breakdown What do I recommend??? ©Sham Kakade 2016 3. Netflix](https://reader034.vdocument.in/reader034/viewer/2022050401/5f7f14e42bb6e969de5d3775/html5/thumbnails/7.jpg)
Identifiability of Factors
• If ruv is described by Lu , Rv what happens if we redefine the “topics” as
• Then,
©Sham Kakade 2016 7
X LR’
=
![Page 8: Collaborative Filtering Matrix Completion Alternating Least Squares · 2016-05-20 · Women on the Verge of a Nervous Breakdown What do I recommend??? ©Sham Kakade 2016 3. Netflix](https://reader034.vdocument.in/reader034/viewer/2022050401/5f7f14e42bb6e969de5d3775/html5/thumbnails/8.jpg)
Matrix Completion via Rank Minimization
• Given observed values:
• Find matrix
• Such that:
• But…
• Introduce bias:
• Two issues:
©Sham Kakade 2016 8
![Page 9: Collaborative Filtering Matrix Completion Alternating Least Squares · 2016-05-20 · Women on the Verge of a Nervous Breakdown What do I recommend??? ©Sham Kakade 2016 3. Netflix](https://reader034.vdocument.in/reader034/viewer/2022050401/5f7f14e42bb6e969de5d3775/html5/thumbnails/9.jpg)
Approximate Matrix Completion
• Minimize squared error:– (Other loss functions are possible)
• Choose rank k:
• Optimization problem:
©Sham Kakade 2016 9
![Page 10: Collaborative Filtering Matrix Completion Alternating Least Squares · 2016-05-20 · Women on the Verge of a Nervous Breakdown What do I recommend??? ©Sham Kakade 2016 3. Netflix](https://reader034.vdocument.in/reader034/viewer/2022050401/5f7f14e42bb6e969de5d3775/html5/thumbnails/10.jpg)
Coordinate Descent for Matrix Factorization
• Fix movie factors, optimize for user factors
• First observation:
©Sham Kakade 2016 10
![Page 11: Collaborative Filtering Matrix Completion Alternating Least Squares · 2016-05-20 · Women on the Verge of a Nervous Breakdown What do I recommend??? ©Sham Kakade 2016 3. Netflix](https://reader034.vdocument.in/reader034/viewer/2022050401/5f7f14e42bb6e969de5d3775/html5/thumbnails/11.jpg)
Minimizing Over User Factors
• For each user u:
• In matrix form:
• Second observation: Solve by
©Sham Kakade 2016 11
![Page 12: Collaborative Filtering Matrix Completion Alternating Least Squares · 2016-05-20 · Women on the Verge of a Nervous Breakdown What do I recommend??? ©Sham Kakade 2016 3. Netflix](https://reader034.vdocument.in/reader034/viewer/2022050401/5f7f14e42bb6e969de5d3775/html5/thumbnails/12.jpg)
Coordinate Descent for Matrix Factorization: Alternating Least-Squares
• Fix movie factors, optimize for user factors– Independent least-squares over users
• Fix user factors, optimize for movie factors– Independent least-squares over movies
• System may be underdetermined:
• Converges to
©Sham Kakade 2016 12
![Page 13: Collaborative Filtering Matrix Completion Alternating Least Squares · 2016-05-20 · Women on the Verge of a Nervous Breakdown What do I recommend??? ©Sham Kakade 2016 3. Netflix](https://reader034.vdocument.in/reader034/viewer/2022050401/5f7f14e42bb6e969de5d3775/html5/thumbnails/13.jpg)
Effect of Regularization
©Sham Kakade 2016 13
X LR’
=
![Page 14: Collaborative Filtering Matrix Completion Alternating Least Squares · 2016-05-20 · Women on the Verge of a Nervous Breakdown What do I recommend??? ©Sham Kakade 2016 3. Netflix](https://reader034.vdocument.in/reader034/viewer/2022050401/5f7f14e42bb6e969de5d3775/html5/thumbnails/14.jpg)
What you need to know…
• Matrix completion problem for collaborative filtering
• Over-determined -> low-rank approximation
• Rank minimization is NP-hard
• Minimize least-squares prediction for known values for given rank of matrix
– Must use regularization
• Coordinate descent algorithm = “Alternating Least Squares”
©Sham Kakade 2016 14
![Page 15: Collaborative Filtering Matrix Completion Alternating Least Squares · 2016-05-20 · Women on the Verge of a Nervous Breakdown What do I recommend??? ©Sham Kakade 2016 3. Netflix](https://reader034.vdocument.in/reader034/viewer/2022050401/5f7f14e42bb6e969de5d3775/html5/thumbnails/15.jpg)
©Sham Kakade 2016 15
Machine Learning for Big Data CSE547/STAT548, University of Washington
Sham Kakade
May 19, 2016
SGD for Matrix Completion,more algorithms
(Matrix-norm Minimization)
Case Study 4: Collaborative Filtering
![Page 16: Collaborative Filtering Matrix Completion Alternating Least Squares · 2016-05-20 · Women on the Verge of a Nervous Breakdown What do I recommend??? ©Sham Kakade 2016 3. Netflix](https://reader034.vdocument.in/reader034/viewer/2022050401/5f7f14e42bb6e969de5d3775/html5/thumbnails/16.jpg)
Stochastic Gradient Descent
• Observe one rating at a time ruv
• Gradient observing ruv:
• Updates:
©Sham Kakade 2016 16
![Page 17: Collaborative Filtering Matrix Completion Alternating Least Squares · 2016-05-20 · Women on the Verge of a Nervous Breakdown What do I recommend??? ©Sham Kakade 2016 3. Netflix](https://reader034.vdocument.in/reader034/viewer/2022050401/5f7f14e42bb6e969de5d3775/html5/thumbnails/17.jpg)
Local Optima v. Global Optima
• We are solving:
• We (kind of) wanted to solve:
• Which is NP-hard… – How do these things relate???
©Sham Kakade 2016 17
![Page 18: Collaborative Filtering Matrix Completion Alternating Least Squares · 2016-05-20 · Women on the Verge of a Nervous Breakdown What do I recommend??? ©Sham Kakade 2016 3. Netflix](https://reader034.vdocument.in/reader034/viewer/2022050401/5f7f14e42bb6e969de5d3775/html5/thumbnails/18.jpg)
Eigenvalue Decompositions for PSD Matrices
• Given a (square) symmetric positive semidefinite matrix:
– Eigenvalues:
• Thus rank is:
• Approximation:
• Property of trace:
• Thus, approximate rank minimization by:
©Sham Kakade 2016 18
![Page 19: Collaborative Filtering Matrix Completion Alternating Least Squares · 2016-05-20 · Women on the Verge of a Nervous Breakdown What do I recommend??? ©Sham Kakade 2016 3. Netflix](https://reader034.vdocument.in/reader034/viewer/2022050401/5f7f14e42bb6e969de5d3775/html5/thumbnails/19.jpg)
Generalizing the Trace Trick
• Non-square matrices have no trace
• For (square) positive semidefinite matrices, eigendecomposition:
• For rectangular matrices, singular value decomposition:
• Nuclear norm:
©Sham Kakade 2016 19
![Page 20: Collaborative Filtering Matrix Completion Alternating Least Squares · 2016-05-20 · Women on the Verge of a Nervous Breakdown What do I recommend??? ©Sham Kakade 2016 3. Netflix](https://reader034.vdocument.in/reader034/viewer/2022050401/5f7f14e42bb6e969de5d3775/html5/thumbnails/20.jpg)
Nuclear Norm Minimization
• Optimization problem:
Possible to relax equality constraints:
Both are convex problems!(solved by semidefinite programming)
©Sham Kakade 2016 20
![Page 21: Collaborative Filtering Matrix Completion Alternating Least Squares · 2016-05-20 · Women on the Verge of a Nervous Breakdown What do I recommend??? ©Sham Kakade 2016 3. Netflix](https://reader034.vdocument.in/reader034/viewer/2022050401/5f7f14e42bb6e969de5d3775/html5/thumbnails/21.jpg)
Nuclear Norm Minimization vs. Direct (Bilinear) Low Rank Solutions
• Nuclear norm minimization:
– Annoying because:
• Instead:
– Annoying because:
– But
• So
• And
• Under certain conditions [Burer, Monteiro ‘04]©Sham Kakade 2016 21
![Page 22: Collaborative Filtering Matrix Completion Alternating Least Squares · 2016-05-20 · Women on the Verge of a Nervous Breakdown What do I recommend??? ©Sham Kakade 2016 3. Netflix](https://reader034.vdocument.in/reader034/viewer/2022050401/5f7f14e42bb6e969de5d3775/html5/thumbnails/22.jpg)
Nuclear Norm Minimization vs. Direct (Bilinear) Low Rank Solutions
• Nuclear norm minimization:
– Annoying because:
• Instead:
– Annoying because:
– But
• So
• And
• Under certain conditions [Burer, Monteiro ‘04]©Sham Kakade 2016 22
![Page 23: Collaborative Filtering Matrix Completion Alternating Least Squares · 2016-05-20 · Women on the Verge of a Nervous Breakdown What do I recommend??? ©Sham Kakade 2016 3. Netflix](https://reader034.vdocument.in/reader034/viewer/2022050401/5f7f14e42bb6e969de5d3775/html5/thumbnails/23.jpg)
Theory
• Suppose true matrix is exactly low rank, what might we hope for?– Exact recovery?
– Statistically?
– Computationally?
• Is this possible?
• Assumptions:
©Sham Kakade 2016 23
![Page 24: Collaborative Filtering Matrix Completion Alternating Least Squares · 2016-05-20 · Women on the Verge of a Nervous Breakdown What do I recommend??? ©Sham Kakade 2016 3. Netflix](https://reader034.vdocument.in/reader034/viewer/2022050401/5f7f14e42bb6e969de5d3775/html5/thumbnails/24.jpg)
Analysis of Nuclear Norm
• Nuclear norm minimization = convex relaxation of rank minimization:
• Theorem [Candes, Recht ‘08]: – If there is a true matrix of rank k,
– And, we observe at least
random entries of true matrix
– Then true matrix is recovered exactly with high probability via convex nuclear norm minimization!
• Under certain conditions
©Sham Kakade 2016 24
![Page 25: Collaborative Filtering Matrix Completion Alternating Least Squares · 2016-05-20 · Women on the Verge of a Nervous Breakdown What do I recommend??? ©Sham Kakade 2016 3. Netflix](https://reader034.vdocument.in/reader034/viewer/2022050401/5f7f14e42bb6e969de5d3775/html5/thumbnails/25.jpg)
Alternating Minimization
• Alt. Min. used in practice.
• Alt. Min. was widely thought to be a search heuristic, does it work?
©Sham Kakade 2016 25
![Page 26: Collaborative Filtering Matrix Completion Alternating Least Squares · 2016-05-20 · Women on the Verge of a Nervous Breakdown What do I recommend??? ©Sham Kakade 2016 3. Netflix](https://reader034.vdocument.in/reader034/viewer/2022050401/5f7f14e42bb6e969de5d3775/html5/thumbnails/26.jpg)
SGD
• SGD also widely used in practice (streaming, fast)
• Does it work?
©Sham Kakade 2016 26
![Page 27: Collaborative Filtering Matrix Completion Alternating Least Squares · 2016-05-20 · Women on the Verge of a Nervous Breakdown What do I recommend??? ©Sham Kakade 2016 3. Netflix](https://reader034.vdocument.in/reader034/viewer/2022050401/5f7f14e42bb6e969de5d3775/html5/thumbnails/27.jpg)
What you need to know…
• Stochastic gradient descent for matrix factorization
• Norm minimization as convex relaxation of rank minimization– Trace norm for PSD matrices
– Nuclear norm in general
• Intuitive relationship between nuclear norm minimization and direct (bilinear) minimization
©Sham Kakade 2016 27
![Page 28: Collaborative Filtering Matrix Completion Alternating Least Squares · 2016-05-20 · Women on the Verge of a Nervous Breakdown What do I recommend??? ©Sham Kakade 2016 3. Netflix](https://reader034.vdocument.in/reader034/viewer/2022050401/5f7f14e42bb6e969de5d3775/html5/thumbnails/28.jpg)
©Sham Kakade 2016 28
Machine Learning for Big Data CSE547/STAT548, University of Washington
Emily Fox
May 6th, 2015
Nonnegative Matrix FactorizationProjected Gradient
Case Study 4: Collaborative Filtering
![Page 29: Collaborative Filtering Matrix Completion Alternating Least Squares · 2016-05-20 · Women on the Verge of a Nervous Breakdown What do I recommend??? ©Sham Kakade 2016 3. Netflix](https://reader034.vdocument.in/reader034/viewer/2022050401/5f7f14e42bb6e969de5d3775/html5/thumbnails/29.jpg)
Matrix factorization solutions can be unintuitive…
• Many, many, many applications of matrix factorization
• E.g., in text data, can do topic modeling (alternative to LDA):
• Would like:
• But…
©Sham Kakade 2016 29
X LR’
=
![Page 30: Collaborative Filtering Matrix Completion Alternating Least Squares · 2016-05-20 · Women on the Verge of a Nervous Breakdown What do I recommend??? ©Sham Kakade 2016 3. Netflix](https://reader034.vdocument.in/reader034/viewer/2022050401/5f7f14e42bb6e969de5d3775/html5/thumbnails/30.jpg)
Nonnegative Matrix Factorization
• Just like before, but
• Constrained optimization problem– Many, many, many, many solution methods… we’ll check out a simple one
©Sham Kakade 2016 30
X LR’
=
![Page 31: Collaborative Filtering Matrix Completion Alternating Least Squares · 2016-05-20 · Women on the Verge of a Nervous Breakdown What do I recommend??? ©Sham Kakade 2016 3. Netflix](https://reader034.vdocument.in/reader034/viewer/2022050401/5f7f14e42bb6e969de5d3775/html5/thumbnails/31.jpg)
Recall: Projected Gradient
• Standard optimization:
– Want to minimize:
– Use, e.g., gradient updates:
• Constrained optimization:
– Given convex set C of feasible solutions
– Want to find minima within C:
• Projected gradient:
– Take a gradient step (ignoring constraints):
– Projection into feasible set:
©Sham Kakade 2016 31
![Page 32: Collaborative Filtering Matrix Completion Alternating Least Squares · 2016-05-20 · Women on the Verge of a Nervous Breakdown What do I recommend??? ©Sham Kakade 2016 3. Netflix](https://reader034.vdocument.in/reader034/viewer/2022050401/5f7f14e42bb6e969de5d3775/html5/thumbnails/32.jpg)
Projected Stochastic Gradient Descent for Nonnegative Matrix Factorization
• Gradient step observing ruv ignoring constraints:
• Convex set:
• Projection step:
©Sham Kakade 2016 32
![Page 33: Collaborative Filtering Matrix Completion Alternating Least Squares · 2016-05-20 · Women on the Verge of a Nervous Breakdown What do I recommend??? ©Sham Kakade 2016 3. Netflix](https://reader034.vdocument.in/reader034/viewer/2022050401/5f7f14e42bb6e969de5d3775/html5/thumbnails/33.jpg)
What you need to know…
• In many applications, want factors to be nonnegative
• Corresponds to constrained optimization problem
• Many possible approaches to solve, e.g., projected gradient
©Sham Kakade 2016 33
![Page 34: Collaborative Filtering Matrix Completion Alternating Least Squares · 2016-05-20 · Women on the Verge of a Nervous Breakdown What do I recommend??? ©Sham Kakade 2016 3. Netflix](https://reader034.vdocument.in/reader034/viewer/2022050401/5f7f14e42bb6e969de5d3775/html5/thumbnails/34.jpg)
©Sham Kakade 2016 34
Machine Learning for Big Data CSE547/STAT548, University of Washington
Sham Kakade
May 19, 2016
Cold Start Problem
Case Study 4: Collaborative Filtering
![Page 35: Collaborative Filtering Matrix Completion Alternating Least Squares · 2016-05-20 · Women on the Verge of a Nervous Breakdown What do I recommend??? ©Sham Kakade 2016 3. Netflix](https://reader034.vdocument.in/reader034/viewer/2022050401/5f7f14e42bb6e969de5d3775/html5/thumbnails/35.jpg)
Cold-Start Problem
• Challenge: Cold-start problem (new movie or user)
• Methods: use features of movie/user
©Sham Kakade 2016 35
IN THEATERS
![Page 36: Collaborative Filtering Matrix Completion Alternating Least Squares · 2016-05-20 · Women on the Verge of a Nervous Breakdown What do I recommend??? ©Sham Kakade 2016 3. Netflix](https://reader034.vdocument.in/reader034/viewer/2022050401/5f7f14e42bb6e969de5d3775/html5/thumbnails/36.jpg)
Cold-Start Problem More Formally
• Consider a new user u’ and predicting that user’s ratings– No previous observations
– Objective considered so far:
– Optimal user factor:
– Predicted user ratings:
©Sham Kakade 2016 36
![Page 37: Collaborative Filtering Matrix Completion Alternating Least Squares · 2016-05-20 · Women on the Verge of a Nervous Breakdown What do I recommend??? ©Sham Kakade 2016 3. Netflix](https://reader034.vdocument.in/reader034/viewer/2022050401/5f7f14e42bb6e969de5d3775/html5/thumbnails/37.jpg)
An Alternative Formulation
• A simpler model for collaborative filtering– We would not have this issue if we assumed all users were identical
– What about for new movies? What if we had side information?
– What dimension should w be?
– Fit linear model:
– Minimize:
©Sham Kakade 2016 37
![Page 38: Collaborative Filtering Matrix Completion Alternating Least Squares · 2016-05-20 · Women on the Verge of a Nervous Breakdown What do I recommend??? ©Sham Kakade 2016 3. Netflix](https://reader034.vdocument.in/reader034/viewer/2022050401/5f7f14e42bb6e969de5d3775/html5/thumbnails/38.jpg)
Personalization
• If we don’t have any observations about a user, use wisdom of the crowd– Address cold-start problem
• Clearly, not all users are the same
• Just as in personalized click prediction, consider model with global and user-specific parameters
• As we gain more information about the user, forget the crowd
©Sham Kakade 2016 38
![Page 39: Collaborative Filtering Matrix Completion Alternating Least Squares · 2016-05-20 · Women on the Verge of a Nervous Breakdown What do I recommend??? ©Sham Kakade 2016 3. Netflix](https://reader034.vdocument.in/reader034/viewer/2022050401/5f7f14e42bb6e969de5d3775/html5/thumbnails/39.jpg)
User Features…
• In addition to movie features, may have information about the user:
• Combine with features of movie:
• Unified linear model:
©Sham Kakade 2016 39
![Page 40: Collaborative Filtering Matrix Completion Alternating Least Squares · 2016-05-20 · Women on the Verge of a Nervous Breakdown What do I recommend??? ©Sham Kakade 2016 3. Netflix](https://reader034.vdocument.in/reader034/viewer/2022050401/5f7f14e42bb6e969de5d3775/html5/thumbnails/40.jpg)
Feature-Based Approach vs. Matrix Factorization
• Feature-based approach: – Feature representation of user and movies fixed
– Can address cold-start problem
• Matrix factorization approach:– Suffers from cold-start problem
– User & movie features are learned from data
• A unified model:
©Sham Kakade 2016 40
![Page 41: Collaborative Filtering Matrix Completion Alternating Least Squares · 2016-05-20 · Women on the Verge of a Nervous Breakdown What do I recommend??? ©Sham Kakade 2016 3. Netflix](https://reader034.vdocument.in/reader034/viewer/2022050401/5f7f14e42bb6e969de5d3775/html5/thumbnails/41.jpg)
Unified Collaborative Filtering via SGD
• Gradient step observing ruv
– For L,R
– For w and wu:
©Sham Kakade 2016 41
![Page 42: Collaborative Filtering Matrix Completion Alternating Least Squares · 2016-05-20 · Women on the Verge of a Nervous Breakdown What do I recommend??? ©Sham Kakade 2016 3. Netflix](https://reader034.vdocument.in/reader034/viewer/2022050401/5f7f14e42bb6e969de5d3775/html5/thumbnails/42.jpg)
What you need to know…
• Cold-start problem
• Feature-based methods for collaborative filtering
– Help address cold-start problem
• Unified approach
©Sham Kakade 2016 42