dataengconf: building a music recommender system from scratch with spotify data team

36
November 14, 2015 Building a Music Recommender from Scratch Vidhya Murali @vid052

Upload: hakka-labs

Post on 23-Jan-2018

815 views

Category:

Software


4 download

TRANSCRIPT

Page 1: DataEngConf: Building a Music Recommender System from Scratch with Spotify Data Team

November 14, 2015

Building a

Music Recommenderfrom

ScratchVidhya Murali@vid052

Page 2: DataEngConf: Building a Music Recommender System from Scratch with Spotify Data Team

Vidhya Murali

Who Am I?

2

•Areas of Interest: Data & Machine Learning•Data Science Engineer @Spotify•Masters Student from the University of Wisconsin Madison

aka Happy Badger for life!

Page 3: DataEngConf: Building a Music Recommender System from Scratch with Spotify Data Team

“Torture the data, and it will confess!”

3

– Ronald Coase, Nobel Prize Laureate

Page 4: DataEngConf: Building a Music Recommender System from Scratch with Spotify Data Team

Music Recommendations at Spotify

Features: Discover Discover Weekly Moments Radio Related Artists

4

Page 5: DataEngConf: Building a Music Recommender System from Scratch with Spotify Data Team

5

30 million tracks…What to recommend?

Page 6: DataEngConf: Building a Music Recommender System from Scratch with Spotify Data Team

6

•Manual Curation by Experts

•Editorial Tagging

•Metadata (e.g. Label provided data, NLP over News, Blogs)

•Audio Signals

•Collaborative Filtering Model

Approaches

Page 7: DataEngConf: Building a Music Recommender System from Scratch with Spotify Data Team

6

•Manual Curation by Experts

•Editorial Tagging

•Metadata (e.g. Label provided data, NLP over News, Blogs)

•Audio Signals

•Collaborative Filtering Model

Approaches

Page 8: DataEngConf: Building a Music Recommender System from Scratch with Spotify Data Team

Definition of CF

7

Hey,I like tracks P, Q, R, S!

Well,I like tracks Q, R, S, T!

Then you should check out track P!

Nice! Btw try track T!

Legacy Slide of Erik Bernhardsson

Page 9: DataEngConf: Building a Music Recommender System from Scratch with Spotify Data Team

Collaborative Filtering Model 8

•Find patterns from user’s past behavior to generate recommendations

•Domain independent

•Scalable

•Accuracy (Collaborative Model) >= Accuracy (Content Based Model)

Page 10: DataEngConf: Building a Music Recommender System from Scratch with Spotify Data Team

Construct Big Matrix!9

Artists(n)

Users(m)

Vidhya

Ellie Goulding

Page 11: DataEngConf: Building a Music Recommender System from Scratch with Spotify Data Team

Construct Big Matrix!9

Artists(n)

Users(m)

Vidhya

Ellie Goulding

Order of Millions!

Page 12: DataEngConf: Building a Music Recommender System from Scratch with Spotify Data Team

Latent Factor Models 10

Vidhya Ellie

.. . . . .

.. . . . .

.. . . . .

.. . . . .

.. . . . .

•Use a “small” representation for each user and items(artists): f-dimensional vectors

.. .

.. .

.. .

.. . . .

.. .

.. .

.. .

.. .

. .m m

n

m n

Page 13: DataEngConf: Building a Music Recommender System from Scratch with Spotify Data Team

Latent Factor Models 10

Vidhya Ellie

.. . . . .

.. . . . .

.. . . . .

.. . . . .

.. . . . .

•Use a “small” representation for each user and items(artists): f-dimensional vectors

.. .

.. .

.. .

.. . . .

.. .

.. .

.. .

.. .

. .m m

n

m n

User Artist Matrix: (m x n)

Page 14: DataEngConf: Building a Music Recommender System from Scratch with Spotify Data Team

Latent Factor Models 10

Vidhya Ellie

.. . . . .

.. . . . .

.. . . . .

.. . . . .

.. . . . .

•Use a “small” representation for each user and items(artists): f-dimensional vectors

.. .

.. .

.. .

.. . . .

.. .

.. .

.. .

.. .

. .m m

n

m n

User Vector Matrix: X: (m x f)

User Artist Matrix: (m x n)

Page 15: DataEngConf: Building a Music Recommender System from Scratch with Spotify Data Team

Latent Factor Models 10

Vidhya Ellie

.. . . . .

.. . . . .

.. . . . .

.. . . . .

.. . . . .

•Use a “small” representation for each user and items(artists): f-dimensional vectors

.. .

.. .

.. .

.. . . .

.. .

.. .

.. .

.. .

. .m m

n

m n

User Vector Matrix: X: (m x f)

Artist Vector Matrix: Y: (n x f)

User Artist Matrix: (m x n)

Page 16: DataEngConf: Building a Music Recommender System from Scratch with Spotify Data Team

Latent Factor Models 10

Vidhya Ellie

.. . . . .

.. . . . .

.. . . . .

.. . . . .

.. . . . .

•Use a “small” representation for each user and items(artists): f-dimensional vectors

.. .

.. .

.. .

.. . . .

.. .

.. .

.. .

.. .

. .

(here, f = 2)

m m

n

m n

User Vector Matrix: X: (m x f)

Artist Vector Matrix: Y: (n x f)

User Artist Matrix: (m x n)

Page 17: DataEngConf: Building a Music Recommender System from Scratch with Spotify Data Team

Why Vectors? 11

•Vectors encode higher order dependencies

•Users and Items in the same vector space!•Use vector similarity to compute:•Item-Item similarities•User-Item recommendations

•Linear complexity: order of number of latent factors

•Easy to scale up

Page 18: DataEngConf: Building a Music Recommender System from Scratch with Spotify Data Team

Explicit Matrix Factorization 12

•User explicitly rates a subset of the music catalog•Goal: Predict how users will rate new music•How: Approximate ratings matrix by the inner product of 2 smaller matrices

by minimizing the RMSE (root mean squared error)

X YUsers

Artists

• = bias for user• = bias for item• = regularization parameter

• = user rating for item• = user latent factor vector• = item latent factor vector

Page 19: DataEngConf: Building a Music Recommender System from Scratch with Spotify Data Team

Matrix Factorization using Implicit Feedback 13

Page 20: DataEngConf: Building a Music Recommender System from Scratch with Spotify Data Team

Matrix Factorization using Implicit Feedback

User Artist Play Count Matrix

13

Page 21: DataEngConf: Building a Music Recommender System from Scratch with Spotify Data Team

Matrix Factorization using Implicit Feedback

User Artist Play Count Matrix

User Artist Preference

Matrix

Binary Label: 1 => played 0 => not played

13

Page 22: DataEngConf: Building a Music Recommender System from Scratch with Spotify Data Team

Matrix Factorization using Implicit Feedback

User Artist Play Count Matrix

User Artist Preference

Matrix

Binary Label: 1 => played 0 => not played

Weights Matrix

Weights based on play count and smoothing

13

Page 23: DataEngConf: Building a Music Recommender System from Scratch with Spotify Data Team

Equation(s) Alert!14

Page 24: DataEngConf: Building a Music Recommender System from Scratch with Spotify Data Team

Implicit Matrix Factorization 15

1 0 0 0 1 0 0 10 0 1 0 0 1 0 0 1 0 1 0 0 0 1 10 1 0 0 0 1 0 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1

•Aggregate all (user, artist) streams into a large matrix•Goal: Approximate binary preference matrix by the inner product of 2 smaller matrices by

minimizing the weighted RMSE (root mean squared error) using a function of total plays as weight •Why?: Once learned, the top recommendations for a user are the top inner products between

their latent factor vector in X and the artist latent factor vectors in Y.

X YUsers

Artists

• = bias for user• = bias for item• = regularization parameter

• = 1 if user streamed artist else 0• • = user latent factor vector• = item latent factor vector

Page 25: DataEngConf: Building a Music Recommender System from Scratch with Spotify Data Team

Alternating Least Squares 16

1 0 0 0 1 0 0 10 0 1 0 0 1 0 0 1 0 1 0 0 0 1 10 1 0 0 0 1 0 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1

X YUsers

Artists

• = bias for user• = bias for item• = regularization parameter

• = 1 if user streamed artist else 0• • = user latent factor vector• = item latent factor vector

Fix artists

•Aggregate all (user, artist) streams into a large matrix•Goal: Approximate binary preference matrix by the inner product of 2 smaller matrices by

minimizing the weighted RMSE (root mean squared error) using a function of total plays as weight •Why?: Once learned, the top recommendations for a user are the top inner products between

their latent factor vector in X and the artist latent factor vectors in Y.

Page 26: DataEngConf: Building a Music Recommender System from Scratch with Spotify Data Team

17

1 0 0 0 1 0 0 10 0 1 0 0 1 0 0 1 0 1 0 0 0 1 10 1 0 0 0 1 0 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1

X YUsers

• = bias for user• = bias for item• = regularization parameter

• = 1 if user streamed artist else 0• • = user latent factor vector• = item latent factor vector

Fix artists

Solve for users

•Aggregate all (user, artist) streams into a large matrix•Goal: Approximate binary preference matrix by the inner product of 2 smaller matrices by

minimizing the weighted RMSE (root mean squared error) using a function of total plays as weight •Why?: Once learned, the top recommendations for a user are the top inner products between

their latent factor vector in X and the artist latent factor vectors in Y.

Alternating Least Squares

Artists

Page 27: DataEngConf: Building a Music Recommender System from Scratch with Spotify Data Team

18

1 0 0 0 1 0 0 10 0 1 0 0 1 0 0 1 0 1 0 0 0 1 10 1 0 0 0 1 0 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1

X YUsers

• = bias for user• = bias for item• = regularization parameter

• = 1 if user streamed artist else 0• • = user latent factor vector• = item latent factor vector

Fix users

•Aggregate all (user, artist) streams into a large matrix•Goal: Approximate binary preference matrix by the inner product of 2 smaller matrices by

minimizing the weighted RMSE (root mean squared error) using a function of total plays as weight •Why?: Once learned, the top recommendations for a user are the top inner products between

their latent factor vector in X and the artist latent factor vectors in Y.

Alternating Least Squares

Artists

Page 28: DataEngConf: Building a Music Recommender System from Scratch with Spotify Data Team

19

1 0 0 0 1 0 0 10 0 1 0 0 1 0 0 1 0 1 0 0 0 1 10 1 0 0 0 1 0 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1

X YUsers

• = bias for user• = bias for item• = regularization parameter

• = 1 if user streamed artist else 0• • = user latent factor vector• = item latent factor vector

Fix usersSolve for artists

•Aggregate all (user, artist) streams into a large matrix•Goal: Approximate binary preference matrix by the inner product of 2 smaller matrices by

minimizing the weighted RMSE (root mean squared error) using a function of total plays as weight •Why?: Once learned, the top recommendations for a user are the top inner products between

their latent factor vector in X and the artist latent factor vectors in Y.

Alternating Least Squares

Artists

Page 29: DataEngConf: Building a Music Recommender System from Scratch with Spotify Data Team

20

1 0 0 0 1 0 0 10 0 1 0 0 1 0 0 1 0 1 0 0 0 1 10 1 0 0 0 1 0 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1

X YUsers

• = bias for user• = bias for item• = regularization parameter

• = 1 if user streamed artist else 0• • = user latent factor vector• = item latent factor vector

Fix usersSolve for artists

Repeat until convergence…

•Aggregate all (user, artist) streams into a large matrix•Goal: Approximate binary preference matrix by the inner product of 2 smaller matrices by

minimizing the weighted RMSE (root mean squared error) using a function of total plays as weight •Why?: Once learned, the top recommendations for a user are the top inner products between

their latent factor vector in X and the artist latent factor vectors in Y.

Alternating Least Squares

Artists

Page 30: DataEngConf: Building a Music Recommender System from Scratch with Spotify Data Team

21

1 0 0 0 1 0 0 10 0 1 0 0 1 0 0 1 0 1 0 0 0 1 10 1 0 0 0 1 0 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1

X YUsers

• = bias for user• = bias for item• = regularization parameter

• = 1 if user streamed track else 0• • = user latent factor vector• = item latent factor vector

Fix usersSolve for artists

Repeat until convergence…

•Aggregate all (user, artist) streams into a large matrix•Goal: Approximate binary preference matrix by the inner product of 2 smaller matrices by

minimizing the weighted RMSE (root mean squared error) using a function of total plays as weight •Why?: Once learned, the top recommendations for a user are the top inner products between

their latent factor vector in X and the artist latent factor vectors in Y.

Alternating Least Squares

Artists

Page 31: DataEngConf: Building a Music Recommender System from Scratch with Spotify Data Team

Vectors•“Compact” representation for users and items(artists) in the same space

Page 32: DataEngConf: Building a Music Recommender System from Scratch with Spotify Data Team

23

Recommendations via Cosine Similarity

Page 33: DataEngConf: Building a Music Recommender System from Scratch with Spotify Data Team

23

Recommendations via Cosine Similarity

Page 34: DataEngConf: Building a Music Recommender System from Scratch with Spotify Data Team

24

Annoy

•70 million users, at least 4 million tracks for candidates per user•Brute Force Approach: •O(70M x 4M x 10) ~= 0(3 peta-operations)!

• Approximate Nearest Neighbor Oh Yeah!

• Uses Local Sensitive Hashing

• Clone: https://github.com/spotify/annoy

Page 35: DataEngConf: Building a Music Recommender System from Scratch with Spotify Data Team

25

Page 36: DataEngConf: Building a Music Recommender System from Scratch with Spotify Data Team

Thank You!You can reach me @Email: [email protected]: @vid052