music r ecommendations at spotify

64
Music recommendations at Spotify Erik Bernhardsson [email protected]

Upload: ayla

Post on 25-Feb-2016

39 views

Category:

Documents


5 download

DESCRIPTION

Music r ecommendations at Spotify. Erik Bernhardsson [email protected]. Spotify. Launched in 2009 Available in 17 countries 20M active users, 5M paying subscribers Peak at 5k tracks/s, 1M logged in users 20M tracks. Some applications. Recommendation stuff at Spotify. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Music r ecommendations  at Spotify

Music recommendations at Spotify

Erik Bernhardsson [email protected]

Page 2: Music r ecommendations  at Spotify

Spotify

- Launched in 2009- Available in 17 countries- 20M active users, 5M paying subscribers- Peak at 5k tracks/s, 1M logged in users- 20M tracks

Page 3: Music r ecommendations  at Spotify

Some applications

Page 4: Music r ecommendations  at Spotify

Recommendation stuff at Spotify

- Related artists:

Page 5: Music r ecommendations  at Spotify

Recommendation stuff at Spotify, cont…

Page 6: Music r ecommendations  at Spotify

More!

Page 7: Music r ecommendations  at Spotify

How can we find music?

Page 8: Music r ecommendations  at Spotify

Recommendations

- Manual classification- Feature extraction- Social media analysis, web scraping, metadata based- Collaborative filtering

Page 9: Music r ecommendations  at Spotify

Pandora & Music Genome Project

- Classifies tracks in terms of 400 attributes- Each track takes 20-30 minutes to classify- A distance function finds similar tracks

- “Subtle use of strings”- “Epic buildup”- “Acid Jazz roots”- “Beats made for dancing”- “Trippy soundscapes”- “Great trombone solo”- …

Page 10: Music r ecommendations  at Spotify

Scraping the web is another approach

Page 11: Music r ecommendations  at Spotify

Feature extraction

Page 12: Music r ecommendations  at Spotify

Collaborative filtering

Idea:- If two movies x, y get similar ratings then they are probably

similar- If a lot of users all listen to tracks x, y, z, then those tracks

are probably similar

Page 13: Music r ecommendations  at Spotify

Collaborative filtering

Page 14: Music r ecommendations  at Spotify

Get data

Page 15: Music r ecommendations  at Spotify

… lots of data

Page 16: Music r ecommendations  at Spotify

Aggregate data

Throw away temporal information and just look at the number of times

Page 17: Music r ecommendations  at Spotify

OK, so now we have a big matrix

Page 18: Music r ecommendations  at Spotify

… very big matrix

Throw out all the temporal data:

Page 19: Music r ecommendations  at Spotify

Supervised collaborative filtering is pretty much matrix completion

Page 20: Music r ecommendations  at Spotify

Supervised learning: Matrix completion

Page 21: Music r ecommendations  at Spotify

Supervised: evaluating rec quality

Page 22: Music r ecommendations  at Spotify

Unsupervised learning

- Trying to estimate the density- i.e. predict probability of future events

Page 23: Music r ecommendations  at Spotify

Try to predict the future given the past

Page 24: Music r ecommendations  at Spotify

How can we find similar items

Page 25: Music r ecommendations  at Spotify

We can calculate correlation coefficient as an item similarity

- Use something like Pearson, Jaccard, …

Page 26: Music r ecommendations  at Spotify

Amazon did this for “customers who bought this also bought”

- US patent 7113917

Page 27: Music r ecommendations  at Spotify

Parallelization is hard though

Page 28: Music r ecommendations  at Spotify

Can speed this up using various LSH tricks

- Twitter: Dimension Independent Similarity Computation (DISCO)

Page 29: Music r ecommendations  at Spotify

Are there other approaches?

Page 30: Music r ecommendations  at Spotify

Natural Language Processing has a lot of similar problems

…matrix factorization is one idea

Page 31: Music r ecommendations  at Spotify

Matrix factorization

Page 32: Music r ecommendations  at Spotify

Matrix factorization

- Want to get user vectors and item vectors- Assume f latent factors (dimensions) for each user/item

Page 33: Music r ecommendations  at Spotify

- Hofmann, 1999- Also called PLSI

Probabilistic Latent Semantic Analysis (PLSA)

Page 34: Music r ecommendations  at Spotify

PLSA, cont.

+ a bunch of constraints:

Page 35: Music r ecommendations  at Spotify

PLSA, cont.

Optimization problem: maximize log-likelihood

Page 36: Music r ecommendations  at Spotify

PLSA, cont.

Page 37: Music r ecommendations  at Spotify
Page 38: Music r ecommendations  at Spotify
Page 39: Music r ecommendations  at Spotify
Page 40: Music r ecommendations  at Spotify
Page 41: Music r ecommendations  at Spotify

“Collaborative Filtering for Implicit Feedback Datasets”

- Hu, Koren, Volinsky (2008)

Page 42: Music r ecommendations  at Spotify

“Collaborative Filtering for Implicit Feedback Datasets”, cont.

Page 43: Music r ecommendations  at Spotify

Here is another method we use

Page 44: Music r ecommendations  at Spotify

What happens each iteration

- Assign all latent vectors small random values- Perform gradient ascent to optimize log-likelihood

Page 45: Music r ecommendations  at Spotify

Calculate derivative and do gradient ascent

- Assign all latent vectors small random values- Perform gradient ascent to optimize log-likelihood

Page 46: Music r ecommendations  at Spotify

2D iteration example

Page 47: Music r ecommendations  at Spotify

Vectors are pretty nice because things are now super fast

- User-item score is a dot product:

- Item-item similarity score is a cosine similarity:

- Both cases have trivial complexity in the number of factors f:

Page 48: Music r ecommendations  at Spotify

Example: item similarity as a cosine of vectors

Page 49: Music r ecommendations  at Spotify

Two dimensional example for tracks

Page 50: Music r ecommendations  at Spotify

We can rank all tracks by the user’s vector

Page 51: Music r ecommendations  at Spotify

So how do we implement this?

Page 52: Music r ecommendations  at Spotify

Hadoop at Spotify

Page 53: Music r ecommendations  at Spotify

One iteration of a matrix factorization algorithm

“Google News personalization: scalable online collaborative filtering”

Page 54: Music r ecommendations  at Spotify
Page 55: Music r ecommendations  at Spotify

So now we solved the problem of recommendations right?

Page 56: Music r ecommendations  at Spotify

Actually what we really want is to apply it to other domains

Page 57: Music r ecommendations  at Spotify

Radio

- Artist radio: find related tracks- Optimize ensemble model based on skip/thumbs data

Page 58: Music r ecommendations  at Spotify

Learning from feedback is actually pretty hard

Page 59: Music r ecommendations  at Spotify

A/B testing

Page 60: Music r ecommendations  at Spotify

More applications!!!

Page 61: Music r ecommendations  at Spotify
Page 62: Music r ecommendations  at Spotify
Page 63: Music r ecommendations  at Spotify

Last but not least: we’re hiring!

Page 64: Music r ecommendations  at Spotify

Thank you