spotify music playlist recommender · 2021. 1. 25. · spotify web api spotify’s web api spotipy...

36
Spotify Music Playlist Recommender Claire Medina, Ferdie Taruc, Psalm Masaya

Upload: others

Post on 30-Jan-2021

5 views

Category:

Documents


0 download

TRANSCRIPT

  • Spotify Music Playlist RecommenderClaire Medina, Ferdie Taruc, Psalm Masaya

  • Team

  • Context● Streaming services revolutionized accessibility of music

  • Given user preferences, how can we create a playlist that can encompass the interests of different people?

    Research Question:

  • Motivation● Creation of shared playlists with others● Catering to different interests ● Current Projects: Recommending Songs based on mood, etc.● This Project: Appealing to Different Users

  • Definition of “Playlists”● Unit composed of songs

    in some ordering

    ● Subjective

    ● Definition: Group of songs that fit a certain theme or genre

  • Limitations to this Definition● Why this definition?

    ○ Difficulty in Dimension Reduction

    ● Can no longer capture:

    ○ Context of the playlist

    ● Content -based filtering

  • Music Recommendation Systems

    ● Collaborative Filtering:○ Exploits interactions between users and items○ Identifies similarities between users

    ● Content Based Filtering:○ Intrinsic Audio Features○ Textual Metadata

    ● We will use the latter method of filtering for our model.

  • Introduction to Approachestake .json files convert dataset into dataframe

    find optimal k given data, cluster using k -meansgiven song input, create a playlist of similar songs

  • Spotify Web API● Spotify’s Web API

    ● Spotipy

    ● Other interesting audio features it provides:

    ○ Danceability, energy, instrumentalness, liveness, loudness, speechiness, valence (positiveness), tempo (BPM)

    ○ Explicitness, Release Date of Song

    More information about audio features: https://deve loper.spotify.com /docum enta tion /web-ap i/re fe rence /tracks/ge t-aud io-fea tu res/

  • Spotify Million Playlist Dataset

    ● Enable research for music recommendation systems in 2018.

    ● Captures 1 million playlists of over 2 million unique tracks.

    ● Re-released September 2020 by AICrowd

    Dataset source: https:/ /www.a icrowd.com /cha llenges/spotify-m illion-p laylist-da tase t-cha llenge

  • Dataset Statistics● All playlists were generated by users from the United States:

    ○ Gender■ 45% Male | 54% Female | 0.5% Unspecified | 0.5% Nonbinary

    ○ Age■ 13-17: 10%■ 18-24: 43%■ 25-34: 31%■ 35-44: 9%■ 45-54: 4%■ 55+: 3%

  • Dataset: Playlist Criterion● Playlists that meet the criteria are selected at random:

    ○ Contains at least 5 tracks and no more than 250 tracks○ Contains at least 3 unique artists and at least 2 unique albums○ Was created after January 1, 2010 and before December 1, 2017○ Does not have an offensive title○ Etc.

    ● Other modifications made to some playlists

  • Dataset Formatting● Approx. 33 GB

    ● Stored as .json files

    ○ Approx. 1000 slices

    ○ 1000 playlists within each slice

    ● Huge Database

  • Example of .json

    ● Playlist information provided:○ Name of playlist ○ PID (unique identifier)○ Collaborative○ Last Modified○ Etc.

    ● “URI” (song and artist identifier) used to query specific information from Spotify’s API.

  • Dataframe Example

  • Dataframe Example (2)

  • Adding Artist Genres ● Approximately 5,071 distinct genres to

    describe artists in Spotify

    ● Echonest applies ML on audio features to classify genres

    ● Classify genre of song based on genre of artist (has limitations!)

  • Adding Artist Genres (everynoise.com)

    atmospheric bouncy/upbeat

    Mechanical instrumentation (techno music)

    Organic instrumentation (classical)

  • T-SNE VisualizationOrganic Instrumental/ More Classical

    Upbeat and Radio Edited Songs-Britney Spears-Drake

    Slower Music like Jazz or Christian Music

    Pop Songs with a lot of Mechanic Instrumental

  • The Playlist Recommendation System

  • K-Means Clustering

    Elbow Method

  • Building a Playlist● Input = Song Name, Artist Name, Spotify DataFrame● Randomized selection of 30000 rows from csv due to hardware limitations

  • Building a Playlist● Train the KMeans model on the given DataFrame● Append cluster labels to the DataFrame for each row using kmeans.labels_

  • Building a PlaylistPreprocessing for the input.

    Predict cluster label for input song.

  • Building a Playlist● Filter out dataframe by cluster label● Calculate euclidean distances and sort based on minimum distance● Pick top n songs (n = 2 for example)

  • Building a Playlist: n -songs Input

    Inputs:

    ● List of Lists → [ [Song Name, Artist Name], [Song Name, Artist Name],... ]● Spotify DataFrame● Number of Songs to Recommend / Song

  • The Example

  • Output

  • Output

  • Output

  • Limitations of Dataset● Only used 4 slices (500MB) of .json file s

    ● Lim ited to on ly a sm all am ount of songs

    ● Playlist crite rion a lso crea tes a b iased sam ple .

    ● Unable to represen t a ll genres

    ● Sm alle r in com parison to la rge r recom m endation system s like :

  • Conclusion

    ● Real-World Implications○ Ability to create a playlist off of different user preferences

  • Next Steps: Ready for the Real-World?

    ● Ready for the Real -World?○ Scale Up○ Playlist metadata○ Ensemble of Collaborative Filtering with Content Based techniques

  • Thank You!

  • Questions?

    Spotify Music Playlist RecommenderTeamContextGiven user preferences, how can we create a playlist that can encompass the interests of different people?MotivationDefinition of “Playlists”Limitations to this DefinitionMusic Recommendation SystemsIntroduction to ApproachesSpotify Web APISpotify Million Playlist DatasetDataset StatisticsDataset: Playlist CriterionDataset FormattingExample of .jsonDataframe ExampleDataframe Example (2)Adding Artist Genres Adding Artist Genres (everynoise.com)T-SNE VisualizationThe Playlist Recommendation SystemK-Means ClusteringBuilding a PlaylistBuilding a PlaylistBuilding a PlaylistBuilding a PlaylistBuilding a Playlist: n-songs InputThe ExampleOutputOutputOutputLimitations of DatasetConclusionNext Steps: Ready for the Real-World?Thank You!Questions?