music recommendation a data mining approach daniel mcennis 2nd year phd daniel mcennis 2nd year phd

20
Music Recommendation A Data Mining Approach Daniel McEnnis 2nd year PhD

Upload: aubrey-fisher

Post on 12-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Music Recommendation A Data Mining Approach Daniel McEnnis 2nd year PhD Daniel McEnnis 2nd year PhD

Music RecommendationA Data Mining ApproachMusic RecommendationA Data Mining Approach

Daniel McEnnis2nd year PhD

Daniel McEnnis2nd year PhD

Page 2: Music Recommendation A Data Mining Approach Daniel McEnnis 2nd year PhD Daniel McEnnis 2nd year PhD

OverviewOverview

High level overview Toolkit Improvements Experiments Evaluation Algorithms research Data Future work

High level overview Toolkit Improvements Experiments Evaluation Algorithms research Data Future work

Page 3: Music Recommendation A Data Mining Approach Daniel McEnnis 2nd year PhD Daniel McEnnis 2nd year PhD

Project GoalsProject Goals

Integrate social information Make algorithms ‘culturally aware’ Implement existing algorithms Systematic evaluation framework

Integrate social information Make algorithms ‘culturally aware’ Implement existing algorithms Systematic evaluation framework

Page 4: Music Recommendation A Data Mining Approach Daniel McEnnis 2nd year PhD Daniel McEnnis 2nd year PhD

Similarity AlgorithmsSimilarity Algorithms

Create new relations based on some aspect of similarity

6 different varieties of similarity Each algorithm can use one of 6

distance functions

Create new relations based on some aspect of similarity

6 different varieties of similarity Each algorithm can use one of 6

distance functions

Page 5: Music Recommendation A Data Mining Approach Daniel McEnnis 2nd year PhD Daniel McEnnis 2nd year PhD

Aggregator AlgorithmsAggregator Algorithms

Takes data from one set of actors and moves it to another

6 different varierties Each variety uses one of 7

aggregator functions Basic building block of Graph-RAT

applications

Takes data from one set of actors and moves it to another

6 different varierties Each variety uses one of 7

aggregator functions Basic building block of Graph-RAT

applications

Page 6: Music Recommendation A Data Mining Approach Daniel McEnnis 2nd year PhD Daniel McEnnis 2nd year PhD

Graph Triples CensusGraph Triples Census

Probable novel algorithm Proof of Correctness Completed Proof of Time Complexity

Completed Literature review in progress

Probable novel algorithm Proof of Correctness Completed Proof of Time Complexity

Completed Literature review in progress

Page 7: Music Recommendation A Data Mining Approach Daniel McEnnis 2nd year PhD Daniel McEnnis 2nd year PhD

SUCCESS!SUCCESS!

Graph-RAT programming language now functioning

Graph-RAT integrates social, cultural, personal, and audio data into algorithms

Includes most commercial algorithms Contains primitives for existing

academic systems Evaluation is entirely automated

Graph-RAT programming language now functioning

Graph-RAT integrates social, cultural, personal, and audio data into algorithms

Includes most commercial algorithms Contains primitives for existing

academic systems Evaluation is entirely automated

Page 8: Music Recommendation A Data Mining Approach Daniel McEnnis 2nd year PhD Daniel McEnnis 2nd year PhD

PROBLEMSPROBLEMS

Page 9: Music Recommendation A Data Mining Approach Daniel McEnnis 2nd year PhD Daniel McEnnis 2nd year PhD

Evaluation ExplorationEvaluation Exploration

9 types of music recommendation Personalized versus generic Open query versus targeted query Dynamic versus static data New music versus all music

9 types of music recommendation Personalized versus generic Open query versus targeted query Dynamic versus static data New music versus all music

Page 10: Music Recommendation A Data Mining Approach Daniel McEnnis 2nd year PhD Daniel McEnnis 2nd year PhD

Personalized RadioPersonalized Radio

Open query with personalized presentation

Static data vs dynamic data New items prediction vs predict

anything

Open query with personalized presentation

Static data vs dynamic data New items prediction vs predict

anything

Page 11: Music Recommendation A Data Mining Approach Daniel McEnnis 2nd year PhD Daniel McEnnis 2nd year PhD

Targeted SearchTargeted Search

Not personalized Similarity queries Automatically generating targeted

lists for a browsing hierarchy New music vs all music Static vs dynamic data

Not personalized Similarity queries Automatically generating targeted

lists for a browsing hierarchy New music vs all music Static vs dynamic data

Page 12: Music Recommendation A Data Mining Approach Daniel McEnnis 2nd year PhD Daniel McEnnis 2nd year PhD

Personalized Tag RadioPersonalized Tag Radio

Create a personalized play list matching a given query

New music vs all music Static vs dynamic data

Create a personalized play list matching a given query

New music vs all music Static vs dynamic data

Page 13: Music Recommendation A Data Mining Approach Daniel McEnnis 2nd year PhD Daniel McEnnis 2nd year PhD

Excluded TypesExcluded Types

‘Top 40’ prediction Rendered obsolete by other types

‘Top 40’ prediction Rendered obsolete by other types

Page 14: Music Recommendation A Data Mining Approach Daniel McEnnis 2nd year PhD Daniel McEnnis 2nd year PhD

Existing AlgorithmsExisting Algorithms

Item-to-Item collaborative filtering 7 variations

User-to-user collaborative filtering 7 variations

Associative mining collaborative filtering

Direct machine learning playlist data Direct machine learning audio data

Item-to-Item collaborative filtering 7 variations

User-to-user collaborative filtering 7 variations

Associative mining collaborative filtering

Direct machine learning playlist data Direct machine learning audio data

Page 15: Music Recommendation A Data Mining Approach Daniel McEnnis 2nd year PhD Daniel McEnnis 2nd year PhD

Novel AlgorithmsNovel Algorithms

Machine learning over profile data Machine learning over cultural and

profile data Machine learning on different

concatenations Audio Playlist Profile Cultural

Machine learning over profile data Machine learning over cultural and

profile data Machine learning on different

concatenations Audio Playlist Profile Cultural

Page 16: Music Recommendation A Data Mining Approach Daniel McEnnis 2nd year PhD Daniel McEnnis 2nd year PhD

Initial DataInitial Data

LiveJournal Separating music data is difficult No tag info or audio content No enough musical data

LastFM by User No audio content Data cleaning is an issue

LiveJournal Separating music data is difficult No tag info or audio content No enough musical data

LastFM by User No audio content Data cleaning is an issue

Page 17: Music Recommendation A Data Mining Approach Daniel McEnnis 2nd year PhD Daniel McEnnis 2nd year PhD

Current DataCurrent Data

40’s Jazz Recordings 1800 annotated recordings from 70

CDs Covers nearly all 40’s popular music

LastFM by Song Retrieves tag and user info by song Data cleaning on user playcounts

needed

40’s Jazz Recordings 1800 annotated recordings from 70

CDs Covers nearly all 40’s popular music

LastFM by Song Retrieves tag and user info by song Data cleaning on user playcounts

needed

Page 18: Music Recommendation A Data Mining Approach Daniel McEnnis 2nd year PhD Daniel McEnnis 2nd year PhD

Data Cleaning TagsData Cleaning Tags

Polysemy Synonomy Disjoint Hypersomny Hyposomny

Initial algorithms developed

Polysemy Synonomy Disjoint Hypersomny Hyposomny

Initial algorithms developed

Page 19: Music Recommendation A Data Mining Approach Daniel McEnnis 2nd year PhD Daniel McEnnis 2nd year PhD

Future Work: ProgrammingFuture Work: Programming

Radically different programming environment

SQL LINQ library package in C#

Radically different programming environment

SQL LINQ library package in C#

Page 20: Music Recommendation A Data Mining Approach Daniel McEnnis 2nd year PhD Daniel McEnnis 2nd year PhD

Future Work: ScalabilityFuture Work: Scalability

Distributed SQL database implementation

Just-in-time compilation Event-based recalculation of

algorithm results Parallel execution of algorithms Multi-threaded algorithms

Distributed SQL database implementation

Just-in-time compilation Event-based recalculation of

algorithm results Parallel execution of algorithms Multi-threaded algorithms