improving music genre classification using collaborative tagging data ling chen, phillip wright *,...
DESCRIPTION
Introduction – Music Genre Classification Challenge: music is an evolving art. Past works trained with low-level features from signals. Timbral texture, rhythmic content, melodic and harmonic content Tags of music tracks provide high-level features. Utilizing tags is trivial? tags may be useful information or noise.TRANSCRIPT
Improving Music Genre Classification Using Collaborative
Tagging DataLing Chen, Phillip Wright*, Wolfgang Nejdl
Leibniz University Hannover*Georgia Institute of Technology
WSDM 2009
Introduction – Music Information Retrieval
People need to search music by music content. Music genre
A top-level description of content Ex: Jazz, Rock, Country etc Critical for music information retrieval
Microsoft required 30 musicologists over one year to manually label a “few hundred thousand songs”.
Introduction –Music Genre Classification
Challenge: music is an evolving art. Past works trained with low-level features from signals.
Timbral texture, rhythmic content, melodic and harmonic content Tags of music tracks provide high-level features. Utilizing tags is trivial?
tags may be useful information or noise.
Problem Description A set of music tracks X = {x1, x2, …, xn}
A set of music tracks C = {c1, c2, …, ck}
Classification: assign the label of xi C(xi) C Γ(xi) = audio signal features of xi
T (xi) = a set of tags of xi
Graph of Tracks Adjacent nodes are semantically similar tracks, in terms of tags. Goal: using the tag information indirectly due to the data sparsity problem Sim(xi, xj): cosine & TF-IDF weighting
xi and xj are adjacent if Sim(xi, xj) > the threshold ε
Single-layer Classification Assuming the audio content of a track has no
direct coupling with its neighbors’ genres:
Double-layer Classificaiton Idea: learning from unknown tracks whose genre labels need to be predicted.
Relaxation labeling technique is adopted. Δk = all of the known information
Audio content of all tracks and genre labels of known tracks Find the class ci for xi to maximize Pr(ci|Δk)
Framework of Double-layer Classification
Naïve Bayes Classifier using audio content information
Iterative ProcessNu(xi) = the set of unknown neighbors of xi
Nk(xi) = the set of known neighbors of xi
base classifier
Experiment Data Crawl MP3 files from the Last.fm Collect the ground truth genre data from All Music Guide 2,262 tracks remaining in 6 genres
Each track has at most 99 tags and at least 1 tag; 29.9 tags on average.
Baseline Performance
Performance of Single-layer Classification
The similarity threshold ε is set to 0.2
Performance of Double-layer Classification
Misclassification Analysis
The performance is limited when using a smaller set of training data
Misclassification usually occurs among Rock, R&B, and Rap. Reason: many cross-class edges between tracks of the
three genres Caused by the noise problem of tag data
Optimizing strategies Tag discrimination Tag augmentation Content combination
Tag Discrimination Idea: assign a higher weight to the tag with a lower class entropy: TF-IDF(tj, xi) TF-IDF(tj, xi) / EC(tj) The similarity values decrease ε is set to 0.05
Performance of Tag Discrimination
Tag Augmentation Idea: increase the number of in-class edges For each known track, its original tag vector is
augmented by adding tags of its neighbors to its tag vector.
Similarity between two tracks after augmentation:
Performance of Tag Augmentation α= 0.6, ε= 0.2
Content Combination Idea: augment features with other information sources SC(xi, xj) = content-based similarity between xi and xj
Overall similarity
Performance of Content Combination
β= 0.6, ε= 0.5
Conclusions While most of existing approaches on automatic
music genre classification focus on finding better low-level features, here we explore the usage of social tags for this task.
Tag information are used to construct a graph of tracks.
Two classification methods are introduced and the Double-layer classifier performs better.
Several strategies of feature processing are considered to improve the performance.