musi-6201 | computational music analysis · 11/9/2015  · musi-6201 | computational music analysis...

Post on 20-Aug-2020

0 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

overview intro similarity mood summary

MUSI-6201 — Computational Music AnalysisPart 9.2: Music Similarity and Mood Recognition

alexander lerch

November 11, 2015

overview intro similarity mood summary

temporal analysisoverview

text book

Chapter 8: Musical Genre, Similarity, and Mood (pp. 156–162)

sources: slides (latex) & Matlab

github repository

lecture contentmusic similarityk-means clustering and SOMsmood and emotionlinear regression

overview intro similarity mood summary

temporal analysisoverview

text book

Chapter 8: Musical Genre, Similarity, and Mood (pp. 156–162)

sources: slides (latex) & Matlab

github repository

lecture contentmusic similarityk-means clustering and SOMsmood and emotionlinear regression

overview intro similarity mood summary

temporal analysisoverview

text book

Chapter 8: Musical Genre, Similarity, and Mood (pp. 156–162)

sources: slides (latex) & Matlab

github repository

lecture contentmusic similarityk-means clustering and SOMsmood and emotionlinear regression

overview intro similarity mood summary

temporal analysisoverview

text book

Chapter 8: Musical Genre, Similarity, and Mood (pp. 156–162)

sources: slides (latex) & Matlab

github repository

lecture contentmusic similarityk-means clustering and SOMsmood and emotionlinear regression

overview intro similarity mood summary

temporal analysisoverview

text book

Chapter 8: Musical Genre, Similarity, and Mood (pp. 156–162)

sources: slides (latex) & Matlab

github repository

lecture contentmusic similarityk-means clustering and SOMsmood and emotionlinear regression

overview intro similarity mood summary

similarity and moodintroduction

commonalities with genre classification

similar set of featuresambiguous ’ground truth’unclear value/impact of low level and high level features

differences to genre classification

mood : often not a classification but regressionsimilarity : distance measure instead of categorizing into classes

overview intro similarity mood summary

similarity and moodintroduction

commonalities with genre classification

similar set of featuresambiguous ’ground truth’unclear value/impact of low level and high level features

differences to genre classification

mood : often not a classification but regressionsimilarity : distance measure instead of categorizing into classes

overview intro similarity mood summary

audio similarityintroduction

perception of music similarity

multi-dimensional (melodic, rhythmic, sound quality, . . . )user dependentassociative, may also depend on editorial datamay be context dependent

genres are clusters of musical similarity

⇒ genre classification is a special case of audio similaritymeasures

instead of assigning (genre) labels, the similarity/distancebetween (pairs) of files is measured

overview intro similarity mood summary

audio similarityintroduction

perception of music similarity

multi-dimensional (melodic, rhythmic, sound quality, . . . )user dependentassociative, may also depend on editorial datamay be context dependent

genres are clusters of musical similarity

⇒ genre classification is a special case of audio similaritymeasures

instead of assigning (genre) labels, the similarity/distancebetween (pairs) of files is measured

overview intro similarity mood summary

audio similarityintroduction

perception of music similarity

multi-dimensional (melodic, rhythmic, sound quality, . . . )user dependentassociative, may also depend on editorial datamay be context dependent

genres are clusters of musical similarity

⇒ genre classification is a special case of audio similaritymeasures

instead of assigning (genre) labels, the similarity/distancebetween (pairs) of files is measured

overview intro similarity mood summary

audio similarityK-Means clustering example

simple k-means example

goal: minimize intra-cluster variancedistance: Euclideanprocedure:

1 initialization:randomly select K points in the feature space as initialization.

2 assignment:assign each observation to the cluster with the mean/centroidof the closest cluster.

3 update:compute mean/centroid for each cluster.

4 iteration:go to step 2 until the clusters converge.

overview intro similarity mood summary

audio similarityK-Means clustering example

simple k-means example

goal: minimize intra-cluster variancedistance: Euclideanprocedure:

1 initialization:randomly select K points in the feature space as initialization.

2 assignment:assign each observation to the cluster with the mean/centroidof the closest cluster.

3 update:compute mean/centroid for each cluster.

4 iteration:go to step 2 until the clusters converge.

overview intro similarity mood summary

audio similarityK-Means clustering example

simple k-means example

goal: minimize intra-cluster variancedistance: Euclideanprocedure:

1 initialization:randomly select K points in the feature space as initialization.

2 assignment:assign each observation to the cluster with the mean/centroidof the closest cluster.

3 update:compute mean/centroid for each cluster.

4 iteration:go to step 2 until the clusters converge.

overview intro similarity mood summary

audio similarityK-Means clustering example

simple k-means example

goal: minimize intra-cluster variancedistance: Euclideanprocedure:

1 initialization:randomly select K points in the feature space as initialization.

2 assignment:assign each observation to the cluster with the mean/centroidof the closest cluster.

3 update:compute mean/centroid for each cluster.

4 iteration:go to step 2 until the clusters converge.

overview intro similarity mood summary

audio similarityK-Means clustering example

data points

ma

tla

bso

urc

e:m

atl

ab

/d

isp

layK

Mea

ns.

m

overview intro similarity mood summary

audio similarityK-Means clustering example

init

assignment

ma

tla

bso

urc

e:m

atl

ab

/d

isp

layK

Mea

ns.

m

overview intro similarity mood summary

audio similarityK-Means clustering example

update centroids

assignment

ma

tla

bso

urc

e:m

atl

ab

/d

isp

layK

Mea

ns.

m

overview intro similarity mood summary

audio similarityK-Means clustering example

update centroids

assignment

ma

tla

bso

urc

e:m

atl

ab

/d

isp

layK

Mea

ns.

m

overview intro similarity mood summary

audio similarityK-Means clustering example

update centroids

assignment

ma

tla

bso

urc

e:m

atl

ab

/d

isp

layK

Mea

ns.

m

overview intro similarity mood summary

audio similarityK-Means clustering example

convergence

ma

tla

bso

urc

e:m

atl

ab

/d

isp

layK

Mea

ns.

m

overview intro similarity mood summary

audio similarityvisualization in a 2D space

problemfeature space is high-dimensional

→ cannot be visualized

find mapping to 2D “preserving” (high-dimensional) distancemetrics

overview intro similarity mood summary

audio similarityvisualization in a 2D space

problemfeature space is high-dimensional

→ cannot be visualized

find mapping to 2D “preserving” (high-dimensional) distancemetrics

overview intro similarity mood summary

audio similarityvisualization example: SOM 1/2

1 create a map with ’neurons’2 train

for each training sample find BMU (best matching unit)adapt BMU and neighbors toward training sample

Wv (t + 1) = Wv (t) + θ(v , t)α(t)(D(t)−Wv (t)

)θ(v , t): depends on distance from BMUα(t): learning restraintD(t) training sample

overview intro similarity mood summary

audio similarityvisualization example: SOM 1/2

1 create a map with ’neurons’2 train

for each training sample find BMU (best matching unit)adapt BMU and neighbors toward training sample

Wv (t + 1) = Wv (t) + θ(v , t)α(t)(D(t)−Wv (t)

)θ(v , t): depends on distance from BMUα(t): learning restraintD(t) training sample

https://en.wikipedia.org/wiki/Self-organizing_map

overview intro similarity mood summary

audio similaritySOM 2/2

from11

E. Pampalk, “Islands of Music,” Diploma Thesis, Technische Universitat Wien, 2001.

overview intro similarity mood summary

mood recognitionintroduction

objective:identify mood/emotion of a song

terminology:

Music Mood Recognition and Music Emotion Recognitionusually used simultaneously

What is the difference between mood andemotion

emotion:

temporary, evanescent(directly) related to eternal stimuli

mood :

longer term, stablediffuse affect state

overview intro similarity mood summary

mood recognitionintroduction

objective:identify mood/emotion of a song

terminology:

Music Mood Recognition and Music Emotion Recognitionusually used simultaneously

What is the difference between mood andemotion

emotion:

temporary, evanescent(directly) related to eternal stimuli

mood :

longer term, stablediffuse affect state

overview intro similarity mood summary

mood recognitionchallenges

ground truth dataverbalization of emotions/moods usually misleadingnot easily quantifiable/categorizablechange over time?

research focuscan established basic emotions (happiness, anger, fear, . . . )used for music perceptionaroused vs. transported moods?

overview intro similarity mood summary

mood recognitionchallenges

ground truth dataverbalization of emotions/moods usually misleadingnot easily quantifiable/categorizablechange over time?

research focuscan established basic emotions (happiness, anger, fear, . . . )used for music perceptionaroused vs. transported moods?

overview intro similarity mood summary

mood recognitionchallenges

ground truth dataverbalization of emotions/moods usually misleadingnot easily quantifiable/categorizablechange over time?

research focuscan established basic emotions (happiness, anger, fear, . . . )used for music perceptionaroused vs. transported moods?

overview intro similarity mood summary

mood recognitionmodels

classification into label clusters

Cluster 1 Cluster 2 Cluster 3 Cluster 4 Cluster 5Rowdy Amiable/Good Natured Literate Witty Volatile

Rousing Sweet Wistful Humorous FieryConfident Fun Bittersweet Whimsical VisceralBoisterous Rollicking Autumnal Wry AggressivePassionate Cheerful Brooding Campy Tense/Anxious

Poignant Quirky IntenseSilly

mood model, circumplex model most common2

2J. A. Russel, “A Circumplex Model of Affect,” Journal of Personality and Social Psychology, vol. 39, no. 6,

pp. 1161–1178, 1980, issn: 1939-1315(Electronic);0022-3514(Print). doi: 10.1037/h0077714.

overview intro similarity mood summary

mood recognitionmodels

classification into label clustersmood model, circumplex model most common2

2J. A. Russel, “A Circumplex Model of Affect,” Journal of Personality and Social Psychology, vol. 39, no. 6,

pp. 1161–1178, 1980, issn: 1939-1315(Electronic);0022-3514(Print). doi: 10.1037/h0077714.

overview intro similarity mood summary

mood recognitionmood model: regression modeling

mapping(N-dimensional) observation (feature) to 2-dimensionalcoordinate (valence/arousal)

trainingfind model to minimize error between data points and“prediction”

overview intro similarity mood summary

linear regressionintroduction to regression 1/2

fit a linear function to a series of points (xj , yj)

yn = m · xn + b

https://en.wikipedia.org/wiki/Linear_regression

overview intro similarity mood summary

linear regressionintroduction to regression 2/2

minimize error between model and data (here: least squares)

e2n = (yn −mxn − b)2

E =∑

(yn −mxn − b)2

∂E

∂b=

∑−2(yn −mxn − b) = 0

− 2∑

yn + 2∑

mxn + 2∑

b = 0∑mxn +

∑b =

∑yn

m∑

xn +Nb =∑

yn

∂E

∂m=

∑−2xn(yn −mxn − b) = 0

− 2∑

xnyn + 2∑

mx2n + 2

∑bxn = 0∑

mx2n +

∑bxn =

∑xnyn

m∑

x2n + b

∑xn =

∑xnyn

m =N

∑xnyn −

∑xn

∑yn

N∑

x2n − (

∑xn)2

b =

∑yn

N−m

∑xn

N

overview intro similarity mood summary

linear regressionintroduction to regression 2/2

minimize error between model and data (here: least squares)

e2n = (yn −mxn − b)2

E =∑

(yn −mxn − b)2

∂E

∂b=

∑−2(yn −mxn − b) = 0

− 2∑

yn + 2∑

mxn + 2∑

b = 0∑mxn +

∑b =

∑yn

m∑

xn +Nb =∑

yn

∂E

∂m=

∑−2xn(yn −mxn − b) = 0

− 2∑

xnyn + 2∑

mx2n + 2

∑bxn = 0∑

mx2n +

∑bxn =

∑xnyn

m∑

x2n + b

∑xn =

∑xnyn

m =N

∑xnyn −

∑xn

∑yn

N∑

x2n − (

∑xn)2

b =

∑yn

N−m

∑xn

N

overview intro similarity mood summary

linear regressionintroduction to regression 2/2

minimize error between model and data (here: least squares)

e2n = (yn −mxn − b)2

E =∑

(yn −mxn − b)2

∂E

∂b=

∑−2(yn −mxn − b) = 0

− 2∑

yn + 2∑

mxn + 2∑

b = 0

∑mxn +

∑b =

∑yn

m∑

xn +Nb =∑

yn

∂E

∂m=

∑−2xn(yn −mxn − b) = 0

− 2∑

xnyn + 2∑

mx2n + 2

∑bxn = 0

∑mx2

n +∑

bxn =∑

xnyn

m∑

x2n + b

∑xn =

∑xnyn

m =N

∑xnyn −

∑xn

∑yn

N∑

x2n − (

∑xn)2

b =

∑yn

N−m

∑xn

N

overview intro similarity mood summary

linear regressionintroduction to regression 2/2

minimize error between model and data (here: least squares)

e2n = (yn −mxn − b)2

E =∑

(yn −mxn − b)2

∂E

∂b=

∑−2(yn −mxn − b) = 0

− 2∑

yn + 2∑

mxn + 2∑

b = 0∑mxn +

∑b =

∑yn

m∑

xn +Nb =∑

yn

∂E

∂m=

∑−2xn(yn −mxn − b) = 0

− 2∑

xnyn + 2∑

mx2n + 2

∑bxn = 0∑

mx2n +

∑bxn =

∑xnyn

m∑

x2n + b

∑xn =

∑xnyn

m =N

∑xnyn −

∑xn

∑yn

N∑

x2n − (

∑xn)2

b =

∑yn

N−m

∑xn

N

overview intro similarity mood summary

linear regressionintroduction to regression 2/2

minimize error between model and data (here: least squares)

e2n = (yn −mxn − b)2

E =∑

(yn −mxn − b)2

∂E

∂b=

∑−2(yn −mxn − b) = 0

− 2∑

yn + 2∑

mxn + 2∑

b = 0∑mxn +

∑b =

∑yn

m∑

xn +Nb =∑

yn

∂E

∂m=

∑−2xn(yn −mxn − b) = 0

− 2∑

xnyn + 2∑

mx2n + 2

∑bxn = 0∑

mx2n +

∑bxn =

∑xnyn

m∑

x2n + b

∑xn =

∑xnyn

m =N

∑xnyn −

∑xn

∑yn

N∑

x2n − (

∑xn)2

b =

∑yn

N−m

∑xn

N

overview intro similarity mood summary

linear regressionintroduction to regression 2/2

minimize error between model and data (here: least squares)

e2n = (yn −mxn − b)2

E =∑

(yn −mxn − b)2

∂E

∂b=

∑−2(yn −mxn − b) = 0

− 2∑

yn + 2∑

mxn + 2∑

b = 0∑mxn +

∑b =

∑yn

m∑

xn +Nb =∑

yn

∂E

∂m=

∑−2xn(yn −mxn − b) = 0

− 2∑

xnyn + 2∑

mx2n + 2

∑bxn = 0∑

mx2n +

∑bxn =

∑xnyn

m∑

x2n + b

∑xn =

∑xnyn

m =N

∑xnyn −

∑xn

∑yn

N∑

x2n − (

∑xn)2

b =

∑yn

N−m

∑xn

N

overview intro similarity mood summary

mood recognitionrange of results

5 mood clusters:40–60% classification rate

mood model:0.1–0.4 absolute prediction error (unit circle)

overview intro similarity mood summary

summarylecture content

1 what are typical use cases for music similarity and moodrecognition

2 discuss advantages and disadvantages of using classes vs thecircumplex model for mood recognition

overview intro similarity mood summary

summarylecture content

1 what are typical use cases for music similarity and moodrecognition

2 discuss advantages and disadvantages of using classes vs thecircumplex model for mood recognition

top related