Transcript

Cross-Cultural Mood Regression for Music Digital Libraries

Xiao Hu Yi-HsuanYang

Faculty of EducationThe University of Hong Kong

Center for IT InnovationAcademia Sinica

� Mood is a popular access point in music digital libraries

� Music mood can be represented by a dimensional model

• Arousal-valence values can be predicted from music audio usinglinear regression

� Good cross-dataset performances between CH496 and MER60:� music in different cultures but annotated by listeners in

the same cultural group� Good cross-dataset performances between MER60 and DEAP120� music in the same culture but annotated by listeners in

different cultural groups� Poor cross-dataset performances between CH496 and DEAP120� music in different cultures and annotated by listeners in

different cultural groups� Good performance within DEAP120 suggests consistent effect

of visual and audio channels on valence perception

� Cross-cultural generalizability of audio-based mood prediction issupported for both arousal and valence predictions when theannotators were from the same cultural background

� For valence prediction either music or annotators needs tobe in the same culture

Audio spectrum

� Music mood is influenced by cultural backgrounds of

� Music

� Listeners

� Can prediction models built on music in one culture be applied tomusic in another culture ?

Audio features studied for

Western songs

Chinese music annotated by

Chinese listeners

Automatic prediction by

SVR

Automatic prediction by

SVR

Compare the regression performance

Western music annotated by

Chinese listeners

Western music annotated by

Western listeners

Automatic prediction by

SVR

CH496 MER60 DEAP120

� R2 (squared correlation coefficient) measures the level ofagreement between the predicted and annotated values

� Good cross-dataset performances between CH496 and MER60� music in different cultures but annotated by listeners in

the same cultural group� Poor performances on DEAP120 may suggest inconsistent

effect of visual and audio channels on arousal perception

Arousal CH496 [test]

MER60 [test]

DEAP120 [test]

CH496 [train] 0.80 0.73 0.42

MER60 [train] 0.77 0.77 0.47

DEAP120 [train] 0.67 0.70 0.44

Valence CH496 [test]

MER60 [test]

DEAP120 [test]

CH496 [train] 0.25 0.15 0.08

MER60 [train] 0.26 0.11 0.22

DEAP120 [train] 0.14 0.22 0.21

� S. C. Koelstra et al., “DEAP: A database for emotion analysis; usingphysiological signals,” IEEETrans.Affective Comput. 3 (1) 2012.

� A. Russell, “A circumspect m2odel of affect,” Journal of Psychology and SocialPsychology, 39(6)1980.

� Y.-H. Yang and H. H. Chen, “Predicting the distribution of perceived emotionsof a music signal for content retrieval,” IEEE Trans. Acoust. Speech SignalProcess,19 (7) 2011.

Energy or neurophysiological stimulation level

PleasantnessPositive and negative affective states

Circumplex model (Russell 1980)

Top Related