emotionally-controlled music synthesis

Post on 12-Apr-2017

264 Views

Category:

Engineering

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Emotionally-Controlled Music Synthesis

António Pedro OliveiraAmílcar CardosoUniversity of Coimbra, Portugal12/12/2008

2

Outline

Introduction Computational Model Features Extraction Regression Models Conclusion

3

Outline

Introduction Computational Model Features Extraction Regression Models Conclusion

Introduction

4

Music is accepted as a language of emotional expression

To control this expression in an automatic way, we are developing a computational model that establishes relations between emotions and musical features

Emotions are defined in 2 dimensions: Valence: degree of happiness (from very sad to very happy

music) Arousal: degree of activation (from very relaxing to very

activation music)

5

Outline

Introduction Computational Model Features Extraction Regression Models Conclusion

Computational Model – Features Extraction

6

Use a database of MIDI music labelled with symbolic and audio features

Computational Model – Regression models

7

Use a database of MIDI music labelled with symbolic and audio features

Modelling relations between emotions and music features with regression models

Computational Model

8

Use a database of MIDI music labelled with symbolic and audio features

Modelling relations between emotions and music features with regression models

Use these models to control the affective content of synthesized music

Computational Model - Experiments

9

96 MIDI pieces of film music that last between 20 and 90 seconds

80 listeners Label online each affective

dimension with integer values between 0 and 10

10

Outline

Introduction Computational Model Features Extraction Regression Models Conclusion

Features Extraction

11

Make a music base with MIDI music labelled with symbolic and audio features

Features Extraction – Correlation between audio

features and valence

12

Sharpness – ratio of high/bass frequencies Loudness – total energy Flatness – spectral distribution of energy Dissonance – perceptive interference of

sinusoids

Similarity – temporal spectral correlation of energy distribution by frequency bands

Dissonance – perceptive interference of sinusoids

Sharpness – ratio of high/bass frequencies Energy – total energy

Features Extraction – Correlation between audio

features and arousal

13

Bridge the gap between audio and symbolic domain:

Spectral similarity vs. note duration, interonset interval Spectral dissonance vs. prevalence of percussion

instruments

Features Extraction – Correlation between audio and symbolic

features

14

15

Outline

Introduction Computational Model Features Extraction Regression Models Conclusion

Regression models

16

Establish weighted relations between emotions and musical features

Use non-linear regression models Model with symbolic and audio

features

Regression models – Correlation between models

and valence

17

Best hybrid (use of audio and symbolic features) non-linear regression model – 84%

Best symbolic linear regression model – 75% Best audio non-linear regression model – 61%

Regression models – Best audio and symbolic features for

valence

18

Regression models – Correlation between models

and arousal

19

Best hybrid (use of audio and symbolic features) non-linear regression model – 90%

Best symbolic linear regression model – 84% Best audio non-linear regression model – 75%

Regression models – Best audio and symbolic features for

arousal

20

21

Outline

Introduction Computational Model Features Extraction Regression Models Conclusion

Conclusion

22

Hybrid non-linear regression models outperformed results of symbolic linear regression models

Non-linear models seem more appropriate than linear models

The use of features from audio and symbolic domains is more appropriate than the use of features from only one domain

Timbre/sound can be used to control/influence the emotional expression

top related