multi-modal music information retrieval - visualisation ...€¦ · visualisation and evaluation of...

22
introduction music information retrieval and ml multi-modality in mir conclusions Multi-Modal Music Information Retrieval - Visualisation and Evaluation of Clusterings by Both Audio and Lyrics Robert Neumayer and Andreas Rauber {neumayer,rauber}@ifs.tuwien.ac.at Vienna University of Technology Institute for Software Technology and Interactive Systems Favoritenstraße 9-11, 1040, Vienna, Austria presented at RIAO’07 May 30, 2007 1/22

Upload: others

Post on 30-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Multi-Modal Music Information Retrieval - Visualisation ...€¦ · Visualisation and Evaluation of Clusterings by Both Audio and Lyrics Robert Neumayer and Andreas Rauber {neumayer,rauber}@ifs.tuwien.ac.at

introduction music information retrieval and ml multi-modality in mir conclusions

Multi-Modal Music Information Retrieval -Visualisation and Evaluation of Clusterings

by Both Audio and Lyrics

Robert Neumayer and Andreas Rauber{neumayer,rauber}@ifs.tuwien.ac.at

Vienna University of TechnologyInstitute for Software Technology and Interactive Systems

Favoritenstraße 9-11, 1040, Vienna, Austria

presented at RIAO’07May 30, 2007

1/22

Page 2: Multi-Modal Music Information Retrieval - Visualisation ...€¦ · Visualisation and Evaluation of Clusterings by Both Audio and Lyrics Robert Neumayer and Andreas Rauber {neumayer,rauber}@ifs.tuwien.ac.at

introduction music information retrieval and ml multi-modality in mir conclusions

motivation for multi-modal analysis

generally increasing amount of digital audio

private userscommercial holdings

novel interfaces & music classification

clustering/maps: PlaySOM,PocketSOMPlayerclassification into categories: genres,emotions, situations,. . .

explore the influence of relevant information

christmas songslove songsspoken documents

2/22

Page 3: Multi-Modal Music Information Retrieval - Visualisation ...€¦ · Visualisation and Evaluation of Clusterings by Both Audio and Lyrics Robert Neumayer and Andreas Rauber {neumayer,rauber}@ifs.tuwien.ac.at

introduction music information retrieval and ml multi-modality in mir conclusions

motivation for multi-modal analysis

generally increasing amount of digital audio

private userscommercial holdings

novel interfaces & music classification

clustering/maps: PlaySOM,PocketSOMPlayerclassification into categories: genres,emotions, situations,. . .

explore the influence of relevant information

christmas songslove songsspoken documents

3/22

Page 4: Multi-Modal Music Information Retrieval - Visualisation ...€¦ · Visualisation and Evaluation of Clusterings by Both Audio and Lyrics Robert Neumayer and Andreas Rauber {neumayer,rauber}@ifs.tuwien.ac.at

introduction music information retrieval and ml multi-modality in mir conclusions

motivation for multi-modal analysis

generally increasing amount of digital audio

private userscommercial holdings

novel interfaces & music classification

clustering/maps: PlaySOM,PocketSOMPlayerclassification into categories: genres,emotions, situations,. . .

explore the influence of relevant information

christmas songslove songsspoken documents

4/22

Page 5: Multi-Modal Music Information Retrieval - Visualisation ...€¦ · Visualisation and Evaluation of Clusterings by Both Audio and Lyrics Robert Neumayer and Andreas Rauber {neumayer,rauber}@ifs.tuwien.ac.at

introduction music information retrieval and ml multi-modality in mir conclusions

what to expect from this presentation

information retrieval

text irmusic ir

integration of both audio and text features

clustering

self-organising map (som)multi-modal clustering

user interface

multi-modal cluster evaluation

5/22

Page 6: Multi-Modal Music Information Retrieval - Visualisation ...€¦ · Visualisation and Evaluation of Clusterings by Both Audio and Lyrics Robert Neumayer and Andreas Rauber {neumayer,rauber}@ifs.tuwien.ac.at

introduction music information retrieval and ml multi-modality in mir conclusions

audio features

computed from the audio waveform

abstract representation

can be computed for every piece of audio

a few feature sets

MPEG7 standard featuresMARSYAS featuresrhythm patterns (1440)rhythm histograms (60)statistical spectrum descriptors (168)

6/22

Page 7: Multi-Modal Music Information Retrieval - Visualisation ...€¦ · Visualisation and Evaluation of Clusterings by Both Audio and Lyrics Robert Neumayer and Andreas Rauber {neumayer,rauber}@ifs.tuwien.ac.at

introduction music information retrieval and ml multi-modality in mir conclusions

text features

plain text lyrics retrieval

three lyrics portals are accessedmissing values issues (e.g. lyrics cannot be retrieved)

‘bag-of-words’ approach

stop word removal: yesstemming: no

tfidf weighting

still abstract, may yet be helpful

interpretabilitycontent words / semantic categories

high dimensionality → reduction needed

7/22

Page 8: Multi-Modal Music Information Retrieval - Visualisation ...€¦ · Visualisation and Evaluation of Clusterings by Both Audio and Lyrics Robert Neumayer and Andreas Rauber {neumayer,rauber}@ifs.tuwien.ac.at

introduction music information retrieval and ml multi-modality in mir conclusions

self-organising map clustering

unsupervised neural network model

data mapping

from high-dimensional input spaceto low-dimensional output space

topology preservation

simplification and visualisation

8/22

Page 9: Multi-Modal Music Information Retrieval - Visualisation ...€¦ · Visualisation and Evaluation of Clusterings by Both Audio and Lyrics Robert Neumayer and Andreas Rauber {neumayer,rauber}@ifs.tuwien.ac.at

introduction music information retrieval and ml multi-modality in mir conclusions

som training process

Figure: self-organising map training algorithm

9/22

Page 10: Multi-Modal Music Information Retrieval - Visualisation ...€¦ · Visualisation and Evaluation of Clusterings by Both Audio and Lyrics Robert Neumayer and Andreas Rauber {neumayer,rauber}@ifs.tuwien.ac.at

introduction music information retrieval and ml multi-modality in mir conclusions

map based user interfaces

Figure: PlaySOM application10/22

Page 11: Multi-Modal Music Information Retrieval - Visualisation ...€¦ · Visualisation and Evaluation of Clusterings by Both Audio and Lyrics Robert Neumayer and Andreas Rauber {neumayer,rauber}@ifs.tuwien.ac.at

introduction music information retrieval and ml multi-modality in mir conclusions

clustering music by lyrics

audio/lyrics collection (7500 songs / 54 genres)som of 20× 20 unitscomprises a range of styles and genres:

metal, r&b, indie , . . .

Figure: clustering of songs centred around the love topic

11/22

Page 12: Multi-Modal Music Information Retrieval - Visualisation ...€¦ · Visualisation and Evaluation of Clusterings by Both Audio and Lyrics Robert Neumayer and Andreas Rauber {neumayer,rauber}@ifs.tuwien.ac.at

introduction music information retrieval and ml multi-modality in mir conclusions

clustering and multiple data sources

why not cluster according to each modality?

connect instances/songs on both maps

identify differences in the data distributions on the map acrossclusterings

12/22

Page 13: Multi-Modal Music Information Retrieval - Visualisation ...€¦ · Visualisation and Evaluation of Clusterings by Both Audio and Lyrics Robert Neumayer and Andreas Rauber {neumayer,rauber}@ifs.tuwien.ac.at

introduction music information retrieval and ml multi-modality in mir conclusions

genre-wise distribution across mappings

audio clustering

Figure: clustering of Christmassongs on the 2D audio map

lyrics clustering

Figure: clustering of Christmassongs on the 2D lyrics map

13/22

Page 14: Multi-Modal Music Information Retrieval - Visualisation ...€¦ · Visualisation and Evaluation of Clusterings by Both Audio and Lyrics Robert Neumayer and Andreas Rauber {neumayer,rauber}@ifs.tuwien.ac.at

introduction music information retrieval and ml multi-modality in mir conclusions

Figure: full view of the visualisation prototype

14/22

Page 15: Multi-Modal Music Information Retrieval - Visualisation ...€¦ · Visualisation and Evaluation of Clusterings by Both Audio and Lyrics Robert Neumayer and Andreas Rauber {neumayer,rauber}@ifs.tuwien.ac.at

introduction music information retrieval and ml multi-modality in mir conclusions

Figure: full view of the visualisation prototype – the vertical map clusterssongs by audio features, the horizontal map is trained on lyrics features

15/22

Page 16: Multi-Modal Music Information Retrieval - Visualisation ...€¦ · Visualisation and Evaluation of Clusterings by Both Audio and Lyrics Robert Neumayer and Andreas Rauber {neumayer,rauber}@ifs.tuwien.ac.at

introduction music information retrieval and ml multi-modality in mir conclusions

detailed view ofconnections

equally distributedartist ‘Kid Rock’

colour-codedconnections

Figure: Kid Rock’s songs16/22

Page 17: Multi-Modal Music Information Retrieval - Visualisation ...€¦ · Visualisation and Evaluation of Clusterings by Both Audio and Lyrics Robert Neumayer and Andreas Rauber {neumayer,rauber}@ifs.tuwien.ac.at

introduction music information retrieval and ml multi-modality in mir conclusions

detailed view ofconnections

genre ‘ChristmasCarols’

colour-codedconnections

Figure: Christmas songs17/22

Page 18: Multi-Modal Music Information Retrieval - Visualisation ...€¦ · Visualisation and Evaluation of Clusterings by Both Audio and Lyrics Robert Neumayer and Andreas Rauber {neumayer,rauber}@ifs.tuwien.ac.at

introduction music information retrieval and ml multi-modality in mir conclusions

distribution across mappings

audio clustering

Figure: clustering of Christmassongs on the 2D audio map

lyrics clustering

Figure: clustering of Christmassongs on the 2D lyrics map

18/22

Page 19: Multi-Modal Music Information Retrieval - Visualisation ...€¦ · Visualisation and Evaluation of Clusterings by Both Audio and Lyrics Robert Neumayer and Andreas Rauber {neumayer,rauber}@ifs.tuwien.ac.at

introduction music information retrieval and ml multi-modality in mir conclusions

quantitative evaluation

select instances belonging to one artist/genre

compute spreading factors on each map considering individualclusterings

integrate these values for both maps

19/22

Page 20: Multi-Modal Music Information Retrieval - Visualisation ...€¦ · Visualisation and Evaluation of Clusterings by Both Audio and Lyrics Robert Neumayer and Andreas Rauber {neumayer,rauber}@ifs.tuwien.ac.at

introduction music information retrieval and ml multi-modality in mir conclusions

quantitative evaluation by example

(a) Upperleft corner(audio)

(b) Di-agonally(lyrics)

(c) Non-Continuous(audio)

(d) Sub-clusters(lyrics)

average distance ratio

contiguity ratio

bonus for continuous clusters

adra,l cra,l adr · cra/b .48 .20 .06c/d .76 .73 .55

20/22

Page 21: Multi-Modal Music Information Retrieval - Visualisation ...€¦ · Visualisation and Evaluation of Clusterings by Both Audio and Lyrics Robert Neumayer and Andreas Rauber {neumayer,rauber}@ifs.tuwien.ac.at

introduction music information retrieval and ml multi-modality in mir conclusions

genre/artist-wise distribution measures

Artist CR ADR ADR×CRSean Paul .4152 .4917 .2042

Good Riddance .8299 .7448 .6181Shakespeare .2626 .3029 .0795Kid Rock .9640 .9761 .9410

Genre CR ADR ADR×CRSpeech .8092 .3417 .2765

Christmas Carol .5800 .7779 .4512Reggae .9495 .8475 .8047Rock .9740 .9300 .9059

21/22

Page 22: Multi-Modal Music Information Retrieval - Visualisation ...€¦ · Visualisation and Evaluation of Clusterings by Both Audio and Lyrics Robert Neumayer and Andreas Rauber {neumayer,rauber}@ifs.tuwien.ac.at

introduction music information retrieval and ml multi-modality in mir conclusions

recap

multi-modal clustering

plus evaluation

possible usage

artist genre identificationadditional info for music information retrieval systemsquality metrics for cluster evaluation (focusing on musiccontext)

. . . and we’ll be hosting the ISMIR conference in September!

22/22