a musical data mining primer

Post on 30-Dec-2015

25 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

A Musical Data Mining Primer. CS235 – Spring ’03 Dan Berger dberger@cs.ucr.edu. Outline. Motivation/Problem Overview Background Types of Music Digital Representations Psychoacoustics Query (Content vs. Meta-Data) Categorization & Clustering Finding More Conclusion. Motivation. - PowerPoint PPT Presentation

TRANSCRIPT

A Musical Data Mining Primer

CS235 – Spring ’03Dan Berger

dberger@cs.ucr.edu

Outline

Motivation/Problem Overview Background

Types of Music Digital Representations Psychoacoustics

Query (Content vs. Meta-Data) Categorization & Clustering Finding More Conclusion

Motivation

More music is being stored digitally: PressPlay offers 300,000 tracks for download

As collections grow – organizing and searching manually become hard; How to find the “right” music in a sea of

possibilities? How to find new artists given current

preferences? How to find a song you heard on the radio?

Problem Overview

Music is a highly dimension time series: 5 minutes @ CD quality > 13M samples!

It seems logical to apply data mining and IR techniques to this form of information. Query, Clustering, Prediction, etc.

Application isn’t straightforward for reasons we’ll discuss shortly.

Background: Types of Music Monophonic: one note sounds at a

time. Homophonic: multiple note sound –

all starting (and ending) at the same instant.

Polyphonic: no constraints on concurrency. Most general – and difficult to handle.

Background: Digital Representations Structured (Symbolic):

MIDI – stores note duration & intensity, instructions for a synthesizer

Unstructured (Sampled): PCM – stores quantized periodic samples

Leverages Nyquist/Shannon’s sampling thm. to faithfully capture the signal.

MP3/Vorbis/AAC – discards “useless” information – reduces storage and fidelity

Use psychoacoustics Some work at rediscovering musical structure.

Background: Psychoacoustics Two main relevant results:

Limited, freq. dependant resolution Auditory masking

We hear different frequencies differently: sound spectrum broken into “critical bands”

We “miss” signals due to spectral &/or temporal “collision.” Loud sounds mask softer ones, Two sounds of similar frequency get blended

Query – Content is King

Current systems use textual meta-data to facilitate query: Song/Album Title, Artist, Genre*

The goal is to query by the musical content: Similarity

‘find songs “like” the current one’ ‘find songs “with” this musical phrase’

Result: Query By Humming A handful of research systems

have been built that locate songs in a collection based on the user humming or singing a melodic portion of the song. Typically search over a collection of

monophonic MIDI files.

Content Based Query

Recall: music is a time series with high dimensionality. Need robust dimensionality

reduction. Not all parts of music are equally

important. Feature extraction – remember the

important features. Which features are important?

Similarity/Feature Extraction The current “hard problem” – there are

ad-hoc solutions, but little supporting theory. Tempo (bpm), volume, spectral qualities,

transitions, etc. Sound source: is it a piano? a trumpet? Singer recognition: who’s the vocalist?

Collectively: “Machine Listening” These are hard problems with some positive

results.

Compression Complexity

Different compression schemes (MP3/Vorbis/AAC) use psychoacoustics differently. Different implementations of a

scheme may also! Feature extraction needs to be

robust to these variations. Seems to be an open problem.

Categorization/Clustering

Genre (rock/r&B/pop/jazz/blues/etc.) is manually assigned – and subjective. Work is being done on automatic

classification and clustering. Relies on (and sometimes reinvents) the

similarity metric work described previously.

Browsing & Visualization:

LOUD: physical exploration Islands of Music: uses self

organizing maps to visualize clusters of similar songs.

Current Efforts

Amazon/iTunes/etc. use collaborative filtering. If the population is myopic and predictable, it

works well, otherwise not. Hit Song Science – clusters a provided set

of songs against a database of top 30 hits to predict success. Claims to have predicted the success of Nora

Jones. Relatable – musical “fingerprint”

technology – involved with “Napster 2”

Finding More

Conferences: Int. Symposium on Music IR (ISMIR) Int. Conference on Music and AI (ICMAI) Joint Conference on Digital Libraries

Journals: ACM/IEEE Multimedia

Groups: MIT Media Lab: Machine Listening Group

Conclusion

Slow steady progress is being made.

“Music Appreciation” is fuzzy we can’t define it but we know it

when we hear it. References, and more detail, are in

my survey paper, available shortly on the web. http://www.cs.ucr.edu/~dberger

Fini

Questions?

top related