convenient mir systems

34
convenient MIR systems vision vs. reality check, research & e-commerce Stephan Baumann

Upload: renate

Post on 30-Jan-2016

33 views

Category:

Documents


0 download

DESCRIPTION

convenient MIR systems. vision vs. reality check, research & e-commerce. Stephan Baumann. Agenda. Personal Profile Convenient Music Information Retrieval Multi-modal queries Identification by description Multi-facet music similarity Timbre Lyrics Cultural aspects - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: convenient MIR systems

convenient MIR systems vision vs. reality check, research & e-commerceStephan Baumann

Page 2: convenient MIR systems

Agenda

• Personal Profile• Convenient Music Information Retrieval

– Multi-modal queries– Identification by description– Multi-facet music similarity

• Timbre • Lyrics• Cultural aspects

• Project MPEER: P2P, semantic web and MIR

Page 3: convenient MIR systems

Research Diary (1991-2003)

• 1991/92 optical music recognition• 1992/93 online handwriting recognition• 1993/94 optical music recognition• 1995/97 document analysis and

understanding• 1996 first look on MultimediaIR (S.Pfeiffer) • 1998/99 spinoff activities with Insiders GmbH • 2000 freelancing/research for draft MIR system • 2001 co-founding Sonicson GmbH• 2001/03 subjective music similarity (Ph.D..

Sep03)

Page 4: convenient MIR systems

Desiderata MIR [Huron]

• 1. Access to all of the world’s music• 2. Access via an indexing method• 3. Fair use (reimbursement to all contributors)• 4. Open system• 5. Self-correcting system• 6. Ensurement of privacy and cultural

practices

Page 5: convenient MIR systems

MIR Categorization [Futrelle]Representation Description Research

Symbolic Notation,Event-based recordings (MIDI),Hybrid representations

Matching, Theme/Melody Extraction, Voice Separation, Musical Analysis

Audio Recordings, Streaming Audio, Instr. Libraries

Sound/Song Spotting, Transcription, Timbre/Genre Classification, Musical Analysis, Recommendation Systems

Visual Scores Score Reading (OMR)

Metadata Cataloging, Bibliography, Descriptions

Library Testbeds, Traditional IR, Interoperability, Recommendation Systems

Page 6: convenient MIR systems

Related Work• Audio:

– [Blum, Wold], [Pfeiffer], [Foote], [Logan], ...– [Scheirer], [Tzanetakis], [Welsh], [Aucouturier], [Peeters], ...

• Cultural: – [Whitman], [Pachet], [Ellis, Berenzweig]

• Multi-modal MIR– [Bainbridge], ...

• Recommendation– [Amazon, Moodlogic, MusicGenome, MuBu, MongoMusic], ...– [Uitdenbogerd]

• User Models– [Chai, Vercoe], [Rolland]

• Music Psychology– [Bruhn, Rösing], [Gabriellson, Västfjäll], ...

• Usability, Convenience– [Shneiderman], [Nielson], ...

Page 7: convenient MIR systems

Convenience

• Using natural language as input for queries of non-musicians

• Accessing meta data, symbolic and audio layers in one interface

• Evaluation of usability (e.g. eye-tracking + user interviews)

• Acquisition of audio features, symbolic features, meta data and lyrics

• Machine communication by using shared music ontologies (MPEG-7, RDF/S, DAML-S)

Page 8: convenient MIR systems

Prototype

bilingual matching of

phonetic ambiguities and

misspellings

recognition of intention

treatment of refinements and

negations

automatic generation of SQL

queries on demandIntention-based

result presentation

extraction of musical concepts from natural

language queries

Page 9: convenient MIR systems

Software Development Lifecylce

• System Design Philosophy: Google-Style• 1. Collection of User req. V1

– Offline– 20 germans, different user segments

• 2. Setup of prototype V1– Online Refinement of req. V1 -> Introduction of PhoneticMatch

• 3. Collection of User req. V2– Online with prototype V1– 100 american native-speakers, internet-aware users

• 4. Setup of prototype V2– Bilingual phonetic match– NLP frontend– Audio-based music similarity

• 5. Scaling of phonetic match component for commercial website

Page 10: convenient MIR systems

Convenience

www.musicline.de

´s no.1 hit

status que -> Status Quo

golgen earing -> Golden Earring

Fisher Set -> Fischer Z

Novospaski Chor -> Novo Spassky Chor

four none blondes -> 4 Non Blondes

Matchbox twenty -> Matchbox 20

Statistics: 540.000 queries/month 400.000 queries for artists/month 80.000 fuzzy queries for artists/month

Page 11: convenient MIR systems

Usability Evaluation: helping text

Page 12: convenient MIR systems
Page 13: convenient MIR systems

Multi-facet Music Similarity

• Audio: MFCCs• Lyrics: TFIDF• Cultural:

– Webcrawling– POS– TFIDF

Page 14: convenient MIR systems

Song Similarity: Audio-based Perception

• Feature Extraction– Input Segment [30..60] sec– 30ms Hanning-Windows, Log Spectrum, Mel-Scale, Inverse Fourier Transform– 1000 vectors using the first 13 MFCCs

• Representation– Intra-Song-Clustering -> Song Signature [Logan]– (Gaussian Mixture Models [Aucouturier])

• Similarity Measure– Euclidean Distance [Foote]– Kullback-Leibler Distance [Logan, Aucouturier]– (Approximative solutions: Sampling [Ellis, Aucouturier])– DistMinMean [Ellis]– Earth Moving Distance (EMD) [Logan]

• Different Features & Similarity Measures– [Welsh] Tonal histograms, tonal transition, volume, tempo, noise->Euclidean Distance– [Rauber&Frühwirth] Psychoacoustic Features -> Hierarchical SOM– [Pfeiffer] A review of MP3-native features– ...

Page 15: convenient MIR systems

Perception of similar Timbre in Songs: Evaluation?!?!

• Audio Database: 700 MP3s of mainstream music at full-length, 40 artists, 70 different genres

• Evaluation: no GT available! only anecdotal evidence or genre/artist/volume GT

Page 16: convenient MIR systems

Lyrics: Vector Space Model (TFIDF)

• Representation of a Collection of Lyrics# of terms k:Song j:Occurence of term h in collection d(h):Weight of term j in song i:

• Similarity metric

Page 17: convenient MIR systems

Song Similarity: Lyrics (1)

Reference Song 112: Lucy pearl - Dance tonight.txt Most-relevant terms: toast spend tonight dance money1. Similar Song : Lucy Pearl - you (feat. snoop dogg and Q-tipp).txt2. Similar Song: Phil Collins - Please Come Out Tonight.txt 3. Similar Song: Madonna - Into the groove.

Reference Song 56: Das Kind Vor Dem Euch.txt - die fantastischen vierMost-relevant terms: wollten euch sehn entsetzt selben1. Similar Song: Die fantastischen Vier - Auf Der Flucht.txt 2. Similar Song: Freundeskreis - Mit Dir.txt Artist: 3. Similar Song: Die fantastischen Vier – Populär

Reference Song 145: madonna - Paradise.txtMost-relevant terms: remains pas encore fois moiZero Hits

Page 18: convenient MIR systems

Song Similarity: Lyrics (2)

Reference Song 193: Phil Collins - One More Night.txt Most-relevant terms: forever wait night cos ooh1. Similar Lyrics: Phil Collins - YOU CAN'T HURRY LOVE.txt2. Similar Lyrics: Phil Collins - Inside Out.txt 3. Similar Lyrics: Phil Collins - This must be Love.txt

Reference Song 297: Cat Stevens - Father And Son.txt Most-relevant terms: fault decision marry son settle1. Similar Lyrics: Phil Collins - We're Sons Of Our Fathers.txt2. Similar Lyrics: Sheryl Crow - No One Said It Would Be Easy.txt 3. Similar Lyrics: George Michael - Father Figure.txt

Page 19: convenient MIR systems

Artist Similarity: Cultural Aspects

Page 20: convenient MIR systems

Web Crawling+PartOfSpeech+TFIDF

adj Terms TFIDF Phrases TFIDFdaft 0,20463 techno music 0,86982new 0,14242 old school 0,80009french 0,12907 great techno buzz 0,40004different 0,09314 overall groove 0,40004digital 0,08607 electronic artists 0,40004vocal 0,07558 new wave 0,40004cool 0,07339 usual drum n bass 0,40004electronic 0,06887 only band 0,36956funky 0,06497 big thing prodigy 0,36956underground 0,06497 good beat 0,34793

Page 21: convenient MIR systems

Visual Evaluation: Similarity (Cosine)

high

low

HEAVYMETAL ROCK POP SOUL DANCE

DANCE

SOUL

POP

ROCK

Page 22: convenient MIR systems

Recall/Precision against P2P, AMG data

R

ARa

Page 23: convenient MIR systems

Learning? Supervised

Evaluation ?![Downie, Uitdenbogerd]

SimilarityClustering Classification

Web Sources

Rel.Feedback (Rocchio)-subjective

-context-dependent-„personal taste“

Unsupervised GroundTruth

MusicSeer ? AMG=Experts

P2P=collabor. Experiment ?

Cosine vs.

Learning

Listening mode

Personal

Classifier

Part Of Speech + TermWeighting

VectorSpaceModel

WEKA Suite?

Page 24: convenient MIR systems

Psychological Factors >>Musical Taste

• Personality >> preferred Styles, Genres– Stability– Introversion / Extraversion– Aggressive / Passive

• Socio-economics >> preferred Styles, Genres• Demographic >> similar users in CF approaches >>

recos– Gender– Age

• Situation– Mood >> tempo, tonality, beatness, pitch height– Listening Mode [Huron]

Page 25: convenient MIR systems

User Model [Chai,Vercoe]

<user>

<generalbackground> <name>John White </name> <education>MS</education> <citizenship>US</citizenship> <birthdate>9/7/1974</birthdate> <sex>male</sex> <occupation>student</occupation></generalbackground>

</user>

<musicbackground>

<education>none</education>

<instrument>piano</instrument>

<instrument>vocal</instrument>

</musicbackground>

<generalpreferences>

<color>blue</color>

<animal>dog</animal>

</generalpreferences>

<musicpreferences>

<genre>classical</genre>

<genre>blues</genre>

<genre>rock/pop</genre>

<composer>Wolfgang Amadeus Mozart</composer>

<artist>Beatles</artist>

<sample>

<title>Yesterday</title>

<artist>Beatles</artist>

</sample>

</musicpreferences>

<habit>

<context>I’m happy

<tempo>very fast</tempo>

<genre>pop</genre>

</context>

<pfeature>romantic

<tempo>very slow</tempo>

<softness>very soft<softness>

<title>*love*</title>

</pfeature>

<context>bedtime

<pfeature>romantic</pfeature>

</context>

</habit>

Page 26: convenient MIR systems

Multi-facet Music Similarity and Adaptive User Model

• Hard-wired multi-facet similarity [Whitman]• Weighting of audio vs. cultural description by slider usage

[Aucouturier]• Description Weight Vectors (DWV) [Rolland]

– Original work for melodic similarity– DWV contains weight for each description in the

representation– Weight is varying with user interaction– Explicit user feedback: re-ranking of system´s output– Implicit adaptation of weights

• Future Work– Apply DWV to multi-facet similarity (audio,lyrics,cultural)– Infer initial setting of weights according to psychological

factors

Page 27: convenient MIR systems

Project MPEER"In a world of spontaneously federating services, there is no point in having a proprietary service, there is no point in staying out of the directory, there is no point in using an XML protocol that no one understands, there is no point in basing it on a proprietary server, and there is no need to justify the obvious error in following that path."

- Simon Phipps, chief technology evangelist, Sun Microsystems, Inc., 2001

Page 28: convenient MIR systems

MPEER Objectives

“Bringing the web to its full potential” [Fensel, Bussler]

Centralized /Static

Distributed /Dynamic

UDDI, WSDL, SOAP

Web Services

URI, HTML, HTTPRDF, RDF(S), DAML, OIL

WWW Semantic Web

Intelligent Web Services

WFSL -> WSMFDAML-S

Formal Semantic

• Relate MIR to the Semantic Web activities (W3C)

• Create (composite) Semantic Web Services for MIR

• Explore the P2P computing paradigm (shared resources)

Page 29: convenient MIR systems

MPEER Architecture

User

Audio(MP3)

Meta Data (XML / RDF)

Title,Artist,Volume,Genre,bpm,Loud,Sound,Like,Dislike,SimilarToAudio(MP3)

Meta Data (XML / RDF)

Title,Artist,Volume,Genre,bpm,Loud,Sound,Like,Dislike,SimilarToAudio(MP3)

Meta Data (XML / MPEG-7 / RDF-S)

Title,Artist,Volume,Genre,bpm,Loudness,Timbre,Like,Dislike,SimilarTo

WebService

e.g.

- Ontologies, Taxonomies

- CD-Retailers, EMD

- MIR services

- Audio ID

- Thumbnails

- ...

P2P Client/Server (Jtella/JXTA)

P2P Client GUI

Basic Features,

Descriptors

Tempo, Loudness

Timbre

Classification Music Similarity

Clustering

Semantic Web Wrapper

„Title Artist Volume Genre

Bpm Loud Sound

Like Dislike

SimilarTo ...“

Page 30: convenient MIR systems

MPEER: composite Webservice• Service Type: „query service“

– Sub Type: Semantic web enabled

– Domain: Music

– Supported ontologies: {ontoson, musicbrainz.com, allmusicguide, ..}

• Port Types:

– Identification by audio, Similarity by audio, Retrieval by partial information

– Personalized recommendations, Playlist generation

– Music-Question Answering

• Operations/Messages of Port Type Identification by audio:

– IF_NOT_MP3(input)->Convert2MP3(input)->CalculateMetadata-> ...

• Composite, Distributed Services: (maybe P2P using users local content&processing

power)

– (1) MPeer.getEverythingFrom(Prince)

– (2) WebServiceRepository.discover&select(SpecialArtistService)

– (3) SpecialArtistService=AllMusicGuide.detailedInfo

– (4) NegotiateContract(contract1,MPeer,AllMusicGuide)

– (5) Contract1.StartTransaction(MPeer,AllMusicGuide)

– (5.1) AllMusicGuide.detailedinfo(Prince)

– (5.2) ...

Page 31: convenient MIR systems

Prototypical P2P Client

Page 32: convenient MIR systems

OpenSource Tools: Ontology Editor

Page 33: convenient MIR systems

OpenSource Tools: DataMining, ML

Page 34: convenient MIR systems

Conclusion

• The Web offers potential beyond symbolic or audio-based MIR reflecting cultural issues

• User-centric MIR systems may benefit from user models and situation-driven adaptation

• The field is too large to be handled by individual institutes

• Composite web services offer a way for collaboration on the topic and maybe to provide holistic, high-quality MIR systems