student: mike jiang advisor: dr. ras, zbigniew w. music information retrieval
TRANSCRIPT
Pitch - fundamental frequency ◦ Melody
Temporal- duration ◦ rhythmic
Timbral*◦ tone color
Facets of Music Information
Aural Queries◦ Query By Humming (QBH) systems
Input: aural melody matches melody, rhythm
Indexing for Aural Queries◦ melodies are extracted from the source◦ Translated into text representations of intervals,
pitch Legal
◦ Is any passage from this piece sampled or copied from one of ours?
possible Applications
Music education ◦ Music performance
analysis◦ Searching music by
instruments for Quintet practicing.
Music therapy◦ Help doctors identify
efficient musical pieces.
string quartet
piano sonata
Data source
organization
volume Type Quality
Traditional data
structured modest discrete,categorical
clean
Audio data Unstructured Very large Continuous,Numeric
noise
The nature and types of raw data
ID Age occupation
Salary
City
1 18 Student
low Atlanta
2 30 Worker
medium
Cleveland
3 43 teacher
medium
Richmond
4 50 professor
high Boston
5 40 banker
high New York
. . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Feature Database
traditional pattern recognition
FeatureExtraction
lower level raw data form
Object/Pattern detection
Higher level representations
classification clustering regression
Pattern Database
Energy values at each sample point
manageable, (nearly)
homogeneous subset of objects
organizing large collections of music create MusicMaps
◦ Automatic description of digital audio files by sound features
◦ visualize the similarity of songs and artists ◦ Similarity search in music collection
MusicMiner
Low level features extraction-400 high level features-60 feature selection Clustering
MusicMiner- numerical measure of
perceptual music similarity
A query by whistling/humming system for melody retrieval
A collection of approx. 2000 melodies and classical themes
notify! Whistle
Note extraction process◦ Thresholding◦ Signal splitting◦ Fourier analysis◦ Quantization to MIDI-Note level
notify! Whistle
Collection provided by user; music archives Query by Example, Audio File audio is indexed and feature vectors are
store in vector file interactive exploration similarity-based search
PlaySOM
Matching Description◦ Features(Rhythm Patterns) are passed to a self-
organizing map◦ retrieves similar music by creating paths on the
map
PlaySOM
For each audio file, generate reproducible landmarks◦ –Each landmark occurs at a time offset
For each landmark, generate a “fingerprint” tag that characterizes its location
Shazam-Industry leader in audio fingerprinting
Do same for sample
Generate list of matching fingerprints
timedb–timesample= Constant
Shazam-Industry leader in audio fingerprinting
Input the melody Match the note sequence and get the answer on
composer, title, notes that matched
C-Brahms Retrieval Engine for Melody Searching
A Java applet records the audio signal. Then its fundamental frequency is analyzed. Adaptive preprocessing reduces the
influence of background noise on the succeeding steps.
A Java-based online QBH system
Query by Example
probabilistic matching◦ probabilistic models
Clustered dataset◦ tree structure◦ match the query following the paths
GUIDO
Query by Humming,Query by Example Multimodal Adaptive Recognition System
◦ also takes into account speech and phonetic content
comparing hummed queries to other hummed queries
http://www.midomi.com/
Midomi
43 MIR systems Most are pitch estimation-based melody and
rhythm match Is there MIR system based on timbre match
existed?
summary
Auto indexing system for musical instruments
intelligence query answering system for music instruments
WWW.MIR.UNCC.EDU
.
Polyphonic Sound
Polyphonic Sound
Get frameGet frame
FFTFFTFeature
extractionFeature
extraction
Classifier
Pitch Estimation
Get Instrument
Get InstrumentSound
separation
Power Spectrum
New spectrum
Strings
Violin
Music
Brass
Trumpet Cello
Percussion
Wood Winds
Piano
Flute
Guitar
English Horn
Viola
Bass Flute OboeBass Clarinet
French HornHarp
FeatureExtractionFeature
Extraction
Features
ClassifierClassifier
instrument confidence
Candidate 1 70%
Candidate 2Candidate 2 50%
. .
. .
. .
Candidate N 10%
40ms
.
Polyphonic Sound
Polyphonic Sound
Get frameGet frame
FFTFFTFeature
extractionFeature
extraction
Higher level Higher level ClassifierClassifier
Get FamilyGet Family
lower level lower level ClassifierClassifier
Get InstrumentCandidates
Get InstrumentCandidates
Finish all the Frames estimation
Finish all the Frames estimation
Voting processVoting process
Get Final winnersGet Final winners