lam: musical audio similarity
DESCRIPTION
LAM: Musical Audio Similarity. Michael Casey Centre for Cognition, Computation and Culture Department of Computing Goldsmiths College, University of London. Overview. Machine Music Understanding Features / Classes / Clusters Real-Time Audio Matching Feature Extraction - PowerPoint PPT PresentationTRANSCRIPT
LAM: Musical Audio Similarity
Michael CaseyCentre for Cognition, Computation and Culture
Department of ComputingGoldsmiths College, University of London
Overview• Machine Music Understanding
• Features / Classes / Clusters
• Real-Time Audio Matching• Feature Extraction• Feature Similarity (Indexing / Retrieval)• PD/MSP Tools
• Music Similarity Applications• Sound object matching• Texture matching
Sound Understanding
Signal Processing Sound Understanding
Feature Extraction
frame 2
frame 3
overlapframe 1
audiosource
20ms10ms 30ms 40ms
Feature Extraction
frame 2
frame 3
overlapframe 1
audiosource
20ms10ms 30ms 40ms
Feature Extraction
frame 2
frame 3
overlapframe 1
audiosource
20ms10ms 30ms 40ms
Feature Extraction
frame 2
frame 3
overlapframe 1
audiosource
20ms10ms 30ms 40ms
Feature Extraction
frame 2
frame 3
overlapframe 1
audiosource
20ms10ms 30ms 40ms
Feature Extraction
frame 2
frame 3
overlapframe 1
audiosource
20ms10ms 30ms 40ms
p( | ) * P( )
Statistical Learningfor Decision Making
Decision boundary
Partitioning of feature space
P( | )= p( )
MusicSpeech
MPEG-7 Audio Tools
Audio
MPEG-7 Audio Tools
Log FrequencySpectrogramAudio
AudioSpectrumEnvelopeD
MPEG-7 Audio Tools
Log FrequencySpectrogramAudio Log
AmplitudeDecorrelatingTransform /
Dimension Reduction
AudioSpectrumEnvelopeD
AudioSpectrumProjectionD
SoundModelStatePathD
State Path
Use estimated state sequence as a feature
MPEG-7 Audio Tools
Log FrequencySpectrogramAudio Log
AmplitudeDecorrelatingTransform /
Dimension Reduction
AudioSpectrumEnvelopeD
AudioSpectrumProjectionD
Hidden MarkovModel
SoundModelDS
MPEG-7 Audio StringsAcoustic Lexicons
Log FrequencySpectrogramAudio Log
AmplitudeDecorrelatingTransform /
Dimension Reduction
AudioSpectrumEnvelopeD
AudioSpectrumProjectionD
Hidden MarkovModel
SoundModelDS StatePath
? 7 1 V 7 1 0 1 ...
SoundModelStatePathD
SYMBOL STRING
State Symbol Sequence (40 State Model)
?71V
7101 .
..
State Symbol Sequence (40 State Model)
?71V
7101 .
..
State Symbol Sequence (40 State Model)
?71V
7101 .
..
State Symbol Sequence (40 State Model)
?71V
7101 .
..
SoundModelStateHistogramD
seconds
stat
e in
dex
stat
e in
dex
0.01s Frames
Self-Similarity Matrix
Self-Similarity Matrix
Self-Similarity Matrix
|||||||||cos, 1
babaT
ba
Self-Similarity Matrix
|||||||||cos, 1
babaT
ba
a
Self-Similarity Matrix
|||||||||cos, 1
babaT
ba
a
b
Self-Similarity Matrix
|||||||||cos, 1
babaT
ba
a
b
Self-Similarity Matrix
|||||||||cos, 1
babaT
ba
S-Matrix
Efficient Storage / Retrieval
• Real-Time Access
• Large Databases
• Distributed Databases
PostgreSQL Database Representation of State Path “Strings” and Histograms
Similarity
• Compute distance between feature pairs• Features == SoundModelStateHistogramD
• Similarity Metric•dist(a,b) >= 0•dist(a,b)== 0 iff a==b•dist(a,b) + dist(b,c) >= dist(a,c)
• Vector Dot Product
|||||||||cos, 1
babaT
ba
Similarity of Feature Trajectories
Dynamic Time Warping
Acousticon Strings
• Distance Metric– String Edit Distance (Levenschtein)
• Scalable to Large Databases– PostgreSQL Implementation– Can use built-in Index Structures
• Scalable to Real-Time Implementation– matching and audio streaming (< 20ms )
Information Retrievalfor Creativity
• Utilize sound extant database for new material
• Take the structure of a music clip but replace the content.
• New interfaces for music creativity.
Audio Information Retrieval
MPEG-7Database
A pre-indexed Collection of Sounds
Audio Query Extract
MPEG-7Database
Segment Match
Result ListA Sound or Scene orList of Sounds
Audio Information Retrieval
Audio Query Extract
MPEG-7Database
Segment Match
Result ListFeature extractionfrom audio.
Audio Information Retrieval
Audio Query Extract
MPEG-7Database
Segment Match
Result ListPartitioningof audio intochunks.
Audio Information Retrieval
Audio Query Extract
MPEG-7Database
Segment Match
Result List
Find similar chunksof Audio
Audio Information Retrieval
Real-Time Matching
MusaicsReal-Time Matching
MusaicsReal-Time MatchingReal-Time Matching
MusaicsReal-Time Matching
MusaicsReal-Time Matching
MusaicsReal-Time Matching
MusaicsReal-Time Matching
MusaicsReal-Time Matching
MusaicsReal-Time Matching
MusaicsReal-Time Matching
MusaicsReal-Time Matching