new applicatiosn and learning machines · provider dmarc broadcasting for $102 million in cash....

77
Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 1 New applications of learning machines New applications of learning machines www.intelligentsound.org isp.imm.dtu.dk Jan Larsen

Upload: others

Post on 24-Sep-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

1 New applications of learning machines

New applications of learning machines

www.intelligentsound.org

isp.imm.dtu.dk

Jan Larsen

Page 2: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

2 New applications of learning machines

Cross-disciplinary research

Signal processing

StatisticsMachine learning

Domain knowledge

Page 3: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

3 New applications of learning machines

Informatics and Mathematical Modelling, DTU

2003 figures84 faculty members28 administrative staff members60 Ph.D. students90 M.Sc. students annually4000 students follow an IMM course annually

image processing and computer graphics

ontologies and databases

safe and secure IT systems

languages and verification

design methodologies

embedded/distributed systemsmathematical physics

mathematical statistics

geoinformatics

operations research

intelligent signal processing

system on-chipsnumerical analysis

Page 4: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

4 New applications of learning machines

ISP Group

HumanitarianDemining

Monitor Systems

Biomedical

Neuroinformatics

Multimedia

Machinelearning

•3+1 faculty•6+1 postdocs•20 Ph.D. students•10 M.Sc. students

•3+1 faculty•6+1 postdocs•20 Ph.D. students•10 M.Sc. students

from processing to understanding

extraction of meaningful information by learning

Page 5: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

5 New applications of learning machines

The potential of learning machines

Most real world problems are too complex to be handled by classical physical modelsIn most real world situations there is access to data describing properties of the problemLearning machines can offer– Learning of optimal prediction/decision/action– Adaptation to the usage environment– New insights into the problem and suggestions for

improvement

Page 6: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

6 New applications of learning machines

A short history of learning machines

clas

sica

l

moder

n

ADALINE

Neural nets

Gaussian processes

Kernel machines

Mixture of experts

Page 7: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

7 New applications of learning machines

Issues in machine learning

Data

•quantity

•stationarity

•quality

•structure

Features

•representation

•selection

•extraction

•integration

Models

•structure

•type

•learning

•selection and

integration

•unsupervised

•semi-supervised

•supervised

•cost function

•maximum likelihood

•Bayesian

•online vs. off-line

Evaluation

•performance

•robustness

•complexity

•interpretation and visualization

•HCI

•parametric: linear, nonlinear, mixture models

•non-parametric: kernel, Gaussian processes, clustering

•noise models

•integration of prior and domain knowledge

Page 8: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

8 New applications of learning machines

Outline

Machine learning framework for sound search– Involves all issues of machine learning

Genre classification– Involves feature selection, projection and integration– Involves linear and nonlinear classifiers

Music separation– Involves combination machine learning and other signal

processing– NMF and ICA machine learning algorithms

MIMO channel estimation and symbol detection– Involves advanced variational Bayesian learning

Page 9: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

9 New applications of learning machines

The digital music market

Wired, April 27, 2005:

"With the new Rhapsody, millions of people can now experience andshare digital music legally and with no strings attached," Rob Glaser,RealNetworks chairman and CEO, said in a statement. "We believe thatonce consumers experience Rhapsody and share it with their friends,many people will upgrade to one of our premium Rhapsody tiers."

Financial Times (ft.com) 12:46 p.m. ET Dec. 28, 2005:

LONDON - Visits to music downloading Web sites saw a 50 percent riseon Christmas Day as hundreds of thousands of people began loadingsongs on to the iPods they received as presents.

Wired, January 17, 2006:

Google said today it has offered to acquire digital radio advertisingprovider dMarc Broadcasting for $102 million in cash.

Page 10: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

10 New applications of learning machines

Huge demand for tools

Organization, search and retrieval– Recommender systems (”taste prediction”)– Playlist generation– Finding similarity in music (e.g., genre classification,

instrument classification, etc.)– Hit prediction– Newscast transcription/search– Music transcription/search

Machine learning is going to play a key role in future systems

Page 11: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

11 New applications of learning machines

Aspects of search

Specificitystandard search enginesindexing of deep contentObjective: high retrieval performance

Similaritymore like thissimilarity metrics Objective: high generalization and user acceptance

Page 12: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

12 New applications of learning machines

Specialized search and music organization

The NGSW is creating an online fully-searchable digital library of spoken word collections spanning the 20th century

Organize songs according to tempo, genre, mood

Query by humming

search for related songs using the “400 genes of music”

Explore byGenre, mood, theme, country, instrument

Page 13: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

13 New applications of learning machines

Sound information data

audio

data

User networks

co-play data

playlist

communities

user groups

Meta data

ID3 tags

context

low

high

Description

level

ontology

Page 14: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

14 New applications of learning machines

Machine learning in sound information processing

machine learning model

audio

data

User networks

co-play data

playlist

communities

user groups

Meta data

ID3 tags

contextTasks

Grouping

Classification

Mapping to a structure

Prediction e.g. answer

to query

Page 15: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

15 New applications of learning machines

Machine learning for high level interpretations

machine learning model

data

feature extraction and

selection

feature extraction and

selection

feature extraction and

selection

feature extraction and

selection

feature extraction and

selection

feature extraction and

selection

feature extraction and

selection

time integrationtime

integrationtime integrationtime

integrationtime integrationtime

integrationtime integration

unsupervisedsupervised

Similarity functions

Euclidian, Weighted Euclidian, Cosine,

Nearest Feature Line, earth Mover Distance, Self-organized Maps,

Distance From Boundary, Cross-

sampling, Bregman, KL, Manhattan,

Adaptive

Page 16: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

16 New applications of learning machines

Similarity structures

Low level features– Ad hoc from time-domain, Ad hoc from spectrum, MFCC,

RCC, Bark/Sone, Wavelets, Gamma-tone-filterbank

High level features– Basic statistics, Histograms, Selected subsets, GMM,

Kmeans, Neural Network, SVM, QDA, SVD, AR-model, MoHMM

Metrics– Euclidian, Weighted Euclidian, Cosine, Nearest Feature Line,

earth Mover Distance, Self-organized Maps, Distance From Boundary, Cross-sampling, Bregman, Manhattan

Time domian

• loudness

• zero-crossing energy

• log-energy

• down sampling

• autocorrelation

• peak detection

• delta-log-loudness

Frequency domain

• MFCC

• Gamma tone filterbank

• pitch

• brightness

• bandwidth

• harmonicity

• spectrum power

• subband power

• centroid

• roll-off

• low-pass filtering

• spectral flatness

• spectral tilt

• sharpness

• roughness

Page 17: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

17 New applications of learning machines

Predicting the answer from query

• : index for answer song

• : index for query song

• : user (group index)

• : hidden cluster index of similarity

Page 18: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

18 New applications of learning machines

Search and similarity integration

Integration

Projection onto latent

space

Clustering –perceptual resolution

user

List of songs,

metadata and content

d1

d2

dn

Page 19: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

19 New applications of learning machines

http://www.intelligentsound.org/demos/conceptdemo.swf

Page 20: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

20 New applications of learning machines

Demo ofWINAMPplugin

Lehn-Schiøler, T., Arenas-García, J., Petersen, K. B., Hansen, L. K., A Genre Classification Plug-in for Data Collection, ISMIR, 2006

Page 21: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

21 New applications of learning machines

http://www.intelligentsound.org/demos/automusic.swf

Page 22: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

22 New applications of learning machines

Intelligent Sound ProjectIMM (DTU) – CS, CT (AaU)

– Signal processing– Databases– Machine learning

Demo: Sound search engine

Demo: Matlab toolbox

Phd projects

Group publications

Joint publications

Workshops/Phd-courses

Page 23: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

23 New applications of learning machines

Research tasks

AaU Communication Technology:

TASK i): Features for sound based context modelling - MPEG and beyond

TASK ii): Signal separation in noisy environments: ICA and noise reduction

AaU Computer Science/Database Management:

TASK iii): Multidimensional management of sound as context

TASK iv): Advanced Query Processing for Sound Feature Streams

DTU IMM-ISP

TASK v): Context detection in sound streamsTASK vi): Webmining for sound

Page 24: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

24 New applications of learning machines

Genre classification

Prototypical example of predicting meta dataThe problem of interpretation of genresCan be used for other applications e.g. hearing aids

Page 25: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

25 New applications of learning machines

Model

Making the computer classify a sound piece intomusical genres such as jazz, techno and blues.

Pre-processingFeature extraction

Statisticalmodel

Post-processing

Sound Signal

Featurevector

Probabilities Decision

Page 26: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

26 New applications of learning machines

How do humans do?

Sounds – loudness, pitch, duration and timbreMusic – mixed streams of soundsRecognizing musical genre– physical and perceptual: instrument recognition, rhythm,

roughness, vocal sound and content– cultural effects

Page 27: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

27 New applications of learning machines

How well do humans do?

Data set with 11 genres25 people assessing 33 random 30s clips

accuracy 54 - 61 %

Baseline: 9.1%

Page 28: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

28 New applications of learning machines

What’s the problem ?

Technical problem: Hierarchical, multi-labelsReal problems: Musical genre is not an intrinsicproperty of music– A subjective measure– Historical and sociological context is important– No Ground-Truth

Page 29: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

29 New applications of learning machines

Music genres form a hierarchy

Music

Jazz New Age Latin

Swing New OrleansCool

Classic BB Vintage BB Contemp. BB

Quincy Jones: ”Stuff like that”

(according to Amazon.com)

Page 30: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

30 New applications of learning machines

Wikipedia

Page 31: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

31 New applications of learning machines

Music Genre Classification Systems

Pre-processingFeature extraction

Statisticalmodel

Post-processing

Sound Signal

Featurevector

Probabilities Decision

Page 32: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

32 New applications of learning machines

Features

Short time features (10-30 ms)– MFCC and LPC– Zero-Crossing Rate (ZCR), Short-time Energy (STE)– MPEG-7 Features (Spread, Centroid and Flatness Measure)

Medium time features (around 1000 ms)– Mean and Variance of short-time features– Multivariate Autoregressive features (DAR and MAR)

Long time features (several seconds)– Beat Histogram

Page 33: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

33 New applications of learning machines

On MFCC

Discrete Fourier

transform

Log amplitude spectrum

Mel scaling and

smoothing

Discrete Cosine

transform

MFCC represents a mel-weighted spectral envelope. The mel-scale models human auditory perception.Are believed to encode music timbre

Sigurdsson, S., Petersen, K. B., Mel Frequency Cepstral Coefficients: An Evaluation of Robustness of MP3 Encoded Music, Proceedings of the Seventh International Conference on Music Information Retrieval (ISMIR), 2006.

Page 34: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

34 New applications of learning machines

Features for genre classification

30s sound clip from the center of the song

6 MFCCs, 30ms frame

6 MFCCs, 30ms frame

6 MFCCs, 30ms frame

3 ARCs per MFCC, 760ms frame

30-dimensional AR features, xr ,r=1,..,80

Page 35: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

35 New applications of learning machines

Page 36: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

36 New applications of learning machines

Statistical models

Desired: (genre class and song )Used models – Intregration of MFCCs using MAR models– Linear and non-linear neural networks– Gaussian classifier– Gaussian Mixture Model– Co-occurrence models

Page 37: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

37 New applications of learning machines

Example of MFCC’s

•Cross correlation

•Temporal correlation

Page 38: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

38 New applications of learning machines

Results reported in• Meng, A., Ahrendt, P., Larsen, J., Hansen, L. K., Temporal Feature Integration for Music Genre Classification, IEEE Transactions on Signal Processing, 2006.

• A. Meng, P. Ahrendt, J. Larsen, Improving Music Genre Classification by Short-Time Feature Integration, IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. V, pp. 497-500, 2005.

• Ahrendt, P., Goutte, C., Larsen, J., Co-occurrence Models in Music Genre Classification, IEEE International workshop on Machine Learning for Signal Processing, pp. 247-252, 2005.

• Ahrendt, P., Meng, A., Larsen, J., Decision Time Horizon for Music Genre Classification using Short Time Features, EUSIPCO, pp. 1293--1296, 2004.

• Meng, A., Shawe-Taylor, J., An Investigation of Feature Models for Music Genre Classification using the Support Vector Classifier, International Conference on Music Information Retrieval, pp. 604-609, 2005

Page 39: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

39 New applications of learning machines

Best results

5-genre problem (with little class overlap) : 2% error– Comparable to human classification on this database

Amazon.com 6-genre problem (some overlap) : 30% error11-genre problem (some overlap) : 50% error– human error about 43%

Page 40: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

40 New applications of learning machines

Best 11-genre confusion matrix

Page 41: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

41 New applications of learning machines

11-genre human evaluation

Page 42: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

42 New applications of learning machines

Supervised Filter Design in Temporal Feature Integration

Model the dynamics of MFCCs:Obtaining periodograms for eachframe of 768ms MFCC“Bank-filter” these new features to obtain discriminative data

J. Arenas-Gacía, J. Larsen, L.H. Hansen, A. Meng: Optimal filtering of dynamics in short-time features for music organization, ISMIR 2006.

Page 43: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

43 New applications of learning machines

MFCC3

frequency

Periodograms contain information about how fast MFCCs changeA bank with 4 constant-amplitude was proposed for genre classification

- 0 Hz : DC Value- 1 – 2 Hz : Beat rates- 3 – 15 Hz : Modulation energy (e.g., vibrato)- 20 – Fs/2 Hz : Perceptual Roughness

Orthonormalized PLS can be used for a better design of this bank filter.Additional constraint U>0: Positive Constrained OPLS (POPLS)

Page 44: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

44 New applications of learning machines

Principal component analysis (PCA)

Choose U to make the best approximation of X

This is equivalent to

– Xu1 proj. explains the maximumvariance of the data

– Xu2 the second one, s.t.

X2 T 1 Tˆ ˆarg min ( ) ,F

−= − = =U X XB B X X X X X XU

1

2 T

,..., 1max , . .

f

n f

n

k j k jkk

s t δ=

=∑u uXu u u

T2 1 0=u u

-1 -0.5 0 0.5 1 1.5 2 2.5-1

-0.5

0

0.5

1

1.5

2

PCA uk: eigenvectors of Cxx

Page 45: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

45 New applications of learning machines

-1 -0.5 0 0.5 1 1.5 2-1

-0.5

0

0.5

1

1.5

2

x(1)

x(2)

Think about the following classification problem

When facing classification or regression problems, we should use the labels to obtain good features

– Which direction will PCA consideras the most relevant one?

– If only one feature is to be kept, which is the best projectionvector under a classificationperspective?

PCA for supervised learning

Page 46: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

46 New applications of learning machines

Orthonormalized Partial Least Squares

Choose U to make the best approximation of Y in some space of reduced dimensionality

Rewriting the above equationOPLS:

X

2 T 1 Tˆ ˆarg min ( ) ,F

−= − = =U Y XB B X X X Y X XU

T T T T

T T T

max max ,

s.t.

xy yx=

= =U UU X YY XU U C C U

U X XU X X I(projected data is white)

Page 47: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

47 New applications of learning machines

Optimality of OPLS

OPLS is optimal for doing regressionbut can also be used for FE in classification if Y is used to encode class membership informationWe can directly use OPLS to design regressors and classifiers:– We compute from the training data– For new data

T 1 Tˆ ( )−=B X X X Y

( )ˆ ˆˆ (regression)

ˆˆ w.t.a. (classification)

= =

=

y xUB xB

y xB

Page 48: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

48 New applications of learning machines

Illustrative example: vibrato detection

Vib

NonVib

64 (32/32) AltoSax music snippets in Db3-Ab5Only the first MFCC was used

Leave-one-out CV error: 9,4 % (nf = 25); 20 % (nf = 2)(Fixed filter bank: 48,3 %)

Page 49: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

49 New applications of learning machines

POPLS for genre classification

1317 music snippets (30 s) evenly distributed among 11 genres7 MFCCs, but an unique filter bank

POPLS 2% better on average

compared to a fixed filter

bank of four filter

10-fold cross-validation

error falls to 61 % for nf =

25

Page 50: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

50 New applications of learning machines

Interpretation of filters

Filter 1: modulationfrequencies of instrumentsFilter 2: lower modulationfrequency + beat-scaleFilter 4: perceptualroughness

Consistent filters across 10-fold cross-validation– robustness to noise– relevant features for genre

Page 51: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

51 New applications of learning machines

Music separation

A possible front end component for the music search frameworkNoise reductionMusic transcriptionInstrument detection and separationVocalist identification

Unsupervised/supervised learning methods

Page 52: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

52 New applications of learning machines

Nonnegative matrix factor 2D deconvolution

M. N. Schmidt, M. Mørup Nonnegative Matrix Factor 2-D Deconvolution for Blind Single Channel Source Separation, ICA2006, 2006. Demo also available.

φ

048

τ0 2 4 6

Time [s]

Freq

uenc

y [H

z]

0 0.2 0.4 0.6 0.8200

400

800

1600

3200time

pitch

Page 53: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

53 New applications of learning machines

Demonstration of the 2D convolutive NMF model

φ

01531

τ0 1 2

Time [s]

Freq

uenc

y [H

z]

0 2 4 6 8 10200

400

800

1600

3200

Page 54: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

54 New applications of learning machines

Separating music into basic components

Page 55: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

55 New applications of learning machines

Separating music into basic components

Combined ICA and masking

• Pedersen, M. S., Wang, D., Larsen, J., Kjems, U., Two-microphone Separation of Speech Mixtures, 2006

• Pedersen, M. S., Lehn-Schiøler, T., Larsen, J., BLUES from Music: BLind Underdetermined Extraction of Sources from Music, ICA2006, vol. 3889, pp. 392-399, Springer Berlin / Heidelberg, 2006

• Pedersen, M. S., Wang, D., Larsen, J., Kjems, U., Separating Underdetermined Convolutive Speech Mixtures, ICA 2006, vol. 3889, pp. 674-681, Springer Berlin / Heidelberg, 2006

•Pedersen, M. S., Wang, D., Larsen, J., Kjems, U., OvercompleteBlind Source Separation by Combining ICA and Binary Time-Frequency Masking, IEEE International workshop on Machine Learning for Signal Processing, pp. 15-20, 2005

Page 56: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

56 New applications of learning machines

Assumptions

Stereo recording of the music piece is available.The instruments are separated to some extent in time and in frequency, i.e., the instruments are sparse in the time-frequency (T-F) domain. The different instruments originate from spatially different directions.

Page 57: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

57 New applications of learning machines

Separation principle: ideal T-F masking

Page 58: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

58 New applications of learning machines

Ste

reo

chan

nel 1

Ste

reo

chan

nel 2

Gain difference between channels

Page 59: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

59 New applications of learning machines

Separation principle 2: ICA

sources mixedsignals

recovered source signals

mixing

x = As

separation

ICAy = Wx

What happens if a 2-by-2 separation matrix W is applied to a

2-by-N mixing system?

Page 60: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

60 New applications of learning machines

ICA on stereo signals

We assume that the mixture can be modeled as an instantaneous mixture, i.e.,

The ratio between the gains in each column in the mixing matrix corresponds to a certain direction

1 1 1

2 1 2

( ) ( )( )

( ) ( )N

N

r rA

r rθ θ

θθ θ

⎡ ⎤= ⎢ ⎥⎣ ⎦

1( , ... , )Nx A sθ θ=

Page 61: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

61 New applications of learning machines

Direction dependent gain

( ) 20log | ( ) |=r θ WA θWhen W is applied, the two separated channels each contain a group of sources, which is as independent as possible from the other channel.

Page 62: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

62 New applications of learning machines

Combining ICA and T-F maskingx1 x2

ICA

STFT STFT

y1 y2

Y1(t, f) Y2(t, f)

1 when

0 otherwise 1 2

1

Y / Y cBM

>⎧= ⎨

1 when

0 otherwise 2 1

2

Y / Y cBM

>⎧= ⎨

X1(t,f)

BM1 BM2

x1(1) x2

(1)

ICA+BM separator

^ ^ISTFT

X2(t,f)

ISTFT

X1(t,f)

x1(2) x2

(2)^ ^ISTFT

X2(t,f)

ISTFT

Page 63: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

63 New applications of learning machines

Method applied iteratively

x1 x2

ICA+BM

ICA+BM ICA+BM

ICA+BM ICA+BM

Page 64: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

64 New applications of learning machines

Improved methodThe assumption of instantaneous mixing may not always holdAssumption can be relaxed Separation procedure is continued until very sparse masks are obtainedMasks that mainly contain the same source are afterwards merged

ICA+BM

ICA+BM

ICA+BM

ICA+BM

ICA+BM ICA+BM ICA+BM

ICA+BM ICA+BM ICA+BM ICA+BM ICA+BM ICA+BM ICA+BM ICA+BM

ICA+BMICA+BMICA+BMICA+BMICA+BMICA+BMICA+BMICA+BM ICA+BMICA+BMICA+BMICA+BM ICA+BMICA+BMICA+BMICA+BM

ICA+BM ICA+BM ICA+BM ICA+BM ICA+BM ICA+BM ICA+BM ICA+BMICA+BM ICA+BM ICA+BM ICA+BM

ICA+BM ICA+BM ICA+BM ICA+BMICA+BM ICA+BM ICA+BM ICA+BM ICA+BM ICA+BM ICA+BM ICA+BMICA+BM ICA+BM ICA+BM ICA+BMICA+BM ICA+BM ICA+BM ICA+BM

Page 65: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

65 New applications of learning machines

Mask merging

If the signals are correlated (envelope), their corresponding masks are merged.

The resulting signal from the merged mask is of higher quality.

+

Page 66: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

66 New applications of learning machines

Results

Evaluation on real stereo music recordings, with the stereo recording of each instrument available, before mixing.We find the correlation between the obtained sources and the by the ideal binary mask obtained sources.Other segregated music examples and code are available online via http://www.imm.dtu.dk

Page 67: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

67 New applications of learning machines

Results

The segregated outputs are dominated by individual instrumentsSome instruments cannot be segregated by this method, because they are not spatially different.

Page 68: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

68 New applications of learning machines

Conclusion on combined ICA T-F separation

An unsupervised method for segregation of single instruments or vocal sound from stereo music. The segregated signals are maintained in stereo.Only spatially different signals can be segregated from each other. The proposed framework may be improved by combining the method with single channel separation methods.

Page 69: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

69 New applications of learning machines

MIMO channel estimation and symbol detection

Christensen, L. P. B., Larsen, J., On Data and Parameter Estimation Using the VariationalBayesian EM-algorithm for Block-fading Frequency-selective MIMO Channels, ICASSP, 2006

Application of machine-learning algorithm to wireless communicationsImproved iterative parameter estimation framework compared to the EM-algorithmGeneralizes the EM-algorithm by working with parameter distributions instead of point-estimatesExplicit solutions provided for channel and covariance estimationSimilar complexity per iteration as the EM-algorithm

Page 70: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

70 New applications of learning machines

MIMO system model

Model parameters

Symbols

•Block fading: the channel is constant over a frame of symbols

•Frequency selective channel has length of symbols

Page 71: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

71 New applications of learning machines

EM learning with hidden variables

The likelihood of is incomplete as the symbols are unknown

E step is computed using the BCJR forward-backward algorithm

Page 72: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

72 New applications of learning machines

Variational Bayes learning

Parameter fluctuations is taken into account

Reduces to EM when

Page 73: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

73 New applications of learning machines

Algorithm

Conjugated priors enables VBE step similar to that of BCJR in EM

VBE-step(Equalization)

VBM-stepChannel

Estimation

Covariance Estimation

( )qx x

( )qh h( )qΣ Σ

( ) ( ),q qh Σh Σ

Page 74: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

74 New applications of learning machines

Simulation example

BPSK link considered (linearized GSM system)SISO Block-fading Typical Urban (TU) channel model, channel length L=7AWGN, i.e., scalar covariance estimationNon-informative priors used

For Nf=142+6 and Ntr=26, VBEM falls back to EMFor Nf=71+6 and Ntr=13, VBEM gains over EM due to increased uncertainty in the parameters

Page 75: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

75 New applications of learning machines

Results

Training: only training bitsKnown: true parameters

No significant difference

Page 76: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

76 New applications of learning machines

Results

Significant differenceUseful when few data relative to number of parameters: short bursts and/or MIMO systems

Page 77: New applicatiosn and learning machines · provider dMarc Broadcasting for $102 million in cash. Intelligent Signal Processing Group, IMM, DTU / Jan Larsen 10 New applications of learning

Intelligent Signal Processing Group, IMM, DTU / Jan Larsen

77 New applications of learning machines

SummaryMachine learning is, and will become, an important component in most real world applicationsSearching in massive amounts of heterogeneous enhances “productivity”simply important to ….quality of life…Machine learning is essential for search – in particular mapping low level data to high description levels enabling human interpretationMusic separation combines unsupervised methods ICA/MNF with other SP and supervised techniquesAdvanced Bayesian learning schemes provides optimal performance in highly non-stationary environment with few training data – e.g. communication