on learning with kernels for audio signal processing: the ... · background - applications in audio...

36
On Learning with Kernels for Audio Signal Processing: the old and the new Hachem Kadri QARMA team - LIF Aix-Marseille University [email protected] GIPSA-Lab 2013

Upload: others

Post on 25-Jul-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: On Learning with Kernels for Audio Signal Processing: the ... · Background - Applications in audio processing I Music segmentation (Davy et al., 2006 ... • Hilbert space of operators

On Learning with Kernels for Audio SignalProcessing: the old and the new

Hachem Kadri

QARMA team - LIFAix-Marseille University

[email protected]

GIPSA-Lab 2013

Page 2: On Learning with Kernels for Audio Signal Processing: the ... · Background - Applications in audio processing I Music segmentation (Davy et al., 2006 ... • Hilbert space of operators

Background - Functional Learning (1/2)

y

i

= f (xi

) + ‘

i

I Supervised learning

≠æ Data: n training examples {(x1, y1), . . . , (xn, yn)}≠æ Goal: learn f

Predictor ‘≠æ Response ModelRd {≠1, 1} Binary ClassificationRd {1, 2, 3, . . .} Multi-class ClassificationRd R Multiple Regression

H. Kadri, QARMA Learning with kernels: the old and the new 2/1

Page 3: On Learning with Kernels for Audio Signal Processing: the ... · Background - Applications in audio processing I Music segmentation (Davy et al., 2006 ... • Hilbert space of operators

Background - Functional Learning (2/2)

I Minimization problem

minfœF

nÿ

i=1V

!y

i

, f(xi

)"

æ V: loss function - e.g. square loss:!y

i

≠ f(xi

)"2

I Overfitting problem

I Regularized minimization

minfœF

nÿ

i=1V

!y

i

, f(xi

)"

+ ⁄�(f)

æ �: regularization - e.g. L2-norm: �(f) = ÎfÎ2F

H. Kadri, QARMA Learning with kernels: the old and the new 3/1

Page 4: On Learning with Kernels for Audio Signal Processing: the ... · Background - Applications in audio processing I Music segmentation (Davy et al., 2006 ... • Hilbert space of operators

Background - Learning with Kernels (1/2)

y

i

= f (xi

) + ‘

i

; y

i

œ RI Linear model: f(x) = Èa, xÍ + b

I Kernels: nonlinear/nonparametric estimation

input space feature space

RKHS associated with a positive definite kernel k givesa desired feature space!!

H. Kadri, QARMA Learning with kernels: the old and the new 4/1

Page 5: On Learning with Kernels for Audio Signal Processing: the ... · Background - Applications in audio processing I Music segmentation (Davy et al., 2006 ... • Hilbert space of operators

Background - Learning with Kernels (2/2)

2 PerspectivesI Feature space

≠æ nonlinear in input space

≠æ projecting data into a Feature space

≠æ linear in Feature space

≠æ kernel trick È�(x1), �(x2)Í = k(x1, x2)

X ℝ

FX

ΦX

f

g

I RKHS theory≠æ Mercer theorem: integral operator + positive kernel

≠æ reproducing property: Èf, k(x, ·)Í = f(x)

≠æ representer theorem: f(·) =qi

i

k(xi

, ·) ; –

i

œ R

H. Kadri, QARMA Learning with kernels: the old and the new 5/1

Page 6: On Learning with Kernels for Audio Signal Processing: the ... · Background - Applications in audio processing I Music segmentation (Davy et al., 2006 ... • Hilbert space of operators

Background - Applications in audio processing

I Music segmentation (Davy et al., 2006 - IRCCyN)

I Speaker verification (Louradour et al., 2007 - IRIT)

I Speaker change detection (Harchaoui et al., 2008 - LTCI)

I Sound recognition (Rabaoui et al., 2008 - LAGIS)

I Speech inversion (Toutios et al., 2008 - LORIA)

H. Kadri, QARMA Learning with kernels: the old and the new 6/1

Page 7: On Learning with Kernels for Audio Signal Processing: the ... · Background - Applications in audio processing I Music segmentation (Davy et al., 2006 ... • Hilbert space of operators

Learning with kernels - Limitations and Challenges

+ + Geometric intuition and interpretation- - Choosing the kernel in advance- - Sequential, time-varying characteristics- - Limited to single task/scalar output

I “Sophisticated” kernel methods

• learning kernels ≠æ MKL: Multiple Kernel Learning

• probability distribution ≠æ RKHS embedding of distributions

• connection geometric/time-varying ≠æ FDA

• multi-task/complex outputs ≠æ Operator-valued Kernel

• Deep Learning - Representation Learning ≠æ ? . . .

H. Kadri, QARMA Learning with kernels: the old and the new 7/1

Page 8: On Learning with Kernels for Audio Signal Processing: the ... · Background - Applications in audio processing I Music segmentation (Davy et al., 2006 ... • Hilbert space of operators

≠æ Learning with kernels/RKHS embedding≠ ≠ ≠ + ≠ ≠ ≠ ≠ ≠ ≠ ≠ + ≠ ≠ ≠ ≠ ≠ ≠ ≠ ≠ ≠ ≠ + ≠ ≠ ≠ ≠ ≠ ≠≠ >

1950 1995 ≠ 2002 2005 ≠ . . .

Aronsazn Vapnik, Cortes, Scholkopf, Smola Gretton, Le Song, Fukumizu

≠æ Learning ⇠⇠⇠XXXwith kernels≠ ≠ ≠ ≠ ≠ ≠ ≠ ≠ ≠ ≠ ≠ + ≠ ≠ ≠ + ≠ ≠ ≠ + ≠ ≠ ≠ + ≠ ≠ ≠ ≠ ≠≠ >

2004 2006 2007 2010Lackiert - Bach Sonn. Rakoto. Cortes - Kloft

≠æ Learning with operator-valued kernels≠ ≠ ≠ ≠ ≠ + ≠ ≠ ≠ ≠ ≠ ≠ + ≠ ≠ ≠ ≠ ≠ + ≠ ≠ ≠ ≠ + ≠ ≠ ≠ ≠ ≠≠ >

1958 ≠ 1960 2005 2008 2010/2011Pedrick - Schwartz Micc & Pontil Caponnetto Kadri/d’Alche-Buc

≠æ Learning ⇠⇠⇠XXXwith operator-valued kernels≠ ≠ ≠ ≠ ≠ ≠ ≠ ≠ ≠ ≠ ≠ ≠ ≠ ≠ ≠ ≠ + ≠ ≠ ≠ ≠ + ≠ ≠ ≠ + ≠ ≠ ≠≠ >

2011 2012 2013Dinuzzo Kadri Sindhwani

H. Kadri, QARMA Learning with kernels: the old and the new 8/1

Page 9: On Learning with Kernels for Audio Signal Processing: the ... · Background - Applications in audio processing I Music segmentation (Davy et al., 2006 ... • Hilbert space of operators

FDA - Examples

850 900 950 1000 10502

2.5

3

3.5

4

4.5

5

5.5

wavelengths

ab

sorb

an

ces

Spectrometric Curves

0 50 100 150

5

10

15

20

25

frequencies

am

plit

ud

e

Speech

−0.04 −0.03 −0.02 −0.01 0 0.01 0.02 0.03 0.04−0.06

−0.04

−0.02

0

0.02

0.04

meters

me

ters

Handwriting

0 2 4 6 8 10 12

−0.2

−0.1

0

0.1

0.2

months

diff

ere

ntia

ted

Lo

g

Electricity Consumption

Regression Classification

Time warping Forecasting

Ramsay and Silverman (2002) - Ferraty and Vieu (2006)

H. Kadri, QARMA Learning with kernels: the old and the new 9/1

Page 10: On Learning with Kernels for Audio Signal Processing: the ... · Background - Applications in audio processing I Music segmentation (Davy et al., 2006 ... • Hilbert space of operators

FDA - Functional inputs & functional outputs

y

i

= f (xi

) + ‘

i

Predictor ‘≠æ Response ModelL

2L

2 Functional Model - Functional Responses

Temperature

0 50 100 150 200 250 300 350

−30

−20

−10

0

10

20

Day

(a)

Deg

C

Precipitation

0 50 100 150 200 250 300 3500

2

4

6

8

10

12

Day

(b)

mm

• Operator estimation≠æ min

fœF

nÿ

i=1Îy

i

≠ f(xi

)Î2Y + ⁄ÎfÎ2

F

H. Kadri, QARMA Learning with kernels: the old and the new 10/1

Page 11: On Learning with Kernels for Audio Signal Processing: the ... · Background - Applications in audio processing I Music segmentation (Davy et al., 2006 ... • Hilbert space of operators

Learning from functional responses - Discrete case

Learning from mul-tiple response data

Multiple out-put regression

Learning vector-valued function

Statistics Machinelearning

Multi-tasklearning

(Micchelli and Pontil, 2005)

C&Wprocedure

(Breiman and Friedman, 1997)

H. Kadri, QARMA Learning with kernels: the old and the new 11/1

Page 12: On Learning with Kernels for Audio Signal Processing: the ... · Background - Applications in audio processing I Music segmentation (Davy et al., 2006 ... • Hilbert space of operators

Reproducing kernels - From scalar to functional

Scalar-valued Function-valued

X ℝ

FX

ΦX

f

g

YX

FXY

ΦXY

f

g

• Operator-valued kernels & function-valued RKHS≠æ Nonlinear FDA

H. Kadri, QARMA Learning with kernels: the old and the new 12/1

Page 13: On Learning with Kernels for Audio Signal Processing: the ... · Background - Applications in audio processing I Music segmentation (Davy et al., 2006 ... • Hilbert space of operators

Outline

• Hilbert space of operators with Reproducing Kernelsæ Function-valued RKHSæ Operator-valued kernel

• Operator estimationæ L

2-regularized operator learning algorithmæ Block operator kernel matrix inversion

• Application to audio and speech processingæ Speech inversionæ Environmental sound recognition

H. Kadri, QARMA Learning with kernels: the old and the new 13/1

Page 14: On Learning with Kernels for Audio Signal Processing: the ... · Background - Applications in audio processing I Music segmentation (Davy et al., 2006 ... • Hilbert space of operators

Operator-valued kernels - Definition

• (xi

(s), y

i

(t))n

i=1 œ X ◊ Y

• X : �x

≠æ R ; Y : �y

≠æ R

• � ™ R : curve ; � ™ R2 : image

DefinitionKF (., .) : X ◊ X ≠æ L(Y)

IKF is Hermitian if KF (w, z) = KF (z, w)ú,

I it is nonnegative on X if for any {(wi

, u

i

)i=1,...,r

} œ X ◊ Yÿ

i,j

ÈKF (wi

, w

j

)ui

, u

j

ÍY Ø 0

H. Kadri, QARMA Learning with kernels: the old and the new 14/1

Page 15: On Learning with Kernels for Audio Signal Processing: the ... · Background - Applications in audio processing I Music segmentation (Davy et al., 2006 ... • Hilbert space of operators

Operator-valued kernels - Function-valued RKHS

• Extending real/vector-valued RKHS theory to FDA (Kadri etal. AISTATS 2010)

• RKHS of function-valued functions

DefinitionA Hilbert space F = {f : X ≠æ Y} is called a reproducing kernelHilbert space if there is an operator-valued kernel KF such that:

Ih : z ‘≠æ KF (w, z)g =∆ h œ F , ’w œ X and g œ Y

I ’f œ F , Èf, KF (w, .)gÍF = Èf(w), gÍY (reproducing property)

H. Kadri, QARMA Learning with kernels: the old and the new 15/1

Page 16: On Learning with Kernels for Audio Signal Processing: the ... · Background - Applications in audio processing I Music segmentation (Davy et al., 2006 ... • Hilbert space of operators

Operator-valued kernels - Uniqueness & Bijection

LemmaF function-valued RKHS =∆ KF (w, z) is unique

I Proof:Û ÈK Õ(wÕ

, .)gÕ, K(w, .)gÍF = ÈK Õ(wÕ

, w)gÕ, gÍY

Û ÈK(w, .)g, K

Õ(wÕ, .)gÕÍF = ÈK(w, w

Õ)g, g

ÕÍY

= Èg, K(w, w

Õ)úhÍY = Èg, K(wÕ

, w)gÕÍY

TheoremKF (w, z) nonnegative ≈∆ RKHS F

I Proof:≈

nqi,j=1

ÈK(wi

, w

j

), u

j

ÍY =nq

i,j=1ÈK(w

i

, .)ui

, K(wj

, .)uj

ÍF

∆ F0, ’f œ F0, f(.) =nq

i=1KF (w

i

, .)–i

H. Kadri, QARMA Learning with kernels: the old and the new 16/1

Page 17: On Learning with Kernels for Audio Signal Processing: the ... · Background - Applications in audio processing I Music segmentation (Davy et al., 2006 ... • Hilbert space of operators

Operator-valued kernels - Construction

• Multi-task kernel =∆ K(w, z) = k(w, z)TI k: real-valued kernelI T: diagonal matrix + low rank matrix (finite dimension)

• FDA kernel =∆ T œ L(Y) (infinite dimension) ?I Concurrent functional linear model

æ y(t) = –(t) + —(t)x(t)æ Multiplication operatoræ Varying coe�cient model (Hastie and Tibshirani, 1993)

I Functional linear model for functional responses (Ramsay andSilverman, 2005)

æ y(t) = –(t) +s

—(s, t)x(s)ds

æ Hilbert-Schmidt integral operator

H. Kadri, QARMA Learning with kernels: the old and the new 17/1

Page 18: On Learning with Kernels for Audio Signal Processing: the ... · Background - Applications in audio processing I Music segmentation (Davy et al., 2006 ... • Hilbert space of operators

Operator-valued kernels - Examples

1. Multiplication operatorKF : X ◊ X ≠æ L(Y)

x1, x2 ‘≠æ k

x

(x1, x2)T ky ; T

h

y

(t) , h(t)y(t)

2. Hilbert-Schmidt integral operatorKF : X ◊ X ≠æ L(Y)

x1, x2 ‘≠æ k

x

(x1, x2)T ky ; T

h

y

(t) ,s

h(s, t)y(s)ds

3. Composition operatorKF : X ◊ X ≠æ L(Y)

x1, x2 ‘≠æ C

Â(x1)CúÂ(x2) ; C

Ï

: f ‘≠æ f ¶ Ï

H. Kadri, QARMA Learning with kernels: the old and the new 18/1

Page 19: On Learning with Kernels for Audio Signal Processing: the ... · Background - Applications in audio processing I Music segmentation (Davy et al., 2006 ... • Hilbert space of operators

Operator-valued kernels - Feature map(Kadri et al., ICML 2011)

I Operator-valued kernel admits a feature map representationæ ÈK(x1, x2)y1, y2ÍY = È�(x1, y1), �(x2, y2)ÍL(X ,Y)

æ ÈK(x1, .)y1, K(x2, .)y2ÍF = ÈK(x1, x2)y1, y2ÍY

I Complex/infinite-dimensional inputsæ multiple functional data x

i

œ (L2)p

I FDA viewpointæ one observation = one continuous curve

Real-valued RKHS�

k

: (L2)p æ L((L2)p

,R)x ‘æ k(x, .)

dim: p ≠æ 1

Function-valued RKHS�y

K

: (L2)p æ L((L2)p

, L

2)x ‘æ K(x, .)y

dim: p ≠æ inf

H. Kadri, QARMA Learning with kernels: the old and the new 19/1

Page 20: On Learning with Kernels for Audio Signal Processing: the ... · Background - Applications in audio processing I Music segmentation (Davy et al., 2006 ... • Hilbert space of operators

Optimization problem - Representer theorem

TheoremThe solution of the minimization problem

minfœF

nÿ

i=1Îy

i

≠ f(xi

)Î2Y + ⁄ÎfÎ2

F

is achieved by a function of the form

f

ú(.) =nÿ

i=1KF (x

i

, .)—i

H. Kadri, QARMA Learning with kernels: the old and the new 20/1

Page 21: On Learning with Kernels for Audio Signal Processing: the ... · Background - Applications in audio processing I Music segmentation (Davy et al., 2006 ... • Hilbert space of operators

Optimization problem - Solution

minfœF

nqi=1

Îy

i

≠ f(xi

)Î2Y + ⁄ÎfÎ2

F

using the representer theorem & the reproducing property

≈∆ min—iœY

nqi=1

Îy

i

≠nq

j=1KF (x

i

, x

j

)—j

Î2Y + ⁄

nqi,j

ÈKF (xi

, x

j

)—i

, —

j

ÍY

I Discretization (Kadri et al., AISTATS 2010)æ grid {t1, . . . , t

m

} =∆ —

i

(t1), . . . , —

i

(tm

)

I Approximation (Kadri et al., Tech. Report 2011)æ Y a real RKHS =∆ —

i

=q

m

l=1 –

il

k(tl

, .)

I Analytic solution (Kadri et al., ICML 2011)æ (K + ⁄I)— = y ; — œ (Y)n and K œ [L(Y)]n◊n

H. Kadri, QARMA Learning with kernels: the old and the new 21/1

Page 22: On Learning with Kernels for Audio Signal Processing: the ... · Background - Applications in audio processing I Music segmentation (Davy et al., 2006 ... • Hilbert space of operators

Optimization problem - Block operator kernel matrixinversion

I (block) numerical range: spectral and operator theory

I Spectral theory of block operator matrices (C. Tretter, 2008)

IK(x

i

, x

j

) = G(xi

, x

j

)T, ’x

i

, x

j

œ X

I Kronecker product

æ K =

Q

caG(x1, x1)T . . . G(x1, x

n

)T... . . . ...

G(xn

, x1)T . . . G(xn

, x

n

)T

R

db = G ¢ T

æ K≠1 = G≠1 ¢ T

≠1

H. Kadri, QARMA Learning with kernels: the old and the new 22/1

Page 23: On Learning with Kernels for Audio Signal Processing: the ... · Background - Applications in audio processing I Music segmentation (Davy et al., 2006 ... • Hilbert space of operators

Algorithm 1 L

2-Regularized Operator Learning Algorithm

Inputdata x

i

œ (L2([0, 1]))p, y

i

œ L

2([0, 1]), size n

Eigendecomposition of G = G(xi

, x

j

)n

i,j=1 œ Rn◊n

eigenvalues –

i

œ R, eigenvectors v

i

œ Rn, size n

Eigendecomposition of T œ L(Y )Initialize k: number of eigenfunctionseigenvalues ”

i

œ R, eigenfunctions w

i

œ L

2([0, 1]), size k

Eigendecomposition of K = G ¢ T

K = K(xi

, x

j

)n

i,j=1 œ (L(Y ))n◊n

eigenvalues ◊

i

œ R, eigenfunctions z

i

œ (L2([0, 1]))n, size n◊k

◊ = – ¢ ”, z = v ¢ w

Solution — = (K + ⁄I)≠1y

Initialize ⁄: regularization parameter— = q

n◊k

i=1 (◊i

+ ⁄)≠1 qn

j=1Èzij

, y

j

Ízi

H. Kadri, QARMA Learning with kernels: the old and the new 22/1

Page 24: On Learning with Kernels for Audio Signal Processing: the ... · Background - Applications in audio processing I Music segmentation (Davy et al., 2006 ... • Hilbert space of operators

Applications - Speech inversionSS

Speech production

Speech Inversion

1: upper lip

2: lower lip

3: jaw

4: tongue tip

5: tongue body

6: velum

7: glottis

Time(s)

Speech signal

Am

plitu

de

Figure : Acoustic to articulatory inversion

I speech inversionæ learning the acoustic-to-articulatory mappingæ from MFCC to Vocal-tract time functions (VTTF)æ improving speech technology and understandingæ helping individuals with speech and hearing disorders

H. Kadri, QARMA Learning with kernels: the old and the new 23/1

Page 25: On Learning with Kernels for Audio Signal Processing: the ... · Background - Applications in audio processing I Music segmentation (Davy et al., 2006 ... • Hilbert space of operators

Applications - Speech inversion

0 0.5 10

10

20

0 0.2 0.4−20

0

202

-20 0.5 1

−20

0

20

0 0.5 1−20

0

20

LA

0 0.5 18

10

12

LP

0 0.5 1

0

20

40

TTC

D

0 0.5 10

50

100

TTC

L

0 0.5 10

5

10

TBC

D

0 0.5 1

100

150

TBC

L

0 0.5 1−0.1

−0.05

0

VE

L

0 0.5 10

0.2

0.4

Time(s)

GLO

2

-2

0 0.2 0.4−20

0

20

0 0.2 0.48

10

12

0 0.2 0.40

20

40

0 0.2 0.420

40

60

0 0.2 0.40

5

10

0 0.2 0.4100

120

140

0 0.2 0.4−0.2

0

0.2

0 0.2 0.40

0.2

0.4

Time(s)

0 0.5 10

10

20

0 0.5 18

9

10

0 0.5 1−50

0

50

0 0.5 10

50

100

0 0.5 1−20

0

20

0 0.5 10

100

200

0 0.5 1−0.2

0

0.2

0 0.5 10

0.2

0.4

Time(s)

1

-1

"beautiful" "conversation" "smooth"

H. Kadri, QARMA Learning with kernels: the old and the new 24/1

Page 26: On Learning with Kernels for Audio Signal Processing: the ... · Background - Applications in audio processing I Music segmentation (Davy et al., 2006 ... • Hilbert space of operators

Applications - Speech inversion

Tab.2 : Average RSSE for the tract variables

VT variables Á-SVR Multi-task functionalLA 2.763 2.341 1.562LP 0.532 0.512 0.528

TTCD 3.345 1.975 1.647TTCL 7.752 5.276 3.463TBCD 2.155 2.094 1.582TBCL 15.083 9.763 7.215VEL 0.032 0.034 0.029GLO 0.041 0.052 0.064Total 3.962 2.755 2.011

IÁ-SVR (Mitra et al., ICASSP 2009)

I Multi-task kernel (Kadri et al., ICASSP 2011)

H. Kadri, QARMA Learning with kernels: the old and the new 25/1

Page 27: On Learning with Kernels for Audio Signal Processing: the ... · Background - Applications in audio processing I Music segmentation (Davy et al., 2006 ... • Hilbert space of operators

Applications - Sound recognition

I Sound Recognition

≠æ Surveillance and security applications

H. Kadri, QARMA Learning with kernels: the old and the new 26/1

Page 28: On Learning with Kernels for Audio Signal Processing: the ... · Background - Applications in audio processing I Music segmentation (Davy et al., 2006 ... • Hilbert space of operators

Applications - Sound recognition

I Features extraction≠æ temporal, spectral, cepstral, ... characteristics

0 1 2 3 4 5 6 7x 104

−1

−0.5

0

0.5

1

Time(s)

Ampl

itude

0 50 100 150 200 250 3000

0.2

0.4

0.6

0.8

1Evolution of the Zero Crossing Rate (ZCR)

0 50 100 1500.005

0.01

0.015

0.02

0.025

0.03Spec−Roll−off (SRF)

0 50 100 150−10

−5

0

5

10

15

20

25

3013 Cepstral coefficients (MFCC)

H. Kadri, QARMA Learning with kernels: the old and the new 27/1

Page 29: On Learning with Kernels for Audio Signal Processing: the ... · Background - Applications in audio processing I Music segmentation (Davy et al., 2006 ... • Hilbert space of operators

Applications - Sound recognitionI Limitations - Multivariate data modeling

≠æ features contain discrete values of various parameters≠æ feature vector œ RDP by concatenating samples of ”= features

I Solution - Multivariate functional data modeling

≠æ modeling each audio signal by a vector of functions in (L2)D

H. Kadri, QARMA Learning with kernels: the old and the new 28/1

Page 30: On Learning with Kernels for Audio Signal Processing: the ... · Background - Applications in audio processing I Music segmentation (Davy et al., 2006 ... • Hilbert space of operators

Applications - Sound recognition

Tab.3 : Classes of sounds and number of samples in the database usedfor performance evaluation.

Classes Number Train Test Total Duration (s)Human screams C1 40 25 65 167

Gunshots C2 36 19 55 97Glass breaking C3 48 25 73 123

Explosions C4 41 21 62 180Door slams C5 50 25 75 96Phone rings C6 34 17 51 107

Children voices C7 58 29 87 140Machines C8 40 20 60 184

Total 327 181 508 18mn 14s

H. Kadri, QARMA Learning with kernels: the old and the new 29/1

Page 31: On Learning with Kernels for Audio Signal Processing: the ... · Background - Applications in audio processing I Music segmentation (Davy et al., 2006 ... • Hilbert space of operators

Applications - Sound recognition

Figure : Structural similarities between two di�erent classes

H. Kadri, QARMA Learning with kernels: the old and the new 30/1

Page 32: On Learning with Kernels for Audio Signal Processing: the ... · Background - Applications in audio processing I Music segmentation (Davy et al., 2006 ... • Hilbert space of operators

Applications - Sound recognition

Figure : Structural diversity inside the same sound class and betweenclasses

H. Kadri, QARMA Learning with kernels: the old and the new 31/1

Page 33: On Learning with Kernels for Audio Signal Processing: the ... · Background - Applications in audio processing I Music segmentation (Davy et al., 2006 ... • Hilbert space of operators

Applications - Sound recognition

Tab.4 : Confusion Matrix obtained when using the Regularized LeastSquares Classification (RLSC) algorithm (Rifkin et al, 2003)

C1 C2 C3 C4 C5 C6 C7 C8C1 92 4 4.76 0 5.27 11.3 6.89 0C2 0 52 0 14 0 2.7 0 0C3 0 20 76.2 0 0 0 17.24 5C4 0 16 0 66 0 0 0 0C5 4 8 0 4 84.21 0 6.8 0C6 4 0 0 0 10.52 86 0 0C7 0 0 0 8 0 0 69.07 0C8 0 0 19.04 8 0 0 0 95

Total Recognition Rate = 77.56%

H. Kadri, QARMA Learning with kernels: the old and the new 32/1

Page 34: On Learning with Kernels for Audio Signal Processing: the ... · Background - Applications in audio processing I Music segmentation (Davy et al., 2006 ... • Hilbert space of operators

Applications - Sound recognition

Tab.5 : Confusion Matrix obtained when using the FunctionalRegularized Least Squares algorithm

C1 C2 C3 C4 C5 C6 C7 C8C1 100 0 0 2 0 5.3 3.4 0C2 0 82 0 8 0 0 0 0C3 0 14 90.9 8 0 0 3.4 0C4 0 4 0 78 0 0 0 0C5 0 0 0 1 89.47 0 6.8 0C6 0 0 0 0 10.53 94.7 0 0C7 0 0 0 0 0 0 86.4 0C8 0 0 9.1 3 0 0 0 100

Total Recognition Rate = 90.18%

H. Kadri, QARMA Learning with kernels: the old and the new 33/1

Page 35: On Learning with Kernels for Audio Signal Processing: the ... · Background - Applications in audio processing I Music segmentation (Davy et al., 2006 ... • Hilbert space of operators

Applications - Beyond Audio ProcessingI Functional outputs - BCI

0 20 40 60 80 100 120 140 160 180 200−20

0

20

Ch. 1

0 20 40 60 80 100 120 140 160 180 200−10

0

10

Ch. 2

0 20 40 60 80 100 120 140 160 180 200−5

0

5

Ch. 3

0 20 40 60 80 100 120 140 160 180 200−5

0

5

Ch. 4

0 20 40 60 80 100 120 140 160 180 200−5

0

5C

h. 5

Time samples0 50 100 150 200

−1.5

−1

−0.5

0

0.5

1

1.5

Time samples

Fin

ger

Move

ment S

tate

0 50 100 150 200

−1

0

1

2

3

4

5

6

Time samples

Fin

ger

Move

ment

I Structured outputs - Image, Text, Graph prediction

5 10 15

2

4

6

8

5 10 15

2

4

6

8

5 10 15

2

4

6

8

5 10 15

2

4

6

8

5 10 15

2

4

6

8

5 10 15

2

4

6

8

5 10 15

2

4

6

8

5 10 15

2

4

6

8

I Tensor outputs - Multilinear multitaskJury 1 Jury 2 Jury 3

Athlete performance technical score · · ·artistic score · · ·

H. Kadri, QARMA Learning with kernels: the old and the new 34/1

Page 36: On Learning with Kernels for Audio Signal Processing: the ... · Background - Applications in audio processing I Music segmentation (Davy et al., 2006 ... • Hilbert space of operators

Conclusion & Perspectives

I Conclusionæ RKHS framework for functional data - Nonlinear FDAæ FDA kernelsæ Audio and Speech processing applications

I Perspectivesæ Mixed data (discrete, continuous,...)æ Learning the operator-valued kernelæ Multilinear representation learning

H. Kadri, QARMA Learning with kernels: the old and the new 35/1