high resolution spectral analysis and nonnegative ... · array processing (beamforming, direction...

100
Page 1 / 37 C4DM Seminar Roland Badeau Wednesday, February 13, 2013 High resolution spectral analysis and nonnegative decompositions applied to music signal processing Roland Badeau Associate Professor, Telecom ParisTech Visiting researcher at C4DM [email protected]

Upload: others

Post on 27-May-2020

14 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 1 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

High resolution spectral analysisand nonnegative decompositions

applied to music signal processing

Roland BadeauAssociate Professor, Telecom ParisTech

Visiting researcher at [email protected]

Page 2: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 2 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Music representations

Musical score

Page 3: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 2 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Music representations

Spectrogram of "Au clair de la lune"

Musical score

Page 4: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 3 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Low-rank approximations

Page 5: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 3 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Low-rank approximations

×

×

=

Page 6: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 4 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

High Resolution (HR) spectral analysis

t

s(t)

Page 7: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 4 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

High Resolution (HR) spectral analysis

s(0)

. . .

s(1)

s(2)

s(1) s(2)

. ..

. ..

t

s(t)

Page 8: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 4 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

High Resolution (HR) spectral analysis

t

s(t)

×

×

=

s(0)

. . .

s(1)

s(2)

s(1) s(2)

. ..

. ..

Page 9: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 5 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Nonnegative Matrix Factorization (NMF)

Musical score

Spectrogram

Page 10: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 5 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Nonnegative Matrix Factorization (NMF)

Musical score

0 2 4 6 8 100

2040

Time (s)

0 2 4 6 8 100

2040

0 2 4 6 8 100

2040

Temporal activations

0

2000

4000

6000

8000

10000

0 0.5 1

Fre

quen

cy (

Hz)

0

2000

4000

6000

8000

10000

0 0.5 10

2000

4000

6000

8000

10000

0 0.5 1

Spectral atoms Spectrogram

Page 11: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 6 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Applications

Applications of High Resolution methodsSpectral analysis (modal analysis, spectroscopy)Array processing (beamforming, direction of arrival (DOA)estimation)Digital communications (channel identification)

Applications of NMFImage analysis (face recognition)Text mining, spectroscopy, finance, etc.

Applications to audio signal processingSource separation, audio codingPitch and tempo estimation, automatic transcription

Page 12: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 6 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Applications

Applications of High Resolution methodsSpectral analysis (modal analysis, spectroscopy)Array processing (beamforming, direction of arrival (DOA)estimation)Digital communications (channel identification)

Applications of NMFImage analysis (face recognition)Text mining, spectroscopy, finance, etc.

Applications to audio signal processingSource separation, audio codingPitch and tempo estimation, automatic transcription

Page 13: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 6 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Applications

Applications of High Resolution methodsSpectral analysis (modal analysis, spectroscopy)Array processing (beamforming, direction of arrival (DOA)estimation)Digital communications (channel identification)

Applications of NMFImage analysis (face recognition)Text mining, spectroscopy, finance, etc.

Applications to audio signal processingSource separation, audio codingPitch and tempo estimation, automatic transcription

Page 14: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 7 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Part I

High Resolution spectral analysis

Page 15: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 8 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Exponential Sinusoidal Model (ESM)

Real-valued model: s(t) =∑r

k=1 ak e−δk t cos(2πνk t + φk )

ak ∈ R∗+ and φk ∈]− π, π] are the amplitude and phase

δk ∈ R and νk ∈]−12 ,

12 ] are the damping factor and frequency

Complex-valued model: s(t) =∑r

k=1 αk zkt

αk = ak eiφk ∈ C∗ is a complex amplitude

zk = e−δk+i2πνk ∈ C∗ is a complex pole

Noisy model: x(t) = s(t) + b(t) (b(t) is a white Gaussian noise)Model estimation

Data vector: s(t) = [s(t), . . . , s(t + n − 1)]T with n > rFourier analysis: spectral resolution of the order of 1

nSubspace analysis: high spectral resolution

Page 16: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 8 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Exponential Sinusoidal Model (ESM)

Real-valued model: s(t) =∑r

k=1 ak e−δk t cos(2πνk t + φk )

ak ∈ R∗+ and φk ∈]− π, π] are the amplitude and phase

δk ∈ R and νk ∈]−12 ,

12 ] are the damping factor and frequency

Complex-valued model: s(t) =∑r

k=1 αk zkt

αk = ak eiφk ∈ C∗ is a complex amplitude

zk = e−δk+i2πνk ∈ C∗ is a complex pole

Noisy model: x(t) = s(t) + b(t) (b(t) is a white Gaussian noise)Model estimation

Data vector: s(t) = [s(t), . . . , s(t + n − 1)]T with n > rFourier analysis: spectral resolution of the order of 1

nSubspace analysis: high spectral resolution

Page 17: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 8 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Exponential Sinusoidal Model (ESM)

Real-valued model: s(t) =∑r

k=1 ak e−δk t cos(2πνk t + φk )

ak ∈ R∗+ and φk ∈]− π, π] are the amplitude and phase

δk ∈ R and νk ∈]−12 ,

12 ] are the damping factor and frequency

Complex-valued model: s(t) =∑r

k=1 αk zkt

αk = ak eiφk ∈ C∗ is a complex amplitude

zk = e−δk+i2πνk ∈ C∗ is a complex pole

Noisy model: x(t) = s(t) + b(t) (b(t) is a white Gaussian noise)Model estimation

Data vector: s(t) = [s(t), . . . , s(t + n − 1)]T with n > rFourier analysis: spectral resolution of the order of 1

nSubspace analysis: high spectral resolution

Page 18: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 8 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Exponential Sinusoidal Model (ESM)

Real-valued model: s(t) =∑r

k=1 ak e−δk t cos(2πνk t + φk )

ak ∈ R∗+ and φk ∈]− π, π] are the amplitude and phase

δk ∈ R and νk ∈]−12 ,

12 ] are the damping factor and frequency

Complex-valued model: s(t) =∑r

k=1 αk zkt

αk = ak eiφk ∈ C∗ is a complex amplitude

zk = e−δk+i2πνk ∈ C∗ is a complex pole

Noisy model: x(t) = s(t) + b(t) (b(t) is a white Gaussian noise)Model estimation

Data vector: s(t) = [s(t), . . . , s(t + n − 1)]T with n > rFourier analysis: spectral resolution of the order of 1

nSubspace analysis: high spectral resolution

Page 19: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 9 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Subspace analysis

z t1

z t2

Signal subspace

of dimension r = 2

Data vectorof dimension n

s(t)s(t) = α1z t

1 + α2z t2

s(t)

Page 20: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 9 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Subspace analysis

z t1

z t2

Signal subspace

of dimension r = 2

Data vectorof dimension n

s(t + 1)s(t) = α1z t

1 + α2z t2

s(t + 1)

Page 21: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 9 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Subspace analysis

z t1

z t2

Signal subspace

of dimension r = 2

Data vectorof dimension n

s(t + 2)s(t) = α1z t

1 + α2z t2

s(t + 2)

Page 22: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 10 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Model estimation

Choose a window (γτ )τ∈N (exponential, rectangular, hybrid)

Compute a "correlation" matrix

Cxx(t) =∑t

τ=0 γτ x(t − τ) x(t − τ)H

Estimate the model parameters

Page 23: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 10 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Model estimation

Choose a window (γτ )τ∈N (exponential, rectangular, hybrid)

Compute a "correlation" matrix

Cxx(t) =∑t

τ=0 γτ x(t − τ) x(t − τ)H

Estimate the model parameters

Cxx (t) zk (t) αk (t)Computation of

complex amplitudes

Ordinary

squaresleast

Estimationof order r

AIC, MDL...

criteria

rComputation ofcomplex poles

Subspaceanalysis

Page 24: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 10 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Model estimation

Choose a window (γτ )τ∈N (exponential, rectangular, hybrid)

Compute a "correlation" matrix

Cxx(t) =∑t

τ=0 γτ x(t − τ) x(t − τ)H

Estimate the model parameters

Cxx (t) W r (t) Φr (t) zk (t) αk (t)Computation ofsignal subspace

Computation ofspectral matrix

Computation ofcomplex poles

Computation ofcomplex amplitudes

Ordinary

squaresleast

Diagonalisationof Φr (t)

Ordinary

squaresleast

Diagonalisationof Cxx (t)

ESPRIT method [Roy and Kailath, 1989]

Estimationof order r

AIC, MDL...

criteria

r

Page 25: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 11 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Subspace analysis

z t1

z t2

Signal subspace

of dimension r = 2

Data vectorof dimension n

s(t)s(t) = α1z t

1 + α2z t2

s(t)

Page 26: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 11 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Subspace analysis

z t1

z t2

Signal subspace

of dimension r = 2

Data vectorof dimension n

s(t + 1)s(t) = α1z t

1 + α2z t2

s(t + 1)

Page 27: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 11 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Subspace analysis

z t1

z t2

Signal subspace

of dimension r = 2

Data vectorof dimension n

s(t + 2)s(t) = α1z t

1 + α2z t2

s(t + 2)

Page 28: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 11 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Subspace analysis

z′ t1

z′ t2

Signal subspace

of dimension r = 2

Data vectorof dimension n

s(t + 3)s(t) = α1z′ t

1 + α2z′ t2

s(t + 3)

Page 29: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 12 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Model estimation

Choose a window (γτ )τ∈N (exponential, rectangular, hybrid)

Compute a "correlation" matrix

Cxx(t) =∑t

τ=0 γτ x(t − τ) x(t − τ)H

Estimate the model parameters

Cxx (t) W r (t) Φr (t) zk (t) αk (t)Computation ofsignal subspace

Computation ofspectral matrix

Computation ofcomplex poles

Computation ofcomplex amplitudes

Ordinary

squaresleast

Diagonalisationof Φr (t)

Ordinary

squaresleast

Diagonalisationof Cxx (t)

ESPRIT method [Roy and Kailath, 1989]

Estimationof order r

AIC, MDL...

criteria

r

Page 30: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 13 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Time-frequency analysis

x(t) W r (t) Φr (t) zk (t) αk (t)Tracking of the

signal subspaceTracking of thespectral matrix

Tracking of thecomplex poles

Tracking of thecomplex amplitudes

SWASVD [3]FAPI [2]

YAST [4]

Adaptive

squaresleast [7]

ESPRIT [5]Adaptive

HRHATRAC [6]

Estimationof order r

ESTERcriterion [1]

r

[1] Roland Badeau, Bertrand David, and Gaël Richard. "A new perturbation analysis for signal enumeration in rotational invariancetechniques". IEEE Transactions on Signal Processing, 54(2): 450–458, February 2006.[2] Roland Badeau, Bertrand David, and Gaël Richard. "Fast Approximated Power Iteration Subspace Tracking". IEEE Transactionson Signal Processing, 53(8): 2931-2941, August 2005.[3] Roland Badeau, Gaël Richard, and Bertrand David. "Sliding window adaptive SVD algorithms". IEEE Transactions on SignalProcessing, 52(1): 1-10, January 2004.[4] Roland Badeau, Gaël Richard, and Bertrand David. "Fast and stable YAST algorithm for principal and minor subspace tracking".IEEE Transactions on Signal Processing, 56(8): 3437-3446, August 2008.[5] Roland Badeau, Gaël Richard, and Bertrand David. "Fast adaptive ESPRIT algorithm". In Proc. of IEEE Workshop on StatisticalSignal Processing (SSP), Bordeaux, France, July 2005.[6] Bertrand David, Roland Badeau, and Gaël Richard. "HRHATRAC Algorithm for Spectral Line Tracking of Musical Signals". InProc. of IEEE ICASSP, volume 3, pages 45-48, Toulouse, France, May 2006.[7] Bertrand David and Roland Badeau. "Fast sequential LS estimation for sinusoidal modeling and decomposition of audiosignals". In Proc. of IEEE WASPAA, pages 211-214, New Paltz, New York, USA, October 2007.

Page 31: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 13 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Time-frequency analysis

x(t) W r (t) Φr (t) zk (t) αk (t)Tracking of the

signal subspaceTracking of thespectral matrix

Tracking of thecomplex poles

Tracking of thecomplex amplitudes

SWASVD [3]FAPI [2]

YAST [4]

Adaptive

squaresleast [7]

ESPRIT [5]Adaptive

HRHATRAC [6]

Estimationof order r

ESTERcriterion [1]

r

[1] Roland Badeau, Bertrand David, and Gaël Richard. "A new perturbation analysis for signal enumeration in rotational invariancetechniques". IEEE Transactions on Signal Processing, 54(2): 450–458, February 2006.[2] Roland Badeau, Bertrand David, and Gaël Richard. "Fast Approximated Power Iteration Subspace Tracking". IEEE Transactionson Signal Processing, 53(8): 2931-2941, August 2005.[3] Roland Badeau, Gaël Richard, and Bertrand David. "Sliding window adaptive SVD algorithms". IEEE Transactions on SignalProcessing, 52(1): 1-10, January 2004.[4] Roland Badeau, Gaël Richard, and Bertrand David. "Fast and stable YAST algorithm for principal and minor subspace tracking".IEEE Transactions on Signal Processing, 56(8): 3437-3446, August 2008.[5] Roland Badeau, Gaël Richard, and Bertrand David. "Fast adaptive ESPRIT algorithm". In Proc. of IEEE Workshop on StatisticalSignal Processing (SSP), Bordeaux, France, July 2005.[6] Bertrand David, Roland Badeau, and Gaël Richard. "HRHATRAC Algorithm for Spectral Line Tracking of Musical Signals". InProc. of IEEE ICASSP, volume 3, pages 45-48, Toulouse, France, May 2006.[7] Bertrand David and Roland Badeau. "Fast sequential LS estimation for sinusoidal modeling and decomposition of audiosignals". In Proc. of IEEE WASPAA, pages 211-214, New Paltz, New York, USA, October 2007.

Page 32: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 14 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Power iteration method

Power iteration method (recursive computation of W r (t))1) Cxy (t) = CxxW r (t − 1) (compression of Cxx )2) W r (t)R(t) = Cxy (t) (orthonormalisation of Cxy (t))

Span(W r (t)) exponentially converges to the signal subspaceIf 2) is an orthogonal-triangular (QR) decomposition, W r (t)converges to the r principal eigenvectors of Cxx

Signal subspace tracking if Cxx(t) is time-varying

Fast algorithm [Strobach, 1996] (complexity of nr2 instead of n2r )

Page 33: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 14 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Power iteration method

Power iteration method (recursive computation of W r (t))1) Cxy (t) = CxxW r (t − 1) (compression of Cxx )2) W r (t)R(t) = Cxy (t) (orthonormalisation of Cxy (t))

Span(W r (t)) exponentially converges to the signal subspaceIf 2) is an orthogonal-triangular (QR) decomposition, W r (t)converges to the r principal eigenvectors of Cxx

Signal subspace tracking if Cxx(t) is time-varying

Fast algorithm [Strobach, 1996] (complexity of nr2 instead of n2r )

Page 34: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 14 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Power iteration method

Power iteration method (recursive computation of W r (t))1) Cxy (t) = CxxW r (t − 1) (compression of Cxx )2) W r (t)R(t) = Cxy (t) (orthonormalisation of Cxy (t))

Span(W r (t)) exponentially converges to the signal subspaceIf 2) is an orthogonal-triangular (QR) decomposition, W r (t)converges to the r principal eigenvectors of Cxx

Signal subspace tracking if Cxx(t) is time-varying

Fast algorithm [Strobach, 1996] (complexity of nr2 instead of n2r )

Page 35: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 14 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Power iteration method

Power iteration method (recursive computation of W r (t))1) Cxy (t) = CxxW r (t − 1) (compression of Cxx )2) W r (t)R(t) = Cxy (t) (orthonormalisation of Cxy (t))

Span(W r (t)) exponentially converges to the signal subspaceIf 2) is an orthogonal-triangular (QR) decomposition, W r (t)converges to the r principal eigenvectors of Cxx

Signal subspace tracking if Cxx(t) is time-varying

Fast algorithm [Strobach, 1996] (complexity of nr2 instead of n2r )

Page 36: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 14 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Power iteration method

Power iteration method (recursive computation of W r (t))1) Cxy (t) = CxxW r (t − 1) (compression of Cxx )2) W r (t)R(t) = Cxy (t) (orthonormalisation of Cxy (t))

Span(W r (t)) exponentially converges to the signal subspaceIf 2) is an orthogonal-triangular (QR) decomposition, W r (t)converges to the r principal eigenvectors of Cxx

Signal subspace tracking if Cxx(t) is time-varying

Fast algorithm [Strobach, 1996] (complexity of nr2 instead of n2r )

Page 37: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 14 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Power iteration method

Power iteration method (recursive computation of W r (t))1) Cxy (t) = CxxW r (t − 1) (compression of Cxx )2) W r (t)R(t) = Cxy (t) (orthonormalisation of Cxy (t))

Span(W r (t)) exponentially converges to the signal subspaceIf 2) is an orthogonal-triangular (QR) decomposition, W r (t)converges to the r principal eigenvectors of Cxx

Signal subspace tracking if Cxx(t) is time-varying

Fast algorithm [Strobach, 1996] (complexity of nr2 instead of n2r )

Page 38: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 14 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Power iteration method

Power iteration method (recursive computation of W r (t))1) Cxy (t) = CxxW r (t − 1) (compression of Cxx )2) W r (t)R(t) = Cxy (t) (orthonormalisation of Cxy (t))

Span(W r (t)) exponentially converges to the signal subspaceIf 2) is an orthogonal-triangular (QR) decomposition, W r (t)converges to the r principal eigenvectors of Cxx

Signal subspace tracking if Cxx(t) is time-varying

Fast algorithm [Strobach, 1996] (complexity of nr2 instead of n2r )

Page 39: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 15 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Subspace tracking

Signal subspace

of dimension r = 2

Data vectorof dimension n

s(t)s(t) = α1z t

1 + α2z t2

w1(t)

w2(t)

s(t)

Page 40: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 15 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Subspace tracking

Signal subspace

of dimension r = 2

Data vectorof dimension n

s(t + 1)s(t) = α1z t

1 + α2z t2

w1(t)

w2(t)s(t + 1)

Page 41: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 15 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Subspace tracking

Signal subspace

of dimension r = 2

Data vectorof dimension n

s(t + 2)s(t) = α1z t

1 + α2z t2

w1(t)

w2(t)

s(t + 2)

Page 42: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 15 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Subspace tracking

Signal subspace

of dimension r = 2

Data vectorof dimension n

s(t)s(t) = α1z t

1 + α2z t2

w1(t)

w2(t)

s(t)

Page 43: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 15 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Subspace tracking

Signal subspace

of dimension r = 2

Data vectorof dimension n

s(t + 1)s(t) = α1z t

1 + α2z t2

w1(t)

w2(t)s(t + 1)

Page 44: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 15 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Subspace tracking

Signal subspace

of dimension r = 2

Data vectorof dimension n

s(t + 2)s(t) = α1z t

1 + α2z t2

w1(t)

w2(t)s(t + 2)

Page 45: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 15 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Subspace tracking

Signal subspace

of dimension r = 2

Data vectorof dimension n

s(t + 3)s(t) = α1z′ t

1 + α2z′ t2

s(t + 3)w2(t)w1(t)

Page 46: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 16 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Natural power method

Natural power method1) Cxy (t) = Cxx(t)W r (t − 1) (compression of Cxx )

2) W r (t) = Cxy (t)(

Cxy (t)HCxy (t))−

12 (orthonormalisation of Cxy (t))

FAPI algorithm [1] (complexity of 3nr instead of nr2)reaches the complexity lower bound (3nr )converges faster than PAST and its variantsguarantees the orthonormality of W r (t) and the numerical stability

[1] Roland Badeau, Bertrand David, and Gaël Richard. "Fast Approximated Power Iteration Subspace Tracking". IEEE Transactionson Signal Processing, 53(8): 2931-2941, August 2005.

Page 47: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 16 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Natural power method

Natural power method1) Cxy (t) = Cxx(t)W r (t − 1) (compression of Cxx )

2) W r (t) = Cxy (t)(

Cxy (t)HCxy (t))−

12 (orthonormalisation of Cxy (t))

FAPI algorithm [1] (complexity of 3nr instead of nr2)reaches the complexity lower bound (3nr )converges faster than PAST and its variantsguarantees the orthonormality of W r (t) and the numerical stability

[1] Roland Badeau, Bertrand David, and Gaël Richard. "Fast Approximated Power Iteration Subspace Tracking". IEEE Transactionson Signal Processing, 53(8): 2931-2941, August 2005.

Page 48: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 16 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Natural power method

Natural power method1) Cxy (t) = Cxx(t)W r (t − 1) (compression of Cxx )

2) W r (t) = Cxy (t)(

Cxy (t)HCxy (t))−

12 (orthonormalisation of Cxy (t))

FAPI algorithm [1] (complexity of 3nr instead of nr2)reaches the complexity lower bound (3nr )converges faster than PAST and its variantsguarantees the orthonormality of W r (t) and the numerical stability

[1] Roland Badeau, Bertrand David, and Gaël Richard. "Fast Approximated Power Iteration Subspace Tracking". IEEE Transactionson Signal Processing, 53(8): 2931-2941, August 2005.

Page 49: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 16 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Natural power method

Natural power method1) Cxy (t) = Cxx(t)W r (t − 1) (compression of Cxx )

2) W r (t) = Cxy (t)(

Cxy (t)HCxy (t))−

12 (orthonormalisation of Cxy (t))

FAPI algorithm [1] (complexity of 3nr instead of nr2)reaches the complexity lower bound (3nr )converges faster than PAST and its variantsguarantees the orthonormality of W r (t) and the numerical stability

[1] Roland Badeau, Bertrand David, and Gaël Richard. "Fast Approximated Power Iteration Subspace Tracking". IEEE Transactionson Signal Processing, 53(8): 2931-2941, August 2005.

Page 50: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 16 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Natural power method

Natural power method1) Cxy (t) = Cxx(t)W r (t − 1) (compression of Cxx )

2) W r (t) = Cxy (t)(

Cxy (t)HCxy (t))−

12 (orthonormalisation of Cxy (t))

FAPI algorithm [1] (complexity of 3nr instead of nr2)reaches the complexity lower bound (3nr )converges faster than PAST and its variantsguarantees the orthonormality of W r (t) and the numerical stability

[1] Roland Badeau, Bertrand David, and Gaël Richard. "Fast Approximated Power Iteration Subspace Tracking". IEEE Transactionson Signal Processing, 53(8): 2931-2941, August 2005.

Page 51: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 16 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Natural power method

Natural power method1) Cxy (t) = Cxx(t)W r (t − 1) (compression of Cxx )

2) W r (t) = Cxy (t)(

Cxy (t)HCxy (t))−

12 (orthonormalisation of Cxy (t))

FAPI algorithm [1] (complexity of 3nr instead of nr2)reaches the complexity lower bound (3nr )converges faster than PAST and its variantsguarantees the orthonormality of W r (t) and the numerical stability

[1] Roland Badeau, Bertrand David, and Gaël Richard. "Fast Approximated Power Iteration Subspace Tracking". IEEE Transactionson Signal Processing, 53(8): 2931-2941, August 2005.

Page 52: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 16 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Natural power method

Natural power method1) Cxy (t) = Cxx(t)W r (t − 1) (compression of Cxx )

2) W r (t) = Cxy (t)(

Cxy (t)HCxy (t))−

12 (orthonormalisation of Cxy (t))

FAPI algorithm [1] (complexity of 3nr instead of nr2)reaches the complexity lower bound (3nr )converges faster than PAST and its variantsguarantees the orthonormality of W r (t) and the numerical stability

[1] Roland Badeau, Bertrand David, and Gaël Richard. "Fast Approximated Power Iteration Subspace Tracking". IEEE Transactionson Signal Processing, 53(8): 2931-2941, August 2005.

Page 53: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 17 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Applications of High Resolution analysis

Analysis / SynthesisHigh resolution time-frequency representationAnalysis of the sympathetic string modes in a concert harpAudio coding

Automatic transcriptionPitch estimation of piano notesMusical tempo estimation

Other applicationsChannel estimation in digital communicationsAdaptive multilinear SVD for structured tensors

Page 54: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 17 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Applications of High Resolution analysis

Analysis / SynthesisHigh resolution time-frequency representationAnalysis of the sympathetic string modes in a concert harpAudio coding

Automatic transcriptionPitch estimation of piano notesMusical tempo estimation

Other applicationsChannel estimation in digital communicationsAdaptive multilinear SVD for structured tensors

Page 55: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 18 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Decomposition of a piano sound

Time (seconds)

Fre

quen

cy (

Hz)

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90

500

1000

1500

2000

2500

3000

3500

4000

4500

Spectrogram

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90

500

1000

1500

2000

2500

3000

3500

4000

4500

Time (seconds)F

requ

ency

(H

z)

HR-ogram

Time (seconds)

Fre

quen

cy (

Hz)

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90

500

1000

1500

2000

2500

3000

3500

4000

4500

Residual

[1] Roland Badeau and Bertrand David. "Adaptive subspace methods for high resolution analysis of music signals". In Acous-tics’08, Paris, France, July 2008.[2] Bertrand David, Gaël Richard, and Roland Badeau. "An EDS modelling tool for tracking and modifying musical signals". InProc. of Stockholm Music Acoustics Conference (SMAC), volume 2, pages 715-718, Stockholm, Sweden, August 2003.[3] Roland Badeau, Rémy Boyer, and Bertrand David. "EDS parametric modeling and tracking of audio signals". In Proc. of the5th International Conference on Digital Audio Effects (DAFx), pages 139-144, Hamburg, Germany, September 2002.

Page 56: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 19 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Decomposition of a violin sound

Time (seconds)

Fre

quen

cy (

Hz)

0 0.1 0.2 0.3 0.4 0.50

1000

2000

3000

4000

5000

6000

7000

8000

Spectrogram

0 0.1 0.2 0.3 0.4 0.50

1000

2000

3000

4000

5000

6000

7000

8000

Time (seconds)F

requ

ency

(H

z)

HR-ogram

Time (seconds)

Fre

quen

cy (

Hz)

0 0.1 0.2 0.3 0.4 0.50

1000

2000

3000

4000

5000

6000

7000

8000

Residual

[1] Roland Badeau and Bertrand David. "Adaptive subspace methods for high resolution analysis of music signals". In Acous-tics’08, Paris, France, July 2008.[2] Bertrand David, Gaël Richard, and Roland Badeau. "An EDS modelling tool for tracking and modifying musical signals". InProc. of Stockholm Music Acoustics Conference (SMAC), volume 2, pages 715-718, Stockholm, Sweden, August 2003.[3] Roland Badeau, Rémy Boyer, and Bertrand David. "EDS parametric modeling and tracking of audio signals". In Proc. of the5th International Conference on Digital Audio Effects (DAFx), pages 139-144, Hamburg, Germany, September 2002.

Page 57: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 20 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Sinusoids and noise separation

Principle: projection onto the signal or the noise subspace [1,2]

Instrument Original Sinusoids Noise

Piano

Guitar

Violin

Flute

Saxophone

Bell

[1] Roland Badeau, Rémy Boyer, and Bertrand David. "EDS parametric modeling and tracking of audio signals". In Proc. of the5th International Conference on Digital Audio Effects (DAFx), pp. 139-144, Hamburg, Germany, September 2002.[2] Bertrand David, Gaël Richard, and Roland Badeau. "An EDS modelling tool for tracking and modifying musical signals". InProc. of Stockholm Music Acoustics Conference (SMAC), volume 2, pp. 715-718, Stockholm, Sweden, August 2003.

Page 58: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 21 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Drum separation and beat estimation

Drum source separation [1]

Original (Aerosmith):

Separated drums:

Remix - more drums:

Remix - less drums:

Beat tracking [2]

Pink Floyd:

Brad Mehldau:

[1] Olivier Gillet and Gaël Richard. Transcription and separation of drum signals from polyphonic music. IEEE Transactions onAudio, Speech, and Language Processing, 16(3): 529-540, March 2008.[2] Miguel Alonso Arevalo, Roland Badeau, Bertrand David, and Gaël Richard. "Musical tempo estimation using noise subspaceprojections". In Proc. of IEEE WASPAA, pages 95-98, New Paltz, New York, USA, October 2003.

Page 59: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 21 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Drum separation and beat estimation

Drum source separation [1]

Original (Aerosmith):

Separated drums:

Remix - more drums:

Remix - less drums:

Beat tracking [2]

Pink Floyd:

Brad Mehldau:

[1] Olivier Gillet and Gaël Richard. Transcription and separation of drum signals from polyphonic music. IEEE Transactions onAudio, Speech, and Language Processing, 16(3): 529-540, March 2008.[2] Miguel Alonso Arevalo, Roland Badeau, Bertrand David, and Gaël Richard. "Musical tempo estimation using noise subspaceprojections". In Proc. of IEEE WASPAA, pages 95-98, New Paltz, New York, USA, October 2003.

Page 60: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 22 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Sympathetic string modes in a concert harp

Modelling sympathetic string modes in a concert harp [1]

Experimental protocol Physical model

[1] Jean-Loïc Le Carrou, François Gautier, and Roland Badeau. "Sympathetic string modes in the concert harp". Acta Acusticaunited with Acustica, 95(4): 744-752, July/August 2009.

Page 61: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 23 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Audio coding

Parametric coder based on the ESM model [1]→ exponential modulations

Subband TemporalInputQuantisation

ESM model

Subband

signal decompositionparameters

segmentation

Attackdetection

estimation

Entropycoding

Joint scalar quantisation with entropy constraint [2,3]

[1] Olivier Derrien, Gaël Richard, and Roland Badeau. "Damped sinusoids and subspace based approach for lossy audio coding".In Acoustics’08, Paris, France, July 2008.[2] Olivier Derrien, Roland Badeau, and Gaël Richard. "Entropy-constrained quantization of exponentially damped sinusoidsparameters". In Proc. of IEEE ICASSP, Prague, Czech Republic, May 2011.[3] Olivier Derrien, Roland Badeau, and Gaël Richard. "Calculation of an entropy-constrained quantizer for exponentially dampedsinudoids parameters". Technical report, Laboratoire de Mécanique et d’Acoustique, Marseille, France, June 2010.

Page 62: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 24 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Audio coding

Original sound:MDCT ESM

9 bits/spl 8.9 bits/spl

8 bits/spl

7 bits/spl 6.8 bits/spl

6 bits/spl 6.4 bits/spl

5 bits/spl 4.7 bits/spl

4 bits/spl 4.4 bits/spl

3 bits/spl 3.2 bits/spl

2 bits/spl 2.1 bits/spl

Page 63: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 25 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Part II

Nonnegative decompositions

Page 64: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 26 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Nonnegative Matrix Factorization (NMF)

Musical score

Spectrogram V

Page 65: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 26 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Nonnegative Matrix Factorization (NMF)

Musical score

0 2 4 6 8 100

2040

Time (s)

0 2 4 6 8 100

2040

0 2 4 6 8 100

2040

Temporal activations H

0

2000

4000

6000

8000

10000

0 0.5 1

Fre

quen

cy (

Hz)

0

2000

4000

6000

8000

10000

0 0.5 10

2000

4000

6000

8000

10000

0 0.5 1

Spectral atoms W Spectrogram V

Page 66: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 27 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

β-divergence and multiplicative rules

Minimisation of the criterion D(V |WH) =∑N

n=1∑F

f=1 d (vfn |vfn )

β-divergence [Eguchi and Kano, 2001]:

dβ(a|b) = 1β(β−1)

(

aβ + (β − 1)bβ − βabβ−1)

β = 2 corresponds to Euclidean distance (EUC),β = 1 corresponds to Kullback-Leibler divergence (KL),β = 0 corresponds to Itakura-Saito divergence (IS),dβ(a|b) is convex w.r.t b if and only if β ∈ [1,2],

Multiplicative update rules [Kompass, 2007]:

W ← W ⊗ (V⊗(WH)β−2)HT

(WH)β−1HT

H ← H ⊗ W T (V⊗(WH)β−2)

W T (WH)β−1

D(V |WH) is non-increasing if and only if β ∈ [1,2].

Page 67: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 27 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

β-divergence and multiplicative rules

Minimisation of the criterion D(V |WH) =∑N

n=1∑F

f=1 d (vfn |vfn )

β-divergence [Eguchi and Kano, 2001]:

dβ(a|b) = 1β(β−1)

(

aβ + (β − 1)bβ − βabβ−1)

β = 2 corresponds to Euclidean distance (EUC),β = 1 corresponds to Kullback-Leibler divergence (KL),β = 0 corresponds to Itakura-Saito divergence (IS),dβ(a|b) is convex w.r.t b if and only if β ∈ [1,2],

Multiplicative update rules [Kompass, 2007]:

W ← W ⊗ (V⊗(WH)β−2)HT

(WH)β−1HT

H ← H ⊗ W T (V⊗(WH)β−2)

W T (WH)β−1

D(V |WH) is non-increasing if and only if β ∈ [1,2].

Page 68: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 27 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

β-divergence and multiplicative rules

Minimisation of the criterion D(V |WH) =∑N

n=1∑F

f=1 d (vfn |vfn )

β-divergence [Eguchi and Kano, 2001]:

dβ(a|b) = 1β(β−1)

(

aβ + (β − 1)bβ − βabβ−1)

β = 2 corresponds to Euclidean distance (EUC),β = 1 corresponds to Kullback-Leibler divergence (KL),β = 0 corresponds to Itakura-Saito divergence (IS),dβ(a|b) is convex w.r.t b if and only if β ∈ [1,2],

Multiplicative update rules [Kompass, 2007]:

W ← W ⊗ (V⊗(WH)β−2)HT

(WH)β−1HT

H ← H ⊗ W T (V⊗(WH)β−2)

W T (WH)β−1

D(V |WH) is non-increasing if and only if β ∈ [1,2].

Page 69: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 28 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Stability of multiplicative update rules

Introduction of an exponentiation step η into NMF multiplicativeupdate rules designed for minimizing the β-divergence [1]:

W ← W ⊗(

(V⊗(WH)β−2)HT

(WH)β−1HT

(1)

H ← H ⊗(

W T (V⊗(WH)β−2)

W T (WH)β−1

(2)

Monotonic decrease of the criterion if β ∈ [1, 2] and η ∈]0, 1]

Exponential or asymptotic stability of both rules (1) and (2) ifη ∈]0, η∗[, where ∀β ∈ R, η⋆ ∈ ]0, 2] and if β ∈ [1, 2], η⋆ = 2

Instability if η /∈ [0, 2] ∀β ∈ R

Step η permits to control the convergence rate

[1] Roland Badeau, Nancy Bertin, and Emmanuel Vincent. "Stability analysis of multiplicative update algorithms and application tononnegative matrix factorization". IEEE Transactions on Neural Networks, vol.21, no. 12, pp. 1869-1881, December 2010.

Page 70: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 28 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Stability of multiplicative update rules

Introduction of an exponentiation step η into NMF multiplicativeupdate rules designed for minimizing the β-divergence [1]:

W ← W ⊗(

(V⊗(WH)β−2)HT

(WH)β−1HT

(1)

H ← H ⊗(

W T (V⊗(WH)β−2)

W T (WH)β−1

(2)

Monotonic decrease of the criterion if β ∈ [1, 2] and η ∈]0, 1]

Exponential or asymptotic stability of both rules (1) and (2) ifη ∈]0, η∗[, where ∀β ∈ R, η⋆ ∈ ]0, 2] and if β ∈ [1, 2], η⋆ = 2

Instability if η /∈ [0, 2] ∀β ∈ R

Step η permits to control the convergence rate

[1] Roland Badeau, Nancy Bertin, and Emmanuel Vincent. "Stability analysis of multiplicative update algorithms and application tononnegative matrix factorization". IEEE Transactions on Neural Networks, vol.21, no. 12, pp. 1869-1881, December 2010.

Page 71: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 28 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Stability of multiplicative update rules

Introduction of an exponentiation step η into NMF multiplicativeupdate rules designed for minimizing the β-divergence [1]:

W ← W ⊗(

(V⊗(WH)β−2)HT

(WH)β−1HT

(1)

H ← H ⊗(

W T (V⊗(WH)β−2)

W T (WH)β−1

(2)

Monotonic decrease of the criterion if β ∈ [1, 2] and η ∈]0, 1]

Exponential or asymptotic stability of both rules (1) and (2) ifη ∈]0, η∗[, where ∀β ∈ R, η⋆ ∈ ]0, 2] and if β ∈ [1, 2], η⋆ = 2

Instability if η /∈ [0, 2] ∀β ∈ R

Step η permits to control the convergence rate

[1] Roland Badeau, Nancy Bertin, and Emmanuel Vincent. "Stability analysis of multiplicative update algorithms and application tononnegative matrix factorization". IEEE Transactions on Neural Networks, vol.21, no. 12, pp. 1869-1881, December 2010.

Page 72: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 28 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Stability of multiplicative update rules

Introduction of an exponentiation step η into NMF multiplicativeupdate rules designed for minimizing the β-divergence [1]:

W ← W ⊗(

(V⊗(WH)β−2)HT

(WH)β−1HT

(1)

H ← H ⊗(

W T (V⊗(WH)β−2)

W T (WH)β−1

(2)

Monotonic decrease of the criterion if β ∈ [1, 2] and η ∈]0, 1]

Exponential or asymptotic stability of both rules (1) and (2) ifη ∈]0, η∗[, where ∀β ∈ R, η⋆ ∈ ]0, 2] and if β ∈ [1, 2], η⋆ = 2

Instability if η /∈ [0, 2] ∀β ∈ R

Step η permits to control the convergence rate

[1] Roland Badeau, Nancy Bertin, and Emmanuel Vincent. "Stability analysis of multiplicative update algorithms and application tononnegative matrix factorization". IEEE Transactions on Neural Networks, vol.21, no. 12, pp. 1869-1881, December 2010.

Page 73: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 28 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Stability of multiplicative update rules

Introduction of an exponentiation step η into NMF multiplicativeupdate rules designed for minimizing the β-divergence [1]:

W ← W ⊗(

(V⊗(WH)β−2)HT

(WH)β−1HT

(1)

H ← H ⊗(

W T (V⊗(WH)β−2)

W T (WH)β−1

(2)

Monotonic decrease of the criterion if β ∈ [1, 2] and η ∈]0, 1]

Exponential or asymptotic stability of both rules (1) and (2) ifη ∈]0, η∗[, where ∀β ∈ R, η⋆ ∈ ]0, 2] and if β ∈ [1, 2], η⋆ = 2

Instability if η /∈ [0, 2] ∀β ∈ R

Step η permits to control the convergence rate

[1] Roland Badeau, Nancy Bertin, and Emmanuel Vincent. "Stability analysis of multiplicative update algorithms and application tononnegative matrix factorization". IEEE Transactions on Neural Networks, vol.21, no. 12, pp. 1869-1881, December 2010.

Page 74: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 29 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Avoiding local minima

The three divergences EUC, KL, and IS have local minima [1]

D(V |WH)

(W , H)

[1] Nancy Bertin and Roland Badeau. "Initialization, distances and local minima in audio applications of the nonnegative matrixfactorization". In Acoustics’08, Paris, France, July 2008.

Page 75: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 29 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Avoiding local minima

The three divergences EUC, KL, and IS have local minima [1]

D(V |WH)

(W , H)

[1] Nancy Bertin and Roland Badeau. "Initialization, distances and local minima in audio applications of the nonnegative matrixfactorization". In Acoustics’08, Paris, France, July 2008.

Page 76: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 29 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Avoiding local minima

The three divergences EUC, KL, and IS have local minima [1]

D(V |WH)

(W , H)

[1] Nancy Bertin and Roland Badeau. "Initialization, distances and local minima in audio applications of the nonnegative matrixfactorization". In Acoustics’08, Paris, France, July 2008.

Page 77: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 29 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Avoiding local minima

The three divergences EUC, KL, and IS have local minima [1]

D(V |WH)

(W , H)

[1] Nancy Bertin and Roland Badeau. "Initialization, distances and local minima in audio applications of the nonnegative matrixfactorization". In Acoustics’08, Paris, France, July 2008.

Page 78: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 29 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Avoiding local minima

The three divergences EUC, KL, and IS have local minima [1]

D(V |WH)

(W , H)

[1] Nancy Bertin and Roland Badeau. "Initialization, distances and local minima in audio applications of the nonnegative matrixfactorization". In Acoustics’08, Paris, France, July 2008.

Page 79: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 29 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Avoiding local minima

The three divergences EUC, KL, and IS have local minima [1]

D(V |WH)

(W , H)

[1] Nancy Bertin and Roland Badeau. "Initialization, distances and local minima in audio applications of the nonnegative matrixfactorization". In Acoustics’08, Paris, France, July 2008.

Page 80: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 29 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Avoiding local minima

The three divergences EUC, KL, and IS have local minima [1]

D(V |WH)

(W , H)

[1] Nancy Bertin and Roland Badeau. "Initialization, distances and local minima in audio applications of the nonnegative matrixfactorization". In Acoustics’08, Paris, France, July 2008.

Page 81: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 30 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Avoiding local minima

Strategies for initialising the algorithm [1]Failure of algorithms from automatic classification

"Simulated cooling" algorithm for IS-NMF [2]Parameter β becomes a function of the iteration index p:

The best transcription is not obtained by finding the lowest minimumof the criterion, but rather by constraining the decomposition

[1] Nancy Bertin and Roland Badeau. "Initialization, distances and local minima in audio applications of the nonnegative matrixfactorization". In Acoustics’08, Paris, France, July 2008.[2] Nancy Bertin, Cédric Févotte, and Roland Badeau. "A tempering approach for Itakura-Saito nonnegative matrix factorization.With application to music transcription". In Proc. of IEEE ICASSP, pages 1545–1548, Taipei, Taiwan, April 2009.

Page 82: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 30 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Avoiding local minima

Strategies for initialising the algorithm [1]Failure of algorithms from automatic classification

"Simulated cooling" algorithm for IS-NMF [2]Parameter β becomes a function of the iteration index p:

The best transcription is not obtained by finding the lowest minimumof the criterion, but rather by constraining the decomposition

[1] Nancy Bertin and Roland Badeau. "Initialization, distances and local minima in audio applications of the nonnegative matrixfactorization". In Acoustics’08, Paris, France, July 2008.[2] Nancy Bertin, Cédric Févotte, and Roland Badeau. "A tempering approach for Itakura-Saito nonnegative matrix factorization.With application to music transcription". In Proc. of IEEE ICASSP, pages 1545–1548, Taipei, Taiwan, April 2009.

Page 83: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 30 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Avoiding local minima

Strategies for initialising the algorithm [1]Failure of algorithms from automatic classification

"Simulated cooling" algorithm for IS-NMF [2]Parameter β becomes a function of the iteration index p:

012 convexity area of dβ

β(p)

p

The best transcription is not obtained by finding the lowest minimumof the criterion, but rather by constraining the decomposition

[1] Nancy Bertin and Roland Badeau. "Initialization, distances and local minima in audio applications of the nonnegative matrixfactorization". In Acoustics’08, Paris, France, July 2008.[2] Nancy Bertin, Cédric Févotte, and Roland Badeau. "A tempering approach for Itakura-Saito nonnegative matrix factorization.With application to music transcription". In Proc. of IEEE ICASSP, pages 1545–1548, Taipei, Taiwan, April 2009.

Page 84: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 30 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Avoiding local minima

Strategies for initialising the algorithm [1]Failure of algorithms from automatic classification

"Simulated cooling" algorithm for IS-NMF [2]Parameter β becomes a function of the iteration index p:

012 convexity area of dβ

β(p)

p

The best transcription is not obtained by finding the lowest minimumof the criterion, but rather by constraining the decomposition

[1] Nancy Bertin and Roland Badeau. "Initialization, distances and local minima in audio applications of the nonnegative matrixfactorization". In Acoustics’08, Paris, France, July 2008.[2] Nancy Bertin, Cédric Févotte, and Roland Badeau. "A tempering approach for Itakura-Saito nonnegative matrix factorization.With application to music transcription". In Proc. of IEEE ICASSP, pages 1545–1548, Taipei, Taiwan, April 2009.

Page 85: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 31 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Harmonicity and spectral smoothness

Model [1]: vfn =∑K

k=1 wfk (θ)hkn where wfk (θ) =∑M

m=1 emkPkm(f )

ff

f

Pk1(f ) PkM (f )

wfk

. . .

+

×eMk×e1k

Pkm(f ) is a predefined harmonic spectral pattern

emk and hkn are estimated by means of a multiplicative algorithm

[1] Emmanuel Vincent, Nancy Bertin, and Roland Badeau. "Adaptive harmonic spectral decomposition for multiple pitch estima-tion". IEEE Transactions on Audio, Speech, and Language Processing, 18(3): 528- 537, March 2010.

Page 86: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 31 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Harmonicity and spectral smoothness

Model [1]: vfn =∑K

k=1 wfk (θ)hkn where wfk (θ) =∑M

m=1 emkPkm(f )

ff

f

Pk1(f ) PkM (f )

wfk

. . .

+

×eMk×e1k

Pkm(f ) is a predefined harmonic spectral pattern

emk and hkn are estimated by means of a multiplicative algorithm

[1] Emmanuel Vincent, Nancy Bertin, and Roland Badeau. "Adaptive harmonic spectral decomposition for multiple pitch estima-tion". IEEE Transactions on Audio, Speech, and Language Processing, 18(3): 528- 537, March 2010.

Page 87: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 31 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Harmonicity and spectral smoothness

Model [1]: vfn =∑K

k=1 wfk (θ)hkn where wfk (θ) =∑M

m=1 emkPkm(f )

ff

f

Pk1(f ) PkM (f )

wfk

. . .

+

×eMk×e1k

Pkm(f ) is a predefined harmonic spectral pattern

emk and hkn are estimated by means of a multiplicative algorithm

[1] Emmanuel Vincent, Nancy Bertin, and Roland Badeau. "Adaptive harmonic spectral decomposition for multiple pitch estima-tion". IEEE Transactions on Audio, Speech, and Language Processing, 18(3): 528- 537, March 2010.

Page 88: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 32 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Temporal smoothness

MAP estimator: C(Θ) = L(Θ) + log(p(Θ)) où Θ = {emk , hkn}

Markov chain structured a priori distribution:

p(H) =K∏

k=1p(hk1)

N∏

n=2p(hkn|hk(n−1))

where p(hkn|hk(n−1)) follows an inverse-Gamma distribution

SAGE algorithm [1] and multiplicative update rules [2] with η = 0.5

[1] Nancy Bertin, Roland Badeau, and Emmanuel Vincent. "Enforcing Harmonicity and Smoothness in Bayesian NonnegativeMatrix Factorization Applied to Polyphonic Music Transcription". IEEE Transactions on Audio, Speech, and Language Processing,18(3): 538-549, March 2010.[2] Nancy Bertin, Roland Badeau, and Emmanuel Vincent. "Fast Bayesian NMF algorithms enforcing harmonicity and temporalcontinuity in polyphonic music transcription". In Proc. of IEEE WASPAA, pages 29-32, New York, USA, October 2009.

Page 89: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 32 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Temporal smoothness

MAP estimator: C(Θ) = L(Θ) + log(p(Θ)) où Θ = {emk , hkn}

Markov chain structured a priori distribution:

hk1 hk2 hkN. . .

p(H) =K∏

k=1p(hk1)

N∏

n=2p(hkn|hk(n−1))

where p(hkn|hk(n−1)) follows an inverse-Gamma distribution

SAGE algorithm [1] and multiplicative update rules [2] with η = 0.5

[1] Nancy Bertin, Roland Badeau, and Emmanuel Vincent. "Enforcing Harmonicity and Smoothness in Bayesian NonnegativeMatrix Factorization Applied to Polyphonic Music Transcription". IEEE Transactions on Audio, Speech, and Language Processing,18(3): 538-549, March 2010.[2] Nancy Bertin, Roland Badeau, and Emmanuel Vincent. "Fast Bayesian NMF algorithms enforcing harmonicity and temporalcontinuity in polyphonic music transcription". In Proc. of IEEE WASPAA, pages 29-32, New York, USA, October 2009.

Page 90: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 32 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Temporal smoothness

MAP estimator: C(Θ) = L(Θ) + log(p(Θ)) où Θ = {emk , hkn}

Markov chain structured a priori distribution:

hk1 hk2 hkN. . .

p(H) =K∏

k=1p(hk1)

N∏

n=2p(hkn|hk(n−1))

where p(hkn|hk(n−1)) follows an inverse-Gamma distribution

SAGE algorithm [1] and multiplicative update rules [2] with η = 0.5

[1] Nancy Bertin, Roland Badeau, and Emmanuel Vincent. "Enforcing Harmonicity and Smoothness in Bayesian NonnegativeMatrix Factorization Applied to Polyphonic Music Transcription". IEEE Transactions on Audio, Speech, and Language Processing,18(3): 538-549, March 2010.[2] Nancy Bertin, Roland Badeau, and Emmanuel Vincent. "Fast Bayesian NMF algorithms enforcing harmonicity and temporalcontinuity in polyphonic music transcription". In Proc. of IEEE WASPAA, pages 29-32, New York, USA, October 2009.

Page 91: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 33 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

NMF-based automatic transcription

Algorithm [1,2]

Time-frequency NonnegativeInput MIDIsignal representation

Transcriptionfiledecomposition

Estimation ofMIDI pitch

Detection of the attacksand ends of notes

W

H

Demo

Original signal (Liszt):

Transcribed signal:

[1] Nancy Bertin, Roland Badeau, and Gaël Richard. "Blind signal decompositions for automatic transcription of polyphonic music:NMF and K-SVD on the benchmark". In Proc. of IEEE ICASSP, volume 1, pages 65-68, Honolulu, Hawaii, USA, April 2007.[2] Nancy Bertin, Roland Badeau, and Emmanuel Vincent. "Enforcing Harmonicity and Smoothness in Bayesian NonnegativeMatrix Factorization Applied to Polyphonic Music Transcription". IEEE Transactions on Audio, Speech, and Language Processing,18(3): 538-549, March 2010.

Page 92: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 33 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

NMF-based automatic transcription

Algorithm [1,2]

Time-frequency NonnegativeInput MIDIsignal representation

Transcriptionfiledecomposition

Estimation ofMIDI pitch

Detection of the attacksand ends of notes

W

H

Demo

Original signal (Liszt):

Transcribed signal:

[1] Nancy Bertin, Roland Badeau, and Gaël Richard. "Blind signal decompositions for automatic transcription of polyphonic music:NMF and K-SVD on the benchmark". In Proc. of IEEE ICASSP, volume 1, pages 65-68, Honolulu, Hawaii, USA, April 2007.[2] Nancy Bertin, Roland Badeau, and Emmanuel Vincent. "Enforcing Harmonicity and Smoothness in Bayesian NonnegativeMatrix Factorization Applied to Polyphonic Music Transcription". IEEE Transactions on Audio, Speech, and Language Processing,18(3): 538-549, March 2010.

Page 93: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 34 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Time-frequency activations

Model [1]: vfn =∑K

k=1 wfkhkn(f ) where hkn(f ) = σ2kn

∣1+

∑Qq=1 b(q)

kn e−i2πνf q∣

2

∣1+

∑Pp=1 a(p)

kn e−i2πνf p∣

2

time (seconds)

fre

qu

en

cy (

kHz)

(a) Original spectrogram

0 1 2 30

1

2

3

4

5

−50

0

50

0 2 4−80

−60

−40

−20

frequency (kHz)

am

plit

ud

e (

dB

)

(b) Spectral form

time (seconds)

fre

qu

en

cy (

kHz)

(c) Time−frequency activation

0 0.5 1 1.5 2 2.5 3 3.50

1

2

3

4

5

−20

0

20

40

60

80

100

Jew’s harp signal decomposed with an ARMA of order (1,1)

[1] Romain Hennequin, Roland Badeau, and Bertrand David. "NMF with time-frequency activations to model non-stationary audioevents". IEEE Transactions on Audio, Speech, and Language Processing, vol. 19, no. 4, pp. 744-753, May 2011.

Page 94: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 35 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Fundamental frequencies variations

Model [1]: vfn =K∑

k=1wfk (ν

0kn) hkn where wfk (ν

0kn) =

nh(ν0kn)

h=1ah g(νf − h ν0

kn)

Time (windows)

Fre

quen

cy (

kHz)

50 100 150 200 250 3000

1

2

3

4

5

−5

0

5

10

15

20

25

30

35

40

Original spectrogram

Time (windows)

Sem

itone

s

20 40 60 80 100 120 140 16014

16

18

20

22

24

26

28

30

32

34

20

25

30

35

40

Temporal activations

Decomposition of an excerpt of J.S. Bach’s first prelude

[1] Romain Hennequin, Roland Badeau, and Bertrand David. "Time-dependent parametric and harmonic templates in nonnegativematrix factorization". In Proc. of DAFx, Graz, Austria, September 2010.

Page 95: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 36 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Score-based informed source separation

Algorithm [1]

Round Midnight (Thelonious Monk):

[1] Romain Hennequin, Bertrand David, and Roland Badeau. "Score informed audio source separation using a parametric modelof non-negative spectrogram". In Proc. of IEEE ICASSP, Prague, Czech Republic, May 2011.

Page 96: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 36 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Score-based informed source separation

Algorithm [1]

Round Midnight (Thelonious Monk):

[1] Romain Hennequin, Bertrand David, and Roland Badeau. "Score informed audio source separation using a parametric modelof non-negative spectrogram". In Proc. of IEEE ICASSP, Prague, Czech Republic, May 2011.

Page 97: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 37 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Conclusions

Nonstationary signal modellingAdaptive high resolution methodsNonnegative decompositions enforcing harmonicity andsmoothness

Applications to audio and music signalsSource separation, audio coding,Pitch and tempo estimation, automatic transcription

OutlooksIs it possible to merge HR methods and NMF in some way?.. to be continued in an upcoming seminar (March 6)

Page 98: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 37 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Conclusions

Nonstationary signal modellingAdaptive high resolution methodsNonnegative decompositions enforcing harmonicity andsmoothness

Applications to audio and music signalsSource separation, audio coding,Pitch and tempo estimation, automatic transcription

OutlooksIs it possible to merge HR methods and NMF in some way?.. to be continued in an upcoming seminar (March 6)

Page 99: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 37 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Conclusions

Nonstationary signal modellingAdaptive high resolution methodsNonnegative decompositions enforcing harmonicity andsmoothness

Applications to audio and music signalsSource separation, audio coding,Pitch and tempo estimation, automatic transcription

OutlooksIs it possible to merge HR methods and NMF in some way?.. to be continued in an upcoming seminar (March 6)

Page 100: High resolution spectral analysis and nonnegative ... · Array processing (beamforming, direction of arrival (DOA) estimation) Digital communications (channel identification) Applications

Page 37 / 37 C4DM Seminar Roland Badeau

Wednesday, February 13, 2013

Conclusions

Nonstationary signal modellingAdaptive high resolution methodsNonnegative decompositions enforcing harmonicity andsmoothness

Applications to audio and music signalsSource separation, audio coding,Pitch and tempo estimation, automatic transcription

OutlooksIs it possible to merge HR methods and NMF in some way?.. to be continued in an upcoming seminar (March 6)