online divergence switching for superresolution-based nonnegative matrix factorization
TRANSCRIPT
Online Divergence Switching for Superresolution-Based
Nonnegative Matrix Factorization
Daichi Kitamura, Hiroshi Saruwatari, Satoshi Nakamura(Nara Institute of Science and Technology, Japan)
Yu Takahashi, Kazunobu Kondo(Yamaha Corporation, Japan)
Hirokazu Kameoka(The University of Tokyo, Japan)
2014 RISP International Workshop on Nonlinear Circuits, Communications and Signal ProcessingSpeech Analysis(2),2PM2-2
2
Outline• 1. Research background• 2. Conventional methods
– Nonnegative matrix factorization– Supervised nonnegative matrix factorization– Directional clustering– Hybrid method
• 3. Proposed method– Online divergence switching for hybrid method
• 4. Experiments• 5. Conclusions
3
Outline• 1. Research background• 2. Conventional methods
– Nonnegative matrix factorization– Supervised nonnegative matrix factorization– Directional clustering– Hybrid method
• 3. Proposed method– Online divergence switching for hybrid method
• 4. Experiments• 5. Conclusions
4
Research background• Music signal separation technologies have received
much attention.
• Music signal separation based on nonnegative matrix factorization (NMF) is a very active research area.
• The separation performance of supervised NMF (SNMF) markedly degrades for the case of many source mixtures.
• Automatic music transcription• 3D audio system, etc.
Applications
We have been proposed a new hybrid separation method for stereo music signals.
Separate!
5
Research background• Our proposed hybrid method
Input stereo signal
Spatial separation method (Directional clustering)
SNMF-based separation method(Superresolution-based SNMF)
Separated signal
L R
6
Research background• Optimal divergence criterion in superresolution-based
SNMF depends on the spatial conditions of the input signal.
• Our aim in this presentation
We propose a new optimal separation scheme for this hybrid method to separate the target signal with high accuracy for any types of the spatial condition.
7
Outline• 1. Research background• 2. Conventional methods
– Nonnegative matrix factorization– Supervised nonnegative matrix factorization– Directional clustering– Hybrid method
• 3. Proposed method– Online divergence switching for hybrid method
• 4. Experiments• 5. Conclusions
8
• NMF– is a sparse representation algorithm.– can extract significant features from the observed matrix.
NMF [Lee, et al., 2001]
Amplitude
Am
plitu
de
Observed matrix(spectrogram)
Basis matrix(spectral patterns)
Activation matrix(Time-varying gain)
Time
: Number of frequency bins: Number of time frames: Number of bases
Time
Fre
quen
cy
Fre
quen
cy
Basis
9
Optimization in NMF• The variable matrices and are optimized by
minimization of the divergence between and .
• Euclidian distance (EUC-distance) and Kullbuck-Leibler divergence (KL-divergence) are often used for the divergence in the cost function.
• In NMF-based separation, KL-divergence based cost function achieves high separation performance.
: Entries of variable matrices and , respectively.
Cost function:
10
• SNMF utilizes some sample sounds of the target.– Construct the trained basis matrix of the target sound– Decompose into the target signal and other signal
SNMF [Smaragdis, et al., 2007]
Separation process Optimize
Training process
Supervised basis matrix (spectral dictionary)
Sample sounds of target signal
Fixed
Ex. Musical scale
Target signal Other signalMixed signal
11
Five-source case
Problem of SNMF• The separation performance of SNMF markedly
degrades when many interference sources exist.
Separate
Two-source case
Separate
Residual components
12
Directional clustering [Araki, et al., 2007]
• Directional clustering– utilizes differences between channels as a separation cue.– Is equal to binary masking in the spectrogram domain.
• Problems– Cannot separate sources in the same direction– Artificial distortion arises owing to the binary masking.
Right
L R
CenterLeft
L R
Center
Binary masking
Input signal (stereo) Separated signal
1
1
1
0
0
0
1
0
0
0
0
0
1
1 1
1
0
0
1
0
0
0
0
0
1 1
1 1
1
1
Fre
quen
cy
Time
C
C
C
R L
R
C
L
L
L
R
R
C
C C
C
R
R
C
R
R
L
L
L
C CC C
C
C
Fre
quen
cy
Time
Binary maskSpectrogram
Entry-wise product
13
Hybrid method [D. Kitamura, et al., 2013]
• We have proposed a new SNMF called superresolution-based SNMF and its hybrid method.
• Hybrid method consists of directional clustering and superresolution-based SNMF.
Directional clustering
L R
Spatialseparation
Spectralseparation
Superresolution-based SNMF
Hybrid method
14
Superresolution-based SNMF• This SNMF reconstructs the spectrogram obtained
from directional clustering using supervised basis extrapolation.
Time
Fre
quen
cySeparated cluster
: Chasms
Time
Fre
quen
cy
Input spectrogramOther
direction
Time
Fre
quen
cy
Reconstructed spectrogram
Targetdirection
Directional clustering
Superresolution-based SNMF
15
• Spectral chasms owing to directional clustering
Superresolution-based SNMF
: Chasm
Time
Fre
que
ncy
Separated clusterChasms
Treat these chasms as an unseen observationsSupervised basis
…
Extrapolate the fittest bases
16
Superresolution-based SNMF
Center RightLeftDirection
sour
ce c
ompo
nent
z
(b)
Center RightLeftDirection
sour
ce c
ompo
nent (a)
Target
Center RightLeftDirection
sour
ce c
ompo
nent (c)
Extrapolated components
Freq
uenc
y of
Freq
uenc
y of
Freq
uenc
y of
After
Input
After
signal
directionalclustering
super-resolution-based SNMF
Binary masking
Time
Fre
quen
cyObserved spectrogram
Target
Interference
Time
Time
Fre
quen
cy
Extrapolate
Fre
quen
cy
Separated cluster
Reconstructed data
Supervised spectral bases
Directional clustering
Superresolution-based SNMF
17
• The divergence is defined at all grids except for the chasms by using the index matrix .
Decomposition model and cost function
Decomposition model:
Supervised bases (Fixed)
: Entries of matrices, , and , respectively
: Weighting parameters,: Binary complement, : Frobenius norm
Regularization term
Penalty term
Cost function:
: Index matrix obtained from directional clustering
18
Update rules• We can obtain the update rules for the optimization of
the variables matrices , , and .
Update rules:
19
Outline• 1. Research background• 2. Conventional methods
– Nonnegative matrix factorization– Supervised nonnegative matrix factorization– Directional clustering– Hybrid method
• 3. Proposed method– Online divergence switching for hybrid method
• 4. Experiments• 5. Conclusions
20
Consideration for optimal divergence• Separation performance of conventional
SNMF
• Superresolution-based SNMF
– Optimal divergence depends on the amount of spectral chasms.
KL-divergence EUC-distance
KL-divergence EUC-distance?
However…
21
Consideration for optimal divergence• Superresolution-based SNMF has two tasks.
• Abilities of each divergence
Signal separation
Basis extrapolation
Superresolution-based SNMF
Signal separation
Basis extrapolation
KL-divergence (Very good) (Poor)EUC-distance (Good) (Good)
22
Consideration for optimal divergence• Spectrum decomposed by NMF with KL-divergence
tends to become sparse compared with that decomposed by NMF with EUC-distance.
• Sparse basis is not suitable for extrapolating using observable data.
-10-8-6-4-20
Am
plitu
de [d
B]
543210Frequency [kHz]
-10-8-6-4-20
Am
plitu
de [d
B]
543210Frequency [kHz]
KL-divergence EUC-distance
23
Consideration for optimal divergence• The optimal divergence for superresolution-based
SNMF depends on the amount of spectral chasms because of the trade-off between separation and extrapolation abilities.
Per
form
ance
Separation
Total performance
Extrapolation
Anti-sparseSparse
-10-8-6-4-20
Am
plitu
de [d
B]
543210Frequency [kHz]
-10-8-6-4-20
Am
plitu
de [d
B]
543210Frequency [kHz]
Sparseness: Weak
KL-divergence EUC-distance
Strong
24
• The optimal divergence for superresolution-based SNMF depends on the amount of spectral chasms.
Consideration for optimal divergence
Time
Fre
quen
cy
: Chasms
Time
Fre
quen
cy
: Chasms
If there are many chasms If the chasms are not exist
The extrapolation ability is required.
The separation ability is required.
KL-divergence should be used.
EUC-distance should be used.
25
Hybrid method for online input data• When we consider applying the hybrid method to
online input data…
Online binary-masked spectrogram
Fre
quen
cy
Time
Observed spectrogramDirectional clustering
Binary mask
26
Hybrid method for online input data• We divide the online spectrogram into some block
parts. F
requ
ency
Time
Superresolution-based SNMF
Superresolution-based SNMF
Superresolution-based SNMF
In parallel
27
Online divergence switching• We calculate the rate of chasms in each block part.
There are many chasms.
The chasms are not exist so much.
Superresolution-based SNMF with
KL-divergence
Superresolution-based SNMF with
EUC-distance
Threshold value
Threshold value
28
Procedure of proposed method
29
Outline• 1. Research background• 2. Conventional methods
– Nonnegative matrix factorization– Supervised nonnegative matrix factorization– Directional clustering– Hybrid method
• 3. Proposed method– Online divergence switching for hybrid method
• 4. Experiments• 5. Conclusions
30
Experimental conditions• We used stereo-panning signals. • Mixture of four instruments generated by MIDI synthesizer• We used the same type of MIDI sounds of the target
instruments as supervision for training process.
Center
12 3
4
Left Right
Target source
Supervision sound
Two octave notes that cover all the notes of the target signal
31
Experimental conditions• We compared three methods.
– Hybrid method using only EUC-distance-based SNMF (Conventional method 1)
– Hybrid method using only KL-divergence-based SNMF (Conventional method 2)
– Proposed hybrid method that switches the divergence to the optimal one (Proposed method)
• We used signal-to-distortion ratio (SDR) as an evaluation score.– SDR indicates the total separation accuracy, which includes
both of quality of separated target signal and degree of separation.
32
Experimental result• Average SDR scores for each method, where the
four instruments are shuffled with 12 combinations.
• Proposed method outperforms other methods.
GoodBad
8.0 8.2 8.4 8.6 8.8 9.0 9.2 9.4 9.6 9.8
SDR [dB]
Conventional method 1
Conventional method 2
Proposed method
33
Conclusions• We propose a new divergence switching scheme for
superresolution-based SNMF.• This method is for the online input signal to separate
using optimal divergence in NMF.• The proposed method can be used for any types of
the spatial condition of sources, and separates the target signal with high accuracy.
Thank you for your attention!