Onset DetectionOnset
◦ The beginning of a musical note or other sound◦ The amplitude rises from zero to an initial peak
Onset Detection Function◦ Function peaks coincide with onsets◦ Various methods exist
2/30
Short-Time Fourier Transform
◦ : signal◦ : hamming window◦ : frame index; : frequency bin index◦ : frame size◦ : hop size
3/30
𝑋 (𝑛 ,𝑘 )= ∑𝑚=− 𝑁
2
𝑁2− 1
𝑥 (h𝑛+𝑚 )𝑤 (𝑚 )𝑒−2 𝑗 𝜋𝑚𝑘
𝑁
Onset Detection FunctionEnergy
◦ Occurrence of an onset ←→ Increase of the signal’s amplitude
◦ : frame index◦ : frame size◦ : -point window (smoothing kernel)
4/30
𝑁
𝐸 (𝑛)= 1𝑁 ∑
𝑚=− 𝑁2
𝑁2−1
[𝑥 (𝑛+𝑚 ) ]2𝑤 (𝑚 )
Onset Detection FunctionEnergy
◦ Spectral domain energy
◦ : frame index; : frequency bin index◦ : frame size◦ : frequency dependent weighting
High-Frequency Content
◦ Weighting each bin’s contribution proportion to its frequency
5/30
𝑋 (𝑛 ,𝑘)
~𝐸 (𝑛)= 1𝑁 ∑
𝑘=− 𝑁2
𝑁2−1
𝑊 (𝑘 )|𝑋 (𝑛 ,𝑘 )|2
Onset Detection FunctionSpectral Flux
◦ Measures the change in magnitude in each frequency bin
◦ : frame index; : frequency bin index◦ : frame size◦ : half-wave rectifier function
6/30
𝑋 (𝑛−1 ,𝑘) 𝑋 (𝑛 ,𝑘)
𝑆𝐹 (𝑛)= ∑𝑘=− 𝑁
2
𝑁2−1
𝐻 (|𝑋 (𝑛 ,𝑘 )|−|𝑋 (𝑛−1 ,𝑘)|)
Onset Detection FunctionSpectral Difference
◦ Distance between successive short-term Fourier spectra
◦ : frame index; : frequency bin index◦ : frame size◦ : half-wave rectifier function
7/30
𝑆𝐷 (𝑛 )= ∑𝑘=− 𝑁
2
𝑁2− 1
{𝐻 (|𝑋 (𝑛 ,𝑘 )|−|𝑋 (𝑛−1 ,𝑘 )|) }2𝑋 (𝑛−1 ,𝑘) 𝑋 (𝑛 ,𝑘)
Onset Detection FunctionPhase
◦ Phase◦ Instantaneous frequency◦ Change in instantaneous frequency
Phase Deviation
◦ : frame index; : frequency bin index◦ : frame size
8/30
𝑋 (𝑛 ,𝑘 )=|𝑋 (𝑛 ,𝑘 )|𝑒 𝑗𝜓 (𝑛 ,𝑘)
𝑃𝐷 (𝑛)= 1𝑁 ∑
𝑘=− 𝑁2
𝑁2−1
|𝜓 ″ (𝑛 ,𝑘 )|
Onset Detection FunctionWeighted Phase Deviation
◦ Considers magnitude and phase jointly◦ Significant improvement
Normalized Weighted Phase Deviation
9/30
𝑊 𝑃𝐷 (𝑛)= 1𝑁 ∑
𝑘=− 𝑁2
𝑁2−1
|𝑋 (𝑛 ,𝑘 )𝜓 ″ (𝑛 ,𝑘 )|
𝑁𝑊 𝑃𝐷 (𝑛)=
∑𝑘=− 𝑁
2
𝑁2− 1
|𝑋 (𝑛 ,𝑘)𝜓 ″ (𝑛 ,𝑘 )|
∑𝑘=−
𝑁2
𝑁2− 1
|𝑋 (𝑛 ,𝑘 )|
Onset Detection FunctionComplex Domain
◦ Considers amplitude and phase jointly◦ Assuming constant amplitude and rate of phase change
◦ Sum of absolute deviations
◦ : frame index; : frequency bin index
10/30
𝑋𝑇 (𝑛 ,𝑘 )=|𝑋 (𝑛−1 ,𝑘 )|𝑒𝜓 (𝑛−1 ,𝑘) +𝜓 ′ (𝑛− 1,𝑘 )
𝐶𝐷 (𝑛)= ∑𝑘=− 𝑁
2
𝑁2− 1
|𝑋 (𝑛 ,𝑘)−𝑋𝑇 (𝑛 ,𝑘 )|
Onset Detection FunctionRectified Complex Domain
◦ CD does not distinguish between increases and decreases in amplitude◦ Onsets versus offsets
11/30
𝑅𝐶𝐷 (𝑛)= ∑𝑘=− 𝑁
2
𝑁2−1
𝑅𝐶𝐷 (𝑛 ,𝑘 )
𝑅𝐶𝐷 (𝑛 ,𝑘 )={|𝑋 (𝑛 ,𝑘 )− 𝑋𝑇 (𝑛 ,𝑘)|∧ , if|𝑋 (𝑛 ,𝑘)|≥|𝑋 (𝑛−1 ,𝑘 )|0∧,o therwise
Onset Detection FunctionKullback-Leibler Divergence
◦ A measure of the information lost
◦ : true distribution◦ : model or approximation of
Kullback-Leibler◦ Highlights positive amplitude changes
◦ : frame index; : frequency bin index◦ : frame size
12/30
𝐷𝐾𝐿 (𝑃 ∥𝑄 )=∑𝑥
𝑃 (𝑥 ) log 𝑃 (𝑥 )𝑄 (𝑥 ) 𝐷𝐾𝐿 (𝑛 )= ∑
𝑘=− 𝑁2
𝑁2−1
|𝑋 (𝑛 ,𝑘 )|log |𝑋 (𝑛 ,𝑘 )||𝑋 (𝑛−1 ,𝑘 )|
Onset Detection FunctionModified Kullback-Leibler
◦ Removes the weighting
Rectified MKL (1)
Rectified MKL (2)◦ Avoids negative values◦ Defined when a series of small values is
encountered◦ prevents peaks at offsets
13/30
𝐷𝑀𝐾𝐿 (𝑛 )= ∑𝑘=− 𝑁
2
𝑁2− 1
log|𝑋 (𝑛 ,𝑘 )|
|𝑋 (𝑛−1 ,𝑘 )|
𝐷𝑀𝐾𝐿 (𝑛 )= ∑𝑘=− 𝑁
2
𝑁2− 1
𝑑 (𝑛 ,𝑘 )
𝑑 (𝑛 ,𝑘 )={log |𝑋 (𝑛 ,𝑘 )||𝑋 (𝑛−1,𝑘 )|
∧ , if|𝑋 (𝑛 ,𝑘 )|≥|𝑋 (𝑛−1 ,𝑘 )|
0∧ ,otherwise
𝐷𝑀𝐾𝐿 (𝑛 )= ∑𝑘=− 𝑁
2
𝑁2− 1
log (1+ |𝑋 (𝑛 ,𝑘)||𝑋 (𝑛−1 ,𝑘 )|+𝜖 )
ReferencesBello, J. P., Daudet, L., Abdallah, S., Duxbury, C., Davies, M., & Sandler, M. B. (2005). A tutorial on onset detection in music signals. Speech and Audio Processing, IEEE Transactions on, 13(5), 1035-1047.
Dixon, S. (2006, September). Onset detection revisited. In Proceedings of the 9th International Conference on Digital Audio Effects (Vol. 120, pp. 133-137).
Hainsworth, S., & Macleod, M. (2003, September). Onset detection in musical audio signals. In Proc. Int. Computer Music Conference (pp. 163-6).
Brossier, P. M. (2006). Automatic annotation of musical audio for interactive applications (Doctoral dissertation).
14/30