[ieee 2010 data compression conference - snowbird, ut, usa (2010.03.24-2010.03.26)] 2010 data...
TRANSCRIPT
Analysis of Amplitude Quantization in ACELP Excitation Coding
Wisarn Patchoo1, Thomas R. Fischer1, Changho Ahn2, and Sangwon Kang2
1School of EECS, Washington State University, Pullman, WA 99163-2752, USA 2School of Elec. Eng. and Comp. Science, Hanyang Univ., Ansan, Korea
Algebraic Code-Excited Linear Prediction (ACELP) is a popular linear prediction speech coding algorithm that provides good performance with reasonable implementation complexity, and requires low transmission bit rate (e.g., [1]). The excitation sequence is formed as the sum of two quantized excitations: an adaptive (pitch) codebook excitation and an algebraic (fixed) codebook excitation. Algebraic codevectors are sparse. A sub-frame of length L is partitioned into K interleaved tracks of pulse positions, and m total non-zero pulse positions are partitioned into mk non-zero pulse positions in track k, with
1
0.
K
kk
m m−
=
=∑ The algebraic codebook is a product code of the form gc where g is a
(quantized) sub-frame gain, and c is the sparse, L-dimensional codevector with m non-zero amplitudes restricted to the values 1± . The algebraic codevector is used to excite a synthesis filter with impulse response h(n), forming the fixed-codebook excitation contribution to the synthesized speech sub-frame, gHc, where H is a lower-triangular matrix formed from h(n) [1].
In this paper we study the coding performance advantages possible by using the optimum ACELP codevector amplitudes compared to the ACELP codebook. Denote the m non-zero positions in the codevector as 0 10 mi i L−≤ ≤ ≤ ≤ . Combining the sub-frame gain, g, with the codevector amplitudes c, and since the remaining L-m dimensions of c are zero, an optimum algebraic codevector must minimize
22H H− = −x c x c where x
is a target signal, 0 1 1
[ ]mi i iH
−= h h h is an L row by m column matrix with
jih , the ij
column of impulse response matrix H, and 1
0j
m
ij
c c−
==∑ is the sum of the m interleaved
algebraic codevectors. The optimum (unquantized) non-zero pulse amplitudes are 1( )T T
opt H H H−=c x , provided HH T ~~ is invertible. The empirical density of the normalized non-zero pulse amplitudes of optc is symmetric and bimodal. Rate-distortion analysis of the quantization of such random variables indicates that at the (typical) ACELP encoding rate of 1 bit per amplitude, and for the square error distortion measure, simple uniform scalar quantization is optimum. At rates larger than 1 bit per pulse amplitude some increase in SNR is possible, but the increase in SNR yields only modest improvement in perceptual quality, as measured by mean opinion score. [1] ITU-T G.729.1, “G.729 based embedded variable bit-rate coder: An 8-32 kbits/s scalable wideband coder bitstream interoperable with G.729,’’ May 2006.
2010 Data Compression Conference
1068-0314/10 $26.00 © 2010 IEEE
DOI 10.1109/DCC.2010.52
550