celp presentation
TRANSCRIPT
![Page 1: CELP Presentation](https://reader030.vdocument.in/reader030/viewer/2022013114/552488614a7959ac488b47f3/html5/thumbnails/1.jpg)
DESIGN OF A CELP CODER AND A STUDY DESIGN OF A CELP CODER AND A STUDY OF ITS PERFORMANCE USING VARIOUS OF ITS PERFORMANCE USING VARIOUS QUANTIZATION METHODSQUANTIZATION METHODS
EECS 651: PROJECT PRESENTATION
UNIVERSITY OF MICHIGAN, ANN ARBOR
APRIL 18, 2005
ByByAwais M. KambohAwais M. Kamboh
KrispianKrispian C. LawrenceC. LawrenceAdityaAditya M. ThomasM. Thomas
Philip I. TsaiPhilip I. Tsai
![Page 2: CELP Presentation](https://reader030.vdocument.in/reader030/viewer/2022013114/552488614a7959ac488b47f3/html5/thumbnails/2.jpg)
PROJECT GOALSPROJECT GOALS
To design and implement a CELP To design and implement a CELP coder in coder in matlabmatlabTo use different quantization methods To use different quantization methods to quantize the LP parameters of the to quantize the LP parameters of the codercoderTo evaluate the performance of the To evaluate the performance of the coder in terms of MSE and coder in terms of MSE and ‘‘perceptual perceptual MSEMSE’’ using the various methods of using the various methods of quantizationquantization
![Page 3: CELP Presentation](https://reader030.vdocument.in/reader030/viewer/2022013114/552488614a7959ac488b47f3/html5/thumbnails/3.jpg)
Presentation OutlinePresentation Outline
Introduction to Speech codingIntroduction to Speech codingCELPCELPCELP coderCELP coderQuantization MethodsQuantization MethodsResults and ComparisonsResults and ComparisonsConclusions and recommendationsConclusions and recommendationsQ&AQ&A
![Page 4: CELP Presentation](https://reader030.vdocument.in/reader030/viewer/2022013114/552488614a7959ac488b47f3/html5/thumbnails/4.jpg)
Introduction to Speech Introduction to Speech CodingCoding
Concerned with obtaining compact Concerned with obtaining compact digital representation of voice signals digital representation of voice signals for more efficient transmission or for more efficient transmission or smaller storage size. smaller storage size.
Objective is to represent speech signal Objective is to represent speech signal with minimum number of bits yet with minimum number of bits yet maintain the perceptual quality. maintain the perceptual quality.
![Page 5: CELP Presentation](https://reader030.vdocument.in/reader030/viewer/2022013114/552488614a7959ac488b47f3/html5/thumbnails/5.jpg)
Speech ProductionSpeech Production
SpeechSpeech–– Air pushed from the lungs past Air pushed from the lungs past
the vocal cords and along the the vocal cords and along the vocal tractvocal tract
–– The basic vibrations The basic vibrations –– vocal vocal cordscords
–– The sound is altered by the The sound is altered by the disposition of the vocal tract disposition of the vocal tract ( tongue and mouth)( tongue and mouth)
Model the vocal tract as a filterModel the vocal tract as a filter–– The shape changes relatively The shape changes relatively
slowlyslowlyThe vibrations at the vocal cordsThe vibrations at the vocal cords–– The excitation signalThe excitation signal
![Page 6: CELP Presentation](https://reader030.vdocument.in/reader030/viewer/2022013114/552488614a7959ac488b47f3/html5/thumbnails/6.jpg)
Speech soundsSpeech sounds
Voiced soundVoiced sound–– The vocal cords vibrate open and closeThe vocal cords vibrate open and close–– QuasiQuasi--periodic pulses of airperiodic pulses of air–– The rate of the opening and closing The rate of the opening and closing –– the pitchthe pitch
Unvoiced soundsUnvoiced sounds–– Forcing air at high velocities through a constrictionForcing air at high velocities through a constriction–– NoiseNoise--like turbulencelike turbulence–– Show little longShow little long--term periodicityterm periodicity–– ShortShort--term correlations still presentterm correlations still present
Plosive soundsPlosive sounds–– A complete closure in the vocal tractA complete closure in the vocal tract–– Air pressure is built up and released suddenlyAir pressure is built up and released suddenly
![Page 7: CELP Presentation](https://reader030.vdocument.in/reader030/viewer/2022013114/552488614a7959ac488b47f3/html5/thumbnails/7.jpg)
CodeCode--Excited Linear Predictor (CELP)Excited Linear Predictor (CELP)
Variants of CELP (LDVariants of CELP (LD--CELP, ACELP etc.)CELP, ACELP etc.)Main difference in generation of excitation Main difference in generation of excitation signal, Filters and Bit rate.signal, Filters and Bit rate.PerformancePerformance–– 4kbps or lower bit4kbps or lower bit--rates give synthetic quality rates give synthetic quality
speech / mechanical speech.speech / mechanical speech.–– Most modern CELP variants produce relatively Most modern CELP variants produce relatively
higher bithigher bit--rates and good quality speech.rates and good quality speech.–– Performance cannot be judged by MSE alone.Performance cannot be judged by MSE alone.
![Page 8: CELP Presentation](https://reader030.vdocument.in/reader030/viewer/2022013114/552488614a7959ac488b47f3/html5/thumbnails/8.jpg)
Linear Predictive Coding.Linear Predictive Coding.
Lungs generate an excitation signal which is Lungs generate an excitation signal which is modeled as white noise.modeled as white noise.Vocal cords either remain open or vibrate with Vocal cords either remain open or vibrate with some frequency, called some frequency, called ‘‘PitchPitch’’..The resulting speech is either unvoiced or voiced The resulting speech is either unvoiced or voiced respectively.respectively.Vocal tract acts as an IIR filter.Vocal tract acts as an IIR filter.
![Page 9: CELP Presentation](https://reader030.vdocument.in/reader030/viewer/2022013114/552488614a7959ac488b47f3/html5/thumbnails/9.jpg)
CELP Parameters (In this Implementation)CELP Parameters (In this Implementation)
Excitation Signal: Excitation Signal: A number of signals are stored in A number of signals are stored in a codebook. We choose the signal that best suits a particular a codebook. We choose the signal that best suits a particular chunk of data (frame).chunk of data (frame).
LP Coefficients: LP Coefficients: The coefficients of vocal tract filter.The coefficients of vocal tract filter.
GainGain: Represents the loudness/energy of speech.: Represents the loudness/energy of speech.
Pitch Filter CoefficientPitch Filter Coefficient: We determine pitch by : We determine pitch by modeling it as a long delay correlation filter which produces modeling it as a long delay correlation filter which produces quasiquasi--periodic signals when excited. periodic signals when excited.
Pitch: Pitch: Pitch of the sound. In the range 50Hz to 500Hz. In Pitch of the sound. In the range 50Hz to 500Hz. In this case it is referred to as Pitch Delay measured in # of this case it is referred to as Pitch Delay measured in # of samplessamples
![Page 10: CELP Presentation](https://reader030.vdocument.in/reader030/viewer/2022013114/552488614a7959ac488b47f3/html5/thumbnails/10.jpg)
Rate of CELPRate of CELPFrame Size: 160 samples. (20 ms)Subframe Size: 40 samples (5 ms)
LP coefficients are transmitted once per frame. All others are transmitted once per subframe.
Code Book : 512 entries; 9 bitsGain: Generally between -2 to +2: 8 bitsPitch: 50Hz to 500Hz =>
16 to 160 samples (at 8KHz Sampling): 8 bitsPitch filter Coeff: 0 to 1.4: 6 bitsLP Coefficients: Different for different Rates.
![Page 11: CELP Presentation](https://reader030.vdocument.in/reader030/viewer/2022013114/552488614a7959ac488b47f3/html5/thumbnails/11.jpg)
CELP EncoderCELP Encoder
LP Analyzer
LP Coefficients ‘a’
Code Book Excitation Sequence
Pitch FilterReconstruction Filter
Perceptual Filter
X -
Gain Speech
Select Min Energy
Speech
E
)/()(czA
zA)(
1zA
Xek -Pbz−−11
Gain Speech
.minE
![Page 12: CELP Presentation](https://reader030.vdocument.in/reader030/viewer/2022013114/552488614a7959ac488b47f3/html5/thumbnails/12.jpg)
CELP Encoder (Contd.)CELP Encoder (Contd.)
Gain ‘G’Pitch Filter Coefficient ‘b’Pitch Delay ‘P’Excitation Sequence ‘k’
Linear Predictor Coefficients ‘a’
Scalar Quantizer
SQVQDPCM
Binary Encoded Data
![Page 13: CELP Presentation](https://reader030.vdocument.in/reader030/viewer/2022013114/552488614a7959ac488b47f3/html5/thumbnails/13.jpg)
CELP DecoderCELP Decoder
ReconstructionGain ‘G’Pitch Filter Coefficient ‘b’Pitch Delay ‘P’Excitation Sequence ‘k’
ReconstructionLinear Predictor Coefficients ‘a’
)(1zA
X Pbz−−11
Gain
ek
ReconstructedSpeech
Binary Decoding
![Page 14: CELP Presentation](https://reader030.vdocument.in/reader030/viewer/2022013114/552488614a7959ac488b47f3/html5/thumbnails/14.jpg)
Perceptual Perceptual FilteringFiltering
)/()()(czA
zAzH =
)/()(czA
zAred =
c = 0.8
)(1
)/(1)/(
)(
zAblue
czAgreen
czAzAred
=
=
=
Frequency (Hz)
![Page 15: CELP Presentation](https://reader030.vdocument.in/reader030/viewer/2022013114/552488614a7959ac488b47f3/html5/thumbnails/15.jpg)
Perceptual Filtering (Contd.)Perceptual Filtering (Contd.))/(
)(czA
zA
Different values of ‘c’ in Perceptual filter.
![Page 16: CELP Presentation](https://reader030.vdocument.in/reader030/viewer/2022013114/552488614a7959ac488b47f3/html5/thumbnails/16.jpg)
Performance of CELP (Unquantized) mse = 0.0041
Original Unquantized
![Page 17: CELP Presentation](https://reader030.vdocument.in/reader030/viewer/2022013114/552488614a7959ac488b47f3/html5/thumbnails/17.jpg)
Performance of CELP (Quantized) mse = 0.0120
LP Coefficients: UnquantizedOther Parameters: Quantized
![Page 18: CELP Presentation](https://reader030.vdocument.in/reader030/viewer/2022013114/552488614a7959ac488b47f3/html5/thumbnails/18.jpg)
Quantization Methods UsedQuantization Methods Used
Scalar QuantizationScalar QuantizationDPCMDPCMVector QuantizationVector QuantizationTSVQTSVQ
![Page 19: CELP Presentation](https://reader030.vdocument.in/reader030/viewer/2022013114/552488614a7959ac488b47f3/html5/thumbnails/19.jpg)
Scalar QuantizationScalar Quantization
Quantize one sample at a timeQuantize one sample at a timeThe simplest quantization schemeThe simplest quantization schemeDesign Design quantizersquantizers with sizes M = 2, 4 , with sizes M = 2, 4 , 8, 16, 32, 64, 128, 256 8, 16, 32, 64, 128, 256
![Page 20: CELP Presentation](https://reader030.vdocument.in/reader030/viewer/2022013114/552488614a7959ac488b47f3/html5/thumbnails/20.jpg)
Scalar Scalar QuantizerQuantizer DesignDesign
Lloyd algorithmLloyd algorithm
Initial guess:Initial guess:a uniform codebooka uniform codebook
![Page 21: CELP Presentation](https://reader030.vdocument.in/reader030/viewer/2022013114/552488614a7959ac488b47f3/html5/thumbnails/21.jpg)
Scalar Scalar QuantizerQuantizer DesignDesign
Training data:Training data:15000 samples of LP coefficients 15000 samples of LP coefficients generated from different speech generated from different speech sourcessources15000/256 = 58 points/cell for M=25615000/256 = 58 points/cell for M=25615000/2 = 7500 points/cell for M=2 15000/2 = 7500 points/cell for M=2
![Page 22: CELP Presentation](https://reader030.vdocument.in/reader030/viewer/2022013114/552488614a7959ac488b47f3/html5/thumbnails/22.jpg)
Performance of the SQPerformance of the SQ
![Page 23: CELP Presentation](https://reader030.vdocument.in/reader030/viewer/2022013114/552488614a7959ac488b47f3/html5/thumbnails/23.jpg)
DPCMDPCM
Quantizing the prediction error, once Quantizing the prediction error, once at a timeat a timeEssentially a scalar Essentially a scalar quantizerquantizerGood for slowly varying sourcesGood for slowly varying sourcesNeed a model for the source to design Need a model for the source to design the linear predictorthe linear predictor
![Page 24: CELP Presentation](https://reader030.vdocument.in/reader030/viewer/2022013114/552488614a7959ac488b47f3/html5/thumbnails/24.jpg)
DPCM Design DPCM Design –– PredictorPredictor
Assume a source modelAssume a source model
FirstFirst--order AR, zeroorder AR, zero--mean Gaussianmean Gaussian
![Page 25: CELP Presentation](https://reader030.vdocument.in/reader030/viewer/2022013114/552488614a7959ac488b47f3/html5/thumbnails/25.jpg)
DPCM Design DPCM Design –– PredictorPredictor
Gaussian? Gaussian? Many different kinds of speech, and Many different kinds of speech, and LP coefficientsLP coefficients
ZeroZero--mean? mean? Empirical mean is near to zeroEmpirical mean is near to zero
![Page 26: CELP Presentation](https://reader030.vdocument.in/reader030/viewer/2022013114/552488614a7959ac488b47f3/html5/thumbnails/26.jpg)
DPCM Design DPCM Design –– PredictorPredictor
FirstFirst--order AR? order AR? Correlation analysis indicates a large Correlation analysis indicates a large firstfirst--order correlation coefficient, near order correlation coefficient, near 0.8, and small higher0.8, and small higher--order order coefficients, smaller than 0.01coefficients, smaller than 0.01
––
![Page 27: CELP Presentation](https://reader030.vdocument.in/reader030/viewer/2022013114/552488614a7959ac488b47f3/html5/thumbnails/27.jpg)
DPCM Design DPCM Design –– QuantizerQuantizer
Designed to be optimal for the random Designed to be optimal for the random variables variables
VVii = X= Xii –– aa11XXii--11
Extract aExtract a11 from correlation analysis, from correlation analysis, like solving the Yulelike solving the Yule--Walker equationWalker equationAvoid calculating the limiting density Avoid calculating the limiting density of the prediction errorof the prediction error
![Page 28: CELP Presentation](https://reader030.vdocument.in/reader030/viewer/2022013114/552488614a7959ac488b47f3/html5/thumbnails/28.jpg)
DPCM PerformanceDPCM Performance
![Page 29: CELP Presentation](https://reader030.vdocument.in/reader030/viewer/2022013114/552488614a7959ac488b47f3/html5/thumbnails/29.jpg)
SQ vs. DPCMSQ vs. DPCM
![Page 30: CELP Presentation](https://reader030.vdocument.in/reader030/viewer/2022013114/552488614a7959ac488b47f3/html5/thumbnails/30.jpg)
SQ vs. DPCMSQ vs. DPCM
![Page 31: CELP Presentation](https://reader030.vdocument.in/reader030/viewer/2022013114/552488614a7959ac488b47f3/html5/thumbnails/31.jpg)
SQ vs. DPCMSQ vs. DPCM
For DPCM:For DPCM:Significant improvement for lower rate Significant improvement for lower rate than SQthan SQThe simple models for sources and The simple models for sources and quantizerquantizer input are effectiveinput are effective
![Page 32: CELP Presentation](https://reader030.vdocument.in/reader030/viewer/2022013114/552488614a7959ac488b47f3/html5/thumbnails/32.jpg)
Vector QuantizationVector Quantization
Key challengeKey challenge–– Given a source Given a source
distribution, how to distribution, how to select codebook (select codebook (**) ) and partitions (and partitions (------) ) to result in smallest to result in smallest average distortionaverage distortion
![Page 33: CELP Presentation](https://reader030.vdocument.in/reader030/viewer/2022013114/552488614a7959ac488b47f3/html5/thumbnails/33.jpg)
VQ DesignVQ Design
LBG algorithm was designed and LBG algorithm was designed and implemented in implemented in MatlabMatlabComputes a codebook of a desired size Computes a codebook of a desired size given a training sequencegiven a training sequence
![Page 34: CELP Presentation](https://reader030.vdocument.in/reader030/viewer/2022013114/552488614a7959ac488b47f3/html5/thumbnails/34.jpg)
Performance of the CELP coderPerformance of the CELP coder
MOS, Mean Opinion ScoreMOS, Mean Opinion Score–– A sample of 20 peopleA sample of 20 people–– Listen to reconstructed speech sample Listen to reconstructed speech sample
and rate the intelligibility and rate the intelligibility Excellent Excellent –– 55Good Good –– 44Fair Fair –– 33Poor Poor –– 22Bad Bad –– 11
![Page 35: CELP Presentation](https://reader030.vdocument.in/reader030/viewer/2022013114/552488614a7959ac488b47f3/html5/thumbnails/35.jpg)
Performance of Coder Performance of Coder with DPCMwith DPCM
M = 2 MOS = 1
M = 4 MOS =1
M = 8 MOS =1 Original
M = 16 MOS =1
M = 32 MOS =2.3
M = 64 MOS =3.1
M = 128 MOS =3.9
M = 256 MOS =4.5
![Page 36: CELP Presentation](https://reader030.vdocument.in/reader030/viewer/2022013114/552488614a7959ac488b47f3/html5/thumbnails/36.jpg)
Performance of Coder Performance of Coder with SQwith SQ
M = 2M = 2 MOS =1
M = 4M = 4 MOS =1
M = 8M = 8 MOS =1Original
M = 16M = 16 MOS =1
M = 32M = 32 MOS =1.8
M = 64M = 64 MOS =2.9
M = 128M = 128 MOS =3.6
M = 256M = 256 MOS =4.1
![Page 37: CELP Presentation](https://reader030.vdocument.in/reader030/viewer/2022013114/552488614a7959ac488b47f3/html5/thumbnails/37.jpg)
Performance of Coder Performance of Coder with VQwith VQ
M = 2M = 2 MOS =1.7
M = 4M = 4 MOS = 1.9
M = 8M = 8 MOS = 2.5
OriginalM = 16M = 16 MOS =2.9
M = 32M = 32 MOS =3.1
M = 64M = 64 MOS =3.1
M = 128M = 128 MOS =2.9
M = 256M = 256 MOS =3.0
![Page 38: CELP Presentation](https://reader030.vdocument.in/reader030/viewer/2022013114/552488614a7959ac488b47f3/html5/thumbnails/38.jpg)
ConclusionsConclusions
Improvement in the quantization of LP Improvement in the quantization of LP coefficients improves the performance coefficients improves the performance of the coderof the coderFor a given codebook size, VQ For a given codebook size, VQ performed better in terms of MSEperformed better in terms of MSEDPCM performed better in terms of DPCM performed better in terms of perceptual MSEperceptual MSE
![Page 39: CELP Presentation](https://reader030.vdocument.in/reader030/viewer/2022013114/552488614a7959ac488b47f3/html5/thumbnails/39.jpg)
QuestionsQuestions
??????????????????
![Page 40: CELP Presentation](https://reader030.vdocument.in/reader030/viewer/2022013114/552488614a7959ac488b47f3/html5/thumbnails/40.jpg)
THANK YOU