lpc10 2.4kbps federal standard in speech coding soo hyun bae school of electrical & computer...
TRANSCRIPT
![Page 1: LPC10 2.4kbps federal standard in speech coding Soo Hyun Bae School of Electrical & Computer Engineering Georgia Institute of Technology ECE 8873 Data](https://reader037.vdocument.in/reader037/viewer/2022102906/56649c745503460f9492752f/html5/thumbnails/1.jpg)
LPC10 LPC10 2.4kbps federal standard in 2.4kbps federal standard in
speech codingspeech coding
LPC10 LPC10 2.4kbps federal standard in 2.4kbps federal standard in
speech codingspeech coding
Soo Hyun Bae
School of Electrical & Computer Engineering
Georgia Institute of Technology<[email protected]>
ECE 8873 Data Compression & Modeling
03/17/2004
![Page 2: LPC10 2.4kbps federal standard in speech coding Soo Hyun Bae School of Electrical & Computer Engineering Georgia Institute of Technology ECE 8873 Data](https://reader037.vdocument.in/reader037/viewer/2022102906/56649c745503460f9492752f/html5/thumbnails/2.jpg)
AgendaAgendaAgendaAgenda
1. Taxonomy of Speech Coders
2. LPC10 Properties
3. Voicing Classification
4. Levinson-Durbin Recursion
5. Pitch Detection
6. Synthesize Speech
7. Speech Coder Comparision
![Page 3: LPC10 2.4kbps federal standard in speech coding Soo Hyun Bae School of Electrical & Computer Engineering Georgia Institute of Technology ECE 8873 Data](https://reader037.vdocument.in/reader037/viewer/2022102906/56649c745503460f9492752f/html5/thumbnails/3.jpg)
Linear PredictionLinear PredictionLinear PredictionLinear Prediction
Speech Coder Standard
FS1015-LPC10 Coefficient 10
FS1016-CELP Code Excitation
MELP Mixed Excitation
IS-54 VCELP Vector Sum Excited
IS-96 QCELP QualComm Code Excited
LD-CELP G.728 Low-Delay Code-Excited
G.729 CS-ACELP Conjugate-structure Algebraic-Code-Excited
LP
LP
LP
LP
LP
LP
LP
![Page 4: LPC10 2.4kbps federal standard in speech coding Soo Hyun Bae School of Electrical & Computer Engineering Georgia Institute of Technology ECE 8873 Data](https://reader037.vdocument.in/reader037/viewer/2022102906/56649c745503460f9492752f/html5/thumbnails/4.jpg)
LPC10
Where is LPC10?Where is LPC10?Where is LPC10?Where is LPC10?
• Taxonomy of Speech Coders
Speech Coders
Waveform Coders Vocoders
Time Domain : PCM. ADPCM
Frequency Domain : Sub-band coders,
Adaptive transform coder
Linear Predictive Coder Formant Coders
Waveform Coders : Preserve the signal waveform not speech
Vocoders : Analyze speech, extract parameters, use parameters to synthesize speech
![Page 5: LPC10 2.4kbps federal standard in speech coding Soo Hyun Bae School of Electrical & Computer Engineering Georgia Institute of Technology ECE 8873 Data](https://reader037.vdocument.in/reader037/viewer/2022102906/56649c745503460f9492752f/html5/thumbnails/5.jpg)
Properties (1)Properties (1)Properties (1)Properties (1)
• So called LPC10 because 10 LP coefficients are used
• Bandwidth: 2.4kbps• Samples/frame : 180 samples• Bits/frame: 54 bits• Frame Size: 22.5ms = 44.44 frames/sec• Target stream : 8khz sampling rate, 16bit
quantization
![Page 6: LPC10 2.4kbps federal standard in speech coding Soo Hyun Bae School of Electrical & Computer Engineering Georgia Institute of Technology ECE 8873 Data](https://reader037.vdocument.in/reader037/viewer/2022102906/56649c745503460f9492752f/html5/thumbnails/6.jpg)
Properties (2)Properties (2)Properties (2)Properties (2)
• “Buzzy” since noise through parameter updates
• Regularly voiced excitation is unnatural, makes some jitter
• Voicing error produce significant distortions
• Only models speech, doesn’t work if backgound noise. Not suitable to mobile phone application
![Page 7: LPC10 2.4kbps federal standard in speech coding Soo Hyun Bae School of Electrical & Computer Engineering Georgia Institute of Technology ECE 8873 Data](https://reader037.vdocument.in/reader037/viewer/2022102906/56649c745503460f9492752f/html5/thumbnails/7.jpg)
Encoded streamEncoded streamEncoded streamEncoded stream
LP Coefficients Pitch&Voicing Energy
0 41 48 53- The remaining 1 bit is for synchronization
• LP Coefficients: Levinson-Durbin Recursion
• Pitch & Voicing : Causal & Noncausal Prediction Gain
• Energy : Low-Band Speech Energy
![Page 8: LPC10 2.4kbps federal standard in speech coding Soo Hyun Bae School of Electrical & Computer Engineering Georgia Institute of Technology ECE 8873 Data](https://reader037.vdocument.in/reader037/viewer/2022102906/56649c745503460f9492752f/html5/thumbnails/8.jpg)
VocoderVocoderVocoderVocoder
Original Speech
Analysis:• Voiced/Unvoiced decision• Pitch Period (voiced only)• Signal power (Gain)
G
Pulse Train
Random Noise
Vocal TractModel
V/U
Synthesized Speech
DecoderSignal Power
PitchPeriod
Encoder
![Page 9: LPC10 2.4kbps federal standard in speech coding Soo Hyun Bae School of Electrical & Computer Engineering Georgia Institute of Technology ECE 8873 Data](https://reader037.vdocument.in/reader037/viewer/2022102906/56649c745503460f9492752f/html5/thumbnails/9.jpg)
Voicing Classification(1)Voicing Classification(1)Voicing Classification(1)Voicing Classification(1)
Voiced Source– Generated by vocal cords’ vibrations– Periodic, spacing is the pitch,
Unvoiced Source– Generated without vibrations– Excitation is modeled by a White Gaussian Noise source– No pitch
How to discriminate?
0F
Fisher’s Method
![Page 10: LPC10 2.4kbps federal standard in speech coding Soo Hyun Bae School of Electrical & Computer Engineering Georgia Institute of Technology ECE 8873 Data](https://reader037.vdocument.in/reader037/viewer/2022102906/56649c745503460f9492752f/html5/thumbnails/10.jpg)
Voice Classification (2)Voice Classification (2)Voice Classification (2)Voice Classification (2)
Compute R(0)
R(0) > R(0) for noise ?Compute LPC and
Pitch Detection
Yes
Silence PeriodNo
![Page 11: LPC10 2.4kbps federal standard in speech coding Soo Hyun Bae School of Electrical & Computer Engineering Georgia Institute of Technology ECE 8873 Data](https://reader037.vdocument.in/reader037/viewer/2022102906/56649c745503460f9492752f/html5/thumbnails/11.jpg)
Pitch & Voicing (1)Pitch & Voicing (1)Pitch & Voicing (1)Pitch & Voicing (1)
• If x(n) is periodic in N, R(k) is also periodic in N• Hard to compute
1
0
)()()(kN
m
kmxmxkR
1
0
)()()(kN
m
cc kmxmxkR
otherwise
Cnxif
Cnxif
nx L
Lc
0
)(1
)(1
)(
![Page 12: LPC10 2.4kbps federal standard in speech coding Soo Hyun Bae School of Electrical & Computer Engineering Georgia Institute of Technology ECE 8873 Data](https://reader037.vdocument.in/reader037/viewer/2022102906/56649c745503460f9492752f/html5/thumbnails/12.jpg)
Pitch & Voicing (2)Pitch & Voicing (2)Pitch & Voicing (2)Pitch & Voicing (2)
![Page 13: LPC10 2.4kbps federal standard in speech coding Soo Hyun Bae School of Electrical & Computer Engineering Georgia Institute of Technology ECE 8873 Data](https://reader037.vdocument.in/reader037/viewer/2022102906/56649c745503460f9492752f/html5/thumbnails/13.jpg)
Reflection Coefficient (1)Reflection Coefficient (1)Reflection Coefficient (1)Reflection Coefficient (1)
• Human auditory system is more sensitive to poles then to zeros
Where G is the gain, p is the order, a’s are poles
p
iii zaza
GzH
1
*1 )1)(1(
)(
![Page 14: LPC10 2.4kbps federal standard in speech coding Soo Hyun Bae School of Electrical & Computer Engineering Georgia Institute of Technology ECE 8873 Data](https://reader037.vdocument.in/reader037/viewer/2022102906/56649c745503460f9492752f/html5/thumbnails/14.jpg)
Reflection Coefficient (2)Reflection Coefficient (2)Reflection Coefficient (2)Reflection Coefficient (2)
j
j
j
j
j
j
j
j
j
j
j
j
j
a
ja
ja
ja
a
a
R
0
0
0
0
0
0
1
)1(
)1(
)(
0
0
)(
)2(
)1(
1
111
• Levinson-Durbin Recursion for all-pole model
)(
)3(
)2(
)1(
)0()3()2()1(
)3()0()1()2(
)2()1()0()1(
)1()2()1()0(
3
2
1
pR
R
R
R
a
a
a
a
RpRpRpR
pRRRR
pRRRR
pRRRR
p
Toeplitz
![Page 15: LPC10 2.4kbps federal standard in speech coding Soo Hyun Bae School of Electrical & Computer Engineering Georgia Institute of Technology ECE 8873 Data](https://reader037.vdocument.in/reader037/viewer/2022102906/56649c745503460f9492752f/html5/thumbnails/15.jpg)
Energy – Gain CoefficientEnergy – Gain CoefficientEnergy – Gain CoefficientEnergy – Gain Coefficient
• From autocorrelation matching property, G is calculated from MSE given by Levinson-Durbin Revursion
• Transmit the coefficient G• Recall
p
kPk kRaRG
1
2 )()0(
p
iii zaza
GzH
1
*1 )1)(1(
)(
![Page 16: LPC10 2.4kbps federal standard in speech coding Soo Hyun Bae School of Electrical & Computer Engineering Georgia Institute of Technology ECE 8873 Data](https://reader037.vdocument.in/reader037/viewer/2022102906/56649c745503460f9492752f/html5/thumbnails/16.jpg)
Synthesize speechSynthesize speechSynthesize speechSynthesize speech
G
Pulse Train
Random Noise
H(z)
V/U
Synthesized Speech
DecoderSignal Power
PitchPeriod
• Recall the Encoder/Decoder structure
![Page 17: LPC10 2.4kbps federal standard in speech coding Soo Hyun Bae School of Electrical & Computer Engineering Georgia Institute of Technology ECE 8873 Data](https://reader037.vdocument.in/reader037/viewer/2022102906/56649c745503460f9492752f/html5/thumbnails/17.jpg)
Speech Coder ComparisonSpeech Coder ComparisonSpeech Coder ComparisonSpeech Coder Comparison
Original
![Page 18: LPC10 2.4kbps federal standard in speech coding Soo Hyun Bae School of Electrical & Computer Engineering Georgia Institute of Technology ECE 8873 Data](https://reader037.vdocument.in/reader037/viewer/2022102906/56649c745503460f9492752f/html5/thumbnails/18.jpg)
ReferencesReferencesReferencesReferences
• Welch V.C., Tremain T.E., Campbell J. P. Jr., “A comparison of US Government standard voice coders”, MILCOM’89, Vol. 1, pp269-273, 1989.
• Cox R. V., “Three New Speech Coders from the ITU Cover a Range of Applications”, Comm. Magazine of IEEE, Vol. 35, pp40-47, 1997
• Campbell J. P. Jr., Tremain T.E., “Voiced/Unvoiced Classification of Speech with Applications to the U.S. Government LPC-10E Algorithm”, ICASSP86, Vol. 11, pp473-476, 1986
• http://www.ee.ucla.edu/~ingrid/ee213a/speech/speech.html
• http://mia.ece.uic.edu/~papers/WWW/MultimediaStandards/
• http://www.ecse.rpi.edu/Homepages/shivkuma/
• http://www.eee.strath.ac.uk/r.w.stewart/index2.htm
• http://web.syr.edu/~gsriniva/tech/docs/
• http://www.speech.cs.cmu.edu/comp.speech/Section3/Software/celp-3.2a.html
• http://www.arl.wustl.edu/~jaf/lpc/• http://www.ecsl.cs.sunysb.edu/cse660/speech.html