ee225d spring,1999 medium & high rate coding* voice - excited channel vocoder * velp & relp...
TRANSCRIPT
EE 2 CTURE ON MEDIUM AND HIGH RATE CODING
N.M 26.1
PE Spring,1999
25D LE
ORGAN / B.GOLD LECTURE 26
University of CaliforniaBerkeley
College of EngineeringDepartment of Electrical Engineering
and Computer Sciences
rofessors : N.Morgan / B.GoldE225D
Medium & High Rate Coding
Lecture 26
EE 2 CTURE ON MEDIUM AND HIGH RATE CODING
N.M 26.2
f Other Sounds.
ms
CM)-(CVSD)
25D LE
ORGAN / B.GOLD LECTURE 26
Quality, Robustness, Speaker I.D., Possible Vocoding o
Summary for Chapter 33 - Medium & High Rate Syste
* Voice Excitation & Spectral Flattering
* Voice - Excited Channel Vocoder
* VELP & RELP
* Wave form Coding using predictive methods.(ADP
* APC
* Subband Coding
Emphasis in an getting a good efficient repesentation of the excitation.
EE 2 CTURE ON MEDIUM AND HIGH RATE CODING
N.M 26.3
on APC.
CELP.
on
t on Error Signal.
25D LE
ORGAN / B.GOLD LECTURE 26
*Multi pulse LPC
* CELP Celp is a variation
* Reducing Codebook Search Time in
* Back Ward Filtering
* Multi Resolution Codebook Search
* Partial Sequence Elimination
* Tree Structured Delta Codebooks
* Adaptive Codebooks
* Linear Conbination Codebooks
* Vector Sum Excited Linear Preducti
* Adaptive Transform Coding
Representation
Accen
of good Excitation
is still the basic issue.
EE 2 CTURE ON MEDIUM AND HIGH RATE CODING
N.M 26.4
ital System
Pure Waveform Coding
nel basis.
rom same
mited Excitation Function.
of the speech.
+
25D LE
ORGAN / B.GOLD LECTURE 26
Different Structures for Wide Band & Medium Band Dig
* Pure Waveform Coding Subband Coding
on a channel by chan* Pure Modelling
* Hybrid Systems
* Voice - Excited System Excitation derived f
* Processing of the Error Signal to Produce a Band Li
APCCELPMulti-Pulse
Many Variation
band limited function
speech
Waveform Coding
0 f1–
f1 f2– Waveform Coding
EE 2 CTURE ON MEDIUM AND HIGH RATE CODING
N.M 26.5
PP CUSD
ntization.
fn 1+( ) f∆
int, fewer bits are pared to I.
sing!, so original can be recovered.
25D LE
ORGAN / B.GOLD LECTURE 26
Pure Waveform Coding(Very Robust)
lay Molly’s tape.ossible Viewgraphs : Local max construct ADPCMFig.33.8, 33.9, 33.10.
Sampling & Qua
I
IIf
f f∆
n f∆
f∆
B
Let’s stick to pure PCM.from a perceptual viewponeeded to encode II, com
Each band is sampled and quantized.
Important Point : if bandwidths are judiciously clean, sampling can be done at Nyquist rate. fs 2B=
No alia
EE 2 CTURE ON MEDIUM AND HIGH RATE CODING
N.M 26.6
nt, fewer bits
s are judicialy clean,
ompared to I.
Nyquist rate.
inal can be recovered.
25D LE
ORGAN / B.GOLD LECTURE 26
Subband Coding
Let’s stick to pure PCM.
From a perceptual Viewpoi
Each band is sampled and quantized.
Important Point: If bandwidth
are needed to encode II, c
sampling can be done at B
I
II
f
f
f
No Aliasing! So orig
∆f
n∆f n 1+( )∆f
fs 2B=f
f
∆f
Fig.33.14
EE 2 CTURE ON MEDIUM AND HIGH RATE CODING
N.M 26.7
0’s)
eech Signal (0-8kHz)
Hz].
der was NOT Robust.
generally carried
annel vibrations.
less sounds.
25D LE
ORGAN / B.GOLD LECTURE 26
Voice -Excited Channel Vocoder (late 1950’s to early 196
Motivations :
(2) Possible Transmission of a Wide Band (Analog) Sp
throughan excisting Telephone Channel [300-3000
(1) General Faling - Excitation Signal in Channel Voco
* Intuitive grasp of the fact that baseband [0-900Hz],
all the necessary excitation information for vocal ch
* A white noise same use a suitable excitatin for voice
EE 2 CTURE ON MEDIUM AND HIGH RATE CODING
N.M 26.8
Base Band
er Synthetic sizer Speech
Speech
tation spectrum.
nerally carried
nnel vibrations.
ess sounds.
25D LE
ORGAN / B.GOLD LECTURE 26
Original Idea
Analysis
Synthesis
Problem :
Speech Channel Spectral
100-900Hz
100-900Hz Non-Linear Vocod
Base Band
Vocoder Analyzer
Filter
Distoration Synthe
Speech Signal
Base Band Speech
100-800Hz
Distortion of the exci
* Intuitive grasp of the fact that baseband [0-900Hz], ge
all the necessary excitation information for vocal cha
* A white noise same use a suitable excitatin for voicel
EE 2 CTURE ON MEDIUM AND HIGH RATE CODING
N.M 26.9
ced spectral flattering
ortion products of the
25D LE
ORGAN / B.GOLD LECTURE 26
In early 1960’s, Schroeder, David et al at BTL, introdu
-got rid of spectral distortion [more or less].
Base Band Signal
Simple non-linearity (rectifier)
Result of non-linearnity distfrequencies are harmonics fundamental frequency.
f
Spectral Flattening
EE 2 CTURE ON MEDIUM AND HIGH RATE CODING
N.M 26.10
e Predictive Coding.
d
time
25D LE
ORGAN / B.GOLD LECTURE 26
In the late 1960’s, schroeder and Atal introduced APC -Adaptiv
Long Term Predictor Prediction
Sample to be predicte
Previous Samples
EE 2 CTURE ON MEDIUM AND HIGH RATE CODING
N.M 26.11
n
the sampling rate.
M)
e1 n k–( ) e2+
25D LE
ORGAN / B.GOLD LECTURE 26
e1 n( )
Two types of prediction.
Perform LPC Analysis on
By transmitting
Major Assumption - is so small
that it can be quantized to 1 bit. BUT SENT at
Predicted Valuey n( ) αy n T–( ) y n( ) e n( )+= =
so e1 n( ) y n( )– αy n –(+=
a1 a2 …ak and α M,, ,
e1 n( ) a1e1 n 1–( ) a2e1 n 2–( ) …ak+ +=
e1 n( )
e2 n( )
EE 2 CTURE ON MEDIUM AND HIGH RATE CODING
N.M 26.12
.
signal costs 8kbs.
2kbs.
Bit Rate.
is big.
25D LE
ORGAN / B.GOLD LECTURE 26
Even a ONE bit error signal results in a large bit rate
If sampling rate is 8kHz, then transmission of error
Addition of transmitting ’s sould be another
Early APC Systems Operated at 9600bps.
Major Research Efforts : Reduction of Error Signal
If pitch is wrong, first error signal
d M ak, ,
e1 n( )Pitch Detection
EE 2 CTURE ON MEDIUM AND HIGH RATE CODING
N.M 26.13
dard Excitation Signals.
mpling rate.antization.
pefully].
25D LE
ORGAN / B.GOLD LECTURE 26
* LPC - Error Signal is eliminated and replaced by stan
* RELP - Residual Excited linear Prediction.
(like Channel Vocoder)
Low Pass Filter of error Signal - reduced sastill one bit qu
* VELP - Voice-Excited LP.
So error signal rate can be reduced [ho
They aimed for 4800 bps.
Key to Error Signal Reduction.
EE 2 CTURE ON MEDIUM AND HIGH RATE CODING
N.M 26.14
vailable for
the Error Signal.
vailable.
ter.
sequentially
n frames] and
tain a GOOD fit
H.countered this ideaSetevens Halle concept pter 17.
25D LE
ORGAN / B.GOLD LECTURE 26
As computer speeds increased, New sptions became a
real-time Coding of
Basic Philosophy - Analysis by Synthesis
* Transmitting system has both analyzer and synthesizer a
* So Synthetic Speech can be generated at the transmit
* Using same criteria, the synthetic speech is compared
with the actual speech [perhaps every frame, or every
synthesizer parameters obtained by analysis VARIED to ob
between ACTUAL SPEECH vs. SYNTHESIZED SPEEC
* So the best parameters are sent. we enin the in cha
EE 2 CTURE ON MEDIUM AND HIGH RATE CODING
N.M 26.15
r quantized error signal.
A
Synthetic
act Parameters
uered difference
esisSpeech
s n( )
he Vocal Tract Parameters.
R meters
Synthetic SpeechTe
���������������������������������������������
25D LE
ORGAN / B.GOLD LECTURE 26
Basic Idea - Replace the one-bit error signal of APC with a vecto
nalyzer
Store of
Speech LPC
LPC
Vocal Tr
mean sq
M Sequences, each of length
one frame.
Change
Address
Analyzer
Synth
s n( )
Transmit address of best of the M sequences transmit t
eceiver
LPC Synthesis
Store of M Sequences, each of length
one frame.
Vocal Tract Para
“best” exciter
best address
his is an
…
xample of VQ.
��� ���������������������������������������������������������������������������������������������������������������������������������
EE 2 CTURE ON MEDIUM AND HIGH RATE CODING
N.M 26.16
lyzer must
t speaker]
es in a single frame.]
stems can opearate
25D LE
ORGAN / B.GOLD LECTURE 26
Problems discussed in Chapter 33
* How to create the stone?
* Perceptual weighting filter.
* Delay
* Reduction of Codebook Search. [In 20ms, the ana
* Adaptive Coding.[ Storage is adaptive to a differen
performe complete analysis-synthesis many tim
So computer speed must be such that MANY sy
in real time SIMULTANEOUSLY.
EE 2 CTURE ON MEDIUM AND HIGH RATE CODING
N.M 26.17
s(z 1–
s n( )
Modulation.
s( n( )
z 1–
s n( )+_
V
25D LE
ORGAN / B.GOLD LECTURE 26
_
n) e n( ) e n( )
e n( )z 1–
Figure 33.7 : Differential Pulse Code Modulation.
Figure 33.8 : Adaptive Differential Pulse Code
Quantized error Signal
Adaptive Format
Low-Pass Rectification
+Q
_n) e n( )
e n( )e
z 1–
Filtering
Quantizer Q(V)
Format
V
EE 2 CTURE ON MEDIUM AND HIGH RATE CODING
N.M 26.18
s n( )
Receiver
M Analyzer
Σ
z 1–
z 1–
1
M n( )
a1 0.9418=
odulation.
25D LE
ORGAN / B.GOLD LECTURE 26
Channel
M
s n( ) e n( ) e n( )
s n( ) a1s n 1–( ) M n( )+=
Analyzer
Transmitter
Σ Hard Limiter
z 1–
z 1–
Σ
z 1–z–
…
s n 1–( )
M n( ) f(e n( ) e n 1–( ),,=
e n 2–( ))a1
a1
Figure 33.9 : Continuously Variable Slope Delta M
EE 2 CTURE ON MEDIUM AND HIGH RATE CODING
N.M 26.19
S n( )
oncept
S
z( )
11 P z( )–--------------------
25D LE
ORGAN / B.GOLD LECTURE 26
s n( )
P z( ) e n( )
+Q
Figure 33.10 : Rudimentary Linear Prediction C
peech
AnalyzerParameter
+
Coder
P
Quantized Error Signal1 P z( )– +
_
EE 2 CTURE ON MEDIUM AND HIGH RATE CODING
N.M 26.20
drature Mirror Filters.
s' n( )
K1
_
K21:2
1:2
25D LE
ORGAN / B.GOLD LECTURE 26
Figure 33.14 : Four Channel Subband Coder with Qua
_
e1 n( )
x n( )
+
H2
K1
K2
e2 n( )
e4 n( )
H1
+
H1
_
+ K1
e3 n( )
H1
+
K2
H2
H2
1:22:1
2:1
2:1
2:1
2:1
2:1
1:2
1:2
1:2
EE 2 CTURE ON MEDIUM AND HIGH RATE CODING
N.M 26.21
25D LE
ORGAN / B.GOLD LECTURE 26
Figure 33.2 : Zig-Zag Network
OUTPUT
EE 2 CTURE ON MEDIUM AND HIGH RATE CODING
N.M 26.22
ignal
Frequency
25D LE
ORGAN / B.GOLD LECTURE 26
Figure 33.1 : Spectral Flattening of the Base-Band S
EE 2 CTURE ON MEDIUM AND HIGH RATE CODING
N.M 26.23
ning.
tic
S
Fig. 33.1c
25D LE
ORGAN / B.GOLD LECTURE 26
Figure 33.3 : Implementation of Spectral Flatte
Hand Limiter Input-Output Characteris
peechLow-Pass
Half-Wave
Bandpass Hand
Fig. 33.1a
Fig. 33.1b
Output
Input
Filter
Rectifier
Filter
Limiter
Bandpass Hand Filter Limiter
Bandpass Hand Filter Limiter
EE 2 CTURE ON MEDIUM AND HIGH RATE CODING
N.M 26.24
Coding
to RCVR
)
and
Packing
25D LE
ORGAN / B.GOLD LECTURE 26
Adaptive Predictive Coding of Speech
Input One Bit
LPC
Pitch
First Error Signal
Σ
-
+
αzM–
sn
a1
e n(
α
akzk–∑
M
q
Quantizer
Σ
Σ
Σ
Σ
αz
M–Analysis
Analysis
a2
a4
q
Speech
EE 2 CTURE ON MEDIUM AND HIGH RATE CODING
N.M 26.25
I
Pass
ignal
Spectrum
Output r
Flattener
+S Speech
tion)
Signal
Spectrum
Output
Flattener
S Speech
25D LE
ORGAN / B.GOLD LECTURE 26
Block Diagrams of VELP and RELP.
nput
LPC
Low-Pass
Coding LPC High-
Excitation S
Distotion
Analysis
Filter
Synthesis
Network
Filte
Low-Pass Filter
Formatting
Transmission
Decoding
peech
(a) VELP (Voice-Excited Linear Prediction)
(b) RELP (Residual-Excited Linear Predic
LPC parameterInput LPC
Low-Pass
Coding LPC
Excitation
Distotion
Analysis
Filter
Synthesis
NetworkLow-Pass Filter
Formatting
Transmission
Decoding
peech Error Signal
EE 2 CTURE ON MEDIUM AND HIGH RATE CODING
N.M 26.26
hm.
25D LE
ORGAN / B.GOLD LECTURE 26
Spectra at different nodes in the VELP algorit
EE 2 CTURE ON MEDIUM AND HIGH RATE CODING
N.M 26.27
g Delay Predictors.
I
Speech
Selecting
Instantaneous_ Objective
Error
igianl eech
gnalSn
25D LE
ORGAN / B.GOLD LECTURE 26
Figure 1 : Speech Synthesis Model with Short and Lon
++nnovation
Fine Structure Spectral Envelope
Long Delay Short Delay
(Formants) (Pitch) “White”
Predictor Predictor
Figure 2 : Block Diagram Illustrating the Procedure for
++Innovation
Long Delay Short Delay
“White”
Predictor Predictor
PerceptualAverage Square
Perceptual Weighting Filter
Error
Or Sp Si
Sn
the Optimum innovation Sequence.
EE 2 CTURE ON MEDIUM AND HIGH RATE CODING
N.M 26.28
25D LE
ORGAN / B.GOLD LECTURE 26
Analysis by Synthesis
How to create the store?
Reduction of delay.
Reduction of Codebook Search
Adaptive Codebook.
APC 10,000bps.
Bessie Smith
Louis Armstong.
EE 2 CTURE ON MEDIUM AND HIGH RATE CODING
N.M 26.29
e1 n( )
25D LE
ORGAN / B.GOLD LECTURE 26
APC - Adaptive Preductor Coder Late 1960’s
pitch
Analysis
LPC
M
e1 n( )
e2 n( )
y n( ) αy n M–( ) y n( ) e1 n( )+= =
e1 n( ) y n( )– αy n M–( )+=
e1 n( ) a1e1 n 1–( ) a2e1 n 2–( ) … e2 n( )+ + +=
i bit
y n M–( ) y n( )
EE 2 CTURE ON MEDIUM AND HIGH RATE CODING
N.M 26.30
ocodar
to get Flat Spectrum.
nthesizer
Hz
ts per second.
25D LE
ORGAN / B.GOLD LECTURE 26
Channel
Low-Pass Baseband
V
Non-Linear
Impossible
Distortion
Sy
Filter 0-800Hz
Vocoder Analyzer 800-8000Hz
Parameter
0-800
10,000 bi
EE 2 CTURE ON MEDIUM AND HIGH RATE CODING
N.M 26.31
1 f2
f∆ n 1+( ) f∆
25D LE
ORGAN / B.GOLD LECTURE 26
Subband Coding
64k bits
20-30k bits
4bits
2bits
0 f1–
f1 f2–
fn fn 1+–
0 f1–
f1 f2–
1500
4000
f
n
EE 2 CTURE ON MEDIUM AND HIGH RATE CODING
N.M 26.32
r than Speech
25D LE
ORGAN / B.GOLD LECTURE 26
Medium & High Rate Systems
Quality
Robustness Speaker I.D.
Possible Vocoding of Sounds othe
Waveform - PCM - 8kHz, 8bits 64kbits.(Subband Coding)
DPCM ADPCM CVSD
Basic Issue
Efficient Representation of the Excitation.
Voice-Excited Systems LPC
Channel Vocoders
Prediction - APCCELPMulti-Pulse