robingerzaguet - univ-rennes1.fr

Robin GerzaguetAssociate Professor @ENSSAT

IRISA • Equipe [email protected]

[email protected]

2021-2022

Digital Signal ProcessingLecture notes • SysNum-2

[email protected]

[email protected]

Objectives

From signal processing specificationsDigital filter analysisDigital Signal processing transformInfinite Impulse Response (IIR) filtersFinite Impulse Response (FIR) filtersSpectral analysisMultirate processing

To practical implementationFixed point designDigital Signal Processor (DSP)

Digital Signal processing versus Processing of digital signal !

Digital Signal Processing 2 / 355

Objectives

DSP in a nutshellFrom Signal processing tools

Filters, transforms, analysisTo DSP applications

Fixed point for real time efficient implementationsSignal processing architectures (DS)

Operating methodsLecture (20h), Tutorials (16h)Lab (9h)And DSP module (Lab and project)Some quizz using Klaxxon


Gratitude

These slides and the lecture structure have been massively inherited from the workof several (awesome) persons:

Olivier Sentieys: people.rennes.inria.fr/Olivier.Sentieys/

Daniel Menard: http://dmenard.perso.insa-rennes.fr/

Olivier Berder: https://scholar.google.fr/citations?user=aUCL1iMAAAAJ&hl=fr&oi=ao

Sources can be found at my personnel page:https://perso.univ-rennes1.fr/robin.gerzaguet/


people.rennes.inria.fr/Olivier.Sentieys/

http://dmenard.perso.insa-rennes.fr/

https://scholar.google.fr/citations?user=aUCL1iMAAAAJ&hl=fr&oi=ao

https://scholar.google.fr/citations?user=aUCL1iMAAAAJ&hl=fr&oi=ao

https://perso.univ-rennes1.fr/robin.gerzaguet/

Objectives

1 IntroductionDigital signal processing in a nutshellMain applicationsAnalog Digital conversion

2 Digital filtering analysisZ transformParallel and successive structuresFiltering specificationFIR filteringIIR filteringSecond Order Sections (SoS)


Objectives

3 IIR synthesisIntroductionFilter prototype functions

Butterworth filtersChebyshev filtersElliptic filtersBessel filters

Transformation functionsImpulse Invariance method

Conclusion

4 FIR synthesisIntroductionLinear phase filtersWindow based synthesis

PrincipleApodisation functionsWindows comparisonWindow selection

Frequency sampling methodConclusion


Objectives

5 Fixed point arithmeticIntroductionNumbers representation: coding familyFixed and floating points

Fixed pointFloating point

Fixed point arithmeticAdditionMultiplication

Fixed point designDynamic

PrincipleSimulation methodsAnalytical methods

Point positionningDynamicRulesConclusion

Data width & quantization noiseFormatWord charac.Quantization noise

Conclusion


Objectives

6 Fast Fourier TransformIntroductionPrinciple of the FFTTime split FFTFrequency split FFTConclusion

7 Multirate processingIntroductionDownsamplingUpsamplingFractional interpolation


Objectives

8 Digital Signal processorDSP algorithm implementationDigital signal processor architectureDSP core units

Processing unitsMemory unitsControl unit

Some DSPsConclusion on DSP


Introduction

Digital Signal Processing Introduction 10 / 355

Introduction1 Introduction

Digital signal processing in a nutshellMain applicationsAnalog Digital conversion

2 Digital filtering analysis

3 IIR synthesis

4 FIR synthesis

5 Fixed point arithmetic

6 Fast Fourier Transform

7 Multirate processing

8 Digital Signal processorDigital Signal Processing Introduction 11 / 355

Digital Signal processing ?

Figure: Digital and analog domains

Digital Signal Processing Introduction Digital signal processing in a nutshell 12 / 355

DSP in a global chain

From sensors to actuators [1]

Sens

ors

A

NB

ADC

Low passAnti aliasing filter

D.S.P. DAC

Actu

ator

s

0.2360.4210.5630.60.610.6130.611.2360.580.490350.20.20.21

0.2360.4210.5630.60.610.6130.611.2360.580.490350.20.20.21

Digital action Te

[1] Proakis and Manolakis, Digital Signal Processing : Principles, Algorithms and Applications, 1996.


DSP in a global chain

Example with a smooth filtering of sensor capture1 Raw sensor measure (high variations)2 After blocker in ADC chain3 Digital samples4 Treated digital samples can be used for additional post-processing5 After converted back in analog domain6 Smooth output from sensors

Possible to do the processing directly in analog domain.More flexibility to do in digital domainAdditional processing and re-useMore complex to design


Digital versus analog processingAdvantages

No drift : processing independent from temperature, agingPrecision depending on bit rangeSoftness: several possible tasks and reconfigurationPrototyping: Software/firmware updatesPrediction: same behaviour as in simulationPerformancesIntegration in systems (SoC, DSP, VLSI)

DrawbacksHigh cost for simple systemsRequired computational speed is function of system bandwidth (can be veryhigh !)Complexity: Hardware, software, interfaces

Need a specific methodology to design a DSP system


Digital versus analog processingAdvantages

No drift : processing independent from temperature, agingPrecision depending on bit rangeSoftness: several possible tasks and reconfigurationPrototyping: Software/firmware updatesPrediction: same behaviour as in simulationPerformancesIntegration in systems (SoC, DSP, VLSI)

DrawbacksHigh cost for simple systemsRequired computational speed is function of system bandwidth (can be veryhigh !)Complexity: Hardware, software, interfaces

Need a specific methodology to design a DSP systemDigital Signal Processing Introduction Digital signal processing in a nutshell 15 / 355

For what applications ?

Digital Signal processing is everywhereAt home

Broadcast TV, Set top Box, video games, VR, networksDVD, HDTV, CD, DAB, DVD, WiFi, . . .

At workVideoconference (Webex), VoIP, phones, internetHigh speed networks (Fiber, ADSL, . . . ), smart buildings, IA, . . .

In mobilityMobile connectivity, smartphones, smartwatch, . . .GPS, Radar, cellular coverage (3G, 4G-LTE, 5G-NR)

Digital signal processing is everywhere, leveraging new usages and markets


Some DSP applications

Different processing on different signalsSignal from sensors

I Associated to physical measure (temperature, pressure, . . . )

Biomedical signalsI EEG, IRM, scanner, . . .

Data signalsI Financial data (trading), statistics

Speech processing

Image processing

. . .

Processing must be adapted to the signal.

Digital Signal Processing Introduction Main applications 17 / 355

Example of speech processing

Figure: Time and frequency component. Up: Voyel. Down: consonantDigital Signal Processing Introduction Main applications 18 / 355

Example of speech processing

Signal in time domainSparsity and statisticalpropertiesIntra and Inter uservariability (speechsignature)Speech masking

Signal in frequency domainHarmonic analysisFrequency masking


Example of echo cancellingSignal processing and intelligence (adaptive processing) to cancel the unwantedcontribution [2]

Figure: Noise cancelling principle apply to earphones. [3]

[2] Haykin, Adaptive filter theory, 1996.[3] Electronic Design, “What’s the Difference Between Passive and Active Noise Cancellation?” 2018.


Analog to Digital conversion


Analog to Digital conversion

First stage that model analog to digital transformationAnalog inputDigital output

We call this a mixed signal device

QuantizationInput signal is continuous and output signal is discrete

Only N level as the outputLead to a continuous to discrete conversion error: quantization errorCan also be modeled as a additive noise (quantization noise)

Digital Signal Processing Introduction Analog Digital conversion 22 / 355

ADC architectureBasic ADC architecture

Input is between 0V andVref

Conversion on 3 bits: 8levels

Observed voltage Output level0 - 1 V 0 0 01 - 2 V 0 0 12 - 3 V 0 1 03 - 4 V 0 1 14 - 5 V 1 0 05 - 6 V 1 0 16 - 7 V 1 1 07 - 8 V 1 1 1

Analog input 3 bits. . . ADC

vref vcc

gnd

b0b1b2


ADC architectureOutput is a digital word

Composed of n bits associated to an analog range (uniform quantization)Some bits carry more “information” than other

Least Significant bit (LSB) and Most Significant bit (MSB)

We have b0, b1, b2An error in b0 leads to anerror of 1 VAn error in b2 leads to anerror of 4 V Analog input 3 bits

. . . ADC

vref vcc

gnd

b0b1b2

See quantization noise in next lectures.


Samples and hold ADCMost famous (and easy to understand) topology, architecture relies on

Sampling time Tc

ADC resolution (i.e numbers of outputs bits)ADC samples (captures, takes) the voltage of an analog signal and holds its valueat a constant level for a specified minimum period of time (Tc)

Analog signal Digital output

C

Figure: Sample and hold architecture [4]

Control signal pilot the holding during Tc

[4] Analog Devices, “Samples and hold amplifiers,” 2008.


Samples and hold ADC

0 2 4 6 8 100

5

10

15

20

25

Time index

Inpu

tvo

ltage

0 2 4 6 8 100

5

10

15

20

25

Time index

Inpu

tvo

ltage

Figure: Sampling and Hold behavior

Rate at a given sampling frequency (dash lines)Signal at this time is hold (red point)An output is given for each interval (purple curve)


Sigma-Delta ADCIn ADC mode, convert a high sampling frequency low bit resolution to a Lowsampling frequency high bit rateBased on 1-Bit ADC quantization stageTopology is half Digital // half analogAdvantage: the digital filter reduces noise impact

Figure: Sigma-Delta ADC topology [5]

[5] Baker, “How Sigma-Delta ADCs work, Part I,” 2011.


Sigma-Delta ADCSigma-Delta modulator topology [6]

Input x(t) +∫

+- To filter

+Vcc

-Vcc

Figure: Scheme of sigma-delta Modulator

A comparator (analog)An integrator (analog)A 1-bit ADC (output is filter input)A 1-bit DAC retro-action

Application: Input signal is 1V[6] Río, Medeiro, and Perez-Verdu, CMOS cascade sigma-delta modulators for sensors and telecom: error analysis and practical design, 2006.


Sigma-Delta ADC

If input is a constant 1V signalx(t) = 1, ∀ tOrder I integrator1 bit ADC/DAC piloted byVcc

Input x(t) +∫

+- To filter

+Vcc

-Vcc

Figure: Scheme of a sigma-deltamodulator

0 100 200 300 400 500 6000

0.5

1

Oversampling indexOutpu

tof

filtervalue

Figure: Output of sigma-delta ADC for aconstant 1V input


Sigma-Delta ADCOutput of Σ ∆ is not the binaryword !

Necessity to apply atransformation on it with adigital filter.Out of our scope here, butsome filter properties

Key design space is the digital fil-ter

Often sinc or sinc3 shapefiltersFilter built to allow flatresponse in useful bandCare design due to notchfrequencies

Figure: Digital filter used in sigma-deltaADC [7][7] Pisani, “Digital Filter Types in Delta-Sigma ADCs,” 2017.


SAR - ADC

SAR - Successive ApproximationRegister ADC

The input voltage is successivelyindentified with a register thatmoves its binary valuesfor N bits identification, SAR shouldbe N times fasterDAC for analog comparison betweeninput and outputAfter convergence, output binarysequence is sent to digital stage

I End of Conversion (EOC)

SARClock

DAC

b0 bN−1

Vref

-+V

e

EOC

Figure: SAR scheme


SAR - ADC

SAR - ADC • Algorithmfor i = 1 : N

bi = 1;W = DAC b(:) ;e = V −W ;

I if e > 0bi = 1;

I elseif e <0bi = 0;

I elsebi = 1, EOC

end

W =

(∑N−1i=0 b[i ]2N−i

)2N − 1 × VRef

SARClock

DAC

b0 bN−1

Vref

-+V

e

EOC

Figure: SAR scheme


SAR ADCIf input is a constant 1V signal

x(t) = 1, ∀ t8 bitsEOC computed at eachiteration

SARClock

DAC

b0 bN−1

Vref

-+V

e

EOC

Figure: Scheme of a sigma-deltamodulator

0 2 4 6 8 10 12 14 16

1

1.5

2

2.5

Bit indexEO

Cev

olut

ion

SAR performance

Figure: Output of SAR ADC for a constant1V output


Pipeline ADCVery used for high bit rateCorresponds to successive low resolution ADCSample and Hold ADC and low bits DAC, last stage is different (lowerresolution)

Figure: Pipeline architecture [8]

[8] Sahoo and Razavi, “A fast simulator for pipelined A/D converters,” 2009.


ADC comparison

Different technology for different purposeSample and Hold

4 Simple architecture

4 Analog architecture that takes into account good properties of MOStechnology

7 Sampling moment distortion (due to clock raising failing time issue)7 Errors due to MOS transistor switches (MOS switch charge injection)


ADC comparison


4 Simple architecture4 Analog architecture that takes into account good properties of MOS

technology

7 Sampling moment distortion (due to clock raising failing time issue)7 Errors due to MOS transistor switches (MOS switch charge injection)


ADC comparison



technology7 Sampling moment distortion (due to clock raising failing time issue)

7 Errors due to MOS transistor switches (MOS switch charge injection)


ADC comparison



technology7 Sampling moment distortion (due to clock raising failing time issue)7 Errors due to MOS transistor switches (MOS switch charge injection)


ADC comparison

Different technology for different purposeSigma-Delta architecture

4 Massive digital architecture: low power, easy to scale, not temperaturedependent

4 Possible very good precision only dependent on speed and processing4 Flexibility ensured with digital post-filtering (no need strong anti aliasing

filters)7 Required speed for high bit precision: in practice limited bandwidth7 Latency induced by digital filtering stage


ADC comparison



4 Possible very good precision only dependent on speed and processing

4 Flexibility ensured with digital post-filtering (no need strong anti aliasingfilters)

7 Required speed for high bit precision: in practice limited bandwidth7 Latency induced by digital filtering stage


ADC comparison




filters)

7 Required speed for high bit precision: in practice limited bandwidth7 Latency induced by digital filtering stage


ADC comparison




filters)7 Required speed for high bit precision: in practice limited bandwidth

7 Latency induced by digital filtering stage


ADC comparison




filters)7 Required speed for high bit precision: in practice limited bandwidth7 Latency induced by digital filtering stage


ADC comparison

Different technology for different purposeSAR architecture

4 Massive digital architecture: low power, easy to scale

4 One clock to rule them all (versus pipeline architectures)4 More bandwidth than Sigma-delta7 Calibration and DAC architecture


ADC comparison


4 Massive digital architecture: low power, easy to scale4 One clock to rule them all (versus pipeline architectures)

4 More bandwidth than Sigma-delta7 Calibration and DAC architecture


ADC comparison


4 Massive digital architecture: low power, easy to scale4 One clock to rule them all (versus pipeline architectures)4 More bandwidth than Sigma-delta

7 Calibration and DAC architecture


ADC comparison


4 Massive digital architecture: low power, easy to scale4 One clock to rule them all (versus pipeline architectures)4 More bandwidth than Sigma-delta7 Calibration and DAC architecture


ADC comparison

Different technology for different purposePipeline architecture

4 Higher achievable rate

4 Higher instantaneous bandwidth4 Scalable architecture7 Pipeline latency7 Calibration and error corrections


ADC comparison


4 Higher achievable rate4 Higher instantaneous bandwidth

4 Scalable architecture7 Pipeline latency7 Calibration and error corrections


ADC comparison


4 Higher achievable rate4 Higher instantaneous bandwidth4 Scalable architecture

7 Pipeline latency7 Calibration and error corrections


ADC comparison


4 Higher achievable rate4 Higher instantaneous bandwidth4 Scalable architecture7 Pipeline latency7 Calibration and error corrections


ADC comparison

Figure: ADC architecture-use cases adequation from Texas Instrument [9]

[9] Texas instrument, “Choose the right A/D converter for your application,” 2015.


Digital filtering

Digital Signal Processing Digital filtering analysis 40 / 355

Filtering

Digital filteringWhat is digital filtering ?

A modification of the time frequency components of the input signalTime and frequency are linked (dual), so define transformation in one domainis enough

We consider a digital signal sampled at sampling time Tc defined as

x [n] = x(nTc)

Filtering affects all the samples x [n] (but not only x !)

Digital Signal Processing Digital filtering analysis Z transform 41 / 355

Filtering

x[n] h y[n]

Filter transformation is expressed with the Z transformFilter and signal can be derived with the Z transform

h(n) <=> H(z)

Filtre numérique

x(n) <=> X(z) y(n) <=> Y(z)7z


Introducing Z transform

Filtering

General framework of digital filteringFilters affects input and output of signal

2 parts in the digital filtering: one affects x and one yWe will focus here on causal filters with depth LFiltering is a bilinear form

y [n] =Lb−1∑k=0

bkx [n − k] +La−1∑k=1

aky [n − k] (2.1)

Coefficients bk affects the inputCoefficients ak affects the output


Filtering

y [n] =Lb−1∑k=0

bkx [n − k] +La−1∑k=1

aky [n − k] (2.2)

By rearranging the previous equation, we can link Z transformsx [n]→ X (z)y [n]→ Y (z)h[n]→ H(z)

We define the filtering operation in Z domain as

Y (z) = X (z)× H(z) (2.3)


Filtering and Z transform

Filtering

y [n] =Lb−1∑k=0

bkx [n − k] +La−1∑k=1

aky [n − k]

We have

Y (z) =(Lb−1∑

k=0bkz−k

)X (z) +

(La−1∑k=1

akz−k)

Y (z) (2.4)

(Lb−1∑k=0

bkz−k)

︸︷︷︸N(z)

X (z) =(1−

L−1∑k=1

akz−k)

︸︷︷︸D(z)

Y (z) (2.5)



Filtering

(Lb−1∑k=0

bkz−k)

︸︷︷︸N(z)

X (z) =(1−

La−1∑k=1

akz−k)

︸︷︷︸D(z)

Y (z) (2.6)

By recalling H(z) = Y (z)X(z)(Lb−1∑

k=0bkz−k

)X (z) =

(1−

L−a1∑k=1

akz−k)

Y (z) (2.7)

H(z) =∑Lb−1

k=0 bkz−k

1−∑La−1

k=1 akz−k= N(z)

D(z) (2.8)



Poles and ZerosFilter can be expressed with a numerator and a denominator

N(z) =∑Lb−1

k=0 bkz−k

D(z) = 1−∑La−1

k=1 akz−k

Note that we can have D(z) = −∑La−1

k=0 akz−k with a0 = −1.We can express the 2 polynomials as

N(z) = b0 ×∏Nz

k=1(z − zk)D(z) =

∏Npk=1(z − pk)

(2.9)

N(z) is the numeratorI It has Nz zeros

D(z) is the denominatorI It has Np poles


Poles and ZerosFilter can be expressed with a numerator and a denominator

N(z) =∑Lb−1

k=0 bkz−k

D(z) = 1−∑La−1

k=1 akz−k

The filter can be expressed in Z transform as

H(z) =∑Lb−1

k=0 bkz−k

1−∑La−1

k=1 akz−k(2.10)

H(z) = b0 ×∏Nz

k=1(z − zk)∏Npk=1(z − pk)

(2.11)

ReminderStability is ensured with |pk | < 1Minimal phase filter1: zeros belongs to the unit circle

1A filter is said to be minimum-phase if the system and its inverse are causal and stable.


Parallel and cascade formSome filters can be split into parallel architecture or cascade architecture

The global Z transform is function of subindexed filters Hi .

Hparallel =M∑i=1

Hi(z) =M∑i=1

∑Lb−1k=0 bi

kz−k

1−∑La−1

k=1 aikz−k(2.12)

Hcascade =M∏i=1

Hi(z) =M∏i=1

∑Lb−1k=0 bi

kz−k

1−∑La−1

k=1 aikz−k(2.13)

X(z) H(z) Y (z)

Direct form

X(z)

H1(z)

H2(z)

HM (z)

...

Y (z)

Parallel architecture

Digital Signal Processing Digital filtering analysis Parallel and successive structures 49 / 355

Parallel and cascade formSome filters can be split into parallel architecture or cascade architecture

The global Z transform is function of subindexed filters Hi .

Hparallel =M∑i=1

Hi(z) =M∑i=1

∑Lb−1k=0 bi

kz−k

1−∑La−1

k=1 aikz−k(2.12)

Hcascade =M∏i=1

Hi(z) =M∏i=1

∑Lb−1k=0 bi

kz−k

1−∑La−1

k=1 aikz−k(2.13)

X(z) H(z) Y (z)

Direct form

X(z) H1(z) H2(z) HM (z). . . Y (z)

Successive architecture


Parallel and cascade formExample of cascade form

y [n] = 4x [n] + 3x [n − 1]− x [n − 2]

Polynomial formWe have H(z) = 4 + 3z−1 − z−2

−x2 + 3x + 4 = 0

That leads tox1 = −3−

√25

2×−1 = 4x2 = −3+

√25

2×−1 = −1

x[n] x

4

z−1 x

3

z−1 x

-1

+ y[n]


Parallel and cascade formExample of cascade form

y [n] = 4x [n] + 3x [n − 1]− x [n − 2]

We have H(z) = 4 + 3z−1 − z−2, equivalent to H(z) = (z−1 + 1)(z−1 − 4)

x[n]

z−1

x

1

+

z−1

x

-4

+

H1 H2


Filtering specificationBefore being implemented filtersshould be specified

What do they do in timeand frequency domainsHow are they characterized

More convenient to lookat the frequency domain be-haviour

Some parts are cut andsome are allowed

1 Low pass filtering (a)2 High pass filtering (b)3 Bandpass filtering (c)4 Stopband filtering (d)

ΩπΩc

1

H(ejΩ)

a) Filtre passe-bas

ΩπΩc

1

H(ejΩ)

b) Filtre passe-haut

ΩπΩ0

1

H(ejΩ)

c) Filtre passe-bande

Ωπ

1

H(ejΩ)

d) Filtre réjecteur-de-bande

Ω2Ω1 Ω0 Ω2Ω1

Bandpass filtering

High pass filteringLow pass filtering

Stopband filtering

Figure: Frequency response (ideal)

Digital Signal Processing Digital filtering analysis Filtering specification 52 / 355

Low pass filteringWe define 3 area

1 Bandwidth: 0 < Ω < Ωp

2 Transition band Ωp < Ω < Ωa

3 Attenuated band Ωa < Ω < π

ΩπΩp

1

|H(ejΩ)|

1-δ1

1+δ1

δ2

Ωa

Ωπ

Ωp

0 dB

|H(ejΩ)| (dB)

20log(1-δ1)

20log(1+δ1)

20logδ2

Ωa

a) Gabarit fréquentiel linéaire b) Gabarit fréquentiel en dB


Bandpass filtering

ΩπΩp+

1

|H(ejΩ)|

1-δ1

1+δ1

δ2

Ωa+Ωa- Ωp-

Figure: Frequency mask of bandpass


Low pass filtering

All filters are LPFHPF as 1-LPFBP as translated LPFSB as 1-BP

Summary of filter parameters

LPF HPF Bandpass StopbandSelectivity s Ωp

ΩaΩaΩp

Ωp+−Ωp−Ωa+−Ωa−

Ωa+−Ωa−Ωp+−Ωp−

Ripple δ1 δ1 δ1 δ1Attenuation δ2 δ2 δ2 δ2

Centering frequency Ω0 - -√

Ωp+.Ωp−√

Ωp+.Ωp−Bandwidth broadness B - - Ωp+−Ωp−

Ω0

Ωp+−Ωp−Ω0


Filter classificationSeveral criteria for classification

Impulse response classificationDepending on the Z transform structure, two kind of filters

Finite Impulse Response filter (FIR)Infinite Impulse Response (IIR)

As the name suggests, function of impulse response nature

Structure classificationFunction of recursive classification2

Non recursive filters FIRRecursive filters IIR

2Equivalence with one exception, see later


Finite Impulse ResponseCannot be derived from analog filtersFIR filters

Impulse response of output signal is finiteParticular case of IIR filters when there is no denominator

Output is only function of input signal

y [n] =L−1∑k=0

bkx [n − k] (2.14)

H(z) =L−1∑k=0

bkz−k (2.15)

Limited response in time domain∀k > L− 1, bk = 0.

Digital Signal Processing Digital filtering analysis FIR filtering 57 / 355

Finite Impulse ResponseCannot be derived from analog filtersFIR filters

Impulse response of output signal is finiteParticular case of IIR filters when there is no denominatorOutput is only function of input signal

y [n] =L−1∑k=0

bkx [n − k] (2.14)

H(z) =L−1∑k=0

bkz−k (2.15)

Limited response in time domain∀k > L− 1, bk = 0.


Finite Impulse Response

Some key FIR properties4 Synthesis methods can lead to arbitrary design of any desired frequency

response4 Inherent stability as output only depends on input4 Design can ensure linear phase, equivalent to constant group delay (no

harmonic distortion in signal)4 Easier implementation in digital architecture7 Transition bandwidth larger than the one of IIR filter with same size (depth

of numerator)


Finite Impulse ResponseFIR implementation

Direct structure (one partof the previously introducedIIR filter)Alternative method:transposed structure

x[n] x

b0

z−1 x

b1

+

z−1 x

b2

+

z−1

...

x

bL−1

+ y[n]


Finite Impulse ResponseFIR implementation

Direct structure (one partof the previously introducedIIR filter)Alternative method:transposed structure

x[n] x

b0

+

x

b1

+

z−1

x

b2

+

z−1

x

bL−1

...

z−1

...

y[n]


FIR complexityLets assess the number of operation to perform a L taps filtering

L multiplicationsL− 1 additions2L elements in memory (storing b0 · · · bL−1 and L elements for the inputsignal x [n − k])

Equivalent to L Multiplication and accumulation (MAC)

A DSP can do one MAC per clock cycleTo filter a data sampled at Fc what is the minimal DSP ?Pc expressed in Million Instruction per Second would require a powercalculation at least greater than

Pc >L× Fc106 (2.16)


Infinite Impulse ResponseThese filters are directly derived from analog filters.

Same effects for digital filtersBut additional effects due to digitalisation (and quantization)

IIR formulationDefinition with

Z transformationDifference equation

y [n] =L−1∑k=0

bkx [n − k] +L−1∑k=1

aky [n − k] (2.17)

H(z) =∑L−1

k=0 bkz−k

1−∑L−1

k=1 akz−k= N(z)

D(z) (2.18)

General bilinear form that links output to input

Digital Signal Processing Digital filtering analysis IIR filtering 62 / 355

Infinite Impulse Response

We have defined a numerator and a denominatorN(z) =

∑L−1k=0 bkz−k

D(z) = 1−∑L−1

k=1 akz−k

If D(z) does not divide N(z) we have an infinite number of terms:

H(z) =∞∑k=0

ckz−k

Otherwise, FIR filtering


IIR properties

Some key IIR properties4 Small transition bandwidth

I Best template can be achieved with IIR filtering

4 Synthesis methods directly inherited from analog methods

7 Risk of instability due to pole positioning

7 Risk of digital instability (feedback loop as output depends in previousoutputs)

Efficient filtering architecture but use with caution


Exercise

We consider H(z) = 1M

1−z−M

1−z−1

Questions1 Show that we have y [n] = y [n − 1] + 1

M [x [n]− x [n −M]]2 Show that we have y [n] = 1

M∑M−1

i=0 x [n − i ]

Correction1 From rational expression directly as H(z) = Y (z)

X(z)

2 Several methods thereI By developing y [n − 1] = y [n − 2] + 1

M [x [n − 1]− x [n − 1−M]]I By factoring (1− z−M) = (1− z−1)× (1 + z−1 + z−2 + · · ·+ z−M+1)I Considering we have a geometric progression with scale factor z−1 FIR

expression


IIR structures

Different structures can be used for IIR filtering

Two based on recursive and non recursive decompositionI One straightforward structure (independent stages)I One that group same delay operation

Structure that minimize memory footprintI Introduction of w [n − k] linked to what happens at delay kI 2 structures can be defined (direct and transposed)

Non canonic structuresH(z) = N(z)

D(z) = N(z)× 1D(z)

IIR is the cascade of a FIR and a pure recursive filter (also called all polefilter)


IIR structure - Non Canonic Cascade

x[n] x

b0

z−1 x

b1

z−1 x

b2

z−1

...

x

bL−1

+

z−1x

a1

z−1x

a2

z−1

...

x

aL−1

+ y[n]

Figure: IIR cascade structure

In cascade form (FIR + All pole filter)One stage dedicated to Numerator and one to Denominator


IIR structure - Non Canonic direct

x[n] x

b0

+

z−1 x

b1

+

z−1 x

b2

+

z−1

...

x

bL−1

+

z−1x

a1

z−1x

a2

z−1

...

x

aL−1

y[n]

Figure: IIR direct structure

In direct formOne stage per delay


IIR filter - Canonic structuresDirect versus transpose structures

Due to multiplication commutativityIIR is also the cascade of an all pole filter and a FIR filter

However these structures require a ×2 in terms of memory (as D(z) and N(z) aretreated independently)

Minimization of number of memories: Canonic structures

Canonic structuresWe introduce W (z) and we express the filter as a set of equations

W (z) = 1D(z) .X (z)

Y (z) = N(z).W (z)

w [n] = x [n] +

∑L−1k=1 ak .w [n − k]

y [n] =∑L−1

k=0 bk .w [n − k](2.19)

Same memory register w [n − k] in the two structuresA canonic transpose can be straightforwardly deduced from (2.19) due tomultiplication commutativity.


IIR Filter - Canonic directx[n] + • x

b0+ y[n]

z−1

•xa1

+ xb1

+

z−1

•xa2

+ xb2

+

z−1

...

•xaL1

+

...

xbL−1

+

...

w[n]

w[n − 1]

w[n − 2]

w[n − L + 1]

Figure: Direct Canonic structureDigital Signal Processing Digital filtering analysis IIR filtering 70 / 355

IIR Filter - Canonic transposex[n] x

b0

+ y[n]

z−1

+xb1

xa1

z−1

+xb2

xa2

z−1

...

+xbL−1

...

xaL−1

...

Figure: Transpose Canonic structure


IIR complexity evaluationLets assess the number of operation to perform a L taps IIR filtering

2L multiplications2(L− 1) additionsMemory depends on structure

I 4L elements for direct structure (2L coefficients, L inputs, L outputs)I 3L + 2 for canonic structure(x [n],y [n],2L coefficients, and L w [n − k]

coefficients)

Equivalent to 2L Multiplication and accumulation (MAC)

A DSP can do one MAC per clock cycleTo filter a data sampled at Fc what is the minimal DSP ?Pc expressed in Million Instruction per Second would require a powercalculation at least greater than

Pc >2L× Fc106 (2.20)


IIR complexity evaluationLets assess the number of operation to perform a L taps IIR filtering

2L multiplications2(L− 1) additionsMemory depends on structure

I 4L elements for direct structure (2L coefficients, L inputs, L outputs)I 3L + 2 for canonic structure(x [n],y [n],2L coefficients, and L w [n − k]

coefficients)

Equivalent to 2L Multiplication and accumulation (MAC)A DSP can do one MAC per clock cycle

To filter a data sampled at Fc what is the minimal DSP ?Pc expressed in Million Instruction per Second would require a powercalculation at least greater than

Pc >2L× Fc106 (2.20)


Second order Sections

We know how to describe an IIR and a FIR flowgraphI Several topologies, direct and transposeI Canonic structures for efficient memory management

2 main issues may ariseThe filter order may be important (based on specification).Implementation of a stable IIR may lead to unstable one due to Fixed Pointquantization.The higher the order, the higher the risk

I A solution can be to split high order IIR filter into cascaded small structures

Digital Signal Processing Digital filtering analysis Second Order Sections (SoS) 73 / 355

Second order Sections

Second order Sections (SoS)Divide IIR into cascaded cells of order II or order III (when order is odd)This is also called a Biquad filter [10]Cells are obtained simple element decompositionBe aware of order of filter cells

I Multiplication is commutative (cascaded filters can be exchanged)I But quantization noise is not the same depending on filter order

X(z) H(z) Y (z)

Direct form

X(z) H1(z) H2(z) HM (z). . . Y (z)

Successive architecture

Note: some design tools provide automatic direct to cascaded structure (for instancedir2cas.m in Matlab language).

[10] Jenkins and Nayeri, “Adaptive filters realized with second order sections,” 1986.

Digital Signal Processing Digital filtering analysis Second Order Sections (SoS) 74 / 355

IIR Synthesis

Digital Signal Processing IIR synthesis 75 / 355

Introduction

Create a arbitrary filter that respects the target specificationsDifferent methods can be used in order to built the filter

Synthesis methodsAnalog synthesis methods . From analog function H(p), transformation fis done to switch from p plan to frequency plan in order to obtain H(z). IIRstructure is straightforward from z structure.

I See next point for different f functions

Direct synthesis in z domain can also be done (but similar to first point)

Digital optimisation methods can be used in order to find a H(z). Costfunctions are introduced to minimize the error (distortion between desiredand obtained)

Digital Signal Processing IIR synthesis Introduction 76 / 355

Analog filter specificationsAnalog specifications

Normalized template Filter order

HNorm(p)

Approximation of H(p)

Filter type(Butterworth, Chebyshev,...)

H(p)

Un-normalization

Digital filtering Analog filtering

Transformation p=f(z)Which f ?

Structure choiceRauch, Sallien-Key,Bi-quadratic

Filter order


Figure: Reminder of analog specification (and introduction of digital filtering)


Filter normalisationFirst objective: obtain a normalized specifications: normalized template

The cut-off pulsation is set to 1Attenuated bandwidth start after 1/s (for s, see slide 76)Ripple and attenuation remains unchanged

ω1

1

|HNorm(p)|

1-δ1

1+δ1

δ2

1/s

ω

0

|HNorm(p)| (dB)

−∆1

∆1

∆2

a) Gabarit prototype linéaire b) Gabarit prototype en dB

1 1/s

Normalized template (linear) Normalized template (log)

Figure: Prototype template (normalized)


Low pass filtering

All filters are LPFHPF as 1-LPFBP as translated LPFSB as 1-BP

Summary of filter parameters

LPF HPF Bandpass StopbandSelectivity s Ωp

ΩaΩaΩp

Ωp+−Ωp−Ωa+−Ωa−

Ωa+−Ωa−Ωp+−Ωp−

Ripple δ1 δ1 δ1 δ1Attenuation δ2 δ2 δ2 δ2

Centering frequency Ω0 - -√

Ωp+.Ωp−√

Ωp+.Ωp−Bandwidth broadness B - - Ωp+−Ωp−

Ω0

Ωp+−Ωp−Ω0


Normalized p transform

From normalized template, deduce a p transform with approximation functions

Filter prototype functions1 Butterworth filters

2 Chebyshev filters

3 Elliptic filters

4 Bessel filters


Butterworth filtersIntroduced in 1930 by Stephen Butterworth [11]

Defined with a target length NThe filter is defined from the square of its frequency response

∣∣HB(ω)∣∣2 = 1

1 +(ωωc

)2N (3.1)

Butterworth filters propertiesAll pole filterAmplitude is a decreasing monolitic function of wFrequency response if as constant as possible in bandpass and attenuatedbandMaximal amplitude for w = 0 (with amplitude of 1)wc is the 3dB cut-off frequency

I As |HB(wc)| = 1√2

[11] Butterworth, “On the theory of filter amplifiers,” 1930.

Digital Signal Processing IIR synthesis Filter prototype functions 81 / 355

Butterworth filtersIntroduced in 1930 by Stephen Butterworth [11]

Defined with a target length NThe filter is defined from the square of its frequency response

∣∣HB(ω)∣∣2 = 1

1 +(ωωc

)2N (3.2)

Butterworth filters propertiesThe attenuation in high frequency (asymptotic) is equal to 20× N perdecadesThe filter order is given by the following relationship

N ≥log(

1δ2√δ1

)log( 1s) (3.3)

[11] Butterworth, “On the theory of filter amplifiers,” 1930.


Butterworth filters

0 0.5 1 1.5 2 2.5 3 3.5 4−80

−70

−60

−50

−40

−30

−20

−10

0

2

3

4

5

6

7

8

9

1/s

Gai

n[d

B]

Figure: Butterworth filter in s domainDigital Signal Processing IIR synthesis Filter prototype functions 83 / 355

Butterworth filters

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−3

−2.5

−2

−1.5

−1

−0.5

0

2

3

4

56789

1/s

Gai

n[d

B]

Figure: Butterworth filter in s domain (zoom in bandpass)Digital Signal Processing IIR synthesis Filter prototype functions 84 / 355

Butterworth filter in analog domainWell known filter used in analog domain [12]

Various possible implementationsPassive implementation with Cauer structure

Figure: Passive implementation of Butterworth filter using Cauer structure

I Corresponds to a cascade structureI Value of capacitance and inductance depending on dsired order

[12] Van Valkenburg, Analog filter design, 1982.


Butterworth filter in analog domainWell known filter used in analog domain [12]

Various possible implementationsActive implementation with Sallen Key topology

Figure: Active implementation of Butterworth filter using Sallen Key structure

I Each stage implement a conjugate pair of polesI If the order is odd, need to add a RC circuits separately

[12] Van Valkenburg, Analog filter design, 1982.


Chebyshev filters - Type I

Filters named from Pafnuty Chebyshev as the filter is derived from Chebyshevpolynoms [13]Also defined by the square of the frequency response∣∣HC (ω)

∣∣2 = 11 + ε2T 2

N

(ωωc

) , (3.4)

I wc is the cutt-off frequencyI ε is a parameter that controls ripple amplitudeI Tn(·) is the N order Chebyshev polynom

Chebyshev filters minimize the error between the idealized and the actual filtercharacteristic over the range of the filter but with ripples in the passband

[13] Sedra and Brackett, Filter theory and design: active and passive, 1978.



Chebyshev propertiesChebyshev polynom definition (first kind) [14]

T0(x) = 1,T1(x) = x ,Tn+1(x) = 2xTn(x)− Tn−1(x).

For a type I, the amplitude in the bandpass as a ripple expressed as

0 ≤ ω ≤ ωc =⇒ 11 + ε2

≤∣∣HC (ω)

∣∣2 ≤ 1

ε is thus closely related to the effective ripple as

ε =√10 δ

10 − 1

[14] Chebyshev, “Théorie des mécanismes connus sous le nom de parallélogrammes",” 1854.


Chebyshev filters - Type IChebyshev propertiesZeros of the filter can be analytically expressed

s±pm = ± sinh(1narsinh

(1ε

))sin(θm) + j cosh

(1narsinh

(1ε

))cos(θm)

n is the order of the Chebyshev filterθm = π

22m−1

n , with m from 1 to nCompromise between desired oscillation and length (with respect to stability).

Minimal order N for a given ripple ε and a bandstop attenuation ∆2 in dB(or δ2 in linear scale)

N ≥cosh−1

(√δ22−1ε2

)cosh−1

( 1s)



0 0.5 1 1.5 2 2.5 3 3.5 4−80

−70

−60

−50

−40

−30

−20

−10

0

2

3

4

5

6

789

1/s

Gai

n[d

B]

Chebyshev Type I filters (1dB)



0 0.5 1 1.5 2 2.5 3 3.5 4−80

−70

−60

−50

−40

−30

−20

−10

02

3

4

5

6

7

8

9

1/s

Gai

n[d

B]

Chebyshev Type I filters (0.1 dB)



0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−1

−0.8

−0.6

−0.4

−0.2

0

2

3

4

5

1/s

Gai

n[d

B]Chebyshev Type I filters (1dB)



0 0.2 0.4 0.6 0.8 1

1/s

-0.1

-0.08

-0.06

-0.04

-0.02

0

Gain

[dB

]

Chebyshev Type I filters (0.1 dB)

2

3

4

5


Chebyshev filters - Type II

Filters named from Pafnuty Chebyshev as the filter is derived from Chebyshevpolynoms [13]Type II are also called inverse Chebyshev filtersAlso defined by the square of the frequency response∣∣HC (ω)

∣∣2 = 11 + 1

ε2T 2N( ω

ωc ), (3.5)

I wc is the cutt-off frequencyI ε is a parameter that controls ripple amplitudeI Tn(·) is the N order Chebyshev polynom

Chebyshev filters minimize the error between the idealized and the actual filtercharacteristic over the range of the filter but with ripples in the stopband

[13] Sedra and Brackett, Filter theory and design: active and passive, 1978.


Chebyshev filters - Type IIChebyshev propertiesZeros of the filter can be analytically expressed as inverse of type I Chebyshev ones

1s±pm

= ± sinh(1narsinh

(1ε

))sin(θm) + j cosh

(1narsinh

(1ε

))cos(θm)

n is the order of the Chebyshev filterθm = π

22m−1

n , with m from 1 to nCompromise between desired oscillation and length (with respect to stability).

Minimal order N for a given ripple ε and a bandpass attenuation ∆1 in dB(or δ1 in linear scale)

N ≥cosh−1

(√1−δ21εδ1

)cosh−1

( 1s)


Elliptic filters

Butterworth filters: No rippleChebyshev filters have ripple in bandpassElliptic filters have ripples in both bandpass and attenuated bandwidth [15]

Elliptic filtersRipple in bandpass and attenuated bandwidthOptimal in terms of selectivity (minimal order for a given spec)Also defined by the square of the frequency response

∣∣HE (ω)∣∣2 = 1

1 + ε2R2N

(ωωc

) , (3.6)

[15] Lutovac, Tošić, and Evans, Filter design for signal processing using MATLAB and Mathematica, 2001.


Elliptic filtersElliptic filter expression ∣∣HE (ω)

∣∣2 = 11 + ε2R2

N

(ξ, ωωc

) ,RN(x , y) is a rational Chebyshev function with order Nε and ε are parameters that control ripple amplitudesRipple and bandpass and attenuated bandwidth can be independentlymonitoredIf ε tends to 0, elliptic tends to be a order I ChebyshevThe gain in the bandpass varies between 1 and 1/

√1 + ε

The attenuation in the attenuated bandwidth varies between −∞ and1/√

1 + ε2L2N with Ln (discrimination factor) defined as

Ln = RN(ε, ξ) (3.7)


Elliptic filters

Figure: Elliptic profile and ripple parameters


Elliptic filters

Rational Chebyshev function

Rn(ξ, x) ≡ cd(

nK (1/Ln)K (1/ξ) cd−1(x , 1/ξ), 1/Ln

)(3.8)

With cd defined as the Jacobi elliptic cosine functionK () is a complete elliptic integral of the first kind

K (k) =∫ π

2

0

dθ√1− k2 sin2 θ

=∫ 1

0

dt√(1− t2) (1− k2t2)

(3.9)


Elliptic filters

0 0.5 1 1.5 2 2.5 3 3.5 4−80

−70

−60

−50

−40

−30

−20

−10

0

2

3

4

5

6

7

89

1/s

Gai

n(d

B)

Elliptic filter (1dB/50dB)

Figure: Elliptic filter with 1dB and 50dB rejection


Elliptic filters

0 0.5 1 1.5 2 2.5 3 3.5 4−80

−70

−60

−50

−40

−30

−20

−10

02

34

5

6

7

8

9

1/s

Gai

n(d

B)

Elliptic filter with (0.1dB/30dB)

Figure: Elliptic filter with 0.1dB and 30dB rejection


Elliptic filters

0 0.2 0.4 0.6 0.8 1

1/s

-0.1

-0.08

-0.06

-0.04

-0.02

0G

ain

(dB

)Elliptic filter bandpass (0.1dB/30dB)

2

3

4

5

Figure: Elliptic filter with 1dB and 50dB rejection in bandpass


Elliptic filters

0 0.2 0.4 0.6 0.8 1

1/s

-0.1

-0.08

-0.06

-0.04

-0.02

0

Gain

(dB

)Elliptic filter zoom (0.1dB/30dB)

2

3

4

5

Figure: Elliptic filter with 0.1dB and 30dB rejection in bandpass


Bessel filtersBessel filters

Introduced with Thomson (sometimes called Thomson filters) [16]

Linear filter with a maximally flat group/phase delay

It preserves the signals in the passband.

Often used in audio processing

But not a good selectivity !

Defined directly with the transfer function

H(s) = θn(0)θn(s/ω0) (3.10)

I θn(x) is a reverse Bessel polynomialI Again, all pole filter

[16] Thomson, “Delay networks having maximally flat frequency characteristics,” 1949.


Bessel filtersFirst Bessel polynomials

1 s + 12 s2 + 3s + 33 s3 + 6s2 + 15s + 154 s4 + 10s3 + 45s2 + 105s + 1055 s5 + 15s4 + 105s3 + 420s2 + 945s + 945

In general case, we have

θn(s) =n∑

k=0aksk

where

ak = (2n − k)!2n−kk!(n − k)! k = 0, 1, . . . , n


Bessel filters

0 0.5 1 1.5 2 2.5 3 3.5 4−80

−70

−60

−50

−40

−30

−20

−10

0

2

3

4

5

6

7

8

9

1/s

Gai

n(d

B)

Bessel filters

Figure: Bessel filter in fullbandDigital Signal Processing IIR synthesis Filter prototype functions 106 / 355

Bessel filters

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−3

−2.5

−2

−1.5

−1

−0.5

0

234

5

6

7

8

9

1/s

Gai

n(d

B)

Bessel filter (zoom in bandpass)

Figure: Bessel filter in bandpassDigital Signal Processing IIR synthesis Filter prototype functions 107 / 355

Bessel filters

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−3

−2.5

−2

−1.5

−1

−0.5

0

2

3

4

5

6

7

89

1/s

Pha

se(r

ad)

Bessel filter (phase)

Figure: Bessel filter phase in bandpassDigital Signal Processing IIR synthesis Filter prototype functions 108 / 355

Bessel filters

0 0.5 1 1.5 2 2.5 3 3.5 4−12

−10

−8

−6

−4

−2

0

1/s

Pha

se(r

ad)

Bessel filters (phase)

Figure: Bessel filter phase in fullbandDigital Signal Processing IIR synthesis Filter prototype functions 109 / 355

Un-normalization

The obtained transfer function is normalizedNeed to switch from normalized Hnorm(pN) to H(p)

Need to introduce a function for variable switchingDepends on filter nature

LPF HPF BPF SBFpN = p

ωcpN = ωc

p pN = 1B

(pω0

+ ω0p

)pN =

[1B

(pω0

+ ω0p

)]−1Table: Unormalization function


Analog filter specificationsAnalog specifications

Normalized template Filter order

HNorm(p)

Approximation of H(p)


H(p)

Un-normalization

Digital filtering Analog filtering

Transformation p=f(z)Which f ?

Structure choiceRauch, Sallien-Key,Bi-quadratic

Filter order


Figure: Reminder of analog specification (and introduction of digital filtering)

Digital Signal Processing IIR synthesis Transformation functions 111 / 355

Impulse invariance methodFilter we have defined are analog filters

Impulse invariance methodFilter in digital domains corresponds to analog filter sampled at the desiredsampling frequency

h[n] = h(nTc) = h(t)t=nTc (3.11)

t

ha(t)

Ha(p)x(t) y(t)

Analog filter

t

h(nT)

H(z)x[n] y[n]

Digital filter

T


Impulse invariance methodAssuming that the filter is an all pole one, we can deduce the digital poles fromthe analog poles.Analog poles

Ha(p) = N(p)D(p) =

∑L−1k=0

hkp+pk , (3.12)

ha(t) =∑L−1

k=0 hke−pk t with inverse LT (3.13)

Digital poles

h[nT ] =L−1∑k=0

hke−pkkT ,H(z) =L−1∑k=0

hkz

z − e−pkT =L−1∑k=0

hk1

1− e−pkT z−1 (3.14)


Example: simple LPF filterIf we consider a simple LPF filter expressed as

H(p) = 1a + p

We obtain in time domain (inverse Laplace transform)

h(t) = u(t)× e−at

Which leads to the analog response (function of sampled data)

he(t) = Tc

∞∑k=0

u(kTc)× e−akTc δ(T − kTc)

Which leads to IIR expression (Z transform)

He(z) = Tc

∞∑k=0

e−akTc z−k = Tc

∞∑k=0

(eaTc z

)−kHe(z) = Tc

zz − e−aTc


Impulse invariance method

Frequency response

H(ejωT ) = H(z)|z=ejωT = 1T∑k

Ha(jω + j 2πkT ) (3.15)

Aliasing in frequency domainI Spectrum is folded for frequency not in [ −12Tc

; 12Tc

]I Only valid if analog filter spectrum is null in this interval

Requirement:|Ha(jω)| ' 0 for |ω| > ωB avec ωB <

π

Tc

Scaling factor 1T in PSD expression: need a normalisation factor to recover

analog response


Impulse invariance method

1

Ha(p)

1/T

p

Ω

2Π−2Π Π

H(Ω)

Figure: Aliasing consequence in frequency domain if the filter is not bandlimited (impulseinvariance method)


Bilinear transformationImpulse invariance method is a simple efficient method for IIR synthesis

4 Ensure stability and direct sampling method7 Risk of frequency domain aliasing

ReasonThe link between Laplace (continuous) and Z transform (discrete) is not abijection.p → z = epT is a surjection of the Laplace transform convergence area

Bilinear transform principleUse a algebraic transformation between p and z with integrationapproximationInteger approximation is used

de(t)dt |t=nT = lim

∆t→0

∆e(nT )∆t ' e[nT ]− e[(n − 1)T ]

T (3.16)


Bilinear transform

Integer approximated with rectanglesBilinear transform is special case of Möbius transformation [17],

t

e(nT)

nT(n-1)T

Figure: Integer approximation with rectangle method

s(nT ) = s((n − 1)T ) + T e(nT ) + e((n − 1)T )2 (3.17)

[17] Arnold and Rogness, “Möbius transformations revealed,” 2008.


Bilinear transformWe can map the integer with a filter defined in z domain (digital filter)

S(z) = z−1S(z) + T E (z) + z−1E (z)2

H(z) = S(z)E (z) = T

21 + z−11− z−1

Besides, in p plan, integration is defined as

Ha(p) = 1p

So we have a p to z transformation defined as

p = 2T

1− z−11 + z−1 (3.18)

And a z to p transformation

z = 2/T + p2/T − p = 2/T + jωa

2/T − jωa(3.19)


Bilinear transformLink between analog and digital sequences can be derived

z = 2 + jωaT2− jωaT

= ejωT

|z | = 1

arg(z) = arctan(ωaT2 ) + arctan(ωaT2 ) = 2 arctan(ωaT2 )

z = ejωT ⇒ arg(z) = ωT

And we finally haveωaT2 = tan

(ωdT2

)(3.20)

Every point on the unit circle in the discrete-time filter z-plane, z = ejωdT

is mapped to a point on the jω axis on the continuous-time filter s-plane, s = jωa

Equivalently, continuous frequency range −∞ < ωa <∞ is mapped ontothe fundamental frequency interval − π

T < ωd <πT


Bilinear transform

plan p

p = jωa

plan z

z = ejωT

π 0

Figure: Bilinear transform

Transformation is bijective between <p < 0 and unity circle: maintainstabilityFrequency is distorted with the transformation but this ensures non aliasing


Bilinear transform

π

π/T

ωa

ωT

−π

Figure: Analog to digital pulsations link

This distortion can be critical to maintain filter specificationsIn this case, we compensate (pre-distortion) this critical frequencies tomaintain the filter specifications


Bilinear transform

Ω

πΩp

1

|H(ejΩ)|1-δ1

1+δ1

δ2

Ωa

ωa

p

1

|Ha(

jωa)

|

1-δ

1

1+

δ1

δ2

ωa

a

π

ωa

ωa

Ω

ωa=2/T tan(Ω/2)

2π

Figure: Pre-distortion principle


Bilinear transform

Bilinear transform method for filter creation1 Have an initial filter specification in analog domain

2 From filter specifications, apply the pre-distortion on critical frequencies (i.efrequencies in bandpass)

ωa p,a = 2T tan

(Ωp,a2

), (3.21)

3 Getting analog filter from appropriate method (see before)

4 using the bilinear transform with p to z transform

H(z) , H(p)|p= 2T

1−z−11+z−1

(3.22)


Conclusion on IIR synthesisDesigning a IIR filter is difficult as the filter should be function of input andoutput

But IIR filter is closed to analog filterWell known methods to design a specific filter based on different approaches

Filter prototype functions1 Butterworth filters

2 Chebyshev filters

3 Elliptic filters

4 Bessel filters

Digital Signal Processing IIR synthesis Conclusion 125 / 355

Conclusion on IIR synthesisFrom analog filter specification, we need to derive a digital filter Two mainmethods for filter transformation

1 Impulse invariance2 Bilinear transform

Impulse invarianceMaintain response in time domain, stability preservedCreates aliasing in frequency domain (no p to z bijection)Straightforward transformation

Bilinear transformationTransformation of analog filter with p = 2

T1−z−1

1+z−1

No aliasing and stability preservedDesigning the filter with pre-distortion to maintain specifications

Digital Signal Processing IIR synthesis Conclusion 126 / 355

FIR Synthesis

Digital Signal Processing FIR synthesis 127 / 355

IntroductionWe recall first what a FIR filtering is

Finite response in time domainI Applied on a digital input x [n]I With a depth L

Equivalent formulation: output only depends on input

y [n] =L−1∑k=0

hkx [n − k] (4.1)

H(z) =L−1∑k=0

hkz−k (4.2)

These filters can only exists in digital domainSynthesis methods can not be inherited from analog ones

Digital Signal Processing FIR synthesis Introduction 128 / 355

Introduction

FIR synthesis methodsBased on signal processing approachesDifferent methods to enable a specific design

FIR Filtering synthesis1 Windowing method

2 Frequency sampling

3 Optimisation methods

Digital Signal Processing FIR synthesis Introduction 129 / 355

Linear phase filters

Phase is important to monitor a filterPhase can not be equal to zero for causal systemsWe may want to have minimal impact of the phase terms Linearphase

Linear phaseConstant group delay versus frequencyEach frequency is delayed with the same amount (coefficient of the linearphase)No frequency distortion

Impact on filter:H(ejΩ) = A(ejΩ)e−jαΩ+jβ , (4.3)

Digital Signal Processing FIR synthesis Linear phase filters 130 / 355


Filter can be expressed as:

H(ejΩ) = A(ejΩ)e−jαΩ+jβ ,

A(ejΩ) is called pseudo-module (module without taking off the sign)

Linear phase propertiesA linear phase filter is defined with

group delay αOffset phase βThe depth L (number of filter taps)

α = L− 12 (4.4)


Linear phase filtersz transform and linear phase filter:

H(ejΩ) = |H(ejΩ)|ej arg(Ω) = A(ejΩ)ejϕ(Ω)

with ϕ(Ω) = −αΩ + β

First case: ϕ(Ω) = −αΩβ = 0−π ≤ Ω ≤ π

Second case: ϕ(Ω) = β − αΩβ 6= 0−π ≤ Ω ≤ π



We have 4 cases

1 β = 0 and L is odd : Type I

2 β = 0 and L is even: Type II

3 β = ±π2 and L is odd : Type III

4 β = ±π2 and L is even : Type IV



β = 0 ϕ(Ω) = −αΩ

L is odd

Symetric impulse response

hk = hL−1−k

Group delay α = L−12 is an

integer

h(n)

α N-1L-1ß = 0, L odd


Type I linear phase filter


β = 0 ϕ(Ω) = −αΩL is evenSymetric impulse response

hk = hL−1−k

Group delay α = L−12 is not

an integer

π 2π

h(n)

α N-1L-1ß = 0,L even

Remark: If we have Ω = π, we shall have H(ejπ) = 0 which means that type IIfilter shall validate this constraint. As a consequence, HPF cannot be type IIlinear phase filters


Type II linear phase filter


β± π2 ϕ(Ω) = β−αΩ

L is oddAntisymetric impulseresponse

hk = −hL−1−k

Group delay α = L−12 is an

integer

π 2π

h(n)

α N-1L-1ß = +/- π/2,

L odd

Remark: We shall have H(ejπ) = H(ejπ) = 0 which means that type III filtershall validate this constraint. Example: bandpass filters


Type III linear phase filter


β± π2 ϕ(Ω) = β−αΩ

L is evenAntisymetric impulseresponse

hk = −hL−1−k

Group delay α = L−12 is not

an integer

π 2π

h(n)

α N-1L-1ß = +/- π/2,

L even

Remark: We shall have H(ejπ) 6= 0 and H(ej0) = 0 which means that type IVfilter shall validate this constraint. Example: HPF


Type IV linear phase filter

Conclusion on linear phase filtersImportance of linear phase filters

Each frequency is delayed with the same amountEnsure no distortion in frequency domain

I Particularly suitable for audio applicationsI Also used in digital communications

Linear phase in frequency domain is linked to time domain symetryI ϕ(Ω) = β − αΩ

Depending on phase caracteristics and filter depth L, 4 classes

Linear phase types1 β = 0 and L is odd : Type I2 β = 0 and L is even: Type II3 β = ±π2 and L is odd : Type III4 β = ±π2 and L is even : Type IV


Conclusion on linear phase filtersh(n)

α N-1L-1ß = 0, L odd

π 2π

h(n)

α N-1L-1ß = 0,L even

a) Type I filter b) Type II filter

π 2π

h(n)

α N-1L-1ß = +/- π/2,

L odd

π 2π

h(n)

α N-1L-1ß = +/- π/2,

L even

c) Type III filter d) Type IV filterFigure: Linear phase filter type (time and frequency domains)


Window based synthesis

Window basic ideaAny filter defined with a finite length corresponds to an infinite filter that isobserved

Example of simplest observation window: the gateAny window have consequence both in time and frequency domains

Mathematical modelWe consider a ideal filter H(ejΩ), with 2π period. We can express is Fouriertransform as

H(ejΩ)=∞∑

k=−∞hke−jkΩ (4.5)

Which leads to the expression of each tap as

hk = 12π

∫ π

−πH(ejΩ)ejkΩdΩ (4.6)

Digital Signal Processing FIR synthesis Window based synthesis 140 / 355

Window based synthesisThe previous definition is valid for each hk from −∞ to +∞

The desired FIR can be deduced by troncation

hk =

hk , 0 ≤ k < L0, k < 0 and k ≥ L

This corresponds to a windowing that affects both the time and frequencydomain

hk = hk .wk (4.7)The rectangular shape r(n) is defined as

rk =

1, 0 ≤ k < L0, k < 0 and k ≥ L− 1

For rectangular shape in time domain, we have a Gibbs effect [18]: Ripple infrequency domain due to hard truncation in time domain

H(ejΩ) = H(ejΩ) ∗W (ejΩ)

[18] Oppenheim and Schafer, Discrete Time Signal Processing, 1999.


Window based synthesisThe previous definition is valid for each hk from −∞ to +∞

The desired FIR can be deduced by troncation

hk =

hk , 0 ≤ k < L0, k < 0 and k ≥ L

This corresponds to a windowing that affects both the time and frequencydomain

hk = hk .wk (4.7)The rectangular shape r(n) is defined as

rk =

1, 0 ≤ k < L0, k < 0 and k ≥ L− 1

For rectangular shape in time domain, we have a Gibbs effect [18]: Ripple infrequency domain due to hard truncation in time domain

H(ejΩ) = H(ejΩ) ∗W (ejΩ)[18] Oppenheim and Schafer, Discrete Time Signal Processing, 1999.


Window based synthesis

H(Ω) W(Ω)*

=H(Ω)^

Figure: Impact of apodisation on filter frequency template


Apodisation functions (windows)

We can use different window shapesHaving a smooth transition in time domain to limits the ripple in frequencydomainOften used in power spectral density estimation

Main apodisation functions1 Rectangular shape2 Triangular shape (Bartlett window)3 Hanning window4 Hamming window5 Blackman window



wk =

g1, 0 ≤ k < Lg0, k < 0 and k ≥ L (4.8)

0 10 20 30 40 50 600

0.2

0.4

0.6

0.8

1

Time index

Win

dow

valu

e

Rectangular window−0.25 −0.2 −0.15 −0.1−5 · 10−2 0 5 · 10−2 0.1 0.15 0.2 0.25

−100

−80

−60

−40

−20

0

Normalized frequency

Freq

uenc

ym

agni

tude

Rectangular window

Figure: Rectangular window (time and frequency responses)


Rectangular shape


wk =

2kL−1 , 0 ≤ k ≤ L−1

22− 2k

L−1 ,L−12 ≤ k ≤ L− 1

0, k < 0 and k ≥ L(4.9)

0 10 20 30 40 50 600

0.2

0.4

0.6

0.8

1

Time index

Win

dow

valu

e

Bartlett window−0.25 −0.2 −0.15 −0.1−5 · 10−2 0 5 · 10−2 0.1 0.15 0.2 0.25

−100

−80

−60

−40

−20

0


Freq

uenc

ym

agni

tude

Bartlett window

Figure: Bartlett window (time and frequency responses)Digital Signal Processing FIR synthesis Window based synthesis 145 / 355

Triangular (Bartlett) shape


wk =

12 −

12 cos

(2πkL−1

), 0 ≤ k ≤ L− 10, k < 0 and k ≥ L

(4.10)

0 10 20 30 40 50 600

0.2

0.4

0.6

0.8

1

Time index

Win

dow

valu

e

Hanning window−0.25 −0.2 −0.15 −0.1−5 · 10−2 0 5 · 10−2 0.1 0.15 0.2 0.25

−100

−80

−60

−40

−20

0


Freq

uenc

ym

agni

tude

Hanning window

Figure: Hanning window (time and frequency responses)


Hanning window


wk =

0.54− 0.46 cos(

2πkL−1

), 0 ≤ k ≤ L− 10, k < 0 and k ≥ L

(4.11)

0 10 20 30 40 50 600

0.2

0.4

0.6

0.8

1

Time index

Win

dow

valu

e

Hamming window−0.25 −0.2 −0.15 −0.1−5 · 10−2 0 5 · 10−2 0.1 0.15 0.2 0.25

−100

−80

−60

−40

−20

0


Freq

uenc

ym

agni

tude

Hamming window

Figure: Hamming window (time and frequency responses)


Hamming window


wk =

0.42− 0.5 cos(

2πkL−1

)+ 0.08 cos

(4πkL−1

), 0 ≤ k ≤ L− 10, k < 0 et k ≥ L

(4.12)

0 10 20 30 40 50 600

0.2

0.4

0.6

0.8

1

Time index

Win

dow

valu

e

Blackman window−0.25 −0.2 −0.15 −0.1−5 · 10−2 0 5 · 10−2 0.1 0.15 0.2 0.25

−100

−80

−60

−40

−20

0


Freq

uenc

ym

agni

tude

Blackman window

Figure: Blackman window (time and frequency responses)


Blackman window

Apodisation functions (windows)Hanning window

Introduced by Julius von HanBased on harvesine function sin2() [19]Exponentially attenuated sinc in frequency domain

Hamming windowDirect derivation from Hanning function with different numeric parametersNumeric choices (0.42 and 0.08) places a zero-crossing at frequency5π/(L− 1) which cancels the first sidelobe of the Hann window [20]

Blackman windowAlso derivative of Hanning window [21]0.42, 0.5 and 0.08 place zeros at the third and fourth sidelobes (discontinuityat the edges and a 6 dB/oct fall-off).

[19] Harris, “On the use of windows for harmonic analysis with the discrete Fourier transform,” 1978.[20] Bojkovic, Bakmaz, and Bakmaz, “Hamming Window to the Digital World,” 2017.[21] Blackman and Tukey, “The measurement of power spectra from the point of view of communications engineering—Part I,” 1958.



0 10 20 30 40 50 600

0.2

0.4

0.6

0.8

1

Time index

Win

dow

valu

e

Rectangular windowBartlett windowHanning windowHamming windowBlackman window

Figure: Comparison between apodisation windows (time domain)


Comparison between windows


−0.25 −0.2 −0.15 −0.1−5 · 10−2 0 5 · 10−2 0.1 0.15 0.2 0.25−100

−80

−60

−40

−20

0


Freq

uenc

ym

agni

tude

Rectangular windowBartlett windowHanning windowHamming windowBlackman window

Figure: Comparison between apodisation windows (frequency domain)Digital Signal Processing FIR synthesis Window based synthesis 151 / 355

Comparison between windows


Window Amplitude ratio Mainlobe width Minimal attenuationbetween main lobe ∆Ωm in attenuated bandwidthand secondary lobe

λ ∆ARectangular −13dB 4π/L -21dBBartlett −25dB 8π/L -25dBHanning −31dB 8π/L -44dBHamming −41dB 8π/L -53dBBlackman −57dB 12π/L -74dB

Table: Window characterization


Other windows

There are many (many many many) windows that can be designed and usedSome are parameter based window that can lead to proper design

Adaptive windows that can be tuned for a proper uses in time and/orfrequency domain

Adjustable windows1 Gaussian window2 Dolph Chebyshev window

For more information (and many examples) see [19]

[19] Harris, “On the use of windows for harmonic analysis with the discrete Fourier transform,” 1978.


Other windows

Eigenvector: Corresponds to the eigenvector of the Fourier transform (FT ofGaussian window is a Gaussian window)Smoothness: Smooth transition both in time and frequency domainWindowception: Gaussian function is infinite, so window is troncated (oritself windowed)Parametrized: σ also linked to bandwidth time product (BT) < 0.5

wk = e−12

(k−(L−1)/2σ(L−1)/2

)2(4.13)


Gaussian window

Other windows

wk = e−12

(k−(L−1)/2σ(L−1)/2

)2

0 10 20 30 40 50 60 700

0.2

0.4

0.6

0.8

1

Time index

Win

dow

valu

e

Gaussian window

−0.25 −0.2 −0.15 −0.1−5 · 10−2 0 5 · 10−2 0.1 0.15 0.2 0.25−100

−80

−60

−40

−20

0


Freq

uenc

ym

agni

tude

Gaussian window

Figure: Gaussian window (time and frequency responses)


Gaussian window

Other windowsPlenty of different window with different uses

Maximization of energy in main lobeSide lobe reductionTrade-off. . .

Other windows1 Generalized normal window2 Tukey window3 Discrete prolate spheroidal sequence (or Slepian window)4 Kaiser window5 Dolph–Chebyshev window


Window selection

We focus on the non-parametric windows1 Rectangular shape2 Triangular shape (Bartlett window)3 Hanning window4 Hamming window5 Blackman window

Strong influence of window in both time and frequency domainsHow to we choose the correct window ?

We must define appropriate metrics


Window selection

Transition bandwidthDefined as ∆Ω = |Ωp − Ωa|Linked to main lobe width ∆Ωm.Approximation: ∆Ω ∼= ∆Ωm

2 .The larger L; the lower ∆Ω

RipplesEqual in bandpass and attenuated bandwidth: Equiripple filters∆A = δ1 = δ2

Window dependant but independant from L


Window selection

∆Ω

∆A

1+∆A

H(Ω)^

H(Ω)

W(Ω)

1

1−∆A

Figure: Windowing impact and important metrics


Window selectionFIR synthesis with windowing

Recall: we have an infinite impulse filter from direct synthesisTo transform it into FIR, apodisation (windowing) is necessary

Window selectionHow do we select the window

1 Choose the window nature with ∆A selection2 From ∆Ω and from the window choice, deduce the targeted length L

The FIR is finally obtained by apodisation i.e

hk = hn−α.wk (4.14)

With α the added delay to ensure linear phase filter


Window selection

Window selectionWe consider the following requirements

1 Transition bandwidth= ∆Ω = 0.1π,2 Targeted attenuation greater than 30dB

From attenuation, we choose Hanning, Hamming or BlackmanAs transition is short, we prefer Hamming (shorter window with better λ)We have a window linked to ∆Ωm = 8π

L = 2∆Ω; L = 40.


Exercise

Frequency sampling method

The filter specifications are often in frequency domainBasic idea: use the specification in frequency domain to obtain the filter intime domainWe need a tool to convert a sampled signal in frequency domain to asampled signal in time domain

Fourier transform

Frequency sampling methodThe desired filter is obtained as the inverse Fourier transform of the filterspecifications defined in frequency domain

We start from the ideal continuous frequency response H(ejω)We choose to sample this response with L points

Digital Signal Processing FIR synthesis Frequency sampling method 162 / 355

Frequency sampling methodFrequency sampling methodThe desired filter is obtained as the inverse Fourier transform of the filterspecifications defined in frequency domain

We start from the ideal continuous frequency response H(ejω)We choose to sample this response with L pointsThe sampling frequency grid (or sampling space in frequency domain) isΩe = 2π

L and the digital filter (in frequency domain) is

H(ejkΩe ) , H(ejΩ)|Ω= 2kπL

(4.15)

We obtain the FIR impulse response by means of inverse Fourier transform

hk = 1L

L−1∑n=0

H(ejkΩe )e2jπ k.nL , k = 0, 1 . . . L− 1 (4.16)

hk = 0, otherwise (4.17)


Frequency sampling methodFilter in time domain

hk = 1L

L−1∑n=0

H(ejkΩe )e2jπ k.nL , k = 0, 1 . . . L− 1

hk = 0 otherwise

Why 2 equations ?Sampling in frequency Periodisation in time domainEquivalent to apply a rectangular window to the filter

hk = 1L

L−1∑n=0

H(ejkΩe )e2jπ k.nL × wk

wk =

1, 0 ≤ k < L0, k < 0 and k ≥ L

Any window can be used !Digital Signal Processing FIR synthesis Frequency sampling method 164 / 355


ΩπΩc

1

H(ejΩ)

H(kΩe)

2π

Figure: Example of frequency sampling of derivative filter

We took L point with uniform frequency sampling between 0 and 2πLast point is not taken !Important: Ensure a linear phase


Example of derivative filter


Let’s assume that we sant to design a LPF filter with a cutt-off frequency of1/4× 2π

One way to this is to force the frequency to 0 when frequency is greater thanthe desired cutt-off frequencyThe higher the filter size, the lower the space between two consecutivefrequency points: the filter becomes more frequency selectiveAlso impact of the window used (no window is a rectangular window)


Example of Low pass filter


0 0.5 1 1.5 2 2.5 3−100

−80

−60

−40

−20

0

Frequency

Mag

nitu

de

L = 16 with rect windowL = 32 with rect windowL = 64 with rect windowL = 128 with rect windowFilter spec

Figure: Influence of frequency spacing in Frequency sampling method



0 0.2 0.4 0.6 0.8 1 1.2 1.4−6

−5

−4

−3

−2

−1

0

Frequency

Mag

nitu

de

L = 16 with rect windowL = 32 with rect windowL = 64 with rect windowL = 128 with rect windowFilter spec

Figure: Influence of frequency spacing in Frequency sampling method (zoom onbandpass)



0 0.5 1 1.5 2 2.5 3−100

−80

−60

−40

−20

0

Frequency

Mag

nitu

de

L = 128 , rect windL = 128 , bartlett windL = 128 , hamming windL = 128 , blackman windFilter spec

Figure: Influence of window apodisation in Frequency sampling method



0 0.2 0.4 0.6 0.8 1 1.2 1.4−6

−5

−4

−3

−2

−1

0

Frequency

Mag

nitu

de

L = 128 , rect windL = 128 , bartlett windL = 128 , hamming windL = 128 , blackman windFilter spec

Figure: Influence of window apodisation in Frequency sampling method (zoom onbandpass)


Conclusion in frequency sampling

Simple method for FIR synthesisBased on what we want to have in frequency domainFIR impulse is obtained with inverse Fourier transform2 freedom degrees in synthesis

I Filter length in samples (linked to frequency spacing)I Choice of apodisation window

But can be tricky if used without cautionForcing the grid leads to to many constraints for proper designRipples in bandpassGain in bandpass to monitor


Conclusion on FIR synthesisDedicated methods for FIR synthesis

Cannot be based on analog filter identification

FIR propertiesFIR implementation are simpler

Stable by constructionOnly dependant on signal inputLinear phase FIR filters are often used

Synthesis methodsDifferent synthesis methods can be used

1 Window based FIR2 Frequency sampling FIR3 Adaptive FIR

Digital Signal Processing FIR synthesis Conclusion 172 / 355

Conclusion on FIR synthesis

Window based filtersFIR can be seen as IIR with observation window (troncature)

Different window can be used with different properties

Trade-off between main lobe width and other lobes level

Main windows1 Rectangular window2 Bartlett window3 Hamming, Hanning, Blackman windows4 Parametrable windows (Gaussian, . . . )


Conclusion on FIR synthesisFrequency sampling methodBased on sampling of the continous analog frequency response

FIR is obtained with inverse Fourier transformJointly used with apodisation (windowing)Arbitrary frequency response can be otained

Adaptive filtersSometimes FIR cannot bepre-computed

Set up a cost function tominimizeThe filter taps areiteratively computed with aminimization method(LMS, RLS, APA, . . . )

Application example: noisecancellation


Fixed point arithmetic

Digital Signal Processing Fixed point arithmetic 175 / 355

A simple example to begin

Picsou wants to have an interesting investmentfor his cash .

The banker proposes the following rule:You place e − 1 eurosThe first year, we multiply by 1 and take over 1 euroThe second year, we multiply by 2 and take over 1euroThe n year, we multiply by n and take over 1 euroYou get your cash back in 25 years

Digital Signal Processing Fixed point arithmetic Introduction 176 / 355

A simple example to begin

So the banker do the simulation on his computerAfter 25 years, Picsou would earn +4645987753e

So picsou is happy

After going back to his home, he calculates the expected benefits in his calculatorHe obtains −140× 1012e

How we use (and represent) numbers matter !


that can have tremendous impact. . .

Ariane V explosion on June 4th 1996(flight 501).

Explosion just after launch (30second)Loss of launcher and cargo (500million dollars)

After investigationThis was due to the inertialmeasurement unit (IMU)The horizontal speed was calculatedon 64 bits and then converted to 16bits without overflow checkingSystem was inherited from Ariane 4,with lower ground speed


What happens ?

Digital signal is discrete in time domainBut it also have a finite number for its representationAs there is a time-frequency grid for digital signal processing, there is anamplitude grid

Some (weird) phenomena can occurAbsorptionTruncationElimination

The way we do the calculation also matters !

First let’s have a look on number representation


Numbers representation

We have a number x , represented on a basis BElement B: base (can be 10, 2, whatever)Decomposition on n elementsDecomposition on the elementary power of the base element

x =n−1∑k=0

akBk (5.1)

Bk are the weightsak basis elements (coefficients) such as ak ∈ [|0; B − 1|]

Digital Signal Processing Fixed point arithmetic Numbers representation: coding family 180 / 355

Unsigned integer coding

Numbers representationExample of binary codingThe coefficient ak are also called the bits (or binary digits)

Bit originFirst use in a 1948 paper entitled “A Mathematical Theory of Communication” byClaude Shannon [22]

He attributed its origin to John W. Tukey, who had written a Bell Labsmemo on 9 January 1947 in which he contracted "binary information digit"to simply "bit"

x =n−1∑k=0

akBk = a0a1a2 . . . an−1 = an−1an−2 . . . a0

a0 is the Less significant bit (LSB)aN−1 is the Most significant bit (MSB)

[22] Shannon, “A mathematical theory of communication,” 1948.


Numbers representationDecimal value Binary code

A10 a3a2a1a00 0 0 0 01 0 0 0 12 0 0 1 03 0 0 1 14 0 1 0 05 0 1 0 16 0 1 1 07 0 1 1 18 1 0 0 09 1 0 0 110 1 0 1 011 1 0 1 112 1 1 0 013 1 1 0 114 1 1 1 015 1 1 1 1

Table: Unsigned binaryrepresentation

For binary representation

x =b−1∑k=0

bk2k

Coding on b bitsCan represent any number between0 and 2b − 1


Sign encoding

We focus now on binary representationStill b bits

Unsigned bit coding allow to code between 0 and 2b − 1

How to represent negative numbers ?

One bit is now allocated to the sign1 bit for sign (often called S)b − 1 bits for binary representation

x = (−1)Sb−2∑k=0

bk2k (5.2)


Signed integer coding (Sign absolute value)

Sign encoding

Drawbacks0 can be represented in 2 manners: +0 and -0.Basic operation cannot be straightforwardly obtained from one bit operations

Example of signed absolute coding drawbacks: additionsDepending on the sign, two operations must be performed

I Same sign: adding value and keep the signI Different sign: subtraction is required


Signed integer coding (Sign absolute value)

Sign encoding

Special case of binary representationAvoid the two representation of 0Representation in order to ease arithmetic operationFrom sign product to sign addition

Complement to two

x = −2b−1 × S +b−2∑k=0

bk2k (5.3)

MSB is the sign bitPositive number have a legacy unsigned representation on b − 1 bitsNegative numbers starts from −2b−1


Signed integer coding (2’s complement)

Sign encoding

PropertiesDefinition domain is not symmetric: more negative values than positive ones

D = [| − 2b−1, . . . , 2b−1 − 1|]

Transition from positive numbers to negative numbers: Bit flip +1

0 = 00 . . . 0−0 = 11 . . . 1+ 1 = 00 . . . 0



Sign encoding

Example: 0000 01018 bits, signed, two’s complementFirst bit is 0: positive numberPositive part: 000 0101 5

Example: -5 ?What is the binary representation of -5 ?From initial example

I Bit flip of 5 + 1I 1111 1011

From scratchI First bit is equal to 1 as negative number: 1xxx xxxxI 8 bits (signed), -5 is coded from inverse on 7 bits: 27 = 128− 5 = 123 x111

1011I Representation of 123 is 111 1011



Fixed point

Number codingWe have consider integer (unsigned and signed) numbersIn a DSP acquisition, any number can be consideredHow can we use real numbers ?

We have a certain number of bitsAny number cannot be represented, and precision depends on how werepresent the numberA number can be split as a integer part and a fractional part

Fixed pointInteger part is coded on m bitsFractional part is on n bits

Digital Signal Processing Fixed point arithmetic Fixed and floating points 188 / 355

Fixed point

Fixed point representation

bm-1 bm-2 b1 b0 b-n+2 b-nb-n+1

Integer part Fractional partm n

b-1

2-n2-121 202m-1-2m

S

Figure: Fixed point representation

m bits for integer partn bits for fractional partb bits = m + n + 1

(b,m,n) bits or Q(m,n)

Still depends on the representation


Fixed point parameters

Some usefull metrics can be defined for fixed point arithmeticRange

Range of the number coding. For 2’s complement, we have

D = Xmax − Xmin = 2× 2m − 2−n) (5.4)

Quantization stepDistance (euclidean) between 2 successive values (uniform quantization)

q = 2−n (5.5)


Fixed point

Focus here on 2’s complement approach [23].There are three ways for decimal point positioning:

Right justified : it means that the data is an integer as no fractional bits areused

Left justified : it means that the data is completely fractional: normalizedinput

Q(m,n) with m 6= 0 and n 6= 0: data is composed of an integer and afractional parts.

[23] Kunt, Bellanger, Coulon, et al., Techniques modernes de traitement numérique des signaux, 1991.


Two’s Complement FP

Fixed point

Representation Framing Framing FramingLeft Right n

Condition m = 0 n = 0 n + m = b − 1

q 2−(b−1) 1 2−(n)

D [−1; 1− q] [−2b−1; 2b−1 − q] [−2m; 2m − q]

Table: FP parameters


Example

Value RepresentationLeft framing Right framing m = 3 n = 2 C.A.2

0.96875 31 7.75 0111110.9375 30 7.5 011110

......

......

0.3125 1 0.25 0000010 0 0 000000

-0.3125 -1 -0.25 111111...

......

...-0.9375 -30 -7.5 100010-0.96875 -31 -7.75 100001

-1 -32 -8 100000


Example

We consider a Q(6,3,2) system

Binary Binary Decimal value011111 0 111.11 7.75011110 0 111.10 7.5000011 0 000.11 0.75000010 0 000.10 0.5000001 0 000.01 0.25000000 0 000.00 0111111 1 111.11 -0.25111111 1 111.10 -0.5100001 1 000.01 -7.75100000 1 000.00 -8.00


Floating point

Fixed point codingOne part is dedicated to integer codingOne part is dedicated to fractional part coding

The position of the point is fixed

Alternative way to represent number: floating pointEquivalent to scientific notationThe number of bits b is fixed, but the position of the point varies and adaptto the encoded dataPart of the number coded is devoted to point location information


Floating point

How can we use a floating point systemStandardized by IEEE (IEEE 754)We still have the sign bit SOne part is called exponentOne part is called significand/mantissa/fraction

S dE-1 d1 d0 cM-1 c2 c0c1

SignificandExponent


Floating point

ExponentDenoted as u

Corresponds to a scaling factor that can be adaptedExponent can be positive or negative, and coded with 2’s complement

Significand/mantissa/fractionCorresponds to the value of the data divided by the scaling factorSignificand belongs to [1; 2[

Expression of data x is

x = (−1)S .2u.(1 +

M−1∑i=0

Ci2i−M)

with u =E−1∑i=0

di2i −(2E−1 − 1

)(5.6)


Floating point

Representation WL (bits) F (bits) E (bits)

simple precision (float) 32 23 8

double precision (double) 64 52 11

Extended double precision (long double) 80 64 15

Table: IEEE 754 number representation. WL, M, et E denote respectively bit width,fractional size and exponent size

2 codes corresponds to 0 (+0 and -0)Special sequence corresponds to ±∞, NaN . . .More (and precious) information in [24]

[24], “IEEE Standard for Floating-Point Arithmetic,” 2008.


Fixed point vs Floating pointWhat is the dynamic for fixed point and floating point ?

For fixed point, dynamic is 6.02bDepends on bit size and exponent allocation for floating point (assume 1/4exponent alloc)

0 5 10 15 20 25 30 350

200

400

600

800

1,000

1,200

1,400

1,600

Number of bits

Dyn

amic

[dB

]

Fixed pointFloating point

Figure: Dynamic comparison between fixed point implementation and floating pointimplementation


Fixed point vs Floating point

Floating point takes advantage when many bits are usedIf number of bits is greater than 16, better use a Floating point system

However having a lot of bits can be costlyIn many DSP applications, few bits are consideredHigh number of bits affects the DAC architecture (and the targetedbandwidth)Modern architectures tend to be floating point but many DSP applicationsare still based on fixed point arithmetic (and will be)

Let’s focus on Fixed point approach


Conclusion on Fixed/Floating point

Fixed pointOne sign, on integer part,on fractional partTwo’s complementimplementation(b,m,n) or Q(m,n) withb = m + n + 1



b-1

2-n2-121 202m-1-2m

S


SignificandExponent

Floating pointOne sign, exponent andfractionResolution depends oncoded wordIEEE 754 norm


Fixed point arithmeticFixed point arithmeticSimple rules for bit manipulation

Arithmetic can be done with logical gates

Figure: Logical gatesDigital Signal Processing Fixed point arithmetic Fixed point arithmetic 202 / 355

Fixed Point arithmetic

As shown before, operation corresponds to exclusive OR and AND for carryThe carry is iteratively shifted to the leftIf inputs on n bits and output on n bits: overflow can occurIf the carry of the last stage is 1, result is greater than 2n − 1

a3 a2 a1 a0

b3 b2 b1 b0

s 3 s 2 s 1 s 0

r 0r 3 r 2 r 1r 4

+

a3 a2 a1 a0

b3 b2 b1 b0

d3 d2 d1 d0

r 0r 3 r 2 r 1r 4

_

Digital Signal Processing Fixed point arithmetic Fixed point arithmetic 203 / 355

Addition and subtraction for unsigned numbers

Fixed Point arithmetic

The carry is iteratively shifted to the leftIf inputs on n bits and output on n bits: overflow can occurSubtraction and additions are the same (carry of the sign bit is not shiftedthough).Fixed point addition: same !

14 0000 1110

+13 0000 1101

+27 0001 1011

- 14 1111 0010

+13 0000 1101

-1 1111 1111

-14 1111 0010

-13 1111 0011

-27 1110 0101


Addition and subtraction for Two’s complement

Fixed point arithmeticAddition

Exclusive OR for bit additionThe carry is a AND

x 0 0 1 1y 0 1 0 1

Exclusive OR & carry 0 1 1 (1, 0)Propag-1 0 1 1 0carry-1 0 0 1 0Res-1 0 1 (1, 0) 0

Propag-2 0 1 0 0carry-2 0 1 0 0Res-2 0 (1, 0) 0 0

Propag-3 0 0 0 0carry-3 1 0 0 0Res-3 1 0 0 0Res 1 0 0 0


Examplex 0 1 1 1 0 1 1y 1 0 1 1 1 1 1

x+y ? Check with decimal that the result is correct


Sign encodingExample with 8 bits coding

Binary coding of 3Binary coding of -4Check that 3 + 3 leads to 6Check that 3 - 4 leads to -1


Addition in Fixed point

Rulesx and y must have the same format (b,m, n)If they have not the same format, need to recast the data

mc = max(ma,mb)nc = max(na, nb)bc = mc + nc + 1

(5.7)

And the recast is done as followsFractional bits added (nc − na) (right) are set to 0Additional integer bits (mc −ma) are set to sign bit value (sign bit extension)


Addition in Fixed pointResult of addition is on

z → (bc ,mc , nc)

Result may not belong to interval of interestOne bit can be necessary to avoid the overflowIf result does not belong to Dc = [−2mc ; 2mc [

z → (bc+1,mc+1, nc)

x (7.75)

y (-0.5)

z (7.25)

x (7.75)

y (0.5)

z (8.25)


Fixed point arithmeticMultiplication

Elementary bit operation is a ANDOutput is extended on more bits than input

a3 a3 a3 a3 a3 a2 a1 a0× b3 b3 b3 b3 b3 b2 b1 b0

b0a3 b0a3 b0a3 b0a3 b0a3 b0a2 b0a1 b0a0b1a3 b1a3 b1a3 b1a3 b1a2 b1a1 b1a0 0b2a3 b2a3 b2a3 b2a2 b2a1 b2a0 0 0b3a3 b3a3 b3a2 b3a1 b3a0 0 0 0b3a3 b3a2 b3a1 b3a0 0 0 0 0b3a2 b3a1 b3a0 0 0 0 0 0b3a1 b3a0 0 0 0 0 0 0b3a0 0 0 0 0 0 0 0c7 c6 c5 c4 c3 c2 c1 c0

Figure: Sign number multiplication

In practice, risk of overflow -> One additional bitDigital Signal Processing Fixed point arithmetic Fixed point arithmetic 210 / 355

Fixed point arithmeticMultiplication

In fixed point output is on more bit than inputsSign bit extension at each stage

mMult = ma + mb + 1nMult = na + nbbMult = ba + bb

(5.8)

0 1 0 0

0 1 0 0

0 0 0 1 0 0 0 0

0 1 0 0

0 0 0 00 0 0 0

0 0 0 0

.. .. . .

One sign bit

One sign bit

Redundant sign bit

x (0.5). (4,0,3)y (0.5). (4,0,3)

2-62-1

znzm

z (0.25). (4,0,3)


Fixed point arithmeticExample Consider format (4,0,3)

x = -0.5y = 0.75

Calculate x × y

1 1 0 0

0 1 1 0

1 1 1 0 1 0 0 0

Redoundant sign bit

x (-0.5) format : (4,0,3)

y (0.75) format : (4,0,3)

z (-0.375) format : (8,1,6)

2-62-1

.

1 1 1 1

0 0 0 0

Sign bit

1 1 10 0 0 00 0 0 0

1 1 0 01 1 11 1 0 01 1

0 0 0 00


Conclusion

AdditionSame format (sign bit extension)Output is mc + 1 (truncation if definition domain is maintained)

MultiplicationBit sign extensionOutput format is different

mMult = ma + mb + 1nMult = na + nbbMult = ba + bb


Fixed point designWe now how to do bit operation in Fixed point

How to manage and built a complete system ?



b-1

2-n2-121 202m-1-2m

S

Fixed point design3 aspects

1 What is the dynamic ?2 How to split integer and fractional parts3 What is the quantization noise ?

Digital Signal Processing Fixed point arithmetic Fixed point design 214 / 355

Fixed point design

Requires a graph flow ofdesired applications

Output is noisy: Precision isevaluated with Signal toNoise Ratio (SNR)

Floating point baseline forinput (and SNR estimation)

Design is done beforeimplementation Hardware

implementation

Precision analysis (SNR)

System specification(data width)

Point specification

Dynamic estimation

Constraints

DSPµC

Floating point baseline

Fixe

dpo

int s

peci

ficat

ion

ASICFPGA

Digital Signal Processing Fixed point arithmetic Fixed point design 215 / 355

Fixed point design

Dynamic is used in order to minimize number of bitsDone to ensure that no overflow occur

x[n] F y[n]

x [n] in floating pointy [n] in floating point

How to calculate dynamic of y [n] ?

Dynamic estimation1 Simulation method2 Analytical methods

Digital Signal Processing Fixed point arithmetic Dynamic 216 / 355

Dynamic

Fixed point design

x[n] F y[n]

Simulation methodFirst way to estimate dynamic

Simpler but risky methodFrom various x [n] calculate the floating output and determine its dynamicDoes not ensure no overflow ! Depends on how simulated x [n] differs fromexperienced x [n]Sophisticated methods based on statistical approach


Dynamic

Fixed point designExample with a FIR filter

0 5 10 15 20 25 30 350

0.1

0.2

0.3

0.4

0.5

Lag index

Val

ue

Figure: FIR example impulse response


Fixed point designExample with a FIR filter

0 5 · 10−2 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5−60

−40

−20

0

20


Mag

nitu

de

Figure: FIR example frequency response


Fixed point design

White Gaussian noise

Chirp (0 ® Fe/2)

( ) 4.2)(max =nyn

( ) 3.9)(max =nyn

)(ny)(nx

)(ny)(nx

Uniform white noise)(nx )(ny ( ) 4.3)(max =nyn


Fixed point design

x[n] F y[n]

Interval arithmeticOther way to estimate dynamic

Deduce definition domain from all elementary operationsUse definition domain of input of elementary operations

Dx = [x , x ]Dy = [y , y ]

Output definition domain depends on input and operationsz = x + y [z , z ] = [x + y , x + y ]z = x − y [z , z ] = [x − y , x − y ]z = x × y [z , z ] = [min(xy , xy , xy , xy), max(xy , xy , xy , xy)]

(5.9)


Dynamic

Fixed point design

x[n] F y[n]

Interval arithmeticComplete graphflow can be obtained from all interval calculationBut can only be done with non-recursive systemsCan be done for FIR and not for IIR


Dynamic

Fixed point design

Norm is an operand that calculate output dynamic based on input dynamicAssume that the system is

I LinearI Invariant in time domain

For Single input

y [n] = h[n] ∗ x [n] Y (z) = H(z).X (z) (5.10)

and for Multiple inputs

y [n] =Ne−1∑i=0

hi [n] ∗ xi [n] (5.11)

Objective: Determine maximal output based on maximal inputDigital Signal Processing Fixed point arithmetic Dynamic 223 / 355

Norm

Fixed point design

Single inputConsidering single input system, we have

y [n] = h[n] ∗ x [n] Y (z) = H(z).X (z)

And absolute value of output is

|y(n)| =∣∣∣∣∣

+∞∑m=−∞

h(m)x(n −m)∣∣∣∣∣ (5.12)

This can be bounded by :

ymax = maxn

(|y(n)|) ≤ maxn

(|x(n)|)+∞∑

m=−∞|h(m)| (5.13)


L1 norm

Fixed point design

Multiple inputsExtension to multiple inputs

ymax = maxn

(|y(n)|) ≤Ne−1∑i=0

maxn

(|xi(n)|)+∞∑

m=−∞|hi(m)| (5.14)

" Several inputs is not a cascaded system (see after)

L1 Norm summaryBased on maximal value boundL1 norm does not rely on x statistical hypothesisNo overflow, but may be overestimated


L1 norm

Fixed point design

Single inputL1 norm can be overestimated, particularly if input signal is narrowbandWe consider a pure tone signal

x(n) = A cos(Ω.n) with 0 ≤ Ω < π (5.15)

H is a filter, and the output is

|y(n)| =∣∣A|.|H(ejΩ)|.| cos

(Ω.n + arg

(H(ejΩ)

))∣∣ with 0 ≤ Ω < π (5.16)

Chebyshev norm is defined as

ymax = maxn

(|y(n)|) ≤ maxn

(|x(n)|) maxΩ

(|H(ejΩ)|

)(5.17)


Chebyshev norm

Application to cascaded systemWe consider a cascad system with 2 filters

H1(z) H

2(z) H

3(z)

x(n)

)(zH

)().( zHzH

)().().( zHzHzH

y (n) y (n) y (n)

( )Ω

( ) ( )ΩΩ=

( )Ω


Application to cascaded systemWhen a cascaded system is considered, need to use the equivalent transfertfunction

H1(z) H

2(z) H

3(z)

x(n)

)(zH

)().( zHzH

)().().( zHzHzH

y (n) y (n) y (n)

Cascaded system

H(z) = H1(z)× H2(z)× H3(z)Dynamic:

y2−max = maxn

(|x(n)|) maxΩ

(|H12(ejΩ)|

)= max

Ω

(|H1.H2(ejΩ)|

)= 0, 356


Application to cascaded systemWhen a cascaded system is considered, need to use the equivalent transfertfunction

H1(z) H

2(z) H

3(z)

x(n)

)(zH

)().( zHzH

)().().( zHzHzH

y (n) y (n) y (n)

Sucessive system

Y2(z) = H2(z)× Y1[z ]Dynamic:

y2−max = maxn

(|x(n)|) maxΩ

(|H1(ejΩ)|

)max

Ω

(|H2(ejΩ)|

)= 1× 1 = 1


Conclusion on dynamicDynamic to calculate variation of output (wrt input): number of bits calculation

Simulated methodsBased on maximal encountered value based on inputsPrecise estimation. Does not avoid overflow

Analytic methodsInterval arithmetic

I Based on flowgraph between input and outputI Elementary operations are propagated

NormsI L1 Norm

F No overflow but (can be) overestimated

I Chebyshev normF Based on pure tone excitation

I For norms, cascaded systems as single output/input


Point positioningWe know what the dynamic is

As we are in fixed point, number of bits is not enough to characterize dataNeed to position the point (limit between integer and fractional part)

Point positioningEach stage of the flowgraph (elementary operation) will have its own tripletsEnsure correct specification with respect to arithmeticEnsure no overflow

From dynamic and interval definition

2mx−1 ≤ xmax < 2mx (5.19)

Position is thus:mx =

⌊log2

(maxn

(|x(n)|))⌋

+ 1 (5.20)

Digital Signal Processing Fixed point arithmetic Point positionning 232 / 355

Example of point positioningData is on 8 bits

1 Position of point if x ∈ [0, 1] ?2 Position of point if x ∈ [0, 1[ ?3 Position of point if x ∈ [0, 20] ?4 Position of point if x ∈ [0, 512] ?5 Position of point if x ∈ [0, 0.004] ?1 Position 1 (i.e 1 bit) for integer part (6 bits for FP)2 Position 0 for integer part (framing left) (7 bits for FP)3 Position 5 (2 bits for FP)4 Impossible (should require 10 bits).5 -7: Possible ! (7 bits for FP, start at 2−7)


Point positioning

Where the point is after elementary operation ?

MultiplicationFixed point multiplication leads to redundant sign bitThis bit is integrated into integer part of result

mz = mx + my + 1

In other wordsbz = bx + by , nz = nx + ny

→ mz = mx + my + 1


Rules

Point positioning


Addition without guard bitsCommon point positioning for the two inputsPosition to ensure no overflow

If output has no guard bit, point positioning as follows:

mc = max (mx ,my ,mz) (5.21)

Fixed point format update before mapping to addition module:

dx = mc −mx Rightshift dx bits ()dy = mc −my Rightshift dy bits ()dz = mc −mz Leftshift dz bits ()

(5.22)


Rules

Point positioning


Addition with guard bitsIf inputs have not same format (MSB not aligned)Data should be realigned before doing addition mx ′ = mx − gx

my ′ = my − gymz′ = mz − gz

(5.23)

With g• number of guard bits, m• initial dot positioning and m′• common dotpositioning


Rules

Point positioning

+

b

mc = max (mx ′ ,my ′ ,mz′) = max (mx − gx ,my − gy ,mz − Bg )


Addition with guard bits

Point positioning

+

b

Guard bit set to maximal value from adderCommon point positioningAdding guard bit (sign replication)Result is then right shifting if not all guard bits are effective used



Point positioning

+

b

Guard bits can be used at the output of the addergz = mz −mc if mz > mcgz = 0 if mz ≤ mc

(5.24)

Input point positioning mx = mc + gxmy = mc + gy

(5.25)

Output point positioningmz = mc + gz (5.26)



Point positioning

+

b

Input point positioning mx = mc + gxmy = mc + gy

(5.27)

Output point positioningmz = mc + gz (5.28)



Conclusion on Point positioning

Dynamic gives position of point

mx =⌊

log2(

maxn

(|x(n)|))⌋

+ 1

Rules for Fixed point arithmeticI Multiplication

mz = mx + my + 1 (5.29)Adder w/o guard bit

I Common formatI Shift for alignment and no overflow

dx = mc −mx Rightshift dx bits ()dy = mc −my Rightshift dy bits ()dz = mc −mz Leftshift dz bits ()


Conclusion

Conclusion on Point positioning

Adder with guard bitsI Need to shift to common format mx′ = mx − gx

my′ = my − gymz′ = mz − gz

I Guard bits can be used at the output of the addergz = mz −mc if mz > mcgz = 0 if mz ≤ mc

I Input point positioning mx = mc + gxmy = mc + gy

I Output point positioning

mz = mc + gz


Conclusion

Data width



b-1

2-n2-121 202m-1-2m

S

From previous sectionknow the dynamicWe know how to locate the point

In some cases, enough as data width is fixed (for example, 8 or 16 bits)Just ensure that data format is acceptable

In many cases, we want to use only bits that bear effective dataWhat is the fractional part length ?

Digital Signal Processing Fixed point arithmetic Data width & quantization noise 243 / 355

Data widthData width depends on instruction set used

Instruction set depends on chosen targetFor µC , DSP, 3 sets

Classic operations1 operation per cycle, with data width bopbop depends on chosen architecture

Arbitrary precision operationsData has a larger widthData is obtained by word concatenation

Sub word parallelism operationsData has a narrow widthParallel processing with k branchCan process word with data length bop/k.


Example of MAC operation

+

Accum

´

A B16 bits

32 bits32 bits

32 bits

+

Accum

´

A B16 bits

16 bits16 bits

16 bits

Double precision (w/o guard bit)

Single precision

+

Accum

´

A B16 bits

40 bits32 bits

40 bits

Double precision(with guard bit)


Data widthIn some components (ASIC or FPGA), data width is customizable: want to reducedata footprint

Lower the energy consumptionEnhance the available space

The lower the data width the better

What is the limit ?Ensure that processing is done as we wantLimitation of the quantization noiseWe define the Signal to Quantization noise

SNRq = 10log10(

E [|yfloating|]2E [|yfixed − yfloating|]2

)(5.30)

Purpose: maximizing SNR while reducing data width


Quantization noiseData is quantified to a discrete level. Two consequences

OverflowQuantization noise

OverflowFrom definition domain, output is

fD(x) =

x ∀ x ∈ DDD(x) ∀ x 6∈ DD

(5.31)

with D the overflow rule

Quantization noiseSurjective mapping between input and output

yi = ui+1−ui2 = ui + q

2 ∀ x ∈ ∆i = [ui ; ui+1] (5.32)

Output noise defined ase = x − xi (5.33)


Overflow

Xmin

Xmin

Xmax

Xmax

x

fD(x)

Xmin

Xmin

Xmax

Xmax

fD(x)

(b)(a)

Saturation:D(x) =

Xmin ∀ x < XminXmax ∀ x > Xmax

(5.34)

Modular rule

D(x) =

Xmax + (x − Xmin) ∀ x < XminXmin + (x − Xmax ) ∀ x > Xmax

(5.35)


Quantization noise

x

e(x)

ui+1ui

xi

x

fQ(x)

ui+1ui xi

q

2

-q

2

Figure: Rounding

Maximal error between 2 quantization steps

yi = ui+1−ui2 = ui + q

2 ∀ x ∈ ∆i = [ui ; ui+1] (5.36)


Rounding versus truncation

Quantization noise

e(x)

xui+1uix

fQ(x)

ui+1ui

q

-q

Figure: Truncation (Absolute sign value)

Maximal error before quantization stepsFormulation with and without 2’s complementWithout 2’s complement:

yi =

ui ∀ x > 0ui+1 ∀ x < 0 (5.37)



Quantization noise

e(x)

xui+1uix

fQ(x)

ui+1ui

q

Figure: Truncation (2’s complement)

Maximal error before quantization stepsFormulation with and without 2’s complementWith 2’s complement:

yi = ui (5.38)



Noise modelQuantization leads to noiseInput is random, so is noise

How to model it ?

n

x(n)

3q

2q

q

y(n)

3q

2q

q

CAN

x(n) y (n)n

e(n)

q/2

- q/2

e[n] = x [n]− y [n] (5.39)


Noise modelNoise quantization is added to the signal

e[n] = x [n]− y [n]

Equivalent noise model:

+ y

-e

xQ( ) yx

Probability density function

2q

2q-

q1

)(ep

e q

q1

)(ep

e

Rounding Truncation


Noise modelProbability density function

2q

2q-

q1

)(ep

e q

q1

)(ep

e

Rounding Truncation

RoundingMean: 0Variance: q2

12

TruncationMean: q

2

Variance: q212


Noise model

What happens if data format change ?Some LSB bits are removed (assume truncation)Noise impact is more important (SNR decreases)

S bm-1 b0 b-1 b-n+2 b-nb-n+1b-j-1b-j

2-n2-1202m-1 2-j-12-j

j k

Remaining bits Removed bits

Figure: Impact of truncation

If we remove k bits of fractional part

e =n∑

i=n−kb−i2−i (5.40)


Format change: truncation

Noise model

What happens if data format change ?Some LSB bits are removed (assume truncation)Noise impact is more important (SNR decreases)

S bm-1 b0 b-1 b-n+2 b-nb-n+1b-j-1b-j

2-n2-1202m-1 2-j-12-j

j k

Remaining bits Removed bits

Figure: Impact of truncation

Equivalent to add a noise with (q new quantization step: 2−n+k = 2−j)Mean

µ = q2 × (1− 2−k) (5.41)

Variance:

σ = q2

12 × (1− 2−2k) (5.42)Digital Signal Processing Fixed point arithmetic Data width & quantization noise 256 / 355

Format change: truncationFormat change: truncation

Fixed point noise analysis

We need to calculate the noise induced by the FP structure (FIR example)

´b0

+

z-1 z-1 z-1

bg mem

+bs0

+

+ ´b1

+

+bs1

+ ´bL-2

+

+bs L-2

+ ´bL-1

+

+bs L-1

+

+

x (n)

y (n)

b

Figure: Noise sources in FIR structure.


Fixed point noise analysisWe need to calculate the noise induced by the FP structure (FIR example)

Quantization noise with b with variance σ2b

σ2b = q2

12 (5.43)

Truncation noise bk for multiplier (when single precision multiplier is used)with variance σ2bk

σ2bk = q2

12 (1− 2−2k) (5.44)

I q: New quantization step (number of FP bits in new format)I k: Number of dropped bits

Common format noise bsk for each shift (common point positioning) withvariance σ2bs k

σ2bsk= q2

12 (1− 2−2k) (5.45)


Fixed point noise analysisFinal truncation bg due to write in memory (potential double to singleprecision transition)

σ2bg = q2

12 (1− 2−2k) (5.46)

I q: New quantization step (number of FP bits in new format)I k: Number of dropped bits

Final noise expression

σ2 = σ2b

(L−1∑k=0|h[k]|2

)+(L−1∑

k=0σ2bk + σ2bsk

)+ σ2bg (5.47)

First term: quantization noise filtered by the FIRh : Coefficients in time domainσ2bsk

and σ2bk can be merged in only one noise variance term (truncation andcommon point positioning in one expression)


Conclusion on data width

Need to calculate number of fractional part bitsFor some applications, number of bits is fixed

I Data can be split or group to modulate data widthI Arithmetic rules for propagation

Other applications: number of bits is a freedom degreeI Maximizing SNR while reducing number of bits

Data is quantized in a digital applicationI Quantization noise can be modeled as a random processI Uniform white processI Different digitalization methods leads to different noise behaviour


Fixed point conclusion



b-1

2-n2-121 202m-1-2m

S

Fixed point design: sign, integer part and fractional part (b,m,n)Baseline: floating point model

SignUnsigned versus signed versionDifferent way to encode sign data: Sign absolute value versus 2’s complement2’s complement allows straightforward use of arithmetic and only valueencodes 0

Digital Signal Processing Fixed point arithmetic Conclusion 261 / 355

Fixed point conclusion

ExponentEncodes integer partLinked to dynamic of dataSign extension bit in many arithmetic operations to avoid overflow

Fractional partEncodes fractionalLinked to output Signal to quantized Noise Ratio (SNR or SQNR)

Arithmetic rulesMultiplication and additions have rules for FP designIn practical systems, fixed point design corresponds to format calculation(Point positioning, data width)


FIR filter exampleKeep calm and do a complete example:

Filter expression

ck = 15 + 3.k

160 ∀ k ∈ [0; L2 − 1]

ck = hL−1−k ∀ k ∈ [L2 ; L− 1]

Design the filter with1 Double precision no guard bits

architecture2 Double precision - 8 guard bits

architecture3 Single precision architecture

0 5 10 15 20 25 30 350

0.1

0.2

0.3

0.4

0.5

Lag index

Val

ueFigure: FIR example impulseresponse


FIR filter exampleKeep calm and do a complete example:

Filter expression

ck = 15 + 3.k

160 ∀ k ∈ [0; L2 − 1]

ck = cL−1−k ∀ k ∈ [L2 ; L− 1]

Design the filter with1 Double precision no guard bits

architecture2 Double precision - 8 guard bits

architecture3 Single precision architecture

0 5 · 10−2 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5−60

−40

−20

0

20


Mag

nitu

deFigure: FIR example impulseresponse


Fast Fourier Transform

Digital Signal Processing Fast Fourier Transform 265 / 355

Introduction

Many processing in Digital signal processing requires to shift from time domainto frequency domain Done with Fourier transformBut how Fourier transform can be (efficiently) done ?

Digital Signal Processing Fast Fourier Transform Introduction 266 / 355

Fourier transform formulationLet’s go back to the initial Fourier transform

The signal x(n) is sampled at its sampling frequency Fc

If we consider a FT of size N we have

X (k) =N−1∑n=0

x(n).e(−2jπ n.kN ), k = 0, 1 . . .N − 1 (6.1)

And its inverse Fourier transform is defined as

x(n) = 1N

N−1∑k=0

X (k).e(2jπ n.kN ), n = 0, 1 . . .N − 1 (6.2)

Note that in practice both time domain and frequency domain signal are complexComputational complexity ?


Fourier transform formulationEquation 6.1 can be formulated with the use of matrixes

X (k) =N−1∑n=0

x(n).e(−2jπ n.kN ), k = 0, 1 . . .N − 1 (6.3)

Discrete Fourier Transform (DFT) with matrix

X (0)X (1)...

X (N − 1)

=

1 1 1 · · · 11 W 1

N W 2N · · · W N−1

N...

......

......

1 W 2N W 4

N · · · W 2(N−1)N

1 W N−1N W 2(N−1)

N · · · W (N−1)2N

×

x(0)x(1)...

x(N − 1)

(6.4)

with WN is called twiddle factor and is equal to e(−2j πN )


Fourier transform formulationWe introduce the Discrete Fourier Transform (DFT) matrix as

FN = 1√N

1 1 1 · · · 11 W 1

N W 2N · · · W N−1

N...

......

......

1 W 2N W 4

N · · · W 2(N−1)N

1 W N−1N W 2(N−1)

N · · · W (N−1)2N

with WN is called twiddle factor and is equal to e(−2j πN )

DFT matrix propertiesThe twiddle factor is a primitive Nth root of unityThe matrix is a Vandermonde matrix

I Matrix with the terms of a geometric progressionNormalisation factor definition can sometimes be different

I Normalisation product (between F and F−1) should be equal to 1I With 1√

N the matrix becomes unitary (i.e it preserves energy)


Fourier transform formulation

X (0)X (1)...

X (N − 1)

=

1 1 1 · · · 11 W 1

N W 2N · · · W N−1

N...

......

......

1 W 2N W 4

N · · · W 2(N−1)N

1 W N−1N W 2(N−1)

N · · · W (N−1)2N

×

x(0)x(1)...

x(N − 1)

Complexity (complex)N2 complex multiplicationsN(N − 1) complex additions

Complexity (real)4N2 real multiplicationsN(4N − 2) real additions

Complexity in O(N2)Digital Signal Processing Fast Fourier Transform Introduction 270 / 355

Principle of the FFTFourier transform is too complex

4N2 real multiplicationsN(4N − 2) real additionsAnd lot of memory (N samples lot of basis elements)

But possibility to use Fourier transform properties

W k(N−n)N = (W kn

N )∗

W k(n+N)N = W n(k+N)

N = W knN

W n+N/2N = −W n

N

W 2knN = W kn

N/2

Complexity is reduced by still in O(N2) [25][25] Oppenheim and Schafer, Digital Signal Processing, 1975.

Digital Signal Processing Fast Fourier Transform Principle of the FFT 271 / 355

Principle of the FFT

Complexity can be reduced as Fourier transform can be clustered (or regrouped)Principle of the Fast Fourier Transform (FFT)

FFT algorithm principleSeparation can be done in frequency domain

I 2 sequences depending on index parityI X(k) is divided into 2 sub-sequences

Separation cane be done in time domainI 2 sequences depending on index parityI x(n) is divided into 2 sub-sequences

Initial FFT proposition in 1965 in [26]Lot of research still today to these methodsSimilar methods for other decomposition (Hadamard . . . )

[26] Cooley and Tukey, “An algorithm for the machine calculation of complex Fourier series,” 1965.

Digital Signal Processing Fast Fourier Transform Principle of the FFT 272 / 355

Time split FFTLet’s express the DFT in time domain

Separate the odd from the even indexes

X (k) =∑n even

x(n).W nkN +

∑n odd

x(n).W nkN

X (k) =N/2−1∑n=0

x(2n).W 2nkN +

N/2−1∑n=0

x(2n + 1).W (2n+1)kN

X (k) =N/2−1∑n=0

x(2n).W nkN/2 + W k

N

N/2−1∑n=0

x(2n + 1).W nkN/2

X (k) = G(k) + W kN .H(k)

With G(k) is the DFT on N/2 even pointsH(k) is the DFT on N/2 odd points

Digital Signal Processing Fast Fourier Transform Time split FFT 273 / 355

Time split FFTDFT of size N is now divided into 2 sub-DFT of size N/2

Not a complexity reduction though !

We can now use the DFT properties (linked to twiddle factor)

W k+N/2N = −W k

N

And the DFT can be expressed as

X (k + N2 ) =

N/2−1∑n=0

x(2n).W n(k+N/2)N/2 + W (k+N/2)

N

N/2−1∑n=0

x(2n + 1).W n(k+N/2)N/2

X (k + N2 ) = G(k)︸︷︷︸

Evenpoints

−W kN . H(k)︸︷︷︸

Oddpoints

This is called a butterfly


To FFTIf N is a power of 2, N/2 can also be divided by 2.

In a Divide and conquer approachCan be done iteratively until a DFT on two points

Input - output link

X (k) = G(k) + W kN .H(k)

X (k + N2 ) = G(k)−W k

N .H(k)

2 points DFTMinimal graph flow

Minimal complexity, as complexity is on O(n)Then, recombination with log2(n) recombinations

FFT complexity is O(n)× log2(n)Digital Signal Processing Fast Fourier Transform Time split FFT 275 / 355

Time split FFTSeparation of the DFT into two main streams

One dedicated to odd pointsOne dedicated to even points

With twiddle factor, recombination can be done

X(0)

X(1)

X(2)

X(3)

X(4)

X(5)

X(6)

X(7)

N/2 points

N/2 points

TFD

TFD

x(0)

x(2)

x(4)

x(6)

x(1)

x(3)

x(5)

x(7)

W 18

W 28

W 38

W 48

W 58

W 68

W 78

W 08

Figure: Index division principle


Minimal FFT butterfly [Time split]A FFT is a flow based on a minimal butterfly

This minimal flow is present at each log2(N) stage of the FFTLet’s have a glance at stage m

Xm(p)

Xm(q)

Xm+1(p)

Xm+1(q)W rN

1

-1

+

+

Figure: Butterfly flow

Link between input and output (depends on index, generic pattern here) linked tothe equations

Xm+1(p) = Xm(p) + W rN .Xm(q)

Xm+1(q) = Xm(p)−W rN .Xm(q) (6.5)


Minimal FFT butterfly [Time split]

Xm(p)

Xm(q)

Xm+1(p)

Xm+1(q)W rN

1

-1

+

+


For each unitOne complex multiplication (based on twiddle factor)One additionOne subtraction

Now, this unit has to be used in each log2(N) stages of the FFTDivide and conquer: go from last stage to first stageIndexes can be rearranged (index are in bit reverse mode)


Example with 16 points FFT

Even N/2 FFT

Odd N/2 FFT

X(0)

X(1)

X(2)

X(3)

X(4)

X(5)

X(6)

X(7)

X(8)

X(9)

X(10)

X(11)

X(12)

X(13)

X(14)

X(15)

x(0)

x(2)

x(4)

x(6)

x(8)

x(10)

x(12)

x(14)

x(1)

x(2)

x(3)

x(5)

x(7)

x(9)

x(11)

x(13)

Figure: Example of last stage for a 16 points FFT with butterfly approach


Example with 16 points FFT

Figure: Example of a 16 points FFT with butterfly approach


Bit reverseDue to the split formulation, indexes are in bit reverse.

Transition is obtained by shifting the bit order and describe them in bigendianbN−1 . . . b0 −→ b0b1 · · · bN−1Example on 8 bits

Regular index Bit sequence Bit reverse sequence Bit reverse value0 000 000 01 001 100 42 010 010 23 011 110 64 100 001 15 101 101 56 110 011 37 111 111 7

It means that the third index corresponds now to the input 2


Frequency split FFT => From DFT expression

The FT can be separated into to half FT size blocks.

X (k) =N/2−1∑n=0

x(n).W nkN +

N−1∑n=N/2

x(n).W nkN

X (k) =N/2−1∑n=0

x(n).W nkN + W kN/2

N

N/2−1∑n=0

x(n + N/2).W nkN

X (k) =N/2−1∑n=0

[x(n) + (−1)k .x(n + N/2)

]W nk

N

Digital Signal Processing Fast Fourier Transform Frequency split FFT 282 / 355

Frequency split FFTWe can now separate odd and even indexes

X (2p) =N/2−1∑n=0

[x(n) + x(n + N/2)]W 2pkN

X (2p + 1) =N/2−1∑n=0

[x(n)− x(n + N/2)] W 2pkN .W n

N

And same divide and conquer method

X (2p) =N/2−1∑n=0

[x(n) + x(n + N/2)] W pkN/2

X (2p + 1) =N/2−1∑n=0

([x(n)− x(n + N/2)] W pk

N/2

).W n

N


Minimal FFT butterfly [Frequency split]

Xm(p)

Xm(q)

Xm+1(p)

Xm+1(q)W rN

1

1

1

-1

+

+


For each unitOne complex multiplication (based on twiddle factor)One additionOne subtraction (Addition and bitflip)

Same computational complexity as with Time split FFTBUT not same butterfly pattern !


Frequency split FFT

Figure: FFT with decimation in frequency


Other radix schemes

Figure: Overview of radix schemes for FFT


Other radix schemes - Radix 4X(0)

X(1)

X(2)

X(3)

X(4)

X(5)

X(6)

X(7)

X(8)

X(9)

X(10)

X(11)

X(12)

X(13)

X(14)

X(15)

X(0)

X(8)

X(4)

X(12)

X(2)

X(10)

X(6)

X(14)

X(1)

X(9)

X(5)

X(13)

X(3)

X(11)

X(7)

X(15)

Figure: FFT with 16 bins with radix 4


Conclusion on FFT

Discrete Fourier Transform is a basis operation in signal processingFrequency processing widely used in many fieldsBase of modern transmission systems (OFDM)

Classic DFT requires lot of computational load4N2 real multiplicationsN(4N − 2) real additionsAnd lot of memory (N samples lot of basis elements)

Fourier transform has a lot of embedded symmetry that can be exploitedBased on roots of unity (twiddle factor)Complexity is highly reduced when N is a power of 2

Digital Signal Processing Fast Fourier Transform Conclusion 288 / 355

Conclusion FFTTime split FFTDivide and conquer method

Each stage is divided into 2 subflows with n/2 DFT stagesBased on DFT properties

X (k) = G(k) + W kN .H(k)

X (k + N2 ) = G(k)−W k

N .H(k)

Minimal system with a butterfly with 2 coefficientsIndex can be rearranged (index in bit reverse mode) to better show thebutterflyN/2 log2(N) complex multiplications and N log2(N) additions

Remark: this approach is called a Radix-2 butterflyPossibility to use larger number of radix (4, 16, . . . )In practice, radix-2 is not best solution (performance/calculation, memory . . . )

Digital Signal Processing Fast Fourier Transform Conclusion 289 / 355

Multirate processing

Digital Signal Processing Multirate processing 290 / 355

IntroductionSignal to be sampled must respect Shannon theorem [22].

For the moment we have considered only one sampling frequencyIn some (many !) cases, frequency sampling can be changed during processing.Example of low pass filtering and downsampling:

-0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5


-45

-40

-35

-30

-25

-20

-15

-10

-5

0

Magnitude

-0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5


-30

-25

-20

-15

-10

-5

0

Magnitude

Figure: Signal with low bandwidth (wrt sampling frequency)

[22] Shannon, “A mathematical theory of communication,” 1948.

Digital Signal Processing Multirate processing Introduction 291 / 355

IntroductionModification of sampling frequency

Reduction of sampling frequency: downsamplingIncreasing of sampling frequency: upsampling

DownsamplingHow ensure that signal is not distorted ?How design the dowsampling filter ?

UpsamplingHow ensure that signal is still a low bandwidth signalHow design the interpolation filter ?

How downsampling and upsampling can be done with arbitrary factors ?

Digital Signal Processing Multirate processing Introduction 292 / 355

DownsamplingReduction of sampling frequency.

Reduction is done with a downsampling factor M (assumed to be integer)We start from continous signal

x [n] = xc(nTc)

with Tc initial sampling time. Reduction of sampling frequency corresponds toincrease of sampling time.

xd [n] = xc(nTd)

with Td new sampling time, greater than Tc , with the relationship

Td = M × Tc

Fd = FcM

Digital Signal Processing Multirate processing Downsampling 293 / 355

Downsampling

Downsampling conditionA signal can be sampled if it respects the Shannon theorem

A signal can be downsampled at rate Fd only if it respects Shannontheorem with respect to Fd .

Signal shall have limited bandwidth: We denote ω0 the maximal pulsation of xc .It means

Xc(jω) = 0 for |ω| ≤ ω0

Downsampling can be done (no aliasing) only if

π/Td = π/(MTc) ≥ ω02πFd = 2πFc/M ≥ ω0


Downsampling

Classic downsampling scheme

x[nTc] ↓ M x[nTd]

Td = M × Tc

Fd = Fc

M

Figure: Downsampling principle

What is the spectral consequence of downsampling ?Why an anti-aliasing filter ?


Downsampling

t

x(nTc)

Tc ww0

1/T

-w0

X(ejW)

p/Tc-p/Tc

t

xd(nTd)

Td w

1/T' Xd(ejW)

p/Td-p/Td

Figure: Sampling impact on frequency domain


Downsampling

Real signals have leakagesAliasing if signal is not band limitedIn practise, leakages at high frequency (noise, replica, . . . )

Before downsampling, always apply a Low pass filter to avoid aliasing

Which filter ?See IIR/FIR synthesis part

x[nTc]

Low pass filterGain = 1Ωc = π

M

↓M x[nTd]


UpsamplingIncreasing of sampling frequency.

Increasing is done with a upsampling factor L (assumed to be integer)We start from continous signal

x [n] = xc(nTc)

with Tu initial sampling time. Increasing of sampling frequency corresponds toincrease of sampling time.

xu[n] = xc(nTu)

with Td new sampling time, lower than Tc , with the relationship

Tu = TcL

Fu = L× Fc

Digital Signal Processing Multirate processing Upsampling 298 / 355

UpsamplingHow can we do an upsampling in practise ?

From signal at rate Fc , with insertion of zero

xe(n) =

x(n/L), n = 0,±L,±2L, . . .0, otherwise (7.1)

or equivalently

xe(n) =∞∑

k=−∞x(k) · δ(n − kL) (7.2)

x[nTc] ↑ L x[nTd]

Td = Tc

L × Tc

Fd = Fc × L


UpsamplingWhat happens in frequency domain ?

Sampling in time at Td Periodisation in frequency domain at each Fd

-0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5


-45

-40

-35

-30

-25

-20

-15

-10

-5

0

Magnitude

-0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5


-45

-40

-35

-30

-25

-20

-15

-10

-5

0

Magnitude

Figure: Upsampling consequence (upsampling factor of 3)


UpsamplingNeed to apply a low pass filtering to remove images

Signal with a maximal frequency of ω0Cut-off frequency of LPF filter with value ω0/L (π/L in full band mode)

-0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5


-45

-40

-35

-30

-25

-20

-15

-10

-5

0

Magnitude

-0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5


-40

-35

-30

-25

-20

-15

-10

-5

0

Magnitude

Figure: Upsampling consequence (upsampling factor of 3) followed by digital filtering


Upsampling

How can we do an upsampling in practise ?From signal at rate Fc , with insertion of zeroFollowed by a low pass filtering

x[nTc] ↑ L LPF filter x[nTu]

Tu = Tc

L × Tc

Fu = Fc × L


Fractional interpolation

Downsampling: Frequency reduction with integer factorUpsampling: Frequency increase with integer factor

How can we manage fractional upsampling or fractional downsampling ?

Output frequency Fo = k × Fc with k ∈ Q.

Fo = LM Fc

Fractional interpolator can be seen as upsampling followed by a decimation

Digital Signal Processing Multirate processing Fractional interpolation 303 / 355

Fractional interpolation

x[nTc] ↑ L LPF filter x[nTd] LPF filter ↓M x[nTd]

Fractional interpolator can be seen as upsampling L followed by a decimationM

Fractional interpolation and filter designFilter after upsampling stageFilter before downsampling stage

Common interpolation filterOnly more selective filter has to be applied

Ωc = min(π

L ,π

M

)


Conclusion on multirate processing

One functional frequency is not enoughNeed to decrease: downsamplingNeed to increase: upsampling

DownsamplingKeep one sample per MBefore dropping samples, LPF filter with π

M

x[nTc]

Low pass filterGain = 1Ωc = π

M

↓M x[nTd]


Conclusion on multirate processingUpsampling

Insert L zeros between samplesSpectral images have tp be removed with LPF filter of πL

x[nTc] ↑LLow pass filter

Gain = 1Ωc = π

L

x[nTu]

Fractional interpolatorChanging rate with a L

M rateCommon filter (more selective applied) between DS and US

x[nTc] ↑ LLPF filterGain = 1

Ωc = min(πL ,

πM

) ↓M x[nTd]


Digital Signal Processor

Digital Signal Processing Digital Signal processor 307 / 355

DSP algorithm implementation

Signal is now digitalAssume that digitalization went OKConsequence of digital conversion in next sections

I Fixed point implementationI Quantization noise impact

DSP implementationData is ready to be processedProcessing depends on chosen architectureBut common rules on how we can do the processing

Digital Signal Processing Digital Signal processor DSP algorithm implementation 308 / 355

DSP algorithm implementationFrom difference equation to flowgraph

y [n] =L−1∑k=0

h[k]x [n − k]

x[n] z−1

×h0

z−1

×h1

+

. . . z−1

×hL−2

+. . .

z−1

×hL−1

+ y[n]

Figure: FIR flowgraph

Equivalence between equation and flowgraphImplementation description


DSP algorithm implementationSome typical examples

Filteringy [n] = y [n] + x [n − k]× h[k] MAC: Multiplication and accumulation

Adaptive formy [n] = y [n − 1] + x [n − k]× h[k] MAD: Multiplication and addition

Complex arithmeticx [n − k] = xi [n − k], xq[n − k], h[k] = hi [k], hq[k]yi [n] = xi [n − k]× hi [k]− xq[n − k]× hq[k]yq[n] = xi [n − k]× hq[k] + xq[n − k]× hi [k]


DSP algorithm implementation

Some properties to keep in mind

Signal processing needs heavy calculationI lot of MAC/MADI Complex multiplications in many fields (FFT,. . . )

Quantization and digital issuesI Full scale and fixed point analysisI Noise and distortions

Data storage and manipulationI Linear or indexed accessI LUTI Data collection and ageing


DSP algorithm implementationSome properties to keep in mind

Digital signal processing core is real time processingExecution time linked to input sampling rate (TI) and output sampling rates(TO)

I Same rate between I and O: TI = TOI Downsampling rates (TI < TO)I Upsampling rates (TI > TO)

Effective processing done in TexI Tex < TO

Digital signal processing can means lot of dataHigh computational loadExtensive use of memory and I/O (DMA)


Real time processingOnline processing

One output per inputProcessing time must be lower than output sampling time3 steps

1 Acquisition2 Processing3 Restitution

eT eT.2 eTn. t

firInput(), output()

idle

x[n] D S P y[n]

y[n]x[n]

nn


Real time processingBlock processing

processing done on several consecutive inputs3 steps

1 acquisition and storing2 processing3 restitution

eT eT.2 eTn. t

FirBlocInput[], output[]

idle

acqexece tNtTN .. +>

D S P

y[n]x[n]

n

x[n0]x[n0-1] x[n0 -2] x[n0 -3] x[n0 -4]

y[n0]y[n0-1] y[n0 -2] y[n0 -3] y[n0 -4]

n


Real time processing

Block processingBlock is associated to samples selected by a window

1 Window can be consecutive (IFFT processing in digital communications)2 Window can overlap (Welch spectral analysis)3 Window can be disjointed


Processing rate

Device and application parameters:Number of parallel path that can be processed Np

Elementary number of DSP units Ne

Sampling period Te

Processing rate of the device

Pp = Np × NeTe

(8.1)


Architecture comparison

EmbeddedFPGA

EE : Efficiency : MIPS / Watt

ReconfigurableProcessor

EmbeddedProcessor

SA1100.4 MIPS/mW

Alpha0.007 MIPS/mW

DSP

2 V DSP3 MOPS/mW

Pleiades10-50 MOPS/mW

ASIC100-1000 MOPS/mW

1100

Flex

ibili

ty)/( mWMIPSCPE c

E =

Energy efficiency

10

ASIP

Digital Signal Processing Digital Signal processor Digital signal processor architecture 317 / 355

DSP applicationSpecific flowgraph with ADC/DAC

Sens

ors

A

NB

ADC

Low passAnti aliasing filter

D.S.P. DAC

Actu

ator

s

0.2360.4210.5630.60.610.6130.611.2360.580.490350.20.20.21

0.2360.4210.5630.60.610.6130.611.2360.580.490350.20.20.21

Digital action Te

Specific instructions dedicated to signal processingSpecific architecture

Digital Signal Processing Digital Signal processor Digital signal processor architecture 318 / 355

DSP core unitsSeveral unity dedicated to different purposeInstruction processor

Also called control unityInterpret instructions, and command other unities

Data processorModify and update data

Memory unityInstruction memory that stores instructionsData memory that buffers data from data processor

Communication unitExternal interface unitPilot access to data (or external instructions) and other processor links.

Digital Signal Processing Digital Signal processor DSP core units 319 / 355

DSP core units

Instruction Processor (IP): Functional Unit (FU) that interpret instructionsand command other FUData Processor (DP): FU that modifies or transform dataMemory

I IM : Instruction Memory : Instruction storageI DM : Data Memory


DSP core units

Data unitD.P.

Control unit: I.P.

Memory unitI.M. + D.M.

Communication unit EIU

Figure: Synoptic of DSP architecture


Processing unitsProcessing is done depending on topologyFixed point

One sign, on integer part,on fractional partTwo’s complementimplementation(b,m,n) or Q(m,n) withb = m + n − 1



b-1

2-n2-121 202m-1-2m

S


SignificandExponent

Floating pointOne sign, exponent andfractionResolution depends oncoded wordIEEE 754 norm


OperationHardware multiplier

Perform multiplication in one cycleOutput is stored in register or given to ALU

Arithmetic and Logic Unit (ALU)Arithmetic operation (add/sub, sign shift,. . . )Logical operation (or/not,. . . )

Shift registersCylinder shift: perform arbitrary shifting in one cycle

Specific blocksDepends on DSP (LMS, Viterbi, . . . )

Address generationOperand registerCircular buffersBit reverse (FFT)


Operation

Operation during a MAC

Operation Input bits Output bitsMultiplier b 2bUAL (add) 2 b 2b

2b + bg 2b+bgSaturation 2b b

2b + bg b

Registers used:

Register Bit numberInput operand b

Accum 2b2b + bg

+

Accumulateur

´

A B

Sat /Arr

bnat

bmult

badd

MACP

bnat


Classic processor architecture problemIf we consider a classic Von Neuman architecture

loop:mov *r0,x0mov *r1,x1mpy x0,x1,aadd a,bmov x1,*r2inc r0inc r1inc r2dec ctrtst ctrjnz loop

15 à 20 cycles for execution

Data Path Memory

Figure: FIR operation with classic architecture

Memory pointer to input dataData processing (multiplication and addition)Aging data (x [n]→ x [n − 1])Update memory pointers


Classic processor architecture problem

Purpose: do these operation in one clock1 Getting input data x [n − k]2 Getting filter coefficient h[k]3 Process data and accumulate4 Aging data

Communications between memory and processing can struggle theeffective rate.

ArchitectureVon Neuman: Unique memory unit for both program and dataHarvard: Separation in two memory units (one for program and one fordata). Each of them have a separate communication bus


DSP architecturesBased on Harvard architecture [27]

Separation of memory unit and processing unitFaster processing compared to Von NeumanPipeline: Instruction fetch → operand fetch

DM

DP

IM

IP

Figure: Classic Harvard architecturefor DSP

Example: TMS320C10

DM for data memoryIM for instruction memoryDP for data processorIP for instruction processor

[27] Lee, “Programmable DSP Architectures: Part I,” 1988.


DSP architectures

Harvard: first variant [27]

Data can be stored in IM

Two operand can beaccessed if IM supports twoaccess in one cycle

I Instruction fetchI 2 operand fetchesI MAC executionI Output in I/O or in

memory

DM

DP

IM/DM

IP

1-to-2

Figure: Classic Harvard architecture for DSP

Example: AT&T DSP32 or DSP32C



DSP architecturesHarvard: Second variant [27]

DM is a multiport memorySeveral access to memory per cycle

DM

DP

IM

IP


Example: Fujitsu MB86232 (3 ports)



DSP architecturesHarvard: Third variant [27]

2 separated DMIf time access to memory is the same, process 2 operand per cycle

DM 1

DP

IM

IP

1-to-2

DM 2


Example: Motorola DSP 56001 - 96002; TMS320C30 - C40



DSP architecturesHarvard: Fourth variant [27]

IM can store operandCache memory forfrequently used instructions(ex. loop)Avoid memory accessconflict (in version 1)

DM

DP

IM/DM

IP

1-to-2 cache


Example: TMS320C25 (cache 1), ADSP-2100 (16-cache)



Conclusion on DSP architecture

Von Neuman architecture is not well adaptedDSP is real time data oriented processing

Harvard architecture for DSPOne Data Memory (DM), one Instruction Memory (IM)Several variant to speed up processing

I Allow use of IM to store dataI Multiport DMI Several DMI DM/IM combined with cache

How to choose ? Depends on other constraints, and operand location: see [28]

[28] Araujo, “Code Generation Algorithms for Digital Signal Processors,” 1997.


Operand location

Use one data from memory and one data inregisterIn one cycle, only one memory lecture isdoneLegacy Harvard architecture can be usedInstruction execution time

tinst = tacc + texec

I With tacc duration of reading from memoryI texec execution of operation

Minimization of tacc with static memory(and pagination)

Memory

Instruction

R

A

texec

tacc

Figure: Register-memory model


Register - memory model

Operand location

Avoid access to memory: operand are inregistersInstruction and memory-to-register is donesimultaneouslyNext operand are moved to register duringcurrent instructionInstruction execution time

tinst = max (tacc; texec)

I With tacc duration of reading from memoryI texec execution of operation

2 operand are read: multi-port DM orseveral DM architecture

Memory

RX

Instruction

RY

A

texec

tacc

Figure: Register-register(load-store) model


Memory- memory model (load-store)

Addressing modes

Data Memory is frequency used How to map address in a efficient way ?

Immediate addressingInstruction contains the operand valueADD #4, AUseful for initialisation and for small data2 words is needed to code the instruction for long word

Direct addressing modeInstruction contains the address of the operandIf full address is coded, 2 words is neededPossibility to use only one word with paged memory


Addressing modesData Memory is frequency used How to map address in a efficient way ?

Direct addressing modePaged memory is used to partially encode addressAddress is a page pointer (in register) and a offset

I Data pointer and Stack pointer

LD #x ,DP Loads MSB of x’s address in page pointerLD @x + 1,AADD @x ,AADD @x + 2,A

a11 A10 a9 a8 a0a15

In instruction9 bits TMS320C54x 7 bits

In register (DP/SP)


Addressing modesData Memory is frequency used How to map address in a efficient way ?

Indirect addressing modeData in memory is pointed by an auxiliary registerInstruction contains the register indexLimited number of register to limits the bit useParticularly efficient in DSP architectureNatural array indexing (pointer) and BK to use circular buffer

Data MemoryStart_address =

xxxxxxxxxxx00000

ARi

End_address = xxxxxxxxxxx11111

xxxxxxxxxxx00010

ARi BK

N=30=1 1 1 1 0


Addressing modesData Memory is frequency used How to map address in a efficient way ?Indirect addressing mode

Possibility to update (calculate) next used addressLinear, indexed, modulo, circular buffer operations

• Register points to data • LD *AR1, A

• Post - Modification:• linear : AR:= AR ± 1 LD *AR1+, A

• indexed : AR:= AR ± MR LD *AR1+0, A

• MR: registre d’index

• modulo : (AR:= AR ± 1)N LD *AR1+% ,A

• bit-reverse : FFT LD *AR1+0B ,A

x0 x1x2

x3x5

addr = AR1AR1 = AR1 + 1

addr = AR1AR1 = AR1 + AR0

addr = AR1AR1 = (AR1 + 1) modulo BK

(BK) specifies the size of the circular buffer.

addr = AR1 (Aï *AR1)

addr = AR1AR1 = bitrev(AR1 + AR0)

After access, AR0 is added to ARx with reverse carry (rc) propagation.


Addressing modesData Memory is frequency used How to map address in a efficient way ?Indirect addressing mode

•Circular buffer• Aging data is automatic (shifting address)

• Bit-Reverse :•. Addressing data in FFT stages

Index bit bit reverse index bitreverse

0 000 000 01 001 100 42 010 010 23 011 110 64 100 001 15 101 101 56 110 011 37 111 111 7

xn-4 xn-3

xn-2

xn-1xn-6N=8

xn

*AR1

xn-2

xn-1

xn

xn-7

xn-3

xn-6

xn-5

xn-4

Élément courant du buffer

First array element

Last element

xn-1

xn

xn+1

xn-6

xn-2

xn-5

xn-4

xn-3

Current data


Example of TSM320C4x

8 auxiliary registers (AR0. . . AR7)2 processing unit ARAU2 specific register

I BK : circular buffer sizeI Index register AR0


Control unitClassic pipeline used in several steps

1 Instruction search2 Decoding instruction3 Operand search4 Execution

Inst n-1

i+1i i+2 i+3 i+4

Inst n+1Inst n

Inst n-1 Inst n Inst n+1

Inst n-1 Inst n Inst n+1

Instruction search

Operand search

Decoding

Inst n-1 Inst nExecution


Control unit

Stationnarity (time domain) from unit point of view

MULT R1,R2,T ADD T,A,A LD R1,AR1LD R2,AR2

LOAD

i+1i i+2 i+3 i+4

LOADLOAD

MULT MULT MULT

ADD ADD ADD

Memorytransfert.

ACUM unit

Mult Unit

Instruction are coded sequentially


Control unit

Stationnarity (data domain) from data processing point of view

LOAD

i+1i i+2 i+3 i+4

LOADLOAD

MULT MULT MULT

ADD ADD ADD

Memorytransfert

Accum unit

Mult unit

Simpler to code as closer to algorithm


Some DSPs

Different baselinesFixed point on 16-24 bits (accumulation 40-60 bits)32 bits in floating pointInstruction set: 16 - 32 bitsOne instruction per cycle (complex set)

leads to different architecturesStrong architecture constraintsExternal memory (on-chip)Hardware for loop management

Digital Signal Processing Digital Signal processor Some DSPs 344 / 355

Some DSPs

Texas instrumentC2000 : Fixed point 16 bits : motor controlC5000 : Fixed point 16 bits : Low consumptionOMAP : C55x + µP ARMC6000 : Fixed point 16 bits (C67 floating) : High performance

Analog deviceADSP21xx : Fixed point 16 bitsSHARC : Floating point 32 bits


Specific instruction set

Syntax Execution Example

MAC[R] Smem, src (Smem) x (TREG) + (src) --> src MAC *AR5, A

MAC[R] Xmem, Ymem, src [,dst] (Xmem) x (Ymem) + (src) --> src or [,dst]

(Xmem) --> TREG

MACR *AR5+, *AR6+, A, B

MAC #lk, src [,dst] (TREG) x lk + (src) or [,dst] MAC #345h, A, B

MAC Smem, #lk, src [,dst] (Smem) x lk + (src) or [,dst]

(Smem) --> TREG

MAC *AR5+, #1234h, A

Description The MAC[R] instruction multiplies and adds with or without rounding. The result is stored in the destination accumulator, if specified, or in the source accumulator. For syntaxes 2 and 3, the data-memory value after the instruction is stored in TREG. TREG is updated during the read phase. The MACR instruction rounds the result of the MAC operation by adding 2^15 to the result and clearing the 16 LSBs (bits 15-0) to 0. Operands Smem: Single data-memory operand Xmem, Ymem: Dual data-memory operands src, dst: A (accumulator A), B (accumulator B) lk: 16-bit long-immediate value


MAC operation



FIRS Xmem, Ymem, pmad

pmad -->PAR While (RC) != 0 (B) + (A(32-16)) x (Pmem addressed by PAR) --> B ((Xmem) + (Ymem)) << 16 --> A (PAR) + 1 --> PAR (RC) - 1 --> RC

FIRS *AR3+, *AR4+, COEFFS

Description FIRS is useful to implement symmetrical FIR filters. The FIRS instruction multiplies accumulator A(32-16) with a program memory value addressed by pmad (program memory address) and adds the result to the value in accumulator B. At the same time, it adds the memory operands Xmem and Ymem, shifts the result left 16 bits and loads this value into accumulator A. In the next iteration, pmad is incremented by 1. Once the repeat pipeline is started, the instruction becomes a single-cycle instruction.


FIR operation



LMS Xmem, Ymem

(A) + (Xmem) << 16 + 2^15 --> A (B) + (Xmem) x (Ymem) --> B

LMS *AR3+. *AR4+

Description

The LMS instruction is used to execute the least mean square algorithm (LMS). The dual data-memory operand

Xmem is shifted left 16 bits and added to accumulator A. The result is rounded by adding 2^15 to the high part of the

accumulator (bits 31-16). The final result is stored in accumulator A. In parallel, Xmem and Ymem are multiplied and

the result is added to accumulator B. Xmem does not overwrite TREG; therefore, TREG always contains the error

value used to update coefficients.


LMS operation

Conclusion on DSPDigital signal Processor target DSP applicationsMany constraints linked to real time data oriented processing

Processing unitFixed point or floating point basedProcessing unit are implemented in DSP

I ALUI MACI Additional blocks

Memory architectureHarvard based topologySeveral variant depending on operand locationDirect or indirect addressing modes

Digital Signal Processing Digital Signal processor Conclusion on DSP 349 / 355

Conclusion on DSP

Control unitPipeline based architectureInstruction can be coded to be stationary in time domain, or in data domainHardware loop to optimize resource management

Topology depends on target DSPVarious architectures and unitsWorkflow must be adapted to targeted DSP

Digital Signal Processing Digital Signal processor Conclusion on DSP 350 / 355

References

Digital Signal Processing References 351 / 355

References[1] J. Proakis and D. Manolakis, Digital Signal Processing : Principles, Algorithms

and Applications. Prentice Hall, 1996.[2] S. Haykin, Adaptive filter theory. Prentice Hall, 1996, ISBN: 9780133227604.[3] Electronic Design, “What’s the Difference Between Passive and Active Noise

Cancellation?,” 2018, http://www.electronicdesign.com/industrial-automation/what-s-difference-between-passive-and-active-noise-cancellation.

[4] Analog Devices, “Samples and hold amplifiers,” MT-090 Tutorial, 2008.[5] B. Baker, “How Sigma-Delta ADCs work, Part I,” Texas Instrument Incorporated,

Analog application journal, 2011.[6] R. del Río, F. Medeiro, and B. Perez-Verdu, CMOS cascade sigma-delta

modulators for sensors and telecom: error analysis and practical design. Springer,2006.

[7] B. Pisani, “Digital Filter Types in Delta-Sigma ADCs,” Texas Instrument,application report, 2017.

[8] B. D. Sahoo and B. Razavi, “A fast simulator for pipelined A/D converters,” inProc. 52nd IEEE International Midwest Symposium on Circuits and Systems,IEEE, 2009, pp. 402–406.


References[9] Texas instrument, “Choose the right A/D converter for your application,”, 2015.[10] W Jenkins and M Nayeri, “Adaptive filters realized with second order sections,” in

Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing,IEEE, vol. 11, 1986, pp. 2103–2106.

[11] S. Butterworth, “On the theory of filter amplifiers,” Wireless Engineer, vol. 7,no. 6, pp. 536–541, 1930.

[12] M. E. Van Valkenburg, Analog filter design. Holt, Rinehart, and Winston, 1982.[13] A. S. Sedra and P. O. Brackett, Filter theory and design: active and passive.

Matrix Pub, 1978.[14] P. L. Chebyshev, “Théorie des mécanismes connus sous le nom de

parallélogrammes",” Mémoires des Savants étrangers présentés à l’Académie deSaint-Pétersbourg., vol. 7, pp. 539–586, 1854.

[15] M. D. Lutovac, D. V. Tošić, and B. L. Evans, Filter design for signal processingusing MATLAB and Mathematica. Miroslav Lutovac, 2001.

[16] W. Thomson, “Delay networks having maximally flat frequency characteristics,”Proceedings of the IEE-Part III: Radio and Communication Engineering, vol. 96,no. 44, pp. 487–490, 1949.


References[17] D. N. Arnold and J. Rogness, “Möbius transformations revealed,”, 2008.[18] A. Oppenheim and R. Schafer, Discrete Time Signal Processing, 2nd ed.,

ser. Prentice All Signal Processing series. Upper Saddle River, New Jersey:Prentice All, 1999, ISBN: 0-13-754920-2.

[19] F. J. Harris, “On the use of windows for harmonic analysis with the discrete fouriertransform,” Proceedings of the IEEE, vol. 66, no. 1, pp. 51–83, 1978, ISSN:0018-9219. DOI: 10.1109/PROC.1978.10837.

[20] Z. S. Bojkovic, B. M. Bakmaz, and M. R. Bakmaz, “Hamming window to thedigital world,” Proceedings of the IEEE, vol. 105, no. 6, pp. 1185–1190, 2017, ISSN:0018-9219. DOI: 10.1109/JPROC.2017.2697118.

[21] R. B. Blackman and J. W. Tukey, “The measurement of power spectra from thepoint of view of communications engineering—part i,” Bell System TechnicalJournal, vol. 37, no. 1, pp. 185–282, 1958.

[22] C. E. Shannon, “A mathematical theory of communication,” The Bell SystemTechnical Journal, vol. 27, no. 3, pp. 379–423, 1948, ISSN: 0005-8580. DOI:10.1002/j.1538-7305.1948.tb01338.x.


https://doi.org/10.1109/PROC.1978.10837

https://doi.org/10.1109/JPROC.2017.2697118

https://doi.org/10.1002/j.1538-7305.1948.tb01338.x

References

[23] M. Kunt, M. Bellanger, F. D. Coulon, and C Guegen, Techniques modernes detraitement numérique des signaux, 1st ed., ser. Traitement de l’information.Presses polytechniques et universitaires romandes et CNET-ENST, 1991, vol. 1,ISBN: 2-88074-207-2.

[24] “IEEE Standard for Floating-Point Arithmetic,” IEEE Std 754-2008, pp. 1–70,2008. DOI: 10.1109/IEEESTD.2008.4610935.

[25] A. V. Oppenheim and R. W. Schafer, Digital Signal Processing. Prentice-Hall,1975.

[26] J. W. Cooley and J. W. Tukey, “An algorithm for the machine calculation ofcomplex fourier series,” Mathematics of computation, vol. 19, no. 90,pp. 297–301, 1965.

[27] E. Lee, “Programmable DSP Architectures: Part I,” IEEE ASSP Magazine,pp. 4–14, 1988.

[28] G. Araujo, “Code Generation Algorithms for Digital Signal Processors,” Ph.D.dissertation, Princeton University Department of EE, Princeton, 1997.


https://doi.org/10.1109/IEEESTD.2008.4610935

robingerzaguet - univ-rennes1.fr

Documents