chapter 1, slide 1 chapter 1 introduction dsp lecture 01

Chapter 1, Slide 1

Chapter 1

Introduction

DSP Lecture 01

Chapter 1, Slide 2

Learning Objectives

• Why process signals digitally?• Definition of a real-time application.• Why use Digital Signal Processing processors?• What are the typical DSP algorithms?• Parameters to consider when choosing a DSP

processor.• Programmable vs ASIC DSP.• Texas Instruments’ TMS320 family.

Chapter 1, Slide 3

Present Day Applications

Consumer AudioConsumer Audio Stereo A/D, D/AStereo A/D, D/A

PLLPLL MixersMixers

MultimediaMultimedia Stereo audioStereo audio

ImagingImaging Graphics paletteGraphics palette

Voltage regulationVoltage regulation

Wireless / CellularWireless / Cellular Voice-band audioVoice-band audio

RF codecsRF codecs Voltage regulationVoltage regulation

HDDHDD PRML read channelPRML read channel

MR pre-ampMR pre-amp Servo controlServo control

SCSI tranceiversSCSI tranceivers

AutomotiveAutomotive Digital radio A/D/ADigital radio A/D/A Active suspensionActive suspension Voltage regulationVoltage regulation

DTADDTAD Speech synthesizerSpeech synthesizer

Mixed-signalMixed-signalprocessorprocessor

DSP:DSP:TechnologyTechnology

EnablerEnabler

Chapter 1, Slide 4

Why go digital?

• Digital signal processing techniques are now so powerful that sometimes it is extremely difficult, if not impossible, for analogue signal processing to achieve similar performance.

• Examples:– FIR filter with linear phase.– Adaptive filters.

Chapter 1, Slide 5

Why go digital?

• Analogue signal processing is achieved by using analogue components such as:

– Resistors.– Capacitors.– Inductors.

• The inherent tolerances associated with these components, temperature, voltage changes and mechanical vibrations can dramatically affect the effectiveness of the analogue circuitry.

Chapter 1, Slide 6

Why go digital?

• With DSP it is easy to:– Change applications.– Correct applications.– Update applications.

• Additionally DSP reduces:– Noise susceptibility.– Chip count.– Development time.– Cost.– Power consumption.

Chapter 1, Slide 7

Why NOT go digital?

• High frequency signals cannot be processed digitally because of two reasons:

– Analog to Digital Converters, ADC cannot work fast enough.

– The application can be too complex to be performed in real-time.

Chapter 1, Slide 8

• DSP processors have to perform tasks in real-time, so how do we define real-time?

• The definition of real-time depends on the application.

• Example: a 100-tap FIR filter is performed in real-time if the DSP can perform and complete the following operation between two samples:

Real-time processing

99

0k

knxkany

Chapter 1, Slide 9

• We can say that we have a real-time application if:– Waiting Time 0

Real-time processing

Processing TimeProcessing TimeWaitingWaiting

TimeTime

Sample TimeSample Timenn n+1n+1

Chapter 1, Slide 10

• Why not use a General Purpose Processor (GPP) such as a Pentium instead of a DSP processor?

– What is the power consumption of a Pentium and a DSP processor?

– What is the cost of a Pentium and a DSP processor?

Why do we need DSP processors?

Chapter 1, Slide 11

• Use a DSP processor when the following are required:

– Cost saving.– Smaller size.– Low power consumption.– Processing of many “high” frequency signals in

real-time.• Use a GPP processor when the following are

required:– Large memory.– Advanced operating systems.

Why do we need DSP processors?

Chapter 1, Slide 12

What are the typical DSP algorithms?

Algorithm Equation

Finite Impulse Response Filter

M

kk knxany

0

)()(

Infinite Impulse Response Filter

N

kk

M

kk knybknxany

10

)()()(

Convolution

N

k

knhkxny0

)()()(

Discrete Fourier Transform

1

0

])/2(exp[)()(N

n

nkNjnxkX

Discrete Cosine Transform

1

0

122

cos).().(N

x

xuN

xfucuF

• The Sum of Products (SOP) is the key element in most DSP algorithms:

Chapter 1, Slide 13

What Problem Are We Trying To Solve?

Digital sampling of Digital sampling of an analog signal:an analog signal:

A

tt

What does it take to do this fast … and easy?What does it take to do this fast … and easy?

Most DSP algorithms can be Most DSP algorithms can be expressed with MAC:expressed with MAC:

countcount

i = 1i = 1Y = Y = a aii * x * xii

for (i = 1; i < count; i++){for (i = 1; i < count; i++){ sum += m[i] * n[i]; } sum += m[i] * n[i]; }

DACDACxx YY

ADCADC DSPDSP

Chapter 1, Slide 14

• Fastest Execution of MACs– The ‘C6x roadmap ... from 200 to 2400 MMACs

• Ease of C Programming– Even using natural C, the ‘C6000 Architecture can perform 2 to 4 MACs

per cycle– Compiler generates 80-100% efficient code

Multiply-Accumulate (MAC) in Natural C CodeMultiply-Accumulate (MAC) in Natural C Code

for (i = 0; i < count; i++){for (i = 0; i < count; i++){ sum += m[i] * n[i]; } sum += m[i] * n[i]; }

Fast MAC using only C

How does the ‘C6000 achieve such performance from C?How does the ‘C6000 achieve such performance from C?

Chapter 1, Slide 16

'C6000 Architecture: Built for Speed

A0A0

A31A31

....A15A15

....

.M1.M1.M1.M1

.L1.L1.L1.L1

.D1.D1.D1.D1

.S1.S1.S1.S1

.M2.M2.M2.M2

.L2.L2.L2.L2

.D2.D2.D2.D2

.S2.S2.S2.S2

B0B0

B31B31

....B15B15

....

Controller/DecoderController/DecoderController/DecoderController/Decoder

MemoryMemory ‘‘C6000 Compiler C6000 Compiler excels at excels at

Natural CNatural C

While While dual-MACdual-MAC speeds speeds math intensive algorithms, math intensive algorithms, flexibility of 8 independent flexibility of 8 independent functional unitsfunctional units allows the allows the compiler to quickly perform compiler to quickly perform other types of processingother types of processing

All ‘C6000 instructions are All ‘C6000 instructions are conditionalconditional allowing efficient allowing efficient hardware pipelininghardware pipelining

Instruction set and CPU Instruction set and CPU hardware orthogonality hardware orthogonality allow the compiler to allow the compiler to achieve 80-100% efficiencyachieve 80-100% efficiency

Chapter 1, Slide 17

Fastest MAC using Natural C

;** --------------------------------------------------*;** --------------------------------------------------*LOOP:LOOP: ; PIPED LOOP KERNEL; PIPED LOOP KERNEL

LDDWLDDW .D1.D1 A4++,A7:A6A4++,A7:A6|||| LDDWLDDW .D2.D2 B4++,B7:B6B4++,B7:B6|||| MPYSPMPYSP .M1X.M1X A6,B6,A5A6,B6,A5|||| MPYSPMPYSP .M2X.M2X A7,B7,B5A7,B7,B5|||| ADDSPADDSP .L1.L1 A5,A8,A8A5,A8,A8|||| ADDSPADDSP .L2.L2 B5,B8,B8B5,B8,B8|| [A1]|| [A1] BB .S2.S2 LOOPLOOP|| [A1]|| [A1] SUBSUB .S1.S1 A1,1,A1A1,1,A1;** --------------------------------------------------*;** --------------------------------------------------*

float mac(float *m, float *n, int count)float mac(float *m, float *n, int count){ int i, float sum = 0;{ int i, float sum = 0;

for (i=0; i < count; i++) {for (i=0; i < count; i++) { sum += m[i] * n[i]; } … sum += m[i] * n[i]; } …

A0A0

A31A31

....A15A15

....

.M1.M1.M1.M1

.L1.L1.L1.L1

.D1.D1.D1.D1

.S1.S1.S1.S1

.M2.M2.M2.M2

.L2.L2.L2.L2

.D2.D2.D2.D2

.S2.S2.S2.S2

B0B0

B31B31

....B15B15

....

Controller/DecoderController/DecoderController/DecoderController/Decoder

MemoryMemory

Chapter 1, Slide 18

'C6000 System Block Diagram

ExternalExternalMemoryMemory

.D1.D1

.M1.M1

.L1.L1

.S1.S1

.D2.D2

.M2.M2

.L2.L2

.S2.S2

Register Set B

Register Set B

Register Set A

Register Set A

CPUCPU

PPEERRIIPPHHEERRAALLSS

Internal BusesInternal Buses

Looking at the internal buses ...Looking at the internal buses ...

InternalInternalMemoryMemory

Chapter 1, Slide 19

‘C6000 Internal Buses

PCPCProgram AddrProgram Addr x32x32

Program DataProgram Data x256x256

DMADMA

DMA AddrDMA Addr - Read- Read

DMA DataDMA Data - Read- Read

DMA AddrDMA Addr - Write- Write

DMA DataDMA Data - Write- Write

AAregsregs

BBregsregs

Data AddrData Addr - T1- T1 x32 x32

Data DataData Data - T1- T1 x32/64 x32/64

Data AddrData Addr - T2- T2 x32x32

Data DataData Data - T2- T2 x32/64 x32/64

InternalInternal

MemoryMemory

ExternalExternal

MemoryMemory

PeripheralsPeripherals

Chapter 1, Slide 20



.D1.D1

.M1.M1

.L1.L1

.S1.S1

.D2.D2

.M2.M2

.L2.L2

.S2.S2

Register Set B

Register Set B

Register Set A

Register Set A

CPUCPU


Next, the internal memory ...Next, the internal memory ...


Chapter 1, Slide 21

‘C6711 Memory

cache details

cache logic FFFF_FFFFFFFF_FFFF

0000_00000000_0000

64KB Internal64KB Internal

On-chip PeripheralsOn-chip Peripherals0180_00000180_0000

128MB External2

128MB External3

8000_00008000_0000

9000_00009000_0000

A000_0000A000_0000

B000_0000B000_0000

128MB External0

128MB External1

64K64KProg / DataProg / Data

(Level 2)(Level 2)CPUCPU

4K4KProgramProgram

CacheCache

4K4KDataData

CacheCache

Chapter 1, Slide 24



.D1.D1

.M1.M1

.L1.L1

.S1.S1

.D2.D2

.M2.M2

.L2.L2

.S2.S2

Register Set B

Register Set B

Register Set A

Register Set A

CPUCPU

PPEERRIIPPHHEERRAALLSS


Looking at each peripheral ...Looking at each peripheral ...


Chapter 1, Slide 26

Hardware vs. Microcode multiplication

• DSP processors are optimised to perform multiplication and addition operations.

• Multiplication and addition are done in hardware and in one cycle.

• Example: 4-bit multiply (unsigned).

10111011x 1110x 1110

10111011x 1110x 1110

HardwareHardware MicrocodeMicrocode

1001101010011010 000000001011.1011.1011..1011..1011...1011...

1001101010011010

Cycle 1Cycle 1Cycle 2Cycle 2Cycle 3Cycle 3Cycle 4Cycle 4

Cycle 5Cycle 5

Chapter 1, Slide 27

Parameters to consider when choosing a DSP processor

Parameter

Arithmetic format

Extended floating point

Extended Arithmetic

Performance (peak)

Number of hardware multipliers

Number of registers

Internal L1 program memory cache

Internal L1 data memory cache

Internal L2 cache

32-bit

N/A

40-bit

1200MIPS

2 (16 x 16-bit) with 32-bit result

32

32K

32K

512K

32-bit

64-bit

40-bit

1200MFLOPS

2 (32 x 32-bit) with 32 or 64-bit result

32

32K

32K

512K

TMS320C6211 (@150MHz)

TMS320C6711 (@150MHz)

C6711 Datasheet: C6711 Datasheet: \Links\TMS320C6711.pdf C6211 Datasheet: C6211 Datasheet: \Links\TMS320C6211.pdf

Chapter 1, Slide 28

Parameters to consider when choosing a DSP processor

Parameter

I/O bandwidth: Serial Ports (number/speed)

DMA channels

Multiprocessor support

Supply voltage

Power management

On-chip timers (number/width)

Cost

Package

External memory interface controller

JTAG

2 x 75Mbps

16

Not inherent

3.3V I/O, 1.8V Core

Yes

2 x 32-bit

US$ 21.54

256 Pin BGA

Yes

Yes

2 x 75Mbps

16

Not inherent

3.3V I/O, 1.8V Core

Yes

2 x 32-bit

US$ 21.54

256 Pin BGA

Yes

Yes

TMS320C6211 (@150MHz)

TMS320C6711 (@150MHz)

Chapter 1, Slide 29

Floating vs. Fixed point processors

• Applications which require:– High precision.– Wide dynamic range.– High signal-to-noise ratio.– Ease of use.

Need a floating point processor.• Drawback of floating point processors:

– Higher power consumption.– Can be more expensive.– Can be slower than fixed-point counterparts and larger in size.

Chapter 1, Slide 30

Floating vs. Fixed point processors

• It is the application that dictates which device and platform to use in order to achieve optimum performance at a low cost.

• For educational purposes, use the floating-point device (C6711) as it can support both fixed and floating point operations.

Chapter 1, Slide 31

General Purpose DSP vs. DSP in ASIC

• Application Specific Integrated Circuits (ASICs) are semiconductors designed for dedicated functions.

• The advantages and disadvantages of using ASICs are listed below:

AdvantagesAdvantages

• High throughputHigh throughput• Lower silicon areaLower silicon area• Lower power consumptionLower power consumption• Improved reliabilityImproved reliability• Reduction in system noiseReduction in system noise• Low overall system costLow overall system cost

DisadvantagesDisadvantages

• High investment costHigh investment cost• Less flexibilityLess flexibility• Long time from design to Long time from design to

marketmarket

Chapter 1, Slide 32

General-purpose DSP market in 2003

Chapter 1, Slide 33

PerformancePerformanceInterfacingInterfacing

PowerPower

SizeSize

Ease-of UseEase-of Use• ProgrammingProgramming• InterfacingInterfacing• Debugging Debugging

IntegrationIntegration• MemoryMemory• PeripheralsPeripherals

CostCost• Device costDevice cost• System costSystem cost• Development costDevelopment cost• Time to market Time to market

System Considerations

Chapter 1, Slide 34

Texas Instruments’ TMS320 family

• Different families and sub-families exist to support different markets.

Lowest CostLowest CostControl SystemsControl Systems Motor ControlMotor Control StorageStorage Digital Ctrl SystemsDigital Ctrl Systems

C2000C2000 C5000C5000

EfficiencyEfficiency Best MIPS perBest MIPS perWatt / Dollar / SizeWatt / Dollar / Size Wireless phonesWireless phones Internet audio playersInternet audio players Digital still cameras Digital still cameras ModemsModems TelephonyTelephony VoIPVoIP

C6000C6000

Multi Channel and Multi Channel and Multi Function App'sMulti Function App's

Comm InfrastructureComm Infrastructure Wireless Base-stationsWireless Base-stations DSLDSL ImagingImaging Multi-media ServersMulti-media Servers VideoVideo

PerformancePerformance & &Best Best Ease-of-UseEase-of-Use

Chapter 1, Slide 35

Texas Instruments’ TMS320 familyTMS320C64x: The C64x fixed-point DSPs offer the industry's highest level of performance to address the demands of the digital age. At clock rates of up to 1 GHz, C64x DSPs can process information at rates up to 8000 MIPS with costs as low as $19.95. In addition to a high clock rate, C64x DSPs can do more work each cycle with built-in extensions. These extensions include new instructions to accelerate performance in key application areas such as digital communications infrastructure and video and image processing.

TMS320C62x: These first-generation fixed-point DSPs represent breakthrough technology that enables new equipments and energizes existing implementations for multi-channel, multi-function applications, such as wireless base stations, remote access servers (RAS), digital subscriber loop (xDSL) systems, personalized home security systems, advanced imaging/biometrics, industrial scanners, precision instrumentation and multi-channel telephony systems.

TMS320C67x: For designers of high-precision applications, C67x floating-point DSPs offer the speed, precision, power savings and dynamic range to meet a wide variety of design needs. These dynamic DSPs are the ideal solution for demanding applications like audio, medical imaging, instrumentation and automotive.

Chapter 1, Slide 36

C6000 RoadmapP

erf

orm

an

ce

Time

C62x/C64x/DM642: Fixed PointC67x: Floating PointC62x/C64x/DM642: Fixed PointC67x: Floating Point

Highest

Perform

ance

Object Code Software CompatibilityFloating PointFloating Point

Multi-coreMulti-core C64x™ DSP 1.1 GHz

C64x™ DSP 1.1 GHz

C6201

C6701

C6202C6203

C6211C6711

C6204

1st Generation

C6713C6713

C6205

C6712

C6412C6412 DM642DM642

2nd Generation

C6415C6415

C6416C6416

C6411C6411

C6414C6414

Chapter 1, Slide 37

Per

for m

ance

Time

C67x

3 GFLOPS and beyond

C6712

600MFLOPS

C6711

900 MFLOPS

C6701

1 GFLOPS

150 MFLOPSC32

C31

C30

C33

’C6000 Floating-Point

Chapter 1, Slide 38

TI Floating Point - A History of Firsts:

First commercially-successful floating-point DSP ‘C30 (1987)

First floating-point DSP with multiprocessing support ‘C40 (1991)

First $10 floating-point DSP ‘C32 (1995)

First 1-GFLOPS DSP ‘C6701 (1998)

First $5 floating-point DSP ‘C33 (1999)

First 2-level cache floating-point DSP ‘C6711 (1999)

First to offer 600 MFLOPS for under $10 ‘C6712 (2000)

TI Floating-Point Innovation

Chapter 1, Slide 39

Useful Links

• Selection Guide: – \Links\DSP Selection Guide.pdf

\Links\DSP Selection Guide.pdf (3Q 2004)\Links\DSP Selection Guide.pdf (3Q 2004)

\Links\DSP Selection Guide.pdf (4Q 2004)\Links\DSP Selection Guide.pdf (4Q 2004)

Chapter 1, Slide 40

Looking for Literature on DSP?

“Understanding Digital Signal Processing” by Richard G. Lyons;Prentice Hall; 2nd edition (March 15, 2004)

ISBN 0-1310-8989-7

“A Simple Approach to Digital Signal Processing”by Craig Marven and Gillian Ewers; ISBN 0-4711-5243-9

“DSP Primer (Primer Series)” by C. Britton Rorabaugh; ISBN 0-0705-4004-7

“DSP First : A Multimedia Approach”James H. McClellan, Ronald W. Schafer, and Mark A. Yoder;ISBN 0-1324-3171-8

Chapter 1, Slide 41

Looking for Books on ‘C6000 DSP? “Digital Signal Processing Implementation

using the TMS320C6000TM DSP Platform”

by Naim Dahnoun; ISBN 0201-61916-4

“C6x-Based Digital Signal Processing”

by Nasser Kehtarnavaz and Burc Simsek;ISBN 0-13-088310-7

“Real-Time Digital Signal Processing: Based on the TMS320C6000” by Nasser Kehtarnavaz; Newnes; Book & CD-Rom (July 14, 2004) ISBN 0-7506-7830-5

“Digital Signal Processing and Applications with the C6713 and C6416 DSK (Topics in Digital Signal Processing)” Wiley-Interscience; Book&CD-Rom (December 3, 2004) by Rulph Chassaing;

ISBN 0-4716-9007-4

Chapter 1, Slide 42

Looking for Books on ‘C6000 DSP?

“Real-Time Digital Signal Processing from Matlab

to C with the TMS320C6x DSK” by Thad B. Welch;

Cameron Wright; Michael Morrow; Book & CD-Rom

(2006) ISBN 0-8493-7382-4

Chapter 1, Slide 43

Chapter 1

Introduction

- End -

chapter 1, slide 1 chapter 1 introduction dsp lecture 01

Documents

realtime processing

dsp processors

technology enabler slide

realtime application

analogue signal processing

gpp processor

definition of realtime

introduction dsp lecture