chapter 1, slide 1 chapter 1 introduction dsp lecture 01
TRANSCRIPT
Chapter 1, Slide 1
Chapter 1
Introduction
DSP Lecture 01
Chapter 1, Slide 2
Learning Objectives
• Why process signals digitally?• Definition of a real-time application.• Why use Digital Signal Processing processors?• What are the typical DSP algorithms?• Parameters to consider when choosing a DSP
processor.• Programmable vs ASIC DSP.• Texas Instruments’ TMS320 family.
Chapter 1, Slide 3
Present Day Applications
Consumer AudioConsumer Audio Stereo A/D, D/AStereo A/D, D/A
PLLPLL MixersMixers
MultimediaMultimedia Stereo audioStereo audio
ImagingImaging Graphics paletteGraphics palette
Voltage regulationVoltage regulation
Wireless / CellularWireless / Cellular Voice-band audioVoice-band audio
RF codecsRF codecs Voltage regulationVoltage regulation
HDDHDD PRML read channelPRML read channel
MR pre-ampMR pre-amp Servo controlServo control
SCSI tranceiversSCSI tranceivers
AutomotiveAutomotive Digital radio A/D/ADigital radio A/D/A Active suspensionActive suspension Voltage regulationVoltage regulation
DTADDTAD Speech synthesizerSpeech synthesizer
Mixed-signalMixed-signalprocessorprocessor
DSP:DSP:TechnologyTechnology
EnablerEnabler
Chapter 1, Slide 4
Why go digital?
• Digital signal processing techniques are now so powerful that sometimes it is extremely difficult, if not impossible, for analogue signal processing to achieve similar performance.
• Examples:– FIR filter with linear phase.– Adaptive filters.
Chapter 1, Slide 5
Why go digital?
• Analogue signal processing is achieved by using analogue components such as:
– Resistors.– Capacitors.– Inductors.
• The inherent tolerances associated with these components, temperature, voltage changes and mechanical vibrations can dramatically affect the effectiveness of the analogue circuitry.
Chapter 1, Slide 6
Why go digital?
• With DSP it is easy to:– Change applications.– Correct applications.– Update applications.
• Additionally DSP reduces:– Noise susceptibility.– Chip count.– Development time.– Cost.– Power consumption.
Chapter 1, Slide 7
Why NOT go digital?
• High frequency signals cannot be processed digitally because of two reasons:
– Analog to Digital Converters, ADC cannot work fast enough.
– The application can be too complex to be performed in real-time.
Chapter 1, Slide 8
• DSP processors have to perform tasks in real-time, so how do we define real-time?
• The definition of real-time depends on the application.
• Example: a 100-tap FIR filter is performed in real-time if the DSP can perform and complete the following operation between two samples:
Real-time processing
99
0k
knxkany
Chapter 1, Slide 9
• We can say that we have a real-time application if:– Waiting Time 0
Real-time processing
Processing TimeProcessing TimeWaitingWaiting
TimeTime
Sample TimeSample Timenn n+1n+1
Chapter 1, Slide 10
• Why not use a General Purpose Processor (GPP) such as a Pentium instead of a DSP processor?
– What is the power consumption of a Pentium and a DSP processor?
– What is the cost of a Pentium and a DSP processor?
Why do we need DSP processors?
Chapter 1, Slide 11
• Use a DSP processor when the following are required:
– Cost saving.– Smaller size.– Low power consumption.– Processing of many “high” frequency signals in
real-time.• Use a GPP processor when the following are
required:– Large memory.– Advanced operating systems.
Why do we need DSP processors?
Chapter 1, Slide 12
What are the typical DSP algorithms?
Algorithm Equation
Finite Impulse Response Filter
M
kk knxany
0
)()(
Infinite Impulse Response Filter
N
kk
M
kk knybknxany
10
)()()(
Convolution
N
k
knhkxny0
)()()(
Discrete Fourier Transform
1
0
])/2(exp[)()(N
n
nkNjnxkX
Discrete Cosine Transform
1
0
122
cos).().(N
x
xuN
xfucuF
• The Sum of Products (SOP) is the key element in most DSP algorithms:
Chapter 1, Slide 13
What Problem Are We Trying To Solve?
Digital sampling of Digital sampling of an analog signal:an analog signal:
A
tt
What does it take to do this fast … and easy?What does it take to do this fast … and easy?
Most DSP algorithms can be Most DSP algorithms can be expressed with MAC:expressed with MAC:
countcount
i = 1i = 1Y = Y = a aii * x * xii
for (i = 1; i < count; i++){for (i = 1; i < count; i++){ sum += m[i] * n[i]; } sum += m[i] * n[i]; }
DACDACxx YY
ADCADC DSPDSP
Chapter 1, Slide 14
• Fastest Execution of MACs– The ‘C6x roadmap ... from 200 to 2400 MMACs
• Ease of C Programming– Even using natural C, the ‘C6000 Architecture can perform 2 to 4 MACs
per cycle– Compiler generates 80-100% efficient code
Multiply-Accumulate (MAC) in Natural C CodeMultiply-Accumulate (MAC) in Natural C Code
for (i = 0; i < count; i++){for (i = 0; i < count; i++){ sum += m[i] * n[i]; } sum += m[i] * n[i]; }
Fast MAC using only C
How does the ‘C6000 achieve such performance from C?How does the ‘C6000 achieve such performance from C?
Chapter 1, Slide 16
'C6000 Architecture: Built for Speed
A0A0
A31A31
....A15A15
....
.M1.M1.M1.M1
.L1.L1.L1.L1
.D1.D1.D1.D1
.S1.S1.S1.S1
.M2.M2.M2.M2
.L2.L2.L2.L2
.D2.D2.D2.D2
.S2.S2.S2.S2
B0B0
B31B31
....B15B15
....
Controller/DecoderController/DecoderController/DecoderController/Decoder
MemoryMemory ‘‘C6000 Compiler C6000 Compiler excels at excels at
Natural CNatural C
While While dual-MACdual-MAC speeds speeds math intensive algorithms, math intensive algorithms, flexibility of 8 independent flexibility of 8 independent functional unitsfunctional units allows the allows the compiler to quickly perform compiler to quickly perform other types of processingother types of processing
All ‘C6000 instructions are All ‘C6000 instructions are conditionalconditional allowing efficient allowing efficient hardware pipelininghardware pipelining
Instruction set and CPU Instruction set and CPU hardware orthogonality hardware orthogonality allow the compiler to allow the compiler to achieve 80-100% efficiencyachieve 80-100% efficiency
Chapter 1, Slide 17
Fastest MAC using Natural C
;** --------------------------------------------------*;** --------------------------------------------------*LOOP:LOOP: ; PIPED LOOP KERNEL; PIPED LOOP KERNEL
LDDWLDDW .D1.D1 A4++,A7:A6A4++,A7:A6|||| LDDWLDDW .D2.D2 B4++,B7:B6B4++,B7:B6|||| MPYSPMPYSP .M1X.M1X A6,B6,A5A6,B6,A5|||| MPYSPMPYSP .M2X.M2X A7,B7,B5A7,B7,B5|||| ADDSPADDSP .L1.L1 A5,A8,A8A5,A8,A8|||| ADDSPADDSP .L2.L2 B5,B8,B8B5,B8,B8|| [A1]|| [A1] BB .S2.S2 LOOPLOOP|| [A1]|| [A1] SUBSUB .S1.S1 A1,1,A1A1,1,A1;** --------------------------------------------------*;** --------------------------------------------------*
float mac(float *m, float *n, int count)float mac(float *m, float *n, int count){ int i, float sum = 0;{ int i, float sum = 0;
for (i=0; i < count; i++) {for (i=0; i < count; i++) { sum += m[i] * n[i]; } … sum += m[i] * n[i]; } …
A0A0
A31A31
....A15A15
....
.M1.M1.M1.M1
.L1.L1.L1.L1
.D1.D1.D1.D1
.S1.S1.S1.S1
.M2.M2.M2.M2
.L2.L2.L2.L2
.D2.D2.D2.D2
.S2.S2.S2.S2
B0B0
B31B31
....B15B15
....
Controller/DecoderController/DecoderController/DecoderController/Decoder
MemoryMemory
Chapter 1, Slide 18
'C6000 System Block Diagram
ExternalExternalMemoryMemory
.D1.D1
.M1.M1
.L1.L1
.S1.S1
.D2.D2
.M2.M2
.L2.L2
.S2.S2
Register Set B
Register Set B
Register Set A
Register Set A
CPUCPU
PPEERRIIPPHHEERRAALLSS
Internal BusesInternal Buses
Looking at the internal buses ...Looking at the internal buses ...
InternalInternalMemoryMemory
Chapter 1, Slide 19
‘C6000 Internal Buses
PCPCProgram AddrProgram Addr x32x32
Program DataProgram Data x256x256
DMADMA
DMA AddrDMA Addr - Read- Read
DMA DataDMA Data - Read- Read
DMA AddrDMA Addr - Write- Write
DMA DataDMA Data - Write- Write
AAregsregs
BBregsregs
Data AddrData Addr - T1- T1 x32 x32
Data DataData Data - T1- T1 x32/64 x32/64
Data AddrData Addr - T2- T2 x32x32
Data DataData Data - T2- T2 x32/64 x32/64
InternalInternal
MemoryMemory
ExternalExternal
MemoryMemory
PeripheralsPeripherals
Chapter 1, Slide 20
'C6000 System Block Diagram
ExternalExternalMemoryMemory
.D1.D1
.M1.M1
.L1.L1
.S1.S1
.D2.D2
.M2.M2
.L2.L2
.S2.S2
Register Set B
Register Set B
Register Set A
Register Set A
CPUCPU
Internal BusesInternal Buses
Next, the internal memory ...Next, the internal memory ...
InternalInternalMemoryMemory
Chapter 1, Slide 21
‘C6711 Memory
cache details
cache logic FFFF_FFFFFFFF_FFFF
0000_00000000_0000
64KB Internal64KB Internal
On-chip PeripheralsOn-chip Peripherals0180_00000180_0000
128MB External2
128MB External3
8000_00008000_0000
9000_00009000_0000
A000_0000A000_0000
B000_0000B000_0000
128MB External0
128MB External1
64K64KProg / DataProg / Data
(Level 2)(Level 2)CPUCPU
4K4KProgramProgram
CacheCache
4K4KDataData
CacheCache
Chapter 1, Slide 24
'C6000 System Block Diagram
ExternalExternalMemoryMemory
.D1.D1
.M1.M1
.L1.L1
.S1.S1
.D2.D2
.M2.M2
.L2.L2
.S2.S2
Register Set B
Register Set B
Register Set A
Register Set A
CPUCPU
PPEERRIIPPHHEERRAALLSS
Internal BusesInternal Buses
Looking at each peripheral ...Looking at each peripheral ...
InternalInternalMemoryMemory
Chapter 1, Slide 26
Hardware vs. Microcode multiplication
• DSP processors are optimised to perform multiplication and addition operations.
• Multiplication and addition are done in hardware and in one cycle.
• Example: 4-bit multiply (unsigned).
10111011x 1110x 1110
10111011x 1110x 1110
HardwareHardware MicrocodeMicrocode
1001101010011010 000000001011.1011.1011..1011..1011...1011...
1001101010011010
Cycle 1Cycle 1Cycle 2Cycle 2Cycle 3Cycle 3Cycle 4Cycle 4
Cycle 5Cycle 5
Chapter 1, Slide 27
Parameters to consider when choosing a DSP processor
Parameter
Arithmetic format
Extended floating point
Extended Arithmetic
Performance (peak)
Number of hardware multipliers
Number of registers
Internal L1 program memory cache
Internal L1 data memory cache
Internal L2 cache
32-bit
N/A
40-bit
1200MIPS
2 (16 x 16-bit) with 32-bit result
32
32K
32K
512K
32-bit
64-bit
40-bit
1200MFLOPS
2 (32 x 32-bit) with 32 or 64-bit result
32
32K
32K
512K
TMS320C6211 (@150MHz)
TMS320C6711 (@150MHz)
C6711 Datasheet: C6711 Datasheet: \Links\TMS320C6711.pdf C6211 Datasheet: C6211 Datasheet: \Links\TMS320C6211.pdf
Chapter 1, Slide 28
Parameters to consider when choosing a DSP processor
Parameter
I/O bandwidth: Serial Ports (number/speed)
DMA channels
Multiprocessor support
Supply voltage
Power management
On-chip timers (number/width)
Cost
Package
External memory interface controller
JTAG
2 x 75Mbps
16
Not inherent
3.3V I/O, 1.8V Core
Yes
2 x 32-bit
US$ 21.54
256 Pin BGA
Yes
Yes
2 x 75Mbps
16
Not inherent
3.3V I/O, 1.8V Core
Yes
2 x 32-bit
US$ 21.54
256 Pin BGA
Yes
Yes
TMS320C6211 (@150MHz)
TMS320C6711 (@150MHz)
Chapter 1, Slide 29
Floating vs. Fixed point processors
• Applications which require:– High precision.– Wide dynamic range.– High signal-to-noise ratio.– Ease of use.
Need a floating point processor.• Drawback of floating point processors:
– Higher power consumption.– Can be more expensive.– Can be slower than fixed-point counterparts and larger in size.
Chapter 1, Slide 30
Floating vs. Fixed point processors
• It is the application that dictates which device and platform to use in order to achieve optimum performance at a low cost.
• For educational purposes, use the floating-point device (C6711) as it can support both fixed and floating point operations.
Chapter 1, Slide 31
General Purpose DSP vs. DSP in ASIC
• Application Specific Integrated Circuits (ASICs) are semiconductors designed for dedicated functions.
• The advantages and disadvantages of using ASICs are listed below:
AdvantagesAdvantages
• High throughputHigh throughput• Lower silicon areaLower silicon area• Lower power consumptionLower power consumption• Improved reliabilityImproved reliability• Reduction in system noiseReduction in system noise• Low overall system costLow overall system cost
DisadvantagesDisadvantages
• High investment costHigh investment cost• Less flexibilityLess flexibility• Long time from design to Long time from design to
marketmarket
Chapter 1, Slide 32
General-purpose DSP market in 2003
Chapter 1, Slide 33
PerformancePerformanceInterfacingInterfacing
PowerPower
SizeSize
Ease-of UseEase-of Use• ProgrammingProgramming• InterfacingInterfacing• Debugging Debugging
IntegrationIntegration• MemoryMemory• PeripheralsPeripherals
CostCost• Device costDevice cost• System costSystem cost• Development costDevelopment cost• Time to market Time to market
System Considerations
Chapter 1, Slide 34
Texas Instruments’ TMS320 family
• Different families and sub-families exist to support different markets.
Lowest CostLowest CostControl SystemsControl Systems Motor ControlMotor Control StorageStorage Digital Ctrl SystemsDigital Ctrl Systems
C2000C2000 C5000C5000
EfficiencyEfficiency Best MIPS perBest MIPS perWatt / Dollar / SizeWatt / Dollar / Size Wireless phonesWireless phones Internet audio playersInternet audio players Digital still cameras Digital still cameras ModemsModems TelephonyTelephony VoIPVoIP
C6000C6000
Multi Channel and Multi Channel and Multi Function App'sMulti Function App's
Comm InfrastructureComm Infrastructure Wireless Base-stationsWireless Base-stations DSLDSL ImagingImaging Multi-media ServersMulti-media Servers VideoVideo
PerformancePerformance & &Best Best Ease-of-UseEase-of-Use
Chapter 1, Slide 35
Texas Instruments’ TMS320 familyTMS320C64x: The C64x fixed-point DSPs offer the industry's highest level of performance to address the demands of the digital age. At clock rates of up to 1 GHz, C64x DSPs can process information at rates up to 8000 MIPS with costs as low as $19.95. In addition to a high clock rate, C64x DSPs can do more work each cycle with built-in extensions. These extensions include new instructions to accelerate performance in key application areas such as digital communications infrastructure and video and image processing.
TMS320C62x: These first-generation fixed-point DSPs represent breakthrough technology that enables new equipments and energizes existing implementations for multi-channel, multi-function applications, such as wireless base stations, remote access servers (RAS), digital subscriber loop (xDSL) systems, personalized home security systems, advanced imaging/biometrics, industrial scanners, precision instrumentation and multi-channel telephony systems.
TMS320C67x: For designers of high-precision applications, C67x floating-point DSPs offer the speed, precision, power savings and dynamic range to meet a wide variety of design needs. These dynamic DSPs are the ideal solution for demanding applications like audio, medical imaging, instrumentation and automotive.
Chapter 1, Slide 36
C6000 RoadmapP
erf
orm
an
ce
Time
C62x/C64x/DM642: Fixed PointC67x: Floating PointC62x/C64x/DM642: Fixed PointC67x: Floating Point
Highest
Perform
ance
Object Code Software CompatibilityFloating PointFloating Point
Multi-coreMulti-core C64x™ DSP 1.1 GHz
C64x™ DSP 1.1 GHz
C6201
C6701
C6202C6203
C6211C6711
C6204
1st Generation
C6713C6713
C6205
C6712
C6412C6412 DM642DM642
2nd Generation
C6415C6415
C6416C6416
C6411C6411
C6414C6414
Chapter 1, Slide 37
Per
for m
ance
Time
C67x
3 GFLOPS and beyond
C6712
600MFLOPS
C6711
900 MFLOPS
C6701
1 GFLOPS
150 MFLOPSC32
C31
C30
C33
’C6000 Floating-Point
Chapter 1, Slide 38
TI Floating Point - A History of Firsts:
First commercially-successful floating-point DSP ‘C30 (1987)
First floating-point DSP with multiprocessing support ‘C40 (1991)
First $10 floating-point DSP ‘C32 (1995)
First 1-GFLOPS DSP ‘C6701 (1998)
First $5 floating-point DSP ‘C33 (1999)
First 2-level cache floating-point DSP ‘C6711 (1999)
First to offer 600 MFLOPS for under $10 ‘C6712 (2000)
TI Floating-Point Innovation
Chapter 1, Slide 39
Useful Links
• Selection Guide: – \Links\DSP Selection Guide.pdf
\Links\DSP Selection Guide.pdf (3Q 2004)\Links\DSP Selection Guide.pdf (3Q 2004)
\Links\DSP Selection Guide.pdf (4Q 2004)\Links\DSP Selection Guide.pdf (4Q 2004)
Chapter 1, Slide 40
Looking for Literature on DSP?
“Understanding Digital Signal Processing” by Richard G. Lyons;Prentice Hall; 2nd edition (March 15, 2004)
ISBN 0-1310-8989-7
“A Simple Approach to Digital Signal Processing”by Craig Marven and Gillian Ewers; ISBN 0-4711-5243-9
“DSP Primer (Primer Series)” by C. Britton Rorabaugh; ISBN 0-0705-4004-7
“DSP First : A Multimedia Approach”James H. McClellan, Ronald W. Schafer, and Mark A. Yoder;ISBN 0-1324-3171-8
Chapter 1, Slide 41
Looking for Books on ‘C6000 DSP? “Digital Signal Processing Implementation
using the TMS320C6000TM DSP Platform”
by Naim Dahnoun; ISBN 0201-61916-4
“C6x-Based Digital Signal Processing”
by Nasser Kehtarnavaz and Burc Simsek;ISBN 0-13-088310-7
“Real-Time Digital Signal Processing: Based on the TMS320C6000” by Nasser Kehtarnavaz; Newnes; Book & CD-Rom (July 14, 2004) ISBN 0-7506-7830-5
“Digital Signal Processing and Applications with the C6713 and C6416 DSK (Topics in Digital Signal Processing)” Wiley-Interscience; Book&CD-Rom (December 3, 2004) by Rulph Chassaing;
ISBN 0-4716-9007-4
Chapter 1, Slide 42
Looking for Books on ‘C6000 DSP?
“Real-Time Digital Signal Processing from Matlab
to C with the TMS320C6x DSK” by Thad B. Welch;
Cameron Wright; Michael Morrow; Book & CD-Rom
(2006) ISBN 0-8493-7382-4
Chapter 1, Slide 43
Chapter 1
Introduction
- End -