choosing the appropriate simulator configuration in code composer

21
Application Report SPRA864 – November 2002 1 Choosing the Appropriate Simulator Configuration in Code Composer Studio IDE Pankaj Ratan Lal, Ambar Gadkari Software Development Systems ABSTRACT Software development for digital signal processors (DSPs) go through algorithm development and application integration stages. Each stage involves validation and optimization of the developed code. Code Composer Studio IDE has a set of simulator configurations and tools to enable application development in these stages. The simulator configurations for a DSP include: functional CPU simulator, cycle accurate CPU simulator, functional device simulator, and device simulator. The simulators also support features for validation and optimization such as the pipeline stall analyzer, code coverage and multi-event profiler, and cache analysis tool. There are also features, such as the pin connect and port connect, which provide external stimuli to the application. It is critical to use the appropriate simulator configuration and a combination of features during the different stages of application development. This application note relates the application validation and optimization challenges to available simulation configurations and supported tools. It helps the developer choose the appropriate simulator configuration along with applicable features during different stages of application development. Contents 1 Introduction 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Application Software Development Flow 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Validation and Optimization Challenges 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Choosing the Appropriate Configuration 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Simulator Configurations 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Algorithm Validation 8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Recommended Configurations 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Algorithm Optimization 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Recommended Configurations 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.2 Profilers 10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.3 Pipeline Analysis 10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Application Validation 10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.1 Recommended Configurations 11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 Application Optimization 12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.1 Recommended Configurations 12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.2 Cache Optimizations 12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Trademarks are the property of their respective owners.

Upload: others

Post on 03-Feb-2022

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Choosing the Appropriate Simulator Configuration in Code Composer

Application ReportSPRA864 – November 2002

1

Choosing the Appropriate Simulator Configuration inCode Composer Studio IDE

Pankaj Ratan Lal, Ambar Gadkari Software Development Systems

ABSTRACT

Software development for digital signal processors (DSPs) go through algorithmdevelopment and application integration stages. Each stage involves validation andoptimization of the developed code. Code Composer Studio IDE has a set of simulatorconfigurations and tools to enable application development in these stages.

The simulator configurations for a DSP include: functional CPU simulator, cycle accurateCPU simulator, functional device simulator, and device simulator. The simulators alsosupport features for validation and optimization such as the pipeline stall analyzer, codecoverage and multi-event profiler, and cache analysis tool. There are also features, such asthe pin connect and port connect, which provide external stimuli to the application. It is criticalto use the appropriate simulator configuration and a combination of features during thedifferent stages of application development.

This application note relates the application validation and optimization challenges toavailable simulation configurations and supported tools. It helps the developer choose theappropriate simulator configuration along with applicable features during different stages ofapplication development.

Contents

1 Introduction 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2 Application Software Development Flow 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Validation and Optimization Challenges 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3 Choosing the Appropriate Configuration 5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Simulator Configurations 6. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Algorithm Validation 8. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.2.1 Recommended Configurations 9. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Algorithm Optimization 9. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.3.1 Recommended Configurations 9. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.2 Profilers 10. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.3 Pipeline Analysis 10. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.4 Application Validation 10. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.1 Recommended Configurations 11. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.5 Application Optimization 12. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.1 Recommended Configurations 12. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.2 Cache Optimizations 12. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Trademarks are the property of their respective owners.

Page 2: Choosing the Appropriate Simulator Configuration in Code Composer

SPRA864

2 Choosing the Appropriate Simulator Configuration in Code Composer Studio IDE

3.5.3 Buffer Transfer Optimizations 13. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6 Real-World Interactions 13. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.7 Setting Up Simulator Configurations in Code Composer Studio IDE 13. . . . . . . . . . . . . . . . . . . .

4 User Scenarios 15. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5 References 18. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Appendix A 19. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.1 Simulator Configurations Based on Extent and Detail of Device Simulated 19. . . . . . . . . . . . . . A.2 Features to Provide External Stimuli to Target Device 19. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.3 Visibility and Analysis Features 20. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Contents

List of Figures

Figure 1 Component View of an Application 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Figure 2 Algorithm Validation and Optimization and the Simulator Configurations 4. . . . . . . . . . . . . . Figure 3 Application Validation and Optimization and the Simulator Configurations 5. . . . . . . . . . . .

List of Tables

Table 1 TMS320C6x Simulator Configurations 6. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Table 2 TMSW320C55x Simulator Configurations 8. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Table 3 Functional Device Simulators vs. Device Simulators 11. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Table 4 Algorithm Validation and Optimization 15. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Table 5 Application Validation and Optimization 16. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1 Introduction

Code Composer Studio IDE for C55x and C6x has multiple simulator configurations availablethrough the import menu in Code Composer Studio setup. The simulators support features suchas the pipeline stall analyzer, code coverage and multi-event profiler, and cache analysis tool.There are also features, such as the pin connect and port connect, which provide externalstimuli to the application. It is critical to use the appropriate simulator configuration and acombination of features during the different stages of application development.

The organization of the application note is as follows:

• Section 2 describes a typical application development flow and associated validation andoptimization challenges.

• Section 3 enumerates the different simulator configurations, supported features, and tools.Further, it describes how appropriate combinations of simulator configuration, features, andtools addresses these challenges.

• Section 4 describes typical user scenarios, illustrating the usage of simulators to tackle thevalidation and optimization challenges.

NOTE: The various simulator configurations and features discussed in this document refer tothose supported by Code Composer Studio IDE v2.2.

Page 3: Choosing the Appropriate Simulator Configuration in Code Composer

SPRA864

3 Choosing the Appropriate Simulator Configuration in Code Composer Studio IDE

2 Application Software Development Flow

A typical application software development involves the following stages:

1. Identifying the application and partitioning it into modules.

2. Making heuristic estimates of CPU cycles and memory usage for individual modules andthe overall application.

3. Creating/reusing algorithms for the modules.

4. Verifying the functionality of individual modules on the target DSP.

5. Performing optimizations on each module, for code size and CPU cycles, to achieve orbetter the cycle estimates that were arrived at in step 2.

Steps 4 and 5 are repeated until satisfactory results are obtained.

6. Integrating modules and verifying the full application. Integration typically involves creatingDSP/BIOS threads, and using the chip support library (CSL) to program the DMA andother peripherals.

7. Optimizing the application for transaction latencies and efficient buffer management, tomatch or better the heuristic estimates.

This process may require changes to the application code and hence involve iterating steps3 through 7.

Figure 1 depicts a typical structure of an application that contains multiple modules integratedinto the application framework. The framework includes using DSP/BIOS and CSL. Theapplication operates on external stimuli through peripherals such as a serial port.

Algorithms, such as FIR and VOL in the figure, are developed first. These are then integratedinto the application framework.

Application Framework

VOL FIR

DSP/BIOS CSL

CPU DMA Other McBSPperipherals

TMS320 Hardware

Figure 1. Component View of an Application

Page 4: Choosing the Appropriate Simulator Configuration in Code Composer

SPRA864

4 Choosing the Appropriate Simulator Configuration in Code Composer Studio IDE

2.1 Validation and Optimization Challenges

Application software development involves creating correct and efficient applications. Ensuringefficiency and correctness involves thorough validation of all aspects of the software and iscritical to the viability of real-time applications. Following is the categorization of the typicalchallenges faced during application development:

• Algorithm validation involves verifying the correctness of developed code. This includesvalidating all parts of the code and debugging any problems encountered in the process.

• Algorithm optimization involves meeting cycle and memory budgets, typically estimatedusing heuristics. The application developer can first meet the CPU cycle budgets assumingan ideal memory system (where memory latency is assumed to be 0). Maximum utilization ofthe CPU resources and instruction set features is needed at this stage to meet theconstraints.

• Application validation involves ensuring correctness after integrating different algorithmsand control code into the application framework. The integration process may involvecreating DSP/BIOS tasks, using CSL and device drivers for programming the peripherals,and properly placing buffers in internal and external memories. The application needs to bevalidated for correct data transfers across buffers, and proper task scheduling andprioritizations to ensure overall correctness.

• Application optimization involves ensuring efficient code placement of all modules tominimize memory latency and cache misses, efficient usage of the DMA for transferring dataacross internal memories, peripherals and external memories, and minimizing the CPU idletime. In this process, changes may have to be made to the application, which will requireiterating through the validation and optimization cycles.

• Real-world interactions involve supplying appropriate external stimuli to run theapplication. These stimuli could be application data inputs/outputs or control signals, such asinterrupts.

Application development may involve going through multiple iterations of these steps. Figure 2and Figure 3 depict this application development flow and broadly indicate which simulatorconfigurations are suitable during this flow.

The next section describes the available configurations and features to meet the developmentalchallenges.

Page 5: Choosing the Appropriate Simulator Configuration in Code Composer

SPRA864

5 Choosing the Appropriate Simulator Configuration in Code Composer Studio IDE

Bug fix

Testfunctionality

Build Debug Tune

Testperformance Optimized

algorithm

Optimize

Functional CPU simulator Cycle accurate CPU simulator

Figure 2. Algorithm Validation and Optimization and the Simulator Configurations

Bug fix

TestfunctionalityIntegrate/ Debug Tune

Testperformance

Optimize

Functional device simulator,

Applicationframework

VOL

FIR

Real-worldinteraction

device simulatorFunctional device simulator,device simulator, real device

build

Figure 3. Application Validation and Optimization and the Simulator Configurations

3 Choosing the Appropriate Configuration

Code Composer Studio IDE provides different simulator configurations that address the needs ofdifferent stages of application development. These configurations differ in capabilities such asthe extent of the DSP device simulated, the level of detail to which the DSP device is simulated,features for simulating external stimuli, and support for debug and efficiency analysis tools.

The simulator configurations are classified based on the extent of the DSP device simulated andthe level of detail to which the DSP device is simulated (for details, refer to section A.1):

Page 6: Choosing the Appropriate Simulator Configuration in Code Composer

SPRA864

6

• Core simulators

– functional CPU simulator

– cycle-accurate CPU simulator

– CPU and cache simulators

• Full-device simulators

– functional-device simulator

– device simulator

The features supported for simulating external stimuli to the DSP are classified as follows (fordetails, refer to section A.2):

• pin connect

• port connect

• cross bar

• boot load

• external host port

The validation and optimization features supported by the simulators are classified as follows(for details refer to section A.3):

• pipeline analysis

• simulator analysis events

• Code Composer Studio IDE profiler

• cache analysis (part of the analysis tool kit released with Code Composer Studio IDE v2.2)

• code coverage and multi-event profiler (part of the analysis tool kit released with CodeComposer Studio IDE v2.2)

Refer to Appendix A for further details on these capabilities. All these simulator configurationsand features are integrated into the Code Composer Studio IDE. Therefore, all the simulatorconfigurations support Code Composer Studio IDE features such as viewing CPU/peripheralregisters, viewing memory contents, setting breakpoints, etc.

Let us now focus on how the simulation configurations can be used to address the differentvalidation and optimization challenges identified.

3.1 Simulator Configurations

The tables below provide the information on various simulator configurations available with CodeComposer Studio IDE v2.2.

Table 1. TMS320C6x Simulator Configurations

Configuration Description

C62xx Cycle AccurateSim, Little Endian

Simulates the core of the C62x processor. This is faster than the device simulator but does notsimulate peripherals and cache system (uses a flat memory system).

C64xx Cycle AccurateSim, Little Endian

Simulates the core of the C64x processor. This is faster than the device simulator but does notsimulate peripherals and cache system (uses a flat memory system).

Page 7: Choosing the Appropriate Simulator Configuration in Code Composer

SPRA864

7

Table 1. TMS320C6x Simulator Configurations (Continued)

Configuration Description

C64xx Cache Simulator,Little Endian

Simulates the core of the C64x processor. This also models the L1D, L1P and the L2 caches.Beyond L2 it uses a flat memory system.

C67xx Cycle AccurateSim, Little Endian

Simulates the core of the C67x processor. This is faster than the device simulator but does notsimulate peripherals and cache system (uses a flat memory system).

C6201 Device Sim, LittleEndian Map 0

Simulates the C6201 processor. Supports PBUS, DMA, DMS, PMS, McBSP(2), timer(2), EMIFsupports interfacing with async, SDRAM and SBSRAM memory models (EMIF not fully cycleaccurate). Does not support HPI.

C6202 Device Sim, LIttleEndian Map 0

Simulates the C6202 processor. Supports PBUS, DMA, DMS, PMS, McBSP(3), timer(2), EMIFsupports interfacing with async, SDRAM and SBSRAM memory models (EMIF not fully cycleaccurate). Does not support Exp.Bus 32-bit.

C6203 Device Sim, LittleEndian Map 0

Simulates the C6203 processor. Supports PBUS, DMA, DMS, PMS, McBSP(3), timer(2), EMIFsupports interfacing with async, SDRAM and SBSRAM memory models (EMIF not fully cycleaccurate). Does not support Exp.Bus 32-bit.

C6204 Device Sim, LittleEndian Map 0

Simulates the C6204 processor. Supports PBUS, DMA, DMS, PMS, McBSP(2), timer(2), EMIFsupports interfacing with async, SDRAM and SBSRAM memory models (EMIF not fully cycleaccurate). Does not support Exp.Bus 32-bit.

C6205 Device Sim, LittleEndian Map 0

Simulates the C6205 processor. Supports PBUS, DMA, DMS, PMS, McBSP(2), timer(2), EMIFsupports interfacing with async, SDRAM and SBSRAM memory models (EMIF not fully cycleaccurate). Does not support PCI.

C6211 Device Simulator,Little Endian

Simulates the C6211 processor. Supports L1D, L1P, L2 cache, EDMA, QDMA, timer(2),McBSP(2), EMIF supports interfacing with async and SDRAM memory models. Does not supportHPI.

C6414 Device Simulator,Little Endian

Simulates the C6414 processor. Supports L1D, L1P, L2 cache, EDMA, QDMA, interrupt selector,McBSP(3), timer(3), EMIF supports interfacing with async, SDRAM and generic sync RAMmemory models. Does not support HPI, Utopia.

C6415 Device Simulator,Little Endian

Simulates the C6415 processor. Supports L1D, L1P, L2 cache, EDMA, QDMA, interrupt selector,McBSP(3), timer(3), EMIF supports interfacing with async, SDRAM and generic sync RAMmemory models. Does not support HPI, PCI, Utopia.

C6416 Device Simulator,Little Endian

Simulates the C6416 processor. Supports L1D, L1P, L2 cache, EDMA, QDMA, interrupt selector,McBSP(3), timer(3), TCP, VCP, EMIF supports interfacing with async, SDRAM and generic syncRAM memory models. Does not support HPI, PCI, Utopia.

C6416 FunctionalSimulator, Little Endian

Simulates the C6416 processor. This is faster than the device simulator but does not simulate allthe peripherals. Supports functional timer(2), interrupt selector , EDMA and QDMA (uses a flatmemory system).

C6411 Device Simulator,Little Endian

Simulates the C6411 processor. Supports L1D, L1P, L2 cache, EDMA, QDMA, interrupt selector,McBSP(2), timer(3), EMIF supports interfacing with async, SDRAM and generic sync RAMmemory models. Does not support HPI, Utopia.

C6701 Device Sim, LittleEndian Map 0

Simulates the C6701 processor. Supports DMA, McBSP(2), timer(2), EMIF supports interfacingwith async, SDRAM and SBSRAM memory models (EMIF not fully cycle accurate). Does notsupport HPI.

Page 8: Choosing the Appropriate Simulator Configuration in Code Composer

SPRA864

8

Table 1. TMS320C6x Simulator Configurations (Continued)

Configuration Description

C6711 Device Simulator,Little Endian

Simulates the C6711 processor. Supports L1D, L1P, L2 cache, EDMA, QDMA, McBSP(2),timer(2), EMIF supports interfacing with async and SDRAM memory models. Does not supportHPI.

C6712 Device Simulator,Little Endian

Simulates the C6712 processor. Supports L1D, L1P, L2 cache, EDMA, QDMA, McBSP(2),Timer(2), EMIF supports interfacing with async and SDRAM memory models.

C6713 Device Simulator,Little Endian

Simulates the C6713 processor. Supports L1D, L1P, L2 cache, EDMA, QDMA, timer(2), EMIFsupports interfacing with async and SDRAM memory models, McBSP(2), McASP(2), interruptselector. Does not support HPI, IIC.

C6713 FunctionalSimulator, Little Endian

Simulates the C6713 processor. This is faster than the device simulator but does not simulate allthe peripherals. Supports functional timer(2), interrupt selector, EDMA and QDMA (uses a flatmemory system).

Table 2. TMS320C55x Simulator Configurations

Configuration Description

C55xx FunctionalSimulator

Simulates the C55x CPU Rev 2.1 core. This gives the fastest possible result but pipeline effectsof the CPU are neglected; means instructions are executed one at a time. Supports the timer butdoesn’t support any other peripherals. This simulator will not be cycle or cycle count accurate.

C55xx Cycle AccurateSimulator

Simulates the C55x CPU Rev 2.1 core. Supports program/data memory with latency. If thememory configuration is not provided, a flat memory system(memory with no latency, noDARAM/SARAM) is used as default. Supports the timer but doesn’t support any other peripheral.

C55xx Cache Simulator Simulates C55x CPU Rev 2.1 core. Supports program/data memory with latency. If the memoryconfiguration is not provided, a flat memory system (memory with no latency, noDARAM/SARAM) is used a default. Also supports timer and C55x Instruction Cache. Does notsupport any other peripheral.

C5510 Device Simulator Simulates the C5510 processor. Supports ICache, DMA, EMIF, timers (2), McBSP (3), RHEA,and EHPI. Doesn’t support DPLL and GPIO. Internal memory interface supports interfacing withSARAM and DARAM models. External memory supports interfacing with asynchronous andSBSRAM models.

C5502 FunctionalSimulator

Simulates the C5502 processor. This is faster than the device simulator but does not simulate allthe peripherals. Supports functional timers (3), watchdog timer, DMA, and ICache. Doesn’tsupport EMIF, McBSP, VBUS, IIC ,UART and UHPI peripherals. Uses flat memory system.

C5502 Device Simulator Simulates the C5502 processor. Supports ICache, DMA, EMIF, timers (3), watchdog timer,McBSP (3), VBUS, IIC, and UART peripherals. Doesn’t support UHPI. Internal memory interfacesupports interfacing with SARAM and DARAM models. External memory supports interfacingwith async, SBSRAM, and SDRAM models.

NOTE:

• All the configurations on the C6x have a corresponding big endian mode configurationsupported in the product. Subsequent references to simulator configurations apply to boththe endian modes.

• All the configurations on the C6x having map 0 specified in the above table have acorresponding map1 configuration supported in the product.

Page 9: Choosing the Appropriate Simulator Configuration in Code Composer

SPRA864

9

• Since C55x has a protected pipelined architecture, there are two variants of CPU simulatorsavailable on the C55x platform: functional CPU simulator (having no pipeline effectsmodeled) and cycle accurate CPU simulator (having the pipeline effects modeledaccurately).

• Most of the capabilities of the simulator configurations mentioned in algorithmic developmentstages are also applicable to the application development stage. If they are not applicable,they will be specifically mentioned.

3.2 Algorithm Validation

Algorithm development is typically CPU centric; algorithms may be developed and validated inisolation of the application framework. Debug visibility into application code variables and datastructures, CPU registers, and memory are key during this development stage. Algorithmvalidation is targeted to ensure coverage of all developed code.

The functional CPU simulator on the C55x and the CPU simulators on the C6x are best suitedfor algorithm validation. These run faster than other simulator configurations (refer to theTMS320C6000 Instruction Set Simulator Technical Overview (SPRU600) and the TMS320C55xInstruction Set Simulator Technical Overview (SPRU599) for details on performance numbers of varioussimulator configurations). These configurations also simulate tasks running on the DSP/BIOS.

The code coverage tool gives information about the source code that was not exercised in a runof the application.

3.2.1 Recommended Configurations

TMS320C55x:

C55x functional simulator

TMS320C6x:

C62x, C64x, C67x cycle accurate CPU simulators

Related features:

Breakpoints, register views, memory views, RTDX, probe points, pin connect, port connect,DSP/BIOS, and code coverage and multi-event profiler.

Example:

In Figure 1, the algorithms VOL and FIR can be debugged using these simulator configurations.Data may be fed to these algorithms through RTDX channels or probe points.

3.3 Algorithm Optimization

Algorithmic optimization involves optimizing for CPU cycles and code size. Identifying regions ofapplication code that consume excess CPU cycles, as well as the causes for these excesscycles are key in this stage.

The C55x cycle accurate CPU simulator and the C6x CPU simulator are best suited for thisstage.

Page 10: Choosing the Appropriate Simulator Configuration in Code Composer

SPRA864

10

3.3.1 Recommended Configurations

TMS320C55x:

C55x cycle accurate simulator

TMS320C6x:

C62x, C64x, C67x cycle accurate CPU simulators

Related features:

• Pipeline stall analyzer (C55x).

• Code Composer Studio IDE profiler, code coverage and multi-event profiler.

3.3.2 Profilers

The Code Composer Studio IDE profiler helps identify hot spots by giving cumulative, min, max,and average event counts over functions and ranges of code. The code coverage andmulti-event profiler helps identify causes of the performance losses by providing profile data overevents such as pipeline stalls, cache misses, and so on. It gives cumulative event counts formultiple events on a source line basis. The multi-event profiler helps pinpoint the exact lines ofthe code-contributing stall cycles. For more details, refer to online help for the Code ComposerStudio IDE profiler and Using Code Coverage and the Multi–event Profiler for Robustness andEfficiency Analysis (SPRA868).

3.3.3 Pipeline Analysis

Once the hotspots are identified, it is important to know their causes. The CPU simulatoraccurately simulates the instruction set behavior of the target DSP. For the C55xpipeline-protected architecture, the simulator accurately simulates the stall behavior due tovarious resource conflicts. This pipeline behavior can be seen on the simulator through thepipeline stall analyzer. This will help identify causes for pipeline stalls and help optimize criticalroutines coded in assembly language. For more details, refer to the online help for informationon the pipeline stall analyzer.

Example:

In Figure 1, VOL and FIR need to be optimized to meet cycle and memory budgets. Themulti-event profiler tells the exact location of cycle losses in the code. Further, on the C55x, thepipeline stall analyzer can be used to trace the exact cause for such a stall.

3.4 Application Validation

Application correctness needs to be ensured after integrating the different algorithms and controlcode into the application framework. The integration process may involve creation of DSP/BIOStasks, using CSL and device drivers for programming the peripherals, and proper placement ofbuffers in internal and external memories.

The device simulator configurations model the DMA, serial ports, timers, and other peripherals.These configurations help in validating the integrated applications. Since there may be multipleiterations involved in ensuring correctness of the application, the functional device simulatorconfigurations are preferred due to their higher simulation speeds.

The differences between the functional and the cycle accurate device simulators are presentedin Table 3.

Page 11: Choosing the Appropriate Simulator Configuration in Code Composer

SPRA864

11

Table 3. Functional Device Simulators vs. Device Simulators

Functional vs. Cycle Accurate Memory Subsystems Functional vs. Cycle Accurate DMA

The functional device simulators model the cache system togive correct event counts for cache hits and misses. Thecycle accurate simulators, besides giving correct eventcounts, also model the cycle latencies due to memoryaccesses The cycle accurate cache models in the devicesimulators can be used to get an estimate of the cyclesgained after optimizing the application, using the analysisinformation from the functional device simulators.

Applications use DMA to transfer data across internalmemories, peripherals and external memories. Thesetransfers are typically synchronized with events such asinterrupts, serial port events, etc. For simulating theapplication behavior correctly, it would suffice to simulate theDMA transfers synchronized on those events, and notnecessarily model the latencies accurately.

The functional simulator utilizes this characteristic tosimulate applications correctly with higher speeds by notmodeling the latencies.

Cycle accurate DMA on device simulators mimic the realtarget behavior by modeling the latencies, the buscontentions, the transfer protocols, etc.

3.4.1 Recommended Configurations

TMS320C55x:

C5502 functional simulator

Use this simulator if the peripheral capabilities needed by the application matches theperipherals supported in this simulator. Otherwise, the C5502 or C5510 device simulator canbe used.

For example, if DMA and timers are used, C5502 functional device simulator is used. IfUART or I2C is used by the application, use the C5502 device simulator, since the C5502functional simulator does not model these peripherals.

TMS320C6x:

C6713 functional simulator, C6416 functional simulator

Use these simulators if the peripheral capabilities needed by the application matches theperipherals supported in these simulators. Otherwise the C6414/15/16, C620x, C621x,C671x device simulators can be used.

For example, if DMA and timers are used, C6416/C6713 functional device simulator is used.If TCP/VCP is used by the application, use the C6416 device simulator

Related features:

Pin connect, port connect are used for providing external stimuli for peripherals such as theserial port. Simulator analysis events can be used for observing events on the target. Theseevents can be used for debugging by configuring them to stop simulation when the event occurs.

Example:

For the application in Figure 1, the VOL and FIR algorithms are integrated into an applicationframework that uses DSP/BIOS and CSL. The functional device simulator may be used to verifyapplication correctness.

Page 12: Choosing the Appropriate Simulator Configuration in Code Composer

SPRA864

12

3.5 Application Optimization

Application optimization involves ensuring efficient code placement of all modules to minimizememory latency and cache misses, efficient usage of the DMA for transferring data acrossinternal memories, peripherals and external memories, and minimizing the CPU idle time. Thefunctional device simulators and the device simulators can be used to perform theseoptimizations.

3.5.1 Recommended Configurations

Cache optimizations:

• TMS320C55x:

– C55x cycle accurate CPU simulatorThis configuration simulates the I-cache and can be used with the cache analysis tool toanalyze the cache hit/misses.

• TMS320C6x:

– 6713 functional simulator, 6416 functional simulatorThese functional device simulator configurations can be used to analyze cachehits/misses. They support the cache analysis tool for analyzing the L1P, and L1D cacheperformance.

– C64x cache simulatorThis configuration can be used to analyze the cycle effects of cache hits and misses onthe C64x devices. This simulator configuration simulates the cycle effects of the L1P,L1D, and L2.Note: The cache analysis tool is supported only with the functional device simulatorconfigurations.

Buffers transfer optimizations:

• TMS320C55x:

– C5502 device simulator, C5510 device simulator

• TMS320C6x:

– C6201/2/3/4/5 device simulator

– C6211 device simulator

– C6711/2/3 device simulator

– C6411/4/5/6 device simulator

3.5.2 Cache Optimizations

Optimizing for memory accesses involves analyzing program/data cache misses and memoryaccess patterns. The functional device simulator configurations can be used to quickly simulatethe application and obtain this analysis data. The cache analysis tool can be used to visualizethese access patterns and identify areas of application improvement. The higher simulationspeeds of the functional device simulator configurations enable performing these applicationoptimizations over multiple iterations.

Once these optimizations are completed, the user can run the application on the devicesimulator or the hardware, such as the test and evaluation board (TEB), to get an accurateestimate of the cycles gained.

Page 13: Choosing the Appropriate Simulator Configuration in Code Composer

SPRA864

13

3.5.3 Buffer Transfer Optimizations

In the background of CPU computations, applications transfer buffers using the DMA acrossinternal memories, peripherals and external memories. Optimizing these transfers may involvechoosing the right buffer sizes, and sequencing the DMA transfers appropriately to maximizeparallelism between CPU and DMA. The device simulator configurations can be used tosimulate and evaluate the effects of these optimizations.

Example:

When FIR and VOL are integrated into the application framework, the functional devicesimulator configurations can be used for memory placement and cache optimizations. Thedevice simulator configurations can be used to perform buffer transfer optimizations andmeasure the cycle effects of all optimizations.

3.6 Real-World Interactions

When simulating the integrated application, it is sometimes essential to provide appropriateexternal stimuli to the peripherals, as would happen in a real system. These stimuli could besome external interrupts or events indicating data availability from the external sources.Additionally, these stimuli can be simulated using the pin connect and the port connect featuresin Code Composer Studio IDE.

Pin connect and port connect features are available on all the simulator configurations. Detailson these are available in the technical reference guide for each ISA simulator.

It is also possible to simulate the device boot process on the C6x simulators and external hostport interactions on the C55x simulators.

3.7 Setting Up Simulator Configurations in Code Composer Studio IDE

Following are the steps to set up the desired simulator configuration in Code Composer StudioIDE.

1. Click on the Code Composer Studio Setup icon. This will bring up the Import Configurationwindow as shown below.

Page 14: Choosing the Appropriate Simulator Configuration in Code Composer

SPRA864

14

2. Select the desired configuration from the available list of configurations shown in thewindow.

Click on Import and then Save and Quit.

This will set the selected simulator configuration in the Code Composer Studio setup.

Page 15: Choosing the Appropriate Simulator Configuration in Code Composer

SPRA864

15

3. Save and close the Code Composer Studio setup.

4 User ScenariosTable 4 lists out a few user scenarios during application development and highlights theappropriate simulator configuration and the applicable features than can be used in CodeComposer Studio IDE.

Table 4. Algorithm Validation and Optimization

Platform Problem Solution

(a) Algorithm Validation

C55x/C6x I am using FIR, IIR and LMS algorithms froma vendor. How can I quickly validate them onthe C55x/C6x platforms?

For quick validation, use the C55x functional simulators.For the C6x, the CPU simulator configurations can beused.

Page 16: Choosing the Appropriate Simulator Configuration in Code Composer

SPRA864

16

Table 4. Algorithm Validation and Optimization (Continued)

Platform SolutionProblem

(b) Pipeline Stalls

C55x I am trying to minimize pipeline stalls byreordering certain instructions that seem tocreate resource conflicts. How can I do thiseasily?

The C55x CPU cycle accurate simulator configurationsupports the pipeline stall analysis feature. This toolclearly shows the causes of pipeline stalls, IBU stalls, andthe instructions and resources that are resulting inconflicts. It allows stepping through the application codeto easily identify the exact areas of code where conflictsoccur.

(c) Algorithm Profiling

C6x/C55x I am working on the C6x/C55x platform. Ineed to get the cycles that are spent ininstructions, without memory latency.

Use any of the 62x, 64x, 67x ISA simulator configurationsor the C55x CPU cycle accurate simulator configurations.The Code Composer Studio IDE profiler or themulti-event profiler may be used for profiling theapplication code.

(d) Profiling a Particular Event

C6x How can I find out details of memory bankconflicts on a per function basis?

A memory bank conflict event is supported on all C6xsimulators. The Code Composer Studio IDE profiler canbe setup to profile the code on this event, instead of thedefault CPU cycles. This enables profiling the code formemory bank conflicts and see the per functiondistribution of this event.

The multi-event profiler may be used to obtain the thisprofile information is needed for memory bank conflictssimultaneously with other events such as CPU cycles,cache events, etc.

(e) Algorithm Optimization

C55x I would like to see the cycle effects ofassembly code optimizations such as usingdual MAC instructions, adding parallelinstructions in various places in my code.How can I do that?

The C55x cycle accurate CPU simulator configurationcan be used to measure these optimizations. Profile theportion of the code that needs to be optimized and thecycle difference can be used for getting the best possibleperformance.

(f) Algorithmic Optimization

C55x I am developing an algorithm for the C55xplatform. I do not know how to choose theideal sizes for the local-repeat andblock-repeat loops so that I get bestperformance from C55x architecture.

The C55x CPU simulator configuration can be used withthe Code Composer Studio IDE profiler, code coverageand multi-event profiler and the pipeline stall analyzer tomeasure branch overheads.

Page 17: Choosing the Appropriate Simulator Configuration in Code Composer

SPRA864

17

Table 5. Application Validation and Optimization

Platform Problem Solution

(a) Cache Analysis

C55x I am developing an application for C5502device. I want to decide on the placement ofmy program within the internal and externalmemory such that I can take full advantage ofthe I-cache in C5502. How can I do it?

The C55x cycle accurate CPU simulator has I-cacheinformation and can be used with the cache analysis tool.

The C5502 functional simulator configuration providesinformation on I-cache hits and misses. The applicationcan be simulated using any of the above configurationsand modified to minimize the number of misses. TheC5502 device simulator configuration can be used tomeasure the effects of cycle latencies for different typesof internal and external memories.

(b) Cache Analysis

C6x I am developing an application for C6416device. I want to decide on the placement ofmy program within the internal and externalmemory such that I can take full advantage ofthe caches in these devices.

The C6416 functional device simulator configuration canbe used with the cache analysis tool to visualize thememory access patterns and details of cache hits andmisses.

(c) Simulating Full-Application

C55x I am developing a vocoder application on theC5502 device. The application uses theMcBSP0 for input data and the McBSP1 forsending the decoded data streams. The datastreams are transferred to internal memoryusing the DMA for the CPU to operate onthem.

Can I use the simulator to validate thisapplication?

The C5502 functional device simulator configuration canbe used initially while integrating the application modules.

The C5502 device simulator configuration can be used tovalidate the programmation of the McBSP0 and McBSP1peripherals. The pin connect and port connect featurescan be used to provide the input data stream to theMcBSP0 and collect the output through the McBSP1.

(d) Simulating Full Application

C6x I have a gsm.729 algorithm running on a RF3framework. The application uses tasksthrough DSP/BIOS, programs DMA throughCSL calls, and uses McBSP for the datatransfers. I want to test the functionalcorrectness of the application.

A functional simulator would have been more appropriateif the McBSP was supported. Since it is not supported,use the C6416 device simulator for the verifying thefunctional correctness of the application. Use the portconnect feature to transfer in the data through theMcBSP. The pin connect feature can be used to providethe external frame syncs and the clock to the McBSP ifthey are programmed for external clock and frame syncs.

Page 18: Choosing the Appropriate Simulator Configuration in Code Composer

SPRA864

18

Table 5. Application Validation and Optimization (Continued)

Platform SolutionProblem

(e) Difference Between CPU, Functional Device and Device Simulators

C6x/C55x I have code that does not use any peripheralor the DMA. What are the differences, I wouldobserve if I run it on CPU simulator, functionaldevice simulator, and device simulator?

Cycle counts: The CPU simulator and functional devicesimulator configurations will provide identical cyclecounts. The device simulator configurations may providedifferent cycle counts if the memory ranges used by theapplication falls in the external memory ranges of thedevice. On the C55x, the C55x simulator configuration willgive accurate CPU cycle counts and will differ from theC55x functional simulator configuration.

Simulation speed: The CPU simulator configurations willrun faster than the functional device simulatorconfigurations. The device simulator configurations willrun slower than the functional device simulatorconfigurations.

5 References1. Code Composer Studio Getting Started Guide (SPRU509)

2. Code Coverage And Multi-event Profiler User Guide (SPRU624)

3. Using the Code Coverage And Multi-event Profiler for Robustness and Efficiency Analysis(SPRA868)

4. Cache Analysis User Guide (SPRU575)

5. Using the Cache Analysis Tool to Improve Cache Utilization (SPRA863)

Page 19: Choosing the Appropriate Simulator Configuration in Code Composer

SPRA864

19 Choosing the Appropriate Simulator Configuration in Code Composer Studio

Appendix A

A.1 Simulator Configurations Based on Extent and Detail of Device Simulated

• Functional CPU simulator

This models the ISA behavior without bringing in the pipeline effects. Memory latency isconstant. This helps in ensuring the functional validation of the modules. These run typicallyfaster than the cycle accurate CPU simulator. This variant is available for C55x simulatorsonly.

• Cycle accurate CPU simulator

This models the pipeline behavior accurately, giving the detailed behavior on cycle lossesdue to CPU stalls and bank conflicts in on-chip memory. They help optimize the modules forinstruction and efficient utilization of the CPU resources.

• CPU and Cache Simulator

This models the cycle behavior of L1I, L1D, and L2 for C64x platform.

The functionality of DMA, caches, and other peripheral is modeled to the extent of ensuringthat applications may be run without modification. These simulators are order of magnitudefaster compared to the device simulators. They compromise on the cycle counts, though stillgive accurate event counts. These are primarily used in the validation and certainoptimization phases at the application level.

• Functional device simulators and device simulators

These simulators model very closely the cycle behavior of the caches, DMA, and otherperipherals. These are generally used to get an estimate for real cycles. They run at a muchslower speed compared to the functional device simulators.

A.2 Features to Provide External Stimuli to Target Device

• Pin connect

Pin connect enables the user to simulate and monitor signals from external interrupts. Fortaking in external interrupts/triggers, some pins are simulated in the C6x/C5x devices. Anyfile having the specified format can be connected to those pins. These formats specify theabsolute/relative cycle for these interrupts to occur.

• Port connect

The port connect tool allows the user to access a file through a memory address. Thisfeature is very useful to set up an input or output data stream to the simulator at supportedaddresses. Whenever a file is connected to a memory (port) address for read (write), datafrom the file is accessed whenever there is a read (write) to the address.

• Cross bar

This tool is used for specifying the interconnectivity within McBSP or McBSP. This differentMcBSP can be interconnected, through their external pins like DR, FSR, and FSX, etc. Thisis available on only some C6x simulators.

• Boot load

This feature can be used to bootload some code into the device memory at the start ofsimulation. This is available on some C6x simulators.

Page 20: Choosing the Appropriate Simulator Configuration in Code Composer

SPRA864

20 Choosing the Appropriate Simulator Configuration in Code Composer Studio

• External host port

This helps simulate the behavior of host interaction through a simple command-file basedmechanism. This is available on some C55x simulators.

A.3 Visibility and Analysis Features

• Pipeline analysis

This feature allows visibility into pipeline stages and instructions residing in each stage of thepipeline as they enter and exit it. It gives stalls including, the instructions involved andconflicting resources.

• Simulator analysis

Simulator analysis allows the user to set up and monitor the occurrence of specific events.The simulator analysis plug-in reports the occurrence of particular system events to monitorand measure the performance of your program. The events can be set up to eitherincrement a counter when they are triggered or to halt the execution when they aretriggered.

• Code Composer Studio IDE profiler

This helps profile over functions or the ranges in code over clocks or analysis events.

• Cache analysis

It graphically visualizes the memory reference pattern of a program over time. This enablesthe programmer to quickly target the areas of code and data that are incurring cache misses,and provides a road map for applying optimizations and transformations to improve cacheperformance.

• Code coverage and multi-event profiler

The code coverage and multi-event profiler tool, available in the Analysis Toolkit for CCSV2.2 User’s Guide (SPRU623) provides two capabilities:

– Code coverage information by identifying source code that was not exercised in a run ofthe application.

– Profile data for functions over multiple events of interest in a single run of the application.

Page 21: Choosing the Appropriate Simulator Configuration in Code Composer

IMPORTANT NOTICE

Texas Instruments Incorporated and its subsidiaries (TI) reserve the right to make corrections, modifications,enhancements, improvements, and other changes to its products and services at any time and to discontinueany product or service without notice. Customers should obtain the latest relevant information before placingorders and should verify that such information is current and complete. All products are sold subject to TI’s termsand conditions of sale supplied at the time of order acknowledgment.

TI warrants performance of its hardware products to the specifications applicable at the time of sale inaccordance with TI’s standard warranty. Testing and other quality control techniques are used to the extent TIdeems necessary to support this warranty. Except where mandated by government requirements, testing of allparameters of each product is not necessarily performed.

TI assumes no liability for applications assistance or customer product design. Customers are responsible fortheir products and applications using TI components. To minimize the risks associated with customer productsand applications, customers should provide adequate design and operating safeguards.

TI does not warrant or represent that any license, either express or implied, is granted under any TI patent right,copyright, mask work right, or other TI intellectual property right relating to any combination, machine, or processin which TI products or services are used. Information published by TI regarding third–party products or servicesdoes not constitute a license from TI to use such products or services or a warranty or endorsement thereof.Use of such information may require a license from a third party under the patents or other intellectual propertyof the third party, or a license from TI under the patents or other intellectual property of TI.

Reproduction of information in TI data books or data sheets is permissible only if reproduction is withoutalteration and is accompanied by all associated warranties, conditions, limitations, and notices. Reproductionof this information with alteration is an unfair and deceptive business practice. TI is not responsible or liable forsuch altered documentation.

Resale of TI products or services with statements different from or beyond the parameters stated by TI for thatproduct or service voids all express and any implied warranties for the associated TI product or service andis an unfair and deceptive business practice. TI is not responsible or liable for any such statements.

Mailing Address:

Texas InstrumentsPost Office Box 655303Dallas, Texas 75265

Copyright 2002, Texas Instruments Incorporated