system-level simulation (hw/sw co-simulation)

58
System-level simulation System-level simulation (HW/SW co-simulation) (HW/SW co-simulation)

Upload: sherri

Post on 19-Jan-2016

97 views

Category:

Documents


7 download

DESCRIPTION

System-level simulation (HW/SW co-simulation). Outline. Problem statement Simulation and embedded system design functional simulation performance simulation POLIS implementation partitioning example implementation simulation software-oriented hardware-centric Summary. analog. digital. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: System-level simulation (HW/SW co-simulation)

System-level simulationSystem-level simulation(HW/SW co-simulation)(HW/SW co-simulation)

Page 2: System-level simulation (HW/SW co-simulation)

OutlineOutline

Problem statementProblem statement Simulation and embedded system designSimulation and embedded system design

functional simulationfunctional simulation performance simulationperformance simulation

POLIS implementationPOLIS implementation partitioning examplepartitioning example

implementation simulationimplementation simulation software-orientedsoftware-oriented hardware-centrichardware-centric

SummarySummary

Page 3: System-level simulation (HW/SW co-simulation)

Embedded Heterogeneous Embedded Heterogeneous SystemSystem

A

Ddigital

downconv

900Mhz70Mhz 10.7Mhz

SH

SAW FILTER

RF IF 40Ms/sec - 540ks/sec

ViterbiEqls.

270.8ks/sec

demodandsync

BB

phone

bookkeypad

intfc

protocolcontrol

de-intl&

decoder

RPE-LTPspeechdecoder

speechquality

enhancement

voicerecognition

Logic

analog digital

phonebookDMA

S/P

DSP core

uC core

RAM & ROM

GSM Phone

Page 4: System-level simulation (HW/SW co-simulation)

Modeling andModeling andSimulation TechniquesSimulation Techniques

execute C code on a workstation

execute object code using

instruction level model instruction cycle

accurate clock cycle accurate

HDL model of the processor RTL gate level netlist

phase accurate model

pin accurate model

fully functional model

spice model of the processor

in-circuit emulators

real hardware

bus functional model

some sort of C code

behavioral HDL code

RTL HDL code discrete event clock cycle

based

gate level netlist

real hardware

some sort of C code

frequency domain model

spice model

real hardware

RF IF

uC

DSPor

ASIC

SourceModels

SmartModels

DesignWare

Hardware Modeler

Page 5: System-level simulation (HW/SW co-simulation)

Problem StatementProblem Statement

To model the behavior of a combined To model the behavior of a combined hardware and software system based on hardware and software system based on models of the behavior of the hardware models of the behavior of the hardware and software componentsand software components

Usually requires trading offUsually requires trading off accuracyaccuracy throughputthroughput convenienceconvenience

using the right abstractions for the taskusing the right abstractions for the task

Page 6: System-level simulation (HW/SW co-simulation)

Function and Interface Function and Interface AbstractionAbstraction

what is going on inside a subsystem

how do two subsystems communicate

How much visibility do I need to have into

function abstraction

interface abstraction

? ?? ?

Page 7: System-level simulation (HW/SW co-simulation)

OutlineOutline

Problem statementProblem statement Simulation and embedded system designSimulation and embedded system design

functional simulationfunctional simulation performance simulationperformance simulation

POLIS implementationPOLIS implementation partitioning examplepartitioning example

implementation simulationimplementation simulation software-orientedsoftware-oriented hardware-centrichardware-centric

SummarySummary

Page 8: System-level simulation (HW/SW co-simulation)

Cycle-based,logic simul.

Design flowDesign flow

Behaviorcapture

Mapping(partitioning)

Architecturecapture

Functionalsimul.

Architecturesimul.

Performancesimul.

Synthesis,coding

Page 9: System-level simulation (HW/SW co-simulation)

for (i=0;i<8;i++) idctrow(i); for (i=0;i<8;i++) idctcol(i);

Functional modelFunctional model

Variable-length dec.

Flowsplitting

IDCT

Motioncomp.

Imagememory

Videooutput

MPEG 2decoder

Stream in Video out

Page 10: System-level simulation (HW/SW co-simulation)

Functional simulationFunctional simulation Timeless (not really co-simulation…)Timeless (not really co-simulation…) Algorithm exploration, functional Algorithm exploration, functional

debugging, virtual prototypingdebugging, virtual prototyping Different formal models: Different formal models:

control-dominated: CSP, EFSMs, DE, etc.control-dominated: CSP, EFSMs, DE, etc.

event-based: Bones, StateCharts, etc.event-based: Bones, StateCharts, etc. data dominated: DF networksdata dominated: DF networks

token-based: Cossap, SPW, etc.token-based: Cossap, SPW, etc. Single process network (Ptolemy-style)Single process network (Ptolemy-style)

Page 11: System-level simulation (HW/SW co-simulation)

HW and SW Modeling HW and SW Modeling TechniquesTechniques

throughput

accu

racy

progmemory

procHW HW

HW

SW&

procHW

HWSW

hardware centrichardware centric

software orientedsoftware oriented

functionalfunctional

Page 12: System-level simulation (HW/SW co-simulation)

Cycle-based,logic simul.

Design flowDesign flow

Behaviorcapture

Mapping(partitioning)

Architecturecapture

Functionalsimul.

Architecturesimul.

Performancesimul.

Synthesis,coding

Page 13: System-level simulation (HW/SW co-simulation)

Architecture modelArchitecture model

AbstractAbstract model for mapping model for mapping no detailed wiring (busses, serial links, etc.)no detailed wiring (busses, serial links, etc.) black-box components (ASICs, micro-black-box components (ASICs, micro-

controllers, DSPs, memories, etc.)controllers, DSPs, memories, etc.) Later refined to a detailed designLater refined to a detailed design

implement communicationimplement communication refine interfacesrefine interfaces

Page 14: System-level simulation (HW/SW co-simulation)

Architecture modelArchitecture model

CPU(10 SPEC)

ASIC(s)(160 Mops)

Memory(2Mb) bus

10Mb/s

Page 15: System-level simulation (HW/SW co-simulation)

Cycle-based,logic simul.

Design flowDesign flow

Behaviorcapture

Mapping(partitioning)

Architecturecapture

Functionalsimul.

Architecturesimul.

Performancesimul.

Synthesis,coding

Page 16: System-level simulation (HW/SW co-simulation)

MappingMapping

Associates functional units with Associates functional units with architectural unitsarchitectural units

Performs HW/SW partitioningPerforms HW/SW partitioning Associates functional communication Associates functional communication

with resources (buffers, busses, serial with resources (buffers, busses, serial links, etc.)links, etc.)

Provides Provides estimatesestimates of performance of a of performance of a given function on a given architectural given function on a given architectural unitunit

Page 17: System-level simulation (HW/SW co-simulation)

MappingMapping

CPU(10 SPEC)

Memory(2Mb) bus

10Mb/s

Flowsplitting

IDCT

Motioncomp.

Imagememory

Videooutput

ASIC(160Mops)

Variable-length dec.

Stream in Video out

Page 18: System-level simulation (HW/SW co-simulation)

MappingMapping

CPU(10 SPEC)

Memory(2Mb) bus

10Mb/s

Flowsplitting

IDCT

Motioncomp.

Imagememory

Videooutput

ASIC(160Mops)

Variable-length dec.

Stream in Video out

Page 19: System-level simulation (HW/SW co-simulation)

MappingMapping

CPU(10 SPEC)

Memory(2Mb) bus

10Mb/s

Flowsplitting

IDCT

Motioncomp.

Imagememory

Videooutput

ASIC(160Mops)

Variable-length dec.

Stream in Video out

Page 20: System-level simulation (HW/SW co-simulation)

MappingMapping

CPU(10 SPEC)

Memory(2Mb) bus

10Mb/s

Flowsplitting

IDCT

Motioncomp.

Imagememory

Videooutput

ASIC(160Mops)

Variable-length dec.

Stream in Video out

Page 21: System-level simulation (HW/SW co-simulation)

Another mappingAnother mapping

CPU(10 SPEC)

Memory(2Mb) bus

10Mb/s

Flowsplitting

IDCT

Motioncomp.

Imagememory

Videooutput

Variable-length dec.

Stream in Video out

ASIC(160Mops)

Page 22: System-level simulation (HW/SW co-simulation)

Performance simulationPerformance simulation

Analyzes performance of behavior on Analyzes performance of behavior on given architecturegiven architecture

Architectural model provides:Architectural model provides: performance estimatesperformance estimates resource constraintsresource constraints

CPU schedulingCPU scheduling bus arbitration policybus arbitration policy (abstract) cache modeling(abstract) cache modeling

Page 23: System-level simulation (HW/SW co-simulation)

OutlineOutline

Problem statementProblem statement Simulation and embedded system designSimulation and embedded system design

functional simulationfunctional simulation performance simulationperformance simulation

POLIS implementationPOLIS implementation partitioning examplepartitioning example

implementation simulationimplementation simulation software-orientedsoftware-oriented hardware-centrichardware-centric

SummarySummary

Page 24: System-level simulation (HW/SW co-simulation)

Performance simulation Performance simulation prototypeprototype

PolisPolis High level specificationHigh level specification Unified model for both hardware and software Unified model for both hardware and software

(CFSMs)(CFSMs) Code generation and estimationCode generation and estimation Hardware synthesisHardware synthesis

Page 25: System-level simulation (HW/SW co-simulation)

The POLIS design flowThe POLIS design flow

Graphical EFSMGraphical EFSM ESTERELESTEREL ................................

CompilersCompilers

CFSMsPartitioningPartitioning

Sw SynthesisSw Synthesis

FormalFormal

VerificationVerification

Sw Code + Sw Code +

RTOSRTOSLogic NetlistLogic NetlistSimulationSimulation

Hw SynthesisHw SynthesisIntfc SynthesisIntfc Synthesis

PrototypePrototype

Page 26: System-level simulation (HW/SW co-simulation)

Performance simulation Performance simulation in POLISin POLIS

Based on synthesized software timing estimatesBased on synthesized software timing estimates Generate C code for both hardware and software Generate C code for both hardware and software

componentscomponents each statement is labeled with the estimated running each statement is labeled with the estimated running

timetime accumulate delays during execution of software accumulate delays during execution of software

componentscomponents use cycle based simulation for hardware componentsuse cycle based simulation for hardware components

Very fastVery fast The system is compiled on the host machineThe system is compiled on the host machine Don’t need a model of the processorDon’t need a model of the processor

Page 27: System-level simulation (HW/SW co-simulation)

Performance estimationPerformance estimation

Current practice: manual guess or detailed Current practice: manual guess or detailed simulationsimulation

Software: need restrictions on coding styleSoftware: need restrictions on coding style still requires lots of user inputs still requires lots of user inputs can be automated if software is synthesized can be automated if software is synthesized

Hardware: need RTL specification or fast Hardware: need RTL specification or fast behavioral synthesisbehavioral synthesis

Communication: need communication Communication: need communication mapping (and refinement)mapping (and refinement)

Page 28: System-level simulation (HW/SW co-simulation)

Performance estimationPerformance estimation

4040

26264141 6363

14

18 9

a := a a := a + 1+ 1

a := 0a := 0

detect(c)detect(c)a<a<

bb

BEGINBEGIN

ENDEND

FF

TT

TT FF

emit(y)emit(y)

L0: if (detect_c) goto L1; else goto end;

L1: if (a<b) goto L3; else goto L2;

L2: a = a + 1; goto L4;

L3: a = 0;

L4: emit(y);

end: clean_up();

Page 29: System-level simulation (HW/SW co-simulation)

CFSM scheduling policyCFSM scheduling policy

Concurrency and shared resourcesConcurrency and shared resources hardware components are concurrenthardware components are concurrent only one software component can be executed at any only one software component can be executed at any

timetime if a software component receives an event,if a software component receives an event,

but the simulated processor is busy, but the simulated processor is busy,

its execution is postponedits execution is postponed need a scheduler to choose among all enabled need a scheduler to choose among all enabled

software processessoftware processes Discrete event simulation with time-stamped Discrete event simulation with time-stamped

events is used to synchronize HW and SW events is used to synchronize HW and SW

Page 30: System-level simulation (HW/SW co-simulation)

Car dashboard exampleCar dashboard example

FRCFRC

TimerTimer

OdoOdo

BeltBelt

SpeedSpeed

RPMRPM CrossdispCrossdisp

CrossdispCrossdisp

CrossdispCrossdispFuelFuel

fuelfuel

key, beltkey, belt

clockclock

wheelwheel

engineengine

fuel_dispfuel_disp

speed_dispspeed_disp

RPM_dispRPM_disp

odo_dispodo_disp

Timing generatorsData processingPWM drivers

Page 31: System-level simulation (HW/SW co-simulation)

Trade-off evaluationTrade-off evaluation

Interactively changed simulation Interactively changed simulation parametersparameters

Define different aspects:Define different aspects: Implementation of each CFSMImplementation of each CFSM CPU and clock frequencyCPU and clock frequency SchedulerScheduler

May be inherited in hierarchyMay be inherited in hierarchy Automatically transmitted to following Automatically transmitted to following

synthesis stepssynthesis steps

Page 32: System-level simulation (HW/SW co-simulation)

Trade-off evaluationTrade-off evaluation

Hw/Sw implementation and partitioningHw/Sw implementation and partitioning meeting timing constraintsmeeting timing constraints trade-offs: speed, code size, chip areatrade-offs: speed, code size, chip area

Scheduling policiesScheduling policies Round RobinRound Robin Pre-emptive and non pre-emptivePre-emptive and non pre-emptive

CPU selectionCPU selection MC 68HC11, MC 68332, MIPS R3000MC 68HC11, MC 68332, MIPS R3000

Don’t need to recompile the systemDon’t need to recompile the system

Page 33: System-level simulation (HW/SW co-simulation)

Trade-off evaluationTrade-off evaluation

Input patternsInput patterns Impulse, clock, randomImpulse, clock, random Sliders, buttonsSliders, buttons Waveforms from fileWaveforms from file

Monitoring the systemMonitoring the system Processor utilization and task scheduling chartsProcessor utilization and task scheduling charts Missed deadlinesMissed deadlines Cost of implementationCost of implementation Internal valuesInternal values

Page 34: System-level simulation (HW/SW co-simulation)

Performance evaluationPerformance evaluation

Targetproc.

MHz Part. Wheel &Engine

Missed

68HC11 4 1 260-8000 0

68HC11 1 1 180-6000 900

68HC11 4 2 50-3000 5000

68332 20 3 50-3000 30

68332 40 3 260-8000 20

Partition 1: HW={Timing unit, PWM drivers) SW={Data Processing}Partition 2: HW={Timing unit} SW={Data processing, PWM drivers}Partition 3: HW={} SW={Timing unit, Data processing, PWM drivers}

Page 35: System-level simulation (HW/SW co-simulation)

Simulation speedSimulation speed

Targetproc.

MHz Part. Graph. cycles/sec

68HC11 4 3 No < 20000

68HC11 4 2 No 650000

68HC11 4 1 No 550000

68HC11 4 2 Yes 100000

Partition 1: HW={Timing unit, PWM drivers) SW={Data Processing}Partition 2: HW={Timing unit} SW={Data processing, PWM drivers}Partition 3: HW={} SW={Timing unit, Data processing, PWM drivers}

Page 36: System-level simulation (HW/SW co-simulation)

Cycle-based,logic simul.

Design flowDesign flow

Behaviorcapture

Mapping(partitioning)

Architecturecapture

Functionalsimul.

Architecturesimul.

Performancesimul.

Synthesis,coding

Page 37: System-level simulation (HW/SW co-simulation)

Architecture refinementArchitecture refinement

CPU(10 SPEC)

ASIC(160 Mops)

Memory(2Mb) bus

10Mb/s

486(66 MHz)

Memory(2Mb) PCI bus

132 Mb/s

ASIC(30 Kgates66 MHz)

Page 38: System-level simulation (HW/SW co-simulation)

Architecture simulationArchitecture simulation

Performed on refined model of the Performed on refined model of the architecture, architecture, ignoring behaviorignoring behavior

Simulation verifies Simulation verifies interface correctnessinterface correctness bus-functional model for processors, with bus-functional model for processors, with

random address and data generationrandom address and data generation

(Logic Modeling, etc.)(Logic Modeling, etc.) cycle-based or logic simulation for the hardware cycle-based or logic simulation for the hardware

interfacesinterfaces Interface refinement simulationInterface refinement simulation

(Rowson et al., Hines et al.)(Rowson et al., Hines et al.)

Page 39: System-level simulation (HW/SW co-simulation)

HW and SW Modeling HW and SW Modeling TechniquesTechniques

throughput

accu

racy

progmemory

procHW HW

HW

SW&

procHW

HWSW

hardware centrichardware centric

software orientedsoftware oriented

functionalfunctional

Page 40: System-level simulation (HW/SW co-simulation)

Bus Functional ModelBus Functional Model

command interpreter command interpreter executing memory bus executing memory bus access cyclesaccess cycles

drives all relevant drives all relevant processor pinsprocessor pins

does not implement does not implement the complete the complete instruction set of the instruction set of the processorprocessor

cmd> WRT val,addrcmd> WRT R2,addrcmd> RD addr,R1cmd> ...

C/C++

HDLlanguage

and/orsimulatorbinding

adrdatardwrt

HW

SW&

procHW

Page 41: System-level simulation (HW/SW co-simulation)

Refined mappingRefined mapping

486(66 MHz)

Memory(2Mb) PCI bus

132 Mb/s

Flowsplitting

IDCT

Motioncomp.

Imagememory

Videooutput

ASIC(30Kgates66 MHz)

Variable-length dec.

Stream in Video out

Page 42: System-level simulation (HW/SW co-simulation)

Cycle-based,logic simul.

Design flowDesign flow

Behaviorcapture

Mapping(partitioning)

Architecturecapture

Functionalsimul.

Architecturesimul.

Performancesimul.

Synthesis,coding

Page 43: System-level simulation (HW/SW co-simulation)

Synthesis and codingSynthesis and coding

Implement functional specification on Implement functional specification on architecturearchitecture

Ideally should be automated Ideally should be automated (e.g. if function for HW is specified as RTL…)(e.g. if function for HW is specified as RTL…)

In practice generally a mix of hand In practice generally a mix of hand design and synthesisdesign and synthesis

Page 44: System-level simulation (HW/SW co-simulation)

OutlineOutline

Problem statementProblem statement Simulation and embedded system designSimulation and embedded system design

functional simulationfunctional simulation performance simulationperformance simulation

POLIS implementationPOLIS implementation partitioning examplepartitioning example

implementation simulationimplementation simulation software-orientedsoftware-oriented hardware-centrichardware-centric

SummarySummary

Page 45: System-level simulation (HW/SW co-simulation)

Implementation simulationImplementation simulation

Necessary because:Necessary because: need detailed performance analysisneed detailed performance analysis hand design is error-pronehand design is error-prone

Most errors should have been caught Most errors should have been caught beforebefore

How to use test vectors and compare How to use test vectors and compare results between results between different abstraction different abstraction levels levels ??

Page 46: System-level simulation (HW/SW co-simulation)

HW and SW Modeling HW and SW Modeling TechniquesTechniques

throughput

accu

racy

progmemory

procHW HW

HW

SW&

procHW

HWSW

hardware centrichardware centric

software orientedsoftware oriented

functionalfunctional

Page 47: System-level simulation (HW/SW co-simulation)

Software OrientedSoftware Oriented

software software executing on aexecuting on amodel of the model of the processorprocessor combine program and data memory with

the processor into a single model

implement instruction fetch, decode, andexecution cycle in opcode interpreter in software

HW

SW&

procHW

Page 48: System-level simulation (HW/SW co-simulation)

Clock Cycle AccurateClock Cycle AccurateInstruction Set Architecture ModelInstruction Set Architecture Model

clock cycle accurate

ISA model

C/C++

HDLlanguage

and/orsimulatorbinding

model derived/generated fromthe instruction set description

ADD R1,R2,R3

{ A1.OP = 0x01

A1.RA = R1

A1.RB = R2

... }

{ R3<- ADD(R2,R1) }

ADD R1,R2,R3

{ A1.OP = 0x01

A1.RA = R1

A1.RB = R2

... }

{ R3<- ADD(R2,R1) }

HW

SW&

procHW

Page 49: System-level simulation (HW/SW co-simulation)

Clock Cycle AccurateClock Cycle AccurateInstruction Set Architecture ModelInstruction Set Architecture Model

hardware devices hardware devices hanginghangingoff the memory busoff the memory bus memory read and write memory read and write

cycles are executed cycles are executed when specific memory when specific memory addresses are being addresses are being referencedreferenced

clock cycle accurate

ISA model

C/C++

HDLlanguage

and/orsimulatorbinding

adrdatardwrt

HW

SW&

procHW

Page 50: System-level simulation (HW/SW co-simulation)

Instruction Cycle AccurateInstruction Cycle AccurateInstruction Set Architecture ModelInstruction Set Architecture Model

same basic idea as same basic idea as Clock Cycle Clock Cycle Accurate ISA ModelAccurate ISA Model processor status can only be observed at processor status can only be observed at

instruction cycle boundariesinstruction cycle boundaries usually no notion of processor pins, usually no notion of processor pins,

besides serial and parallel portsbesides serial and parallel ports can be enhanced to drive all pins, can be enhanced to drive all pins,

including the memory interfaceincluding the memory interface Clock Cycle Accurate ISA ModelClock Cycle Accurate ISA Model

HW

SW&

procHW

Page 51: System-level simulation (HW/SW co-simulation)

Virtual ProcessorVirtual Processor

C/C++ code is compiled and executed C/C++ code is compiled and executed on a workstationon a workstation

accuracy of numeric results may be an accuracy of numeric results may be an issueissue

Hardware devices hanging off the Hardware devices hanging off the memory busmemory bus

additional function calls to execute additional function calls to execute memory read and write cycles when memory read and write cycles when specific memory addresses are being specific memory addresses are being referenced can be inserted in the original referenced can be inserted in the original softwaresoftware

need estimation and synchronization with need estimation and synchronization with hardware simulation to model software hardware simulation to model software timingtiming

HW

SW&

procHW

Page 52: System-level simulation (HW/SW co-simulation)

HW and SW Modeling HW and SW Modeling TechniquesTechniques

throughput

accu

racy

progmemory

procHW HW

hardware centrichardware centric

HW

SW&

procHW

software orientedsoftware oriented

HWSW

functionalfunctional

Page 53: System-level simulation (HW/SW co-simulation)

Hardware-CentricHardware-Centric

Possible descriptions of the Possible descriptions of the hardwarehardware

actual hardwareactual hardware gate level netlistgate level netlist clock cycle accurate architectural clock cycle accurate architectural

descriptiondescription

hardware executing a softwareimage stored in memory

progmemory

procHW HW

Page 54: System-level simulation (HW/SW co-simulation)

progmemory

procHW HWActual HardwareActual Hardware

processor is plugged into a device processor is plugged into a device connected to a workstationconnected to a workstation

discrete event simulator just sees discrete event simulator just sees another componentanother component

software image is loaded into memory software image is loaded into memory and clock is appliedand clock is applied

workstation running discrete event simulator

XPLORERMS-3X00

ModelSource

XPLORERMS-3X00

ModelSource

Hardware Modeler

Page 55: System-level simulation (HW/SW co-simulation)

progmemory

procHW HWGate Level NetlistGate Level Netlist

exact timing of the signals at all input exact timing of the signals at all input and output pins and output pins

full visibility of all internal signals and full visibility of all internal signals and storagestorage

gate level netlist executed in a discreteevent verification environment

Page 56: System-level simulation (HW/SW co-simulation)

Clock Cycle-AccurateClock Cycle-AccurateArchitectural DescriptionArchitectural Description

model delivers at each clock model delivers at each clock edgeedgea set of a set of stablestable output signals output signals givengivena set of a set of stablestable input signals input signals

visibility of internal signalsvisibility of internal signalsand storage depends onand storage depends ondegree of model refinementdegree of model refinement

architectural description executing ina cycle based verification environment

progmemory

procHW HW

Page 57: System-level simulation (HW/SW co-simulation)

SummarySummary

Functional modelVERIFY

FUNCTIONALITY

Implementation modelVERIFY

ABSTRACTIONS

Performance modelVERIFY

PERFORMANCE

Architecture modelVERIFY

INTERFACES

Page 58: System-level simulation (HW/SW co-simulation)

ConclusionsConclusions

Abstraction key to speedupAbstraction key to speedup Separate behavior (functionality), Separate behavior (functionality),

communication and timingcommunication and timing Architecture refinement essential for Architecture refinement essential for

true co-design and fast co-simulationtrue co-design and fast co-simulation Software and hardware synthesis helps Software and hardware synthesis helps

performance estimationperformance estimation Rapid prototyping may be the ultimate Rapid prototyping may be the ultimate

co-simulation tool...co-simulation tool...