qemu/sstemcsystemc cosim lationcosimulation at …adt.cs.upb.de/quf/quf11/quf2011_04.pdfspeed up...
TRANSCRIPT
![Page 1: QEMU/SstemCSystemC Cosim lationCosimulation at …adt.cs.upb.de/quf/quf11/quf2011_04.pdfSpeed up through dynamic binary translation w.r.t. interpretive ISS Speed up through native](https://reader034.vdocument.in/reader034/viewer/2022042215/5ebc77109538062a4b5632f1/html5/thumbnails/1.jpg)
Cooperative Computing & Communication Laboratory
QEMU/S stemC Cosim lation atQEMU/SystemC Cosimulation at Different Abstraction Levels
1st International QEMU Users Forum (QUF’11)March 18th, 2011
Markus Becker, Henning Zabel, Wolfgang MüllerUniversity of Paderborn/C-LAB
![Page 2: QEMU/SstemCSystemC Cosim lationCosimulation at …adt.cs.upb.de/quf/quf11/quf2011_04.pdfSpeed up through dynamic binary translation w.r.t. interpretive ISS Speed up through native](https://reader034.vdocument.in/reader034/viewer/2022042215/5ebc77109538062a4b5632f1/html5/thumbnails/2.jpg)
Cooperative Computing & Communication Laboratory
Today’s Embedded Software ComplexityToday s Embedded Software Complexity
Highly complex platformsHighly complex platforms Multi-core with pipelines & branch prediction Shared memories & hierarchical caches Buses & networks-on-chipuses & e o s o c p
Modern real-time operating systems & compilers Preemptive multitaskingp g Virtual memory Code optimization
Early virtual platforms• Software development• Performance estimation• Real-time verification
2© 2011 Siemens AG und Universität Paderborn QUF‘11 / M. Becker
![Page 3: QEMU/SstemCSystemC Cosim lationCosimulation at …adt.cs.upb.de/quf/quf11/quf2011_04.pdfSpeed up through dynamic binary translation w.r.t. interpretive ISS Speed up through native](https://reader034.vdocument.in/reader034/viewer/2022042215/5ebc77109538062a4b5632f1/html5/thumbnails/3.jpg)
Cooperative Computing & Communication Laboratory
System Level & Transaction Level MethodologySystem Level & Transaction Level Methodology
ComputationComputation
cycletimed
A. SpecificationD F
approx.timed
B. Component assemblyC. Bus arbitrationD. Bus functionalE Cycle accurate computation
C E
untimed
E. Cycle accurate computationF. Implementation
A B
Communicationuntimed approx.
ti dcycleti d
3© 2011 Siemens AG und Universität Paderborn QUF‘11 / M. Becker
timed timed
![Page 4: QEMU/SstemCSystemC Cosim lationCosimulation at …adt.cs.upb.de/quf/quf11/quf2011_04.pdfSpeed up through dynamic binary translation w.r.t. interpretive ISS Speed up through native](https://reader034.vdocument.in/reader034/viewer/2022042215/5ebc77109538062a4b5632f1/html5/thumbnails/4.jpg)
Cooperative Computing & Communication Laboratory
System Level RTOS Modeling: State of the ArtSystem Level RTOS Modeling: State of the Art
HW/SW cosimulation: HDL and cycle-accurate ISSHW/SW cosimulation: HDL and cycle accurate ISSCycle-accurate timing Infeasible for early investigations of complex systems
Abstract system level RTOS models in SystemCNative speed
S ffi i t ti i Sufficient timing accuracy All source code must be available!!!
Advanced emulation (virtual prototypes)Efficient target binary execution Instruction-accurate
4© 2011 Siemens AG und Universität Paderborn QUF‘11 / M. Becker
![Page 5: QEMU/SstemCSystemC Cosim lationCosimulation at …adt.cs.upb.de/quf/quf11/quf2011_04.pdfSpeed up through dynamic binary translation w.r.t. interpretive ISS Speed up through native](https://reader034.vdocument.in/reader034/viewer/2022042215/5ebc77109538062a4b5632f1/html5/thumbnails/5.jpg)
Cooperative Computing & Communication Laboratory
OutlineOutline
SystemC RTOS Modeling
QEMU/S t C C i l ti E i tQEMU/SystemC Cosimulation Environment
QEMU Cycle-Approximate Time EstimationQEMU Cycle Approximate Time Estimation
5© 2011 Siemens AG und Universität Paderborn QUF‘11 / M. Becker
![Page 6: QEMU/SstemCSystemC Cosim lationCosimulation at …adt.cs.upb.de/quf/quf11/quf2011_04.pdfSpeed up through dynamic binary translation w.r.t. interpretive ISS Speed up through native](https://reader034.vdocument.in/reader034/viewer/2022042215/5ebc77109538062a4b5632f1/html5/thumbnails/6.jpg)
Cooperative Computing & Communication Laboratory
SystemC Abstract RTOS ModelingSystemC Abstract RTOS Modeling
Application movw r22, r28movw r24, r18call 0xa2 <mod>
Instruction accurate softwareApplication tasks
T1 TnT2
movw r18, r28sbiw r24, 0x00
brne .-16movw r18, r28
Application tasks Actual RTOS kernel Device drivers & comm. stacks
InstructionsRTOSInstruction Set Simulator (ISS) RTOS
N ti f ti l t
RTOSAbstraction
T1 TnT2
Application Native functional segmentsWith time annotations
RTOS model provides
Abstract RTOSModel
RTOS model provides Scheduling policies Context switching Canonic API/Standard APIs
Time annotatedsegments
T1
6© 2011 Siemens AG und Universität Paderborn QUF‘11 / M. Becker
SystemC/SpecC Resource synchronizationScheduling
T2
Tn
…
![Page 7: QEMU/SstemCSystemC Cosim lationCosimulation at …adt.cs.upb.de/quf/quf11/quf2011_04.pdfSpeed up through dynamic binary translation w.r.t. interpretive ISS Speed up through native](https://reader034.vdocument.in/reader034/viewer/2022042215/5ebc77109538062a4b5632f1/html5/thumbnails/7.jpg)
Cooperative Computing & Communication Laboratory
SystemC Abstract RTOS Modeling (cont‘d)SystemC Abstract RTOS Modeling (cont d)
Tasks and Interrupt Service Routines (ISR)
SystemC Threads
Tasks and Interrupt Service Routines (ISR)Modeled/wrapped by SystemC threads
Derive SystemC modules from RTOS Module B l t id RTOS d li
yWrapping RTOSTasks/ISRs
Base class to provide RTOS modelingCapabilities
Module class provides primitives for1:n
RTOSModules
Module class provides primitives forSynchronization of functional segments and forTime annotation consume(t)
Conte t class s nchroni es local task/ISR time
RTOS context
Context class synchronizes local task/ISR time With global SystemC time
Context class corresponds to a simulated CPU
1:n
ISR scheduler
p
1:1
CPU
7© 2011 Siemens AG und Universität Paderborn QUF‘11 / M. Becker
Task scheduler
![Page 8: QEMU/SstemCSystemC Cosim lationCosimulation at …adt.cs.upb.de/quf/quf11/quf2011_04.pdfSpeed up through dynamic binary translation w.r.t. interpretive ISS Speed up through native](https://reader034.vdocument.in/reader034/viewer/2022042215/5ebc77109538062a4b5632f1/html5/thumbnails/8.jpg)
Cooperative Computing & Communication Laboratory
QEMU EmulatorQEMU Emulator
QEMU open source emulatorQEMU open source emulator Dynamic binary translation based CPU emulation PowerPC, ARM, MIPS, etc.
Full system emulation Complete target software stack OS and device drivers CPU Memory & I/O
OS & Drivers
Application
Memory Management Unit (MMU) I/O & peripherals
U d l ti
CPU, Memory & I/O
Host Process
Full system emulationUser mode emulation Target-compiled application task
Unprivileged CPU instructions only( d )
Application Task(user mode)
Trap system callsUser Mode CPU
Host Process
User mode emulation
8© 2011 Siemens AG und Universität Paderborn QUF‘11 / M. Becker
User mode emulation
![Page 9: QEMU/SstemCSystemC Cosim lationCosimulation at …adt.cs.upb.de/quf/quf11/quf2011_04.pdfSpeed up through dynamic binary translation w.r.t. interpretive ISS Speed up through native](https://reader034.vdocument.in/reader034/viewer/2022042215/5ebc77109538062a4b5632f1/html5/thumbnails/9.jpg)
Cooperative Computing & Communication Laboratory
QEMU/SystemC Cosimulation EnvironmentQEMU/SystemC Cosimulation Environment
QEMU task wrapper
Task.elf
RTOS context
Task.elf
Task.elf
Q ppMemories
& I/OSC_THREAD QEMU task wrapper
MemoriesRTOS context
Task.elf
Task.elf
Task.elf
Native task wrapper
Task scheduler SC_THREAD
Task.elfTask.elfTask.elf
& I/O
SC_THREAD
RTOS contextNative task wrapper
Task.elfTask.elfTask.elf
Task scheduler SC_THREADTask scheduler
9© 2011 Siemens AG und Universität Paderborn QUF‘11 / M. Becker
![Page 10: QEMU/SstemCSystemC Cosim lationCosimulation at …adt.cs.upb.de/quf/quf11/quf2011_04.pdfSpeed up through dynamic binary translation w.r.t. interpretive ISS Speed up through native](https://reader034.vdocument.in/reader034/viewer/2022042215/5ebc77109538062a4b5632f1/html5/thumbnails/10.jpg)
Cooperative Computing & Communication Laboratory
QEMU/SystemC Cosimulation EnvironmentQEMU/SystemC Cosimulation Environment
QEMU task wrapper QEMUTLM
Task.elf
RTOS context
Task.elf
Task.elf
Q ppMemories
& I/OSC_THREAD
QEMU User mode Emulator
Task.elf
SyscallTranslator
ops
TLMTransactor
execute()syscall()RTOS context
Task.elf
Task.elf
Task.elf
Native task wrapper
Task scheduler SC_THREAD
Exec time EstimatorSC_THREAD delay
execute()y ()consume()
QEMUTaskWrapper::Thread() {do {
wait(); // For task activationwhile(!END_OF_TASK) {
switch(QEMU->execute()) { case SYSCALL_EXCEPTION:
RTOS->consume(ESTIMATOR->delay);TRANSLATOR->syscall(&QEMU->env);TRANSLATOR >syscall(&QEMU >env);break;
... }
}} hile(tr e)
10© 2011 Siemens AG und Universität Paderborn QUF‘11 / M. Becker
} while(true);}
![Page 11: QEMU/SstemCSystemC Cosim lationCosimulation at …adt.cs.upb.de/quf/quf11/quf2011_04.pdfSpeed up through dynamic binary translation w.r.t. interpretive ISS Speed up through native](https://reader034.vdocument.in/reader034/viewer/2022042215/5ebc77109538062a4b5632f1/html5/thumbnails/11.jpg)
Cooperative Computing & Communication Laboratory
QEMU/SystemC Multilevel CosimulationQEMU/SystemC Multilevel Cosimulation
Native Task Level Binary Task Level CPU LevelUser modeQEMU inUser mode
CPUNetwork
SW Tasks
RTOSModel
RTOSModel
DriverComm.StacksRTOS Kernels
DriverComm.StacksRTOS Kernel
User Emu.User Emu.
RTOSModel
RTOSModel CPU Emu. CPU Emu.
I/O I/OI/O I/O
RTOS Software Refinement
User modeQEMU inUser mode System mode
QEMU in System modeSystem modeQEMU inSystem mode
11© 2011 Siemens AG und Universität Paderborn QUF‘11 / M. Becker
![Page 12: QEMU/SstemCSystemC Cosim lationCosimulation at …adt.cs.upb.de/quf/quf11/quf2011_04.pdfSpeed up through dynamic binary translation w.r.t. interpretive ISS Speed up through native](https://reader034.vdocument.in/reader034/viewer/2022042215/5ebc77109538062a4b5632f1/html5/thumbnails/12.jpg)
Cooperative Computing & Communication Laboratory
Simulation OverheadSimulation Overhead
Example applicationExample application PowerPC 405 ORCOS real-time operating system Two RTOS tasks synchronized via kernel signals
Cosim. Level Sim. TimeNative Task Level 5 6s
o OS as s sy c o ed a e e s g a s
Native Task Level 5.6s
Mixed Task Set 7.6s
Binary Task Level 9.2sQEMU fullSystem modeQEMU in User mode
CPU Level 51.6s
CPU Level Cosim 1472.2sQEMU in fullSystem mode
RTOS model
Task1 Task2
RTOS model
Task1 Task2
Native simulation Target emulation
RTOS model
Task1 Task2
RTOS kernel
Task1 Task2
12© 2011 Siemens AG und Universität Paderborn QUF‘11 / M. Becker
RTOS modelRTOS model RTOS model RTOS kernel
![Page 13: QEMU/SstemCSystemC Cosim lationCosimulation at …adt.cs.upb.de/quf/quf11/quf2011_04.pdfSpeed up through dynamic binary translation w.r.t. interpretive ISS Speed up through native](https://reader034.vdocument.in/reader034/viewer/2022042215/5ebc77109538062a4b5632f1/html5/thumbnails/13.jpg)
Cooperative Computing & Communication Laboratory
Time Annotated Basic BlocksTime Annotated Basic Blocks
Static WCET/BCET annotation error:
BB Real Delay Terror ≤ Tmax_dynamic/2
Basic Block Delay Tstatic Tmax_dynamic
BB BCET
BB WCET
13© 2011 Siemens AG und Universität Paderborn QUF‘11 / M. Becker
![Page 14: QEMU/SstemCSystemC Cosim lationCosimulation at …adt.cs.upb.de/quf/quf11/quf2011_04.pdfSpeed up through dynamic binary translation w.r.t. interpretive ISS Speed up through native](https://reader034.vdocument.in/reader034/viewer/2022042215/5ebc77109538062a4b5632f1/html5/thumbnails/14.jpg)
Cooperative Computing & Communication Laboratory
Basic Block Delay Estimation (QEMU)Basic Block Delay Estimation (QEMU)
Cycle-approximate delay estimationCycle approximate delay estimation Integrates with QEMU’s dynamic binary translator No explicit micro architecture CPU model
Two phases approach Basic block translation (static analysis)
• Accumulate static instruction delays • Annotate translated blocks with delay accumulation• Instrument translated blocks with dynamic estimation code
Translated Block execution (dynamic estimation)Translated Block execution (dynamic estimation)• Execute instrumented translated blocks• Accumulate dynamic delays
TTotalAnnotation= TStaticAnalysis+TDynamicEstimation
14© 2011 Siemens AG und Universität Paderborn QUF‘11 / M. Becker
![Page 15: QEMU/SstemCSystemC Cosim lationCosimulation at …adt.cs.upb.de/quf/quf11/quf2011_04.pdfSpeed up through dynamic binary translation w.r.t. interpretive ISS Speed up through native](https://reader034.vdocument.in/reader034/viewer/2022042215/5ebc77109538062a4b5632f1/html5/thumbnails/15.jpg)
Cooperative Computing & Communication Laboratory
Execution Time Estimation AccuracyExecution Time Estimation Accuracy
1.600.000
1 000 000
1.200.000
1.400.000
es
600.000
800.000
1.000.000
CPU KCycle
0
200.000
400.000 BB‐BCET
Estimation (QEMU)
Real (Logic Analyzer)
BB‐WCET
15© 2011 Siemens AG und Universität Paderborn QUF‘11 / M. Becker
![Page 16: QEMU/SstemCSystemC Cosim lationCosimulation at …adt.cs.upb.de/quf/quf11/quf2011_04.pdfSpeed up through dynamic binary translation w.r.t. interpretive ISS Speed up through native](https://reader034.vdocument.in/reader034/viewer/2022042215/5ebc77109538062a4b5632f1/html5/thumbnails/16.jpg)
Cooperative Computing & Communication Laboratory
Execution Time Estimation Accuracy (cont‘d)Execution Time Estimation Accuracy (cont d)
10
5
0on (%
)
5
0
Deviatio
Deviation:Estimation vs. Real
‐5
‐10
16© 2011 Siemens AG und Universität Paderborn QUF‘11 / M. Becker
‐15
![Page 17: QEMU/SstemCSystemC Cosim lationCosimulation at …adt.cs.upb.de/quf/quf11/quf2011_04.pdfSpeed up through dynamic binary translation w.r.t. interpretive ISS Speed up through native](https://reader034.vdocument.in/reader034/viewer/2022042215/5ebc77109538062a4b5632f1/html5/thumbnails/17.jpg)
Cooperative Computing & Communication Laboratory
Execution Time Estimation OverheadExecution Time Estimation Overhead4
2 5
3
3,5
e (s)
1,5
2
2,5
ulation Time
0,5
1
,
Sim
Untimed QEMU @Intel P4 3GHz
QEMU w/ Time @ Intel P4 3 GHz
PowerPC Board @ 300 MHz
0
17© 2011 Siemens AG und Universität Paderborn QUF‘11 / M. Becker
![Page 18: QEMU/SstemCSystemC Cosim lationCosimulation at …adt.cs.upb.de/quf/quf11/quf2011_04.pdfSpeed up through dynamic binary translation w.r.t. interpretive ISS Speed up through native](https://reader034.vdocument.in/reader034/viewer/2022042215/5ebc77109538062a4b5632f1/html5/thumbnails/18.jpg)
Cooperative Computing & Communication Laboratory
RTOS Modeling and TLM Methodology?RTOS Modeling and TLM Methodology?
18© 2011 Siemens AG und Universität Paderborn QUF‘11 / M. Becker
![Page 19: QEMU/SstemCSystemC Cosim lationCosimulation at …adt.cs.upb.de/quf/quf11/quf2011_04.pdfSpeed up through dynamic binary translation w.r.t. interpretive ISS Speed up through native](https://reader034.vdocument.in/reader034/viewer/2022042215/5ebc77109538062a4b5632f1/html5/thumbnails/19.jpg)
Cooperative Computing & Communication Laboratory
Synchronization SchemeSynchronization Scheme
Data dependency awareness (causality-true)Data dependency awareness (causality true) System calls Shared variables I/O/O
Dynamic software segment Comprise TB execution between consecutive interaction points
Cycle
Interaction points
Consume accumulated execution time (interruptible wait-statement)
Instruction
TB
Cycle
tTB Dynamic software segment
tTaskTask
Basic block
TB tTB
tBB
Dynamic software segment
19© 2011 Siemens AG und Universität Paderborn QUF‘11 / M. Becker
TB = Translated Block
![Page 20: QEMU/SstemCSystemC Cosim lationCosimulation at …adt.cs.upb.de/quf/quf11/quf2011_04.pdfSpeed up through dynamic binary translation w.r.t. interpretive ISS Speed up through native](https://reader034.vdocument.in/reader034/viewer/2022042215/5ebc77109538062a4b5632f1/html5/thumbnails/20.jpg)
Cooperative Computing & Communication Laboratory
ConclusionConclusion
Abstract RTOS simulation and QEMU emulationAbstract RTOS simulation and QEMU emulation Early performance estimation Fast RTOS verification
Fast simulation Speed up through dynamic binary translation w.r.t. interpretive ISS
Speed up through native RTOS kernel & driver abstraction Speed up through native RTOS kernel & driver abstraction Fast cycle-approximate time accuracy through QEMU extension Flexibility by means of mixing native and binary task levels
Open issues & future work RTOS modeling TLM methodology Efficient cache & multicore modeling
20© 2011 Siemens AG und Universität Paderborn QUF‘11 / M. Becker
![Page 21: QEMU/SstemCSystemC Cosim lationCosimulation at …adt.cs.upb.de/quf/quf11/quf2011_04.pdfSpeed up through dynamic binary translation w.r.t. interpretive ISS Speed up through native](https://reader034.vdocument.in/reader034/viewer/2022042215/5ebc77109538062a4b5632f1/html5/thumbnails/21.jpg)
Cooperative Computing & Communication Laboratory
Research Outlook (1/2)Research Outlook (1/2)
Motivation
Transaction-level modelslevel models
RTOS-aware refinement flow
Conclusion
Research outlook
21© 2011 Siemens AG und Universität Paderborn QUF‘11 / M. Becker
![Page 22: QEMU/SstemCSystemC Cosim lationCosimulation at …adt.cs.upb.de/quf/quf11/quf2011_04.pdfSpeed up through dynamic binary translation w.r.t. interpretive ISS Speed up through native](https://reader034.vdocument.in/reader034/viewer/2022042215/5ebc77109538062a4b5632f1/html5/thumbnails/22.jpg)
Cooperative Computing & Communication Laboratory
Dynamic Binary Translation (DBT)Dynamic Binary Translation (DBT)
Target instruction set emulation through host codeTarget instruction set emulation through host code Static pre-compilation of functional equivalent host code snippets Dynamic translation of linear Basic Blocks (BB) at runtime
• Concatenate code snippets until branch instruction
Introduction
QEMU
Mixed Level Co ca e a e code s ppe s u b a c s uc o• Store Translated Blocks (TB) in translation cache
Main loop• Translate BB if program counter (PC) value is unknown
Mixed Level Simulation
Experimental Results p g ( )
• Otherwise, chain TBs directly from cacheConclusion
Fetch Branch? ExecuteYes No
Yes
KnownPC?
Decode
No
TBGeneration TB
Cache
010101101010101000010110
Host codeSnippets
22© 2011 Siemens AG und Universität Paderborn QUF‘11 / M. Becker
[Adapted from: M. Gligor et al. - Using binary translation in event driven simulationfor fast and flexible MPSoC simulation, CODES+ISSS’09, Grenoble, France]
![Page 23: QEMU/SstemCSystemC Cosim lationCosimulation at …adt.cs.upb.de/quf/quf11/quf2011_04.pdfSpeed up through dynamic binary translation w.r.t. interpretive ISS Speed up through native](https://reader034.vdocument.in/reader034/viewer/2022042215/5ebc77109538062a4b5632f1/html5/thumbnails/23.jpg)
Cooperative Computing & Communication Laboratory
QEMU/SystemC Co-Simulation Levels (cont‘d)QEMU/SystemC Co-Simulation Levels (cont d)
Fully native RTOS model in SystemCFully native RTOS model in SystemC Early and fast verification through native simulation
Mixed native/emulated user space
Introduction
SystemC/QEMU
Cosimulation Flexibility in case of limited source code availability
User space emulationRTOS kernel & device driver abstraction
Environment
Execution TimeEstimation
RTOS kernel & device driver abstraction Abstracts from register accurate I/O
Co-simulation of full system emulator and SystemCCo simulation of full system emulator and SystemC Verification of actual RTOS and device drivers Final target firmware verification
RTOS model
Task1 Task2
RTOS model
Task1 Task2
Native simulation Target emulation
RTOS model
Task1 Task2
RTOS kernel
Task1 Task2
23© 2011 Siemens AG und Universität Paderborn QUF‘11 / M. Becker
RTOS modelRTOS model RTOS model RTOS kernel
![Page 24: QEMU/SstemCSystemC Cosim lationCosimulation at …adt.cs.upb.de/quf/quf11/quf2011_04.pdfSpeed up through dynamic binary translation w.r.t. interpretive ISS Speed up through native](https://reader034.vdocument.in/reader034/viewer/2022042215/5ebc77109538062a4b5632f1/html5/thumbnails/24.jpg)
Cooperative Computing & Communication Laboratory
SystemCSystemC
System Level Design Language (IEEE standard)System Level Design Language (IEEE standard)
C++ class and macro libraryMod les
Introduction
SystemC/QEMU
Cosimulation Modules Ports Interfaces
Channels
Environment
Execution TimeEstimation
Channels
Cooperative event-based simulation kernelP t SC METHOD SC THREAD Process types: SC_METHOD, SC_THREAD
wait() for event or time
Ab t ti L lAbstraction Levels Register Transfer Level Transaction Level Modeling (TLM) support
24© 2011 Siemens AG und Universität Paderborn QUF‘11 / M. Becker