model-based timing analysis and deployment optimization
TRANSCRIPT
![Page 1: Model-based Timing Analysis and Deployment Optimization](https://reader033.vdocument.in/reader033/viewer/2022060302/62941ef374fa37505d6a9bdf/html5/thumbnails/1.jpg)
ITEA3 - 17003
Lukas Krawczyk, Mahmoud Bazzal, Ram Prasath Govindarajan, and Carsten Wolff
Institute for the Digital Transformation of Application and Living DomainsDortmund University of Applied Sciences and Arts
44227 Dortmund, [email protected]
1
Model-based Timing Analysis and Deployment Optimization for Heterogeneous Multi-Core Systems using Eclipse APP4MC
![Page 2: Model-based Timing Analysis and Deployment Optimization](https://reader033.vdocument.in/reader033/viewer/2022060302/62941ef374fa37505d6a9bdf/html5/thumbnails/2.jpg)
ITEA3 - 17003
Agenda
§ Introduction§ Motivational example§ APP4MC§ Integration§ End-to-End reaction latency analysis§ Optimization§ Case Study§ Conclusion and outlook
2
![Page 3: Model-based Timing Analysis and Deployment Optimization](https://reader033.vdocument.in/reader033/viewer/2022060302/62941ef374fa37505d6a9bdf/html5/thumbnails/3.jpg)
ITEA3 - 17003
Introduction
§ Increasing demands on automotive computing platforms driven by new automotive functionalities
§ Set of about 80 ECUs in todays cars will be reduced to about 10 high-performance units
§ Centralized computing platforms consisting of sophisticated heterogeneous accelerators
3
Source: Bosch
![Page 4: Model-based Timing Analysis and Deployment Optimization](https://reader033.vdocument.in/reader033/viewer/2022060302/62941ef374fa37505d6a9bdf/html5/thumbnails/4.jpg)
ITEA3 - 17003
Introduction
§ Heterogeneous Hardware§ Different models of computation§ Variety processing-unit specific scheduling strategies
§ Heterogeneous functional Domains§ Mixed levels of criticality§ Freedom from interference
§ COTS Hardware§ Limited to no capabilities for adjusting hardware
4
![Page 5: Model-based Timing Analysis and Deployment Optimization](https://reader033.vdocument.in/reader033/viewer/2022060302/62941ef374fa37505d6a9bdf/html5/thumbnails/5.jpg)
ITEA3 - 17003
Contributions
§ Model-based approach for deploying software to heterogeneous hardware while ensuring end-to-end reaction latency constraints.§ Applicable in early design phases§ Response Time Analysis§ Design Space Exploration using Genetic Algorithm§ Industrial Case Study on an heterogeneous COTS hardware platform
5
![Page 6: Model-based Timing Analysis and Deployment Optimization](https://reader033.vdocument.in/reader033/viewer/2022060302/62941ef374fa37505d6a9bdf/html5/thumbnails/6.jpg)
ITEA3 - 17003
Motivational Example - Hardware
Specification• NVIDIA Pascal™ Architecture
GPU• GPU - NVIDIA Pascal™,256
CUDA cores• CPU - HMP Dual Denver 2/2
MB L2 +Quad ARM® A57/2 MB L2
• Memory - 8 GB 128 bit LPDDR4
6
Architecture of Jetson TX2
![Page 7: Model-based Timing Analysis and Deployment Optimization](https://reader033.vdocument.in/reader033/viewer/2022060302/62941ef374fa37505d6a9bdf/html5/thumbnails/7.jpg)
ITEA3 - 170037
Motivational Example - Software
Task chainsLidar è Localization è EKF è Planner è DASM
CAN è Localization è EKF è Planner è DASM
SFM è Planner è DASM
Lane_detection è Planner è DASM
Detection è Planner è DASM
![Page 8: Model-based Timing Analysis and Deployment Optimization](https://reader033.vdocument.in/reader033/viewer/2022060302/62941ef374fa37505d6a9bdf/html5/thumbnails/8.jpg)
ITEA3 - 170038
Motivational Example – Self suspension
CPU(Denver/ARM) GPU
Execution
Suspension Execution Eng.
Execution
Host to accelerator offloading
return
![Page 9: Model-based Timing Analysis and Deployment Optimization](https://reader033.vdocument.in/reader033/viewer/2022060302/62941ef374fa37505d6a9bdf/html5/thumbnails/9.jpg)
ITEA3 - 17003
APP4MC
9
AMALTHEA
TraceModel
AMALTHEA
SystemModel
SW ModelingInitial model of thesoftware
PartitioningIdentification of initialtasks
Optimization- Task distribution- Memory mapping
Simulation
Software Execution
System Modeling- Hardware- Constraints
![Page 10: Model-based Timing Analysis and Deployment Optimization](https://reader033.vdocument.in/reader033/viewer/2022060302/62941ef374fa37505d6a9bdf/html5/thumbnails/10.jpg)
ITEA3 - 17003
APP4MC
10
SW Model HW Model
![Page 11: Model-based Timing Analysis and Deployment Optimization](https://reader033.vdocument.in/reader033/viewer/2022060302/62941ef374fa37505d6a9bdf/html5/thumbnails/11.jpg)
ITEA3 - 17003
Integrated Approach
Amalthea model is updated with mapping information.
11
Amalthea model is constructed out of system design information.
APP4MC is used to implement the real-time analysis and deployment optimization approach.
![Page 12: Model-based Timing Analysis and Deployment Optimization](https://reader033.vdocument.in/reader033/viewer/2022060302/62941ef374fa37505d6a9bdf/html5/thumbnails/12.jpg)
ITEA3 - 17003
End-to-End Reaction Latency Analysis
Implicit LET (Logical Execution Time)
12
§ LET communication è deterministic data propagation points§ Implicit communication è shorter end-to-end reaction latency
![Page 13: Model-based Timing Analysis and Deployment Optimization](https://reader033.vdocument.in/reader033/viewer/2022060302/62941ef374fa37505d6a9bdf/html5/thumbnails/13.jpg)
ITEA3 - 17003
Optimization - goals§ Evaluation of solutions is computationally complex§ Multi-phased optimization strategy
13
Utilization
Deadline miss
End to end reaction latency
Total Utilization of any core < 1.0
Worst Case Response Time of any task < period of that task
Minimize the Critical path (longest data propagation
path)
![Page 14: Model-based Timing Analysis and Deployment Optimization](https://reader033.vdocument.in/reader033/viewer/2022060302/62941ef374fa37505d6a9bdf/html5/thumbnails/14.jpg)
ITEA3 - 17003
Optimization – goals§ Utilization
∀𝒫# ∈ 𝒫𝒰, '()
𝑈+ ≤ 1.0
§ No deadline miss (Worst case response time)∀𝜏+ ∈ 𝑇, ℛ+3 ≤ 𝑃+
§ Worst case end to end latencymax89:
(𝐿=>(𝜎))
14
![Page 15: Model-based Timing Analysis and Deployment Optimization](https://reader033.vdocument.in/reader033/viewer/2022060302/62941ef374fa37505d6a9bdf/html5/thumbnails/15.jpg)
ITEA3 - 17003
Optimization – degrees of freedom
§ Allocation target§ The processing unit executing a task.
§ Priority§ Higher priority tasks will preempt lower priority tasks
§ Accelerator target § Defines the target which should execute the accelerable content of an
executable§ Time slice
§ Amount of time given periodically to execute a task on the accelerator
15
![Page 16: Model-based Timing Analysis and Deployment Optimization](https://reader033.vdocument.in/reader033/viewer/2022060302/62941ef374fa37505d6a9bdf/html5/thumbnails/16.jpg)
ITEA3 - 17003
Optimization – encoding
§ Allocation target§ All processing units except accelerators§ Offloadable tasks can also be allocated
§ Priority§ Number of priorities is the number of tasks executing
on a core. § Priorities are unique.
§ Accelerator target § For offloadable tasks only the runnable to be
accelerated is offloaded.§ Time slice
§ Only valid for an accelerator bound executable.16
![Page 17: Model-based Timing Analysis and Deployment Optimization](https://reader033.vdocument.in/reader033/viewer/2022060302/62941ef374fa37505d6a9bdf/html5/thumbnails/17.jpg)
ITEA3 - 17003
Case Study – Configuration
§ GA configuration:§ 500 initial population§ Mutation rate of 5%
§ Termination criteria:§ Implicit communications: 2000 generations of steady fitness values.§ LET: first feasible solution (no deadline miss)
§ Hardware configuration§ Intel Core i5-3570K quad-core CPU @ 3.4 GHz
17
![Page 18: Model-based Timing Analysis and Deployment Optimization](https://reader033.vdocument.in/reader033/viewer/2022060302/62941ef374fa37505d6a9bdf/html5/thumbnails/18.jpg)
ITEA3 - 17003
Case Study – run time
18
• LET• Feasible solution after ~3s
(avg.)• Implicit
• Feasible solution after ~3s (avg.)
• Implicit optimized• Optimal solution after ~6s
(avg.)• Similar runtime to other
approaches for the same case study.
![Page 19: Model-based Timing Analysis and Deployment Optimization](https://reader033.vdocument.in/reader033/viewer/2022060302/62941ef374fa37505d6a9bdf/html5/thumbnails/19.jpg)
ITEA3 - 17003
Case Study – results
19
![Page 20: Model-based Timing Analysis and Deployment Optimization](https://reader033.vdocument.in/reader033/viewer/2022060302/62941ef374fa37505d6a9bdf/html5/thumbnails/20.jpg)
ITEA3 - 17003
Conclusion and Outlook
§ A combined § Genetic Algorithm based Design Space Exploration approach § Response Time Analysis for heterogeneous hardware applying RMS and WRR
scheduling§ Fully integrated into App4MC
§ Results demonstrate the applicability of our approach for industrial problems with similar run-times as other approaches while delivering better bounds.
§ Future work§ Validate Results on real hardware§ Evaluate performance on larger problems§ Consider further blocking factors
20
![Page 21: Model-based Timing Analysis and Deployment Optimization](https://reader033.vdocument.in/reader033/viewer/2022060302/62941ef374fa37505d6a9bdf/html5/thumbnails/21.jpg)
ITEA3 - 17003
Acknowledgements
The research leading to these results has received funding from the Federal Ministry for Education and Research (BMBF) under Grant 01IS18047D in the context of the ITEA3 EU-Project PANORAMA.
https://[email protected]
21