make life easier for embedded software engineers facing … · 1 31/01/2020 make life easier for...
TRANSCRIPT
![Page 1: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,](https://reader034.vdocument.in/reader034/viewer/2022050608/5faf1e4454ac792cb27ff04c/html5/thumbnails/1.jpg)
1
31/01/2020
Make Life Easier for Embedded Software EngineersFacing Complex Hardware Architectures
R. Leconte, E. Jenn, G. Bois, H. Guérard
IRT Saint Exupéry, Space Codesign, Thales AVS
![Page 2: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,](https://reader034.vdocument.in/reader034/viewer/2022050608/5faf1e4454ac792cb27ff04c/html5/thumbnails/2.jpg)
2
AGENDA
• SpaceStudio and deployment to hybrid HW/SW platform
• TOAST (Tool for OpenMP Annotations to Space design Translation) and its purpose
• OpenMP Offloading
• Example on a LineDetection algorithm
![Page 3: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,](https://reader034.vdocument.in/reader034/viewer/2022050608/5faf1e4454ac792cb27ff04c/html5/thumbnails/3.jpg)
3
Sequential C code
Parallel SW
architecture on
Hybrid platform
(CPU + FPGA)SpaceModules
• Workflow based on an existing tool: SpaceStudio to exposeparallelism• Encapsulate logic inside SpaceModules
![Page 4: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,](https://reader034.vdocument.in/reader034/viewer/2022050608/5faf1e4454ac792cb27ff04c/html5/thumbnails/4.jpg)
4
SpaceModules
• Use OpenMP to expose parallelism• Then TOAST generates SpaceModules
• Finally SpaceStudio deploys to the target platform
OpenMP annotated C codeSequential C code
Parallel SW
architecture on
Hybrid platform
(CPU + FPGA)
OpenMP
Offloading
![Page 5: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,](https://reader034.vdocument.in/reader034/viewer/2022050608/5faf1e4454ac792cb27ff04c/html5/thumbnails/5.jpg)
5
• OpenMP allows to offload computations to an accelerator:• GPU, FPGA, etc.
SpaceModulesSequential C code OpenMP annotated C code
int a = 2;
int b = 12;
b = a + 2;
printf("b = %d\n", b);
// b = 4
Parallel SW
architecture on
Hybrid platform
(CPU + FPGA)
![Page 6: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,](https://reader034.vdocument.in/reader034/viewer/2022050608/5faf1e4454ac792cb27ff04c/html5/thumbnails/6.jpg)
6
• OpenMP allows to offload computations to an accelerator• GPU, FPGA, etc.
SpaceModulesSequential C code OpenMP annotated C code
int a = 2;
int b = 12;
#pragma omp target map(from:b)
{
b = a + 2;
}
printf("b = %d\n", b);
// b = 4
Parallel SW
architecture on
Hybrid platform
(CPU + FPGA)
![Page 7: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,](https://reader034.vdocument.in/reader034/viewer/2022050608/5faf1e4454ac792cb27ff04c/html5/thumbnails/7.jpg)
7
int a = 2;
int b = 12;
#pragma omp target map(from:b)
{
b = a + 2;
}
printf("b = %d\n", b);
// b = 4
• OpenMP allows to offload computations to an accelerator• GPU, FPGA, etc.
SpaceModulesSequential C code OpenMP annotated C code
FPGA
Parallel SW
architecture on
Hybrid platform
(CPU + FPGA)
CPU
![Page 8: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,](https://reader034.vdocument.in/reader034/viewer/2022050608/5faf1e4454ac792cb27ff04c/html5/thumbnails/8.jpg)
8
Template code
Host code
Communications
Host code
Host.cpp
Template code
Communications
Offloaded code
Communications
Target0.cpp
Compilation
High Level
Synthesis
CPU
int a = 2;
int b = 12;
#pragma omp target map(from:b)
{
b = a + 2;
}
printf("b = %d\n", b);
// b = 4
• OpenMP allows to offload computations to an accelerator• GPU, FPGA, etc.
SpaceModulesSequential C code OpenMP annotated C code
FPGA
Parallel SW
architecture on
Hybrid platform
(CPU + FPGA)
![Page 9: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,](https://reader034.vdocument.in/reader034/viewer/2022050608/5faf1e4454ac792cb27ff04c/html5/thumbnails/9.jpg)
9
#pragma omp target
{
}
Code to
offload
Communications
Communications
Template code
Communications
Host Code
Host Code
Offloaded Code
Host Code
Host Code
OpenMP Source Code with Offloading
Host Source Code
TargetSource Code
Template code
Host.cpp
Target0.cpp
• TOAST converts OpenMP annotated code into C/C++ code using SpaceStudio Communication APIs
![Page 10: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,](https://reader034.vdocument.in/reader034/viewer/2022050608/5faf1e4454ac792cb27ff04c/html5/thumbnails/10.jpg)
10
Project
C
Host
SpaceModule
Template
C
Host code
SpaceModule
CTarget
SpaceModule
Template
C
Target code
SpaceModule
Apply
Replacements
+
TOASTTool for OpenMP Annotations
to Space design Translation
C
code with
Offloading
Match AST
Nodes
Call Match
CallbackHost Source
Replacements
C
Modified
Host code
Target Source
elements
Add Template
code
Instantiate Template
Clang LibTooling
AST Matcher Match
Callback
TOAST Architecture
![Page 11: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,](https://reader034.vdocument.in/reader034/viewer/2022050608/5faf1e4454ac792cb27ff04c/html5/thumbnails/11.jpg)
11
Edge
Detection
Line
DetectionFiltering
Input Image Detected Lines
Edge
Detection
Line
Detection
Input Image Detected Lines
Send
Images
Receive
and
Display
Lines
Receive
Images
Send
Lines
Z-turnFiltering
FPGA
Cortex-A9Communi-
cations
Line Detection and test bench
NetworkNetwork
LineDetection Use Case
![Page 12: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,](https://reader034.vdocument.in/reader034/viewer/2022050608/5faf1e4454ac792cb27ff04c/html5/thumbnails/12.jpg)
12
Slower because of :
• Communications
• Low FPGA frequency
Convolution
Accelerator
Host
Host
#pragma omp declare targetvoid sub_filter(int line_start, int line_end, int height, int width, int kernel_size, unsigned char* source_image, unsigned char* result_image) {
// convolution accessing source_image and writing results in result_image...
}#pragma omp end declare target... #pragma omp target map(always, to: source_image[0:IMAGE_SIZE])
map(from: result_image[0:IMAGE_SIZE]){
sub_filter(0, height, height, width, kernel_size, source_image, result_image);}
Communications
LineDetection: One Accelerator
![Page 13: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,](https://reader034.vdocument.in/reader034/viewer/2022050608/5faf1e4454ac792cb27ff04c/html5/thumbnails/13.jpg)
13
#pragma omp target map(always, to: source_image[0:IMAGE_SIZE]) map(from: result_image[0:IMAGE_SIZE])
{sub_filter(0, height, height, width, kernel_size, source_image, result_image);
}
#pragma omp target data map(always, to: source_image[0:IMAGE_SIZE]) map(from: result_image[0:IMAGE_SIZE])
{#pragma omp target map(to: source_image[0: SLICE_SIZE + OVERLAP_SIZE])
map(from: result_image[0: SLICE_SIZE]) nowait{ sub_filter(0, height / 2, height, width, kernel_size, source_image, result_image); }
#pragma omp target map(to: source_image[SLICE_SIZE - OVERLAP_SIZE: SLICE_SIZE + 2*OVERLAP_SIZE])map(from: result_image[SLICE_SIZE:SLICE_SIZE]) nowait
{ sub_filter(height / 2, height, height, width, kernel_size, source_image, result_image); }}
LineDetection : Several Accelerators
![Page 14: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,](https://reader034.vdocument.in/reader034/viewer/2022050608/5faf1e4454ac792cb27ff04c/html5/thumbnails/14.jpg)
14
LineDetection : Several Accelerators
#pragma omp target data map(always, to: source_image[0:IMAGE_SIZE]) map(from: result_image[0:IMAGE_SIZE])
{#pragma omp target map(to: source_image[0: SLICE_SIZE + OVERLAP_SIZE])
map(from: result_image[0: SLICE_SIZE]) nowait{ sub_filter(0, height / 2, height, width, kernel_size, source_image, result_image); }
#pragma omp target map(to: source_image[SLICE_SIZE - OVERLAP_SIZE: SLICE_SIZE + 2*OVERLAP_SIZE])map(from: result_image[SLICE_SIZE:SLICE_SIZE]) nowait
{ sub_filter(height / 2, height, height, width, kernel_size, source_image, result_image); }}
Accelerator
Convolution
Host
Host
Evaluation of the solution :
• Execution time
• Space occupation on the FPGA
![Page 15: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,](https://reader034.vdocument.in/reader034/viewer/2022050608/5faf1e4454ac792cb27ff04c/html5/thumbnails/15.jpg)
15
LineDetection : Several Accelerators
Accelerator
Convolution
Host
Host ~60ms
~100msAccelerators
Convolution
Host
Host ~60ms
~40ms
Small code changes for
architecture exploration
Possible additional work:
• HLS pragmas -> ~17ms
![Page 16: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,](https://reader034.vdocument.in/reader034/viewer/2022050608/5faf1e4454ac792cb27ff04c/html5/thumbnails/16.jpg)
16
Sequential C code OpenMP annotated C code
8 identical
accelerators
• OpenMP allows to offload computations to multipleaccelerators• Architecture exploration:
allows to find the good offloading pattern
Accelerators
Convolution
Host
Host
Parallel SW
architecture on
Hybrid platform
(CPU + FPGA)
![Page 17: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,](https://reader034.vdocument.in/reader034/viewer/2022050608/5faf1e4454ac792cb27ff04c/html5/thumbnails/17.jpg)
17
Xilinx : Zynq Intel : Cyclone
Architectures in SpaceStudio
![Page 18: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,](https://reader034.vdocument.in/reader034/viewer/2022050608/5faf1e4454ac792cb27ff04c/html5/thumbnails/18.jpg)
18
![Page 19: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,](https://reader034.vdocument.in/reader034/viewer/2022050608/5faf1e4454ac792cb27ff04c/html5/thumbnails/19.jpg)
19
![Page 20: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,](https://reader034.vdocument.in/reader034/viewer/2022050608/5faf1e4454ac792cb27ff04c/html5/thumbnails/20.jpg)
20
![Page 21: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,](https://reader034.vdocument.in/reader034/viewer/2022050608/5faf1e4454ac792cb27ff04c/html5/thumbnails/21.jpg)
21
Video: application running on Z-turn
![Page 22: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,](https://reader034.vdocument.in/reader034/viewer/2022050608/5faf1e4454ac792cb27ff04c/html5/thumbnails/22.jpg)
22
Conclusion
TOAST allows to leverage the OpenMP Offloadingstandard to deploy software on FPGAs using SpaceStudio
• OpenMP is a well established standard
Our workflows allows:• Design space exploration
• Deployment on SoCs featuring an FPGA
• Support multiple vendors (Xilinx, Intel)
![Page 23: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,](https://reader034.vdocument.in/reader034/viewer/2022050608/5faf1e4454ac792cb27ff04c/html5/thumbnails/23.jpg)
2323
Thank you for your attention
© IRT AESE ”Saint Exupéry” - All rights reserved. This document and all information contained herein is the sole property of IRT
AESE “Saint Exupéry”. No intellectual property rights are granted by the delivery of this document or the disclosure of its content. IRT
AESE ”Saint Exupéry” and its logo are registered trademarks.