fpga-based multi-threading for on-board processing
TRANSCRIPT
![Page 1: FPGA-Based Multi-Threading for On-board Processing](https://reader031.vdocument.in/reader031/viewer/2022020620/61e44b9ab379c607cc0a3d37/html5/thumbnails/1.jpg)
FPGA-Based Multi-Threading
for
On-board Processing
Pasquale Lombardi - Syderal
Andrea Guerrieri - EPFL
Bilel Belhadj - Syderal
9-11 of April 2018 SEFUW 2018
![Page 2: FPGA-Based Multi-Threading for On-board Processing](https://reader031.vdocument.in/reader031/viewer/2022020620/61e44b9ab379c607cc0a3d37/html5/thumbnails/2.jpg)
Content
Spacecraft on-board computing trend and needs
FPGA-Based Multi-Threading (FBMT)
Introduction
Implementation
Proof of concept test case
9-11 of April 2018 SEFUW 2018 2
![Page 3: FPGA-Based Multi-Threading for On-board Processing](https://reader031.vdocument.in/reader031/viewer/2022020620/61e44b9ab379c607cc0a3d37/html5/thumbnails/3.jpg)
Spacecraft On-board Computing
Trend
Big data challenge
o Increasing data volume
o Requiring data reduction before downloading
o On-board selection of useful data
Miniaturization allows for smaller
satellites
o Similar observational capabilities
o Lower energy/power capacity
o Lower download bandwidth
9-11 of April 2018 SEFUW 2018 3
![Page 4: FPGA-Based Multi-Threading for On-board Processing](https://reader031.vdocument.in/reader031/viewer/2022020620/61e44b9ab379c607cc0a3d37/html5/thumbnails/4.jpg)
Spacecraft On-board
Key Computing Needs
Scalable and energy efficient
Fast adoption of new technologies
9-11 of April 2018 SEFUW 2018 4
![Page 5: FPGA-Based Multi-Threading for On-board Processing](https://reader031.vdocument.in/reader031/viewer/2022020620/61e44b9ab379c607cc0a3d37/html5/thumbnails/5.jpg)
FBMT basic elements and their relevance
FPGA (Field Programmable Gate Array) is an attractive
technology for space missions
o Unbeatable flexibility
o Best performance-to-power ratio
Multi-threading is effectively used in software to
o Exploit multi-core as well as single-core platforms
o Improve responsiveness
o Improve real-time performance
Heterogeneous computing is a new trend for achieving
o High performance and
o Energy efficiency
9-11 of April 2018 SEFUW 2018 5
![Page 6: FPGA-Based Multi-Threading for On-board Processing](https://reader031.vdocument.in/reader031/viewer/2022020620/61e44b9ab379c607cc0a3d37/html5/thumbnails/6.jpg)
FBMT conceptual basisDo heterogeneous computing
on a CPU + FPGA platform
o Increase performance and
power efficiency
Use abstraction to handle threads
on FPGA hardware
o Make FPGA hardware threading as easy as
software threading
Use dynamic partial reconfiguration to allocate
FPGA resources to threads
o Maximize hardware utilization by reusing
FPGA resources for different functions
9-11 of April 2018 SEFUW 2018 6
![Page 7: FPGA-Based Multi-Threading for On-board Processing](https://reader031.vdocument.in/reader031/viewer/2022020620/61e44b9ab379c607cc0a3d37/html5/thumbnails/7.jpg)
FBMT benefits and architecture Does not require a high performance CPU
o Computational intensive tasks are executed
by the FPGA
Allows the design to be portable against adoption
of different technologies and components quality
o e.g. due to technology obsolescence or
mission quality requirements
Allows for scalability of system performance
o More than one FPGA can be operated by
the same CPU; CPU can be multicore
Allows for graceful degradation of the system
o Either CPU or FPGA can be reconfigured
to replace any failed functionality in the
system even at degraded performance
9-11 of April 2018 SEFUW 2018 7
![Page 8: FPGA-Based Multi-Threading for On-board Processing](https://reader031.vdocument.in/reader031/viewer/2022020620/61e44b9ab379c607cc0a3d37/html5/thumbnails/8.jpg)
FBMT programmingMulti-threading vs. sequential programming
o Improved responsiveness
o Optimized exploitation of parallel resources
o Improved performance
Allows for independent tasks and parallel
execution
o Threaded tasks can execute independently
of each other and possibly in parallel
Threads require operating system support
o Most operating systems support threads
FPGA threading abstraction
o It provides support for FPGA hardware
threads as for software threads
9-11 of April 2018 SEFUW 2018 8
![Page 9: FPGA-Based Multi-Threading for On-board Processing](https://reader031.vdocument.in/reader031/viewer/2022020620/61e44b9ab379c607cc0a3d37/html5/thumbnails/9.jpg)
IMPLEMENTATION
9-11 of April 2018 SEFUW 2018 9
![Page 10: FPGA-Based Multi-Threading for On-board Processing](https://reader031.vdocument.in/reader031/viewer/2022020620/61e44b9ab379c607cc0a3d37/html5/thumbnails/10.jpg)
Architecture
Requirements
FBMT
on
ZYNQ
Portability Scalability
9-11 of April 2018 SEFUW 2018 10
![Page 11: FPGA-Based Multi-Threading for On-board Processing](https://reader031.vdocument.in/reader031/viewer/2022020620/61e44b9ab379c607cc0a3d37/html5/thumbnails/11.jpg)
Architectural Overview
Key Ideaso Manage the FPGA resources by software: Execution Controller (EC)
(Flexibility)
o Logical separation between the Host Machine (HM) and EC (Portability
and Scalability)
o HM and EC both part of the Processing System (PS) (Hard IPs)
o Modular and scalable FPGA design (Reusability and Power Scalability)
Advantageso The HM and EC continue running even during the full FPGA
reconfiguration
o No halt or reboot needed to change the size and the number of the PRRs
o Could perform SEE mitigation techniques such as partial and full FPGA
scrubbing
9-11 of April 2018 SEFUW 2018 11
![Page 12: FPGA-Based Multi-Threading for On-board Processing](https://reader031.vdocument.in/reader031/viewer/2022020620/61e44b9ab379c607cc0a3d37/html5/thumbnails/12.jpg)
DemonstratorXilinx ZC706 Development Board
9-11 of April 2018 SEFUW 2018 12
![Page 13: FPGA-Based Multi-Threading for On-board Processing](https://reader031.vdocument.in/reader031/viewer/2022020620/61e44b9ab379c607cc0a3d37/html5/thumbnails/13.jpg)
June 2017 Confidential 13
Internal Block Diagram
9-11 of April 2018 SEFUW 2018 13
![Page 14: FPGA-Based Multi-Threading for On-board Processing](https://reader031.vdocument.in/reader031/viewer/2022020620/61e44b9ab379c607cc0a3d37/html5/thumbnails/14.jpg)
Execution Controller
Management software running on ARM CortexA9 (CPU1) of
Zynq SoC.
Main functions
Executes a sequence of predefined operations for FPGA
management:o Exchange information with HM (asynchronous commands and status)
o Load partial/full bitstreams and configure the PL
o Implements non-preemptive scheduling algorithms
o Monitor power consumption/temperature of the SoC
Advantageso Written in ANSI C for maximum portability
o High-Performance and Real-time Execution
o Real Concurrency of operation in respect with respect to HM
9-11 of April 2018 SEFUW 2018 14
![Page 15: FPGA-Based Multi-Threading for On-board Processing](https://reader031.vdocument.in/reader031/viewer/2022020620/61e44b9ab379c607cc0a3d37/html5/thumbnails/15.jpg)
Host MachineEmbedded Linux on ARM CortexA9 (CPU0) of the Zynq SoCo Application code C/C++
o Manage Hardware Threads using APIs
APIsSet of predefined functions to control and use the FPGA resources:
o Simple for a software developer familiar with Pthreads
o Hides the complexity and details of FPGA
9-11 of April 2018 SEFUW 2018 15
![Page 16: FPGA-Based Multi-Threading for On-board Processing](https://reader031.vdocument.in/reader031/viewer/2022020620/61e44b9ab379c607cc0a3d37/html5/thumbnails/16.jpg)
List of APIs
API Description
fbmt_initialize ( ) Set the maximum number of concurrent HW Threads
fbmt_create ( ) Create one or more HW Threads
fbmt_join ( ) Wait until the end of one or more HW Threads
fbmt_cancel( ) Terminate the execution of one or more HW Threads
fbmt_set_sched ( ) Select the scheduling algorithm
fbmt_malloc ( ) Allocate virtual memory space for HW Thread
fbmt_free ( ) Free virtual memory space for HW Thread
fbmt_mutex ( ) Mutual Exclusion for HW Thread
9-11 of April 2018 SEFUW 2018 16
![Page 17: FPGA-Based Multi-Threading for On-board Processing](https://reader031.vdocument.in/reader031/viewer/2022020620/61e44b9ab379c607cc0a3d37/html5/thumbnails/17.jpg)
APIs Usage ExampleHost Source Code#include <iostream>
#include "fbmt_APIs.h"
void main ( int argc, char *argv[ ] )
{
std::cout << «FBMT APPLICATION TEST 7.1.7 \n\r";
fbmt_initialize ( ONE_HW_THREAD_MAX );
hw_thread[HW_THREAD_0].id = 1;
hw_thread[HW_THREAD_0].type = CLOUD_DETECTION;
fbmt_create ( 1, &hw_thread[HW_THREAD_0] );
fbmt_join ( 1, &hw_thread[HW_THREAD_0] );
return;
}
FBMT APIs
INIT FPGA
INIT SW
STRUCTURE
INSTANTIATE
WAIT
9-11 of April 2018 SEFUW 2018 17
![Page 18: FPGA-Based Multi-Threading for On-board Processing](https://reader031.vdocument.in/reader031/viewer/2022020620/61e44b9ab379c607cc0a3d37/html5/thumbnails/18.jpg)
Performance
Description API Latency [ms]
HOST-FPGA Command/Response
fbmt_create( ) 1
HW Thread Instantiation fbmt_create ( ) 60*
FPGA Initialization fbmt_initialize ( ) 700*
*Full Bitstream/Partial Bitstream Configuration: latency
dependent on the size of FPGA/Hardware Thread
FPGA Configuration Throughput:
210 Mib/s• C++(APIs)/ C(EC)
• VHDL (no Vendor IPs)
• 200MHz
SoC LayoutAPIs Performance
9-11 of April 2018 SEFUW 2018 18
![Page 19: FPGA-Based Multi-Threading for On-board Processing](https://reader031.vdocument.in/reader031/viewer/2022020620/61e44b9ab379c607cc0a3d37/html5/thumbnails/19.jpg)
Platform's Main FeaturesContinues to operate during the full FPGA reconfiguration
Has the capability to change the size and the number of
concurrent Hardware Threads at run-time
Manages the Hardware Threads using similar scheduling
algorithms used for software threads;
Provide virtual memory Support to Hardware Threads
Monitors power consumption and temperature of the SoC
and adapts the scheduling accordingly;
Could autonomously perform SEE mitigation techniques
such as full and partial FPGA scrubbing;
Source code fully portable and scalable over different
platforms
9-11 of April 2018 SEFUW 2018 19
![Page 20: FPGA-Based Multi-Threading for On-board Processing](https://reader031.vdocument.in/reader031/viewer/2022020620/61e44b9ab379c607cc0a3d37/html5/thumbnails/20.jpg)
PROOF OF CONCEPT TEST CASE
9-11 of April 2018 SEFUW 2018 20
![Page 21: FPGA-Based Multi-Threading for On-board Processing](https://reader031.vdocument.in/reader031/viewer/2022020620/61e44b9ab379c607cc0a3d37/html5/thumbnails/21.jpg)
Cloud Detection ApplicationMotivation o On-board processing of multi-spectral images
o 66% of cloudy pixels are considered useless data
o Save storage resources and downlink capacities
Sentinel-2 multi-spectral imageso Copernicus Open Access Hub
o Download S2 Level-1C products
o Process Scene Classification Algorithm
o Accelerate execution with FBMT Hardware
Threads
Data volumeo Real-World, High volume data
o High memory throughput
o High processing power
9-11 of April 2018 SEFUW 2018 21
![Page 22: FPGA-Based Multi-Threading for On-board Processing](https://reader031.vdocument.in/reader031/viewer/2022020620/61e44b9ab379c607cc0a3d37/html5/thumbnails/22.jpg)
Cloud Detection AlgorithmInput Data o 9 spectral bands of S2 MSI ~ 250 MB
o Covers 100x100 km of earth surface
o 60 meters resolution per pixel
Output Datao Scene Classification Mask
o Cloud Mask
o Snow Mask
Threshold based Algorithmo Compare reflectance to thresholds
o Two main filter sequences
• Cloud Detection Sequence (CDS)
• Snow Detection Sequence (SDS)
Two accelerator designso Accelerator 1 includes CDS and SDS
o Accelerator 2 includes CDS only
o Dynamic reconfiguration of accelerator designs according to image content
Accelerator 1Accelerator 2
9-11 of April 2018 SEFUW 2018 22
![Page 23: FPGA-Based Multi-Threading for On-board Processing](https://reader031.vdocument.in/reader031/viewer/2022020620/61e44b9ab379c607cc0a3d37/html5/thumbnails/23.jpg)
Processing Examples
9-11 of April 2018 SEFUW 2018 23
![Page 24: FPGA-Based Multi-Threading for On-board Processing](https://reader031.vdocument.in/reader031/viewer/2022020620/61e44b9ab379c607cc0a3d37/html5/thumbnails/24.jpg)
Inside PR regions: Accelerator Architecture
Common Modules o DMA engines, Arbiters, AXI interfaces
o Internal Buses, Configuration Registers
Custom Processing Elements (PEs)o From simple addition to user-defined computation
o Could communicate together
o May be heterogeneous
Data Triggered Configurationo Computation is a side effect of data transfer
o Provides opportunities for data processing parallelism
Configurable parameters
o PE number, DMA size, Register map, ...
o AXI data bus width, AXI burst length
9-11 of April 2018 SEFUW 2018 24
![Page 25: FPGA-Based Multi-Threading for On-board Processing](https://reader031.vdocument.in/reader031/viewer/2022020620/61e44b9ab379c607cc0a3d37/html5/thumbnails/25.jpg)
Cloud Detection ExampleRGB image
9-11 of April 2018 SEFUW 2018 25
![Page 26: FPGA-Based Multi-Threading for On-board Processing](https://reader031.vdocument.in/reader031/viewer/2022020620/61e44b9ab379c607cc0a3d37/html5/thumbnails/26.jpg)
Performance
Software versiono ARM Cortex-A9 hard core, 1 GHz
o Linux OS, 1 GB of RAM
FBMT resourceso FPGA utilization (4 PRRs, 32 bits)
• Static Design ~ 5%
• Total Design ~ 38%
o Accelerator - DDR3 throughput• Read ~ 2 GB/s (max. 6 GB/s)
• Write ~ 2.5 GB/s (max. 8 GB/s)
~ x40
~ x12
9-11 of April 2018 SEFUW 2018 26
![Page 27: FPGA-Based Multi-Threading for On-board Processing](https://reader031.vdocument.in/reader031/viewer/2022020620/61e44b9ab379c607cc0a3d37/html5/thumbnails/27.jpg)
CONCLUSION AND PERSPECTIVE
9-11 of April 2018 SEFUW 2018 27
![Page 28: FPGA-Based Multi-Threading for On-board Processing](https://reader031.vdocument.in/reader031/viewer/2022020620/61e44b9ab379c607cc0a3d37/html5/thumbnails/28.jpg)
Conclusion
FPGA-based multi-threading system for on-board processingo Opens opportunities for flexible parallel computing
o Best performance-to-power ratio
FBMT demonstratoro Implemented on Xilinx Zynq SoC FPGA
o Leverages dynamic partial reconfiguration technology
o Full-abstraction of FPGA thread creation and synchronization
o Portable and industry-friendly design
Cloud detection test case o Proof of FPGA-based multi-threading concepts
o Processing element based accelerator design
o Performance
• x40 faster
• x12 more energy efficient
9-11 of April 2018 SEFUW 2018 28
![Page 29: FPGA-Based Multi-Threading for On-board Processing](https://reader031.vdocument.in/reader031/viewer/2022020620/61e44b9ab379c607cc0a3d37/html5/thumbnails/29.jpg)
Perspectives
Develop a library of acceleratorso High-level synthesis
o Automatic RHBD (Radiation Hardened By
Design) design generation
Scalability Road mapo Multi-FPGA system
o Compact backplane technology
o High-speed links
FBMT qualificationo System-level radiation analysis
o COTS component selection
o Radiation tests
FPGAFPGA
FPGA
HOST
Threads
T3
T1T2
User
Application
9-11 of April 2018 SEFUW 2018 29
![Page 30: FPGA-Based Multi-Threading for On-board Processing](https://reader031.vdocument.in/reader031/viewer/2022020620/61e44b9ab379c607cc0a3d37/html5/thumbnails/30.jpg)
For further information or feedback
9-11 of April 2018 SEFUW 2018 30
Syderal
Pasquale Lombardi
Bilel Belhadj
EPFL LAP
Andrea Guerrieri
Sahand Kashani-Akhavan
Paolo Ienne