Download - 3rd 3DDRESD: BiRF
1D and 2D Bitstream Relocation for Partially
Dynamically Reconfigurable Architecture
BY
Marco Novati
3-Day DRESD: July 29, 2008
AimsAims
Architectural support for relocation:
Create an integrated HW/SW system to manage online relocation (1D and 2D) in reconfigurable architecture
Create efficient bitstream relocation solutions suitable for the target system:
1D - 2DHW – SW
2
OutlineOutline
IntroductionRelocationState of ArtProposed SolutionsResultsConcluding Remarks and Future Work
3
What’s Next…What’s Next…
IntroductionReconfigurable ArchitectureXilinx FPGAs
RelocationState of ArtProposed SolutionsResultsConcluding Remarks and Future Work
4
55
Reconfigurable architectureReconfigurable architecture
A basic reconfigurable architecture consists of:a Static area: a basic Harward architecturea Reconfigurable area: an device area composed by several reconfigurable regions
6
Xilinx FPGAs and Configuration Xilinx FPGAs and Configuration MemoryMemory
Frame Addressing: Virtex, Frame Addressing: Virtex, Virtex-EVirtex-E
7* Inspired to Virtex Series Configuration Architecture User Guide
Frame Addressing: Virtex2proFrame Addressing: Virtex2pro
8* Taken from Virtex-II Pro and Virtex-II Pro X FPGA User Guide
Frame Addressing: Virtex 4-5 Frame Addressing: Virtex 4-5 (1/2)(1/2)
New Frame Addressing:Possibility of addressing rows and columns
9* Inspired to Virtex 4 & 5 Configuration Architecture User Guide
Frame Addressing: Virtex 4-5 Frame Addressing: Virtex 4-5 (2/2)(2/2)
10* Inspired to Virtex 4 & 5 Configuration Architecture User Guide
What’s Next…What’s Next…
IntroductionRelocationState of ArtProposed SolutionsResultsConcluding Remarks and Future Work
11
12
Relocation: RationaleRelocation: Rationale
Bitstreams relocation technique to: speedup the overall system executionreduce the amount of memory used to store partial bitstreamsassign at runtime the bitstreams placement
13
Relocation: The ProblemRelocation: The Problem
People Demanding for Functionalities
Set of Available Functionalities
FiArea/Time
Legenda:
A2/1
B 1/2
C2/2
D 1/1 E 1/1
F 2/2
RR3RR2RR1
FPGA
RR3RR2RR1
A
RR3RR2RR1
F
RR3RR2RR1
D
RR3RR2RR1
B
RR3RR2RR1
C
E
RR3RR2RR1
RFU Implementations
14
Relocation: ScenarioRelocation: Scenario
Time
Area
AB
Rec. F
F
Rec. E
E
Rec. C
C
Rec. D
D
RR3RR2RR1
A
RR3RR2RR1
F
RR3RR2RR1
D
RR3RR2RR1
B
RR3RR2RR1
C
E
RR3RR2RR1
RFU Implementations
A
E
D
C
B
F
2/1
2/2
1/2
1/1
1/1
2/2
A possible scenario
FiArea/Time
Legenda:
Time
15
Relocation: MotivationRelocation: Motivation
A
E
D
C
B
F
2/1
2/2
1/2
1/1
1/1
2/2
A possible scenario
FiArea/Time
Legenda:
Time
RR3RR2RR1
A
RR3RR2RR1
F
RR3RR2RR1
D
RR3RR2RR1
B
RR3RR2RR1
C
E
RR3RR2RR1
RFU Implementations
RR3RR2RR1
A
RR3RR2RR1
C
RR3RR2RR1
B
RR3RR2RR1
B
RR3RR2RR1
D
RR3RR2RR1
D
E
RR3RR2RR1
E
RR3RR2RR1
RR3RR2RR1
F
Time
Area
AB
Rec. C
C
Rec. F
F
Rec. E
E
DRec. D
Time
Area
AB
Rec. C
C
R2 F
F
R2 E
E
DR2 D
RR3RR2RR1
A
RR3RR2RR1
F
RR3RR2RR1
D
RR3RR2RR1
B
RR3RR2RR1
C
E
RR3RR2RR1
RFU Implementations
What’s Next…What’s Next…
IntroductionRelocationState of Art
PARBITBITPOSBAnMaTREPLICA
Proposed SolutionsResultsConcluding Remarks and Future Work
16
PARBITPARBIT
[E. Horta and John W. Lockwood, ”PARBIT: A Tool to Transform Bitfiles to Implement Partial Reconfiguration of Field Programmable Gate Arrays (FPGAs)”, Washington University, Technical Report, July 2001.]
Features:PureC software Enables the generation of the partial bitstream fileSmall modifications, altering only the parts related to the location on the device.
CONS:Only offlineOnly 1D reconfiguration
17
BITPOSBITPOS
[Yana E. Krasteva, Eduardo de la Torre, Teresa Riesgo and Didier Joly, ”Virtex II FPGA Bitstream Maniplation: Application to Reconfiguration Control Systems”, 2006 International Conference on Field Programmable Logic and Applications, August 2006.]
Features:Extract an area from a configuration fileGenerate the new relocated bitstream
CONS:Only offlineOnly Virtex II, Virtex II Pro [1D]
18
BAnMaTBAnMaT
[D. Deori, ”BAnMaT: un Framework per l’Analisi e la Manipolazione di un Bitstream Orientato alla Riconfigurabilita Parziale”, DEI, Milano, Politecnico di Milano, 2006]
Features:Bitstream correctness checkPerform modification on a configuration bitstreamPermits to bypass synthesis process from the VHDL
CONS:Only offline manipulation
19
REPLICAREPLICA
[H. Kalte, G. Lee, M. Porrmann and U. Rckert, ”REPLICA: A Bitstream Manipulation Filter for Module Relocation in Partial Reconfigurable Systems”, The 12th Reconfigurable Architectures Workshop (RAW 2005), 2005.]
Features:Hardware filter that exploit relocationNecessary manipulations during the download processRelocation hiding
CONS:Only for external reconfigurable systemOnly 1D relocationMaximum frequency of 50 MHz 20
What’s Next…What’s Next…
IntroductionRelocationState of ArtPolarisProposed Solutions
PolarisTarget ArchitectureProposed Relocation Solutions
ResultsConcluding Remarks and Future Work
21
22
Polaris: MotivationsPolaris: Motivations
Complete workflow to generate a self dynamically reconfigurable architecture that:
Supports 1D and 2D reconfiguration
Has “good” area constraints for cores
Performs Runtime task placement decisions
Exploits internal and fast Core relocation
Starting from specification of:Target applicationTarget device infoReconfiguration modelCommunication Infrastructure
2323
Polaris Polaris OverviewOverview
Workflow to manage allocation and relocation of tasks in self dynamically reconfigurable architectures
Final goal: complete architecture (bitstreams and code) generation
Target Architecture: YaRATarget Architecture: YaRA
24
PPC Based YaRAPPC Based YaRA
25
STATIC AREA
Proposed Relocation SolutionsProposed Relocation Solutions
Runtime Support for Self Dynamical Runtime 1D and 2D Reconfiguration– Xilinx Virtex, Virtex-E, Virtex2pro [1D]– Xilinx Virtex-4 and Virtex-5 [2D]
Relocation, different solutions:– Software:
• BAnMaT Lite– Hardware:
• BiRF [1D]• BiRF Square [2D]
26
Configuration BitstreamConfiguration Bitstream
27
BiRF & BiRF Square Block BiRF & BiRF Square Block DiagramDiagram
28
The ParserThe Parser
29
CRC CalculationCRC Calculation
Particular CRC value, used by Xilinx tools
Two version of BiRF and BiRF Square:– By using the “predefined” values– With actual CRC calculation
X16 + X15 + X2 + 1 [1D]
X32 + X28 + X27 + X26 + X25 + X23 + X22 + X20 + X19 + X18 + X14 + X13 + X11 + X10 + X9 + X8 + X6 + 1 [2D]
30
What’s Next…What’s Next…
Introduction Relocation State of Art Proposed Solutions Results
– Synthesis Results– Relocation Solutions Results
Concluding Remarks and Future Work
31
Synthesis Results: AreaSynthesis Results: Area
32
FPGA BiRF BiRF Square
FamilyModel
Generic
Version
Optimized
Version
Generic
Version
Optimized
Version
Virtex II Pro
vp7 11.6 % 3.6 % − −
Virtex II Pro
vp20 5.8 % 1.8 % − −
Virtex II Pro
vp30 4.2 % 1.3 % − −
Virtex 4 vlx40 − − 2.2 % 0.9 %
Virtex 4 vlx60 − − 1.5 % 0.6 %
Virtex 4 vlx100
− − 0.8 % 0.3 %
Virtex 5 vlx50 − − 1.1 % 0.8 %
Virtex 5 vlx85 − − 0.6 % 0.4 %
Virtex 5 vlx110
− − 0.5 % 0.3 %
Synthesis Results: Time Synthesis Results: Time PerformancesPerformances
BiRF:– On a Virtex2pro with speed grade -5
• General purpose version: max frequency of 101 MHz• Specific version: max frequency of 136 MHz
BiRF Square:– On a Virtex-4 with speed grade -12
• General purpose version: max frequency of 160 MHz• Specific version: max frequency of 290 MHz
– On a Virtex-5 with speed grade -3• General purpose version: max frequency of 226 MHz• Specific version: max frequency of 304 MHz
33
Relocation Solutions Results Relocation Solutions Results (1/2)(1/2)
BiRF, BiRF Square, BAnMaT Lite– Permit to support relocation in a self partially and
dynamically 1D or 2D reconfigurable system– The occupation ratio is relatively small– Frequency more than acceptable– Reduction of internal memory requirements
Throughput:– BiRF: 6 MB/s – BiRF Square: 7.3 MB/s– BAnMaT Lite: 2.6 MB/s
34
Relocation Solutions Results Relocation Solutions Results (2/2)(2/2)
A total configuration file size is about 1 MB Considering an architecture:
– 1/3 of the area as fixed part – 2/3 as reconfigurable part with 6 slots
With such hypothesis– Size of a partial bitstream will be about 110 KB– Relocation time of about:
• 18 ms with BiRF• 15 ms with BiRF Square• 42 ms with BAnMaT Lite
35
Relocation Time Results (1/4)Relocation Time Results (1/4)
36
Relocation Time Results (2/4)Relocation Time Results (2/4)
FPU1: clock time 0.01 ms, required for 3.65 s (7 add, 3 sub, 10 mul, 1 square root and 4 div)– Feasible RR assignment: (0,0) and (6,0)
JPEG: a complete JPEG Hardware Compressor, compression rate 24 img(352x288)/s, required for 3 seconds (72 img 352x288)– Feasible RR assignment: (0,0), (0,1), (6,0) and
(6,1) FPU2: clock time 0.01 ms, required for 3.13 s (6
add, 5 sub, 8 mul and 4 div)– Feasible RR assignment: (0,0) and (6,0)
3DES: a Triple-DES 64-bit block cipher, required for 1 second, in order to process a file of 72 MB– Feasible RR assignment: (0,0),(1,0), (3,0) and (3,1)
37
Relocation Time Results (3/4)Relocation Time Results (3/4)
38
Relocation Time Results (4/4)Relocation Time Results (4/4)
39
What’s Next…What’s Next…
Introduction Relocation State of Art Proposed Solutions Results Concluding Remarks and Future Work
40
Concluding RemarksConcluding Remarks Architectural support for relocation:
– Create an integrated HW/SW system to manage online relocation (1D and 2D) in reconfigurable architecture
– Create efficient bitstream relocation solutions suitable for the target system:
• 1D - 2D• HW – SW
Pubblications:– International conferences:
• M. Morandi, M. Novati, M. D. Santambrogio, D. Sciuto, Core allocation and relocation management for a self dynamically reconfigurable architecture, ISVLSI 2008, IEEE Computer Society Annual Symposium on VLSI
• S. Corbetta, F. Ferrandi, M. Morandi, M. Novati, M. D. Santambrogio, D. Sciuto, Two Novel Approaches to Online Partial Bitstream Relocation in a Dynamically Reconfigurable System, ISVLSI 2007, IEEE Computer Society Annual Symposium on VLSI
– IEEE Transaction on VLSI (Second Rewiew Phase):• M. Morandi, M. Novati, M.D. Santambrogio, P Spoletini, D. Sciuto, Internal and
External Bitstream Relocation for Partial Dynamic Reconfiguration, TSVLSI, IEEE Transactions on Very Large Scale Integration Systems41
Future WorkFuture Work
Validation tool for the chosen– Reconfiguration model– Communication infrastructure
Simulation framework– Monitor the reconfigurable system evolution– Evaluate different placement policies and area
constraints definitions
42
43
General InformationGeneral Information
Webpage– www.dresd.org/?q=polaris
Mailing List– [email protected]
Contact– To have more information regarding polaris:
• [email protected] – For a complete list of information on how to contact us:
• www.dresd.org/?q=contact_polaris
44
QuestionsQuestions