uic thesis novati

Post on 04-Jun-2015

615 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1D and 2D Bitstream Relocation for Partially

Dynamically Reconfigurable Architecture

BY

Marco Novati

marco.novati@dresd.org

Thesis committee:

Shantanu Dutt (chair), Marco Domenico Santambrogio, Piotr Gmytrasiewicz

UIC Thesis Defense: May 8, 2008

AimsAims

Architectural support for relocation:

Create an integrated HW/SW system to manage online relocation (1D and 2D) in reconfigurable architecture

Create efficient bitstream relocation solutions suitable for the target system:

1D - 2DHW – SW

2

OutlineOutline

IntroductionRelocationState of ArtProposed SolutionsResultsConcluding Remarks and Future Work

3

What’s Next…What’s Next…

IntroductionReconfigurationXilinx FPGAs

RelocationState of ArtProposed SolutionsResultsConcluding Remarks and Future Work

4

55

Reconfigurable ComputingReconfigurable Computing

“Reconfigurable computing is intended to fill the gap between hardware and software, achieving potentially

much higher performance than software, while maintaining a higher level of flexibility than hardware”

(K. Compton and S. Hauck, Reconfigurable Computing: a Survey of Systems and Software, 2002)

6

5 W5 W

whowho controls the reconfiguration

wherewhere the reconfigurator is located

whenwhen the configurations are generated

whichwhich is the granularity of the reconfiguration

in whatwhat dimension the reconfiguration operates

7

Reconfiguration in everyday Reconfiguration in everyday lifelife

Hocke

y

Football

(Complete – Static)

(Partial –

Dynamic)

(Partial – Static)

7

Soccer

88

Reconfigurable architectureReconfigurable architecture

A basic reconfigurable architecture consists of:a Static area: a basic Harward architecturea Reconfigurable area: an device area composed by several reconfigurable regions

9

Basic DefinitionsBasic Definitions

CoreCore: a specific representation of a functionality. It is possible, for example, to have a core described in VHDL, in C or in an intermediate representation (e.g. a DFG)

IP-CoreIP-Core: a core described using a HD Language combined with its communication infrastructure (i.e. the bus interface)

Reconfigurable Functional UnitReconfigurable Functional Unit: an IP-Core that can be plugged and/or unplugged at runtime in an already working architecture

Reconfigurable RegionReconfigurable Region: a portion of the device area used to implement a reconfigurable core

10

Xilinx FPGAs and Configuration Xilinx FPGAs and Configuration MemoryMemory

Frame Addressing: Virtex, Frame Addressing: Virtex, Virtex-EVirtex-E

11* Inspired to Virtex Series Configuration Architecture User Guide

Frame Addressing: Virtex2proFrame Addressing: Virtex2pro

12* Taken from Virtex-II Pro and Virtex-II Pro X FPGA User Guide

Frame Addressing: Virtex 4-5 Frame Addressing: Virtex 4-5 (1/2)(1/2)

New Frame Addressing:Possibility of addressing rows and columns

13* Inspired to Virtex 4 & 5 Configuration Architecture User Guide

Frame Addressing: Virtex 4-5 Frame Addressing: Virtex 4-5 (2/2)(2/2)

14* Inspired to Virtex 4 & 5 Configuration Architecture User Guide

What’s Next…What’s Next…

IntroductionRelocationState of ArtProposed SolutionsResultsConcluding Remarks and Future Work

15

16

Relocation: RationaleRelocation: Rationale

Bitstreams relocation technique to: speedup the overall system executionreduce the amount of memory used to store partial bitstreamsachieve a core preemptive execution assign at runtime the bitstreams placement

17

Relocation: The ProblemRelocation: The Problem

People Demanding for Functionalities

Set of Available Functionalities

FiArea/Time

Legenda:

A2/1

B 1/2

C2/2

D 1/1 E 1/1

F 2/2

RR3RR2RR1

FPGA

RR3RR2RR1

A

RR3RR2RR1

F

RR3RR2RR1

D

RR3RR2RR1

B

RR3RR2RR1

C

E

RR3RR2RR1

RFU Implementations

18

Relocation: ScenarioRelocation: Scenario

Time

Area

AB

Rec. F

F

Rec. E

E

Rec. C

C

Rec. D

D

RR3RR2RR1

A

RR3RR2RR1

F

RR3RR2RR1

D

RR3RR2RR1

B

RR3RR2RR1

C

E

RR3RR2RR1

RFU Implementations

A

E

D

C

B

F

2/1

2/2

1/2

1/1

1/1

2/2

A possible scenario

FiArea/Time

Legenda:

Time

19

Relocation: MotivationRelocation: Motivation

A

E

D

C

B

F

2/1

2/2

1/2

1/1

1/1

2/2

A possible scenario

FiArea/Time

Legenda:

Time

RR3RR2RR1

A

RR3RR2RR1

F

RR3RR2RR1

D

RR3RR2RR1

B

RR3RR2RR1

C

E

RR3RR2RR1

RFU Implementations

RR3RR2RR1

A

RR3RR2RR1

C

RR3RR2RR1

B

RR3RR2RR1

B

RR3RR2RR1

D

RR3RR2RR1

D

E

RR3RR2RR1

E

RR3RR2RR1

RR3RR2RR1

F

Time

Area

AB

Rec. C

C

Rec. F

F

Rec. E

E

DRec. D

Time

Area

AB

Rec. C

C

R2 F

F

R2 E

E

DR2 D

RR3RR2RR1

A

RR3RR2RR1

F

RR3RR2RR1

D

RR3RR2RR1

B

RR3RR2RR1

C

E

RR3RR2RR1

RFU Implementations

What’s Next…What’s Next…

IntroductionRelocationState of Art

PARBITBITPOSBAnMaTREPLICA

Proposed SolutionsResultsConcluding Remarks and Future Work

20

PARBITPARBIT

[E. Horta and John W. Lockwood, ”PARBIT: A Tool to Transform Bitfiles to Implement Partial Reconfiguration of Field Programmable Gate Arrays (FPGAs)”, Washington University, Technical Report, July 2001.]

Features:PureC software Enables the generation of the partial bitstream fileSmall modifications, altering only the parts related to the location on the device.

CONS:Only offlineOnly 1D reconfiguration

21

BITPOSBITPOS

[Yana E. Krasteva, Eduardo de la Torre, Teresa Riesgo and Didier Joly, ”Virtex II FPGA Bitstream Maniplation: Application to Reconfiguration Control Systems”, 2006 International Conference on Field Programmable Logic and Applications, August 2006.]

Features:Extract an area from a configuration fileGenerate the new relocated bitstream

CONS:Only offlineOnly Virtex II, Virtex II Pro [1D]

22

BAnMaTBAnMaT

[D. Deori, ”BAnMaT: un Framework per l’Analisi e la Manipolazione di un Bitstream Orientato alla Riconfigurabilita Parziale”, DEI, Milano, Politecnico di Milano, 2006]

Features:Bitstream correctness checkPerform modification on a configuration bitstreamPermits to bypass synthesis process from the VHDL

CONS:Only offline manipulation

23

REPLICAREPLICA

[H. Kalte, G. Lee, M. Porrmann and U. Rckert, ”REPLICA: A Bitstream Manipulation Filter for Module Relocation in Partial Reconfigurable Systems”, The 12th Reconfigurable Architectures Workshop (RAW 2005), 2005.]

Features:Hardware filter that exploit relocationNecessary manipulations during the download processRelocation hiding

CONS:Only for external reconfigurable systemOnly 1D relocationMaximum frequency of 50 MHz 24

What’s Next…What’s Next…

IntroductionRelocationState of ArtPolarisProposed Solutions

PolarisTarget ArchitectureProposed Relocation Solutions

Results Concluding Remarks and Future Work

25

26

Polaris: MotivationsPolaris: Motivations

Complete workflow to generate a self dynamically reconfigurable architecture that:– Supports 1D and 2D reconfiguration

– Has “good” area constraints for cores

– Performs Runtime task placement decisions

– Exploits internal and fast Core relocation

Starting from specification of:– Target application– Target device info– Reconfiguration model– Communication Infrastructure

2727

Polaris Polaris OverviewOverview

Workflow to manage allocation and relocation of tasks in self dynamically reconfigurable architectures

Final goal: complete architecture (bitstreams and code) generation

Target Architecture: YaRATarget Architecture: YaRA

28

PPC Based YaRAPPC Based YaRA

29

STATIC AREA

Proposed Relocation SolutionsProposed Relocation Solutions

Runtime Support for Self Dynamical Runtime 1D and 2D Reconfiguration– Xilinx Virtex, Virtex-E, Virtex2pro [1D]– Xilinx Virtex-4 and Virtex-5 [2D]

Relocation, different solutions:– Software:

• BAnMaT Lite– Hardware:

• BiRF [1D]• BiRF Square [2D]

30

Configuration BitstreamConfiguration Bitstream

31

BiRF & BiRF Square Block BiRF & BiRF Square Block DiagramDiagram

32

The ParserThe Parser

33

CRC CalculationCRC Calculation

Particular CRC value, used by Xilinx tools

Two version of BiRF and BiRF Square:– By using the “predefined” values– With actual CRC calculation

X16 + X15 + X2 + 1 [1D]

X32 + X28 + X27 + X26 + X25 + X23 + X22 + X20 + X19 + X18 + X14 + X13 + X11 + X10 + X9 + X8 + X6 + 1 [2D]

34

What’s Next…What’s Next…

Introduction Relocation State of Art Proposed Solutions Results

– Synthesis Results– Relocation Solutions Results

Concluding Remarks and Future Work

35

ResultsResults

Relocation solutions:– Small area usage (slide 37)– High time performance (slide 38)

Relocation results:– Internal memory saving (slides 39 – 40)– Time saving (slides 41- 44)

36

Synthesis Results: AreaSynthesis Results: Area

37

FPGA BiRF BiRF Square

FamilyModel

Generic

Version

Optimized

Version

Generic

Version

Optimized

Version

Virtex II Pro

vp7 11.6 % 3.6 % − −

Virtex II Pro

vp20 5.8 % 1.8 % − −

Virtex II Pro

vp30 4.2 % 1.3 % − −

Virtex 4 vlx40 − − 2.2 % 0.9 %

Virtex 4 vlx60 − − 1.5 % 0.6 %

Virtex 4 vlx100

− − 0.8 % 0.3 %

Virtex 5 vlx50 − − 1.1 % 0.8 %

Virtex 5 vlx85 − − 0.6 % 0.4 %

Virtex 5 vlx110

− − 0.5 % 0.3 %

Synthesis Results: Time Synthesis Results: Time PerformancesPerformances

BiRF:– On a Virtex2pro with speed grade -5

• General purpose version: max frequency of 101 MHz• Specific version: max frequency of 136 MHz

BiRF Square:– On a Virtex-4 with speed grade -12

• General purpose version: max frequency of 160 MHz• Specific version: max frequency of 290 MHz

– On a Virtex-5 with speed grade -3• General purpose version: max frequency of 226 MHz• Specific version: max frequency of 304 MHz

38

Relocation Solutions Results Relocation Solutions Results (1/2)(1/2)

BiRF, BiRF Square, BAnMaT Lite– Permit to support relocation in a self partially and

dynamically 1D or 2D reconfigurable system– The occupation ratio is relatively small– Frequency more than acceptable– Reduction of internal memory requirements

Throughput:– BiRF: 6 MB/s – BiRF Square: 7.3 MB/s– BAnMaT Lite: 2.6 MB/s

39

Relocation Solutions Results Relocation Solutions Results (2/2)(2/2)

A total configuration file size is about 1 MB Considering an architecture:

– 1/3 of the area as fixed part – 2/3 as reconfigurable part with 6 slots

With such hypothesis– Size of a partial bitstream will be about 110 KB– Relocation time of about:

• 18 ms with BiRF• 15 ms with BiRF Square• 42 ms with BAnMaT Lite

40

Relocation Time Results (1/4)Relocation Time Results (1/4)

41

Relocation Time Results (2/4)Relocation Time Results (2/4)

FPU1: clock time 0.01 ms, required for 3.65 s (7 add, 3 sub, 10 mul, 1 square root and 4 div)– Feasible RR assignment: (0,0) and (6,0)

JPEG: a complete JPEG Hardware Compressor, compression rate 24 img(352x288)/s, required for 3 seconds (72 img 352x288)– Feasible RR assignment: (0,0), (0,1), (6,0) and

(6,1) FPU2: clock time 0.01 ms, required for 3.13 s (6

add, 5 sub, 8 mul and 4 div)– Feasible RR assignment: (0,0) and (6,0)

3DES: a Triple-DES 64-bit block cipher, required for 1 second, in order to process a file of 72 MB– Feasible RR assignment: (0,0),(1,0), (3,0) and (3,1)

42

Relocation Time Results (3/4)Relocation Time Results (3/4)

43

Relocation Time Results (4/4)Relocation Time Results (4/4)

44

What’s Next…What’s Next…

Introduction Relocation State of Art Proposed Solutions Results Concluding Remarks and Future Work

45

Concluding RemarksConcluding Remarks Architectural support for relocation:

– Create an integrated HW/SW system to manage online relocation (1D and 2D) in reconfigurable architecture

– Create efficient bitstream relocation solutions suitable for the target system:

• 1D - 2D• HW – SW

Pubblications:– International conferences:

• M. Morandi, M. Novati, M. D. Santambrogio, D. Sciuto, Core allocation and relocation management for a self dynamically reconfigurable architecture, ISVLSI 2008, IEEE Computer Society Annual Symposium on VLSI

• S. Corbetta, F. Ferrandi, M. Morandi, M. Novati, M. D. Santambrogio, D. Sciuto, Two Novel Approaches to Online Partial Bitstream Relocation in a Dynamically Reconfigurable System, ISVLSI 2007, IEEE Computer Society Annual Symposium on VLSI

– IEEE Transaction on VLSI (Second Rewiew Phase):• M. Morandi, M. Novati, M.D. Santambrogio, P Spoletini, D. Sciuto, Internal and

External Bitstream Relocation for Partial Dynamic Reconfiguration, TSVLSI, IEEE Transactions on Very Large Scale Integration Systems46

Future WorkFuture Work

Validation tool for the chosen– Reconfiguration model– Communication infrastructure

Simulation framework– Monitor the reconfigurable system evolution– Evaluate different placement policies and area

constraints definitions

47

48

General InformationGeneral Information

Webpage– www.dresd.org/?q=polaris

Mailing List– polaris-ml@dresd.org

Contact– To have more information regarding polaris:

• polaris@dresd.org – For a complete list of information on how to contact us:

• www.dresd.org/?q=contact_polaris

49

QuestionsQuestions

top related