search space properties for pipelined fpga applications

22
USC USC Search Space Properties Search Space Properties for Pipelined FPGA for Pipelined FPGA Applications Applications University of Southern California Information Sciences Institute Heidi Ziegler, Mary Hall, Byoungro So Oct 2, 2003

Upload: ghazi

Post on 15-Jan-2016

34 views

Category:

Documents


0 download

DESCRIPTION

Search Space Properties for Pipelined FPGA Applications. University of Southern California Information Sciences Institute Heidi Ziegler, Mary Hall, Byoungro So Oct 2, 2003. Mapping Assignment. Partition Chip Capacity. Compute Data Layout. Manage Communication. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Search Space Properties  for Pipelined FPGA Applications

USCUSC

Search Space Properties Search Space Properties for Pipelined FPGA Applicationsfor Pipelined FPGA Applications

University of Southern CaliforniaInformation Sciences Institute

Heidi Ziegler, Mary Hall, Byoungro So

Oct 2, 2003

Page 2: Search Space Properties  for Pipelined FPGA Applications

2

USCUSCMapping AssignmentMapping Assignment

Machine V ision K ernel(application requirem ents)

1 . Edge detection 2 . Feature extraction 3 . Distance com putation

FP GA

M ap

Page 3: Search Space Properties  for Pipelined FPGA Applications

3

USCUSC

Machine V ision K ernel(MV I S)

1 . Edge detection 2 . Feature extraction 3 . Distance com putation

configurable logic element

off-chip memory

datapath

on-chip storage

I nterconnect

configurable logic

Mapping an Application to HardwareMapping an Application to Hardware

1

2

3

Compute Data Layout

Partition Chip Capacit

y

Manage Communicati

on

Page 4: Search Space Properties  for Pipelined FPGA Applications

4

USCUSCBuild on Prior Work in Build on Prior Work in DEFACTODEFACTO

Automatic design space exploration for individual loop nests (DAC03, PLDI02)

Analyses and transformations to exploit ILP (PLDI02) and maximize memory bandwidth (LPCP02)

Communication and pipeline analysis to exploit data and task parallelism (FCCM02, DAC03)

C

Analyses and T ransformations

SU I F to VHDL

Behavioral Synthesis and Estimation

Good Design?

Logic Synthesis and P lace&Route

N o

Yes

Page 5: Search Space Properties  for Pipelined FPGA Applications

5

USCUSCThis ResearchThis Research

Integrates communication and pipelining analysis with the single loop design space exploration

Defines and illustrates search space properties for the global optimization problem

Describes a search algorithm and presents a case study

Page 6: Search Space Properties  for Pipelined FPGA Applications

6

USCUSCSequential MVIS KernelSequential MVIS Kernel

ReadWriteExecution Order

Time

AB

2-D array

access order row-wise

data dependen

ce

B

RAW

Edge

Feature

Distance

F

RAW

D

D

Pipeline Stage S1

Pipeline Stage S2

Pipeline Stage S3

Page 7: Search Space Properties  for Pipelined FPGA Applications

7

USCUSCReaching Definition Data Access DescriptorReaching Definition Data Access Descriptor

Set describes basic data access information

s program pointr, w read or write array access

accessed array section, integer linear inequalities

traversal order, vector of dims., slowest to fastest

vector of dominant induction variables for ea. dim

set of statements this tuple describes (def or use)

set of reaching definitions

)(},,{ ARDAD swr

Page 8: Search Space Properties  for Pipelined FPGA Applications

8

USCUSCCommunication RequirementsCommunication Requirements

Read (4)Write (3)

)(, ,, BRDADBRDADf sjrsiw

Stage S2

Stage S1

|3,2,129202910

)(1, yxdd

BRDAD sw

3|4,2,129202910

)(2, yxdd

BRDAD sr

B

B

Communication

RAW

Solve directly for data, granularity, placement

Page 9: Search Space Properties  for Pipelined FPGA Applications

9

USCUSCTask GraphTask Graph Nodes are pipeline stages Communication edge descriptors (CEDs) computed from

RDADs

array section, per communication instance send point receive point

S 1

S 5

S 2

S 4

S 3

{R D AD s}s2

{R D AD s}s1

{R D AD s}s4

{R D AD s}s5

{R D AD s}s3

CE D s2 -> s3 (a )ra te s2 (a )p ro d

ra te s3 (a ) c o n s

CE D s2 -> s3 (b )rate s2 (a ) p ro d

ra te s3 (a ) c o n s

CE D s1 -> s2 (a )ra te s1 (a ) p ro d

ra te s2 (a ) c o n s

CE D s1 -> s5 (a )ra te s1 (a ) p ro d

ra te s5 (a ) c o n s

CE D s1 -> s4 (x)ra te s1 (x)p ro d

ra te s4 (x) c o n s

CE D s4 -> s5 (y)ra te s4 (y)p ro d

ra te s5 (y) c o n s

CE D s5 -> s3 (y)ra te s5 (y)p ro d

ra te s3 (y) c o n s

)(ACED ji ss

Page 10: Search Space Properties  for Pipelined FPGA Applications

10

USCUSCGlobal Optimization StrategyGlobal Optimization Strategy

2 Criteria Design’s execution time should be

minimized Design’s space utilization, for a given level

of performance, should be minimized

Estimates Behavioral synthesis area (all loops) Behavioral synthesis timing (all loops) Communication rates

Page 11: Search Space Properties  for Pipelined FPGA Applications

11

USCUSCTransformationsTransformations

Local Unroll and jam Scalar replacement Custom data layout

Global Communication granularity and

placement Producer-Consumer Rate Matching Data reorganization on-chip

Page 12: Search Space Properties  for Pipelined FPGA Applications

12

USCUSCHigh-Level Design FlowHigh-Level Design FlowC

Communication and P ipeline Analysis

Custom Data Layout

SU I F to VHDL

Behavioral Synthesis and Estimation

Basic Compiler O ptimizations

Scalar Replacement

Unro ll and J am

Producer-Consumer Rate M atching

Communication Granularity Analysis

Logic Synthesis / P lace & Route

G ood D esig n ? N o

Y es

Con fig u ration B it S tream

Page 13: Search Space Properties  for Pipelined FPGA Applications

13

USCUSCObservation 1: Observation 1: Non-increasing Memory AccessesNon-increasing Memory Accesses

Choose to place communication on-chip

off-chip memory

configurable logic device

Stage 1 AABB

Stage 2

S1

S2BB

DD

EE

BB

AA

Single Loop So lution Global So lution

DD

EE

Page 14: Search Space Properties  for Pipelined FPGA Applications

14

USCUSCObservation 2: Observation 2: Non-increasing Unroll FactorNon-increasing Unroll Factor

Local solution assumed to be best-case performance, worst-case space estimate

Stage 1

S1

S2

Single Loop So lution Global So lution

Stage 2Reduce unroll factors

Page 15: Search Space Properties  for Pipelined FPGA Applications

15

USCUSCObservation 3:Observation 3:Matching Rates without Affecting PerformanceMatching Rates without Affecting Performance

Avoid creating longer critical paths

S 1

S 3

S 2

If rateprod(d) < ratecons(d),we can safely reduce the unroll factor for S3

until the rates match

CED(d)rateprod(d)ratecons(d)

CED(a)rateprod(a)ratecons(a)

Page 16: Search Space Properties  for Pipelined FPGA Applications

16

USCUSCOptimization Algorithm: Step Optimization Algorithm: Step 11

S 1

S 3

S 2

peak

feat

u re_

x

CE D s1 , s2 (p eak)

CE D s2 , s3 ( featu re_ x)

R D AD w ,s1 (p eak)

R D AD r,s2 (p eak)R D AD w ,s2 ( featu re_ x)

R D AD r,s3 ( featu re_ x)R D AD w ,s3 (ssd )

R D AD r,s1 (u )

R D AD r,s3 (u )R D AD r,s3 (v)

R D AD w ,s2 ( featu re_ y)

Apply Pipeline and Communication Analysisfor (x=0;x<image-2;x++) {

for (y=0;y<image-2;y++) {

uh1 = -3*u[x][y] – 3*u[x+1][y]……;

uh2 = -3*u[x][y] +3*u[x+1][y] …..;

peak[x][y] = uh1 + uh2;

}

}

for (x=0;x<image-2;x++) {

for (y=0;y<image-2;y++) {

if (feature_x[x][y] !=0)

ssd[x][y] = (u[x][y]-v[x][y+1])2 ……….

}

}

for (x=0;x<image-2;x++) {

for (y=0;y<image-2;y++) {

if (peak[x][y] > threshold)

feature_x[x][y] = x;

else feature_x[x][y] = 0;

}

}

Page 17: Search Space Properties  for Pipelined FPGA Applications

17

USCUSCOptimization Algorithm: Step Optimization Algorithm: Step 22

Stage 1

Stage 2

Stage 3

S et o f U n ro ll F actors

S et o f U n ro ll F actors

S et o f U n ro ll F actors

peak

fea t

u re_

x

Find Single Loop Solutions in Isolationfor (x=0;x<image-2;x++) {

for (y=0;y<image-2;y++) {

uh1 = -3*u[x][y] – 3*u[x+1][y]……;

uh2 = -3*u[x][y] +3*u[x+1][y] …..;

peak[x][y] = uh1 + uh2;

}

}

for (x=0;x<image-2;x++) {

for (y=0;y<image-2;y++) {

if (feature_x[x][y] !=0)

ssd[x][y] = (u[x][y]-v[x][y+1])2 ……….

}

}

for (x=0;x<image-2;x++) {

for (y=0;y<image-2;y++) {

if (peak[x][y] > threshold)

feature_x[x][y] = x;

else feature_x[x][y] = 0;

}

}

Page 18: Search Space Properties  for Pipelined FPGA Applications

18

USCUSCOptimization Algorithm: Optimization Algorithm: Step 3Step 3

Match Producer and Consumer Rates

S 1

S 3

S 2

CED(feature_x)rateprod(feature_x)ratecons(feature_x)

CED(peak)rateprod(peak)ratecons(peak)

rateprod(peak) = ratecons(peak)

rateprod(feature_x) = ratecons(feature_x)

Page 19: Search Space Properties  for Pipelined FPGA Applications

19

USCUSCOptimization Algorithm: Step Optimization Algorithm: Step 44

Apply Greedy Strategy to Meet Chip Constraint

Stage 1

Stage 2

Stage 3

inareacapacity 1

If not, apply greedy strategy and then repeat steps 3 and 4.

Final Solution

Page 20: Search Space Properties  for Pipelined FPGA Applications

20

USCUSCRelated WorkRelated Work Synthesizing high-level constructs

Handel-C, RaPiD, PipeRench, Babb et al.

Design space exploration Derrien/Rajopadhye, Cameron, PICO

Program analysis on arrays Hall et. al, Amarasinghe, Balasundaram &

Kennedy

Pipeline analysis Splash 2, Weinhardt & Luk, Du et. al, Goldstein et

al.

Page 21: Search Space Properties  for Pipelined FPGA Applications

21

USCUSCConclusionConclusion

System-level compiler automatically derives a pipelined implementation with explicit communication, while partitioning the chip capacity among pipeline stages

Global optimization strategy Built upon local solution with communication

Constrain the search space Non-increasing memory accesses Non-increasing unroll factors

Page 22: Search Space Properties  for Pipelined FPGA Applications

22

USCUSCContact InformationContact Information

Project Web Site

www.isi.edu/asd/defacto

Authors’ email addresses

ziegler, mhall, [email protected]