making good points : application-specific pareto-point generation for design space exploration using...

19
Making Good Points: Application-Specific Pareto-Point Generation for Design Space Exploration using Rigorous Statistical Methods David Sheldon, Frank Vahid * Department of Computer Science and Engineering University of California, Riverside *Also with the Center for Embedded Computer Systems at UC Irvine

Upload: eileen-ferguson

Post on 19-Jan-2016

224 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Making Good Points : Application-Specific Pareto-Point Generation for Design Space Exploration using Rigorous Statistical Methods David Sheldon, Frank

Making Good Points: Application-Specific Pareto-Point Generation for Design Space Exploration using Rigorous Statistical Methods

David Sheldon, Frank Vahid*

Department of Computer Science and EngineeringUniversity of California, Riverside

*Also with the Center for Embedded Computer Systems at UC Irvine

Page 2: Making Good Points : Application-Specific Pareto-Point Generation for Design Space Exploration using Rigorous Statistical Methods David Sheldon, Frank

Counterbus

W1

16 bytes4 physical lines filled when line size is 32 bytes

Off Chip Memory

Line Concatenation

[Zhang/Vahid/Najjar, ISCA 2003, ISVLSI 2003, TECS 2005]

Parameterized Component: Cache

2 of 19

127% 620% 126%

0%

20%

40%

60%

80%

100%

120%

padp

cm crc

auto

2

bcnt bilv

binar

y blit

brev

g3fa

x fir

pjepg

ucbq

sort v4

2

adpc

m epic

g721

pegw

it

mpe

g

jpeg art

mcf

pars

er vpr

Ave

Norm

alize

d En

ergy

)

cnv8K4W32B cnv8K1W32B cfg8Kwcwslc

40% avg savings

Page 3: Making Good Points : Application-Specific Pareto-Point Generation for Design Space Exploration using Rigorous Statistical Methods David Sheldon, Frank

David Sheldon, UC Riverside 3 of 19

FPGA Systems are Often Built from Parameterized Components

Parameterized components include: Cache (e.g., size, associatively, line size)

Processors Co-processors Buses (e.g., bit width, network-on-chip structure)

uP

MPEG Enc

Cache config

config

config

config Bus

FPGA

DSP

config

Page 4: Making Good Points : Application-Specific Pareto-Point Generation for Design Space Exploration using Rigorous Statistical Methods David Sheldon, Frank

0

2

4

6

8

10

12

14

0 20 40 60 80 100 120 140 160 180 200

Million

s

Thousands

Equivilent LUTs

cycle

s

520 pointsOver 10 days

~35 min per point

<1 min to execute

Remaining time was in synthesis and place and route

520 pointsOver 10 days

~35 min per point

<1 min to execute

Remaining time was in synthesis and place and route

Microblaze Soft-Core Processor – Design Space due to Parameters

Pareto points: Points where no point exists that is better in all metrics.

Cycles

Equivalent LUTs 4 of 19

Page 5: Making Good Points : Application-Specific Pareto-Point Generation for Design Space Exploration using Rigorous Statistical Methods David Sheldon, Frank

David Sheldon, UC Riverside 5 of 19

Pareto Points Differ Per Application and Per Criteria

App a2

Designer B

Platform

App a1

Time

Ene

rgy

TimeE

nerg

y

Pareto points

Designer A

c1c2 c3

c1

c2

c3

(a)

(b)

c1 c3 ...c2

Page 6: Making Good Points : Application-Specific Pareto-Point Generation for Design Space Exploration using Rigorous Statistical Methods David Sheldon, Frank

David Sheldon, UC Riverside 6 of 19

Previous Work: Parameter Interdependency graph

Platune [Givargis/Vahid 2002]: Introduced parameter interdependency graph

Edges – parameters are dependent

Nodes not connected – independent

Search dependent parameters exhaustively; compose local Pareto points into global points

Greatly reduces search space if independent parameters

Good results, 44 hours Randomized Approaches

Pareto Simulated Annealing (PSA) [Talarico 2006]

Good results, 6 hours Genetic Algorithms [Ascia 2005]

Good results, 4 hours

Platune’s Architecture

MIPS

I$

D$

MEM

CPU–I$ Bus

CPU–D$ Bus

$-MEM Bus

sizeassoc.linesize

sizeassoc.

codea code

code

Supply Voltage

Page 7: Making Good Points : Application-Specific Pareto-Point Generation for Design Space Exploration using Rigorous Statistical Methods David Sheldon, Frank

David Sheldon, UC Riverside 7 of 19

Our Approach We developed

Design-of-Experiments (DoE)-based technique to automatically generate a parameter interdependency graph

Relieves designer of burden Technique to generate Pareto-points via parameter interdependency graph edge-weight-based algorithm

Improve speed versus Platune Called DoE-Based Pareto-Point Generator (DPG)

0

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

0 2 4 6 8 10 12

Time (sec)

En

erg

y (

J)

Time

Performance

Page 8: Making Good Points : Application-Specific Pareto-Point Generation for Design Space Exploration using Rigorous Statistical Methods David Sheldon, Frank

David Sheldon, UC Riverside 8 of 19

Design of Experiments (DoE)

i$ size i$ assoc d$ size d$ lined$ assocm-i$code

m-i$a code

$-mcode

Supply Voltage

MIPSI$

D$MEM

CPU–I$ Bus

CPU–D$ Bus

$-MEM Bus

sizeassoc.linesize

sizeassoc.

codea code

code

Supply Voltage

2k8

8k832

Bi

BiBi

4.1

DoE generates a set of orthogonal experiments that allows for statistical analysis of the search space

Page 9: Making Good Points : Application-Specific Pareto-Point Generation for Design Space Exploration using Rigorous Statistical Methods David Sheldon, Frank

David Sheldon, UC Riverside 9 of 19

DPG Algorithm Subsequent DoE analysis determines main effects of parameters

Y bar Marginal Means Plot

0

0.00002

0.00004

0.00006

0.00008

0.0001

0.00012

0.00014

0.00016

-1 1 -1 1 -1 1 -1 1 -1 1 -1 1 -1 1 -1 1 -1 1

Effect Levels

i$ size

i$ assoc

d$ size

d$ line

d$ assoc

m-i$code

m-i$a code

$-mcode

Supply Voltage

MIPS

I$

D$

MEM

CPU–I$ Bus

CPU–D$ Bus

$-MEM Bus

sizeassoc.linesize

sizeassoc.

codea code

code

Supply Voltage

Page 10: Making Good Points : Application-Specific Pareto-Point Generation for Design Space Exploration using Rigorous Statistical Methods David Sheldon, Frank

David Sheldon, UC Riverside 10 of 19

DPG Algorithm (cont.) Compute weight of each pair of nodes Sort edges in decreasing weight

DK, (I$ assoc, CPU-I$ address code) DI, (I$ assoc, CPU I$ code) IK, (CPU-I$ code, CPU I$ address code) IQ, (CPU-I$ code, $-MEM address code) KQ, (CPU I$ address code, $-MEM address code) ...

MIPS

I$

D$

MEM

CPU–I$ Bus

CPU–D$ Bus

$-MEM Bus

sizeassoc.linesize

sizeassoc.

codea code

code

Supply Voltage

Page 11: Making Good Points : Application-Specific Pareto-Point Generation for Design Space Exploration using Rigorous Statistical Methods David Sheldon, Frank

David Sheldon, UC Riverside 11 of 19

DPG Algorithm (cont.) Pair wise merge of nodes

Creates a sparse set of Pareto points The designer can direct the tool to fill in the regions of interest

Original Pareto pointsFilled in Pareto points

Time

Energy

Page 12: Making Good Points : Application-Specific Pareto-Point Generation for Design Space Exploration using Rigorous Statistical Methods David Sheldon, Frank

David Sheldon, UC Riverside 12 of 19

Platune – Pareto Graph with Fill-in

0

0.001

0.002

0.003

0.004

0.005

0.006

0.007

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5

Time

En

erg

y

Platune

Single Factor

DoE IOT

DPG - 3 value

DPG - fill in

jpeg

Page 13: Making Good Points : Application-Specific Pareto-Point Generation for Design Space Exploration using Rigorous Statistical Methods David Sheldon, Frank

David Sheldon, UC Riverside 13 of 19

Platune – Pareto Graph with Fill-in

0

0.001

0.002

0.003

0.004

0.005

0.006

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

Time (sec)

En

erg

y (J

)

Platune

DPG

b1_histogram

Page 14: Making Good Points : Application-Specific Pareto-Point Generation for Design Space Exploration using Rigorous Statistical Methods David Sheldon, Frank

Interdependency Graph Comparison: Manual vs. Automated

David Sheldon, UC Riverside 14 of 19

jpeg b1_histogram g3fax

Page 15: Making Good Points : Application-Specific Pareto-Point Generation for Design Space Exploration using Rigorous Statistical Methods David Sheldon, Frank

David Sheldon, UC Riverside 15 of 19

Platune Results

44

0

1

2

3

4

5

6

SF DoE IOT DPG Genetic PSA Platune

Ru

ntim

e in

Ho

urs

DPG is 30x faster than Platune 2.5x faster than Genetic Algorithms

Page 16: Making Good Points : Application-Specific Pareto-Point Generation for Design Space Exploration using Rigorous Statistical Methods David Sheldon, Frank

Xilinx Microblaze Soft-Core Processor

Tuned the Microblaze for various benchmarks

Exhaustive data generated for 12 benchmarks for comparison

The Microblaze also has a configurable cache, which allows for over 3,000 configurations.

For these tests we used results previously generated thus giving us only 64 configurations.

David Sheldon, UC Riverside 16 of 19

Microblaze

bsFPUdiv

mulMSRPCMP

Page 17: Making Good Points : Application-Specific Pareto-Point Generation for Design Space Exploration using Rigorous Statistical Methods David Sheldon, Frank

David Sheldon, UC Riverside 17 of 19

Network on Chip – Results

DPG also works on larger design spaces

Page 18: Making Good Points : Application-Specific Pareto-Point Generation for Design Space Exploration using Rigorous Statistical Methods David Sheldon, Frank

DPG Scales Well

David Sheldon, UC Riverside 18 of 19

Number of Parameters

DPG Analysis Phase

Total Design Space

Percent of Design Space

6 34 64 53.13%10 67 1,024 6.54%15 136 32,768 0.42%20 234 1,048,576 0.02%25 353 33,554,432 0.001%30 497 1,073,741,824 0.00005%

Page 19: Making Good Points : Application-Specific Pareto-Point Generation for Design Space Exploration using Rigorous Statistical Methods David Sheldon, Frank

David Sheldon, UC Riverside 19 of 19

Conclusion DoE-Based Pareto-Point Generation (DPG) algorithm quickly finds good Pareto Points Results were better and obtained faster than previous Platune or randomized techniques

Approach is easier to use – no designer knowledge of parameter interdependencies is needed

Useful for FPGAs as well as other parameterized systems, such as SOCs synthesized to ASICs, parameterized SOCs, etc.