resource awareness fpga design practices for reconfigurable computing: principles and examples wu,...

19
Resource Awareness FPGA Design Practices for Reconfigurable Computing: Principles and Examples Wu, Jinyuan Fermilab, PPD/EED April 2007

Upload: theresa-brown

Post on 02-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Resource Awareness FPGA Design Practices for Reconfigurable Computing: Principles and Examples Wu, Jinyuan Fermilab, PPD/EED April 2007

Resource Awareness FPGA Design Practices for

Reconfigurable Computing: Principles and Examples

Wu, Jinyuan

Fermilab, PPD/EED

April 2007

Page 2: Resource Awareness FPGA Design Practices for Reconfigurable Computing: Principles and Examples Wu, Jinyuan Fermilab, PPD/EED April 2007

Introduction• Short Course (1/2 day):

– “How to Design Compact FPGA Functions:

Resource awareness design practices.”

– http://www-ppd.fnal.gov/EEDOffice-W/Projects/ckm/comadc/CompactFPGAdesign.pdf

• Refresher Course (45min):– “Resource Saving in Micro-Computer Software &

FPGA Firmware Designs”

– http://www-ppd.fnal.gov/EEDOffice-W/Projects/ckm/comadc/ResourceSaving.ppt

• This Document– Resource Awareness FPGA Design Practices for

Reconfigurable Computing: Principles and Examples

What can be done with an

FPGA?

Page 3: Resource Awareness FPGA Design Practices for Reconfigurable Computing: Principles and Examples Wu, Jinyuan Fermilab, PPD/EED April 2007

Example: ADC Using FPGA

AMP &Shaper

AMP &Shaper

AMP &Shaper

AMP &Shaper

AMP &Shaper

AMP &Shaper

AMP &Shaper

AMP &Shaper

ADC

ADC

ADC

ADC

FPGA

TDC

TDC

TDC

TDC

R1 R1

C

R2

FPGA

VREF

• Analog signals from AMP & Shapers are directly fed to FPGA pins.

• FPGA outputs and passive RC network are used to generate ramping reference voltage VREF.

• The input voltages and VREF are compared using FPGA differential input receivers.

• The times of transitions representing input voltage values are digitized by TDC blocks in FPGA.

T1 T2 T3 T4

V1 V2V3 V4

V1 V2V3 V4

T1 T2 T3 T4

Page 4: Resource Awareness FPGA Design Practices for Reconfigurable Computing: Principles and Examples Wu, Jinyuan Fermilab, PPD/EED April 2007

TDC Inside FPGA

c0

c90

c180

c270

c0

MultipleSampling

ClockDomain

Changing

Trans. Detection& Encode

Q0

Q1

Q2

Q3QF

QE

QD

c90

Coarse TimeCounter

DV

T0T1

TS

• Sampling rate: 360 MHz x4 phases = 1.44 GHz.

• LSB = 0.69 ns.

• Logic elements with critical timing are assigned as shown.

4Ch

Logic elements with non-critical timing are freely placed by the fitter of the compiler.

Page 5: Resource Awareness FPGA Design Practices for Reconfigurable Computing: Principles and Examples Wu, Jinyuan Fermilab, PPD/EED April 2007

ADC Test: Waveform Digitization on BD3_19

1

1.5

2

2.5

2500 3000 3500 4000 4500 5000 5500

t(ns)

V

Leading Ramp Trailing Ramp

0

8

16

24

32

40

48

56

64

0 32 64 96 128 160 192 224 256

Leading Ramp Trailing Ramp

RawData

Input Waveform, Overlap Trigger& Reference Voltage

Converted

FPGA

TDC

TDC

50 50

1000pF

100

VREF

A lot can be done with an FPGA if one can image.

Page 6: Resource Awareness FPGA Design Practices for Reconfigurable Computing: Principles and Examples Wu, Jinyuan Fermilab, PPD/EED April 2007

Micro-computing vs. Reconfigurable Computing

• In microprocessor, the users specify program on fixed logic circuits.

• In FPGA, the users specify logic circuits (as well as program).

• The FPGA computing needs not to follow microprocessor architectures. (But useful experiences can be borrowed.)

• The usefulness of FPGA reconfigurable computing is still to be fully appreciated.

(100+3-4)*5+7 =?

100

34

57Control:

Data: 100,3,4,5,7

LD (-) (+)(*)(+)

CPUFPGAData

ProgramConfiguration

DataProgram

Page 7: Resource Awareness FPGA Design Practices for Reconfigurable Computing: Principles and Examples Wu, Jinyuan Fermilab, PPD/EED April 2007

Example: Track Fitting

z=z0(z-z0)=-2 (z-z0)=+2 (z-z0)=+4(z-z0)=-4

4h

y0-4

2000 )()( zzzzhyy

Page 8: Resource Awareness FPGA Design Practices for Reconfigurable Computing: Principles and Examples Wu, Jinyuan Fermilab, PPD/EED April 2007

Relative Errors of Several Track Fitter Schemes

0.00

2.00

4.00

6.00

8.00

10.00

12.00

14.00

16.00

18.00

20.00

0 2 4 6 8 10 12 14 16 18

Track Half Length

Rel

ativ

e E

rro

rs

3-point, next planes

3-point, full length

FPGA fitter

Least Square

2000 )()( zzzzhyy

Least Square Fitter

Multiplier-less FPGA LS Fitter

Page 9: Resource Awareness FPGA Design Practices for Reconfigurable Computing: Principles and Examples Wu, Jinyuan Fermilab, PPD/EED April 2007

Least Square Fitter

2000 )()( zzzzhyy

y1y2y3y4y5y6y7

iii

iii

iii

ye

ydh

ycy

0

c1

c2

c3

c4

c5

c6

c7

d1

d2

d3

d4

d5

d6

d7

e1

e2

e3

e4

e5

e6

e7

X

X

X

• The parameters can be described as inner-products.

• Hit coordinates and coefficients are fed simultaneously.

• The inner-products can be calculated with multiplier-accumulator structures.

Page 10: Resource Awareness FPGA Design Practices for Reconfigurable Computing: Principles and Examples Wu, Jinyuan Fermilab, PPD/EED April 2007

Multiplier-less (ML) Quasi-Least Square Fitter

iii

iii

iii

ye

ydh

ycy

0

y1y2y3y4y5y6y7

x1x2x3x4x5x6x7

<<

+/- +/- +/-

<< <<

4

• The coefficients are described as “two-bit” numbers, e.g.:– 5=4+1; 7=8-1; 112=128-16;

• The multiplication is replaced with two shift & add/sub operations.

• There are two clock cycles to fetch a measurement point (i.e., y1, y2, etc.) allowing two shift & add/sub operations

+18-1

128-16

Page 11: Resource Awareness FPGA Design Practices for Reconfigurable Computing: Principles and Examples Wu, Jinyuan Fermilab, PPD/EED April 2007

Inaccuracy Doesn’t Matter, A Lot of Time

0.00

0.50

1.00

1.50

2.00

2.50

3.00

3.50

4.00

4.50

5.00

0 2 4 6 8 10 12 14 16 18

Half-length of the Track

Rel

ativ

e E

rro

r

eta4096 Least Square

eta4096 FPGA Fitter

hh512 Least Square

hh512 FPGA fitter

yy32 Least Square

yy32 FPGA fitter

Least Square Fitter

Multiplier-lessQuasi-Least Square

FPGA Fitter

2000 )()( zzzzhyy

Page 12: Resource Awareness FPGA Design Practices for Reconfigurable Computing: Principles and Examples Wu, Jinyuan Fermilab, PPD/EED April 2007

Fitting is easy. Matching hits is harder.Software FPGA

Typical

FPGA Resource Saving Approaches

O(n2)for(){

for(){…}

}

O(n)*O(N)Comparator

Array

Hash Sorter

O(n)*O(N): in RAM

O(n3)for(){

for(){

for(){…}

}

}

O(n)*O(N2)CAM,

Hugh Trans.

Tiny Triplet Finder

O(n)*O(N*logN)

O(n4)for(){ for(){

for(){ for()

{…}

}}}

Page 13: Resource Awareness FPGA Design Practices for Reconfigurable Computing: Principles and Examples Wu, Jinyuan Fermilab, PPD/EED April 2007

Resource Saving Tricks

Loop Reduction Tricks:The number of computations in a given task is reduced by (1) using fewer iterations in loops or/and (2) using fewer operations in each iteration.

Non-Loop Reduction Tricks:The number of computations in a given task is unchanged. The FPGA resource is saved by (1) reusing the resources multiple times via sequencing or/and (2) using transistor-saving resources such as RAM.

Page 14: Resource Awareness FPGA Design Practices for Reconfigurable Computing: Principles and Examples Wu, Jinyuan Fermilab, PPD/EED April 2007

Resource Saving TricksLoop-Reduction

Multiplier-less (ML) Approaches

Recursive Implementation of FIR Filter

FFT: O(n)*O(log(N))

Tiny Triplet Finder: O(n)*O(N*log(N))

+

s[n]

-x[n-K]

x[n]

+y[n]

-s[n-K]

x[n]

y[n]

*h1*h2

*h[K]

X

<<

+/-

*R1/R3

*R2/R3

Bit

Arr

ay

Shifter

Bit

Arr

ay

ShifterBit-wise Coincident Logic

Page 15: Resource Awareness FPGA Design Practices for Reconfigurable Computing: Principles and Examples Wu, Jinyuan Fermilab, PPD/EED April 2007

Resource Saving TricksNon-Loop-Reduction

Sequencing: Using RAM: Hash Sorter/Histogram

OP1

Initialization

OP2 OP3 OP4

OP1 OP2 OP3 OP4

OP1 OP2 OP3 OP4

OP1 OP2 OP3 OP4

Initialization 1Initialization 2Initialization 3

OP1OP2OP3OP4

OP1OP2OP3OP4

OP1OP2OP3OP4

OP1OP2OP3OP4

Page 16: Resource Awareness FPGA Design Practices for Reconfigurable Computing: Principles and Examples Wu, Jinyuan Fermilab, PPD/EED April 2007

InputCtrl

De-serial.

BCO

Hit(s)

D

W/RWA

RA

16

32

An Example of Inexplicit Computing & Hidden Resource

• Data with random time stamp are re-ordered according to beam crossing (BCO).

• Data with same BCO output together and the bandwidth becomes smaller.

• Inexplicit computing (sorting) is performed with hidden resource (RAM, it should be static RAM not dynamic RAM.)

RAM

Page 17: Resource Awareness FPGA Design Practices for Reconfigurable Computing: Principles and Examples Wu, Jinyuan Fermilab, PPD/EED April 2007

Why Saving Resource?

Why not?

Page 18: Resource Awareness FPGA Design Practices for Reconfigurable Computing: Principles and Examples Wu, Jinyuan Fermilab, PPD/EED April 2007

The Fever of Moore’s Law vs. Maxwell’s Equations

t

DJH

t

BE

B

D

0

1998 2000 2002 2004 2006 2008 2010

Op/sec

MIT, 2002

• During the hot days of Moore’s Law, the rules of thumb are: – BRB – Buy Rather than Build

– URU – Use Rather than Understand

– WRW – Wait Rather than Work

• From fundamental principles like Maxwell’s Equations, it is known limits of Moore’s Law exist. The technology advance should come from: – The I3 Law: Imagination, Innovation & Implementation.

WRW

Page 19: Resource Awareness FPGA Design Practices for Reconfigurable Computing: Principles and Examples Wu, Jinyuan Fermilab, PPD/EED April 2007

Total Useful Works = (Clock Frequency)

x (Silicon Size) x (Efficiency)

• There is a big room for improvement on computation efficiency in both micro-computer software and FPGA firmware.

• Resource awareness not only saves direct cost, but also indirect cost like power consumption, PC board layout, cooling etc.

• Unnecessary artificial complexities confuse people, often including the designer.• Resource saving helps today when technology stales.• Resource saving helps future with technology progresses.

E

F

S

E

F

S

Primarily Users’Responsibility