optimal configuration of combined gpp/dsp/fpga systems …antonio/pubs/p-fall97acs.pdfgpp/dsp/fpga...

37
1 Optimal Configuration of Combined GPP/DSP/FPGA Systems for Minimal SWAP by John K. Antonio Department of Computer Science College of Engineering Texas Tech University [email protected] Fall ACS Meeting November 4-6, 1997

Upload: others

Post on 18-Apr-2020

15 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Optimal Configuration of Combined GPP/DSP/FPGA Systems …antonio/pubs/p-fall97acs.pdfGPP/DSP/FPGA Systems for Minimal SWAP by John K. Antonio Department of Computer Science ... •

1

Optimal Configuration of Combined GPP/DSP/FPGA Systems for

Minimal SWAPby

John K. AntonioDepartment of Computer Science

College of EngineeringTexas Tech University

[email protected]

Fall ACS MeetingNovember 4-6, 1997

Page 2: Optimal Configuration of Combined GPP/DSP/FPGA Systems …antonio/pubs/p-fall97acs.pdfGPP/DSP/FPGA Systems for Minimal SWAP by John K. Antonio Department of Computer Science ... •

2

OutlineOutline

• Program Objectives and Schedule of Milestones

• Representative Examples of Current Work

• Competing STAP Weight Solvers

• Power Prediction Model for FPGAs

• Optimal Configuration for SAR Processing

• Questions/Answers

Page 3: Optimal Configuration of Combined GPP/DSP/FPGA Systems …antonio/pubs/p-fall97acs.pdfGPP/DSP/FPGA Systems for Minimal SWAP by John K. Antonio Department of Computer Science ... •

3

Program Objectives

• Demonstrate advantages of combined use of GPP, DSP, and FPGA technologies for SAR and STAP applications

• Demonstrate advantages/disadvantages of different FPGA designs and implementations in terms of power consumption and real-estate requirements

• Develop and evaluate power prediction models for a GPP/DSP/FPGA prototype system

• Development of formal optimizations for configuring GPP/DSP/FPGA systems

Page 4: Optimal Configuration of Combined GPP/DSP/FPGA Systems …antonio/pubs/p-fall97acs.pdfGPP/DSP/FPGA Systems for Minimal SWAP by John K. Antonio Department of Computer Science ... •

4

Program Objectives

• Incorporation of data characteristics and requirements in optimizing system configuration– dynamic range– numerical accuracy

• Incorporation of multiple GPP/DSP algorithms, FPGA designs and implementations, and data representations in optimizing system configuration– Time-domain vs. frequency-domain convolutions– QR vs. conjugate gradient STAP weight solver – Fixed-point vs. block floating point vs. floating point

Page 5: Optimal Configuration of Combined GPP/DSP/FPGA Systems …antonio/pubs/p-fall97acs.pdfGPP/DSP/FPGA Systems for Minimal SWAP by John K. Antonio Department of Computer Science ... •

5

Schedule of Milestones

June 1997 June 1998 June 1999 Dec. 1999Dec. 1998Dec. 1997

Design STAPIterative Weight Solver for FPGA

Inter-GPP/DSP Comm.Simulator for STAP

Optimal GPP/DSPConfig. for SAR

GPP/DSP/FPGA Platform Construction and Independent Testing of GPP/DSP and FPGA Subsystems

Implement STAP Iterative Weight Solver on FPGA

Optimal GPP/DSPConfig. for STAP

Implement SAR Linear Filteringon FPGA

Optimal GPP/DSP/FPGAConfig. for SAR/STAP

GPP/DSP and FPGA Subsystem Integration and Testing

Optimal GPP/DSP/FPGA Config. for SAR

Demonstrate Combined SAR/STAP onGPP/DSP/FPGA Platform

Implement SAR on GPP/DSP

Design SAR Linear Filteringfor FPGA

Implement STAP on GPP/DSP

Implement SAR onGPP/DSP/FPGA Platform

Optimal GPP/DSP/FPGA Config. for STAP

Implement STAP onGPP/DSP/FPGA Platform

Develop FPGA Power Consumption Simulator

KeyGPP/DSP Sub-System

Research/DesignImplement/Test

FPGA Sub-SystemResearch/DesignImplement/Test

GPP/DSP/FPGA SystemResearch/DesignImplement/Test

Page 6: Optimal Configuration of Combined GPP/DSP/FPGA Systems …antonio/pubs/p-fall97acs.pdfGPP/DSP/FPGA Systems for Minimal SWAP by John K. Antonio Department of Computer Science ... •

6

OutlineOutline

• Program Objectives and Schedule of Milestones

• Representative Examples of Current Work

• Competing STAP Weight Solvers

• Power Prediction Model for FPGAs

• Optimal Configuration for SAR Processing

• Questions/Answers

Page 7: Optimal Configuration of Combined GPP/DSP/FPGA Systems …antonio/pubs/p-fall97acs.pdfGPP/DSP/FPGA Systems for Minimal SWAP by John K. Antonio Department of Computer Science ... •

7

References for STAP

J. Ward, “Space-Time Adaptive Processing for Airborne Radar,” Technical Report 1015, MIT Lincoln Laboratory, Lexington, MA, 1994.

K. C. Cain, J. A. Torres, and R. T. Williams, (R. A. Games, Project Leader), “RT_STAP: Real-Time Space-Time Adaptive Processing Benchmark,” MITRE Technical Report MTR 96B0000021, Feb. 1997.

MCARM Data Files, Rome Laboratory, (http://sunrise.oc.rl.af.mil).

D. G. Luenberger, Linear and Nonlinear Programming, Addison-Wesley, Reading, MA, 1984.

Page 8: Optimal Configuration of Combined GPP/DSP/FPGA Systems …antonio/pubs/p-fall97acs.pdfGPP/DSP/FPGA Systems for Minimal SWAP by John K. Antonio Department of Computer Science ... •

8

Formulation of STAP Weight Equation

mth Ran

ge

Segm

ent

(with

N Rce

lls)L

Cha

nnel

s

Doppler

k (k - 1)(k + 1)

Data Matrix Needed for Calculating Weights for kth Doppler Bin

and mth Range SegmentUsing 3rd Order

Doppler-Factored STAP 131

:),(^

×=× LL

rkx

r

∑=

=N

rkxrkx

mkR

r

H

RN 1),(),(1

),(ψ

Page 9: Optimal Configuration of Combined GPP/DSP/FPGA Systems …antonio/pubs/p-fall97acs.pdfGPP/DSP/FPGA Systems for Minimal SWAP by John K. Antonio Department of Computer Science ... •

9

RR NLNL

mk

3

:),(^

×=×

X

STAP Weight CalculationUsing QR Decomposition

),(),(1

),(),(1),(1

mkmk

Nrkxrkxmk

H

R

r

H

R

N

NR

XX

ψ

=

= ∑=

smkwmk γ=),(),(ψ

The Weight Equation:

sNmkwRR

smkwRRN

rkwRQQRN

RT

T

R

TT

R

γ

γ

=

=

=

),(

),(1

),(1

*11

*

**

QRmkT =),(XQR-Decomposition :

Page 10: Optimal Configuration of Combined GPP/DSP/FPGA Systems …antonio/pubs/p-fall97acs.pdfGPP/DSP/FPGA Systems for Minimal SWAP by John K. Antonio Department of Computer Science ... •

10

Using Conjugate Gradient Approachto Solve the Weight Equation

sw =ψ:Solvingfor CG

Initialization

)()()(

)()1()1()1(

)1()1(

)()()(

)()()()1(

kkTk

kTkkk

kk

kkTk

kTkkk

ddddggd

swg

ddd

dgww

ΨΨ

+−=

−Ψ=

Ψ−=

+++

++

+

)0()0()0()0()0( ,set , Choose dgwsdw −=Ψ−=

Iteration

Page 11: Optimal Configuration of Combined GPP/DSP/FPGA Systems …antonio/pubs/p-fall97acs.pdfGPP/DSP/FPGA Systems for Minimal SWAP by John K. Antonio Department of Computer Science ... •

11

Preliminary Numerical Studies

Relative Error and FLOP count Vs. Tolerance for Nr = 125Data File: re050068 (32 pulses, 28 Weight Vectors Computed)

10-710-8 10-110-6 10-5 10-4 10-3 10-2

10-9

10-8

10-7

10-6

10-5

10-4

10-3

10-2

10-1

100

10-710-8 10-110-6 10-5 10-4 10-3 10-2

108

109

1010

Tolerance Tolerance

Rel

ativ

e Er

ror

FLO

P C

ount

QRCG

Page 12: Optimal Configuration of Combined GPP/DSP/FPGA Systems …antonio/pubs/p-fall97acs.pdfGPP/DSP/FPGA Systems for Minimal SWAP by John K. Antonio Department of Computer Science ... •

12

Preliminary Numerical Studies

Relative Error and FLOP count Vs. Tolerance for Nr = 250Data File: re050068 (32 pulses, 28 Weight Vectors Computed)

10-710-8 10-110-6 10-5 10-4 10-3 10-2 10-710-8 10-110-6 10-5 10-4 10-3 10-2

Tolerance Tolerance

Rel

ativ

e Er

ror

FLO

P C

ount

10-7

10-6

10-5

10-4

10-3

10-2

10-1

108

109

1010

QRCG

Page 13: Optimal Configuration of Combined GPP/DSP/FPGA Systems …antonio/pubs/p-fall97acs.pdfGPP/DSP/FPGA Systems for Minimal SWAP by John K. Antonio Department of Computer Science ... •

13

• Easier and More Efficient to Implement on FPGA Hardware than QR Decomposition Approach:

State Machine Design

Implementation of Conjugate Gradient on FPGAs

NumericalOperations

Registers

ψ

w( j + 1)w( j)

• Floating point• Block floating point• Fixed point

• No. of variables• No. of bits/variable• Dynamic Range• Accuracy

Page 14: Optimal Configuration of Combined GPP/DSP/FPGA Systems …antonio/pubs/p-fall97acs.pdfGPP/DSP/FPGA Systems for Minimal SWAP by John K. Antonio Department of Computer Science ... •

14

QR Decompositionversus

Conjugate Gradient

• QR Decomposition:

• Suitable for GPP/DSP implementation

• Good performance for small values of NR

• Conjugate Gradient:

• Suitable for either GPP/DSP or FPGA implementations

• Good performance for large values of NR

• Provides a way to balance desired precision and computational effort

• FPGA implementations offer many design parameters (e.g., data representation, no. bits variable, etc.)

Page 15: Optimal Configuration of Combined GPP/DSP/FPGA Systems …antonio/pubs/p-fall97acs.pdfGPP/DSP/FPGA Systems for Minimal SWAP by John K. Antonio Department of Computer Science ... •

15

Conceptual Illustration of Trade-Offs(graphs shown are hypothetical)

{Precision, Accuracy, Dyn. Rnge, L, Nr}(Multidimensional Parameter Space)

Com

puta

tiona

l Com

plex

ity

Pow

er R

equi

rem

ents

CG on GPP/DSPQR on GPP/DSP

CG on FPGA - Floating PointCG on FPGA - Block Floating PointCG on FPGA - Fixed Point

{Precision, Accuracy, Dyn. Rnge, L, Nr}(Multidimensional Parameter Space)

Page 16: Optimal Configuration of Combined GPP/DSP/FPGA Systems …antonio/pubs/p-fall97acs.pdfGPP/DSP/FPGA Systems for Minimal SWAP by John K. Antonio Department of Computer Science ... •

16

OutlineOutline

• Program Objectives and Schedule of Milestones

• Representative Examples of Current Work

• Competing STAP Weight Solvers

• Power Prediction Model for FPGAs

• Optimal Configuration for SAR Processing

• Questions/Answers

Page 17: Optimal Configuration of Combined GPP/DSP/FPGA Systems …antonio/pubs/p-fall97acs.pdfGPP/DSP/FPGA Systems for Minimal SWAP by John K. Antonio Department of Computer Science ... •

17

References for FPGA Power Prediction

K. P. Parker and E. J. McCluskey, “Probabilistic Treatment of General Combinatorial Networks,” IEEE Trans. Computers, Vol. C-24, June 1975, pp. 668-670.

Kaushik Roy and Sharat Prasad, “Circuit Activity Based LogicSynthesis for Low Power Reliable Operations,” IEEE Trans. VLSI Systems, Vol. 1, No. 4, Dec.1993, pp.

Kaushik Roy, “Power Dissipation Driven FPGA Place and Route under Timing Constraints,” School of Electrical and Computer Engineering, Purdue University.

“XC4000 Series Field Programmable Gate Arrays,” Xilinx, Inc., September 18, 1996.

Page 18: Optimal Configuration of Combined GPP/DSP/FPGA Systems …antonio/pubs/p-fall97acs.pdfGPP/DSP/FPGA Systems for Minimal SWAP by John K. Antonio Department of Computer Science ... •

18

FPGA Power Consumption

Interconnection fabric

Logic block

Most of the logic/area in the FPGA is used to route signals.

As signals traverse this network of transistors, there can be a significant power consumption.

Page 19: Optimal Configuration of Combined GPP/DSP/FPGA Systems …antonio/pubs/p-fall97acs.pdfGPP/DSP/FPGA Systems for Minimal SWAP by John K. Antonio Department of Computer Science ... •

19

Leakage CurrentDynamic Capacitance Charging Current

Most important for CMOSDependant on clock frequency

Dependant on signal activity

Power Dissipation in CMOS

Transient Current

Page 20: Optimal Configuration of Combined GPP/DSP/FPGA Systems …antonio/pubs/p-fall97acs.pdfGPP/DSP/FPGA Systems for Minimal SWAP by John K. Antonio Department of Computer Science ... •

20

Time-Domain Modeling

x3

x2

x1y

y

x3

x2

x1

:)(1 tx:)(2 tx:)(3 tx

:)(21 txx:)(321 txxx

• Very precise results• Computationally expensive

Calculation of instantaneous power:

p(t)

Page 21: Optimal Configuration of Combined GPP/DSP/FPGA Systems …antonio/pubs/p-fall97acs.pdfGPP/DSP/FPGA Systems for Minimal SWAP by John K. Antonio Department of Computer Science ... •

21

( ) 50.0=clockp

( ) 88.01 =xp

( ) 29.02 =xp

( ) 69.03 =xp ( ) 27.03 =xA

( ) 0.1=clockA

( ) 10.01 =xA

( ) 17.02 =xA

p(s): the probability that signal sattains a logical value of true at any given clock cycle.

A(s): the probability that signal stransitions at any given clock cycle.

Probabilistic Modeling

Page 22: Optimal Configuration of Combined GPP/DSP/FPGA Systems …antonio/pubs/p-fall97acs.pdfGPP/DSP/FPGA Systems for Minimal SWAP by John K. Antonio Department of Computer Science ... •

22

Probabilistic Modeling

x3

x2

x1

y

y

x3

x2

x1

:)(1 tx:)(2 tx:)(3 tx

:)(21 txx:)(321 txxx

• Acceptable results• Computationally inexpensive

p=0.88, A=0.10

p=0.29, A=0.17

p=0.69, A=0.27

p=0.83, A=0.17

p=0.10, A=0.13

Calculation of average power:

∑∈

=gates all

2

21

ggavg ACVP

Page 23: Optimal Configuration of Combined GPP/DSP/FPGA Systems …antonio/pubs/p-fall97acs.pdfGPP/DSP/FPGA Systems for Minimal SWAP by John K. Antonio Department of Computer Science ... •

23

Probabilistic Model Implementation

p(s1), A(s1)

p(s2), A(s2)

p(s3), A(s3)

Step 1: Probabilistic information is distilled from the input data and presented to the model.

Step 2: Probabilistic data “propagates” throughout the model, depositing activity information as it does so.

Step 3: Power is estimated using activity measures and known CMOS gate capacitances.

Page 24: Optimal Configuration of Combined GPP/DSP/FPGA Systems …antonio/pubs/p-fall97acs.pdfGPP/DSP/FPGA Systems for Minimal SWAP by John K. Antonio Department of Computer Science ... •

24

OutlineOutline

• Program Objectives and Schedule of Milestones

• Representative Examples of Current Work

• Competing STAP Weight Solvers

• Power Prediction Model for FPGAs

• Optimal Configuration for SAR Processing

• Questions/Answers

Page 25: Optimal Configuration of Combined GPP/DSP/FPGA Systems …antonio/pubs/p-fall97acs.pdfGPP/DSP/FPGA Systems for Minimal SWAP by John K. Antonio Department of Computer Science ... •

25

J. T. Muehring and J. K. Antonio, “Optimal Configuration of an Embedded Parallel System for Synthetic Aperture Radar Processing,”Proc. Int’l Conf. on Signal Processing Applications & Technology, Boston, MA, Oct. 1996, pp. 1489-1494.(http://hpcl.cs.ttu.edu/~antonio/pubs/conf033.pdf)

T. Einstein, “Realtime Synthetic Aperture Radar Processing on the RACE Multicomputer,” Application Note 203.0, Mercury Computing Systems, Inc., Chelmsford, MA, 1995.

J. C. Curlander and R. N. McDonough, Synthetic Aperture Radar: Systems and Signal Processing, John Wiley & Sons, New York, NY, 1991.

“SHARC DSP Compute Nodes (3.3-Volt),” Mercury Computing Systems, Inc., Chelmsford, MA, 1995.

References for Optimal Configurationfor SAR Processing

Page 26: Optimal Configuration of Combined GPP/DSP/FPGA Systems …antonio/pubs/p-fall97acs.pdfGPP/DSP/FPGA Systems for Minimal SWAP by John K. Antonio Department of Computer Science ... •

26

DistributedCorner-Turn

1

Ran

ge S

ampl

es

Pulse No.

Range Samples

Puls

e N

o.

Range Processing(shown across 3 range processors)

Azimuth Processing(shown across 4 azimuth processors)

1

1

1

K r

Sa

Sa

K r

where Sa is the azimuth section length and Kr is the range reference kernel size

GPP/DSP Approach for SAR Processing

Page 27: Optimal Configuration of Combined GPP/DSP/FPGA Systems …antonio/pubs/p-fall97acs.pdfGPP/DSP/FPGA Systems for Minimal SWAP by John K. Antonio Department of Computer Science ... •

27

Kernel

Discard

OverlapSection

FFT size

Large Overlap/Section ratio ⇒ Small azimuth memory, large number azimuth processorsSmall Overlap/Section ratio ⇒ Large azimuth memory, small number azimuth processors

The Sectioned Convolutionfor the Azimuth Processing

Page 28: Optimal Configuration of Combined GPP/DSP/FPGA Systems …antonio/pubs/p-fall97acs.pdfGPP/DSP/FPGA Systems for Minimal SWAP by John K. Antonio Department of Computer Science ... •

28

Pv F R F F

PvR

F FS

MR v F R F F

MR R S

rr r s r r

a

s aa a

a

rs r r s r r

as a

=+ +

=+

+

=+ +

=+

( lg )

( lg )

( lg )

( )

6 10

6 10

16 6 10

2

2

2

3

2

3

δ α γ δγδ

αγ

δ

δ α γ δγδ

λ δδ

where Pr and Pa are the number of required processors and Mr and Maare the memory requirements in Mbytes for range and azimuth processing, respectively

Derivations for Memory and Processorsfor GPP/DSP Systems

Page 29: Optimal Configuration of Combined GPP/DSP/FPGA Systems …antonio/pubs/p-fall97acs.pdfGPP/DSP/FPGA Systems for Minimal SWAP by John K. Antonio Department of Computer Science ... •

29

• Determine configurations for the CNs, number of CNs of each configuration, and section size, to satisfy processor and memory requirements and minimize power consumption

• Notation and Definitions:– CN Configuration: Specifies the daughtercard type

and number of range and azimuth processors (per configured CN)

– X, Y: The two possible CN configurations– XT, YT: Daughtercard type for each CN configuration

Determining Optimal Configurations for GPP/DSP Systems

Page 30: Optimal Configuration of Combined GPP/DSP/FPGA Systems …antonio/pubs/p-fall97acs.pdfGPP/DSP/FPGA Systems for Minimal SWAP by John K. Antonio Department of Computer Science ... •

30

• Notation and Definitions (continued):– Xr, Yr: Number of range processors per CN (for

each configuration)– Xa, Ya: Number of azimuth processors per CN (for

each configuration)– NX, NY: Number of CNs of configurations X and Y– ΠCN(•): Power per CN as a function of

daughtercard type– MCN(•): Memory per CN as a function of

daughtercard type– PCN(•): Processors per CN as a function of

daughtercard type

Determining Optimal Configurations for GPP/DSP Systems

Page 31: Optimal Configuration of Combined GPP/DSP/FPGA Systems …antonio/pubs/p-fall97acs.pdfGPP/DSP/FPGA Systems for Minimal SWAP by John K. Antonio Department of Computer Science ... •

311,0,,,,,

,....2,1,2

)()(

)()()(

)()()(

)(

)()(

≥≥

=+≥=

≤+≤+

+≥

+≥

+≤+≤

+=

aararYX

aak

a

TCNar

TCNar

aa

aaa

r

rrTCN

aa

aaa

r

rrTCN

aYaXaa

rYrXr

TCNYTCNX

SYYXXNN

kKSF

YPYYXPXX

SPSMY

PMYYM

SPSMX

PMXXM

YNXNSPYNXNP

YΠNXΠNZMinimize:

Subject to:

Optimization Formulation forGPP/DSP Systems

Page 32: Optimal Configuration of Combined GPP/DSP/FPGA Systems …antonio/pubs/p-fall97acs.pdfGPP/DSP/FPGA Systems for Minimal SWAP by John K. Antonio Department of Computer Science ... •

32

Power Consumption in Optimal Configuration

Page 33: Optimal Configuration of Combined GPP/DSP/FPGA Systems …antonio/pubs/p-fall97acs.pdfGPP/DSP/FPGA Systems for Minimal SWAP by John K. Antonio Department of Computer Science ... •

33

% Power Increase of NominalOver Optimal Configuration

Page 34: Optimal Configuration of Combined GPP/DSP/FPGA Systems …antonio/pubs/p-fall97acs.pdfGPP/DSP/FPGA Systems for Minimal SWAP by John K. Antonio Department of Computer Science ... •

340.5 1 1.5 250

100

150

200

250

300

350

400

δ

v

112211112 121112 201112 202112 211121 202130 202130 211202 211211 220

Optimal CN Configurations

arTarT YYYXXX

Page 35: Optimal Configuration of Combined GPP/DSP/FPGA Systems …antonio/pubs/p-fall97acs.pdfGPP/DSP/FPGA Systems for Minimal SWAP by John K. Antonio Department of Computer Science ... •

35

• Assume FPGAs are used for range processing• Additional Notation and Definitions:

– Dr: Dynamic range required for range processing– Ar: Accuracy required for range processing– Ir: Incoming data rate (depends on δ, v, R, Rs)– Tu: Data type used (floating, block, or fixed)– Bu: Number of bits used for data representation

(depends on Dr, Ar, Tu)– Clu: Clock rate used (depends on Ir, Tu, Bu)

Determining Optimal Configurations for GPP/DSP/FPGA Systems

Page 36: Optimal Configuration of Combined GPP/DSP/FPGA Systems …antonio/pubs/p-fall97acs.pdfGPP/DSP/FPGA Systems for Minimal SWAP by John K. Antonio Department of Computer Science ... •

36

• Additional Notation and Definitions (continued):– Gu: Number of FPGA chips used for range

processing (depends on Ir, Tu, Bu)

– ΠG: Power consumption of FPGAs (depends on Clu, Tu, Bu, Gu)

• Ongoing Work– Deriving precise relationships among above terms– Extending current GPP/DSP optimization formulation

to include FPGA utilization

Determining Optimal Configurations for GPP/DSP/FPGA Systems

Page 37: Optimal Configuration of Combined GPP/DSP/FPGA Systems …antonio/pubs/p-fall97acs.pdfGPP/DSP/FPGA Systems for Minimal SWAP by John K. Antonio Department of Computer Science ... •

37

OutlineOutline

• Program Objectives and Schedule of Milestones

• Representative Examples of Current Work

• Competing STAP Weight Solvers

• Power Prediction Model for FPGAs

• Optimal Configuration for SAR Processing

• Questions/Answers