dept. electrónica y computación univ. santiago de compostela lab. de l’informatique du...

Dept. Electrónica y ComputaciónUniv. Santiago de Compostela

Lab. de l’Informatique duParallélisme. ENS-Lyon

Faithful Powering (X p) Computation

using Table Look-Up and a

Fused Accumulation Tree

José-Alejandro PiñeiroJavier D. BrugueraJean-Michel Muller



INTRODUCTION

ALGORITHM 2nd-degree minimax approximation specialized squaring unit fused accumulation tree error computation

PROPOSED ARCHITECTURES unfolded & pipelined architectures pre-layout synthesis results &

comparison

CONCLUSION

SUMMARY



Faithfully-rounded single-precision floating-point computation of powering function (Xp)

p parameter

Computer graphics and DSP

Framework for the computation of reciprocal (X-1) ( division), square root (X1/2), inverse square root (X-1/2), inverse squaring (X-2), ...

2nd-degree minimax approximation

Table Look-Up + Specialized Squaring Unit + Fused Accumulation Tree

Speed of 1st-order & area of 2nd-order methods

INTRODUCTION



Direct Table Look-Up

Polynomial or rational approximations

Iterative algorithms

digit-recurrence methods (linear convergence)

multiplicative-based algorithms (quadratic conv.)

Table-based Methods (table look-up + low-degree polynomial)

INTRODUCTIONtypes of methods



INTRODUCTIONtable-based methods

bipartite tables

linear approximations

second-order approximations

),(),()( 311210 XXaXXaXf

210)( XCCXf

222210)( XCXCCXf

'')( XCXf

table LU + addition

table LU + multiplication

table LU + 2nd-degree polynomial evaluation

162

122

82

m22 table sizes

2/32 m table sizes

m2 table sizes

8m

8m

8m



ALGORITHM2nd-degree minimax approx.

222210 XCXCCX p

mnmm

m

xxxX

xxxX

2][.

].1[

212

211

2/1p

8m

TablesUpLookCCC 210 ,,

unitsquaringspecX .22

recodingSD 4

treeaccfusedevalpolynomial .

21 XXX



ALGORITHM2nd-degree minimax approx.

Maple program for coefficients obtention

??_),,(

),(

)(

),(

,)(

),(

,,)(

0***

0

***0

22221

2**

2

**2

**021

1*1

*2

*1

*0

bounderrorbitskjierror

CbitsiCrounding

CXCXCXminimax

CbitskCrounding

CCXCXminimax

CbitsjCrounding

CCCXminimax

p

p

p

222210 XCXCCX p



ALGORITHMspecialized squaring unit

jiijji

ijji

iii

xxxxxx

xxxx

xxx

2

X2 : m leading zeros

X22 : 2m leading zeros

8m

math. identities

Carry-Save output

leading zeros & truncation

6 22 3 2

2

h m

X



ALGORITHM

Delay about 24x24-bit multiplier Reduced area (coefficients wordlength)

fused accumulation tree

8m

)()( 222210 XCppsXCppsC

recoding to SD-4

CSAlevelspps 315681



ALGORITHMerror computation

FAITHFUL ROUNDING the intermediate results

between the two correct machine numbers

rrounderminttotal

212 r

round12 r

computapproxermint 6 2

22 3

h m

max squaring computC

100 2 rCC



ARCHITECTURE

inverse square root (p = -1/2)

Table size:11.75 Kb

unfolded architecture

8m

2 functions : inserting a new set of tables and multiplexers

treeaccfusedSDCSsquaringunfolded tttt __2 treeaccfusedtableluunfolded ttt __



latency : 3 cycles throughput : 1 result / cycle

ARCHITECTUREpipelined architecture

regSDCSsquaringpipel tttt 2.

regtablelupipel ttt .

regCPApipel ttt .

regCSAgenpppipel tttt 2:4_. 3



ARCHITECTUREsynthesis results and comparison

pre-layout synthesis CMOS 0.35 m VHDL design-flow Synopsys

unfolded arch.

pipelined arch.

comparison



New method for the single-precision floating-point faithfully-rounded powering computation

Second-degree minimax approximation, look-up tables, specialized squaring unit and fused accumulation tree

Unfolded and pipelined (3:1) architectures proposed

Pre-layout synthesis results (CMOS 0.35 m) and comparison with previous table-based methods

Speed of linear approx. & reduced area of 2nd-degree interpolations

Future work• Generalization to any f(X)• Employment for seed obtention in double-precision

computations. Multiplicative-based methods: a single Newton-Raphson or Goldschmidt iteration required

CONCLUSION



Faithful Powering (X p) Computation

using Table Look-Up and a

Fused Accumulation Tree

José-Alejandro PiñeiroJavier D. BrugueraJean-Michel Muller

dept. electrónica y computación univ. santiago de compostela lab. de l’informatique du...

Documents