custom-fpga design and mapping for dsp transforms

41
CUSTOM-FPGA DESIGN AND MAPPING FOR DSP TRANSFORMS BHARATHWAJ V SANKARA SHUBHIKA TANEJA

Upload: dale-whitfield

Post on 01-Jan-2016

63 views

Category:

Documents


3 download

DESCRIPTION

CUSTOM-FPGA DESIGN AND MAPPING FOR DSP TRANSFORMS. BHARATHWAJ V SANKARA SHUBHIKA TANEJA. OUTLINE. Mathematics of different DSP algorithms Generalization of DSP algorithms Systolic array architecture Logic block architecture & pipelining Mapping different algorithms to hardware Statistics. - PowerPoint PPT Presentation

TRANSCRIPT

CUSTOM-FPGA DESIGN AND MAPPING FOR DSP

TRANSFORMS

BHARATHWAJ V SANKARASHUBHIKA TANEJA

OUTLINE

Mathematics of different DSP algorithms Generalization of DSP algorithms Systolic array architecture Logic block architecture & pipelining Mapping different algorithms to hardware Statistics

DSP TRANSFORMS

DFT: X(k) =

DCT: X(k) =

DST: X(k) =

DHT: X(k) =

1

0

)2/)12(*cos()(N

n

k Nknpinx

1

0

)/*2*exp(*)(N

n

Nnkpijnx

1

0

))1/()1)(1(sin(*)(N

n

Nknpinx

1

0

))/*2sin()/*2(cos(*)(N

n

NnkpiNnkpinx

1/√N, k=0 √2/N, k= 1 to N-1αk =

MATH BACKGROUND

Let the Transform function be δ(n,k)

then X(k)=

Where

For k=0;

X(0) =

),()(),( knnxkn

),(1

0

knN

n

1

0

)0,3()0,2()0,1()0,0()0,(N

n

n

For a 4 point transform:

X(0) = x(0)δ(0,0) + x(1)δ(1,0) + x(2)δ(2,0) + x(3)δ(3,0)

X(1) = x(0)δ(0,1) + x(1)δ(1,1) + x(2)δ(2,1) + x(3)δ(3,1)

X(2) = x(0)δ(0,2) + x(1)δ(1,2) + x(2)δ(2,2) + x(3)δ(3,2)

X(3) = x(0)δ(0,3) + x(1)δ(1,3) + x(2)δ(2,3) + x(3)δ(3,3)

Consider DFT:X(k) =

=

=

This is simillar to Hartley Transform except that the second term is multiplied by –j coefficient.

Xh(k) =

1

0

))/*2sin()/*2(cos(*)(N

n

NnkpijNnkpinx

1

0

)/*2*exp(*)(N

n

Nnkpijnx

1

0

)/*2sin()()/*2cos()(N

n

NnkpinjxNnkpinx

1

0

)/*2sin()()/*2cos()(N

n

NnkpinxNnkpinx

Generalizing the transforms: X(k) = Where:

cos(2*pi*nk/2N) – jsin(2*pi*nk/2N) - DFT

αkcos[pi(2n+1)k/2N] - DCT

δ(n,k) = sin[pi(n+1)(k+1)/(N+1)] - DST

cos(2*pi*nk/N) + sin(2*pi*nk/N) - DHT

1

0

),()(N

n

knnx

LUT ENTRIES

TRANSFORM

δ1(n,k) δ2(n,k)

DFT cos(2*pi*nk/N) -jsin(2*pi*nk/N)

DCT αkcos*[pi(n+1)k/2N] 0

DST cos[90-pi(n+1)(k+1)/(N+1)] 0

DHT cos(2*pi*nk/N) sin(2*pi*nk/N)

SAD 1 -1

δ(n,k) = δ1(n,k) + δ2(n,k)

Proposed Architecture

SYSTOLIC ARRAY ARCHITECTURE

SYSTOLIC ARRAY ARCHITECTURE

δ(0,0) δ(0,1)

δ(3,2)

δ(2,0)

δ(3,1)

δ(2,3)

δ(0,2)

δ(1,0) δ(1,1)

δ(2,2)δ(2,1)

δ(3,0)

δ(1,2) δ(1,3)

δ(0,3)

δ(3,3)

δ(n,k)= δ1(n,k)+ δ2(n,k)

SYSTOLIC ARRAY ARCHITECTURE

δ(0,0) δ(0,1)

δ(3,2)

δ(2,0)

δ(3,1)

δ(2,3)

δ(0,2)

δ(1,0) δ(1,1)

δ(2,2)δ(2,1)

δ(3,0)

δ(1,2) δ(1,3)

δ(0,3)

δ(3,3)

x(0)

x(1)

x(2)

x(3)

δ(n,k)= δ1(n,k)+ δ2(n,k)

SYSTOLIC ARRAY ARCHITECTURE

δ(0,0) δ(0,1)

δ(3,2)

δ(2,0)

δ(3,1)

δ(2,3)

δ(0,2)

δ(1,0) δ(1,1)

δ(2,2)δ(2,1)

δ(3,0)

δ(1,2) δ(1,3)

δ(0,3)

δ(3,3)

x(0)

x(1)

x(2)

x(3)

X(0) X(1) X(2) X(3)

δ(n,k)= δ1(n,k)+ δ2(n,k)

LOGIC BLOCK OF THE FPGA

LUT

k1

k2

×

×

+/-

+

From other LB

Mode

Mode

δ(n,k)= δ1(n,k1)+ δ2(n,k2)

Mode= SAD selection

Mode

x(n)

x(n)/y(n)

PIPELINING OF THE DFG

L

×

×

MUX

MUX

+ +

MUX

PIPELINING OF THE DFG

L

×

×

MUX

MUX

+ +

MUX

PIPELINING OF THE DFG

L

×

×

MUX

MUX

+ +

MUX

D

D

D

D

D

D

D

D

D

PIPELINING OF THE DFG

L

×

×

MUX

MUX

+ +

MUX

Tcritial = max{ TL, TM, TA+TMUX} = TM

D

D

D

D

D

D

D

D

D

WHAT’s NEW IN THIS?

Customized for transforms C-Code CAD - Systolic array architecture –

suited for transforms Easier routing Specific Data-Path that supports tranforms Better performance Better utilization of resources.

intuitive

MAPPING ALGORITHMS TO

FPGA

DFT SOFTWARE CODE

//PSEUDOCODE

for (i=0;i<N;i++) {

X(i) = 0; X*(i) = 0;

for (j=0;j<N;j++) {

X(i) = X(i) + x(j)*cos(2*pi*i*j/N);

X*(i) = X*(i) + x(j)*sin(2*pi*i*j/N);

}

}

DFT – Higher level

δ1(0,2)δ1(1,2)

δ2(0,2)δ2(1,2)

δ1(0,3)δ1(1,3)

δ1(2,2)δ1(3,2)

δ2(2,2)δ2(3,2)

δ1(2,3)δ1(3,3)

δ2(2,3)δ2(3,3)

δ2(0,3)δ2(1,3)

δ1(0,0)δ1(1,0)

δ2(0,0)δ2(1,0)

δ1(0,1)δ1(1,1)

δ1(2,0)δ1(3,0)

δ2(2,0)δ2(3,0)

δ1(2,1)δ1(3,1)

δ2(2,1)δ2(3,1)

δ2(0,1)δ2(1,1)

DFT – Higher level

δ1(0,2)δ1(1,2)

δ2(0,2)δ2(1,2)

δ1(0,3)δ1(1,3)

δ1(2,2)δ1(3,2)

δ2(2,2)δ2(3,2)

δ1(2,3)δ1(3,3)

δ2(2,3)δ2(3,3)

δ2(0,3)δ2(1,3)

x(0)

x(1)

x(2)

x(3)

δ1(0,0)δ1(1,0)

δ2(0,0)δ2(1,0)

δ1(0,1)δ1(1,1)

δ1(2,0)δ1(3,0)

δ2(2,0)δ2(3,0)

δ1(2,1)δ1(3,1)

δ2(2,1)δ2(3,1)

δ2(0,1)δ2(1,1)

x(0)

x(1)

x(2)

x(3)

DFT – Higher level

δ1(0,2)δ1(1,2)

δ2(0,2)δ2(1,2)

δ1(0,3)δ1(1,3)

δ1(2,2)δ1(3,2)

δ2(2,2)δ2(3,2)

δ1(2,3)δ1(3,3)

δ2(2,3)δ2(3,3)

δ2(0,3)δ2(1,3)

x(0)

x(1)

x(2)

x(3)

δ1(0,0)δ1(1,0)

δ2(0,0)δ2(1,0)

δ1(0,1)δ1(1,1)

δ1(2,0)δ1(3,0)

δ2(2,0)δ2(3,0)

δ1(2,1)δ1(3,1)

δ2(2,1)δ2(3,1)

δ2(0,1)δ2(1,1)

x(0)

x(1)

x(2)

x(3)

X(2) X*(2) X(3) X*(3)

X(0) X*(0) X(1) X*(1)

DCT/DST SOFTWARE CODE

//PSEUDOCODE

for (i=0;i<N;i++) {

X(i) = 0; X*(i) = 0;

for (j=0;j<N;j++) {

X(i) = X(i) + x(j)*cos(pi*(2n+1)k/2N);

}

}

DCT/DST – HIGHER LEVEL

δ(0,0)δ(1,0)

δ(0,1)δ(1,1)

δ(0,2)δ(1,2)

δ(2,0)δ(3,0)

δ(2,1)δ(3,1)

δ(2,2)δ(3,2)

δ(2,3)δ(2,3)

δ(0,3)δ(1,3)

δ(0,0)δ(1,0)

δ(0,1)δ(1,1)

δ(0,2)δ(1,2)

δ(2,0)δ(3,0)

δ(2,1)δ(3,1)

δ(2,2)δ(3,2)

δ(2,3)δ(2,3)

δ(0,3)δ(1,3)

DCT/DST – HIGHER LEVEL

δ(0,0)δ(1,0)

δ(0,1)δ(1,1)

δ(0,2)δ(1,2)

δ(2,0)δ(3,0)

δ(2,1)δ(3,1)

δ(2,2)δ(3,2)

δ(2,3)δ(2,3)

δ(0,3)δ(1,3)

x(0)

x(1)

x(2)x(3)

δ(0,0)δ(1,0)

δ(0,1)δ(1,1)

δ(0,2)δ(1,2)

δ(2,0)δ(3,0)

δ(2,1)δ(3,1)

δ(2,2)δ(3,2)

δ(2,3)δ(2,3)

δ(0,3)δ(1,3)

x(1)

x(2)x(3)

x(0)

DCT/DST – HIGHER LEVEL

δ(0,0)δ(1,0)

δ(0,1)δ(1,1)

δ(0,2)δ(1,2)

δ(2,0)δ(3,0)

δ(2,1)δ(3,1)

δ(2,2)δ(3,2)

δ(2,3)δ(2,3)

δ(0,3)δ(1,3)

x(0)

x(1)

x(2)x(3)

X(0) X(1) X(2) X(3)

δ(0,0)δ(1,0)

δ(0,1)δ(1,1)

δ(0,2)δ(1,2)

δ(2,0)δ(3,0)

δ(2,1)δ(3,1)

δ(2,2)δ(3,2)

δ(2,3)δ(2,3)

δ(0,3)δ(1,3)

x(1)

x(2)x(3)

x(0) X(0) X(1) X(2) X(3)

DCT/DST SOFTWARE CODE

//PSEUDOCODE

for (i=0;i<N;i++) {

X(i) = 0; X*(i) = 0;

for (j=0;j<N;j++) {

X(i) = X(i) + x(j)*(cos(2*pi*nk/N)+sin(2*pi*nk/N);

}

}

DHT – HIGHER LEVEL

δ(0,0) δ(0,1)

δ(3,2)

δ(2,0)

δ(3,1)

δ(2,3)

δ(0,2)

δ(1,0) δ(1,1)

δ(2,2)δ(2,1)

δ(3,0)

δ(1,2) δ(1,3)

δ(0,3)

δ(3,3)

DHT – HIGHER LEVEL

δ(0,0) δ(0,1)

δ(3,2)

δ(2,0)

δ(3,1)

δ(2,3)

δ(0,2)

δ(1,0) δ(1,1)

δ(2,2)δ(2,1)

δ(3,0)

δ(1,2) δ(1,3)

δ(0,3)

δ(3,3)

x(0)

x(1)

x(2)

x(3)

DHT – HIGHER LEVEL

δ(0,0) δ(0,1)

δ(3,2)

δ(2,0)

δ(3,1)

δ(2,3)

δ(0,2)

δ(1,0) δ(1,1)

δ(2,2)δ(2,1)

δ(3,0)

δ(1,2) δ(1,3)

δ(0,3)

δ(3,3)

x(0)

x(1)

x(2)

x(3)

X(0) X(1) X(2) X(3)

MAPPING AT LOGIC BLOCK LEVEL

LOGIC BLOCK LEVEL

L

×

×

MUX

MUX

+ +

MUX

θ1

θ2

x1

x2

DCT/DFT/DST

LOGIC BLOCK LEVEL

L

×

×

MUX

MUX

+ +

MUX

θ1

θ2

x1

x2

cos(θ1)

cos(θ1)

DCT/DFT/DST

LOGIC BLOCK LEVEL

L

×

×

MUX

MUX

+ +

MUX

θ1

θ2

x1

x2

cos(θ1)

cos(θ2)

x2cos(θ2)

x1cos(θ1)

x1

x2

DCT/DFT/DST

LOGIC BLOCK LEVEL

L

×

×

MUX

MUX

+ +

MUX

θ1

θ2

x1

x2

cos(θ1)

cos(θ2)

x2cos(θ2)

x1cos(θ1)

x1

x2

x1cos(θ1)

x2cos(θ2)

X+

X+

DCT/DFT/DST

DHT – LOGIC BLOCK LEVEL

L

×

×

MUX

MUX

+ +

MUX

θ1

90-θ1

x1

x1

cos(θ1)

cos(90-θ1)

x1cos(90-θ1)

x1cos(θ1)

x1

x1

x1cos(θ1)

x1cos(90-θ1)

X+

X+

FPGA OR DSP

SAMPLE RATE > MHZ? FPGA CONTEXT SWITCH? DSP/FPGA FLOATING POINT? DSP C CODE? DSP

FPGA

FUTURE WORK

STATISTICS

N-POINT DFT N2

N-POINT DCT N2/2

N-POINT DST N2/2

N-POINT DHT N2

SAD N2

THANK YOU!QUESTIONS?