fast memory addressing scheme for radix-4 fft implementation presented by cheng-chien wu, master...

34
Fast Memory Addressing Scheme for Radix-4 FFT Implementation Presented by Cheng-Chien Wu , Master Student of CSIE,CCU 1 Author: Xin Xiao, Erdal Oruklu and Jafar Saniie (Illinois Institute of Technology) Source: IEEE International Conference on Electro/Information Technology, 2009. eit ’09

Upload: tobias-harrison

Post on 01-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Fast Memory Addressing Scheme for Radix-4 FFT Implementation Presented by Cheng-Chien Wu, Master Student of CSIE,CCU 1 Author: Xin Xiao, Erdal Oruklu and

1

Fast Memory Addressing Scheme for Radix-4 FFT Implementation

Presented by Cheng-Chien Wu , Master Student of CSIE,CCU

Author: Xin Xiao, Erdal Oruklu and Jafar Saniie

(Illinois Institute of Technology)Source:

IEEE International Conference on Electro/Information Technology, 2009. eit ’09

Page 2: Fast Memory Addressing Scheme for Radix-4 FFT Implementation Presented by Cheng-Chien Wu, Master Student of CSIE,CCU 1 Author: Xin Xiao, Erdal Oruklu and

2

Outline

– Introduction– Radix4-FFT– Related Work– Proposed Method– Experimental Results– Conclusion

Page 3: Fast Memory Addressing Scheme for Radix-4 FFT Implementation Presented by Cheng-Chien Wu, Master Student of CSIE,CCU 1 Author: Xin Xiao, Erdal Oruklu and

3

Introduction

• Fast Fourier Transform (FFT) is widely applied in the speech processing, image processing, and communication system.

• One of the key components for various signal processing and communications applications such as software defined radio and OFDM.

Page 4: Fast Memory Addressing Scheme for Radix-4 FFT Implementation Presented by Cheng-Chien Wu, Master Student of CSIE,CCU 1 Author: Xin Xiao, Erdal Oruklu and

4

Introduction(cont’d)

Page 5: Fast Memory Addressing Scheme for Radix-4 FFT Implementation Presented by Cheng-Chien Wu, Master Student of CSIE,CCU 1 Author: Xin Xiao, Erdal Oruklu and

5

Introduction(cont’d)

• The main objective – This study is primarily Concerned

Improving the performance of the address generation unit of the FFT processor by eliminating the complex critical path components.

Page 6: Fast Memory Addressing Scheme for Radix-4 FFT Implementation Presented by Cheng-Chien Wu, Master Student of CSIE,CCU 1 Author: Xin Xiao, Erdal Oruklu and

6

Outline

– Introduction– Radix4-FFT– Related Work– Proposed Method– Experimental Results– Conclusion

Page 7: Fast Memory Addressing Scheme for Radix-4 FFT Implementation Presented by Cheng-Chien Wu, Master Student of CSIE,CCU 1 Author: Xin Xiao, Erdal Oruklu and

7

Introduction(cont’d)

• Important FFT issues– High throughput– FFT size– Power consumption– Low cost– Area

Page 8: Fast Memory Addressing Scheme for Radix-4 FFT Implementation Presented by Cheng-Chien Wu, Master Student of CSIE,CCU 1 Author: Xin Xiao, Erdal Oruklu and

8

Outline

– Introduction– Radix4-FFT– Related Work– Proposed Method– Conclusion

Page 9: Fast Memory Addressing Scheme for Radix-4 FFT Implementation Presented by Cheng-Chien Wu, Master Student of CSIE,CCU 1 Author: Xin Xiao, Erdal Oruklu and

9

Radix-4

• The N-point discrete Fourier transform is defined by

Page 10: Fast Memory Addressing Scheme for Radix-4 FFT Implementation Presented by Cheng-Chien Wu, Master Student of CSIE,CCU 1 Author: Xin Xiao, Erdal Oruklu and

10

Data Path of Radix-4

Page 11: Fast Memory Addressing Scheme for Radix-4 FFT Implementation Presented by Cheng-Chien Wu, Master Student of CSIE,CCU 1 Author: Xin Xiao, Erdal Oruklu and

11

Butterfly Units

• The N-point FFT can be decomposed to repeated micro-operations called butterfly operations. When the size of the butterfly is r, the FFT operation is called a radix-r FFT.

Page 12: Fast Memory Addressing Scheme for Radix-4 FFT Implementation Presented by Cheng-Chien Wu, Master Student of CSIE,CCU 1 Author: Xin Xiao, Erdal Oruklu and

12

Butterfly Units in Radix-4

Page 13: Fast Memory Addressing Scheme for Radix-4 FFT Implementation Presented by Cheng-Chien Wu, Master Student of CSIE,CCU 1 Author: Xin Xiao, Erdal Oruklu and

13

Memory-based FFT

• In memory-based FFT architecture, only one butterfly structure is implemented in the chip, this butterfly unit will execute all the calculations recursively.

Page 14: Fast Memory Addressing Scheme for Radix-4 FFT Implementation Presented by Cheng-Chien Wu, Master Student of CSIE,CCU 1 Author: Xin Xiao, Erdal Oruklu and

14

Execution Time

• If parallel and pipeline processing techniques are used, an N point radix-r FFT can be executed by clock cycles.

• This indicates that a radix-4 FFT can be four times faster than a radix-2 FFT.

Page 15: Fast Memory Addressing Scheme for Radix-4 FFT Implementation Presented by Cheng-Chien Wu, Master Student of CSIE,CCU 1 Author: Xin Xiao, Erdal Oruklu and

15

Outline

– Introduction– Radix4-FFT– Related Work– Proposed Method– Experimental Results– Conclusion

Page 16: Fast Memory Addressing Scheme for Radix-4 FFT Implementation Presented by Cheng-Chien Wu, Master Student of CSIE,CCU 1 Author: Xin Xiao, Erdal Oruklu and

16

Related Work

Year Title

1969 Organization of Large Scale Fourier Processors

J. Assoc.Comput. Mach.

1976 Simplified control of FFT hardware IEEE Trans. Acoust.,Speech, Signal Processing

1992 Conflict free memory addressing for dedicated FFT hardware

IEEE Trans. Circuits Syst.

1999 An effective memory addressing scheme for FFT processors

IEEE Trans. on Signal Process

2008 An Efficient FFT Engine With ReducedAddressing Logic

IEEE Transactions on Circuits and Systems II

Page 17: Fast Memory Addressing Scheme for Radix-4 FFT Implementation Presented by Cheng-Chien Wu, Master Student of CSIE,CCU 1 Author: Xin Xiao, Erdal Oruklu and

17

Data Path of Radix-2

Page 18: Fast Memory Addressing Scheme for Radix-4 FFT Implementation Presented by Cheng-Chien Wu, Master Student of CSIE,CCU 1 Author: Xin Xiao, Erdal Oruklu and

18

Data Path of Radix-4

Page 19: Fast Memory Addressing Scheme for Radix-4 FFT Implementation Presented by Cheng-Chien Wu, Master Student of CSIE,CCU 1 Author: Xin Xiao, Erdal Oruklu and

19

Outline

– Introduction– Radix4-FFT– Related Work– Proposed Method– Experimental Results– Conclusion

Page 20: Fast Memory Addressing Scheme for Radix-4 FFT Implementation Presented by Cheng-Chien Wu, Master Student of CSIE,CCU 1 Author: Xin Xiao, Erdal Oruklu and

20

Memory Banks

• Four memory banks are used to store the data.

Page 21: Fast Memory Addressing Scheme for Radix-4 FFT Implementation Presented by Cheng-Chien Wu, Master Student of CSIE,CCU 1 Author: Xin Xiao, Erdal Oruklu and

21

Read Ports and Write Ports

• However, for pass 1 and pass 2, four inputs and four outputs of any butterfly stage belong to same memory bank.

• Since each memory bank is a two-port memory, at each clock cycle, each memory bank can export (read) once and import(write) once.

• Four clock cycles are necessary to perform four read and four write accesses in pass 1 and pass 2.

Page 22: Fast Memory Addressing Scheme for Radix-4 FFT Implementation Presented by Cheng-Chien Wu, Master Student of CSIE,CCU 1 Author: Xin Xiao, Erdal Oruklu and

22

Counter D

• Other main components of the FFT processor are Counter D and the barrel shifter. Counter D has two parts:– Pass counter P which is v=log4N

bits (Pv-1 to P0) – Butterfly counter B which is bits

(Bm-1 to B0).

Page 23: Fast Memory Addressing Scheme for Radix-4 FFT Implementation Presented by Cheng-Chien Wu, Master Student of CSIE,CCU 1 Author: Xin Xiao, Erdal Oruklu and

23

Barrel Shifter

• The barrel shifter generates all the addresses for four memory banks based on the pass number of the FFT, which can be expressed as:

RR(counter B, 2p) • where RR(counter B, 2p) means

rotate-right butterfly counter B by 2p bits, and p is the pass number of FFT.

Page 24: Fast Memory Addressing Scheme for Radix-4 FFT Implementation Presented by Cheng-Chien Wu, Master Student of CSIE,CCU 1 Author: Xin Xiao, Erdal Oruklu and

24

Twiddle Factor

• For twiddle factors Wb, Wc and Wd, three memory banks are used with same address generation logic. For pass p, this address is given as:

• (2p 0’s follow)

Page 25: Fast Memory Addressing Scheme for Radix-4 FFT Implementation Presented by Cheng-Chien Wu, Master Student of CSIE,CCU 1 Author: Xin Xiao, Erdal Oruklu and

25

For Larger FFT Size

• For different length FFT transforms, the control logic of the multiplexers only depends on the last three bits of the counter ,so the register and multiplexer structures are fixed for different length FFTs resulting in a common architecture for any N-point FFT.

Page 26: Fast Memory Addressing Scheme for Radix-4 FFT Implementation Presented by Cheng-Chien Wu, Master Student of CSIE,CCU 1 Author: Xin Xiao, Erdal Oruklu and

26

Logic Minimization

• After logic minimization, it results in only primitive logic gates such as AND/OR gates using the least significant bits of the butterfly counter B.

Page 27: Fast Memory Addressing Scheme for Radix-4 FFT Implementation Presented by Cheng-Chien Wu, Master Student of CSIE,CCU 1 Author: Xin Xiao, Erdal Oruklu and

27

Address Sequences(R0~R15)

Page 28: Fast Memory Addressing Scheme for Radix-4 FFT Implementation Presented by Cheng-Chien Wu, Master Student of CSIE,CCU 1 Author: Xin Xiao, Erdal Oruklu and

28

Address Sequences(R16 ~R31)

Page 29: Fast Memory Addressing Scheme for Radix-4 FFT Implementation Presented by Cheng-Chien Wu, Master Student of CSIE,CCU 1 Author: Xin Xiao, Erdal Oruklu and

29

Outline

– Introduction– Radix4-FFT– Related Work– Proposed Method– Experimental Results– Conclusion

Page 30: Fast Memory Addressing Scheme for Radix-4 FFT Implementation Presented by Cheng-Chien Wu, Master Student of CSIE,CCU 1 Author: Xin Xiao, Erdal Oruklu and

30

Experimental Results

Page 31: Fast Memory Addressing Scheme for Radix-4 FFT Implementation Presented by Cheng-Chien Wu, Master Student of CSIE,CCU 1 Author: Xin Xiao, Erdal Oruklu and

31

Experimental Results

Page 32: Fast Memory Addressing Scheme for Radix-4 FFT Implementation Presented by Cheng-Chien Wu, Master Student of CSIE,CCU 1 Author: Xin Xiao, Erdal Oruklu and

32

Outline

– Introduction– Radix4-FFT– Related Work– Proposed Method– Experimental Results– Conclusion

Page 33: Fast Memory Addressing Scheme for Radix-4 FFT Implementation Presented by Cheng-Chien Wu, Master Student of CSIE,CCU 1 Author: Xin Xiao, Erdal Oruklu and

33

Conclusions

• The proposed method for radix-4 FFT avoids any addition in the address generation, enabling a fast data path for butterfly operations.

• The same concept can be extended to any radix FFT, but the amount of registers and multiplexers for different radix FFT will be different: For radix-r FFT, registers and 4r multiplexers are needed.

Page 34: Fast Memory Addressing Scheme for Radix-4 FFT Implementation Presented by Cheng-Chien Wu, Master Student of CSIE,CCU 1 Author: Xin Xiao, Erdal Oruklu and

34

Thanks for Listening