implementation of fast fourier transform on general purpose computers tianxiang yang
TRANSCRIPT
Implementation of Fast Fourier Transform on General Purpose
Computers
Tianxiang Yang
FFT Formulation
• Basically a matrix-vector product:
1
2
1
0
)1)(1()1(21
)1(26
42
132
1
2
1
0
1
1
1
11111
NNN
NN
NNN
NNNNN
NNNNN
N x
x
x
x
WWW
WWWW
WWWW
X
X
X
X
)( /2 NjN eW
FFT - What do we already have?
• A history of theoretical ideas:– Gauss (1805). First but largely unnoticed.
– Cooley-Tukey (1965). Reduces the order of the number of operations from N2 to Nlog2(N). Also suitable for any length of FFT computation.
– Yanve (1968). Requires the least known number of multiplications, as well as additions for length 2n FFTs.
– Almost uncountable others.
Motivation: Divide and Conquer
• Map the original problem into several sub-problems in such a way the the following inequality is satisfied:
sum(cost(subproblems)) + cost(mapping)
< cost(original problem)
Main Categories of FFT Algorithms
• Original Cooley-Tukey.
• Split-radix.
• Prime factor.
• Winograd FFT algorithms.
Many techniques were invented such as: DFT computation as a convolution, computation of the cyclic convolution, etc.
Implementation Issues
• General Purpose Computers
• Digital Signal Processors
• Vector and Multi-Processors
• VLSI
Fewer operations always better?
FFT implementations on GPP
• Algorithms under survey include:– FFTPACK, Temperton, SUNPERF, Sorensen,
Bailey, Oorua, Krukar, QFT, Green, Singleton, NRF, FFTW
– Special interest: FFTW (Fast Fourier Transform in the West)
Overview of FFTW
• Planner + Executor– FFTW has collected a sea of small combinable small
programs called “codelets”
– Planner tries to minimize the actual execution time, not the number of floating point operations.
– A dedicated FFTW compiler is used to combine codelets by the plan by wisely allocating register and memory usage and by taken advantages of the processor pipeline.
FFTW
• Generates unexpected code specific optimized for the current machine. An adaptive approach.
• Performance results:– Significant faster than most proposed implementations.
– Faster or equivalent to some machine specific optimized library
– Best FFT on GPP ever.
Reference
• A.V. Oppenheim and R.W. Schafer, Discrete-time Signal Processing. Englewood Cliffs, NJ 07632. Prentice-Hall, 1989.
• P. Duhamel and M. Vetterli, “Fast Fourier Transforms: A Tutorial Review and a State of the Art”, Signal Processing, vol. 19, Apr. 1990
• http://www.fftw.org (official FFTW site).