the fast fourier transform; short-time fourier transform
TRANSCRIPT
2013-11-27Dan Ellis 1
ELEN E4810: Digital Signal ProcessingTopic 10:
The Fast Fourier Transform1. Calculation of the DFT2. The Fast Fourier Transform algorithm3. Short-Time Fourier Transform
2013-11-27Dan Ellis 2
1. Calculation of the DFT Filter design so far has been oriented to
time-domain processing - cheaper! But: frequency-domain processing
makes some problems very simple:
use all of x[n], or use short-time windows Need an efficient way to calculate DFT
Fourier domainprocessingx[n] y[n]DFT IDFT
X[k] Y[k]
2013-11-27Dan Ellis 3
The DFT Recall the DFT:
discrete transform of discrete sequence
Matrix form:�
X[k] = x[n]WNkn
n=0
N1
�
WN = e j2
N( )
⇒ WNr has only
N distinct values
WN@2º/N
Structure ⇒ opportunities
for efficiency
�
⇧⇧⇧⇧⇧⇤
X[0]X[1]X[2]
...X[N � 1]
⇥
⌃⌃⌃⌃⌃⌅=
�
⇧⇧⇧⇧⇧⇧⇤
1 1 1 · · · 11 W 1
N W 2N · · · W (N�1)
N
1 W 2N W 4
N · · · W 2(N�1)N
......
.... . .
...1 W (N�1)
N W 2(N�1)N · · · W (N�1)2
N
⇥
⌃⌃⌃⌃⌃⌃⌅
�
⇧⇧⇧⇧⇧⇤
x[0]x[1]x[2]
...x[N � 1]
⇥
⌃⌃⌃⌃⌃⌅
2013-11-27Dan Ellis 4
Computational Complexity
N complex multiplies + N-1 complex adds per point (k) × N points (k = 0.. N-1) cpx mult: (a+jb)(c+jd) = ac - bd + j(ad + bc)
= 4 real mults + 2 real adds cpx add = 2 real adds
N points: 4N2 real mults, 4N2-2N real adds
�
X[k] = x[n]WNkn
n=0
N1
2013-11-27Dan Ellis 5
Goertzel’s Algorithm Now:
i.e.where
�
X k[ ] = x [ ]WNk
=0
N1
=WN
kN x [ ] WN
k N( )
looks like aconvolution
�
X k[ ] = yk N[ ]yk n[ ] = xe n[ ] hk[n]
xe[n] = {x[n] 0 ≤ n < N0 n = N
hk[n] = {WN-kn n ≥ 0
0 n < 0+
z-1WN-k
xe[n]xe[N] = 0
yk[n]yk[-1] = 0yk[N] = X[k]
2013-11-27Dan Ellis 6
Goertzel’s Algorithm Separate ‘filters’ for each X[k]
can calculate for just a few values of k No large buffer, no coefficient table Same complexity for full X[k]
(4N2 mults, 4N2 - 2N adds) but: can halve multiplies by making the
denominator real:
�
H z( ) = 11WN
k z1= 1WN
kz1
1 2cos 2kN z1 + z2
evaluate onlyfor last step
2 real multsper step
2013-11-27Dan Ellis 7
2. Fast Fourier Transform FFT Reduce complexity of DFT
from O(N2) to O(N·logN) grows more slowly with larger N
Works by decomposing large DFT into several stages of smaller DFTs
Often provided as a highly optimized library
2013-11-27Dan Ellis 8
Decimation in Time (DIT) FFT Can rearrange DFT formula in 2 halves:
�
X k[ ] = x n[ ] WNnk
n=0
N1
= x 2m[ ] WN2mk + x 2m +1[ ] WN
2m+1( )k( )m=0
N21
= x 2m[ ] WN2
mk
m=0
N21
+WNk x 2m +1[ ] WN
2
mk
m=0
N21
N/2 pt DFT of x for even n N/2 pt DFT of x for odd nX0[<k>N/2] X1[<k>N/2]
Arrangeterms
in pairs...
Group termsfrom each
pair
k = 0.. N-1
2013-11-27Dan Ellis 9
Decimation in Time (DIT) FFT
We can evaluate an N-pt DFT as two N/2-pt DFTs (plus a few mults/adds)
But if DFTN{•} ~ O(N2)then DFTN/2{•} ~ O((N/2)2) = 1/4 O(N2)
⇒ Total computation ~ 2 1/4 O(N2)= 1/2 the computation (+") of direct DFT
�
DFTN x n[ ]{ } = DFTN2x0 n[ ]{ }+WN
kDFTN2x1 n[ ]{ }
x[n] for even n x[n] for odd n
2013-11-27Dan Ellis 10
One-Stage DIT Flowgraph
Classic FFT structure
Even pointsfrom x[n]
Odd pointsfrom x[n]
Same as X[0..3]
except forfactors on
X1[•] terms
“twiddle factors”:always apply to
odd-terms outputNOT mirror-image
�
X k[ ] = X0 k N2
[ ] +WNkX1 k N
2[ ]
x[0]x[2]x[4]x[6]
x[1]x[3]x[5]x[7]
X[0]X[1]X[2]X[3]
X[4]X[5]X[6]X[7]
X0[0]X0[1]X0[2]X0[3]
X1[0]X1[1]X1[2]X1[3]
DFTN2
DFTN2
WN0
WN1
WN2
WN3
WN4
WN5
WN6
WN7
2013-11-27Dan Ellis 11
If decomposing one DFTN into two smaller DFTN/2’s speeds things up ...Why not further divide into DFTN/4’s ?
i.e.
make:
Similarly,
Multiple DIT Stages
�
X k[ ] = X0 k N2
[ ] +WNkX1 k N
2[ ]
0 ≤ k < N
�
X0 k[ ] = X00 k N4
[ ] +WN2
k X01 k N4
[ ]0 ≤ k < N/2
N/4-pt DFT of even points in even subset of x[n]
N/4-pt DFT of odd points from even subset
�
X1 k[ ] = X10 k N4
[ ] +WN2
k X11 k N4
[ ]
2013-11-27Dan Ellis 12
Two-Stage DIT Flowgraph
x[0]x[4]x[2]x[6]
x[1]x[5]x[3]x[7]
X[0]X[1]X[2]X[3]
X[4]X[5]X[6]X[7]
X0[0]X00
X01
X10
X11
X0[1]X0[2]X0[3]
X1[0]X1[1]X1[2]X1[3]
DFTN4
DFTN4
DFTN4
DFTN4
WN0
WN1
WN2
WN3
WN4
WN5
WN6
WN7WN/2
3
WN/20WN/23
WN/20
different from before same as before
2013-11-27Dan Ellis 13
Multi-stage DIT FFT Can keep doing this until we get down
to 2-pt DFTs:
→ N = 2M-pt DFT reduces to M stages of twiddle factors & summation (O(N2) part vanishes)
→ real mults < M·4N , real adds < 2M·2N→ complexity ~ O(N·M) = O(N·log2N)
DFT2X[0] = x[0] + x[1]X[1] = x[0] - x[1] -1 = W2
1
1 = W20
“butterfly” element
≡
2013-11-27Dan Ellis 14
�
WNr+N
2 = e j2 r+N
2( )N
= e j2r
N e j2N /2
N
= WNr
FFT Implementation Details Basic butterfly (at any stage):
Can simplify:
WNr
WNr+N/2
XX[r]
XX[r+N/2]
XX0[r]
XX1[r]
••••••
2 cpx mults
XX[r]
XX[r+N/2]
XX0[r]
XX1[r]WN
r -1
just one cpx mult!
i.e. SUB rather than ADD
2013-11-27Dan Ellis 15
8-pt DIT FFT Flowgraph
-1’s absorbed into summation nodes WN
0 disappears ‘in-place’ algorithm: sequential stages
bit-r
ever
sed
inde
W4
W4
W8W8W8
x[0]x[4]x[2]x[6]x[1]x[5]x[3]x[7]
000100010110001101011111
X[0]X[1]X[2]X[3]X[4]X[5]X[6]X[7]
---
--
--
--
-
-
-2
3
2013-11-27Dan Ellis 16
FFT for Other Values of N Having N = 2M meant we could divide
each stage into 2 halves = “radix-2 FFT” Same approach works for:
N = 3M radix-3 N = 4M radix-4 - more optimized radix-2 etc...
Composite N = a·b·c·d → mixed radix(different N/r point FFTs at each stage) .. or just zero-pad to make N = 2M
M
2013-11-27Dan Ellis 17
Inverse FFT Recall IDFT:
Thus:
Hence, use FFT to calculate IFFT:
�
x[n] = 1N
X[k]WNnk
k=0
N1
only differencesfrom forward DFT
�
Nx*[n] = X[k]WNnk( )*
k=0
N1
= X *[k]WNnk
k=0
N1
Forward DFT of x′[n] = X*[k]|k=ni.e. time sequence made from spectrum
DFTRe{x[n]}Im{x[n]}
Re
Im
Re
Im
Re{X[k]}Im{X[k]}
1/N
-1/N-1
pure real flowgraph
x n[ ] = 1N X * k[ ]WN
nk
k=0
N 1
*
2013-11-27Dan Ellis 18
If x[n] is pure-real, DFT wastes mult’s Real x[n] → Conj. symm. X[k] = X*[-k] Given two real sequences, x[n] and w[n]
call y[n] = j·w[n] , v[n] = x[n] + y[n] N-pt DFT V[k] = X[k] + Y[k]
but: V[k]+V*[-k] = X[k]+X*[-k]+Y[k]+Y*[-k]⇒ X[k]=1/2(V[k]+V*[-k]) , W[k]=-j/2(V[k]-V*[-k])
i.e. compute DFTs of two N-pt real sequences with a single N-pt DFT
DFT of Real Sequences
X[k] -Y[k]
2013-11-27Dan Ellis 19
3. Short-Time Fourier Transform (STFT) Fourier Transform (e.g. DTFT) gives
spectrum of an entire sequence: How to see a time-varying spectrum? e.g. slow AM of a sinusoid carrier:
�
x n[ ] = 1 cos 2nN
cos0n
0 200 400 600 800 1000-2
-1
0
1
2
n
x[n]
2013-11-27Dan Ellis 20
Fourier Transform of AM Sine Spectrum of
whole sequence indicates modulation indirectly...
... as cancellationbetween closely-tuned sines
2cAcB= cA+B +cA-B
Nsin2ºknN
-Nsin2º(k-1)nN2
-Nsin2º(k+1)nN2
0 0.02 0.04 0.06 0.080
200
400
600N
N/2
X[k]
k/(N/2)
-1
-0.5
0
0.5
1
-1
-0.5
0
0.5
1
0 128 256 384 512 640 768 896-1
-0.5
0
0.5
1
M
2013-11-27Dan Ellis 21
Fourier Transform of AM Sine Sometimes we’d rather separate
modulation and carrier:x[n] = A[n]cos!0n A[n] varies on a
different (slower) timescale One approach:
chop x[n] into short sub-sequences .. .. where slow modulator is ~ constant DFT spectrum of pieces → show variation
!
A[n]
!0
2013-11-27Dan Ellis 22
FT of Short Segments Break up x[n] into successive, shorter
chunks of length NFT, then DFT each:
Shows amplitude modulationof !0 energy
0 128 256 384 512 640 768 896 1024 = N-2
-1
0
1
2
0 640
50
100
n
k k
x0[n]
x[n]
X0[k]
x1[n]
X1[k]
x2[n]
X2[k]
x3[n]
X3[k]
x4[n]
X4[k]
x5[n]
X5[k]
x6[n]
X6[k]
x7[n]
X7[k]
k0 = 0 · NFT2
NFT = N/8
2013-11-27Dan Ellis 23
The Spectrogram Plot successive DFTs in time-frequency:
This image is called the Spectrogram
time hopsize (between successive frames)= 128 points
128 256 384 512 640 768 896 1024
k
n
k kXi[k]
X[k,n]
00
5
10
15
200
406080100120
X0[k] X1[k] X2[k] X3[k] X4[k] X5[k] X6[k] X7[k]
2013-11-27Dan Ellis 24
Short-Time Fourier Transform Spectrogram = STFT magnitude
plotted on time-frequency plane STFT is (DFT form):
intensity as a function of time & frequency�
X k,n0[ ] = x n0 + n[ ] w[n] e j2knNFT
n=0
NFT 1
frequency
index timeindex
NFT points of xstarting at n0
window DFTkernel
2013-11-27Dan Ellis 25
STFT Window Shape w[n] provides ‘time localization’ of STFT
e.g. rectangularselects x[n], n0 ≤ n < n0+NW
But: resulting spectrum has same problems as windowing for FIR design:
nw[n]
�
X e j ,n0( ) = DTFT x n0 + n[ ] w n[ ]{ }= e jn0X e j( )W e j ( )( )d
spectrum of short-time window
is convolved with (twisted) parent spectrum
DTFTform ofSTFT
2013-11-27Dan Ellis 26
STFT Window Shape e.g. if x[n] is a pure sinusoid,
Hence, use tapered window for w[n]
blurring (mainlobe)+ ghosting (sidelobes)
e.g. Hamming
�
w n[ ] =0.54 + 0.46cos(2 n
2M +1)
sidelobes< -40 dB
X(ej ) W(ej )
-10 -5 0 5 10n
w[n] W(ej )
2013-11-27Dan Ellis 27
STFT Window Length Length of w[n] sets temporal resolution
Window length ∝ 1/(Mainlobe width)
more time detail ↔ less frequency detail
short window measuresonly local properties
longer window averagesspectral character
shorter window→ more blurred
spectrum
0 200 400 600 800 1000-0.1
0
0.1
0.2
0 200 400 600 800 1000-0.1
0
0.1
0.2x[n] x[n]w [n] LS w [n]
-100 -50 0 50 1000
0.5
1
- -0.5 0 0.50
10
20
-100 -50 0 50 1000
0.5
1
- -0.5 0 0.50
10
20
n
n
wL[n]
wS[n] WS(ej )
WL(ej )
N1 pts
N2 pts
N1zero at 4π
N2zero at 4π
2013-11-27Dan Ellis 28
STFT Window Length Can illustrate time-frequency tradeoff
on the time-frequency plane:
Alternate tilingsof time-freq:
disks show ‘blurring’due to window length;area of disk is constant→ Uncertainty principle:
±f·±t ≥ k
half-length window → half as many DFT samples
0 100 200 3000
0.51
0
50
100
150
200
250
n
k
2013-11-27Dan Ellis 29
Spectrograms of Real Sounds
individual t-fcells merge
into continuousimage
time-domain
successiveshortDFTs
time / s
time / s
freq
/ Hz
intensity / dB
2.35 2.4 2.45 2.5 2.55 2.60
1000
2000
3000
4000
freq
/ Hz
0
1000
2000
3000
4000
0
0.1
-50
-40
-30
-20
-10
0
10
0 0.5 1 1.5 2 2.5
2013-11-27Dan Ellis 30
Narrowband vs. Wideband Effect of varying window length:
1.4 1.6 1.8 2 2.2 2.4 2.6
freq
/ Hz
time / s
level/ dB
0
1000
2000
3000
4000
freq
/ Hz
0
1000
2000
3000
4000
0
0.2
Win
dow
= 25
6 pt
Nar
row
band
W
indo
w =
48 p
t W
ideb
and
-50
-40
-30
-20
-10
0
10
M
2013-11-27Dan Ellis 31
Spectrogram in Matlab>> [d,sr]=wavread(’mpgr1_sx419.wav');
>> Nw=256;>> specgram(d,Nw,sr)>> caxis([-80 0])
>> colorbar
(hann) window length
actual sampling rate(to label time axis)
dB
Time
Freq
uenc
y
0.5 1 1.5 2 2.5 30
2000
4000
6000
8000
-80
-60
-40
-20
0
2013-11-27Dan Ellis
-60 -40 -20 0 -1
0
1-1
0
1
n Re{x[n]}
Im{x[n]}
32
STFT as a Filterbank Consider one ‘row’ of STFT:
where
Each STFT row is output of a filter (subsampled by the STFT hop size)�
Xk n0[ ] = x n0 + n[ ] w n[ ] e j2knN
n=0
N1
= hk m[ ]x n0 m[ ]m=0
N1( )
�
hk n[ ] = w n[ ] e j2knN
just one freq.convolution
withcomplex IR
2013-11-27Dan Ellis 33
STFT as a Filterbank If
then
Each STFT row is the same bandpass response defined by W(ej!), frequency-shifted to a given DFT bin:
�
hk n[ ] = w ( )n[ ] e j2knN
�
Hk ej( ) =W e ( ) j 2kN( )( ) shift-in-!
A bank of identical, frequency-shiftedbandpass filters:“filterbank”
W(ej )H1(ej ) H2(ej ) • • •
2013-11-27Dan Ellis 34
STFT Analysis-Synthesis IDFT of STFT frames can reconstruct
(part of) original waveform e.g. if
then Can shift by n0, combine, to get x[n]:
Could divide by w[n-n0] to recover x[n]...
�
X k,n0[ ] = DFT x n0 + n[ ] w n[ ]{ }IDFT X k,n0[ ]{ } = x n0 + n[ ] w n[ ]
^
n0n
x̂[n] x[n]·w[n-n0]
2013-11-27Dan Ellis 35
STFT Analysis-Synthesis Dividing by small values of w[n] is bad
Prefer to overlap windows:
i.e. sample X[k,n0]at n0 = r·H where H = N/2 (for example)
Thenhopsize window length
�
ˆ x n[ ] = x n[ ]w n rH[ ]r= x n[ ]
�
w n rH[ ]r =1if
n
x̂[n]
x[n]·w[n-r·H]
2013-11-27Dan Ellis 36
STFT Analysis-Synthesis Hann or Hamming windows
with 50% overlap sum to constant
Can modify individual frames of X[k,n] and then reconstruct complex, time-varying modifications tapered overlap makes things OK�
0.54 + 0.46cos(2 nN )( )
+ 0.54 + 0.46cos(2 nN2N )( ) =1.08
0 20 40 60 800
0.2
0.4
0.6
0.8
1
n
w[n] w[n-N/2]w[n] + w[n-N/2]
2013-11-27Dan Ellis 37
STFT Analysis-Synthesis e.g. Noise reduction:
STFT oforiginal speech
Speech corruptedby white noise
Energy thresholdmask
M100 200 300
20406080100120
r
k