the fast fourier transform; short-time fourier transform

2013-11-27Dan Ellis 1

ELEN E4810: Digital Signal ProcessingTopic 10:

The Fast Fourier Transform1. Calculation of the DFT2. The Fast Fourier Transform algorithm3. Short-Time Fourier Transform

2013-11-27Dan Ellis 2

1. Calculation of the DFT Filter design so far has been oriented to

time-domain processing - cheaper! But: frequency-domain processing

makes some problems very simple:

use all of x[n], or use short-time windows Need an efficient way to calculate DFT

Fourier domainprocessingx[n] y[n]DFT IDFT

X[k] Y[k]

2013-11-27Dan Ellis 3

The DFT Recall the DFT:

discrete transform of discrete sequence

Matrix form:�

X[k] = x[n]WNkn

n=0

N1

�

WN = e j2

N( )

⇒ WNr has only

N distinct values

WN@2º/N

Structure ⇒ opportunities

for efficiency

�

⇧⇧⇧⇧⇧⇤

X[0]X[1]X[2]

...X[N � 1]

⇥

⌃⌃⌃⌃⌃⌅=

�

⇧⇧⇧⇧⇧⇧⇤

1 1 1 · · · 11 W 1

N W 2N · · · W (N�1)

N

1 W 2N W 4

N · · · W 2(N�1)N

......

.... . .

...1 W (N�1)

N W 2(N�1)N · · · W (N�1)2

N

⇥

⌃⌃⌃⌃⌃⌃⌅

�

⇧⇧⇧⇧⇧⇤

x[0]x[1]x[2]

...x[N � 1]

⇥

⌃⌃⌃⌃⌃⌅

2013-11-27Dan Ellis 4

Computational Complexity

N complex multiplies + N-1 complex adds per point (k) × N points (k = 0.. N-1) cpx mult: (a+jb)(c+jd) = ac - bd + j(ad + bc)

= 4 real mults + 2 real adds cpx add = 2 real adds

N points: 4N2 real mults, 4N2-2N real adds

�

X[k] = x[n]WNkn

n=0

N1

2013-11-27Dan Ellis 5

Goertzel’s Algorithm Now:

i.e.where

�

X k[ ] = x [ ]WNk

=0

N1

=WN

kN x [ ] WN

k N( )

looks like aconvolution

�

X k[ ] = yk N[ ]yk n[ ] = xe n[ ] hk[n]

xe[n] = {x[n] 0 ≤ n < N0 n = N

hk[n] = {WN-kn n ≥ 0

0 n < 0+

z-1WN-k

xe[n]xe[N] = 0

yk[n]yk[-1] = 0yk[N] = X[k]

2013-11-27Dan Ellis 6

Goertzel’s Algorithm Separate ‘filters’ for each X[k]

can calculate for just a few values of k No large buffer, no coefficient table Same complexity for full X[k]

(4N2 mults, 4N2 - 2N adds) but: can halve multiplies by making the

denominator real:

�

H z( ) = 11WN

k z1= 1WN

kz1

1 2cos 2kN z1 + z2

evaluate onlyfor last step

2 real multsper step

2013-11-27Dan Ellis 7

2. Fast Fourier Transform FFT Reduce complexity of DFT

from O(N2) to O(N·logN) grows more slowly with larger N

Works by decomposing large DFT into several stages of smaller DFTs

Often provided as a highly optimized library

2013-11-27Dan Ellis 8

Decimation in Time (DIT) FFT Can rearrange DFT formula in 2 halves:

�

X k[ ] = x n[ ] WNnk

n=0

N1

= x 2m[ ] WN2mk + x 2m +1[ ] WN

2m+1( )k( )m=0

N21

= x 2m[ ] WN2

mk

m=0

N21

+WNk x 2m +1[ ] WN

2

mk

m=0

N21

N/2 pt DFT of x for even n N/2 pt DFT of x for odd nX0[<k>N/2] X1[<k>N/2]

Arrangeterms

in pairs...

Group termsfrom each

pair

k = 0.. N-1

2013-11-27Dan Ellis 9

Decimation in Time (DIT) FFT

We can evaluate an N-pt DFT as two N/2-pt DFTs (plus a few mults/adds)

But if DFTN{•} ~ O(N2)then DFTN/2{•} ~ O((N/2)2) = 1/4 O(N2)

⇒ Total computation ~ 2 1/4 O(N2)= 1/2 the computation (+") of direct DFT

�

DFTN x n[ ]{ } = DFTN2x0 n[ ]{ }+WN

kDFTN2x1 n[ ]{ }

x[n] for even n x[n] for odd n

2013-11-27Dan Ellis 10

One-Stage DIT Flowgraph

Classic FFT structure

Even pointsfrom x[n]

Odd pointsfrom x[n]

Same as X[0..3]

except forfactors on

X1[•] terms

“twiddle factors”:always apply to

odd-terms outputNOT mirror-image

�

X k[ ] = X0 k N2

[ ] +WNkX1 k N

2[ ]

x[0]x[2]x[4]x[6]

x[1]x[3]x[5]x[7]

X[0]X[1]X[2]X[3]

X[4]X[5]X[6]X[7]

X0[0]X0[1]X0[2]X0[3]

X1[0]X1[1]X1[2]X1[3]

DFTN2

DFTN2

WN0

WN1

WN2

WN3

WN4

WN5

WN6

WN7

2013-11-27Dan Ellis 11

If decomposing one DFTN into two smaller DFTN/2’s speeds things up ...Why not further divide into DFTN/4’s ?

i.e.

make:

Similarly,

Multiple DIT Stages

�

X k[ ] = X0 k N2

[ ] +WNkX1 k N

2[ ]

0 ≤ k < N

�

X0 k[ ] = X00 k N4

[ ] +WN2

k X01 k N4

[ ]0 ≤ k < N/2

N/4-pt DFT of even points in even subset of x[n]

N/4-pt DFT of odd points from even subset

�

X1 k[ ] = X10 k N4

[ ] +WN2

k X11 k N4

[ ]

2013-11-27Dan Ellis 12

Two-Stage DIT Flowgraph

x[0]x[4]x[2]x[6]

x[1]x[5]x[3]x[7]

X[0]X[1]X[2]X[3]

X[4]X[5]X[6]X[7]

X0[0]X00

X01

X10

X11

X0[1]X0[2]X0[3]

X1[0]X1[1]X1[2]X1[3]

DFTN4

DFTN4

DFTN4

DFTN4

WN0

WN1

WN2

WN3

WN4

WN5

WN6

WN7WN/2

3

WN/20WN/23

WN/20

different from before same as before

2013-11-27Dan Ellis 13

Multi-stage DIT FFT Can keep doing this until we get down

to 2-pt DFTs:

→ N = 2M-pt DFT reduces to M stages of twiddle factors & summation (O(N2) part vanishes)

→ real mults < M·4N , real adds < 2M·2N→ complexity ~ O(N·M) = O(N·log2N)

DFT2X[0] = x[0] + x[1]X[1] = x[0] - x[1] -1 = W2

1

1 = W20

“butterfly” element

≡

2013-11-27Dan Ellis 14

�

WNr+N

2 = e j2 r+N

2( )N

= e j2r

N e j2N /2

N

= WNr

FFT Implementation Details Basic butterfly (at any stage):

Can simplify:

WNr

WNr+N/2

XX[r]

XX[r+N/2]

XX0[r]

XX1[r]

••••••

2 cpx mults

XX[r]

XX[r+N/2]

XX0[r]

XX1[r]WN

r -1

just one cpx mult!

i.e. SUB rather than ADD

2013-11-27Dan Ellis 15

8-pt DIT FFT Flowgraph

-1’s absorbed into summation nodes WN

0 disappears ‘in-place’ algorithm: sequential stages

bit-r

ever

sed

inde

xing

W4

W4

W8W8W8

x[0]x[4]x[2]x[6]x[1]x[5]x[3]x[7]

000100010110001101011111

X[0]X[1]X[2]X[3]X[4]X[5]X[6]X[7]

---

--

--

--

-

-

-2

3

2013-11-27Dan Ellis 16

FFT for Other Values of N Having N = 2M meant we could divide

each stage into 2 halves = “radix-2 FFT” Same approach works for:

N = 3M radix-3 N = 4M radix-4 - more optimized radix-2 etc...

Composite N = a·b·c·d → mixed radix(different N/r point FFTs at each stage) .. or just zero-pad to make N = 2M

M

2013-11-27Dan Ellis 17

Inverse FFT Recall IDFT:

Thus:

Hence, use FFT to calculate IFFT:

�

x[n] = 1N

X[k]WNnk

k=0

N1

only differencesfrom forward DFT

�

Nx*[n] = X[k]WNnk( )*

k=0

N1

= X *[k]WNnk

k=0

N1

Forward DFT of x′[n] = X*[k]|k=ni.e. time sequence made from spectrum

DFTRe{x[n]}Im{x[n]}

Re

Im

Re

Im

Re{X[k]}Im{X[k]}

1/N

-1/N-1

pure real flowgraph

x n[ ] = 1N X * k[ ]WN

nk

k=0

N 1

*

2013-11-27Dan Ellis 18

If x[n] is pure-real, DFT wastes mult’s Real x[n] → Conj. symm. X[k] = X*[-k] Given two real sequences, x[n] and w[n]

call y[n] = j·w[n] , v[n] = x[n] + y[n] N-pt DFT V[k] = X[k] + Y[k]

but: V[k]+V*[-k] = X[k]+X*[-k]+Y[k]+Y*[-k]⇒ X[k]=1/2(V[k]+V*[-k]) , W[k]=-j/2(V[k]-V*[-k])

i.e. compute DFTs of two N-pt real sequences with a single N-pt DFT

DFT of Real Sequences

X[k] -Y[k]

2013-11-27Dan Ellis 19

3. Short-Time Fourier Transform (STFT) Fourier Transform (e.g. DTFT) gives

spectrum of an entire sequence: How to see a time-varying spectrum? e.g. slow AM of a sinusoid carrier:

�

x n[ ] = 1 cos 2nN

cos0n

0 200 400 600 800 1000-2

-1

0

1

2

n

x[n]

2013-11-27Dan Ellis 20

Fourier Transform of AM Sine Spectrum of

whole sequence indicates modulation indirectly...

... as cancellationbetween closely-tuned sines

2cAcB= cA+B +cA-B

Nsin2ºknN

-Nsin2º(k-1)nN2

-Nsin2º(k+1)nN2

0 0.02 0.04 0.06 0.080

200

400

600N

N/2

X[k]

k/(N/2)

-1

-0.5

0

0.5

1

-1

-0.5

0

0.5

1

0 128 256 384 512 640 768 896-1

-0.5

0

0.5

1

M

2013-11-27Dan Ellis 21

Fourier Transform of AM Sine Sometimes we’d rather separate

modulation and carrier:x[n] = A[n]cos!0n A[n] varies on a

different (slower) timescale One approach:

chop x[n] into short sub-sequences .. .. where slow modulator is ~ constant DFT spectrum of pieces → show variation

!

A[n]

!0

2013-11-27Dan Ellis 22

FT of Short Segments Break up x[n] into successive, shorter

chunks of length NFT, then DFT each:

Shows amplitude modulationof !0 energy

0 128 256 384 512 640 768 896 1024 = N-2

-1

0

1

2

0 640

50

100

n

k k

x0[n]

x[n]

X0[k]

x1[n]

X1[k]

x2[n]

X2[k]

x3[n]

X3[k]

x4[n]

X4[k]

x5[n]

X5[k]

x6[n]

X6[k]

x7[n]

X7[k]

k0 = 0 · NFT2

NFT = N/8

2013-11-27Dan Ellis 23

The Spectrogram Plot successive DFTs in time-frequency:

This image is called the Spectrogram

time hopsize (between successive frames)= 128 points

128 256 384 512 640 768 896 1024

k

n

k kXi[k]

X[k,n]

00

5

10

15

200

406080100120

X0[k] X1[k] X2[k] X3[k] X4[k] X5[k] X6[k] X7[k]

2013-11-27Dan Ellis 24

Short-Time Fourier Transform Spectrogram = STFT magnitude

plotted on time-frequency plane STFT is (DFT form):

intensity as a function of time & frequency�

X k,n0[ ] = x n0 + n[ ] w[n] e j2knNFT

n=0

NFT 1

frequency

index timeindex

NFT points of xstarting at n0

window DFTkernel

2013-11-27Dan Ellis 25

STFT Window Shape w[n] provides ‘time localization’ of STFT

e.g. rectangularselects x[n], n0 ≤ n < n0+NW

But: resulting spectrum has same problems as windowing for FIR design:

nw[n]

�

X e j ,n0( ) = DTFT x n0 + n[ ] w n[ ]{ }= e jn0X e j( )W e j ( )( )d

spectrum of short-time window

is convolved with (twisted) parent spectrum

DTFTform ofSTFT

2013-11-27Dan Ellis 26

STFT Window Shape e.g. if x[n] is a pure sinusoid,

Hence, use tapered window for w[n]

blurring (mainlobe)+ ghosting (sidelobes)

e.g. Hamming

�

w n[ ] =0.54 + 0.46cos(2 n

2M +1)

sidelobes< -40 dB

X(ej ) W(ej )

-10 -5 0 5 10n

w[n] W(ej )

2013-11-27Dan Ellis 27

STFT Window Length Length of w[n] sets temporal resolution

Window length ∝ 1/(Mainlobe width)

more time detail ↔ less frequency detail

short window measuresonly local properties

longer window averagesspectral character

shorter window→ more blurred

spectrum

0 200 400 600 800 1000-0.1

0

0.1

0.2

0 200 400 600 800 1000-0.1

0

0.1

0.2x[n] x[n]w [n] LS w [n]

-100 -50 0 50 1000

0.5

1

- -0.5 0 0.50

10

20

-100 -50 0 50 1000

0.5

1

- -0.5 0 0.50

10

20

n

n

wL[n]

wS[n] WS(ej )

WL(ej )

N1 pts

N2 pts

N1zero at 4π

N2zero at 4π

2013-11-27Dan Ellis 28

STFT Window Length Can illustrate time-frequency tradeoff

on the time-frequency plane:

Alternate tilingsof time-freq:

disks show ‘blurring’due to window length;area of disk is constant→ Uncertainty principle:

±f·±t ≥ k

half-length window → half as many DFT samples

0 100 200 3000

0.51

0

50

100

150

200

250

n

k

2013-11-27Dan Ellis 29

Spectrograms of Real Sounds

individual t-fcells merge

into continuousimage

time-domain

successiveshortDFTs

time / s

time / s

freq

/ Hz

intensity / dB

2.35 2.4 2.45 2.5 2.55 2.60

1000

2000

3000

4000

freq

/ Hz

0

1000

2000

3000

4000

0

0.1

-50

-40

-30

-20

-10

0

10

0 0.5 1 1.5 2 2.5

2013-11-27Dan Ellis 30

Narrowband vs. Wideband Effect of varying window length:

1.4 1.6 1.8 2 2.2 2.4 2.6

freq

/ Hz

time / s

level/ dB

0

1000

2000

3000

4000

freq

/ Hz

0

1000

2000

3000

4000

0

0.2

Win

dow

= 25

6 pt

Nar

row

band

W

indo

w =

48 p

t W

ideb

and

-50

-40

-30

-20

-10

0

10

M

2013-11-27Dan Ellis 31

Spectrogram in Matlab>> [d,sr]=wavread(’mpgr1_sx419.wav');

>> Nw=256;>> specgram(d,Nw,sr)>> caxis([-80 0])

>> colorbar

(hann) window length

actual sampling rate(to label time axis)

dB

Time

Freq

uenc

y

0.5 1 1.5 2 2.5 30

2000

4000

6000

8000

-80

-60

-40

-20

0

2013-11-27Dan Ellis

-60 -40 -20 0 -1

0

1-1

0

1

n Re{x[n]}

Im{x[n]}

32

STFT as a Filterbank Consider one ‘row’ of STFT:

where

Each STFT row is output of a filter (subsampled by the STFT hop size)�

Xk n0[ ] = x n0 + n[ ] w n[ ] e j2knN

n=0

N1

= hk m[ ]x n0 m[ ]m=0

N1( )

�

hk n[ ] = w n[ ] e j2knN

just one freq.convolution

withcomplex IR

2013-11-27Dan Ellis 33

STFT as a Filterbank If

then

Each STFT row is the same bandpass response defined by W(ej!), frequency-shifted to a given DFT bin:

�

hk n[ ] = w ( )n[ ] e j2knN

�

Hk ej( ) =W e ( ) j 2kN( )( ) shift-in-!

A bank of identical, frequency-shiftedbandpass filters:“filterbank”

W(ej )H1(ej ) H2(ej ) • • •

2013-11-27Dan Ellis 34

STFT Analysis-Synthesis IDFT of STFT frames can reconstruct

(part of) original waveform e.g. if

then Can shift by n0, combine, to get x[n]:

Could divide by w[n-n0] to recover x[n]...

�

X k,n0[ ] = DFT x n0 + n[ ] w n[ ]{ }IDFT X k,n0[ ]{ } = x n0 + n[ ] w n[ ]

^

n0n

x̂[n] x[n]·w[n-n0]

2013-11-27Dan Ellis 35

STFT Analysis-Synthesis Dividing by small values of w[n] is bad

Prefer to overlap windows:

i.e. sample X[k,n0]at n0 = r·H where H = N/2 (for example)

Thenhopsize window length

�

ˆ x n[ ] = x n[ ]w n rH[ ]r= x n[ ]

�

w n rH[ ]r =1if

n

x̂[n]

x[n]·w[n-r·H]

2013-11-27Dan Ellis 36

STFT Analysis-Synthesis Hann or Hamming windows

with 50% overlap sum to constant

Can modify individual frames of X[k,n] and then reconstruct complex, time-varying modifications tapered overlap makes things OK�

0.54 + 0.46cos(2 nN )( )

+ 0.54 + 0.46cos(2 nN2N )( ) =1.08

0 20 40 60 800

0.2

0.4

0.6

0.8

1

n

w[n] w[n-N/2]w[n] + w[n-N/2]

2013-11-27Dan Ellis 37

STFT Analysis-Synthesis e.g. Noise reduction:

STFT oforiginal speech

Speech corruptedby white noise

Energy thresholdmask

M100 200 300

20406080100120

r

k

the fast fourier transform; short-time fourier transform

Documents