6.003 signal processingof lec 05a sampling rate 8000 samples/sec, 9600 samples window_size=4096,...

Post on 22-Jan-2021

2 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

6.003 Signal Processing

Week 9, Lecture A:Short Time Fourier Transform

6.003 Fall 2020

Time-varying SignalsReal-world signals (i.e., speech, music, . . .) often have frequency content that varies with time.

Fourier Transform: events that are local in time are global in frequency (and vice versa). Sudden changes and local variations can be difficult to detect.

Example: 2 tunesโ€ข cos0.wavโ€ข cos1.wav

How to tell them apart?

FFT of the two signals:

cos0 cos1

Short-Time Fourier Transform (STFT)The short-time Fourier Transform is a tradeoff between time and frequency-domain representations, representing the frequency content of the signal at various points in time.

Conceptually, we are taking the DFT of successive windowed regions of the original signal (and these regions may overlap).

Short-Time Fourier Transform (STFT)

Formally, we define the STFT of a signal x as:

๐‘†๐‘‡๐น๐‘‡ ๐‘ฅ [๐‘š, ๐‘˜] =

๐‘›=0

๐‘โˆ’1

๐‘ฅ ๐‘› + ๐‘š ร— ๐‘  โˆ™ ๐‘ค[๐‘›]๐‘’โˆ’๐‘—2๐œ‹๐‘˜๐‘ ๐‘›

Where:โ€ข ๐‘š is a time index, ๐‘˜ is a frequency index

โ€ข ๐‘ is the length of a window

โ€ข ๐‘ค[๐‘›] is window function

โ€ข ๐‘  is โ€œstep sizeโ€

STFT and Spectrograms

๐‘†๐‘‡๐น๐‘‡ ๐‘ฅ [๐‘š, ๐‘˜] =

๐‘›=0

๐‘โˆ’1

๐‘ฅ ๐‘› +๐‘š ร— ๐‘  โˆ™ ๐‘ค[๐‘›]๐‘’โˆ’๐‘—2๐œ‹๐‘˜๐‘

๐‘›

The STFT is often visualized using a spectrogram, which is defined to be the magnitude squared of the STFT. Spectrogram:

Frequency content of the signal at various points in time.

Setting Suitable Parameters for Spectrograms

๐‘†๐‘‡๐น๐‘‡ ๐‘ฅ [๐‘š, ๐‘˜] =

๐‘›=0

๐‘โˆ’1

๐‘ฅ ๐‘› +๐‘š ร— ๐‘  โˆ™ ๐‘ค[๐‘›]๐‘’โˆ’๐‘—2๐œ‹๐‘˜๐‘

๐‘›

โ€ข What window size N should be used?

โ€ข What step size s should be used?

Setting suitable window size N:

โ€ข Larger N will give rise to better frequency resolution

โ€ข Larger N will compromise time resolution

Setting suitable step size s:

โ€ข Smaller s helps with better time resolution

โ€ข Smaller s will mean more steps to perform calculation, increase computation cost

โ€ข What type of window function should be used ?

Spectrogram of signal from cos0.wav

โ€ข Window size large, we can not detect frequency change in time

โ€ข Need to optimize the windowsize to achieve reasonable resolution both in freq. and time

โ€ข Freq. conversion: see slide #13 of Lec 05A

Sampling rate 8000 samples/sec, 9600 samples

window_size=4096, step_size=256 window_size=2048, step_size=256 window_size=512, step_size=256

window_size=256, step_size=128 window_size=128, step_size=64

Spectrogram of signal from cos1.wav

โ€ข Onset of a note tend to have frequencies across all spectrumโ€ข Reducing step size, better resolution in time, longer time to compute

Sampling rate 8000 samples/sec, 9600 samples

window_size=2048, step_size=256 window_size=256, step_size=32window_size=256, step_size=256

Spectrogram of signal from mystery1.wav

โ€ข Larger window size N to resolve frequency betterโ€ข If already know whole frequency range, can zoom into certain frequency range to see better โ€ข Now we can read music using spectrogram!

window_size=1024, step_size=64window_size=512, step_size=128 window_size=1024, step_size=64

sample_rate=22050, 214987 samples

Spectrogram of signal from mystery2.wav

โ€ข Larger window size N to resolve frequency better to see the chords more clearly

window_size=512, step_size=128 window_size=1024, step_size=64 window_size=1024, step_size=64

sample_rate=22050, 202341 samples

Spectrogram of signal from mystery3.wav

window_size=1024, step_size=64 window_size=2048, step_size=64

further zoom in in frequency

sample_rate=22050, 362386 samples

โ€ข Now we can use this to analyze all kinds of waveforms!

window_size=2048, step_size=32

Zoom in both in freq. and in time

Other Spectrograms

โ€ข Spectrogram gives visual representation of the spectrum of frequencies contained in signal, as a function of time. A very useful tool in many fields!

window_size=512, step_size=16

fs=8000, short speech

window_size=2048, step_size=32

fs=44100

window_size=2048, step_size=32

fs=44100, piano music

STFT in Processing Streaming Signals

โ€ข Breaking a signal into pieces so that it can be processed in small chunks is especially important in streaming applications where the signal may be arbitrarily long.

General procedure: โ€ข Divide the input signal into a sequence of windows, each of length N. โ€ข Process each window. โ€ข Assemble the processed pieces together to form the output.

โ€ข How to do this? How to choose N (good resolution with high speed)?

โ€ข How to assemble them smoothly?

Summaryโ€ข Short-Time Fourier Transform: analyzing a sequence of finite-length

portions of an input signal

โžขSpectrogram: visual representation of the spectrum of frequencies of a signal as it varies with time

โžขProcessing Streaming signals: (See Recitation 9B)

top related