digitized sound - electrical, computer & energy...

Digitized SoundDigitized Sound

Telecommunications 1Telecommunications 1P. MathysP. Mathys

Sampling of WaveformsSampling of Waveforms

Computers cannot directly deal with Computers cannot directly deal with continuouscontinuous--timetime (CT) waveforms.(CT) waveforms.A CT waveform needs to beA CT waveform needs to be sampledsampled at at regular time intervals before it can be regular time intervals before it can be stored and processed by a computer.stored and processed by a computer.The sampling operation converts the CT The sampling operation converts the CT signal into a signal into a discretediscrete--timetime (DT) signal (DT) signal or or sequencesequence..

Sampling of WaveformsSampling of Waveforms

The next slide shows a sinewave and its The next slide shows a sinewave and its samples after sampling with ratesamples after sampling with rateFs = 16000 Hz (16000 samples/sec)Fs = 16000 Hz (16000 samples/sec)The samples are marked with “The samples are marked with “oo” in the ” in the graph.graph.Each sample is a real number, e.g., Each sample is a real number, e.g., 0.70710678 and a sampled waveform is 0.70710678 and a sampled waveform is a sequence of such numbers.a sequence of such numbers.

Sampling of 1000 Hz SinewaveSampling of 1000 Hz Sinewave

16000 Samples/sec ==> Fs = 16000 Hz 16 Samples/Period

Sampling at Higher RateSampling at Higher Rate

If we sample at a higher rate, we expect If we sample at a higher rate, we expect toto

Need more memory.Need more memory.Need larger files to store all samples.Need larger files to store all samples.Takes longer to transmit samples over Takes longer to transmit samples over network.network.Reduce approximation error that results Reduce approximation error that results from sampling a CT signal.from sampling a CT signal.

Sampling at Higher RateSampling at Higher Rate


Sampling at Lower RateSampling at Lower Rate

If we sample at a lower rate, we will use If we sample at a lower rate, we will use less storage space and transmission, less storage space and transmission, e.g., over the Internet, will be faster.e.g., over the Internet, will be faster.But how much will the quality degrade?But how much will the quality degrade?What is the minimum sampling rate that What is the minimum sampling rate that is needed?is needed?

Sampling at Lower RateSampling at Lower Rate


Using Different Sampling RatesUsing Different Sampling Rates

Sampling RateFs

File Size(1 sec, 16 bits)

SoundSample

8,000 Hz 16 kB

16,000 Hz 32 kB

32,000 Hz 64 kB

sin1000_8.wav

sin1000_16.wav

sin1000_32.wav

What Sampling Rate is Best?What Sampling Rate is Best?

The previous three examples of The previous three examples of sampling a sampling a 1000 Hz1000 Hz tone at tone at Fs=8000 Fs=8000 Hz, Fs=16000 Hz, Hz, Fs=16000 Hz, andand Fs=32000 HzFs=32000 Hzshow no difference in sound quality.show no difference in sound quality.But let’s look at another example where But let’s look at another example where we sample a we sample a 5000 Hz5000 Hz tone at tone at Fs=16000Fs=16000Hz and Hz and Fs=8000Fs=8000 Hz.Hz.

5000 Hz Sinewave, Fs=16000 Hz5000 Hz Sinewave, Fs=16000 Hz

16 samples in 1 msec, 3.2 samples per period

sin5000_16.wav


Leaving out every second sample (x) we have8 samples (o) in 1 msec, 1.6 samples per period

Fs=8000

Fs=16000sin5000_16.wav

sin5000_8.wav

5000 Hz Sinewave5000 Hz Sinewave

The The 5000 Hz5000 Hz sinewave sounds different sinewave sounds different at at Fs=8000 HzFs=8000 Hz and at and at Fs=16000 HzFs=16000 Hz..The reason is that the soundcard, which The reason is that the soundcard, which converts the samples back to a CT converts the samples back to a CT waveform, tries to find the “smoothest” waveform, tries to find the “smoothest” (i.e., the lowest frequency) waveform (i.e., the lowest frequency) waveform that passes through all samples as that passes through all samples as shown in the next slide.shown in the next slide.


Green curve is “smoother” than blue (dashed) curve.

Green

Blue

sin5000_8.wav

sin5000_16.wav

AliasingAliasingThe effect whereby tone frequencies The effect whereby tone frequencies are altered because of a reduction in are altered because of a reduction in sampling rate is called sampling rate is called aliasingaliasing..Aliasing affects all sampled sound Aliasing affects all sampled sound sequences, whether they be pure tones, sequences, whether they be pure tones, music or speech. Music examples:music or speech. Music examples:

OriginalFs=44100 Hz

w/o AliasingFs=11025 Hz

with AliasingFs=11025 Hz

muss44.wav muss11.wav muss11_ali.wav

AliasingAliasing

With Aliasing Original

Aliasing foldshigh frequencycomponentsdown to lowfrequencies.This is visibleand audibleespecially wellin the portionsmarked in red.

muss11_ali_orig.wav

muss44s.wavmuss11_ali44s.wav

AliasingAliasing

With Aliasing No Aliasing

To preventaliasing fromoccurring, thesound fileneeds to belowpass-filteredbefore thesampling rateis reduced.

muss11_ali_noali.wav

muss11_ali44s.wav muss11_44s.wav

Nyquist RateNyquist Rate

Let Let BB (in Hz) be the highest frequency (in Hz) be the highest frequency contained in a sound waveform.contained in a sound waveform.Sampling TheoremSampling Theorem: To avoid distortion : To avoid distortion due to aliasing, a signal of bandwidth due to aliasing, a signal of bandwidth BBmust use a sampling rate must use a sampling rate Fs>2BFs>2B..The sampling rate of The sampling rate of 2B2B samples/sec is samples/sec is called called Nyquist rateNyquist rate..

Common Sampling RatesCommon Sampling Rates

Application Sampling RateFs [Hz]

BandwidthB [Hz]

Telephony 8000 3000

MusicLow Quality 11025 5000

Music 22050 10000

MusicHi-Fi 44100 20000

DATDigital Audio Tape 48000 22000

QuantizationQuantization

Sampling alone is not sufficient to Sampling alone is not sufficient to convert a waveform to a format that a convert a waveform to a format that a computer can handle.computer can handle.The problem is that each sample is a The problem is that each sample is a real numberreal number that requires infinite that requires infinite precision for processing and storing.precision for processing and storing.Computers have finite word length and Computers have finite word length and can only can only approximateapproximate real numbers.real numbers.


For example,For example,pi = 3.14159265358979323846264…pi = 3.14159265358979323846264…

is a real number. On a computer that is a real number. On a computer that can represent 5 decimal digits we would can represent 5 decimal digits we would approximate approximate pipi as as 3.14163.1416..Typical computer word lengths are 8, Typical computer word lengths are 8, 16, 32, and 64 bits (approximately 2, 4, 16, 32, and 64 bits (approximately 2, 4, 9, and 19 decimal digits).9, and 19 decimal digits).


The process of approximating a real The process of approximating a real number by a number with finite number by a number with finite wordlength is called wordlength is called quantizationquantization..Unlike sampling, quantization isUnlike sampling, quantization is not a not a reversible processreversible process..However, by choosing a large enough However, by choosing a large enough wordlength, the quantization error can wordlength, the quantization error can be made as small as desired.be made as small as desired.

Quantization: ExamplesQuantization: Examples

The next slides show the quantization of The next slides show the quantization of a sine wave to 2, 3, and 4 bits:a sine wave to 2, 3, and 4 bits:

2 Bits (4 levels): 2 Bits (4 levels): 00, 01, 10, 1100, 01, 10, 113 Bits (8 levels): 3 Bits (8 levels): 000, 001, 010, 011, 100, 000, 001, 010, 011, 100, 101, 110, 111101, 110, 1114 Bits (16 levels): 4 Bits (16 levels): 0000, 0001, 0010, 0011, 0000, 0001, 0010, 0011, 0100, 0101, 0110, 0111, 1000, 1001, 1010, 0100, 0101, 0110, 0111, 1000, 1001, 1010, 1011, 1100, 1101, 1110, 11111011, 1100, 1101, 1110, 1111

Quantization: Example (2 Bits)Quantization: Example (2 Bits)

Digital sequence of samples (Fs=16000 Hz):10,11,11,11,11,11,11,10,01,00,00,00,00,...

00

11

10

01

2 bits

16 bitsq1000_16.wav

q1000_2.wav

Quantization: Example (2 Bits)Quantization: Example (2 Bits)Quantization isa non-linearoperation thatintroduces newfrequencycomponents.Here the newcomponentsappear at oddmultiples of thefundamentalfrequency

q1000h_16_2.wav

16 bits 2 bits

q1000_16.wav q1000_2.wav


000001

011

101110111

100

010

Digital sequence of samples (Fs=16000 Hz):100,110,111,111,111,111,110,100,011,001,...

16 bits

3 bits

q1000_16.wav

q1000_3.wav


Digital sequence of samples (Fs=16000 Hz):1001,1100,1110,1111,1111,1110,1100,1001,0110,...

0000000100100011010001010110011110001001101010111100110111101111

4 bits

16 bits

q1000_4.wav

q1000_16.wav

Quantization ErrorQuantization Error

The difference The difference q(t)q(t)--y(t)y(t) between the between the quantized signal and the original signal quantized signal and the original signal (shown dotted in red in the previous (shown dotted in red in the previous slides) is called slides) is called quantization errorquantization error..The quantization error becomes larger if The quantization error becomes larger if fewer bits are used. Quantization is a fewer bits are used. Quantization is a nonlinear processnonlinear process that introduces new that introduces new and unwanted frequency components.and unwanted frequency components.

Signal to Quantization Noise RatioSignal to Quantization Noise Ratio

The signal to quantization noise ratio The signal to quantization noise ratio ((SQNRSQNR) is a measure to judge the effect ) is a measure to judge the effect of quantization.of quantization.SQNR is computed as:SQNR is computed as:SQNR = 10*log(signal power/quant err pwr)SQNR = 10*log(signal power/quant err pwr)

where the logarithm is taken to the base where the logarithm is taken to the base 10 and SQNR is measured in dB 10 and SQNR is measured in dB (decibels).(decibels).

If the signal power is If the signal power is 1010 times higher times higher than the quantization error power, then than the quantization error power, then the SQNR is the SQNR is 10 dB10 dB, if it is a , if it is a 100100 times times higher, the SQNR is higher, the SQNR is 20 dB20 dB, if it is a , if it is a 10001000times higher, the SQNR is times higher, the SQNR is 30 dB30 dB, etc., etc.For quantization of a sinusoid to For quantization of a sinusoid to kk bits bits we havewe have

SQNR = 6.02 x k + 1.76 dBSQNR = 6.02 x k + 1.76 dB

Signal to Quantization Noise RatioSignal to Quantization Noise Ratio

Quantization of SinewavesQuantization of SinewavesQuantization SQNR [dB] Sound

16 bits 98.1

8 bits 49.9

4 bits 25.8

3 bits 19.8

2 bits 13.8

1 bit 7.8 q1000_1.wav

q1000_2.wav

q1000_3.wav

q1000_4.wav

q1000_8.wav

q1000_16.wav

Quantization of Speech/MusicQuantization of Speech/Music

Quantization SQNR [dB] Sound

16 bits ~84

8 bits 35.0

6 bits 22.6

4 bits 10.1

2 bits -3.5

tlp_16.wav

tlp_8.wav

tlp_6.wav

tlp_4.wav

tlp_2.wav

SQNR for 16SQNR for 16--bit bit QuantizationQuantization

1616--Bit Bit quantization quantization (e.g., for CD) yields (e.g., for CD) yields approximately 90 dB SQNR.approximately 90 dB SQNR.What does that mean?What does that mean?10*log(signal 10*log(signal pwrpwr/noise /noise pwrpwr) = 90) = 90

Signal Signal pwrpwr/noise /noise pwr pwr = 10^9= 10^9That is, the That is, the quantization quantization noise power is noise power is only only one billionthone billionth of the signal power.of the signal power.

Common Parameters for SoundCommon Parameters for Sound

The table on the next slide shows The table on the next slide shows common sampling rates common sampling rates FsFs and and quantizations quantizations Q Q (in bits/sample) for (in bits/sample) for sound waveforms.sound waveforms.The uncompressed rate The uncompressed rate R R (in bytes/sec) (in bytes/sec) is computed using:is computed using:

R = #channels x Fs x Q / 8R = #channels x Fs x Q / 8

Common Parameters for SoundCommon Parameters for SoundApplication # of

ChannelsFs

in HertzQ

in bitsRate R

in kB/secSpeech

(Telephony)1 8000 8 8

Speech 1 11025 8 11.025

MusicLow Quality

1 11025 16 22.05

SpeechHigh Quality

1 22050 16 44.1

Music, Stereo 2 22050 16 88.2

Music, Hi-Fi 1 44100 16 88.2

MusicHi-Fi Stereo

2 44100 16 176.4

The MPEG StandardsThe MPEG Standards

MPEG MPEG stands for Moving Picture stands for Moving Picture Experts Group. This group works on Experts Group. This group works on standards for coding of moving images standards for coding of moving images and sound.and sound.MPEG standards can be obtained from MPEG standards can be obtained from ISO (International Standards ISO (International Standards Organization) or, in the US, from ANSI Organization) or, in the US, from ANSI (American National Standards Institute).(American National Standards Institute).


MPEGMPEG--1: Standard for compression and 1: Standard for compression and coding of relatively low resolution video coding of relatively low resolution video (352x240 pixels, 30 frames/s) at 1.152 (352x240 pixels, 30 frames/s) at 1.152 MbitsMbits/sec (or 144 /sec (or 144 kBkB/sec) for CD/sec) for CD--ROM.ROM.MPEGMPEG--2: Is an extension of MPEG2: Is an extension of MPEG--1 for 1 for highhigh--quality digital video using rates in quality digital video using rates in the range 1.5 ... 15 the range 1.5 ... 15 MbitsMbits/sec./sec.

MPEGMPEG--3: Was once intended for HDTV 3: Was once intended for HDTV (High Definition Television) applications, (High Definition Television) applications, but HDTV is now part of MPEGbut HDTV is now part of MPEG--2. Thus 2. Thus MPEGMPEG--3 efforts were abandoned.3 efforts were abandoned.MPEGMPEG--4: Intended for very low4: Intended for very low bitratebitratecoding of audiocoding of audio--visual programs, e.g., visual programs, e.g., for interactive mobile multimedia for interactive mobile multimedia applications at rates up to 64applications at rates up to 64 kbitskbits/sec./sec.


MPEGMPEG--1 (and MPEG1 (and MPEG--2) specify a family 2) specify a family of three audio coding schemes, called of three audio coding schemes, called layerlayer--1, layer1, layer--2, and layer2, and layer--3.3.From layerFrom layer--1 to layer1 to layer--3, encoder 3, encoder complexity and performance (sound complexity and performance (sound quality perquality per bitratebitrate) are increasing.) are increasing.

LayerLayer--1: From 321: From 32 kbitkbit/sec to 448/sec to 448 kbitkbit/sec/secLayerLayer--2: From 322: From 32 kbitkbit/sec to 384/sec to 384 kbitkbit/sec/secLayerLayer--3: From 323: From 32 kbitkbit/sec to 320/sec to 320 kbitkbit/sec/sec


LayerLayer--1:1: 4:1 (4:1 (typtyp. rate 384 kbps). rate 384 kbps)LayerLayer--2:2: 6:1…8:1 (6:1…8:1 (typtyp. rate 224 kbps). rate 224 kbps)LayerLayer--3:3: 10:1…12:1 (10:1…12:1 (typtyp. rate 128 kbps). rate 128 kbps)

MP3: MP3: Typical Compression RatiosTypical Compression Ratios

MP3 Compression TechniquesMP3 Compression Techniques

Minimal Audition Threshold:Minimal Audition Threshold: Is not Is not linear, ear is most sensitive between 2 linear, ear is most sensitive between 2 and 5 kHz. Sounds below threshold and 5 kHz. Sounds below threshold need not be retained and coded.need not be retained and coded.Masking Effect:Masking Effect: During strong sounds During strong sounds you do not hear the weakest sounds. you do not hear the weakest sounds. Thus, using Thus, using psychoacoustic psychoacoustic modeling modeling not all sounds need to be coded.not all sounds need to be coded.

MP3 Compression TechniquesMP3 Compression TechniquesReservoir of Bytes:Reservoir of Bytes: Some passages Some passages may not be may not be codeable codeable at a given rate at a given rate without altering musical quality. MP 3 without altering musical quality. MP 3 then “borrows” bytes from other then “borrows” bytes from other passages that can be coded at lower passages that can be coded at lower rate.rate.Huffman Coding:Huffman Coding: Is used to code the Is used to code the information into variable length information into variable length codewordscodewords after compression.after compression.

SummarySummary

Sound waveforms need to be sampled Sound waveforms need to be sampled and quantized for computers.and quantized for computers.Sampling converts continuous time to Sampling converts continuous time to discrete time. Aliasing must be avoided.discrete time. Aliasing must be avoided.Quantization converts amplitude to finite Quantization converts amplitude to finite wordlength value. Keep SQNR low.wordlength value. Keep SQNR low.File size (uncompressed) in bytes is:File size (uncompressed) in bytes is:#channels x Fs x bits/sample x time(sec) / 8#channels x Fs x bits/sample x time(sec) / 8

digitized sound - electrical, computer & energy...

Documents