1 digital audio compression. 2 formats there are many different formats for storing and...
TRANSCRIPT
1
Digital Audio CompressionDigital Audio Compression
2
FormatsFormats
There are many different formats for storing and communicating digital audio:CD audioWavAiffAu MP3
3
The Storage ProblemThe Storage Problem
CD quality recording44100 sampling rate16 bit quantization2 channels (stereo)
176.4 Kbytes per second1 minute is ~ 10.5 MBytes74 minutes is ~780 MB
4
PsychoacousticsPsychoacoustics
The study of the psychological and physiological principles of sound perception
CDs try to accurately reproduce the original audio signalBut we do not hear all of this signalThe parts that we don’t hear are redundantIf we remove these parts we can store the
signal using less data but without effecting the perceived sound
5
Threshold of Hearing & Masking
Threshold of Hearing & Masking
The threshold of hearing curve describes the minimum level at which the ear can detect a tone at a given frequency
Fletcher-Munson curves
6
Amplitude MaskingAmplitude Masking
Amplitude masking occurs when a tone shifts the threshold curve upwards in the frequency region that surrounds it
0.
7
Critical BandCritical Band
Hair cells on the Basilar membrane respond to the strongest stimulation in their local region
This local region is called the critical band
Critical bands are smaller for low frequency signals than they are for high frequency signals
8
Critical BandsCritical Bands
9
Amplitude Masking & Thresholds
Amplitude Masking & Thresholds
10
Temporal MaskingTemporal Masking
Masking can also occur when tones are sounded at slightly different timesPremasking – signal A is masked by signal B
which occurs laterPostmaking – signal A is masked by signal B
which ends before signal A has startedTemporal masking increases as time
differences reduce
11
Temporal MaskingTemporal Masking
12
MaskingMasking
Amplitude and temporal masking form a masking area in the time-frequency domain
13
Perceptual CodingPerceptual Coding
Perceptual coders analyse the frequency and amplitude content of the input signal and compare it to a model of human auditory perception
Parts of the input signal which are inaudible are removed
14
Perceptual CodingPerceptual Coding
A perceptual coder uses a digital filter bank to split a short duration of audio signal into multiple frequency bands
15
Perceptual CodingPerceptual Coding
The coder analyses the energy in each of these subbands to determine which subbands contain audible information
Subbands which are not audible are not coded
16
Perceptual CodingPerceptual Coding
Quantization bits are assigned according to signal strength above the audibility curve
17
Perceptual CodingPerceptual Coding
The purpose of perceptual coding is to reduce the data rate
Perceptual coders maintain sampling frequency, selectively decrease word length
Coders reduction ratio is the ratio of input bit rate to output bit rateRatios of up to 6:1 are often transparent
18
Perceptual CodingPerceptual Coding
Because the inaudible content of the signal is removed the playback system’s ability to convey audible music should improveIn theory it is possible to get better
reproduction after perceptual coding than the original! (In theory…)
Perceptual coders more properly code an audio signal for passage through an audio system
19
MP3MP3
Mpeg 1 Audio Layer 3Developed to support audio coding for
playback with videoUses :
A filterbank producing 32 subbands from 24ms of audio data
Perceptual coder originally produced by the Fraunhofer Institut Integrierte Schaltungen
Lossless Huffman coding
20
MP3MP3
21
MP3MP3
Sound quality is highly dependent on the performance of the encoder
Most encoders use constant-bitrate (CBR) encoding. In this mode you choose a target bitrate (e.g. 128kBit/s)
CodecsFraunhoferXing MP3 encoderEtc…
22
Joint Stereo CodingJoint Stereo Coding
Takes advantage of interchannel redundancy between stereo channels
Some sounds and some components are equal in both channelsLow frequencies: Bass instruments, strings,
low components of drumsCentrally placed signals: typically vocals
Removing duplication reduces data without effecting perceived sound
23
FinFin
Fin