pac/aac audio coding standard a. moreno [email protected] georgia institute of technology...

25
PAC/AAC audio coding standard A. Moreno [email protected] Georgia Institute of Technology ECE8873-Spring/2004

Upload: norman-sherman

Post on 17-Dec-2015

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: PAC/AAC audio coding standard A. Moreno antonio@ece.gatech.edu Georgia Institute of Technology ECE8873-Spring/2004 antonio@ece.gatech.edu

PAC/AAC audio coding standard

A. [email protected] Institute of TechnologyECE8873-Spring/2004

Page 2: PAC/AAC audio coding standard A. Moreno antonio@ece.gatech.edu Georgia Institute of Technology ECE8873-Spring/2004 antonio@ece.gatech.edu

Overview

Audio Recording Coding-ultimate goal AAC Encoder Block Diagram Principles of Psychoacoustics Perceptual Entropy Quantization and Coding Samples

Page 3: PAC/AAC audio coding standard A. Moreno antonio@ece.gatech.edu Georgia Institute of Technology ECE8873-Spring/2004 antonio@ece.gatech.edu

Introduction

"If a tree falls in the forest with no one around to hear it, does it make a sound?"

Page 4: PAC/AAC audio coding standard A. Moreno antonio@ece.gatech.edu Georgia Institute of Technology ECE8873-Spring/2004 antonio@ece.gatech.edu

Audio Recording

Edison, 1877

Page 5: PAC/AAC audio coding standard A. Moreno antonio@ece.gatech.edu Georgia Institute of Technology ECE8873-Spring/2004 antonio@ece.gatech.edu

Audio Recording

Philips, 1978

A/D Converter

PCM

Page 6: PAC/AAC audio coding standard A. Moreno antonio@ece.gatech.edu Georgia Institute of Technology ECE8873-Spring/2004 antonio@ece.gatech.edu

Coding

Ultimate Goal: reduce the number of bits needed to represent the data.

Bitrate = Fsa x Wordlength

Page 7: PAC/AAC audio coding standard A. Moreno antonio@ece.gatech.edu Georgia Institute of Technology ECE8873-Spring/2004 antonio@ece.gatech.edu

AAC Encoder Block Diagram

Perceptual Model

Gain Control MDCT TNS

Multi-ChannelM/S, Intensity Prediction z^-1

Quant

ScaleFactorExtract

Iterative Rate Control Loop

EntropyCoding

Side information coding, Bitstreamchannel

s(n)

Page 8: PAC/AAC audio coding standard A. Moreno antonio@ece.gatech.edu Georgia Institute of Technology ECE8873-Spring/2004 antonio@ece.gatech.edu

Principles of Psychoacoustics

Source localization.

Two ears are necessary.

Brain uses intensity differences, and time delays between the two perceived signals.

Page 9: PAC/AAC audio coding standard A. Moreno antonio@ece.gatech.edu Georgia Institute of Technology ECE8873-Spring/2004 antonio@ece.gatech.edu

Principles of Psychoacoustics

inaudible

audible

Absolute Hearing Threshold

Page 10: PAC/AAC audio coding standard A. Moreno antonio@ece.gatech.edu Georgia Institute of Technology ECE8873-Spring/2004 antonio@ece.gatech.edu

Principles of Psychoacoustics

Human Ear Loudness characteristic

Robinson and Dadson equi-loudness contours.

Page 11: PAC/AAC audio coding standard A. Moreno antonio@ece.gatech.edu Georgia Institute of Technology ECE8873-Spring/2004 antonio@ece.gatech.edu

Principles of Psychoacoustics Critical Bands

Concept introduced by Harvey Fletcher 1940.

Frequency to Place Transform.Function of frequency that quantifies the cochlear filter passbands.

Example: The critical band for a 1kHz is about 160Hz in width. A narrow band noise centered at 1kHz is perceived with the same loudness as long as the width < 160Hz.

(Hz)])1000/(4.11[7525)( 69.02ffBWc

Page 12: PAC/AAC audio coding standard A. Moreno antonio@ece.gatech.edu Georgia Institute of Technology ECE8873-Spring/2004 antonio@ece.gatech.edu

Principles of Psychoacoustics

Simultaneous Masking: Frequency

inaudible

audible

Page 13: PAC/AAC audio coding standard A. Moreno antonio@ece.gatech.edu Georgia Institute of Technology ECE8873-Spring/2004 antonio@ece.gatech.edu

Principles of Psychoacoustics

BETH TN 5.14

Simplified Paradigms:Noise Masking Tone

Tone Masking Noise

1Bark

THN

1Bark

THTKETH NT

K=3dB...5dB (constant)

Page 14: PAC/AAC audio coding standard A. Moreno antonio@ece.gatech.edu Georgia Institute of Technology ECE8873-Spring/2004 antonio@ece.gatech.edu

Principles of Psychoacoustics

1Bark

th

Spread of Masking

Page 15: PAC/AAC audio coding standard A. Moreno antonio@ece.gatech.edu Georgia Institute of Technology ECE8873-Spring/2004 antonio@ece.gatech.edu

Principles of Psychoacoustics

Masking: Temporal

Page 16: PAC/AAC audio coding standard A. Moreno antonio@ece.gatech.edu Georgia Institute of Technology ECE8873-Spring/2004 antonio@ece.gatech.edu

Perceptual Entropy Perceptual Entropy, objective metric of

perceptually relevant introduced by J. Johnston

The perceived information from an audio signal is only a fraction of the total information emanated by the source.

Page 17: PAC/AAC audio coding standard A. Moreno antonio@ece.gatech.edu Georgia Institute of Technology ECE8873-Spring/2004 antonio@ece.gatech.edu

Perceptual Entropy

Procedure:1. Window and transform to frequency.2. Masking Threshold is computed using

perceptual rules3. A determination is made of the

number of bits required to quantize the spectrum, without injecting perceptible noise.

Page 18: PAC/AAC audio coding standard A. Moreno antonio@ece.gatech.edu Georgia Institute of Technology ECE8873-Spring/2004 antonio@ece.gatech.edu

Perceptual Entropy

a

gSFM

)1,60

min( dB

SFM

)dB(5.5)1()5.14( iOi

s(n) HannWindow

MDCTDetermine nature

(Noise-like)(Tone-like)

ApplyThresholding

rules

)10/()(10log10 ii OCiT

Spectral Flatness Measure

Coefficient of ‘Tonality’

Offset

JND Estimates

Page 19: PAC/AAC audio coding standard A. Moreno antonio@ece.gatech.edu Georgia Institute of Technology ECE8873-Spring/2004 antonio@ece.gatech.edu

Perceptual Entropy

25

1e)bits/sampl(1)

/6

)Im(int(22log1)

/6

)Re(int(22log

i

bh

blwiiii

i

i kT

wn

kT

wnPE

i: index of critical band;bli, blh: lower and upper bounds of band i;ki: number of transform component in band i;Ti: masking threshold in band i;nint: rounding to the nearest integer.

Page 20: PAC/AAC audio coding standard A. Moreno antonio@ece.gatech.edu Georgia Institute of Technology ECE8873-Spring/2004 antonio@ece.gatech.edu

Returning

"If a tree falls in the forest with no one around to hear it, does it make a sound?"

From a Perceptual Coding standpoint, if no one can hear it, THERE IS NO TREE.

Page 21: PAC/AAC audio coding standard A. Moreno antonio@ece.gatech.edu Georgia Institute of Technology ECE8873-Spring/2004 antonio@ece.gatech.edu

AAC Encoder Block Diagram

Perceptual Model

Gain Control MDCT TNS

Multi-ChannelM/S, Intensity Prediction z^-1

Quant

ScaleFactorExtract

Iterative Rate Control Loop

EntropyCoding

Side information coding, Bitstreamchannel

s(n)

Page 22: PAC/AAC audio coding standard A. Moreno antonio@ece.gatech.edu Georgia Institute of Technology ECE8873-Spring/2004 antonio@ece.gatech.edu

Quantization and Coding

Power-law quantizer Huffman Coding (table can be chosen)

Global Gain -> Quantization step size Scale Factors -> noise shaping factor

Page 23: PAC/AAC audio coding standard A. Moreno antonio@ece.gatech.edu Georgia Institute of Technology ECE8873-Spring/2004 antonio@ece.gatech.edu

Quantization and Codingwhile NOISE_CTL

while FINDING_RATENr_bits= get_bits_needed();if (Nr_bits > max_bits)

adjust_global_gain();else

FINDING_RATE=0;endq_noise=get_quant_noise_level();if (q_noise> Th(band))

adjust_band_scale_factor();else

NOISE_CTL=0;end

Page 24: PAC/AAC audio coding standard A. Moreno antonio@ece.gatech.edu Georgia Institute of Technology ECE8873-Spring/2004 antonio@ece.gatech.edu

Samples

Castanets

Original 48kHz Stereo

128kbps AAC Stereo (48kHz)

Piano

Timpani

Page 25: PAC/AAC audio coding standard A. Moreno antonio@ece.gatech.edu Georgia Institute of Technology ECE8873-Spring/2004 antonio@ece.gatech.edu

References[1] Ted Painter and Andreas Spanias. Perceptual coding

of digital audio. Proceedings of the IEEE, 88(4):449-513. Abril 2000.

[2] Karlheinz Brandenburg, MP3 and AAC explained, AES 17th International Conference on High Quality Audio Coding, 1999.

[3] J.D. Johnston, A.J. Ferreira, Sum-Difference Stereo Transform Coding, Proc. ICASSP 1992.

[4] Deepen Sinha, James D. Johnston. Audio Compression at low bit rates using a Signal Adaptive switched Filterbank. Proc. of the ICASSP 1996, pp. 1053-1056 .