mpeg-1 - mcgill schulich faculty of musicich/classes/mumt621_09/presentations... · 2011. 1. 2. ·...

25
MPEG MPEG - - 1 1 Overview of MPEG Overview of MPEG - - 1 Standard 1 Standard Introduction to perceptual and entropy Introduction to perceptual and entropy codings codings

Upload: others

Post on 02-Mar-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

  • MPEGMPEG--11

    Overview of MPEGOverview of MPEG--1 Standard1 StandardIntroduction to perceptual and entropy Introduction to perceptual and entropy codingscodings

  • 25 September 200925 September 2009 MPEGMPEG--1 Presentation1 Presentation 22

    ContentsContents

    HistoryHistory

    Psychoacoustics and perceptual codingPsychoacoustics and perceptual coding

    Entropy codingEntropy coding

    MPEGMPEG--11

    Layer I/IILayer I/II

    Layer III (MP3)Layer III (MP3)

    Comparison and Audio QualityComparison and Audio Quality

  • 25 September 200925 September 2009 MPEGMPEG--1 Presentation1 Presentation 33

    IntroductionIntroduction

    Digitizing an analog signal is (Digitizing an analog signal is (lossylossy) compression) compression

    Digitizing introduces quantization noiseDigitizing introduces quantization noise

    Quantization noise imply loss of qualityQuantization noise imply loss of quality

    Linear quantization > 16 bit (98 dB) Linear quantization > 16 bit (98 dB) inaudible noise (CD)inaudible noise (CD)

    Linear quantization 4 bit (26 dB)Linear quantization 4 bit (26 dB)

  • 25 September 200925 September 2009 MPEGMPEG--1 Presentation1 Presentation 44

    ContentsContents

    HistoryHistory

    Psychoacoustics and perceptual codingPsychoacoustics and perceptual coding

    Entropy codingEntropy coding

    MPEGMPEG--11

    Layer I/IILayer I/II

    Layer III (MP3)Layer III (MP3)

    Comparison and Audio QualityComparison and Audio Quality

  • 25 September 200925 September 2009 MPEGMPEG--1 Presentation1 Presentation 55

    HistoryHistory

    Moving Picture Expert Group (MPEG)Moving Picture Expert Group (MPEG)

    Created in January 1988Created in January 1988

    Starts the development of MPEGStarts the development of MPEG--1 in May 19881 in May 1988

    Publishes the MPEGPublishes the MPEG--1 standard in November 1992 (ISO/IEC 111721 standard in November 1992 (ISO/IEC 11172-- 3 for audio)3 for audio)

    MPEGMPEG--1 standard1 standard

    Defines bitDefines bit--streamstream

    Defines decoding functionsDefines decoding functions

    DOES NOT define encoding techniquesDOES NOT define encoding techniques

    Inspired by MUSICAM (Masking pattern Universal Subband Integrated Coding And Multiplexing) )

  • 25 September 200925 September 2009 MPEGMPEG--1 Presentation1 Presentation 66

    ContentsContents

    HistoryHistory

    Psychoacoustics and perceptual codingPsychoacoustics and perceptual coding

    Entropy codingEntropy coding

    MPEGMPEG--11

    Layer I/IILayer I/II

    Layer III (MP3)Layer III (MP3)

    Comparison and Audio QualityComparison and Audio Quality

  • 25 September 200925 September 2009 MPEGMPEG--1 Presentation1 Presentation 77

    PsychoacousticsPsychoacoustics

    Masking effectMasking effect

    Critical bandsCritical bandsz/Bark lower boundary

    higher boundary bandwidth

    central frequency

    0 0 100 100 50

    1 100 200 100 150

    2 200 300 100 250

    3 300 400 100 350

    4 400 510 110 450

    5 510 630 120 570

    6 630 770 140 700

    7 770 920 150 840

    (Brandenburg)

    Time domain masking (Pohlmann 2000) Frequency domain masking (Pohlmann 2000)

    Idealized critical bands (Painter & Spanias 2000)

  • 25 September 200925 September 2009 MPEGMPEG--1 Presentation1 Presentation 88

    Perceptual CodingPerceptual Coding

    Dividing the different Dividing the different subbandssubbandsof a signalof a signal

    Ignoring masked audio Ignoring masked audio informationinformation

    Introducing inaudibleIntroducing inaudiblequantization noisequantization noise

    Bits association according to masking threshold (Pohlmann 2000)

    Quantization noise added according to masking threshold (Pohlmann 2000)

  • 25 September 200925 September 2009 MPEGMPEG--1 Presentation1 Presentation 99

    Perceptual CodingPerceptual Coding

    Perceptual Encoder/Decoder (Kahrs & Brandenburg 1998)

  • 25 September 200925 September 2009 MPEGMPEG--1 Presentation1 Presentation 1010

    ContentsContents

    HistoryHistory

    Psychoacoustics and perceptual codingPsychoacoustics and perceptual coding

    Entropy codingEntropy coding

    MPEGMPEG--11

    Layer I/IILayer I/II

    Layer III (MP3)Layer III (MP3)

    Comparison and Audio QualityComparison and Audio Quality

  • 25 September 200925 September 2009 MPEGMPEG--1 Presentation1 Presentation 1111

    Entropic CodingEntropic Coding

    Use information about the signal to code Use information about the signal to code efficientlyefficiently

    Entropy of a signalEntropy of a signal

    Example 1: {0, 2, 2, 2, 0, 0, 0, 0, 0, 2, 0, 3, 2, 2, 0, 0, 0, 3Example 1: {0, 2, 2, 2, 0, 0, 0, 0, 0, 2, 0, 3, 2, 2, 0, 0, 0, 3, 0, 0}, 0, 0}20 symbols 20 symbols –– twelve 0 (0.6), zero 1 (0), six 2 (0.3), two 3 (0.1)twelve 0 (0.6), zero 1 (0), six 2 (0.3), two 3 (0.1)Entropy Entropy H H = 1.30= 1.30

    Example 2: {1, 2, 3, 0, 2, 1, 1, 2, 3, 0, 0, 1, 0, 3, 3, 3, 2, 0Example 2: {1, 2, 3, 0, 2, 1, 1, 2, 3, 0, 0, 1, 0, 3, 3, 3, 2, 0, 1, 2}, 1, 2}20 symbols 20 symbols –– five 0 (0.25), five 1 (0.25), five 2 (0.25), five 3 (0.25)five 0 (0.25), five 1 (0.25), five 2 (0.25), five 3 (0.25)Entropy Entropy H H = 2= 2

    Shannon theoremShannon theorem It is impossible to code with less than It is impossible to code with less than H H bits/symbolbits/symbol It is possible to code with less than It is possible to code with less than H+H+11 bits/symbolbits/symbol

  • 25 September 200925 September 2009 MPEGMPEG--1 Presentation1 Presentation 1212

    Entropic CodingEntropic Coding

    Huffman codingHuffman coding

    Example 1: {0, 2, 2, 2, 0, 0, 0, 0, 0, 2, 0, 3, 2, 2, 0, 0, 0, 3Example 1: {0, 2, 2, 2, 0, 0, 0, 0, 0, 2, 0, 3, 2, 2, 0, 0, 0, 3, 0, 0}, 0, 0}20 symbols 20 symbols –– twelve 0 (0.6), zero 1 (0), six 2 (0.3), two 3 (0.1)twelve 0 (0.6), zero 1 (0), six 2 (0.3), two 3 (0.1)Entropy Entropy H H = 1.30= 1.30

    Immediate coding:Immediate coding:0 0 ““0000”” 1 1 ““0101”” 2 2 ““1010”” 3 3 ““1111””““00101010000000000010001110100000001100000010101000000000001000111010000000110000””

    Huffman coding:Huffman coding:0 0 ““00”” 1 1 ““111111”” 2 2 ““1010”” 3 3 ““110110””

    ““010101000000100110101000011000010101000000100110101000011000””

    Efficiency:Efficiency:

    Immediate coding: 2 bits/symbolImmediate coding: 2 bits/symbol

    Huffman coding: 1.5 bits/symbol (statistically)Huffman coding: 1.5 bits/symbol (statistically)

  • 25 September 200925 September 2009 MPEGMPEG--1 Presentation1 Presentation 1313

    ContentsContents

    HistoryHistory

    Psychoacoustics and perceptual codingPsychoacoustics and perceptual coding

    Entropy codingEntropy coding

    MPEGMPEG--11

    Layer I/IILayer I/II

    Layer III (MP3)Layer III (MP3)

    Comparison and Audio QualityComparison and Audio Quality

  • 25 September 200925 September 2009 MPEGMPEG--1 Presentation1 Presentation 1414

    MPEGMPEG--11

    Sampling rate: 32, 44.1 and 48 kHzSampling rate: 32, 44.1 and 48 kHz

    Four modes:Four modes:

    Mono: 1 channelMono: 1 channel

    Stereo: 2 channelsStereo: 2 channels

    Dual: 2 channels independent (e.g. bilingual Dual: 2 channels independent (e.g. bilingual programmesprogrammes))

    Joint stereo: 2 channels coded togetherJoint stereo: 2 channels coded together

    2 perceptual models2 perceptual models

    Floating point quantization (normalization)Floating point quantization (normalization)

    Error checking: Error checking: Cyclic redundancy check (CRC)

  • 25 September 200925 September 2009 MPEGMPEG--1 Presentation1 Presentation 1515

    ContentsContents

    HistoryHistory

    Psychoacoustics and perceptual codingPsychoacoustics and perceptual coding

    Entropy codingEntropy coding

    MPEGMPEG--11

    Layer I/IILayer I/II

    Layer III (MP3)Layer III (MP3)

    Comparison and Audio QualityComparison and Audio Quality

  • 25 September 200925 September 2009 MPEGMPEG--1 Presentation1 Presentation 1616

    MPEGMPEG--1 Layer I1 Layer I

    From 32 to 448 kbpsFrom 32 to 448 kbps

    3232--subband subband polyphasepolyphasefilterbankfilterbank

    Bit allocation (0Bit allocation (0--15)15)

    Max dynamic range > 120 dBMax dynamic range > 120 dB

    Linear quantizationLinear quantization

    1 frame 1 frame 384 samples384 samplesExample: Philips DigitalExample: Philips DigitalCompact CassetteCompact Cassette

    Example of Layer I encoder (Pohlmann 2000)

    Layer I frame format (Pohlmann 2000)

  • 25 September 200925 September 2009 MPEGMPEG--1 Presentation1 Presentation 1717

    MPEGMPEG--1 Layer II1 Layer II

    From 32 to 384 kbpsFrom 32 to 384 kbps

    Improvement of Layer IImprovement of Layer I

    Improved FFT analysisImproved FFT analysis

    Scale factor redundancyScale factor redundancy

    Finer quantizationFiner quantization

    1 frame 1 frame 1152 samples1152 samples

    Example: Digital Audio Example: Digital Audio Broadcasting (DAB)Broadcasting (DAB)

    Example of Layer II encoder (Pohlmann 2000)

    Layer II frame format (Pohlmann 2000)

  • 25 September 200925 September 2009 MPEGMPEG--1 Presentation1 Presentation 1818

    ContentsContents

    HistoryHistory

    Psychoacoustics and perceptual codingPsychoacoustics and perceptual coding

    Entropy codingEntropy coding

    MPEGMPEG--11

    Layer I/IILayer I/II

    Layer III (MP3)Layer III (MP3)

    Comparison and Audio QualityComparison and Audio Quality

  • 25 September 200925 September 2009 MPEGMPEG--1 Presentation1 Presentation 1919

    MPEGMPEG--1 Layer III (MP3)1 Layer III (MP3)

    From 32 to 320 kbpsFrom 32 to 320 kbps

    Improvements:Improvements:

    Finer psychoacoustics modelFiner psychoacoustics model

    Alias reduction (MDCT filters)Alias reduction (MDCT filters)

    NonuniformNonuniform quantizationquantization

    Entropy codingEntropy coding

    AdaptativeAdaptative block sizeblock size

    Only Layer with patentsOnly Layer with patents

    Inspired by:Inspired by:

    ASPEC (audio spectral perceptualentropy coding)

    OCF (optimal coding in the freq. domain)

    Example of Layer III encoder (Pohlmann 2000)

    Layer III frame format (Pohlmann 2000)

  • 25 September 200925 September 2009 MPEGMPEG--1 Presentation1 Presentation 2020

    MPEGMPEG--1 Layer III (MP3)1 Layer III (MP3)

    Filtering: Hybrid Filtering: Hybrid polyphasepolyphase filter/MDCTfilter/MDCT

    SteadySteady--state signals: 18state signals: 18--point MDCT on every point MDCT on every subbandsubbandFrequency resolution: 41.67 HzFrequency resolution: 41.67 HzTime resolution: 24 msTime resolution: 24 ms

    Transient signals: 6Transient signals: 6--point MDCTpoint MDCTFrequency resolution: 125 HzFrequency resolution: 125 HzTime resolution: 8 msTime resolution: 8 ms

    3 blocks modes3 blocks modes

    PrePre--echo detectionecho detection

    Quantization : power 3/4Quantization : power 3/4

    Entropy coding: Entropy coding:

    Huffman tablesHuffman tables

    Run length codingRun length coding

    Filtering stage in Layer III encoder (Pohlmann 2000)

    MDCT filterbank in Layer III encoder (Pohlmann 2000)

  • 25 September 200925 September 2009 MPEGMPEG--1 Presentation1 Presentation 2121

    Joint Stereo CodingJoint Stereo Coding

    Intensity codingIntensity coding

    Sum of left/right channelsSum of left/right channels

    Coding of the sum and of left/right scale factorsCoding of the sum and of left/right scale factors

    Usually only for highUsually only for high--frequency frequency subbandssubbands

    Efficient for redundant audio channelsEfficient for redundant audio channels

    MS (mid/side) stereo codingMS (mid/side) stereo coding

    Sum and difference of left/right channelsSum and difference of left/right channels

    Coding of the two valuesCoding of the two values

    Stereo maskingStereo masking

  • 25 September 200925 September 2009 MPEGMPEG--1 Presentation1 Presentation 2222

    ContentsContents

    HistoryHistory

    Psychoacoustics and perceptual codingPsychoacoustics and perceptual coding

    Entropy codingEntropy coding

    MPEGMPEG--11

    Layer I/IILayer I/II

    Layer III (MP3)Layer III (MP3)

    Comparison and Audio QualityComparison and Audio Quality

  • 25 September 200925 September 2009 MPEGMPEG--1 Presentation1 Presentation 2323

    Audio QualityAudio Quality

    Comparison with CDComparison with CD--quality at 48 kHz (16 bit quality at 48 kHz (16 bit -- 1.412 Mbps)1.412 Mbps)

    Layer I: Layer I:

    No perceptual difference for 384 kbps (stereo) No perceptual difference for 384 kbps (stereo) –– 2:1 compression2:1 compression

    Layer II:Layer II:

    No perceptual difference for 256 kbps (stereo) No perceptual difference for 256 kbps (stereo) –– 4:1 compression4:1 compression

    Layer IIILayer III

    Increase of mean opinion score compared to Layer II at 256 kbps Increase of mean opinion score compared to Layer II at 256 kbps (stereo) for 128 kbps (stereo) (stereo) for 128 kbps (stereo) –– 8:1 compression8:1 compression

  • 25 September 200925 September 2009 MPEGMPEG--1 Presentation1 Presentation 2424

    Comparison between LayersComparison between Layers

    Layers I and II are very similarLayers I and II are very similar

    Each Layer has its defined decoder design Each Layer has its defined decoder design

    Encoding/Decoding complexity: Layer I/IIEncoding/Decoding complexity: Layer I/II broadcastingbroadcasting

    Encoding/Decoding quality: Layer IIIEncoding/Decoding quality: Layer III audio storageaudio storage

  • 25 September 200925 September 2009 MPEGMPEG--1 Presentation1 Presentation 2525

    ConclusionConclusion

    BibliographyBibliography::

    AmbikairajahAmbikairajah, E. & al. , E. & al. Auditory masking and MPEGAuditory masking and MPEG--1 audio compression1 audio compression, , Electronics & Communication Engineering Journal, Electronics & Communication Engineering Journal, 19971997

    Brandenburg, K. & Brandenburg, K. & BosiBosi, M. , M. Overview of MPEG Audio: Current and Future Overview of MPEG Audio: Current and Future Standards for LowStandards for Low--BitBit--Rate Audio CodingRate Audio Coding, Journal of the Audio Engineering Society, , Journal of the Audio Engineering Society, 19971997, Vol. Vol. 45(No. 1/2) , Vol. Vol. 45(No. 1/2)

    Painter, T. & Painter, T. & SpaniasSpanias, A. , A. Perceptual Coding of Digital AudioPerceptual Coding of Digital Audio, Proceedings of IEEE, Proceedings of IEEE, , 20002000, Vol. Vol. 88(No. 4), Vol. Vol. 88(No. 4)

    Painter, T. & Painter, T. & SpaniasSpanias, A. , A. A Review of Algorithms for Perceptual Coding of Digital A Review of Algorithms for Perceptual Coding of Digital Audio SignalsAudio Signals,, Digital SignalDigital Signal Processing, Processing, 19971997

    Pan, D. Pan, D. A Tutorial on MPEG/Audio CompressionA Tutorial on MPEG/Audio Compression, IEEE , IEEE MultiMediaMultiMedia, IEEE , IEEE Computer Society, Computer Society, 19951995, Vol. 2(2), pp. 60, Vol. 2(2), pp. 60--7474

    Pan, D.Y. Pan, D.Y. Digital Audio CompressionDigital Audio Compression, Digital Technical Journal, , Digital Technical Journal, 19931993, Vol. 5, Vol. 5

    KahrsKahrs, M. and Brandenburg, K. , M. and Brandenburg, K. Applications of digital signal processing to audio Applications of digital signal processing to audio and acousticsand acoustics, , KluwerKluwer Academic Publishers, Academic Publishers, 19981998

    MallatMallat, S. , S. TraitementTraitement du Signaldu Signal, , EcoleEcole Polytechnique, Polytechnique, 20002000

    PohlmannPohlmann, K.C. , K.C. Principles of Digital Audio,Principles of Digital Audio, McGrawMcGraw--Hill Professional, Hill Professional, 20002000

    http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=620475&isnumber=13490http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=620475&isnumber=13490http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=620475&isnumber=13490http://www.aes.org/tmpFiles/elib/20090924/7871.pdfhttp://www.aes.org/tmpFiles/elib/20090924/7871.pdfhttp://www.aes.org/tmpFiles/elib/20090924/7871.pdfhttp://www.aes.org/tmpFiles/elib/20090924/7871.pdfhttp://www.aes.org/tmpFiles/elib/20090924/7871.pdfhttp://www.aes.org/tmpFiles/elib/20090924/7871.pdfhttp://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=842996&isnumber=18261http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=842996&isnumber=18261http://medialab.hsr.ch/dm/vorlesung/dsp97.pdfhttp://medialab.hsr.ch/dm/vorlesung/dsp97.pdfhttp://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=00388209http://eceftp.niu.edu.tw/mhyeh/teach.files/media92/data/paper/Digital_Audio_Compression_1993.pdfhttp://www.springerlink.com/content/h77t14/?p=e0f426fe6ce54d0cbf51591d35ae5987&pi=0http://www.springerlink.com/content/h77t14/?p=e0f426fe6ce54d0cbf51591d35ae5987&pi=0

    MPEG-1ContentsIntroductionContentsHistoryContentsPsychoacousticsPerceptual CodingPerceptual CodingContentsEntropic CodingEntropic CodingContentsMPEG-1ContentsMPEG-1 Layer IMPEG-1 Layer IIContentsMPEG-1 Layer III (MP3)MPEG-1 Layer III (MP3)Joint Stereo CodingContentsAudio QualityComparison between LayersConclusion