compression 4

Upload: rakesh-inani

Post on 01-Jun-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/9/2019 Compression 4

    1/24

    296.3 Page 1

    CPS 296.3:Algorithms in the Real World

    Data Compression III

  • 8/9/2019 Compression 4

    2/24

    296.3 Page 2

    Compression Outline

    Introduction: Lossy s. Lossless! "en#hmar$s! %Information Theory: &ntropy! et#.

    Probability Coding: 'u((man ) Arithmeti# Coding

    Applications of Probability Coding: PP* ) others

    Lempel-Ziv Algorithms:

    + L,--! gip!

    + L,-/! #ompress 01ot #oered in #lass

    Other Lossless Algorithms "urros4WheelerLossy algorithms for images5P&! *P&! ...

    Compressing graphs and meshes ""7

  • 8/9/2019 Compression 4

    3/24

    296.3 Page 3

    Lempel4,i Algorithms

    LZ!!0Sliding Windo"ariants: L,SS 0Lempel4,i4Storer4Symans$i

    Applications: gzip! S8ueee! L'A! P7,P! ,OO

    LZ!#0i#tionary "ased"ariants: L,W 0Lempel4,i4Wel#h! L,C

    Applications: compress! ;! CC

  • 8/9/2019 Compression 4

    4/24

    296.3 Page 4

    L,--: Sliding Windo Lempel4,i

    Dictionaryand buffer>indos? are (i@ed lengthand slide ith the cursor$epeat:

    Output %p& l& c'herep position o( the longest mat#h that starts in

    the di#tionary 0relatie to the #ursorl length o( longest mat#hc ne@t #har in =u((er =eyond longest mat#h

    Adan#e indo =y l) B

    a a c a a c a b c a b a b a c

    Dictionary

    (previously coded)Lookahead

    u!!er

    "ursor

  • 8/9/2019 Compression 4

    5/24

  • 8/9/2019 Compression 4

    6/24

    296.3 Page 6

    L,-- e#oding

    e#oder $eeps same di#tionary indo as en#oder.;or ea#h message it loo$s it up in the di#tionary and

    inserts a #opy at the end o( the string

    What i( l pD 0only part o( the message is in the

    di#tionary.&.g. di#t abcd! #odeord (2,9,e)

    E Simply #opy (rom le(t to rightfor (i = 0; i < lengt; i!!)

    o"t#c"rsor!i$ = o"t#c"rsor%offset!i$

    E Out abcdcdcdcdcdce

  • 8/9/2019 Compression 4

    7/24

  • 8/9/2019 Compression 4

    8/24

    296.3 Page *

    Optimiations used =y gzip0#ont.

    B. 'u((man #ode the positions! lengths and #hars2. 1on greedy: possi=ly use shorter mat#h so that

    ne@t mat#h is =etter

    3. Fse a hash ta=le to store the di#tionary.

    + 'ash $eys are all strings o( length 3 in thedi#tionary indo.

    + ;ind the longest mat#h ithin the #orre#thash =u#$et.

    + Puts a limit on the length o( the sear#h ithina =u#$et.

    + Within ea#h =u#$et store in order o( position

  • 8/9/2019 Compression 4

    9/24

    296.3 Page 9

  • 8/9/2019 Compression 4

    10/24

    296.3 Page 1+

  • 8/9/2019 Compression 4

    11/24

    296.3 Page 11

    Comparison to Lempel4,i -/

    "oth L,-- and L,-/ and their ariants $eep a>di#tionary? o( re#ent strings that hae =een seen.

  • 8/9/2019 Compression 4

    12/24

    296.3 Page 12

    Lempel4,i Algorithms Summary

    Adapts ell to #hanges in the (ile 0e.g. a se#ond pass? and #ompress mu#h =etter.

  • 8/9/2019 Compression 4

    13/24

    296.3 Page 13

    Compression Outline

    Introduction: Lossy s. Lossless! "en#hmar$s! %Information Theory: &ntropy! et#.

    Probability Coding: 'u((man ) Arithmeti# Coding

    Applications of Probability Coding: PP* ) others

    Lempel-Ziv Algorithms: L,--! gip! #ompress! %

    Other Lossless Algorithms

    + "urros4Wheeler

    + AC"Lossy algorithms for images5P&! *P&! ...

    Compressing graphs and meshes ""7

  • 8/9/2019 Compression 4

    14/24

    296.3 Page 14

    "urros 4Wheeler

    Currently near =est >=alan#ed? algorithm (or te@t"rea$s (ile into (i@ed4sie =lo#$s and en#odes ea#h

    =lo#$ separately.

    (or each bloc)

    + Sort ea#h #hara#ter =y its (ull #onte@t.

  • 8/9/2019 Compression 4

    15/24

    296.3 Page 1#

    "urros Wheeler: &@ample

    LetJs en#ode: decode

    Conte@t >raps? around. Last #har is most signi(i#ant.

    Sort

    Context

    All rotations o( input

  • 8/9/2019 Compression 4

    16/24

    296.3 Page 16

    "urros Wheeler e#oding

    7ey dea: Can #onstru#t entire sorted ta=le (rom sorted#olumn aloneK ;irst: sorting the output gies last #olumn o(#onte@t:

    Conte*t Output

    c od e

    d e

    e c

    e d

    o d

  • 8/9/2019 Compression 4

    17/24

    296.3 Page 1

    "urros Wheeler e#oding

    1o sort pairs in last #olumn o( #onte@t and output #olumnto (orm last to #olumns o( #onte@t:

    Conte*t Output

    c o

    d e

    d e

    e ce d

    o d

    Conte*t Output

    ec o

    ed e

    od e

    de c

    de d

    co d

  • 8/9/2019 Compression 4

    18/24

    296.3 Page 1*

    "urros Wheeler e#oding

    Repeat until entire ta=le is #omplete. Pointer to (irst#hara#ter proides uni8ue de#oding.

    *essage as d in (irst position! pre#eded in rapped(ashion =y e#ode: de#ode.

  • 8/9/2019 Compression 4

    19/24

    296.3 Page 19

    "urros Wheeler e#oding

    Optimiation: onJt really hae to re=uild the hole #onte@tta=le.

    What #hara#ter #omes a(terthe (irst #hara#ter! dBD

    5ust hae to (ind dBin last

    #olumn o( #onte@t and see hat(ollos it: eB.

    O=seration: instan#es o( same#hara#ter o( output appear insame order in last #olumn o(#onte@t. 0Proo( is an e@er#ise.

  • 8/9/2019 Compression 4

    20/24

    296.3 Page 2+

    "urros4Wheeler: e#oding

    Outputo

    e

    ec

    d

    d

    Conte*tc

    d

    de

    e

    o

    $an)

    4

    1

    2

    3

    ran$? is the positiono( a #hara#ter i( it eresorted using a sta=lesort.

  • 8/9/2019 Compression 4

    21/24

    296.3 Page 21

    "urros4Wheeler e#ode

    ;un#tion "We#ode0n! Start! nS *oe

  • 8/9/2019 Compression 4

    22/24

    296.3 Page 22

    e#ode &@ample

    + $an)%+'

    oI 6

    e2 I

    e6

    #3 B

    dB 2

    d 3

  • 8/9/2019 Compression 4

    23/24

    296.3 Page 23

    Oerie o(

  • 8/9/2019 Compression 4

    24/24

    296.3 Page 24

    AC" 0Asso#iate Coder o( "uyanos$y

    "onte(t "ontents

    decode

    dec odedecode

    decode

    decode

    deco de

    7eep di#tionary sorted =y #onte@t0the last #hara#ter is the mostsigni(i#ant

    E ;ind longest mat#h (or #onte@t

    E ;ind longest mat#h (or #ontentsE Code

    E istan#e =eteen mat#hes inthe sorted order

    E Length o( #ontents mat#h

    'as aspe#ts o( "urros4Wheeler!and L,--