compression 4
TRANSCRIPT
-
8/9/2019 Compression 4
1/24
296.3 Page 1
CPS 296.3:Algorithms in the Real World
Data Compression III
-
8/9/2019 Compression 4
2/24
296.3 Page 2
Compression Outline
Introduction: Lossy s. Lossless! "en#hmar$s! %Information Theory: &ntropy! et#.
Probability Coding: 'u((man ) Arithmeti# Coding
Applications of Probability Coding: PP* ) others
Lempel-Ziv Algorithms:
+ L,--! gip!
+ L,-/! #ompress 01ot #oered in #lass
Other Lossless Algorithms "urros4WheelerLossy algorithms for images5P&! *P&! ...
Compressing graphs and meshes ""7
-
8/9/2019 Compression 4
3/24
296.3 Page 3
Lempel4,i Algorithms
LZ!!0Sliding Windo"ariants: L,SS 0Lempel4,i4Storer4Symans$i
Applications: gzip! S8ueee! L'A! P7,P! ,OO
LZ!#0i#tionary "ased"ariants: L,W 0Lempel4,i4Wel#h! L,C
Applications: compress! ;! CC
-
8/9/2019 Compression 4
4/24
296.3 Page 4
L,--: Sliding Windo Lempel4,i
Dictionaryand buffer>indos? are (i@ed lengthand slide ith the cursor$epeat:
Output %p& l& c'herep position o( the longest mat#h that starts in
the di#tionary 0relatie to the #ursorl length o( longest mat#hc ne@t #har in =u((er =eyond longest mat#h
Adan#e indo =y l) B
a a c a a c a b c a b a b a c
Dictionary
(previously coded)Lookahead
u!!er
"ursor
-
8/9/2019 Compression 4
5/24
-
8/9/2019 Compression 4
6/24
296.3 Page 6
L,-- e#oding
e#oder $eeps same di#tionary indo as en#oder.;or ea#h message it loo$s it up in the di#tionary and
inserts a #opy at the end o( the string
What i( l pD 0only part o( the message is in the
di#tionary.&.g. di#t abcd! #odeord (2,9,e)
E Simply #opy (rom le(t to rightfor (i = 0; i < lengt; i!!)
o"t#c"rsor!i$ = o"t#c"rsor%offset!i$
E Out abcdcdcdcdcdce
-
8/9/2019 Compression 4
7/24
-
8/9/2019 Compression 4
8/24
296.3 Page *
Optimiations used =y gzip0#ont.
B. 'u((man #ode the positions! lengths and #hars2. 1on greedy: possi=ly use shorter mat#h so that
ne@t mat#h is =etter
3. Fse a hash ta=le to store the di#tionary.
+ 'ash $eys are all strings o( length 3 in thedi#tionary indo.
+ ;ind the longest mat#h ithin the #orre#thash =u#$et.
+ Puts a limit on the length o( the sear#h ithina =u#$et.
+ Within ea#h =u#$et store in order o( position
-
8/9/2019 Compression 4
9/24
296.3 Page 9
-
8/9/2019 Compression 4
10/24
296.3 Page 1+
-
8/9/2019 Compression 4
11/24
296.3 Page 11
Comparison to Lempel4,i -/
"oth L,-- and L,-/ and their ariants $eep a>di#tionary? o( re#ent strings that hae =een seen.
-
8/9/2019 Compression 4
12/24
296.3 Page 12
Lempel4,i Algorithms Summary
Adapts ell to #hanges in the (ile 0e.g. a se#ond pass? and #ompress mu#h =etter.
-
8/9/2019 Compression 4
13/24
296.3 Page 13
Compression Outline
Introduction: Lossy s. Lossless! "en#hmar$s! %Information Theory: &ntropy! et#.
Probability Coding: 'u((man ) Arithmeti# Coding
Applications of Probability Coding: PP* ) others
Lempel-Ziv Algorithms: L,--! gip! #ompress! %
Other Lossless Algorithms
+ "urros4Wheeler
+ AC"Lossy algorithms for images5P&! *P&! ...
Compressing graphs and meshes ""7
-
8/9/2019 Compression 4
14/24
296.3 Page 14
"urros 4Wheeler
Currently near =est >=alan#ed? algorithm (or te@t"rea$s (ile into (i@ed4sie =lo#$s and en#odes ea#h
=lo#$ separately.
(or each bloc)
+ Sort ea#h #hara#ter =y its (ull #onte@t.
-
8/9/2019 Compression 4
15/24
296.3 Page 1#
"urros Wheeler: &@ample
LetJs en#ode: decode
Conte@t >raps? around. Last #har is most signi(i#ant.
Sort
Context
All rotations o( input
-
8/9/2019 Compression 4
16/24
296.3 Page 16
"urros Wheeler e#oding
7ey dea: Can #onstru#t entire sorted ta=le (rom sorted#olumn aloneK ;irst: sorting the output gies last #olumn o(#onte@t:
Conte*t Output
c od e
d e
e c
e d
o d
-
8/9/2019 Compression 4
17/24
296.3 Page 1
"urros Wheeler e#oding
1o sort pairs in last #olumn o( #onte@t and output #olumnto (orm last to #olumns o( #onte@t:
Conte*t Output
c o
d e
d e
e ce d
o d
Conte*t Output
ec o
ed e
od e
de c
de d
co d
-
8/9/2019 Compression 4
18/24
296.3 Page 1*
"urros Wheeler e#oding
Repeat until entire ta=le is #omplete. Pointer to (irst#hara#ter proides uni8ue de#oding.
*essage as d in (irst position! pre#eded in rapped(ashion =y e#ode: de#ode.
-
8/9/2019 Compression 4
19/24
296.3 Page 19
"urros Wheeler e#oding
Optimiation: onJt really hae to re=uild the hole #onte@tta=le.
What #hara#ter #omes a(terthe (irst #hara#ter! dBD
5ust hae to (ind dBin last
#olumn o( #onte@t and see hat(ollos it: eB.
O=seration: instan#es o( same#hara#ter o( output appear insame order in last #olumn o(#onte@t. 0Proo( is an e@er#ise.
-
8/9/2019 Compression 4
20/24
296.3 Page 2+
"urros4Wheeler: e#oding
Outputo
e
ec
d
d
Conte*tc
d
de
e
o
$an)
4
1
2
3
ran$? is the positiono( a #hara#ter i( it eresorted using a sta=lesort.
-
8/9/2019 Compression 4
21/24
296.3 Page 21
"urros4Wheeler e#ode
;un#tion "We#ode0n! Start! nS *oe
-
8/9/2019 Compression 4
22/24
296.3 Page 22
e#ode &@ample
+ $an)%+'
oI 6
e2 I
e6
#3 B
dB 2
d 3
-
8/9/2019 Compression 4
23/24
296.3 Page 23
Oerie o(
-
8/9/2019 Compression 4
24/24
296.3 Page 24
AC" 0Asso#iate Coder o( "uyanos$y
"onte(t "ontents
decode
dec odedecode
decode
decode
deco de
7eep di#tionary sorted =y #onte@t0the last #hara#ter is the mostsigni(i#ant
E ;ind longest mat#h (or #onte@t
E ;ind longest mat#h (or #ontentsE Code
E istan#e =eteen mat#hes inthe sorted order
E Length o( #ontents mat#h
'as aspe#ts o( "urros4Wheeler!and L,--