![Page 1: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/1.jpg)
Pushdown compression
Pilar Albert, Elvira Mayordomo, Philippe Moser
Sylvain Perifel
LIP, ENS Lyon
Paris, May 30, 2008
![Page 2: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/2.jpg)
Motivations
I New compression algorithms for structured documents (XML):behaviour depending on the current tag→ use of a stack to push and pop tags.
I Very simple algorithms→ pushdown automata.
I Easy to compress and decompress.Practical tests: better performances than zip (Lempel-Ziv).
I Need for a theoretical study.
![Page 3: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/3.jpg)
Motivations
I New compression algorithms for structured documents (XML):behaviour depending on the current tag→ use of a stack to push and pop tags.
I Very simple algorithms→ pushdown automata.
I Easy to compress and decompress.Practical tests: better performances than zip (Lempel-Ziv).
I Need for a theoretical study.
![Page 4: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/4.jpg)
Motivations
I New compression algorithms for structured documents (XML):behaviour depending on the current tag→ use of a stack to push and pop tags.
I Very simple algorithms→ pushdown automata.
I Easy to compress and decompress.Practical tests: better performances than zip (Lempel-Ziv).
I Need for a theoretical study.
![Page 5: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/5.jpg)
Outline
1. Introduction (LZ, FS)
2. Pushdown compression
3. Pushdown beats LZ
4. LZ beats pushdown
5. Conclusion
![Page 6: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/6.jpg)
Outline
1. Introduction (LZ, FS)
2. Pushdown compression
3. Pushdown beats LZ
4. LZ beats pushdown
5. Conclusion
![Page 7: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/7.jpg)
Compression
I Lossless compression.
I Compressor: injective and computable functionf : {0, 1}∗ → {0, 1}∗.
I Compression ratio on a finite word x:
ρf (x) =|f(x)|
|x |.
Compression ratio on an infinite sequence S:
ρf (S) = lim supn→∞
ρf (S[1..n]).
![Page 8: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/8.jpg)
Compression
I Lossless compression.
I Compressor: injective and computable functionf : {0, 1}∗ → {0, 1}∗.
I Compression ratio on a finite word x:
ρf (x) =|f(x)|
|x |.
Compression ratio on an infinite sequence S:
ρf (S) = lim supn→∞
ρf (S[1..n]).
![Page 9: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/9.jpg)
Compression
I Lossless compression.
I Compressor: injective and computable functionf : {0, 1}∗ → {0, 1}∗.
I Compression ratio on a finite word x:
ρf (x) =|f(x)|
|x |.
Compression ratio on an infinite sequence S:
ρf (S) = lim supn→∞
ρf (S[1..n]).
![Page 10: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/10.jpg)
Lempel-Ziv
Text to be compressed:
0 1 2 3 4 5 6 7 8 9 10
ε/
0
/
1
/
0 0
/
0 1
/
0 1 1
/
1 0
/
1 0 0
/
1 0 0 0
/
0 0 1
/
1 1
Compression result:
ε;(0, 0);(0, 1);(1, 0);(1, 1);(4, 1);(2, 0);(6, 0);(7, 0);(3, 1);(2, 1)
Lemma
I If p is the number of phrases, then |LZ(x)| = p log p.
I For all x, the compression ratio ρLZ(x) satisfies
log |x |√|x |≤ ρLZ(x) ≤ 1 + o(1).
![Page 11: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/11.jpg)
Lempel-Ziv
Text to be compressed:
0
1 2 3 4 5 6 7 8 9 10
ε/0
/
1
/
0 0
/
0 1
/
0 1 1
/
1 0
/
1 0 0
/
1 0 0 0
/
0 0 1
/
1 1
Compression result:
ε;
(0, 0);(0, 1);(1, 0);(1, 1);(4, 1);(2, 0);(6, 0);(7, 0);(3, 1);(2, 1)
Lemma
I If p is the number of phrases, then |LZ(x)| = p log p.
I For all x, the compression ratio ρLZ(x) satisfies
log |x |√|x |≤ ρLZ(x) ≤ 1 + o(1).
![Page 12: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/12.jpg)
Lempel-Ziv
Text to be compressed:
0 1
2 3 4 5 6 7 8 9 10
ε/0/1
/
0 0
/
0 1
/
0 1 1
/
1 0
/
1 0 0
/
1 0 0 0
/
0 0 1
/
1 1
Compression result:
ε;(0, 0);
(0, 1);(1, 0);(1, 1);(4, 1);(2, 0);(6, 0);(7, 0);(3, 1);(2, 1)
Lemma
I If p is the number of phrases, then |LZ(x)| = p log p.
I For all x, the compression ratio ρLZ(x) satisfies
log |x |√|x |≤ ρLZ(x) ≤ 1 + o(1).
![Page 13: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/13.jpg)
Lempel-Ziv
Text to be compressed:
0 1 2
3 4 5 6 7 8 9 10
ε/0/1/0 0
/
0 1
/
0 1 1
/
1 0
/
1 0 0
/
1 0 0 0
/
0 0 1
/
1 1
Compression result:
ε;(0, 0);(0, 1);
(1, 0);(1, 1);(4, 1);(2, 0);(6, 0);(7, 0);(3, 1);(2, 1)
Lemma
I If p is the number of phrases, then |LZ(x)| = p log p.
I For all x, the compression ratio ρLZ(x) satisfies
log |x |√|x |≤ ρLZ(x) ≤ 1 + o(1).
![Page 14: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/14.jpg)
Lempel-Ziv
Text to be compressed:
0 1 2 3
4 5 6 7 8 9 10
ε/0/1/0 0/0 1
/
0 1 1
/
1 0
/
1 0 0
/
1 0 0 0
/
0 0 1
/
1 1
Compression result:
ε;(0, 0);(0, 1);(1, 0);
(1, 1);(4, 1);(2, 0);(6, 0);(7, 0);(3, 1);(2, 1)
Lemma
I If p is the number of phrases, then |LZ(x)| = p log p.
I For all x, the compression ratio ρLZ(x) satisfies
log |x |√|x |≤ ρLZ(x) ≤ 1 + o(1).
![Page 15: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/15.jpg)
Lempel-Ziv
Text to be compressed:
0 1 2 3 4
5 6 7 8 9 10
ε/0/1/0 0/0 1/0 1 1
/
1 0
/
1 0 0
/
1 0 0 0
/
0 0 1
/
1 1
Compression result:
ε;(0, 0);(0, 1);(1, 0);(1, 1);
(4, 1);(2, 0);(6, 0);(7, 0);(3, 1);(2, 1)
Lemma
I If p is the number of phrases, then |LZ(x)| = p log p.
I For all x, the compression ratio ρLZ(x) satisfies
log |x |√|x |≤ ρLZ(x) ≤ 1 + o(1).
![Page 16: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/16.jpg)
Lempel-Ziv
Text to be compressed:
0 1 2 3 4 5
6 7 8 9 10
ε/0/1/0 0/0 1/0 1 1/1 0
/
1 0 0
/
1 0 0 0
/
0 0 1
/
1 1
Compression result:
ε;(0, 0);(0, 1);(1, 0);(1, 1);(4, 1);
(2, 0);(6, 0);(7, 0);(3, 1);(2, 1)
Lemma
I If p is the number of phrases, then |LZ(x)| = p log p.
I For all x, the compression ratio ρLZ(x) satisfies
log |x |√|x |≤ ρLZ(x) ≤ 1 + o(1).
![Page 17: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/17.jpg)
Lempel-Ziv
Text to be compressed:
0 1 2 3 4 5 6 7 8 9 10
ε/0/1/0 0/0 1/0 1 1/1 0/1 0 0/1 0 0 0/0 0 1/1 1
Compression result:
ε;(0, 0);(0, 1);(1, 0);(1, 1);(4, 1);(2, 0);(6, 0);(7, 0);(3, 1);(2, 1)
Lemma
I If p is the number of phrases, then |LZ(x)| = p log p.
I For all x, the compression ratio ρLZ(x) satisfies
log |x |√|x |≤ ρLZ(x) ≤ 1 + o(1).
![Page 18: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/18.jpg)
Lempel-Ziv
Text to be compressed:
0 1 2 3 4 5 6 7 8 9 10
ε/0/1/0 0/0 1/0 1 1/1 0/1 0 0/1 0 0 0/0 0 1/1 1
Compression result:
ε;(0, 0);(0, 1);(1, 0);(1, 1);(4, 1);(2, 0);(6, 0);(7, 0);(3, 1);(2, 1)
Lemma
I If p is the number of phrases, then |LZ(x)| = p log p.
I For all x, the compression ratio ρLZ(x) satisfies
log |x |√|x |≤ ρLZ(x) ≤ 1 + o(1).
![Page 19: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/19.jpg)
Finite-state compression (1)
Finite-state transducer: finite-state automaton that outputs lettersat each transition→ function f : {0, 1}∗ → {0, 1}∗.
0 / ε
1 / 01
0 / 0
1 / 011
00 7→ 0, 01 7→ 01, 1 7→ 011.
Example: 00 00 01 1 1 00 7→ 0 0 01 011 011 0.
Finite-state compressor: injective finite-state transducer (takinginto account the final state).
![Page 20: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/20.jpg)
Finite-state compression (1)
Finite-state transducer: finite-state automaton that outputs lettersat each transition→ function f : {0, 1}∗ → {0, 1}∗.
0 / ε
1 / 01
0 / 0
1 / 011
00 7→ 0, 01 7→ 01, 1 7→ 011.
Example: 00 00 01 1 1 00 7→ 0 0 01 011 011 0.
Finite-state compressor: injective finite-state transducer (takinginto account the final state).
![Page 21: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/21.jpg)
Finite-state compression (2)
For a finite-state compressor C: compression ratio of an infinitesequence S
ρC(S) = lim supn→∞
|C(S[1..n])|
n.
Finite-state compression ratio:
ρFS(S) = infC∈FS
ρC(S).
![Page 22: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/22.jpg)
Finite-state compression (2)
For a finite-state compressor C: compression ratio of an infinitesequence S
ρC(S) = lim supn→∞
|C(S[1..n])|
n.
Finite-state compression ratio:
ρFS(S) = infC∈FS
ρC(S).
![Page 23: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/23.jpg)
LZ better than FS
Theorem (Lempel, Ziv, 1979)
On every infinite sequence S ∈ {0, 1}N, Lempel-Ziv is better thanany finite-state compressor, that is,
ρLZ(S) ≤ ρFS(S).
![Page 24: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/24.jpg)
Outline
1. Introduction (LZ, FS)
2. Pushdown compression
3. Pushdown beats LZ
4. LZ beats pushdown
5. Conclusion
![Page 25: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/25.jpg)
Pushdown transducers
Pushdown compressor = finite-state transducer with a stack.
The transition is done according both to the symbol read and to thetopmost symbol of the stack.
Each transition either pushes or pops symbols from the stack.
PD compression ratio: ρPD(S) = infC∈PD ρC(S).
Two variants: with or without endmarkers→ C(x) or C(x#) (enables to empty the stack).
![Page 26: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/26.jpg)
Pushdown transducers
Pushdown compressor = finite-state transducer with a stack.
The transition is done according both to the symbol read and to thetopmost symbol of the stack.
Each transition either pushes or pops symbols from the stack.
PD compression ratio: ρPD(S) = infC∈PD ρC(S).
Two variants: with or without endmarkers→ C(x) or C(x#) (enables to empty the stack).
![Page 27: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/27.jpg)
Pushdown transducers
Pushdown compressor = finite-state transducer with a stack.
The transition is done according both to the symbol read and to thetopmost symbol of the stack.
Each transition either pushes or pops symbols from the stack.
PD compression ratio: ρPD(S) = infC∈PD ρC(S).
Two variants: with or without endmarkers→ C(x) or C(x#) (enables to empty the stack).
![Page 28: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/28.jpg)
Example
Proposition
Let S = 0∞.
I The compression ratio on S of a finite-state compressor withk states is ≥ 1/k .
I There exists a pushdown compressor with k states whosecompression ratio on S is ≤ 1/k 2 (with endmarkers).
Proof.
![Page 29: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/29.jpg)
Example
Proposition
Let S = 0∞.
I The compression ratio on S of a finite-state compressor withk states is ≥ 1/k .
I There exists a pushdown compressor with k states whosecompression ratio on S is ≤ 1/k 2 (with endmarkers).
Proof.
1. Let C be a FS compressor with k states.
Then C must output at least one symbol every k letters.
Otherwise there would exist u such that for all i0 ≤ i ≤ i0 + k ,all the u[1..i] have the same image.
Since there are only k states, this contradicts injectivity.
![Page 30: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/30.jpg)
Example
Proposition
Let S = 0∞.
I The compression ratio on S of a finite-state compressor withk states is ≥ 1/k .
I There exists a pushdown compressor with k states whosecompression ratio on S is ≤ 1/k 2 (with endmarkers).
Proof.
2. Let C be the following pushdown compressor on input 0n:
I it pushes 0n/k on the stack (by counting modulo k );
I at the end it pops the stack and outputs one symbol every k(by counting modulo k ).
�
![Page 31: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/31.jpg)
Remarks
I Same result as FS for pushdown without endmarkers.
I LZ on S = 0∞ has compression ratio 0. . .
I but FS also!
ρFS(S) = infC∈FS
ρC(S) ≤ 1/k for all k .
![Page 32: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/32.jpg)
Remarks
I Same result as FS for pushdown without endmarkers.
I LZ on S = 0∞ has compression ratio 0. . .
I but FS also!
ρFS(S) = infC∈FS
ρC(S) ≤ 1/k for all k .
![Page 33: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/33.jpg)
Remarks
I Same result as FS for pushdown without endmarkers.
I LZ on S = 0∞ has compression ratio 0. . .
I but FS also!
ρFS(S) = infC∈FS
ρC(S) ≤ 1/k for all k .
![Page 34: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/34.jpg)
Outline
1. Introduction (LZ, FS)
2. Pushdown compression
3. Pushdown beats LZ
4. LZ beats pushdown
5. Conclusion
![Page 35: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/35.jpg)
Pushdown beats LZ
Theorem
There exists a sequence S such that
ρPD(S) = 1/2 (without endmarkers)
andρLZ(S) = 1.
![Page 36: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/36.jpg)
The idea
Pushdown compresses palindromes with ratio ' 1/2. . .
but LZ not always.
→ build a sequence of the form
S = u1u1u2u2 . . .
with well-chosen words ui (here u stands for the mirror of u).
![Page 37: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/37.jpg)
The idea
Pushdown compresses palindromes with ratio ' 1/2. . .
but LZ not always.
→ build a sequence of the form
S = u1u1u2u2 . . .
with well-chosen words ui (here u stands for the mirror of u).
![Page 38: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/38.jpg)
Proof
Let En ⊂ {0, 1}n be the set of words of size n that are notpalindromes. Let u1, . . . , u|En |/2 be |En |/2 words of En such that∀i, j, ui , uj . Then
u1 . . . u|En |/2 u|En |/2 . . . u1
is LZ-incompressible but 1/2-PD-compressible.→ repeat this for all sizes n to obtain the infinite sequence S. �
![Page 39: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/39.jpg)
Outline
1. Introduction (LZ, FS)
2. Pushdown compression
3. Pushdown beats LZ
4. LZ beats pushdown
5. Conclusion
![Page 40: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/40.jpg)
LZ beats pushdown
Theorem
There exists a sequence S such that
ρLZ(S) = 0
andρPD(S) = 1 (with endmarkers).
![Page 41: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/41.jpg)
The idea
LZ compresses repetitions very well (ratio tends to 0). . .
but pushdown not always.
I Show that some repetitions are not compressed by pushdown(→ pumping lemma);
I build a sequence of the form
S = un11 un2
2 . . .
for well chosen ui and ni .
![Page 42: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/42.jpg)
The idea
LZ compresses repetitions very well (ratio tends to 0). . .
but pushdown not always.
I Show that some repetitions are not compressed by pushdown(→ pumping lemma);
I build a sequence of the form
S = un11 un2
2 . . .
for well chosen ui and ni .
![Page 43: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/43.jpg)
LZ and repetitions
Lemma
Let u be a word.The compression ratio of LZ on un is O( log n
√n
) (and thus tends to 0when n → ∞).
Proof.For all k , there are at most |u| different words of size k in un.Call p the number of phrases in the parsing of un by LZ algorithm.Let tk be the number of phrases of size k . We have:
|un | =∑k≥1
tk ≥p/|u|∑k=1
k |u| ≥p2
2|u|.
Thus p = O(√
n) and |LZ(x)| = p log p. �
![Page 44: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/44.jpg)
LZ and repetitions
Lemma
Let u be a word.The compression ratio of LZ on un is O( log n
√n
) (and thus tends to 0when n → ∞).
Proof.For all k , there are at most |u| different words of size k in un.Call p the number of phrases in the parsing of un by LZ algorithm.Let tk be the number of phrases of size k . We have:
|un | =∑k≥1
tk ≥p/|u|∑k=1
k |u| ≥p2
2|u|.
Thus p = O(√
n) and |LZ(x)| = p log p. �
![Page 45: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/45.jpg)
PD and repetitions
Let C be a pushdown compressor.
Suppose there is a pumping lemma: on input uvnw, C has eachtime the same behaviour on v.
If v is not compressible, then C(uvnw) ≥ n|v |,thus ρC(uvnw)→ 1.
![Page 46: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/46.jpg)
Pumping lemma
Theorem
Let A be a pushdown transducer (working without endmarkers).There exist two constants α, β > 0 such that all word w can be cutin three pieces w = tuv satisfying:
I |u| ≥ bα|w |βc;
I if C(tuv) = xyz then C(tun) = xyn.
![Page 47: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/47.jpg)
Acceptors: reminder
Equivalence pushdown automata / context-free grammars.
S
w
A
A
x y z t u
A
y z t
![Page 48: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/48.jpg)
Acceptors: reminder
Equivalence pushdown automata / context-free grammars.
S
w
A
A
x y z t u
A
y z t
![Page 49: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/49.jpg)
Acceptors: reminder
Equivalence pushdown automata / context-free grammars.
S
w
A
A
x y z t u
A
y z t
![Page 50: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/50.jpg)
Acceptors: reminder
Equivalence pushdown automata / context-free grammars.
S
w
A
A
x y z t u
A
y z t
![Page 51: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/51.jpg)
Transducers: proof (1)
I Lifetime of a column: never go below its top symbol.
I Equivalent columns: c′ in the lifetime of c and same state/topsymbol.
. . .
Z
(1)
Z
...
Y
. . .
Z Z ′
...
Z
. . .
Z ′W
Z ′
. . .
State: q1 q2 q3 q1
(2)
. . .
Lifetime of column (1)
Lifetime of column (2)
![Page 52: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/52.jpg)
Transducers: proof (1)
I Lifetime of a column: never go below its top symbol.
I Equivalent columns: c′ in the lifetime of c and same state/topsymbol.
. . .
Z
(1)
Z
...
Y
. . .
Z Z ′
...
Z
. . .
Z ′W
Z ′
. . .
State: q1 q2 q3 q1
(2)
. . .
Lifetime of column (1)
Lifetime of column (2)
![Page 53: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/53.jpg)
Transducers: proof (2)
I p: number of pairs state/top symbol;
I k : max number of symbols pushed by one rule;
I L(p, k , d): maximum lifetime of a column during which no pairof equivalent columns are at distance ≥ d.
I L(p + 1, k , d) = d + kdL(p, k , d).
Distance < d of c
d − 1
Z0 Z0
Y1
...
Ydk
. . .
Z0
Y1
...
Ydk−1. . .
Z0
Y1 . . .Z0
c c1 c2 cdk
Lifetime of cLifetime of c1 c1, . . . , cdk−1 Lifetime of cdk
≤ L(p, k , d)
≤ L(p, k , d)
![Page 54: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/54.jpg)
Transducers: proof (2)
I p: number of pairs state/top symbol;
I k : max number of symbols pushed by one rule;
I L(p, k , d): maximum lifetime of a column during which no pairof equivalent columns are at distance ≥ d.
I L(p + 1, k , d) = d + kdL(p, k , d).
Distance < d of c
d − 1
Z0 Z0
Y1
...
Ydk
. . .
Z0
Y1
...
Ydk−1. . .
Z0
Y1 . . .Z0
c c1 c2 cdk
Lifetime of cLifetime of c1 c1, . . . , cdk−1 Lifetime of cdk
≤ L(p, k , d)
≤ L(p, k , d)
![Page 55: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/55.jpg)
The endmarker
Theorem
Let A be a pushdown transducer (working with endmarkers).There exist two constants α, β > 0 such that all word w can be cutin three pieces w = tuv satisfying:
I |u| ≥ bα|w |βc;
I there are five words x, x′, y, y′, z such thatC(tunv#) = xynzy′nx′.
Remark.The same is true with an initially nonempty stack.
![Page 56: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/56.jpg)
LZ beats PD
Theorem
There exists a sequence S such that
ρLZ(S) = 0 and ρPD(S) = 1 (with endmarkers).
Proof.Let wi be a sufficiently big Kolmogorov-random word→ cut it in three pieces wi = tiuivi , with ui big enough (thusincompressible), according to the i-th pushdown transducers. Then
S = t1un11 t2un2
2 . . .
(for sufficiently large integers ni) is LZ-compressible but notPD-compressible. �
![Page 57: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/57.jpg)
LZ beats PD
Theorem
There exists a sequence S such that
ρLZ(S) = 0 and ρPD(S) = 1 (with endmarkers).
Proof.Let wi be a sufficiently big Kolmogorov-random word→ cut it in three pieces wi = tiuivi , with ui big enough (thusincompressible), according to the i-th pushdown transducers. Then
S = t1un11 t2un2
2 . . .
(for sufficiently large integers ni) is LZ-compressible but notPD-compressible. �
![Page 58: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/58.jpg)
Outline
1. Introduction (LZ, FS)
2. Pushdown compression
3. Pushdown beats LZ
4. LZ beats pushdown
5. Conclusion
![Page 59: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/59.jpg)
Summary
I Introduction of pushdown compression.
I Strictly better than finite-state compression.
I Better than Lempel-Ziv on some sequences (palindromes:compression ratio 1/2 instead of 1).
I Worse than Lempel-Ziv on some sequences (repetitions:compression ratio 1 instead of 0).
![Page 60: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/60.jpg)
Summary
I Introduction of pushdown compression.
I Strictly better than finite-state compression.
I Better than Lempel-Ziv on some sequences (palindromes:compression ratio 1/2 instead of 1).
I Worse than Lempel-Ziv on some sequences (repetitions:compression ratio 1 instead of 0).
![Page 61: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/61.jpg)
Summary
I Introduction of pushdown compression.
I Strictly better than finite-state compression.
I Better than Lempel-Ziv on some sequences (palindromes:compression ratio 1/2 instead of 1).
I Worse than Lempel-Ziv on some sequences (repetitions:compression ratio 1 instead of 0).
![Page 62: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/62.jpg)
Future work
I Lower bound on the compression ratio of a PD compressorwith k states with endmarkers?
I Better separation for “PD beats LZ”?
![Page 63: Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifelsperifel/recherche/paris.pdf · Pilar Albert, Elvira Mayordomo, Philippe Moser Sylvain Perifel LIP, ENS Lyon Paris, May](https://reader036.vdocument.in/reader036/viewer/2022081614/5fbff0d72a2704644464f7df/html5/thumbnails/63.jpg)
Outline
1. Introduction (LZ, FS)
2. Pushdown compression
3. Pushdown beats LZ
4. LZ beats pushdown
5. Conclusion