language-processing problemsdimacs.rutgers.edu/workshops/lattices/slides/backhouse2.pdf · 2003. 7....
TRANSCRIPT
![Page 1: Language-Processing Problemsdimacs.rutgers.edu/Workshops/Lattices/slides/backhouse2.pdf · 2003. 7. 15. · References 23 J.H. Conway, “Regular Algebra and Finite Machines”, Chapman](https://reader034.vdocument.in/reader034/viewer/2022051901/5ff098ec648229572c0389d3/html5/thumbnails/1.jpg)
1
Language-Processing Problems
Roland Backhouse
DIMACS, 8th July, 2003
![Page 2: Language-Processing Problemsdimacs.rutgers.edu/Workshops/Lattices/slides/backhouse2.pdf · 2003. 7. 15. · References 23 J.H. Conway, “Regular Algebra and Finite Machines”, Chapman](https://reader034.vdocument.in/reader034/viewer/2022051901/5ff098ec648229572c0389d3/html5/thumbnails/2.jpg)
2
Introduction
“Factors” and the “factor matrix” were introduced by Conway(1971).
He used them very effectively in, for example, constructingbiregulators.
Conway’s discussion is wordy, making it difficult to understand.There are also occasional errors which are difficult to detect and addto the confusion. (“The theorem does prevent E from occurringtwice” should read “The theorem does not prevent E from occurringtwice.”)
![Page 3: Language-Processing Problemsdimacs.rutgers.edu/Workshops/Lattices/slides/backhouse2.pdf · 2003. 7. 15. · References 23 J.H. Conway, “Regular Algebra and Finite Machines”, Chapman](https://reader034.vdocument.in/reader034/viewer/2022051901/5ff098ec648229572c0389d3/html5/thumbnails/3.jpg)
3
KMP Failure Function (pattern aabaa)
node i 1 2 3 4 5
failure node f(i) 0 1 0 1 2
Factor Graph (language Σ∗aabaa)
b ���� ���� ���� ���� ���� ����- - - - -a a b a a
0 1 2 3 4 5��-
ε6
ε?
ε
6
ε?
ε6
![Page 4: Language-Processing Problemsdimacs.rutgers.edu/Workshops/Lattices/slides/backhouse2.pdf · 2003. 7. 15. · References 23 J.H. Conway, “Regular Algebra and Finite Machines”, Chapman](https://reader034.vdocument.in/reader034/viewer/2022051901/5ff098ec648229572c0389d3/html5/thumbnails/4.jpg)
4
Language Problems
S ::= aSS | ε .
Is-empty
S=φ ≡ ({a}=φ ∨ S=φ ∨ S=φ) ∧ {ε}=φ .
Nullable
ε∈S ≡ (ε∈ {a} ∧ ε∈S ∧ ε∈S) ∨ ε∈ {ε} .
Shortest word length
#S = (#a + #S + #S) ↓ #ε .
Non-Example
aa ∈ S 6≡ (aa∈ {a} ∧ aa∈S ∧ aa∈S) ∨ aa ∈ {ε} .
![Page 5: Language-Processing Problemsdimacs.rutgers.edu/Workshops/Lattices/slides/backhouse2.pdf · 2003. 7. 15. · References 23 J.H. Conway, “Regular Algebra and Finite Machines”, Chapman](https://reader034.vdocument.in/reader034/viewer/2022051901/5ff098ec648229572c0389d3/html5/thumbnails/5.jpg)
5
Fusion
Many problems are expressed in the form
evaluate ◦ generate
where generate generates a (possibly infinite) candidate set ofsolutions, and evaluate selects a best solution.
Examples:shortest ◦ path ,
(x∈) ◦ L .
Solution method is to fuse the generation and evaluation processes,eliminating the need to generate all candidate solutions.
![Page 6: Language-Processing Problemsdimacs.rutgers.edu/Workshops/Lattices/slides/backhouse2.pdf · 2003. 7. 15. · References 23 J.H. Conway, “Regular Algebra and Finite Machines”, Chapman](https://reader034.vdocument.in/reader034/viewer/2022051901/5ff098ec648229572c0389d3/html5/thumbnails/6.jpg)
6
Conditions for Fusion
Fusion is made possible when
• evaluate is an adjoint in a Galois connection,
• generate is expressed as a fixed point.
Algorithms for solving resulting fixed point equation include
• brute-force iteration,
• Knuth’s generalisation of Dijkstra’s shortest path algorithm. .
Solution method typically involves generalising the problem.
![Page 7: Language-Processing Problemsdimacs.rutgers.edu/Workshops/Lattices/slides/backhouse2.pdf · 2003. 7. 15. · References 23 J.H. Conway, “Regular Algebra and Finite Machines”, Chapman](https://reader034.vdocument.in/reader034/viewer/2022051901/5ff098ec648229572c0389d3/html5/thumbnails/7.jpg)
7
Galois Connections
Suppose A=(A,v) and B=(B,�) are partially ordered sets andsuppose F∈A←B and G∈B←A . Then (F ,G) is a Galois connectionof A and B iff, for all x∈B and y∈A,
F(x) v y ≡ x � G(y) .
Examples
Negation:
¬p⇒q ≡ p⇐¬q .
Ceiling function:
dxe ≤n ≡ x≤n .
Maximum:
x↑y ≤ z ≡ x≤ z ∧ y≤ z .
Even (divisible by two):
if b→2 2 ¬b→1 fi \ m ≡ b⇒ even(m) .
![Page 8: Language-Processing Problemsdimacs.rutgers.edu/Workshops/Lattices/slides/backhouse2.pdf · 2003. 7. 15. · References 23 J.H. Conway, “Regular Algebra and Finite Machines”, Chapman](https://reader034.vdocument.in/reader034/viewer/2022051901/5ff098ec648229572c0389d3/html5/thumbnails/8.jpg)
8
Parsing
x∈S ⇒ b ≡ S ⊆ if b → Σ∗ 2 ¬b → Σ∗− {x} fi .
Shortest Word (Path)
Let Σ≥k denote the set of all words over alphabet Σ of length at leastk.
Let #S denote the length of a shortest word in the language S.
#S ≥ k ≡ S ⊆ Σ≥k .
(Most common application is when S is the set of paths from onenode to another in a graph.)
![Page 9: Language-Processing Problemsdimacs.rutgers.edu/Workshops/Lattices/slides/backhouse2.pdf · 2003. 7. 15. · References 23 J.H. Conway, “Regular Algebra and Finite Machines”, Chapman](https://reader034.vdocument.in/reader034/viewer/2022051901/5ff098ec648229572c0389d3/html5/thumbnails/9.jpg)
9
Fusion Theorem
F(µ�g) = µvh
provided that
• F is a lower adjoint in a Galois connection of v and � (see briefsummary of definition below)
• F ◦g = h ◦ F .
Galois Connection
F(x) v y ≡ x � G(y) .
F is called the lower adjoint and G the upper adjoint.
![Page 10: Language-Processing Problemsdimacs.rutgers.edu/Workshops/Lattices/slides/backhouse2.pdf · 2003. 7. 15. · References 23 J.H. Conway, “Regular Algebra and Finite Machines”, Chapman](https://reader034.vdocument.in/reader034/viewer/2022051901/5ff098ec648229572c0389d3/html5/thumbnails/10.jpg)
10
Language Recognition
Problem: For given word x and grammar G, determine x∈L(G).That is, implement
(x∈) ◦ L .
Language L(G) is the least fixed point (with respect to the subsetrelation) of a monotonic function.
(x∈) is the lower adjoint in a Galois connection of languages (orderedby the subset relation) and booleans (ordered by implication).(Recall,
x∈S ⇒ b ≡ S ⊆ if b → Σ∗ 2 ¬b → Σ∗− {x} fi .)
![Page 11: Language-Processing Problemsdimacs.rutgers.edu/Workshops/Lattices/slides/backhouse2.pdf · 2003. 7. 15. · References 23 J.H. Conway, “Regular Algebra and Finite Machines”, Chapman](https://reader034.vdocument.in/reader034/viewer/2022051901/5ff098ec648229572c0389d3/html5/thumbnails/11.jpg)
11
Nullable Languages
Problem: For given grammar G, determine ε∈L(G).
(ε∈) ◦ L
Solution: Easily expressed as a fixed point computation.
Works because:
• The function (x∈) is a lower adjoint in a Galois connection (forall x, but in particular for x= ε).
• For all languages S and T ,
ε∈S·T ≡ ε∈S ∧ ε∈ T .
![Page 12: Language-Processing Problemsdimacs.rutgers.edu/Workshops/Lattices/slides/backhouse2.pdf · 2003. 7. 15. · References 23 J.H. Conway, “Regular Algebra and Finite Machines”, Chapman](https://reader034.vdocument.in/reader034/viewer/2022051901/5ff098ec648229572c0389d3/html5/thumbnails/12.jpg)
12
Problem Generalisation
Problem: For given grammar G, determine whether all words in L(G)
have even length. I.e. implement
alleven ◦ L .
The function alleven is a lower adjoint in a Galois connection.Specifically, for all languages S and T ,
alleven(S)⇐ b ≡ S ⊆ if ¬b → Σ∗ 2 b→ (Σ·Σ)∗ fi .
Nevertheless, fusion doesn’t work (directly) because
• there is no ⊗ such that, for all languages S and T ,
alleven(S·T) ≡ alleven(S) ⊗ alleven(T) .
Solution: Generalise by tupling: compute simultaneously alleven andallodd.
![Page 13: Language-Processing Problemsdimacs.rutgers.edu/Workshops/Lattices/slides/backhouse2.pdf · 2003. 7. 15. · References 23 J.H. Conway, “Regular Algebra and Finite Machines”, Chapman](https://reader034.vdocument.in/reader034/viewer/2022051901/5ff098ec648229572c0389d3/html5/thumbnails/13.jpg)
13
General Context-Free Parsing
Problem: For given grammar G, determine x∈L(G).
(x∈) ◦ L .
Not (in general) expressible as a fixed point computation.
Fusion fails because: for all x, x 6= ε, there is no ⊗ such that, for alllanguages S and T ,
x∈S·T ≡ (x∈S) ⊗ (x∈ T) .
CYK: Let F(S) denote the relation 〈i, j:: x[i..j)∈S〉.
Works because:
• The function F is a lower adjoint.
• For all languages S and T ,
F(S·T) = F(S) • F(T)
where B•C denotes the composition of relations B and C.
![Page 14: Language-Processing Problemsdimacs.rutgers.edu/Workshops/Lattices/slides/backhouse2.pdf · 2003. 7. 15. · References 23 J.H. Conway, “Regular Algebra and Finite Machines”, Chapman](https://reader034.vdocument.in/reader034/viewer/2022051901/5ff098ec648229572c0389d3/html5/thumbnails/14.jpg)
14
Language Inclusion
Problem: For fixed (regular) language E and varying S, determine
S ⊆ E .
Example: Emptiness test:S ⊆ φ .
Example: Pattern Matching: given pattern P, for each prefix t of textT , evaluate:
{t} ⊆ Σ∗ · {P} .
Example: All words are of even length:
S ⊆ (Σ·Σ)∗ .
![Page 15: Language-Processing Problemsdimacs.rutgers.edu/Workshops/Lattices/slides/backhouse2.pdf · 2003. 7. 15. · References 23 J.H. Conway, “Regular Algebra and Finite Machines”, Chapman](https://reader034.vdocument.in/reader034/viewer/2022051901/5ff098ec648229572c0389d3/html5/thumbnails/15.jpg)
15
Language Inclusion
Problem: For fixed (regular) language E and varying S, determine
S ⊆ E .
• Function (⊆E) is a lower adjoint. Specifically,
S⊆E⇐ b ≡ S ⊆ if b→E 2 ¬b→Σ∗ fi .
• But, for E 6=φ and E 6=Σ∗, there is no ⊗ such that, for alllanguages S and T ,
S·T ⊆ E ≡ (S⊆E) ⊗ (T ⊆E) .
Solution (Oege de Moor): Use factor theory to derive generalisation.
![Page 16: Language-Processing Problemsdimacs.rutgers.edu/Workshops/Lattices/slides/backhouse2.pdf · 2003. 7. 15. · References 23 J.H. Conway, “Regular Algebra and Finite Machines”, Chapman](https://reader034.vdocument.in/reader034/viewer/2022051901/5ff098ec648229572c0389d3/html5/thumbnails/16.jpg)
16
Factors
For all languages S, T and U,
S·T ⊆ U ≡ T ⊆ S\U ,
S·T ⊆ U ≡ S ⊆ U/T .
Note:S\(U/T) = (S\U)/T .
Hence, writeS\U/T .
![Page 17: Language-Processing Problemsdimacs.rutgers.edu/Workshops/Lattices/slides/backhouse2.pdf · 2003. 7. 15. · References 23 J.H. Conway, “Regular Algebra and Finite Machines”, Chapman](https://reader034.vdocument.in/reader034/viewer/2022051901/5ff098ec648229572c0389d3/html5/thumbnails/17.jpg)
17
Left and Right Factors
Define the functions / and . by
X/ = E/X ,
X. = X\E .
By definition, the range of / is the set of left factors of E and therange of . is the set of right factors of E.
We also have the Galois connection:
X⊆Y/ ≡ Y⊆X. .
Hence,
X/./ = X/ ,
X./. = X. ,
E/. = E = E./ .
![Page 18: Language-Processing Problemsdimacs.rutgers.edu/Workshops/Lattices/slides/backhouse2.pdf · 2003. 7. 15. · References 23 J.H. Conway, “Regular Algebra and Finite Machines”, Chapman](https://reader034.vdocument.in/reader034/viewer/2022051901/5ff098ec648229572c0389d3/html5/thumbnails/18.jpg)
18
The Factor Matrix
Let L denote the set of left factors of E.
Define the factor matrix of E to be the binary operator \ restricted toL×L. Thus entries in the matrix take the form L0\L1 where L0 andL1 are left factors of E.
The factor matrix of E is denoted by [[E]]. It is a reflexive, transitivematrix.
[[E]] = [[E]]∗ .
The row and column containing individual factors, the left factors,the right factors, and E itself, is given by:
U\E/V = U./\V/ ,
V/ = E/\V/ ,
U. = U./\E./ ,
E = E/ \ E./ .
![Page 19: Language-Processing Problemsdimacs.rutgers.edu/Workshops/Lattices/slides/backhouse2.pdf · 2003. 7. 15. · References 23 J.H. Conway, “Regular Algebra and Finite Machines”, Chapman](https://reader034.vdocument.in/reader034/viewer/2022051901/5ff098ec648229572c0389d3/html5/thumbnails/19.jpg)
19
Using the Factor Matrix
Problem: For fixed regular language E and varying S, determine
S ⊆ E .
Generalisation: For fixed regular language E and varying S,determine the relation
S ⊆ [[E]] .
(Formally, the relation 〈L,M:: S ⊆ L\M〉 where L and M range overthe left factors of E.)
Works because:
S·T ⊆ [[E]] ≡ (S⊆ [[E]]) • (T ⊆ [[E]]) .
where B•C denotes the composition of relations B and C.
![Page 20: Language-Processing Problemsdimacs.rutgers.edu/Workshops/Lattices/slides/backhouse2.pdf · 2003. 7. 15. · References 23 J.H. Conway, “Regular Algebra and Finite Machines”, Chapman](https://reader034.vdocument.in/reader034/viewer/2022051901/5ff098ec648229572c0389d3/html5/thumbnails/20.jpg)
20
Proof
We have to show that
S·T ⊆ U/\W/ ≡ 〈∃V :: S ⊆ U/\V/ ∧ T ⊆ V/\W/〉 .
First,
S·T ⊆E
= { unit of conjunction }
S·T ⊆E ∧ true
= { factors, T/=E/T ; cancellation }
S⊆ T/ ∧ T/ · T ⊆ E
= { factors, T/. = T/\E }
S⊆ T/ ∧ T ⊆ T/. .
Whence:
![Page 21: Language-Processing Problemsdimacs.rutgers.edu/Workshops/Lattices/slides/backhouse2.pdf · 2003. 7. 15. · References 23 J.H. Conway, “Regular Algebra and Finite Machines”, Chapman](https://reader034.vdocument.in/reader034/viewer/2022051901/5ff098ec648229572c0389d3/html5/thumbnails/21.jpg)
21
S·T ⊆ U/\W/
= { factors, definition of W/ }
U/ ·S · T ·W ⊆ E
= { above, with S,T := U/ ·S , T ·W }
U/ ·S ⊆ (T ·W)/ ∧ T ·W ⊆ (T ·W)/.
= { factors }
S ⊆ U/\ (T ·W)/ ∧ T ⊆ (T ·W)/./W
= { U./W = U\W/ }
S ⊆ U/\ (T ·W)/ ∧ T ⊆ (T ·W)/\W/ .⇒ { one-point rule }
〈∃V :: S ⊆ U/\V/ ∧ T ⊆ V/\W/〉⇒ { Leibniz }
〈∃V :: S·T ⊆ U/\V/ · V/\W/〉⇒ { cancellation, }
S·T ⊆ U/\W/ .
![Page 22: Language-Processing Problemsdimacs.rutgers.edu/Workshops/Lattices/slides/backhouse2.pdf · 2003. 7. 15. · References 23 J.H. Conway, “Regular Algebra and Finite Machines”, Chapman](https://reader034.vdocument.in/reader034/viewer/2022051901/5ff098ec648229572c0389d3/html5/thumbnails/22.jpg)
22
Summary
• Use of fusion as programming method.
• Problem generalisation involves generalising the algebra inthe solution domain.
• Factor theory as basis for language inclusion problems.
Challenges
• Efficient computation of factor matrices.
• Extension to non-regular languages.
![Page 23: Language-Processing Problemsdimacs.rutgers.edu/Workshops/Lattices/slides/backhouse2.pdf · 2003. 7. 15. · References 23 J.H. Conway, “Regular Algebra and Finite Machines”, Chapman](https://reader034.vdocument.in/reader034/viewer/2022051901/5ff098ec648229572c0389d3/html5/thumbnails/23.jpg)
23
References
J.H. Conway, “Regular Algebra and Finite Machines”, Chapman andHall, London, 1971.
Backhouse, R.C and Carre, B.A. “Regular algebra applied topath-finding problems”, J. Institute of Mathematics and itsApplications, vol. 15, pp. 161–186, 1975.
Backhouse, R.C. and Lutz, R.K., “Factor graphs, failure functionsand bi-trees”, Automata, Languages and Programming, LNCS 52,pp. 61–75, 1977.
Roland Backhouse, “Fusion on Languages”, 10th EuropeanSymposium on Programming, LNCS 2028, pp. 107–121, 2001.
O. de Moor, S. Drape, D. Lacey and G. Sittampalam. Incrementalprogram analysis via language factors.(www.comlab.ox.ac.uk/oucl/work/oege.demoor/pubs.htm)
For related publications on fixed points, Galois connections andmathematics of program construction, seewww.cs.nott.ac.uk/~rcb/papers