structurally recursive descent parsingnad/publications/danielsson-dtp2008... · 2020. 7. 12. · i...
TRANSCRIPT
![Page 1: Structurally Recursive Descent Parsingnad/publications/danielsson-dtp2008... · 2020. 7. 12. · I Structurally recursive descent parsing. I Termination guaranteed. I Errors caught](https://reader033.vdocument.in/reader033/viewer/2022060914/60a84b73a8faa36aeb46113a/html5/thumbnails/1.jpg)
Structurally RecursiveDescent Parsing
Nils Anders Danielsson (Nottingham)
Joint work with Ulf Norell (Chalmers)
![Page 2: Structurally Recursive Descent Parsingnad/publications/danielsson-dtp2008... · 2020. 7. 12. · I Structurally recursive descent parsing. I Termination guaranteed. I Errors caught](https://reader033.vdocument.in/reader033/viewer/2022060914/60a84b73a8faa36aeb46113a/html5/thumbnails/2.jpg)
Parser combinators
I Parser combinator libraries are great!
I Elegant code.
I Executable grammars.
I Easy to abstract out recurring patterns.
I Light-weight.
I Nowadays often fast enough.
![Page 3: Structurally Recursive Descent Parsingnad/publications/danielsson-dtp2008... · 2020. 7. 12. · I Structurally recursive descent parsing. I Termination guaranteed. I Errors caught](https://reader033.vdocument.in/reader033/viewer/2022060914/60a84b73a8faa36aeb46113a/html5/thumbnails/3.jpg)
Simple example
expr = + $ term <∗ sym ’+’ <∗> expr| term
term = . . .
![Page 4: Structurally Recursive Descent Parsingnad/publications/danielsson-dtp2008... · 2020. 7. 12. · I Structurally recursive descent parsing. I Termination guaranteed. I Errors caught](https://reader033.vdocument.in/reader033/viewer/2022060914/60a84b73a8faa36aeb46113a/html5/thumbnails/4.jpg)
Simple example
expr = + $ expr <∗ sym ’+’ <∗> term| term
term = . . .
![Page 5: Structurally Recursive Descent Parsingnad/publications/danielsson-dtp2008... · 2020. 7. 12. · I Structurally recursive descent parsing. I Termination guaranteed. I Errors caught](https://reader033.vdocument.in/reader033/viewer/2022060914/60a84b73a8faa36aeb46113a/html5/thumbnails/5.jpg)
⊥
![Page 6: Structurally Recursive Descent Parsingnad/publications/danielsson-dtp2008... · 2020. 7. 12. · I Structurally recursive descent parsing. I Termination guaranteed. I Errors caught](https://reader033.vdocument.in/reader033/viewer/2022060914/60a84b73a8faa36aeb46113a/html5/thumbnails/6.jpg)
Risk of non-termination
I Combinator parsing isnot guaranteed to terminate.
I Most combinator parsers fail forleft-recursive grammars.
I Executable grammars?
I Some errors are not caught at compile-time.
![Page 7: Structurally Recursive Descent Parsingnad/publications/danielsson-dtp2008... · 2020. 7. 12. · I Structurally recursive descent parsing. I Termination guaranteed. I Errors caught](https://reader033.vdocument.in/reader033/viewer/2022060914/60a84b73a8faa36aeb46113a/html5/thumbnails/7.jpg)
Another problem
I All interesting grammars are cyclic:
expr = + $ term <∗ sym ’+’ <∗> expr| term
I Cyclic values are hard to understand andreason about.
I How do you implement combinator parsing in alanguage which requires structural recursion?
![Page 8: Structurally Recursive Descent Parsingnad/publications/danielsson-dtp2008... · 2020. 7. 12. · I Structurally recursive descent parsing. I Termination guaranteed. I Errors caught](https://reader033.vdocument.in/reader033/viewer/2022060914/60a84b73a8faa36aeb46113a/html5/thumbnails/8.jpg)
Our solution
I Remove cycles by representing grammars asfunctions from non-terminals to parsers:
Grammar tok nt = nt res → Parser tok nt res
I Rule out left recursion by restricting the types.
![Page 9: Structurally Recursive Descent Parsingnad/publications/danielsson-dtp2008... · 2020. 7. 12. · I Structurally recursive descent parsing. I Termination guaranteed. I Errors caught](https://reader033.vdocument.in/reader033/viewer/2022060914/60a84b73a8faa36aeb46113a/html5/thumbnails/9.jpg)
Examples
![Page 10: Structurally Recursive Descent Parsingnad/publications/danielsson-dtp2008... · 2020. 7. 12. · I Structurally recursive descent parsing. I Termination guaranteed. I Errors caught](https://reader033.vdocument.in/reader033/viewer/2022060914/60a84b73a8faa36aeb46113a/html5/thumbnails/10.jpg)
Example
I Non-terminals:
data NT : ParserType whereexpr : NT Nterm : NT N
I Result type: N.
I Indices ensuring termination: .Inferred automatically.
![Page 11: Structurally Recursive Descent Parsingnad/publications/danielsson-dtp2008... · 2020. 7. 12. · I Structurally recursive descent parsing. I Termination guaranteed. I Errors caught](https://reader033.vdocument.in/reader033/viewer/2022060914/60a84b73a8faa36aeb46113a/html5/thumbnails/11.jpg)
Example
g : Grammar Char NTg expr = + $ ! term <∗ sym ’+’ <∗> ! expr
| ! termg term = . . .
I Note: g is not recursive.
I ! turns a non-terminal into a parser.
![Page 12: Structurally Recursive Descent Parsingnad/publications/danielsson-dtp2008... · 2020. 7. 12. · I Structurally recursive descent parsing. I Termination guaranteed. I Errors caught](https://reader033.vdocument.in/reader033/viewer/2022060914/60a84b73a8faa36aeb46113a/html5/thumbnails/12.jpg)
Example
g : Grammar Char NTg expr = + $ ! term <∗ sym ’+’ <∗> ! expr
| ! termg term = . . .
I Uses applicative functor interface.
I Monadic interface also possible.
![Page 13: Structurally Recursive Descent Parsingnad/publications/danielsson-dtp2008... · 2020. 7. 12. · I Structurally recursive descent parsing. I Termination guaranteed. I Errors caught](https://reader033.vdocument.in/reader033/viewer/2022060914/60a84b73a8faa36aeb46113a/html5/thumbnails/13.jpg)
Abstraction
I Much of the flavour of ordinarycombinator parsers is preserved.
I Abstraction requires a little work, though.
![Page 14: Structurally Recursive Descent Parsingnad/publications/danielsson-dtp2008... · 2020. 7. 12. · I Structurally recursive descent parsing. I Termination guaranteed. I Errors caught](https://reader033.vdocument.in/reader033/viewer/2022060914/60a84b73a8faa36aeb46113a/html5/thumbnails/14.jpg)
Abstraction
data NT : ParserType wherelib : L.Nonterminal NT i r → NT rexpr : NT Nterm : NT Nop : NT (N→ N→ N)
open L.Combinators lib
g : Grammar Char NTg (lib p) = libraryGrammar pg expr = chainl1 (! term) (! op)g term = number | parenthesised (! expr)g op = + <$ sym ’+’
| − <$ sym ’-’
![Page 15: Structurally Recursive Descent Parsingnad/publications/danielsson-dtp2008... · 2020. 7. 12. · I Structurally recursive descent parsing. I Termination guaranteed. I Errors caught](https://reader033.vdocument.in/reader033/viewer/2022060914/60a84b73a8faa36aeb46113a/html5/thumbnails/15.jpg)
Abstraction
data NT : ParserType wherelib : L.Nonterminal NT i r → NT rexpr : NT Nterm : NT Nop : NT (N→ N→ N)
open L.Combinators lib
g : Grammar Char NTg (lib p) = libraryGrammar pg expr = chainl1 (! term) (! op)g term = number | parenthesised (! expr)g op = + <$ sym ’+’
| − <$ sym ’-’
![Page 16: Structurally Recursive Descent Parsingnad/publications/danielsson-dtp2008... · 2020. 7. 12. · I Structurally recursive descent parsing. I Termination guaranteed. I Errors caught](https://reader033.vdocument.in/reader033/viewer/2022060914/60a84b73a8faa36aeb46113a/html5/thumbnails/16.jpg)
Running a parser
parse : Parser tok nt i result→ Grammar tok nt→ [tok ]→ [result × [tok ]]
![Page 17: Structurally Recursive Descent Parsingnad/publications/danielsson-dtp2008... · 2020. 7. 12. · I Structurally recursive descent parsing. I Termination guaranteed. I Errors caught](https://reader033.vdocument.in/reader033/viewer/2022060914/60a84b73a8faa36aeb46113a/html5/thumbnails/17.jpg)
How does it work?
![Page 18: Structurally Recursive Descent Parsingnad/publications/danielsson-dtp2008... · 2020. 7. 12. · I Structurally recursive descent parsing. I Termination guaranteed. I Errors caught](https://reader033.vdocument.in/reader033/viewer/2022060914/60a84b73a8faa36aeb46113a/html5/thumbnails/18.jpg)
Indices
Parsers are indexed on two things:
Index = Empty × Corners
Empty Does the parser accept the empty string?
Corners A tree representation of theproper left corners of the parser.
![Page 19: Structurally Recursive Descent Parsingnad/publications/danielsson-dtp2008... · 2020. 7. 12. · I Structurally recursive descent parsing. I Termination guaranteed. I Errors caught](https://reader033.vdocument.in/reader033/viewer/2022060914/60a84b73a8faa36aeb46113a/html5/thumbnails/19.jpg)
Indices
Empty Does the parser accept the empty string?
Empty = Bool
Corners Represents all positions in the grammarin which the parser must not recurseto itself.
data Corners : Set whereleaf : Cornersstep : Corners → Cornersnode : Corners → Corners → Corners
![Page 20: Structurally Recursive Descent Parsingnad/publications/danielsson-dtp2008... · 2020. 7. 12. · I Structurally recursive descent parsing. I Termination guaranteed. I Errors caught](https://reader033.vdocument.in/reader033/viewer/2022060914/60a84b73a8faa36aeb46113a/html5/thumbnails/20.jpg)
Some basic combinators
<∗> : Parser (e1, c1)→ Parser (e2, c2)→Parser (e1 ∧ e2, if e1 then node c1 c2 else c1)
| : Parser (e1, c1)→ Parser (e2, c2)→Parser (e1 ∨ e2, node c1 c2)
! : nt (e, c)→ Parser (e, step c)
![Page 21: Structurally Recursive Descent Parsingnad/publications/danielsson-dtp2008... · 2020. 7. 12. · I Structurally recursive descent parsing. I Termination guaranteed. I Errors caught](https://reader033.vdocument.in/reader033/viewer/2022060914/60a84b73a8faa36aeb46113a/html5/thumbnails/21.jpg)
Some basic combinators
<∗> : Parser (e1, c1)→ Parser (e2, c2)→Parser (e1 ∧ e2, if e1 then node c1 c2 else c1)
| : Parser (e1, c1)→ Parser (e2, c2)→Parser (e1 ∨ e2, node c1 c2)
! : nt (e, c)→ Parser (e, step c)
![Page 22: Structurally Recursive Descent Parsingnad/publications/danielsson-dtp2008... · 2020. 7. 12. · I Structurally recursive descent parsing. I Termination guaranteed. I Errors caught](https://reader033.vdocument.in/reader033/viewer/2022060914/60a84b73a8faa36aeb46113a/html5/thumbnails/22.jpg)
Some basic combinators
<∗> : Parser (e1, c1)→ Parser (e2, c2)→Parser (e1 ∧ e2, if e1 then node c1 c2 else c1)
| : Parser (e1, c1)→ Parser (e2, c2)→Parser (e1 ∨ e2, node c1 c2)
! : nt (e, c)→ Parser (e, step c)
This does not type check:
grammar : nt (e, c) res → Parser tok nt (e, c) resgrammar rec = ! rec
Reason: c 6= step c .
![Page 23: Structurally Recursive Descent Parsingnad/publications/danielsson-dtp2008... · 2020. 7. 12. · I Structurally recursive descent parsing. I Termination guaranteed. I Errors caught](https://reader033.vdocument.in/reader033/viewer/2022060914/60a84b73a8faa36aeb46113a/html5/thumbnails/23.jpg)
Some basic combinators
<∗> : Parser (e1, c1)→ Parser (e2, c2)→Parser (e1 ∧ e2, if e1 then node c1 c2 else c1)
| : Parser (e1, c1)→ Parser (e2, c2)→Parser (e1 ∨ e2, node c1 c2)
! : nt (e, c)→ Parser (e, step c)
This works, though:
grammar : nt (e, c) res → Parser tok nt (e, c) resgrammar rec = sym c ∗> ! rec
Reason: sym c must consume a token.
![Page 24: Structurally Recursive Descent Parsingnad/publications/danielsson-dtp2008... · 2020. 7. 12. · I Structurally recursive descent parsing. I Termination guaranteed. I Errors caught](https://reader033.vdocument.in/reader033/viewer/2022060914/60a84b73a8faa36aeb46113a/html5/thumbnails/24.jpg)
Some basic combinators
<∗> : Parser (e1, c1)→ Parser (e2, c2)→Parser (e1 ∧ e2, if e1 then node c1 c2 else c1)
| : Parser (e1, c1)→ Parser (e2, c2)→Parser (e1 ∨ e2, node c1 c2)
! : nt (e, c)→ Parser (e, step c)
Indirect left recursion also fails:
grammar : nt (e, c) res → Parser tok nt (e, c) resgrammar rec = ! other <∗> . . .grammar other = many p <∗> ! rec
![Page 25: Structurally Recursive Descent Parsingnad/publications/danielsson-dtp2008... · 2020. 7. 12. · I Structurally recursive descent parsing. I Termination guaranteed. I Errors caught](https://reader033.vdocument.in/reader033/viewer/2022060914/60a84b73a8faa36aeb46113a/html5/thumbnails/25.jpg)
Indices can be useful anyway
I Parsing zero or more things:
many : Parser tok nt (false, c) r→ Parser tok nt [r ]
I Note that the input parsermust not accept the empty string.
I Even if the backend can handle many emptyit seems reasonable to assume that it is a bug.
![Page 26: Structurally Recursive Descent Parsingnad/publications/danielsson-dtp2008... · 2020. 7. 12. · I Structurally recursive descent parsing. I Termination guaranteed. I Errors caught](https://reader033.vdocument.in/reader033/viewer/2022060914/60a84b73a8faa36aeb46113a/html5/thumbnails/26.jpg)
Backend
I Simple backtracking implementation. (So far.)
I Lexicographic structural recursion over:
1. An upper bound on thelength of the input string.
2. The Corners index.3. The structure of the parser.
![Page 27: Structurally Recursive Descent Parsingnad/publications/danielsson-dtp2008... · 2020. 7. 12. · I Structurally recursive descent parsing. I Termination guaranteed. I Errors caught](https://reader033.vdocument.in/reader033/viewer/2022060914/60a84b73a8faa36aeb46113a/html5/thumbnails/27.jpg)
Expressive power?
I Can define grammars with an infinite numberof non-terminals:
data NT : ParserType wherea1+ : N→ NT Unit
g : Grammar Char NTg (a1+ zero) = sym ’a’ ∗> return unitg (a1+ (suc n)) = sym ’a’ ∗> ! (a1+ n)
I Can use this to define non-context-freelanguages: anbncn.
![Page 28: Structurally Recursive Descent Parsingnad/publications/danielsson-dtp2008... · 2020. 7. 12. · I Structurally recursive descent parsing. I Termination guaranteed. I Errors caught](https://reader033.vdocument.in/reader033/viewer/2022060914/60a84b73a8faa36aeb46113a/html5/thumbnails/28.jpg)
Expressive power?
I Can define grammars with an infinite numberof non-terminals:
data NT : ParserType wherea1+ : N→ NT Unit
g : Grammar Char NTg (a1+ zero) = sym ’a’ ∗> return unitg (a1+ (suc n)) = sym ’a’ ∗> ! (a1+ n)
I Careful! Types can become really complicated:
nt : (n : N)→ NT (f n) Unit
![Page 29: Structurally Recursive Descent Parsingnad/publications/danielsson-dtp2008... · 2020. 7. 12. · I Structurally recursive descent parsing. I Termination guaranteed. I Errors caught](https://reader033.vdocument.in/reader033/viewer/2022060914/60a84b73a8faa36aeb46113a/html5/thumbnails/29.jpg)
Expressive power?
I Can define grammars with an infinite numberof non-terminals:
data NT : ParserType wherea1+ : N→ NT Unit
g : Grammar Char NTg (a1+ zero) = sym ’a’ ∗> return unitg (a1+ (suc n)) = sym ’a’ ∗> ! (a1+ n)
I The same warning applies when defininglibraries.
Libraries
![Page 30: Structurally Recursive Descent Parsingnad/publications/danielsson-dtp2008... · 2020. 7. 12. · I Structurally recursive descent parsing. I Termination guaranteed. I Errors caught](https://reader033.vdocument.in/reader033/viewer/2022060914/60a84b73a8faa36aeb46113a/html5/thumbnails/30.jpg)
Conclusions
I Structurally recursive descent parsing.
I Termination guaranteed.
I Errors caught at compile-time.
I Still feels like combinator parsing.
I More complicated types,but the overhead for the user is usually small.
![Page 31: Structurally Recursive Descent Parsingnad/publications/danielsson-dtp2008... · 2020. 7. 12. · I Structurally recursive descent parsing. I Termination guaranteed. I Errors caught](https://reader033.vdocument.in/reader033/viewer/2022060914/60a84b73a8faa36aeb46113a/html5/thumbnails/31.jpg)
Possible future work
I More efficient backend.
I Use backend which can handle left recursion ⇒less complicated types.
I But the types can be nice to have anyway.I And who needs left recursion?
chainl is more high-level.
![Page 32: Structurally Recursive Descent Parsingnad/publications/danielsson-dtp2008... · 2020. 7. 12. · I Structurally recursive descent parsing. I Termination guaranteed. I Errors caught](https://reader033.vdocument.in/reader033/viewer/2022060914/60a84b73a8faa36aeb46113a/html5/thumbnails/32.jpg)
?
![Page 33: Structurally Recursive Descent Parsingnad/publications/danielsson-dtp2008... · 2020. 7. 12. · I Structurally recursive descent parsing. I Termination guaranteed. I Errors caught](https://reader033.vdocument.in/reader033/viewer/2022060914/60a84b73a8faa36aeb46113a/html5/thumbnails/33.jpg)
Extra slides
![Page 34: Structurally Recursive Descent Parsingnad/publications/danielsson-dtp2008... · 2020. 7. 12. · I Structurally recursive descent parsing. I Termination guaranteed. I Errors caught](https://reader033.vdocument.in/reader033/viewer/2022060914/60a84b73a8faa36aeb46113a/html5/thumbnails/34.jpg)
Defining a library: Non-terminals
The non-terminals are parameterised on the outergrammar’s non-terminals:
data NT (nt : ParserType) : ParserType wheremany : Parser tok nt (false, c) r → NT nt [r ]many1 : Parser tok nt (false, c) r → NT nt [r ]
![Page 35: Structurally Recursive Descent Parsingnad/publications/danielsson-dtp2008... · 2020. 7. 12. · I Structurally recursive descent parsing. I Termination guaranteed. I Errors caught](https://reader033.vdocument.in/reader033/viewer/2022060914/60a84b73a8faa36aeb46113a/html5/thumbnails/35.jpg)
Defining a library: Combinators
Combinators parameterised on a lib constructor:
module Combinators (lib : NT nt i r → nt i r) where? : Parser tok nt (false, c) r → Parser tok nt [r ]
p? = ! lib (many p)+ : Parser tok nt (false, c) r → Parser tok nt [r ]
p+ = ! lib (many1 p)
library : NT nt i r → Parser tok nt i rlibrary (many p) = return [ ] | p+
library (many1 p) = :: $ p <∗> p?
![Page 36: Structurally Recursive Descent Parsingnad/publications/danielsson-dtp2008... · 2020. 7. 12. · I Structurally recursive descent parsing. I Termination guaranteed. I Errors caught](https://reader033.vdocument.in/reader033/viewer/2022060914/60a84b73a8faa36aeb46113a/html5/thumbnails/36.jpg)
Defining a library: Combinators
Wrappers (to ease use of the library):
module Combinators (lib : NT nt i r → nt i r) where? : Parser tok nt (false, c) r → Parser tok nt [r ]
p? = ! lib (many p)+ : Parser tok nt (false, c) r → Parser tok nt [r ]
p+ = ! lib (many1 p)
library : NT nt i r → Parser tok nt i rlibrary (many p) = return [ ] | p+
library (many1 p) = :: $ p <∗> p?
![Page 37: Structurally Recursive Descent Parsingnad/publications/danielsson-dtp2008... · 2020. 7. 12. · I Structurally recursive descent parsing. I Termination guaranteed. I Errors caught](https://reader033.vdocument.in/reader033/viewer/2022060914/60a84b73a8faa36aeb46113a/html5/thumbnails/37.jpg)
Defining a library: Combinators
Grammar (as before):
module Combinators (lib : NT nt i r → nt i r) where? : Parser tok nt (false, c) r → Parser tok nt [r ]
p? = ! lib (many p)+ : Parser tok nt (false, c) r → Parser tok nt [r ]
p+ = ! lib (many1 p)
library : NT nt i r → Parser tok nt i rlibrary (many p) = return [ ] | p+
library (many1 p) = :: $ p <∗> p?
Go back