parsing expression grammars aaron hoffer css 548 autumn 2012

7
4 2 5 1 3 0011 0010 1010 1101 0001 0100 1011 Parsing Expression Grammars Aaron Hoffer CSS 548 Autumn 2012

Upload: naomi-tucker

Post on 31-Dec-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Parsing Expression Grammars Aaron Hoffer CSS 548 Autumn 2012

4251 3

0011 0010 1010 1101 0001 0100 1011

Parsing Expression Grammars

Aaron Hoffer

CSS 548

Autumn 2012

Page 2: Parsing Expression Grammars Aaron Hoffer CSS 548 Autumn 2012

4251 3

0011 0010 1010 1101 0001 0100 1011

2

PEGs

• What if Flex and Yacc were one program?• What if you could use the same regular

expression patterns as Flex in your parser generator?

• What if Yacc supported…– ! (not “XYZ…”)– * (zero or more)

Page 3: Parsing Expression Grammars Aaron Hoffer CSS 548 Autumn 2012

4251 3

0011 0010 1010 1101 0001 0100 1011

3

/* Scanning C comments with Flex */

<INITIAL>”/*” { BEGIN(IN_COMMENT);}

<IN_COMMENT>[^*]*\*+ { BEGIN(WARNING);}

<WARNING>[^/] { BEGIN(IN_COMMENT);}

<WARNING>”/“ { BEGIN(INITIAL); }

Page 4: Parsing Expression Grammars Aaron Hoffer CSS 548 Autumn 2012

4251 3

0011 0010 1010 1101 0001 0100 1011

4

/* Scanning C comments with PEG */

Comment: ”/*” (“*” !”/” / [^*])* “*/”

Let’s break it into multiple rules to see what it means:

Comment: ”/*” Middle “*/”

Middle: (Asterisk | NotAsterisk)*

Asterisk: “*” !”/”

NotAsterisk: [^*]

Page 5: Parsing Expression Grammars Aaron Hoffer CSS 548 Autumn 2012

4251 3

0011 0010 1010 1101 0001 0100 1011

5

/*Nested /*comments*/ with PEG*/

• Add the non-terminal Comment into Middle • Now parses nested comments

Comment:”/*” Middle “*/”Middle: (Comment | Asterisk | NotAsterisk)*Asterisk: “*” !”/” NotAsterisk: [^*]

Page 6: Parsing Expression Grammars Aaron Hoffer CSS 548 Autumn 2012

4251 3

0011 0010 1010 1101 0001 0100 1011

6

What is a PEG?

• Not context-free grammar or regular expression • Are not ambiguous. PEG parsers matches rules in the

order they are defined• Are a formal description of what a recursive descent

parser with back-tracking is capable of parsing• Support predicates like “not” and “and” because the

parser can look ahead and then back-track

Page 7: Parsing Expression Grammars Aaron Hoffer CSS 548 Autumn 2012

4251 3

0011 0010 1010 1101 0001 0100 1011

7

Domain Specific Languages and PEGs

• Alan Kay Viewpoints Research Institute – lots of research on PEGs

• Their vision: tiny DSLs cooperating in the same environment to accomplish big tasks (sounds like Lisp to some critics)

• VPRI’s STEP project demonstrated an OS, graphics system, word processor, spreadsheet, etc. in 20 KLOCs

• How VPRI did it?– Throw everything away. – Create self-hosting language and little DSLs. – Collapse code size by factor of 1000