introduction to lex - skkuarcs.skku.edu/.../p1.flex-bison.pdf · lex –a lexical analyzer...
TRANSCRIPT
![Page 1: Introduction to Lex - SKKUarcs.skku.edu/.../p1.flex-bison.pdf · Lex –A Lexical Analyzer Generator Source A table of regular expressions and corresponding program fragments Optional](https://reader035.vdocument.in/reader035/viewer/2022062414/5ed07ad9b0e09762e52be49a/html5/thumbnails/1.jpg)
Introduction to Lex & Yacc
(flex & bison)
![Page 2: Introduction to Lex - SKKUarcs.skku.edu/.../p1.flex-bison.pdf · Lex –A Lexical Analyzer Generator Source A table of regular expressions and corresponding program fragments Optional](https://reader035.vdocument.in/reader035/viewer/2022062414/5ed07ad9b0e09762e52be49a/html5/thumbnails/2.jpg)
Lex & Yacc (flex & bison)
lex (flex) yacc (bison)
yylex() yyparse()
lexical rules
(regular expression)
InputProcessed
output
lexical rules
(context-free grammar)
![Page 3: Introduction to Lex - SKKUarcs.skku.edu/.../p1.flex-bison.pdf · Lex –A Lexical Analyzer Generator Source A table of regular expressions and corresponding program fragments Optional](https://reader035.vdocument.in/reader035/viewer/2022062414/5ed07ad9b0e09762e52be49a/html5/thumbnails/3.jpg)
call yylex()
next token: (NUM, 12)
Working Version of Automated Software
lex.yy.c
yylex() { … }
myparser.tab.c
yyparse() { … }
Input
program
12 + 26
main.c
main() { … }
call yyparse()+
12 26
syntax tree
![Page 4: Introduction to Lex - SKKUarcs.skku.edu/.../p1.flex-bison.pdf · Lex –A Lexical Analyzer Generator Source A table of regular expressions and corresponding program fragments Optional](https://reader035.vdocument.in/reader035/viewer/2022062414/5ed07ad9b0e09762e52be49a/html5/thumbnails/4.jpg)
Lex – A Lexical Analyzer Generator
Source
A table of regular expressions and corresponding program fragments
Optional user codes
Output: lex.yy.c ( yylex() )
The table is translated to a program which
Reads an input stream
Copy the input to an output stream
Partition the input strings which match the given expressions
As each such string is recognized the corresponding program fragment is executed
User codes are simply copied to the output
![Page 5: Introduction to Lex - SKKUarcs.skku.edu/.../p1.flex-bison.pdf · Lex –A Lexical Analyzer Generator Source A table of regular expressions and corresponding program fragments Optional](https://reader035.vdocument.in/reader035/viewer/2022062414/5ed07ad9b0e09762e52be49a/html5/thumbnails/5.jpg)
Lex – Format of Input File
%{
#include <stdio.h>
#include <stdlib.h>
void mycode(char *);
%}
LETTER [A-Za-z]
DIGIT [0-9]
%%
({LETTER})({LETTER}|{DIGIT})* printf(“ID = %s\n“, yytext);
({DIGIT})+ {
mycode(yytext);
}
%%
void mycode(char* text) {
printf(“NUMBER = %d\n“, atoi(yytext));
}
definitions
rules
user codes
headers
name definition
pattern action
![Page 6: Introduction to Lex - SKKUarcs.skku.edu/.../p1.flex-bison.pdf · Lex –A Lexical Analyzer Generator Source A table of regular expressions and corresponding program fragments Optional](https://reader035.vdocument.in/reader035/viewer/2022062414/5ed07ad9b0e09762e52be49a/html5/thumbnails/6.jpg)
Flex on the run
$ ls
dn.l main.c
$ flex dn.l
$ ls
dn.l lex.yy.c main.c
$ gcc lex.yy.c main.c –lfl
$ ls
a.exe dn.l lex.yy.c main.c
$ ./a.exe
…
Ctrl-D
$
$ cat main.c
main()
{
yylex();
}
![Page 7: Introduction to Lex - SKKUarcs.skku.edu/.../p1.flex-bison.pdf · Lex –A Lexical Analyzer Generator Source A table of regular expressions and corresponding program fragments Optional](https://reader035.vdocument.in/reader035/viewer/2022062414/5ed07ad9b0e09762e52be49a/html5/thumbnails/7.jpg)
Regular Expressions . matches any single character
* matches zero or more copies of preceding expression
+ matches one or more copies of preceding expression
? matches zero or one copy of preceding expression -?[0-9]+ : signed numbers including optional minus sign
[] matches any character within the brackets [Abc1], [A-Z], [A-Za-z], [^123A-Z]
^ matches the beginning of line
$ matches the end of line
\ escape metacharacter e.g. \* matches with *
| matches either the preceding expression or the following abc|ABC
() groups a series of regular expression (123)(123)*
exclude [123A-Z]
![Page 8: Introduction to Lex - SKKUarcs.skku.edu/.../p1.flex-bison.pdf · Lex –A Lexical Analyzer Generator Source A table of regular expressions and corresponding program fragments Optional](https://reader035.vdocument.in/reader035/viewer/2022062414/5ed07ad9b0e09762e52be49a/html5/thumbnails/8.jpg)
Yacc – Yet Another Compiler-Compiler
Similar to Lex
Input: Context Free Grammar (CFG)
Output: <filename>.tab.h, <filename>.tab.c ( yyparse() )
%{
headers
%}
Declarations
%%
Grammar Rules
%%
User Codes
![Page 9: Introduction to Lex - SKKUarcs.skku.edu/.../p1.flex-bison.pdf · Lex –A Lexical Analyzer Generator Source A table of regular expressions and corresponding program fragments Optional](https://reader035.vdocument.in/reader035/viewer/2022062414/5ed07ad9b0e09762e52be49a/html5/thumbnails/9.jpg)
Yacc with Lex - an example
<calc.l>
/* Definitions */
%{
#include <stdlib.h>
#include “calc.tab.h”
%}
NUMBER [0-9]+
%%
/* Rules */
{NUMBER} { yylval = atoi(yytext);
return NUMBER; }
“+” { return PLUS; }
“*” { return MULT; }
“\n” { return EOL; }
. { yyerror(“unexpected char”); }
%%
/* User Code */
<calc.y>
/* Definitions */
%{
#include <stdio.h>
%}
%token NUMBER PLUS MULT EOL
%%
/* Rules */
goal: eval goal {}
| eval {}
;
eval: expr EOL { printf(“= %d\n”, $1); }
;
expr: NUMBER { $$ = $1; }
| expr PLUS expr { $$ = $1 + $3; }
| expr MULT expr { $$ = $1 * $3; }
;
%%
/* User Code */
int yyerror(char *s)
{ return printf(“%s₩n”, s); }
![Page 10: Introduction to Lex - SKKUarcs.skku.edu/.../p1.flex-bison.pdf · Lex –A Lexical Analyzer Generator Source A table of regular expressions and corresponding program fragments Optional](https://reader035.vdocument.in/reader035/viewer/2022062414/5ed07ad9b0e09762e52be49a/html5/thumbnails/10.jpg)
Associativity & Precedence
Specify associativity & precedence
Change the grammar rules
If-else conflict can be OK in yacc
%token NUMBER PLUS MULT EOL ASSN
%token NUMBER EOL
%left PLUS
%left MULT
%right ASSN
expr0: NUMBER { $$ = $1; }
;
expr1: expr0 { $$ = $1; }
| expr1 MULT expr0 { $$ = $1 * $3; }
;
expr2: expr1 { $$ = $1; }
| expr2 PLUS expr1 { $$ = $1 + $3; }
;
Lowest
Highest
Precedence level
![Page 11: Introduction to Lex - SKKUarcs.skku.edu/.../p1.flex-bison.pdf · Lex –A Lexical Analyzer Generator Source A table of regular expressions and corresponding program fragments Optional](https://reader035.vdocument.in/reader035/viewer/2022062414/5ed07ad9b0e09762e52be49a/html5/thumbnails/11.jpg)
Tips
For Lex (flex)
Single character can be used as token
char c = yytext[0];
String should be duplicated to safely pass outside the scanner
yylval.str = strndup(yytext, yyleng);
For Yacc (bison)
Token and nonterminal symbol have type
%union {
Exp* exp;
char* str;
}
%token <str> ID
%type <exp> expr
![Page 12: Introduction to Lex - SKKUarcs.skku.edu/.../p1.flex-bison.pdf · Lex –A Lexical Analyzer Generator Source A table of regular expressions and corresponding program fragments Optional](https://reader035.vdocument.in/reader035/viewer/2022062414/5ed07ad9b0e09762e52be49a/html5/thumbnails/12.jpg)
Bison on the run
$ ls
calc.l calc.y main.c
$ flex calc.l
$ ls
calc.l calc.y lex.yy.c main.c
$ bison –d calc.y
$ ls
calc.l calc.y lex.yy.c calc.tab.c calc.tab.h main.c
$ gcc -o calc.exe calc.tab.c lex.yy.c –lfl
$ ls
calc.l calc.y lex.yy.c calc.tab.c calc.tab.h main.c
calc.exe
$ ./calc.exe
<main.c>
main()
{ yyparse(); }
![Page 13: Introduction to Lex - SKKUarcs.skku.edu/.../p1.flex-bison.pdf · Lex –A Lexical Analyzer Generator Source A table of regular expressions and corresponding program fragments Optional](https://reader035.vdocument.in/reader035/viewer/2022062414/5ed07ad9b0e09762e52be49a/html5/thumbnails/13.jpg)
Programming Assignment #1
Implement an engineering calculator flex & bison on Cygwin environment (http://www.cygwin.com)
Inputs: equation and command.
Outputs: calculated value
error message for illegal equations or illegal commands
Online submission (zipped file named as student_id.zip) http://www.icampus.ac.kr
student_id.zip should contain engcalc.l – flex input file
engcalc.y – bison input file
main.c – main C program
makefile – make command should generate “engcalc.exe”
README – user manual, limitations
![Page 14: Introduction to Lex - SKKUarcs.skku.edu/.../p1.flex-bison.pdf · Lex –A Lexical Analyzer Generator Source A table of regular expressions and corresponding program fragments Optional](https://reader035.vdocument.in/reader035/viewer/2022062414/5ed07ad9b0e09762e52be49a/html5/thumbnails/14.jpg)
Programming Assignment #1(cont’d)
Inputs: equations:
integer, real number (note: numbers can be signed, e.g. -2, -3.45, +2.1e23)
+, -, *, /, ^, #, (, ) e.g.) 2^3 = 23 , #4 = 4 , 3#9 = 39
^, # are right-to-left associative operators e.g.) 2^3^2 = 232 , 2#4^2 = 242
built-in functions: sin(), cos(), tan(), log(), ln(), exp()
sin(), cos(), tan() take degrees()
commands: memory: e.g.) m = 35, n = 3.14
clear [m] - Invalidate the value stored in m.
If m is not specified, invalidate all memory values.
Extra Points (+10%) user defined functions: e.g.) f(x, y) = x + 4 * y
clearf [fn] - Invalidate the user defined function fn.
If fn is not specified, invalidate all user defined functions.
base 10base e
real number divide, not integer divide