![Page 1: 1 Lex. 2 Lex is a lexical analyzer Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20) temp++; Lex Ident: Var Integer: 12 Oper: + Integer: 9 Semicolumn:](https://reader035.vdocument.in/reader035/viewer/2022070403/56649f305503460f94c4a426/html5/thumbnails/1.jpg)
1
Lex
![Page 2: 1 Lex. 2 Lex is a lexical analyzer Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20) temp++; Lex Ident: Var Integer: 12 Oper: + Integer: 9 Semicolumn:](https://reader035.vdocument.in/reader035/viewer/2022070403/56649f305503460f94c4a426/html5/thumbnails/2.jpg)
2
Lex is a lexical analyzer
Var = 12 + 9;if (test > 20) temp = 0;else while (a < 20) temp++;
Lex
Ident: VarInteger: 12Oper: +Integer: 9Semicolumn: ;Keyword: ifParen: (Ident: testOper: >....
Input
Output
![Page 3: 1 Lex. 2 Lex is a lexical analyzer Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20) temp++; Lex Ident: Var Integer: 12 Oper: + Integer: 9 Semicolumn:](https://reader035.vdocument.in/reader035/viewer/2022070403/56649f305503460f94c4a426/html5/thumbnails/3.jpg)
3
For each kind of stringsthere is a regular expression
“if”“then”
“+”“-”“=“
/* operators */
/* keywords */
Lex
Regular expressions
![Page 4: 1 Lex. 2 Lex is a lexical analyzer Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20) temp++; Lex Ident: Var Integer: 12 Oper: + Integer: 9 Semicolumn:](https://reader035.vdocument.in/reader035/viewer/2022070403/56649f305503460f94c4a426/html5/thumbnails/4.jpg)
4
(0|1|2|3|4|5|6|7|8|9)+ /* integers */
/* identifiers */
Lex
Regular expressions
(a|b|..|z|A|B|...|Z)+
![Page 5: 1 Lex. 2 Lex is a lexical analyzer Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20) temp++; Lex Ident: Var Integer: 12 Oper: + Integer: 9 Semicolumn:](https://reader035.vdocument.in/reader035/viewer/2022070403/56649f305503460f94c4a426/html5/thumbnails/5.jpg)
5
integers
[0-9]+(0|1|2|3|4|5|6|7|8|9)+
![Page 6: 1 Lex. 2 Lex is a lexical analyzer Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20) temp++; Lex Ident: Var Integer: 12 Oper: + Integer: 9 Semicolumn:](https://reader035.vdocument.in/reader035/viewer/2022070403/56649f305503460f94c4a426/html5/thumbnails/6.jpg)
6
(a|b|..|z|A|B|...|Z)+ [a-zA-Z]+
identifiers
![Page 7: 1 Lex. 2 Lex is a lexical analyzer Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20) temp++; Lex Ident: Var Integer: 12 Oper: + Integer: 9 Semicolumn:](https://reader035.vdocument.in/reader035/viewer/2022070403/56649f305503460f94c4a426/html5/thumbnails/7.jpg)
7
Each regular expression has an action:
Examples:
\n
Regular expression Action
linenum++
[a-zA-Z]+ printf(“identifier”);
[0-9]+ prinf(“integer”);
![Page 8: 1 Lex. 2 Lex is a lexical analyzer Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20) temp++; Lex Ident: Var Integer: 12 Oper: + Integer: 9 Semicolumn:](https://reader035.vdocument.in/reader035/viewer/2022070403/56649f305503460f94c4a426/html5/thumbnails/8.jpg)
8
Default action: ECHO;
Print the string identifiedto the output
![Page 9: 1 Lex. 2 Lex is a lexical analyzer Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20) temp++; Lex Ident: Var Integer: 12 Oper: + Integer: 9 Semicolumn:](https://reader035.vdocument.in/reader035/viewer/2022070403/56649f305503460f94c4a426/html5/thumbnails/9.jpg)
9
A small program
%%
[a-zA-Z]+ printf(“Identifier\n”);
[0-9]+ prinf(“Integer\n”);
[ \t\n] ; /*skip spaces*/
![Page 10: 1 Lex. 2 Lex is a lexical analyzer Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20) temp++; Lex Ident: Var Integer: 12 Oper: + Integer: 9 Semicolumn:](https://reader035.vdocument.in/reader035/viewer/2022070403/56649f305503460f94c4a426/html5/thumbnails/10.jpg)
10
1234 test
var 566 78
9800
Input Output
IntegerIdentifierIdentifierIntegerIntegerInteger
![Page 11: 1 Lex. 2 Lex is a lexical analyzer Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20) temp++; Lex Ident: Var Integer: 12 Oper: + Integer: 9 Semicolumn:](https://reader035.vdocument.in/reader035/viewer/2022070403/56649f305503460f94c4a426/html5/thumbnails/11.jpg)
11
%%
[a-zA-Z]+ printf(“Identifier\n”);
[0-9]+ prinf(“Integer\n”);
[ \t] ; /*skip spaces*/
. printf(“Error in line: %d\n”, linenum);
Another program%{ int linenum = 1;%}
\n linenum++;
![Page 12: 1 Lex. 2 Lex is a lexical analyzer Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20) temp++; Lex Ident: Var Integer: 12 Oper: + Integer: 9 Semicolumn:](https://reader035.vdocument.in/reader035/viewer/2022070403/56649f305503460f94c4a426/html5/thumbnails/12.jpg)
12
1234 test
var 566 78
9800 +
temp
Input Output
IntegerIdentifierIdentifierIntegerIntegerIntegerError in line 3Identifier
![Page 13: 1 Lex. 2 Lex is a lexical analyzer Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20) temp++; Lex Ident: Var Integer: 12 Oper: + Integer: 9 Semicolumn:](https://reader035.vdocument.in/reader035/viewer/2022070403/56649f305503460f94c4a426/html5/thumbnails/13.jpg)
13
Lex matches the longest input string
“if”“ifend”
Regular Expressions
Input: ifend if ifn
Matches: “ifend” “if” nomatch
![Page 14: 1 Lex. 2 Lex is a lexical analyzer Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20) temp++; Lex Ident: Var Integer: 12 Oper: + Integer: 9 Semicolumn:](https://reader035.vdocument.in/reader035/viewer/2022070403/56649f305503460f94c4a426/html5/thumbnails/14.jpg)
14
Internal Structure of Lex
Lex
Regular expressions
NFA DFAMinimalDFA
The final states of the DFA areassociated with actions
![Page 15: 1 Lex. 2 Lex is a lexical analyzer Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20) temp++; Lex Ident: Var Integer: 12 Oper: + Integer: 9 Semicolumn:](https://reader035.vdocument.in/reader035/viewer/2022070403/56649f305503460f94c4a426/html5/thumbnails/15.jpg)
15
Compilers
![Page 16: 1 Lex. 2 Lex is a lexical analyzer Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20) temp++; Lex Ident: Var Integer: 12 Oper: + Integer: 9 Semicolumn:](https://reader035.vdocument.in/reader035/viewer/2022070403/56649f305503460f94c4a426/html5/thumbnails/16.jpg)
16
Compiler
Program
v = 5;if (v>5) x = 12 + v;while (x !=3) { x = x - 3; v = 10;}......
Add v,v,0cmp v,5jmplt ELSETHEN: add x, 12,vELSE:WHILE:cmp x,3...
Machine Code
![Page 17: 1 Lex. 2 Lex is a lexical analyzer Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20) temp++; Lex Ident: Var Integer: 12 Oper: + Integer: 9 Semicolumn:](https://reader035.vdocument.in/reader035/viewer/2022070403/56649f305503460f94c4a426/html5/thumbnails/17.jpg)
17
Lexicalanalyzer parser
Compiler
program machinecode
![Page 18: 1 Lex. 2 Lex is a lexical analyzer Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20) temp++; Lex Ident: Var Integer: 12 Oper: + Integer: 9 Semicolumn:](https://reader035.vdocument.in/reader035/viewer/2022070403/56649f305503460f94c4a426/html5/thumbnails/18.jpg)
18
Parser knows the grammarof the programming language
![Page 19: 1 Lex. 2 Lex is a lexical analyzer Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20) temp++; Lex Ident: Var Integer: 12 Oper: + Integer: 9 Semicolumn:](https://reader035.vdocument.in/reader035/viewer/2022070403/56649f305503460f94c4a426/html5/thumbnails/19.jpg)
19
Parser
PROGRAM -> STMT_LISTSTMT_LIST -> STMT STMT_LIST | STMT;STMT -> EXPR ; | IF_STMT | WHILE_STMT | { STMT_LIST }
EXPR -> EXPR + EXPR | EXPR - EXPR | IDIF_STMT -> if (EXPR) then STMT | if (EXPR) then STMT else STMTWHILE_STMT-> while (EXPR) do STMT
![Page 20: 1 Lex. 2 Lex is a lexical analyzer Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20) temp++; Lex Ident: Var Integer: 12 Oper: + Integer: 9 Semicolumn:](https://reader035.vdocument.in/reader035/viewer/2022070403/56649f305503460f94c4a426/html5/thumbnails/20.jpg)
20
The parser constructs the derivation for the particular input program
10 + 2 * 5
Parser
E -> E + E | E * E | INT
E => E + E => E + E * E => 10 + E*E => 10 + 2 * E => 10 + 2 * 5
input
derivation
![Page 21: 1 Lex. 2 Lex is a lexical analyzer Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20) temp++; Lex Ident: Var Integer: 12 Oper: + Integer: 9 Semicolumn:](https://reader035.vdocument.in/reader035/viewer/2022070403/56649f305503460f94c4a426/html5/thumbnails/21.jpg)
21
10
E
2 5
E => E + E => E + E * E => 10 + E*E => 10 + 2 * E => 10 + 2 * 5
derivation
derivation tree
E E
E E
+
*
![Page 22: 1 Lex. 2 Lex is a lexical analyzer Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20) temp++; Lex Ident: Var Integer: 12 Oper: + Integer: 9 Semicolumn:](https://reader035.vdocument.in/reader035/viewer/2022070403/56649f305503460f94c4a426/html5/thumbnails/22.jpg)
22
10
E
2 5
derivation tree
E E
E E
+
*
mult t1, 10, 5add t2, 10, t1
machine code
![Page 23: 1 Lex. 2 Lex is a lexical analyzer Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20) temp++; Lex Ident: Var Integer: 12 Oper: + Integer: 9 Semicolumn:](https://reader035.vdocument.in/reader035/viewer/2022070403/56649f305503460f94c4a426/html5/thumbnails/23.jpg)
23
Parsing
![Page 24: 1 Lex. 2 Lex is a lexical analyzer Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20) temp++; Lex Ident: Var Integer: 12 Oper: + Integer: 9 Semicolumn:](https://reader035.vdocument.in/reader035/viewer/2022070403/56649f305503460f94c4a426/html5/thumbnails/24.jpg)
24
grammar
Parserinputstring
derivation
![Page 25: 1 Lex. 2 Lex is a lexical analyzer Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20) temp++; Lex Ident: Var Integer: 12 Oper: + Integer: 9 Semicolumn:](https://reader035.vdocument.in/reader035/viewer/2022070403/56649f305503460f94c4a426/html5/thumbnails/25.jpg)
25
Example:
Parserderivation
S
bSAS
aSbS
SSSinput
?aabb
![Page 26: 1 Lex. 2 Lex is a lexical analyzer Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20) temp++; Lex Ident: Var Integer: 12 Oper: + Integer: 9 Semicolumn:](https://reader035.vdocument.in/reader035/viewer/2022070403/56649f305503460f94c4a426/html5/thumbnails/26.jpg)
26
Exhaustive Search
||| bSAaSbSSS
Phase 1:
S
bSaS
aSbS
SSS
aabb
![Page 27: 1 Lex. 2 Lex is a lexical analyzer Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20) temp++; Lex Ident: Var Integer: 12 Oper: + Integer: 9 Semicolumn:](https://reader035.vdocument.in/reader035/viewer/2022070403/56649f305503460f94c4a426/html5/thumbnails/27.jpg)
27
S
bSaS
aSbS
SSS
aabb
![Page 28: 1 Lex. 2 Lex is a lexical analyzer Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20) temp++; Lex Ident: Var Integer: 12 Oper: + Integer: 9 Semicolumn:](https://reader035.vdocument.in/reader035/viewer/2022070403/56649f305503460f94c4a426/html5/thumbnails/28.jpg)
28
||| bSAaSbSSS Phase 2
aSbS
SSS
aabbSSSS
bSaSSSS
aSbSSSS
SSSSSS
aaSbS
bSaSaSbS
aaSbbaSbS
aSSbaSbS
Phase 1
![Page 29: 1 Lex. 2 Lex is a lexical analyzer Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20) temp++; Lex Ident: Var Integer: 12 Oper: + Integer: 9 Semicolumn:](https://reader035.vdocument.in/reader035/viewer/2022070403/56649f305503460f94c4a426/html5/thumbnails/29.jpg)
29
||| bSAaSbSSS Phase 2
aSbS
SSS
aabbSSSS
bSaSSSS
aSbSSSS
SSSSSS
aaSbS
bSaSaSbS
aaSbbaSbS
aSSbaSbS
Phase 1
![Page 30: 1 Lex. 2 Lex is a lexical analyzer Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20) temp++; Lex Ident: Var Integer: 12 Oper: + Integer: 9 Semicolumn:](https://reader035.vdocument.in/reader035/viewer/2022070403/56649f305503460f94c4a426/html5/thumbnails/30.jpg)
30
Phase 2
SSSS
aSbSSSS
SSSSSS
aaSbbaSbS
aSSbaSbS
Phase 3
aabbaaSbbaSbS
![Page 31: 1 Lex. 2 Lex is a lexical analyzer Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20) temp++; Lex Ident: Var Integer: 12 Oper: + Integer: 9 Semicolumn:](https://reader035.vdocument.in/reader035/viewer/2022070403/56649f305503460f94c4a426/html5/thumbnails/31.jpg)
31
Final result of exhaustive search
Parser
derivation
S
bSAS
aSbS
SSSinput
aabb
aabbaaSbbaSbS
(Top-down parsing)
![Page 32: 1 Lex. 2 Lex is a lexical analyzer Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20) temp++; Lex Ident: Var Integer: 12 Oper: + Integer: 9 Semicolumn:](https://reader035.vdocument.in/reader035/viewer/2022070403/56649f305503460f94c4a426/html5/thumbnails/32.jpg)
32
Time complexity of exhaustive search
Suppose there are no productions of the form
A
BA
Number of phases for string : w ||2 w
![Page 33: 1 Lex. 2 Lex is a lexical analyzer Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20) temp++; Lex Ident: Var Integer: 12 Oper: + Integer: 9 Semicolumn:](https://reader035.vdocument.in/reader035/viewer/2022070403/56649f305503460f94c4a426/html5/thumbnails/33.jpg)
33
Time for phase 1: k
k possible derivations
For grammar with rules k
![Page 34: 1 Lex. 2 Lex is a lexical analyzer Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20) temp++; Lex Ident: Var Integer: 12 Oper: + Integer: 9 Semicolumn:](https://reader035.vdocument.in/reader035/viewer/2022070403/56649f305503460f94c4a426/html5/thumbnails/34.jpg)
34
Time for phase 2: 2k
possible derivations2k
![Page 35: 1 Lex. 2 Lex is a lexical analyzer Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20) temp++; Lex Ident: Var Integer: 12 Oper: + Integer: 9 Semicolumn:](https://reader035.vdocument.in/reader035/viewer/2022070403/56649f305503460f94c4a426/html5/thumbnails/35.jpg)
35
Time for phase : ||2 wk
possible derivations||2 wk
||2 w
![Page 36: 1 Lex. 2 Lex is a lexical analyzer Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20) temp++; Lex Ident: Var Integer: 12 Oper: + Integer: 9 Semicolumn:](https://reader035.vdocument.in/reader035/viewer/2022070403/56649f305503460f94c4a426/html5/thumbnails/36.jpg)
36
Total time needed for string :w
||22 wkkk
Extremely bad!!!
![Page 37: 1 Lex. 2 Lex is a lexical analyzer Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20) temp++; Lex Ident: Var Integer: 12 Oper: + Integer: 9 Semicolumn:](https://reader035.vdocument.in/reader035/viewer/2022070403/56649f305503460f94c4a426/html5/thumbnails/37.jpg)
37
There exist faster algorithmsfor specialized grammars
S-grammar: axA
symbol stringof variables
),( aA appears once
![Page 38: 1 Lex. 2 Lex is a lexical analyzer Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20) temp++; Lex Ident: Var Integer: 12 Oper: + Integer: 9 Semicolumn:](https://reader035.vdocument.in/reader035/viewer/2022070403/56649f305503460f94c4a426/html5/thumbnails/38.jpg)
38
S-grammar example:
cS
bSSS
aSS
abccabcSabSSaSS
Each string has a unique derivation
![Page 39: 1 Lex. 2 Lex is a lexical analyzer Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20) temp++; Lex Ident: Var Integer: 12 Oper: + Integer: 9 Semicolumn:](https://reader035.vdocument.in/reader035/viewer/2022070403/56649f305503460f94c4a426/html5/thumbnails/39.jpg)
39
In the exhaustive search parsingthere is only one choice in each phase
For S-grammars:
Total time for parsing string :w ||w
Time for a phase: 1
![Page 40: 1 Lex. 2 Lex is a lexical analyzer Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20) temp++; Lex Ident: Var Integer: 12 Oper: + Integer: 9 Semicolumn:](https://reader035.vdocument.in/reader035/viewer/2022070403/56649f305503460f94c4a426/html5/thumbnails/40.jpg)
40
For general context-free grammars:
There exists a parsing algorithmthat parses a stringin time
||w3||w