plan for today and beginning next week (lexical analysis)cs453/yr2014/slides/02... ·...
TRANSCRIPT
![Page 1: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/1.jpg)
CS453 Lecture Regular Expressions and Transition Diagrams 1
Plan for Today and Beginning Next week (Lexical Analysis)
Regular Expressions
Finite State Machines DFAs: Deterministic Finite Automata Complications NFAs: Non Deterministic Finite State Automata
From Regular Expressions to NFAs From NFAs to DFAs
![Page 2: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/2.jpg)
CS453 Lecture Regular Expressions and Transition Diagrams 2
Structure of a Typical Compiler
“sentences”
Synthesis
optimization
code generation
target language
IR
IR code generation
IR
Analysis
character stream
lexical analysis
“words” tokens
semantic analysis
syntactic analysis
AST
annotated AST
interpreter
![Page 3: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/3.jpg)
CS453 Lecture Regular Expressions and Transition Diagrams 3
Tokens for Example MeggyJava program import meggy.Meggy;
class PA3Flower {public static void main(String[] whatever){{
// Upper left petal, clockwise Meggy.setPixel( (byte)2, (byte)4, Meggy.Color.VIOLET ); Meggy.setPixel( (byte)2, (byte)1, Meggy.Color.VIOLET); … }}Tokens: Symbol(IMPORT,null), Symbol(MEGGY,null),
Symbol(SEMI,null), Symbol(CLASS,null), Symbol(ID,”PA3Flower”), Symbol(LBRACE,null), …
![Page 4: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/4.jpg)
About The Slides on Languages and Finite Automata
Slides Originally Developed by Prof. Costas Busch (2004) – Many thanks to Prof. Busch for developing the original slide set.
Adapted with permission by Prof. Dan Massey (Spring 2007) – Subsequent modifications, many thanks to Prof. Massey for CS 301 slides
Adapted with permission by Prof. Michelle Strout (Spring 2011) – Adapted for use in CS 453 – Adapted by Wim Bohm( added regular expr à NFA à DFA, Spr2012)
![Page 5: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/5.jpg)
A language is a set of strings (sometimes called sentences)
String: A finite sequence of letters
Examples: “cat”, “dog”, “house”, … Defined over a fixed alphabet:
{ }zcba ,,,, …=Σ
Languages
![Page 6: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/6.jpg)
Empty String
A string with no letters: ε (sometimes λ is used)
Observations:
€
ε = 0
εw = wε = w
εabba = abbaε = abba
![Page 7: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/7.jpg)
Regular Expressions
Regular expressions describe regular languages You have probably seen them in OSs / editors Example:
describes the language
€
(a | (b)(c)) *
€
L((a | (b)(c))*) = ε,a,bc,aa,abc,bca,...{ }
![Page 8: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/8.jpg)
Recursive Definition for Specifying Regular Expressions
∅, ε, α
r1 | r2r1 r2r1 *r1( )
Are regular expressions
Primitive regular expressions: where
2r1rGiven regular expressions and α ∈ Σ, somealphabet
![Page 9: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/9.jpg)
Regular operators
choice: A | B a string from L(A) or from L(B) concatenation: A B a string from L(A) followed by a string from L(B) repetition: A* 0 or more concatenations of strings from L(A) A+ 1 or more grouping: ( A ) Concatenation has precedence over choice: A|B C vs. (A|B)C More syntactic sugar, used in scanner generators: [abc] means a or b or c [\t\n ] means tab, newline, or space [a-z] means a,b,c, …, or z
CS453 Lecture Regular Expressions and Transition Diagrams 9
![Page 10: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/10.jpg)
Example Regular Expressions and Regular Definitions
Regular definition: name : regular expression name can then be used in other regular expressions Keywords “print”, “while” Operations: “+”, “-”, “*” Identifiers: let : [a-zA-Z] // chose from a to z or A to Z dig : [0-9] id : let (let | dig)* Numbers: dig+ = dig dig* CS453 Lecture Regular Expressions and Transition Diagrams 10
![Page 11: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/11.jpg)
Finite Automaton
Input
String Output
String
Finite Automaton
![Page 12: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/12.jpg)
Finite Accepter
Input
“Accept” or “Reject”
String
Finite Automaton
Output
![Page 13: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/13.jpg)
State Transition Graph
initial state
final state “accept” state
transition
abba -Finite Accepter
0q 1q 2q 3q 4qa b b a
5q
a a bb
ba,
ba,
![Page 14: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/14.jpg)
Initial Configuration
1q 2q 3q 4qa b b a
5q
a a bb
ba,
Input String a b b a
ba,
0q
![Page 15: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/15.jpg)
Reading the Input
0q 1q 2q 3q 4qa b b a
5q
a a bb
ba,
a b b a
ba,
![Page 16: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/16.jpg)
0q 1q 2q 3q 4qa b b a
5q
a a bb
ba,
a b b a
ba,
![Page 17: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/17.jpg)
0q 1q 2q 3q 4qa b b a
5q
a a bb
ba,
a b b a
ba,
![Page 18: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/18.jpg)
0q 1q 2q 3q 4qa b b a
5q
a a bb
ba,
a b b a
ba,
![Page 19: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/19.jpg)
0q 1q 2q 3q 4qa b b a
Output: “accept”
5q
a a bb
ba,
a b b a
ba,
Input finished
![Page 20: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/20.jpg)
String Rejection
1q 2q 3q 4qa b b a
5q
a a bb
ba,
a b a
ba,
0q
![Page 21: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/21.jpg)
0q 1q 2q 3q 4qa b b a
5q
a a bb
ba,
a b a
ba,
![Page 22: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/22.jpg)
0q 1q 2q 3q 4qa b b a
5q
a a bb
ba,
a b a
ba,
![Page 23: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/23.jpg)
0q 1q 2q 3q 4qa b b a
5q
a a bb
ba,
a b a
ba,
![Page 24: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/24.jpg)
0q 1q 2q 3q 4qa b b a
5q
a a bb
ba,Output: “reject”
a b a
ba,
Input finished
![Page 25: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/25.jpg)
The Empty String
1q 2q 3q 4qa b b a
5q
a a bb
ba,
ba,
0q
€
ε
![Page 26: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/26.jpg)
1q 2q 3q 4qa b b a
5q
a a bb
ba,
ba,
0q
Output: “reject”
Would it be possible to accept the empty string?
€
ε
![Page 27: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/27.jpg)
Another Example
a
b ba,
ba,
0q 1q 2q
a ba
![Page 28: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/28.jpg)
a
b ba,
ba,
0q 1q 2q
a ba
![Page 29: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/29.jpg)
a
b ba,
ba,
0q 1q 2q
a ba
![Page 30: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/30.jpg)
a
b ba,
ba,
0q 1q 2q
a ba
![Page 31: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/31.jpg)
a
b ba,
ba,
0q 1q 2q
a ba
Output: “accept”
Input finished
![Page 32: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/32.jpg)
Rejection
a
b ba,
ba,
0q 1q 2q
ab b
![Page 33: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/33.jpg)
a
b ba,
ba,
0q 1q 2q
ab b
![Page 34: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/34.jpg)
a
b ba,
ba,
0q 1q 2q
ab b
![Page 35: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/35.jpg)
a
b ba,
ba,
0q 1q 2q
ab b
![Page 36: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/36.jpg)
a
b ba,
ba,
0q 1q 2q
ab b
Output: “reject”
Input finished
Which strings are accepted?
![Page 37: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/37.jpg)
Formalities
Deterministic Finite Automaton (DFA)
( )FqQM ,,,, 0δΣ=
QΣ
δ
0q
F
: set of states
: input alphabet
: transition function
: initial state
: set of final (accepting) states
![Page 38: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/38.jpg)
Input Alphabet Σ
0q 1q 2q 3q 4qa b b a
5q
a a bb
ba,
{ }ba,=Σ
ba,
![Page 39: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/39.jpg)
Set of States Q
0q 1q 2q 3q 4qa b b a
5q
a a bb
ba,
{ }543210 ,,,,, qqqqqqQ =
ba,
![Page 40: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/40.jpg)
Initial State 0q
1q 2q 3q 4qa b b a
5q
a a bb
ba,
ba,
0q
![Page 41: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/41.jpg)
Set of Final States F
0q 1q 2q 3qa b b a
5q
a a bb
ba,
{ }4qF =
ba,
4q
![Page 42: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/42.jpg)
Transition Function δ
0q 1q 2q 3q 4qa b b a
5q
a a bb
ba,
QQ →Σ×:δ
ba,
![Page 43: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/43.jpg)
( ) 10, qaq =δ
2q 3q 4qa b b a
5q
a a bb
ba,
ba,
0q 1q
![Page 44: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/44.jpg)
( ) 50, qbq =δ
1q 2q 3q 4qa b b a
5q
a a bb
ba,
ba,
0q
![Page 45: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/45.jpg)
0q 1q 2q 3q 4qa b b a
5q
a a bb
ba,
ba,
( ) 32, qbq =δ
![Page 46: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/46.jpg)
Transition Function / table δ
0q 1q 2q 3q 4qa b b a
5q
a a bb
ba,
δ a b0q1q2q3q4q5q
1q 5q5q 2q5q 3q4q 5q
ba,5q5q5q5q
![Page 47: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/47.jpg)
Complications
1. "1234" is an NUMBER but what about the “123” in “1234” or the “23”, etc. Also, the scanner must recognize many tokens, not one, only stopping at end of file.
3. "if" is a keyword or reserved word IF, but "if" is also defined by the reg. exp. for identifier ID. We want to recognize IF.
4. We want to discard white space and comments.
5. "123" is a NUMBER but so is "235" and so is "0", just as "a" is an ID and so is "bcd”, we want to recognize a token, but add attributes to it. CS453 Lecture Regular Expressions and Transition Diagrams 47
![Page 48: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/48.jpg)
Complications 1
1. "1234" is an NUMBER but what about the “123” in “1234” or the “23”, etc. Also, the scanner must recognize many tokens, not one, only stopping at end of file. So: recognize the largest string defined by some regular expression, only stop getting more input if there is no more match. This introduces
the need to reconsider a character, as it is the first of the next token
e.g. fname(a,bcd ); would be scanned as ID OPEN ID COMMA ID CLOSE SEMI EOF scanning fname would consume (, which would be put back and then recognized as OPEN
CS453 Lecture Regular Expressions and Transition Diagrams 48
![Page 49: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/49.jpg)
Complication 2
2. "if" is a keyword or reserved word IF, but "if" is also defined by the reg. exp. for identifier ID, we want to recognize IF, so
Have some way of determining which token ( IF or ID ) is recognized.
This can be done using priority, e.g. in scanner generators an earlier definition has a higher priority than a later one.
By putting the definition for IF before the definition for ID in the input
for the scanner generator, we get the desired result.
What about the string “ifyouleavemenow”?
CS453 Lecture Regular Expressions and Transition Diagrams 49
![Page 50: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/50.jpg)
Complication 3
3. we want to discard white space and comments and not bother the parser with these. So:
in scanner generators, we can specify, using a regular expression, white space e.g. [\t\n ]
and return no token, i.e. move to the next
specify comments using a (NASTY) regular expression and again return no token, move to the next
CS453 Lecture Regular Expressions and Transition Diagrams 50
![Page 51: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/51.jpg)
Complication 4
4. "123" is a NUMBER but so is "235" and so is "0", just as "a" is an ID and so is "bcd”, we want to recognize a token, but add attributes to it. So,
Scanners return Symbols, not tokens. A Symbol is a (token, tokenValue) pair, e.g. (NUMBER,123) or (ID,"a").
Often more information is added to a symbol, e.g. line number and position (as we will do in MeggyJava)
CS453 Lecture Regular Expressions and Transition Diagrams 51
![Page 52: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/52.jpg)
(Non) Deterministic Finite State Automata
A Deterministic Finite State Automaton (DFA) has disjoint character sets on its edges, i.e. the choice “which state is next” is deterministic.
A Non-deterministic Finite State Automaton (NFA) does NOT, i.e. it can have character sets on its edges that overlap (non empty
intersection), and empty sets on the some edges (labeled ε ). NFAs are used in the translation from regular expressions to FSAs.
E.g. when we combine the reg. exp for IF with the reg.exp for ID by just merging the two Transition graphs, we would get an NFA.
NFAs are a first step in creating a DFA for a scanner. The NFA is then transformed into a DFA.
CS453 Lecture Regular Expressions and Transition Diagrams 52
![Page 53: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/53.jpg)
From regular expressions to NFAs
regexp simple letter “a” empty string AB concat the NFAs
A|B split merge them
A* build a loop
CS453 Lecture Regular Expressions and Transition Diagrams 53
a ε
A B
A
B ε
ε
ε
A ε
ε
accept state of the NFA for A
![Page 54: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/54.jpg)
The Problem
DFAs are easy to execute (table driven interpretation)
NFAs are easy to build from reg. exps, but hard to execute we would need some form of guessing, implemented by back tracking
To build a DFA from an NFA we avoid the back track by taking all choices in the NFA at once, a move with a character or ε gets us to a set of states in the NFA, which will become one state in the DFA.
We keep doing this until we have exhausted all possibilities.
This mechanism is called transitive closure (This ends because there is only a finite set of subsets of NFA states.
How many are there? ) CS453 Lecture Regular Expressions and Transition Diagrams 54
![Page 55: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/55.jpg)
Example IF and ID
let : [a-z] dig : [0-9]
tok : if | id
if : “i” “f”
id : let (let | dig)*
CS453 Lecture Regular Expressions and Transition Diagrams 55
![Page 56: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/56.jpg)
Example: NFA for IF and ID
CS453 Lecture Regular Expressions and Transition Diagrams 56
i f IF
ε a-z 0
2 3
4 5 8 ε
a-z 0-9
7 6 ε ε
ID
IF has priority over ID. From 0, with ε we can get to states 1 and 4 this is called an ε-closure We can now simulate the behavior of the NFA and build a table for the DFA making character moves plus ε-closures
let : [a-z] dig : [0-9] tok : if | id if : “i” “f” id : let (let | dig)*
1 ε
![Page 57: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/57.jpg)
NFA simulation scanning “in”
CS453 Lecture Regular Expressions and Transition Diagrams 57
ε
a-z
0
4
5 8 ε
a-z 0-9
7 6 ε ε
ID
DFAstate NFAstates Move Next 0 0,1,4 i 2,5,8,6 1 2,5,6,8 n 6,7,8 Only one of the states in 6,7,8 is an accepting state, an ID accepting state, so “in” is an ID
i f IF 2 3 1 ε
![Page 58: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/58.jpg)
NFA simulation scanning “if”
CS453 Lecture Regular Expressions and Transition Diagrams 58
ε
a-z 4
5 8 ε
a-z 0-9
7 6 ε ε
ID
DFAstate NFAstates Move Next 0 0,1,4 i 2,5,6,8 1 2,5,6,8 f 3,6,7,8 Two of the states in 3,6,7,8 are accepting, an IF accepting state (3) and an ID accepting state (8), IF has priority over ID, so “if” is an IF
ε 0
i f IF 2 3 1 ε
![Page 59: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/59.jpg)
Definitions: edge(s,c) and closure
edge(s,c): the set of all NFA states reachable from state s following an edge with character c
closure(S): the set of all states reachable from S with no chars or ε
T=S repeat T’=T; forall s in T’ { T’=T; } until T’==T
This transitive closure algorithm terminates because there is a finite number of states in the NFA
CS453 Lecture Regular Expressions and Transition Diagrams 59
closure(S) = T = S∪ ( edge(s,ε))s∈T
T = T '∪( edge(s,ε))s∈T '
![Page 60: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/60.jpg)
DFAedge and NFA Simulation
Suppose we are in state DFA d = {si, sk,sl} By moving with character c from d we reach a set of new
NFA states, call these DFAedge(d,c), a new or already existing DFA state
NFA simulation: let the input string be c1…ck d=closure({s1}) // s1 the start state of the NFA for i from 1 to k d = DFAedge(d,ci)
CS453 Lecture Regular Expressions and Transition Diagrams 60
DFAedge(d,c) = closure( edge(s,c))s∈d
![Page 61: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/61.jpg)
Constructing a DFA with closure and DFAEdge
state d1 = closure(s1) the closure of the start state of the NFA make new states by moving from existing states with a character c, using DFAEdge(d,c); record these in the transition table make accepts in the transition table, if there is an accepting state in d, decide priority if more than one accept state. Instead of characters we use non-overlapping (DFA)
character classes to keep the table manageable.
CS453 Lecture Regular Expressions and Transition Diagrams 61
![Page 62: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/62.jpg)
NFA to DFA (let’s build it)
CS453 Lecture Regular Expressions and Transition Diagrams 62
i f
ε
a-z
1
2 3
4
5 8 ε
a-z 0-9
7 6
ε
IF
ID
ε
![Page 63: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/63.jpg)
NFA to DFA
CS453 Lecture Regular Expressions and Transition Diagrams 63
i f
ε
a-z
1
2 3
4
5 8 ε
a-z 0-9
7 6
ε
1: 1,4
2: 2,5,6,8 i 3:
3,6,7,8
f IF IF
5: 5,6,8
ID
4: 6,7,8
a-h j-z
a-z 0-9
a-z 0-9
a-z 0-9
ID ID
ε
ID
a-e g-z 0-9
![Page 64: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/64.jpg)
The transition table for IF ID
p NFAstates(p) i f a-h a-e,g-z a-z,0-9 ACPT j-z 0-9 1 {1,4} {2,5,6,8} {5,6,8} 2 {2,5,6,8} {3,6,7,8} {6,7,8} ID 3 {3,6,7,8} {6,7,8} IF 4 {6,7,8} {6,7,8} ID 5 {5,6,8} {6,7,8} ID
CS453 Lecture Regular Expressions and Transition Diagrams 64
![Page 65: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and](https://reader030.vdocument.in/reader030/viewer/2022041107/5f09b6927e708231d42828d3/html5/thumbnails/65.jpg)
Suggested Exercise
Build an NFA and a DFA for integer and float literals
dot: “.”
dig: [0-9]
int-lit: dig+
float-lit: dig* dot dig+
CS453 Lecture Regular Expressions and Transition Diagrams 65