cdt314 faber formal languages, automata and models of computation lecture 4
DESCRIPTION
CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 4 School of Innovation, Design and Engineering Mälardalen University 2012. 1. Content. - PowerPoint PPT PresentationTRANSCRIPT
11
CDT314
FABER
Formal Languages, Automata and Models of Computation
Lecture 4
School of Innovation, Design and Engineering Mälardalen University
2012
2
Regular Expressions and Regular Languages
NFADFA Subset Construction (Delmängdskonstruktion)
FA RE State Elimination (Tillståndseliminering)
Minimizing DFA by Set Partitioning (Särskiljandealgoritmen)
Grammar: Linear grammar. Regular grammar.
Application: Compiler. Lexical Analysis.
Content
3
Two Basic Theoremsabout Finite State Automata
Based on C Busch, RPI, Models of Computation
4
1: KleeneTheoremNFADFA
Let M be NFA accepting L. Then there exists DFA M´ that accepts L as well.
For each NFA there exists corresponding DFA
that accepts the same language.
5
II: Finite Language TheoremAny finite language is regular.
Any finite language is FSA-acceptable.
Proof: every finite language is a union of single strings which are regular because described by regular expressionsExample L = {abba, abb, abab}L is expressed by the regular expression “abba|abb|abab“
FSA = Finite State Automaton = DFA/NFA
6
Some Properties of Regular Languages,
Summary
7
Properties
1L 2L
21LLConcatenation:
*1LStar:
21 LL Union:
Are RegularLanguages
For regular languages and we will prove that:
8
Regular languages are closed under
21LLConcatenation:
*1LStar:
21 LL Union:
9
1LRegular language
11 LML
1M
Single final state
NFA
2LRegular language
2M
Single final state
22 LML
NFA
10
Example
}{1 baL na
b
1M
baL 2ab
2M
11
Union (Thompson’s construction)
NFA for
1M
2M
21 LL
12
ab
ab
}{1 baL n
}{2 baL
}{}{21 babaLL n NFA for
Example
13
Concatenation (Thompson’s construction)
NFA for 21LL
1M 2M
14
Example NFA for }{}}{{21 bbaababaLL nn
ab ab
}{1 baL n}{2 baL
15
Star Operation (Thompson’s construction)
NFA for *1L
1M
*1L
16
Example
NFA for *}{*1 baL n
ab
}{1 baL n
17
Reverse of a Regular Language
18
Theorem
The reverse of a regular languageis a regular language
RL L
Proof idea
Construct NFA that accepts :RL
invert the transitions of the NFAthat accepts L
19
ProofSince is regular, there is NFA that accepts
L
Example:
baabL *a
b
ba
L
20
Invert Transitions
a
b
ba
a
b
ba
21
Make old initial state a final state
and vice versa
a
b
ba
a
b
ba
22
Add a new initial state
a
b
ba
a
b
ba
23
Resulting machine accepts RL
baabL *ababLR *
a
b
ba
RL is regular
24
Regular Expressionsand
Regular Languages
25
Theorem: Regular expressions are representation of regular languages and they are equivalent.
Regular expressions describe regular languages in formal language theory. They have the same expressive power as regular grammars.
26
Theorem - Part 1
LanguagesGenerated byRegular Expressions
RegularLanguages
1. For any regular expression the language is regular
r)(rL
27
Theorem - Part 2
LanguagesGenerated byRegular Expressions
RegularLanguages
LrL )(2. For any regular language there is a regular expression with
Lr
28
Proof - Part 1
r)(rL
For any regular expression the language is regular
Proof by induction on the size of r
29
Induction BasisPrimitive Regular Expressions: ,,NFAs
)()( 1 LML
)(}{)( 2 LML RegularLanguages
)(}{)( 3 aLaML a
(where )
30
Inductive Hypothesis
1r 2rAssume
for regular expressions and
that and are regular languages)( 1rL )( 2rL
31
Inductive StepWe will prove:
regular languages
))(()(
)()(
1
1
21
21
rLrL
rrLrrL
32
By definition of regular expressions:
)())(())(()(
)()()()()()(
11
*11
2121
2121
rLrLrLrL
rLrLrrLrLrLrrL
33
)( 1rL )( 2rLBy inductive hypothesis we know: and are regular languages
Regular languages are closed under
*1
21
21
rLrLrLrLrL union
concatenation
star
34
Therefore
** 11
2121
2121
rLrL
rLrLrrL
rLrLrrL
regularlanguages
And trivially
))(( 1rL is a regular language
35
Proof – Part 2
Lr LrL )(
For any regular language there is a regular expression with
Proof by construction of regular expression
36
LML )(
Single final state
LM
Since is regular take the NFA that accepts it
37
From construct the equivalentGeneralized Transition Graph
transition labels are regular expressions
M
Example
a
ba,
cM
a
ba
c
QED
38
Summary: Operations on Regular Expressions
RE Regular language description
a+b {a,b} (a+b)(a+b) {aa, ab, ba, bb} a* {, a, aa, aaa, …} a*b {b,ab, aab, aaab, …} (a+b)* {, a, b, aa, ab, ba, aaa, bbb…}
39
Algebraic Properties
Axiom Description r +s = s+r + is commutative (r +s)+t = r +(s+t) + is associative (rs)t = r (st) concatenation is associative r (s+t) = rs+rt (s+t)r = sr +tr
concatenation distributes over +
r = r r = r
is the identity element for concatenation
r* = ( r +)* relation between * and r** = r* * is idempotent
40
Operator Precedence
1. Kleene star 2. Concatenation 3. Union
allows dropping unnecessary parentheses.
41
Converting Regular Expression to a DFA
42
Example: From a Regular Expression to an NFA
Example : (a+b)*abb step by step construction
2a
4 5b
6
1
3
(a+b)
43
Example: From a Regular Expression to an NFA
Example : (a+b)*abb
2a
4 5b
1
6
3
0 7
(a+b)*
44
Example: From a Regular Expression to an NFA
Example : (a+b)*abb
2 a
4 5b
10
10
a
b
b
76
3
8
9
(a+b)*abb
45
Converting FA to a Regular Expression
State Elimination
http://www.math.uu.se/~salling/Movies/FA_to_RegExpression.mov
46
Add a new start (s) and a new final (f) state:• From s to the ex-starting state (1)• From each ex-final state (1,3) to f
Example
ba ,a
bb
1 2
3
a
47
ba ,a
b
b
1 2
3
a
s
f
Let’s remove state 1!Each combination of input/output to 1 will generate a new path once state 1 is gone
; ba , ; ba a; , a ;
The original:
ba ,a
bb
1 2
3
a
ba ,a
bb
1 2
3
a
48
When state 1 is gone we must be able to make all those transitions!
ba ,a
bb
1 2
3
a
s
f
ba( ); ba ,
ba,a
bba
Previous:
49
;
ba ,a
bb
1 2
3
a
s
f
ba( )
50
ba ,a
b
b
1 2
3
a
s
f
ba a; ,
ba( )a
ba( )
51
A common mistake: having several arrows between the same pair of states. Join the two arrows (by union of their regular expressions) before going on to the next step.
ba ,a
b1 2
3
a
s
f
ba( )
bba( )a
b
52
ba ,a
b1 2
3
a
s
f
a ; ba( )
a
bba( )a
53
Union again..
ba ,a
b1 2
3
a
s
f
ba( )
a
bba( )a
54
ab
2
3
s
f
ba( )
a
bba( )a
Without state 1...
55
ab
2
3
s
f
ba( )
a
ba(a
Now we repeat the same procedure for the state 2...
b
56
Following the path s-2-3 we concatenate all strings on our way…don’t forget a* in 2!
ab
2
3
s
f
ba( )
a
bba( )a
bba( ) a
57
When 2 is removed, the path 3 - 2 –3 has also to be preserved, so we concatenate 3-2 string, take a* loop and go back 2-3 with string b!
ab 2
3
s
f
ba( )
bba(a
a
bba( ) a
( bba( )a ba)
58
3
s
f
This is how the FA looks like without state 2:
a
bba( ) a
( bba( )a ba)
59
3
s
f
Finally we remove state 3…
a
bba( ) a
( bba( )a ba)
60
3
s
f
...so we concatenate strings s-3, loop in 3 and 3-f
)))((( babbaa
babbaa ))((
baba *)(
a
baba )( )( a
61
Now we can omit state 3!
s
f
From s we have two choices, empty string or the long expression
OR is represented by union , as usually
)))((( babbaababa )( )( a
62
So union the arrows...
s f
...and we are done!
)))((( babbaababa )( )( a
63
Converting FA to a Regular Expression-
An Algorithm for State Elimination
64
• We expand our notion of NFA- to allow transitions on arbitrary regular expressions, not simply single symbols or .
• Successively eliminate states, replacing transitions that enter and leave a state with a more complicated regular expression, until eventually there are only two states left: a start state, an accepting state, and a single transition connecting them, labeled with a regular expression.
• The resulting regular expression is then our answer.
65
• To begin with, the automaton should have a start state that has no transitions into it (including self-loops), and which is not accepting.
• If your automaton does not obey this property, add a new, non-accepting start state s, and add a -transition from s to the original start state.
• The automaton should also have a single accepting final state with no transitions leaving it, and no self-loops.
• If your automaton does not have it, add a new final state q, change all other accepting states to non-accepting, and add -transitions from them to q.
This change clearly doesn't change the language accepted by the automaton.
66
Repeat the following steps, which eliminate a state:
1. Pick a non-start, non-accepting state q to eliminate. The state q will have i transitions in and j transitions out. Each will be labeled with a regular expression. For each of the ij combinations of transitions into and out of q, replace:
A
B
p qC
r
withCBA *
p r
And delete state q.
67
2. If several transitions go between a pair of states, replace them with a single transition that is the union of the individual transitions.
E.g. replace: A
B
p r
with
p rBA
68
Example ba
a
b1 3 4a2
b
abab *a
b1 3 4
baabab *)*( 1 4
69
N.B. common mistake!
ba,
a b
*)( bameans
See example on s 46 and 47 in Sallings book
i.e.
NOT *)*( ba
70
Minimizing DFAby Set Partitioning
http://www.math.uu.se/~salling/Movies/SubsetConstruction.mov
71
Minimization of DFA
The deterministic finite automata are not always the smallest possible accepting the source language.
There may be states with the same "acceptance behavior". This applies to states p and q, if for all input words, the automaton always or never moves to a final state from p and q.
72
State Reduction by Set Partitioning (Särskiljandealgoritmen)
The set partitioning technique is similar to one used for partitioning people into groups based on their responses to questionnaire.
The following slides show the detailed steps for computing equivalent state sets of the starting DFA and constructing the corresponding reduced DFA.
73
bb a
b b
a
a
b
ab
a
4
3
1
2
a
5
0
b
aa, b
3
1,2
a
b4,5
0a, b
Starting DFA
Reduced DFA
74
Step 0: Partition the states into two groups accepting and non-accepting.
State Reduction by Set Partitioning
{ 3, 4, 5 } { 0, 1, 2 }
P1 P2
bb a
b b
aa
b
ab
a
4
31
2
a
5
0
75
Step 1: Get the response of each state for each input symbol. Notice that States 3 and 0 show different responses from the ones of the other states in the same set.
P1 P2
p1 p1 p1 p2 p1 p1
a a {3, 4, 5 } {0,1, 2 }b b
p2 p1 p1 p2 p1 p1
State Reduction by Set Partitioning
bb a
b b
aa
b
ab
a
4
31
2
a
5
0
P1P2
76
P11 P12 P21 P22
p11 p11 p12 p12
a a {4, 5} {3} {1, 2} {0}
b b p11 p11 p11 p11
Step 2: Partition the sets according to the responses, and go to Step 1 until no partition occurs.
No further partition is possible for the sets P11 and P21 . So the final partition results are as follows:
{4, 5} {3} {1, 2} {0}
77
{4, 5} {3} {1, 2} {0}
Minimized DFA consists of four states of the final partition, and the transitions are the one corresponding to the starting DFA.
b
aa, b
3
1,2
a
b4,5
0a, b
bb a
b b
aa
b
ab
a
4
31
2
a
5
0
b
aa, b
3
1,2
a
b4,5
0a, b
Minimized DFA Starting DFA
78
DFA Minimization Algorithm
The algorithm
P { F, {Q-F}}while ( P is still changing) T { } for each set s P for each partition s by into s1, s2, …, sk
T T s1, s2, …, sk if T P then P T
Why does this work?Partition P 2Q (power set)Start off with 2 subsets of Q
{F} and {Q-F}
While loop takes PiPi+1 by splitting one or more sets
Pi+1 is at least one step closer to the partition with |Q | sets
Maximum of |Q | splits
Partitions are never combinedInitial partition ensures that final
states are intact
This is a fixed-point algorithm!
79
Set Partition Algorithm (Salling book)
Example 2.39 We apply step by step set partition algorithm on the following DFA: },{ ba
a
ba
a
a
b
b
21
35
4
6
ba ab
Two strings x,y are distinguished by language (DFA) if there is a (distinguishing) string z such as that only one of strings xz, yz ends in accepting state (belongs to language).
80
}6,5,4,3,2,1{
}6,5,4,2,1{ }3{Non-accepting: Accepting:
What happens when we read ?a
}3{
}3{}5{
}5,4,2{ }6,1{
From 1 and 6 by a we reach 3 which is accepting. They form a special group.
}6,1{}4,2{
What happens when we read ?b
a
ba
a
a
b
b
21
35
4
6
ba ab
We search for strings that are distinguished by DFA!
Distinguishing string: ora b
From 5 we end in 6 by b, leaving its group.
81
}3{}5{ }6,1{}4,2{The minimal automaton has four states:
a
a
ab
b
}6,1{ b}3{
}5{}4,2{
a
a
ba
a
a
b
b
21
35
4
6
ba ab
82
The Chomsky Hierarchy
83
Regular Languages
}{ nnba }{ RwwContext-Free LanguagesNon-regular languages
}0:{ ! nan}0,:{ lncba lnln
84
Some Additional Examplesfor your excersize
85
NFA N for (a+b)*abb
1start
2
1010
a 3
4 5
0 6
b
87 9 a b b
Example of subset construction for NFA DFA
86
Translation table for DFA
STATEINPUT SYMBOL
a b
ABCDE
BBBBB
CDCEC
Example of subset construction for NFA DFA
87
Result of applying the subset construction of NFA for (a+b)*abb.
Astart
Ba
Db b
10E
b
a a
a
bC
b
a
Example of subset construction for NFA DFA
88
Another Example of State Elimination
ba,a
b
b0q 1q 2q
b
ba a
b
b0q 1q 2q
b
89
ba a
b
b0q 1q 2q
b
0q 2q
babb*
)(* babb
Another Example of State Elimination
90
0q 2q
babb*
)(* babb
*)(**)*( bbabbabbr
LMLrL )()(
Another Example of State Elimination
Resulting Regular Expression
91
In General
Removing states
iq q jqa b
cde
iq jq
dae* bce*dce*
bae*
92
Obtaining the final regular expression
0q fq
1r
2r
3r4r
*)*(* 213421 rrrrrrr
LMLrL )()(
93
Example: From an NFA to a DFA
states a bA B CB B DC B CD B EE B C
A B
C
D Ea b b
ab b
b
aa
a
94
Example: From an NFA to a DFA
states a bA B A
B B D
D B E
E B A
A B D Ea b b
b
a
a
a
b
95
Example: DFA Minimization
s0
as1
b
s3
bs4
s2
a
b
b
a
a
a
b
s0 , s2
as1
b
s3
bs4
b
a
a
a
b
final state
Current Partition Split on a Split on bP0 {s4} {s0, s1, s2, s3} none {s0, s1, s2} {s3}P1 {s4}{s3}{s0, s1, s2} none {s0, s2}{s1}P2 {s4}{s3}{s1}{s0, s2} none none
96
Example: DFA Minimization
What about a ( b + c )* ?
First, the subset construction:
q0 q1 a
q4 q5 b
q6 q7 c
q3 q8 q2 q9
-closure(move(s,*))NFA states a b c
s0 q0 q1, q2, q3, q4, q6, q9
none none
s1 q1, q2, q3,q4, q6, q9
none q5, q8, q9,q3, q4, q6
q7, q8, q9,q3, q4, q6
s2 q5, q8, q9,q3, q4, q6
none s2 s3
s3 q7, q8, q9,q3, q4, q6
none s2 s3
s3
s2
s0 s1
c
ba
b
b
c
c
Final states
97
Example: DFA MinimizationThen, apply the minimization algorithm
To produce the minimal DFAs3
s2
s0 s1
c
ba
b
b
c
c
Split onCurrent Partition a b c
P0 { s1, s2, s3} {s0} none none none
s0 s1
a
b , c
final states
98
Grammars
99
GrammarsGrammars express languages
Example: the English language
verbpredicate
nounarticlephrasenoun
predicatephrasenounsentence
_
_
100
barksverbsingsverb
dognounbirdnoun
thearticleaarticle
101
A derivation of “the bird sings”:
ñáñáñáÞñáñáñáÞ
ñáñáñáÞñáñáñáÞ
ñáñáñáÞñáñáÞñá
birdtheverbbirdtheverbnounthe
verbnounarticlepredicatenounarticlepredicatephrasenounsentence
sings
_
102
A derivation of “a dog barks”:
barksdogaverbdogaverbnouna
verbnounarticleverbphrasenounpredicatephrasenounsentence
Þ
Þ
Þ
Þ
Þ
Þ
__
103
The language of the grammar:
}
{
"ingssogdthe",barks"ogdthe"
,"ingssogda",barks"ogda",sings"birdthe",barks"birdthe",sings"birda"
,barks"birda"L
"ingssogdhet","barksogdhet"
,"ingssogda","barksogda","ingssbirdhet","barksbirdhet"
,"ingssbirda","barksbirda"{L
}
}
104
Notation
dognoun
birdnoun
Non-terminal (Variable)
TerminalProduction rule
105
Example
Derivation of sentence:
SaSbS
ab
abaSbS ÞÞ
aSbS S
Grammar:
106
aabbaaSbbaSbS ÞÞÞ
aSbS S
aabb
SaSbSGrammar:
Derivation of sentence
107
Other derivations
aaabbbaaaSbbbaaSbbaSbS ÞÞÞÞ
aaaabbbbaaaaSbbbbaaaSbbbaaSbbaSbS
ÞÞÞÞÞ
108
The language of the grammar
SaSbS
}0:{ nbaL nn
109
Formal Definition
Grammar PSTVG ,,,
:V Set of variables
:T Set of terminal symbols
:S Start variable
:P Set of production rules
110
ExampleGrammar
SaSbS
G
}{SV },{ baT
},{ SaSbSP
PSTVG ,,,
111
Sentential Form A sentence that contains variables and terminals
Example
aaabbbaaaSbbbaaSbbaSbS ÞÞÞÞ
sentential forms Sentence(sats)
112
We write:
Instead of:
aaabbbS*Þ
aaabbbaaaSbbbaaSbbaSbS
ÞÞÞÞ
113
nww*
1 Þ
nwwww ÞÞÞÞ 321
In general we write
if
By default ww*Þ( )
114
Example
SaSbS
aaabbbS
aabbS
abS
S
*
*
*
*
Þ
Þ
Þ
Þ
Grammar Derivations
115
baaaaaSbbbbaaSbbÞ
aaSbbSÞ
SaSbS
GrammarExample
Derivations
116
Another Grammar Example
AaAbAAbS
Derivations
aabbbaaAbbbaAbbSabbaAbbAbS
bAbS
ÞÞÞÞÞÞ
ÞÞ
GGrammar
117
More Derivations
aaaabbbbbaaaaAbbbbbaaaAbbbbaaAbbbaAbbAbS
ÞÞÞÞÞÞ
bbaS
bbbaaaaaabbbbS
aaaabbbbbS
nn
Þ
Þ
Þ
118
The Language of a Grammar
For a grammar with start variable G S
}:{)( wSwGLÞ
String of terminals
119
ExampleFor grammar
AaAbAAbS
}0:{)( nbbaGL nn
Since bbaS nnÞ
G
120
Notation
AaAbA
|aAbA
thearticleaarticle
theaarticle |
121
Linear Grammars
122
Linear Grammars
Grammars with at most one variable (non-terminal) at the right side of a production
AaAbAAbS
SaSbS
Examples:
123
A Non-Linear Grammar
bSaSaSbS
SSSS
Grammar G
)}()(:{)( wnwnwGL ba
124
A Linear Grammar
Grammar
AbBaBAAS
|
}0:{)( nbaGL nn
G
125
Right-Linear Grammars
All productions have form: xBA
xAor
aSabSS
Example
126
Left-Linear Grammars
All productions have form BxA
aBBAabA
AabS
|
xAor
Example
127
Regular Grammars
A grammar is a regular grammar if and only if it is right-linear or left-linear.
128
Regular Grammars Generate
Regular Languages
129
Theorem
LanguagesGenerated byRegular Grammars
RegularLanguages
130
Theorem - Part 1
LanguagesGenerated byRegular Grammars
RegularLanguages
Any regular grammar generatesa regular language
131
Theorem - Part 2
Any regular language is generated by a regular grammar
LanguagesGenerated byRegular Grammars
RegularLanguages
132
Proof – Part 1
The language generated by any regular grammar is regular
)(GLG
LanguagesGenerated byRegular Grammars
RegularLanguages
133
The case of Right-Linear Grammars
Let be a right-linear grammar
We will prove: is regular
Proof idea We will construct NFA with
M)()( GLML
G
)(GL
134
Grammar is right-linearG
Example
aBbBBaaABaAS
|
|
135
Construct NFA such that every state is a grammar variable:
M
aBbBBaaABaAS
|
|
A
B
S specialfinal stateFV
136
Add edges for each production:
S FV
A
B
a
aBbBBaaABaAS
|
|
137
S FV
A
B
a
aBbBBaaABaAS
|
|
138
S FV
A
B
a
a
a
aBbBBaaABaAS
|
|
139
S FV
A
B
a
a
a
baBbB
BaaABaAS
|
|
140
S FV
A
B
a
a
a
b
a
aBbBBaaABaAS
|
|
141
aaabaaaabBaaaBaAS ÞÞÞÞ
S FV
A
B
a
a
a
b
a
142
S FV
A
B
a
a
a
b
a
abBBBaaABaAS
|
|
GM GrammarNFA
abaaaabGLML
**)()(
143
In GeneralA right-linear grammar
has variables:
and productions:
G
,,, 210 VVV
jmi VaaaV 21
mi aaaV 21or
144
We construct the NFA such that:
each variable corresponds to a node:
M
iV
0V 1V 2V FV
specialfinal state
….
145
For each production:
we add transitions and intermediate nodes
jmi VaaaV 21
iV jV………
1a 2a ma
Example:http://www.cs.duke.edu/csed/jflap/tutorial/grammar/toFA/index.htmlConvert Right-Linear Grammar to FA by JFLAP
146
Example
M
)()( MLGL
04
5193
452
48433421
23110
|
||
aVaVaV
VaVVaaaVaaV
VaVaV
0VFV
1V
2V
3V
4V
1a
3a3a
4a
8a
2a 4a5a
9a5a
9a
147
The case of Left-Linear Grammars
Let be a left-linear grammar
We will prove: is regular
Proof idea We will construct a right-linear grammar with
G
)(GL
G RGLGL )()(
148
Since is left-linear grammarthe productions look like:
G
kaaBaA 21
kaaaA 21
149
Construct right-linear grammar G
In :G kaaBaA 21
In :G BaaaA k 12
vBA
BvA R
150
Construct right-linear grammar G
In :G kaaaA 21
In :G 12aaaA k
vA
RvA
151
It is easy to see that:
Since is right-linear, we have:
RGLGL )()(
)(GL RGL )(
G
)(GLRegularLanguage
RegularLanguage
RegularLanguage
152
Proof - Part 2
Any regular language is generated by some regular grammar
LG
LanguagesGenerated byRegular Grammars
RegularLanguages
153
Proof idea
Any regular language is generated by some regular grammar
LG
Construct from a regular grammar such that
M G)()( GLML
Since is regularthere is an NFA such that
LM )(MLL
154
Example
*)*(* abbababL)(MLL
a
b
a
b
M1q 2q
3q
0q
155
a
b
a
b
M0q 1q 2q
3q
3
13
32
21
11
10
qqqbqqaqqbqqaqq
G
LMLGL )()(
Convert to a right-linear grammarM
156
In General
For any transition:aq p
Add production: apq
variable terminal variable
157
For any final state: fq
Add production: fq
158
Since is right-linear grammar
is also a regular grammar with
G
LMLGL )()(
G
159
Regular GrammarsA regular grammar is any right-linear or left-linear grammar
aSabSS
aBBAabA
AabS
|
1G 2G
Examples
160
ObservationRegular grammars generate regular languages
aSabSS
aabGL *)()( 1
aBBAabA
AabS
|
*)()( 2 abaabGL
1G 2GExamples
161
Regular Languages
Chomsky’s Language Hierarchy
Non-regular languages
162
Application: CompilerLexical Analysis
(Lexikalisk analys i Kompilatorteori)
163
What is a compiler?A compiler is program that translates a source
language into an equivalent target language
while (i > 3) { a[i] = b[i]; i ++}
mov eax, ebxadd eax, 1cmp eax, 3jcc eax, edx
C program
assemblyprogram
compiler does this
164
What is a compiler?
class foo { int bar; ...}
struct foo { int bar; ...}
Java program
compiler does this
C program
165
What is a compiler?
class foo { int bar; ...}
........
.........
........
Java program
compiler does this
Java virtual machine program
166
Phases of a compiler
Lexical AnalyzerScanner
Parser
Semantic Analyzer
Source Program
Syntax Analyzer
Tokens
Parse Tree
Abstract Syntax Tree withattributes
167
Compilation in a Nutshell
Source code(character stream)
Lexical analysis
Parsing
Token stream
Abstract syntax tree(AST)
Semantic Analysis
if (b == 0) a = b;
if ( b ) a = b ;0==
if==
b 0=
a b
if==
int b int 0=
int alvalue
int b
booleanDecorated AST
int ;
;
168
Stages of Analysis
Lexical Analysis – breaking the input up into individual words/tokens
Syntax Analysis – parsing the phrase structure ofthe program
Semantic Analysis – calculating the program’s meaning
169
Lexical Analyzer
Input is a stream of charactersProduces a stream of names, keywords &
punctuation marksDiscards white space and comments
170
Recognize “tokens” in a program source code.The tokens can be variable names, reserved
words, operators, numbers, … etc.Each kind of token can be specified as an RE,
e.g., a variable name is of the form [A-Za-z][A-Za-z0-9]*.
We can then construct an -NFA to recognize it automatically.
Lexical Analyzer
171
By putting all these -NFA’s together, we obtain one which can recognize all different kinds of tokens in the input string.
We can then convert this -NFA to NFA and then to DFA, and implement this DFA as a deterministic program - the lexical analyzer.
Lexical Analyzer
172
RE NFA DFA Minimal DFA
Thompson’s Contruction
Subset Contruction
Hopcroft Minimization