flt course manual

28
SAINT MARY UNIVERSITY Formal Language  Theory  Haftu Hagos Chapter One The Theory of Computation Introduction Computer science is a practical discipline. Those who have worked in it often have a marked  preference for useful and tangible problems over theoretical specu lation. This is certainly true of computer science students who are interested mainly in working on difficult applications from the real world.

Upload: haftu-hagos

Post on 07-Jul-2018

234 views

Category:

Documents


0 download

TRANSCRIPT

8/18/2019 FLT Course Manual

http://slidepdf.com/reader/full/flt-course-manual 1/28

SAINT MARYUNIVERSITY

Formal Language

Theory Haftu Hagos

Chapter One

The Theory of Computation

Introduction

Computer science is a practical discipline. Those who have worked in it often have a marked

preference for useful and tangible problems over theoretical speculation. This is certainly true of

computer science students who are interested mainly in working on difficult applications from

the real world.

8/18/2019 FLT Course Manual

http://slidepdf.com/reader/full/flt-course-manual 2/28

1

Theoretical questions are interesting to them only if they help in finding good solutions. This

attitude is appropriate, since without applications there would be little interest in computers. But

given this practical orientation, one might well ask “ why study theory ”.

The first answer is that theory provides concepts and principles that help us to understand the

general nature of the discipline. The field of computer science includes a wide range of special

topics from machine design to programming. The use of computers are in the real world involves

a wealth of specific detail that must be learned of a successful application. This makes computer

science a very diverse and broad discipline. But in spite of this diversity, there are some common

underlying principles. To study these basic principles, we construct abstract models of computers

and computation. These models embody the important features that are common to bothhardware and software and that are essential to many of the special and comple constructs we

encounter while working with computers.

A second , and perhaps not so obvious answer, is that the ideas we will discuss have some

intermediate and important applications. The fields of digital design, programming languages

and compiler designs are the most obvious e amples, but there are many others. The concepts we

study here run like a thread through computer science from operating system to pattern

recognition.

The third answer is one which we try to convince the reader. The sub!ect matter is interest

intellectually and fun. "t provides many challenging, pu##le like problems that can lead to some

sleepless nights.

Therefore, in this course we will look at models that represent feature at the core of all computers

and their applications. To model the hardware of the computer, we introduce the notion of

automation $plural, automata%. &utomation is a construct that processes all the indispensable

features of a digital computer. "t accepts an input, produces output, may have some temporarystorage, and can make decision in transforming the input into the output. & formal language is an

abstract of general characteristics of programming languages.

& formal language consists of a set of symbols and some rules of formation by which these

symbols can be combined into entities called sentences. & formal language is the set of all

8/18/2019 FLT Course Manual

http://slidepdf.com/reader/full/flt-course-manual 3/28

2

strings permitted by the rules of formation. Though some the formal languages we study here are

simpler than programming languages, they have many of the same essential features. 'e can

learn a great deal about programming languages from formal languages. (inally, we will

formali#e the concept of mechanical computation by giving a precise definition of the term

algorithm and we study the kinds of problems that are $and are not% suitable for solution by such

mechanical means.

1.1 Mathematical Preliminaries and Notation Sets

& set is a collection of elements without any structure other than membership. To indicate that

is an element of set ), we write ∈

). The statement that is not in ) is written ∉

). &set is specified by enclosing some description of its elements in curly braces* for e ample, the set

of integers +, , -, is shown as

S={0,1,2 }

.

llipses are used whenever the elements are clear. Thus, /a, b, c,0,#1 stands for all the lower

case letters of the nglish alphabet, while /-,2,3,01denotes the set of all positive even integers.'hen the need arises, we use more e plicit notation, in which we write

S={ i : i>0, iiseven }

'e read this as “) is set of all i, such that i is greater than #ero and i is even” implies of course

that i is an integer.

The usual set operations are union ( Ս % andintersection ( ∩ %,difference ( ! and

Complementation defined as

S 1∪ S2={ x: x∈ S1∨ x∈ S2 }

8/18/2019 FLT Course Manual

http://slidepdf.com/reader/full/flt-course-manual 4/28

3

S 1 ∩ S 2={ x: x∈S 1∧ x∈S2 }

S 1− S 2={ x : x∈S 1∧ x∉S2 }

S={ x: x∈U ∧ x∉S}

The set with no elements, called the empty set or the null set is denoted by∅

. (rom the

definition of a set, it is obvious that

S∪ ∅= S− ∅= S

S ∩∅= ∅ ∅⁻= U

S¿= S

The followin" useful identity e#ualities

S 1∪ S 2= S1 ∩ S 2

S 1 ∩ S 2= S 1 ∪ S 2

A∪ (B ∩C )=( A∪ B)∩ ( A∪ C ) ($istri%ution property!

& set ) is said to be a subset of ) if every element of ) is also an element of ).

S 1⊆ S

• 4et & and B be sets. 'hen does & 5 B6

'hen they contain the same elements .

When A⊆ B and B⊆ A

DeMorgansLaw

8/18/2019 FLT Course Manual

http://slidepdf.com/reader/full/flt-course-manual 5/28

4

"f ) ⊆), but ) contains an element not on ) we say that ) is proper su%set of )7 we write this

as

S 1⊂ S

"( ) and )- has no common elements that is ) ∩ )- 5 ∅

, then the sets are said to be

$is&oint sets.

Theorem' "f & and B are both finite sets, thenn$&∪ B% 5 n$&% 8 n$B% 9 n$& : B%

& set is said to be finite if it contains finite number of elements* otherwise it is said to be infinite.

The si#e of finite set is the number of elements in it. &nd is denoted by | S| .

& given set normally has many subsets. The set of all the subsets of a set ) is called the powerset

of set ) and is denoted by S. ;bserve that s is set of sets .

)*ample' set ) is the set /a, b, c1, then its poweset is

S + , ∅ , ,a- ,%- ,c- ,a %- ,a c- ,% c- ,a % c-- + /

)ets are said to be Cartesian product of other sets. (or the Cartesian product of two sets, which

itself is a set ordered pairs, we write

S 1 × S 2 ={( x , y): x∈ S1 , y∈ S 2 }

)*ample' let S1 + , 0- and S + , 2 3-. Then

S 1 × S 2 ={(2,2 ),(2,3 ), (2,5 ), (2,6 ),(4,2 ),(4,3 ),(4,5 ),(4,6 )}

1. 4elations and 5unctions

A un!"#on #$ a rule "ha" a$$#gn "o elemen"$ o one $e" a un#%ue elemen" o ano"her$e"& I f 'eno"e$ a un!"#on( "hen "he )r$" $e" #$ !alle' "he 'oma#n o an' "he$e!on' $e" #$ #"$ range& *r#"e

f : S 1 → S 2

8/18/2019 FLT Course Manual

http://slidepdf.com/reader/full/flt-course-manual 6/28

+

To indicate that the domain of f is a subset of ) and that the range of f is a subset of )-. "f the

domain of f is all of ) , we say that f is the total function on ) . ;therwise, f is said to be a

partial function

4elations are more general than functions7 in a function each element of the domain has e actlyone associate element in the range* in a relation there may be several elements in the range.

1. 6raphs and Tress

& graph is a construct consists of two finite sets, the set 7 + ,8 1 8 9 8n- of vertices and the set )+ ,e 1 e 9 em- of edges . ach edge is a pair of vertices from <. for instance

e i=( v j , vk )

is an ed"e from 8 & to 8 :.

5i"ure 1.1

=raphs are conveniently visuali#ed by diagrams in which the vertices are represented as circlesand the edges as lines with arrows connecting the vertices as shown above.

The graph with vertices /v , v-, v>1 and edges /$v , v>%, $v-, v>%, $v>, v %, $v>, v>%1 is depictedin figure . .

Trees are particular types of graphs. & tree is a digraph that has no cycles, and that one distinctverte , called the root, )uch that there is e actly one path from the root to every other verte .This definition implies that the root has no incoming edges and that there are some verticeswithout outgoing edges. These are called the lea8es of the tree.

8/18/2019 FLT Course Manual

http://slidepdf.com/reader/full/flt-course-manual 7/28

,

The height of the tree is the largest level number of any verte .

5i"ure 1.

1.0 Proof Techni#ues

Testing program is essentially important &?owever, testing goes only so far, since we can@t tryour program for every input. Aore importantly, if the program is comple , say a tricky recursionor iteration. 'hen our code testing tell us that the code is in correct, we still need to go it right.

To make our iteration or recursion correct, we need to set up an inductive hypothesis.

1.0.1 $educti8e proof

&s you know from your previous knowledge, a deductive proof consists of a sequence of

statements whose truth leads us from some initial statement, called hypothesis or the givenstatements, to a conclusion statement. ach step in the proof must fellow , by some acceptedlogical principle, from either the given facts, or some the previous statements in the deductive

proof or a combination of these.

The theorem that is proved when we go from a hypothesis ? to a conclusion C is the statement“if ? then C.” we say that C is deducted from ?. e ample theorem of the form “if ? then C” willillustrate these points.

8/18/2019 FLT Course Manual

http://slidepdf.com/reader/full/flt-course-manual 8/28

-

Theorem 1. 7 if x ≥ 4 then 2 x ≥ x 2

(irst notice that, hypothesis ? is “ x≥4 ”, this hypothesis has a parameter , and thus is neither

true nor false. ather, its truth depends on the value of the parameter * ? is true for 53 andfalse for 5-.

4ikewise the conclusion C is “ 2 x ≥ x 2

”, this statement also uses parameter and is true for

certain value of and not others. (or e ample, C is false for 5>, since 2 3= 8 which is not as

large as 3 2= 9 . ;n the other hand C is true for 52 since 4 2= 2 4 = 16 . (or 5 , the

statement is also true.

Derhaps, you can see the intuitive argument that tells us that the conclusion 2 x ≥ x 2 will be true

whenever x ≥ 4 .

Theorem 1.0 7 if is sum of the squares of four positive integers, then 2 x ≥ x 2

.

Poof 7

)tepE 7 we have repeated one of the given statements of the theorem7 that is the sum of thesquares of four integers. "t often helps in proofs if we name quantities that are referred but notnamed, and we have done so here, giving the four integers the names, a, b, c and d.

1.0. Contradiction proof

Theorem 1.2 7 let ) be a finite subset of some infinite set F. let T to be the complement of ) withrespect to ). then T is infinite.

8/18/2019 FLT Course Manual

http://slidepdf.com/reader/full/flt-course-manual 9/28

.

Proof 7 intuitively, this theorem says that if you have an infinite supply of something $F%, and youtake a finite amount away $)%, then you still have an infinite amount left. 4et us begin byrestating the fact of the theorem as given below.

?owever, we still stuck.

)o let us try to proof the given theorem by contradiction technique.

The contradiction of the conclusion is “T is finite”. 4et us assume T is finite along the statement

of the hypothesis that says ) is finite7 i.e., ‖S‖ 5 n for some integer n. similarly we can restate

the assumption that T is finite as ‖T ‖= m for some integer m.

Gow one of the given statement tells us that S∪ T = U and S ∩T = ∅

that is the elements

of F are e actly the elements of ) and T. thus, there must be n 8 m elements of F. since m 8 n is

an integer and we have shown ‖U ‖= n+m , it follows that F is finite. Aore precisely, we

showed that the number of elements F is finite integers, which is the definition of “finite”. Butthe statement that F is finite contradicts the given statement that F is “infinite”. 'e have thusused the contradiction of our conclusion to prove contradiction of one of the given statements of the hypothesis and by principle of “proof by contradiction” we may conclude the theorem is true.

1.2 5ormal ;an"ua"es

& language can be seen as a system suitable for e pression of certain ideas, facts and concepts.(or formali#ing the notion of a language one must cover all the varieties of languages such asnatural $human% languages and programming languages. 4et us look at some common featuresacross the languages.

;ne may broadly see that a language is a collection of sentences* a sentence is a sequence of words* and a word is a combination of syllables. "f one considers a language that has a script,then it can be observed that a word is a sequence of symbols of its underlying alphabet. "t isobserved that a formal learning of a language has the following three steps.

8/18/2019 FLT Course Manual

http://slidepdf.com/reader/full/flt-course-manual 10/28

/

10 Learn#ng #"$ alphabet "he $ym ol$ "ha" are u$e' #n "he language&20 I"$ words a$ ar#ou$ $e%uen!e$ o $ym ol$ o #"$ al ha e"&30 Formation of sentences $e%uen!e o ar#ou$ 5or'$ "ha" ollo5

!er"a#n rule$ o "he language&

"n this learning, step > is the most difficult part. 4et us postpone discussing construction of sentences and concentrating on steps and -. (or the time being instead of completely ignoringabout sentences one may look at the common features of a word and a sentence to agree upon

both are !ust sequences of some symbols of the underlying alphabet. (or e ample, the nglishsentence

"The English articles - a, an and the – are categorized into two types:inde nite and de nite."

may be treated as a sequence of symbols from the oman alphabet along with enough punctuation marks such as comma, fullEstop, colon and further one more special symbol, namely blankEspace which is used to separate two words. Thus, abstractly, a sentence or a word may beinterchangeably used for a sequence of symbols from an alphabet. 'ith this discussion we startwith the basic definitions of alpha%ets and strin"s and then we introduce the notion of languageformally.(urther, in this chapter, we introduce some of the operations on languages and discuss algebraic

properties of languages with respect to those operations. 'e end the chapter with an introductionto finite representation of languages via regular e pressions &

1.2.1 Alpha%etsDe nition 6 An al ha e" #$ a )n#"e $e" o o 7e!"$ !alle' $ym ol$&No"a"#on6 ∈={a , b , … , z }

1.2. Strin"

De nition: A $"r#ng o er an al ha e" ∈ #$ a )n#"e $e%uen!e o $ym ol$

rom ∈ &No"a"#on6 5( 8( y9 or $"r#ng$& In$"ea' o 5 : ;5 1( 5 2( 9( 5 <0 5e 5#ll

$#m ly 5r#"e5 : 5 15 2 & & & 5<&

Note: $"r#ng$ o er #nary al ha e" =>( 1? are o "en !alle' #nary $"r#ng$&

De nition: The leng"h o $"r#ng #$ "he num er o $ym ol$ !on"a#ne' #n "he$"r#ng&No"a"#on6 @5@&

8/18/2019 FLT Course Manual

http://slidepdf.com/reader/full/flt-course-manual 11/28

1>

. .> ;an"ua" e

De nition: A language o er an al ha e" #$ a $e" o $"r#ng$ o er ∈ &No"a"#on6 L( M( N &&& or language$& @L@ or "he $# e ;num er o $"r#ng$0 o L&

No"a"#on6 ∈ ' 5#ll 'eno"e a $e" o all $"r#ng$ o er ∈ & Then( a language L

o er ∈ #$ 7u$" a $u $e" o ∈ B&

No"a"#on6 ∈k

5#ll 'eno"e a $e" o all $"r#ng$ o leng"h < o er ∈ &

Re #e5 Cue$"#on$

% )uppose 'alter@s online music store conducts a customer survey to determine the preferencesof its customers. Customers are asked what type of music they like. They may choose fromthe following categories7 Dop $D%, Ha## $H%, Classical $C%, and none of the above $G%. ;f ++customers some of the results are as follows7

22 like Classical-I like all three

like only Dop+ like Ha## and Classical, but not Dop

?ow many like Classical but not Ha##6 'e can fill in the <enn diagram below to keep track ofthe numbers.

There are n$C% 5 22 total that like Classical, and n$C : H% 5 -I8 + 5 >I that like both Ha## and

Classical, so 229>I 5 I like Classical but not Ha##.-% 4et F 5 / , -, >, 2, , J J J, +1 & 5 /-, 2, 3, K, +1 B 5 />, 3, L1 C 5 / , -, >, K, L, +1

perform the indicated operationsa% & : B

8/18/2019 FLT Course Manual

http://slidepdf.com/reader/full/flt-course-manual 12/28

11

b% &∪ Bc% &@ : Cd% $& : C%@e% $&∪ B% : C

f% $&∪ B% : &>% Metermine if the following statements are true or false. ?ere & represents any set.

a% N⊆ & b% &@⊆ &c% $&@%@ 5 &

2% 4et F 5 / , -, >, 2, , 3, I, K, L, +1 and & 5 / , >, , I, L1 and B 5 / , 2, , L1.a% (ind &∪ B

b% (ind & : B!0 Fse a <enn diagram to represent these sets &

% ;ne hundred students were surveyed and asked if they are currently taking math $A%,nglish $ % andOor ?istory $?% The survey findings are summari#ed here7

Survey esults

n;M0 : 4+ n;M D E0 : 1+n;E0 : 41 n;M D 0 : 1.

n; 0 : 4> n;M D E D 0 : -n ;M D E0 ∪ ;M D 0 ∪ ;E D 0G : 3,

a% Fse a <enn diagram to represent this data. b% ?ow many students are only taking math6

3% Ginety people at a )uperbowl party were surveyed to see what they ate while watching thegame. The following data was collected7

2K had nachos.>L had wings.> had potato skins.-+ had both wings and potato skins.

L had both potato skins and nachos.-- had both wings and nachos.

+ had nachos, wings and potato skins.a% Fse a <enn diagram to represent this data.

b% ?ow many had nothing6I% "f 4 5 /+, , + 1 and 4- 5 / , ++1, then 4 4- is0000..6K% "f 4 5 /b, ba, bab1 and 4- 5 / , b, bb, abb1 then we have 4 4-5 000000.6Ɛ

L% "n the conte t of formal language theory, another important notation is Pleene star or Pleeneclosure. The Pleene closure of language 4 is denoted by 4Q, is defined as

8/18/2019 FLT Course Manual

http://slidepdf.com/reader/full/flt-course-manual 13/28

12

& ample7 the kleene closure of the languange /+ 1 is

, <1 <1<1 <1<1<1 9.-Ɛ

a% "f 4 5 /+, +1, then 4Q is 000000006

HHHHHHHHHHHHHHHHHHHH!H" "ND HHHHHHHHHHHHHHHHHHHHHH

Chapter

5inite Automata

Introduction'e will be making use of mathematical models of physical systems called finite automata, orfinite state machines to recogni#e whether or not a string is in a particular language. This section

introduces this idea and gives the precise definition of what constitutes a finite automaton. 'elook at several variations on the definition $to do with the concept of determinism% and see thatthey are equivalent for the purpose of recogni#ing whether or not a string is in a given language.

(inite automata are a useful model for many important kinds of hardware and software. 4et us !ust list some of the most important kinds.

% )oftware for designing and checking the behavior of digital circuits.-% The “le ical analy#er” of a typical compiler that is the compiler component that breaks

the input into logical units such as identifiers, keywords and punctuation.>% )oftware for scanning large bodies of te t, such as collection of web pages, to finite

occurrences of words, phrases or other patterns.40 )oftware for verifying systems of all types that have a finite number distinct state such as

communication protocols or protocols for secure e change of information.

(inite &utomaton can be classified into two types 9

• Meterministic (inite &utomaton $M(&%

8/18/2019 FLT Course Manual

http://slidepdf.com/reader/full/flt-course-manual 14/28

13

• GonEdeterministic (inite &utomaton $GM(& O G(&%

In FA( or ea!h #n u" $ym ol( one !an 'e"erm#ne "he $"a"e "o 5h#!h "he ma!h#ne5#ll mo e& en!e( #" #$ !alle' Deterministic #utomaton$ A$ #" ha$ a )n#"e num er

o $"a"e$( "he ma!h#ne #$ !alle' Deterministic Finite Machine or DeterministicFinite #utomaton$

In N FA( or a ar"#!ular #n u" $ym ol( "he ma!h#ne !an mo e "o any !om #na"#on o "he $"a"e$ #n "he ma!h#ne& In o"her 5or'$( "he e8a!" $"a"e "o 5h#!h "he ma!h#nemo e$ !anno" e 'e"erm#ne'& en!e( #" #$ !alle' Non%deterministic #utomaton$ A$ #" ha$ )n#"e num er o $"a"e$( "he ma!h#ne #$ !alle' Non%deterministic FiniteMachine or Non%deterministic Finite #utomaton &

.1 $eterministic finite Accepters i

The first steps of automaton we study in detail are finite accepters that are deterministic

in their operation. 'e start with a precise formal definition of deterministic acceptors..1.1 $eterministic Accepters and Their transition 6raphs

& deterministic finite accepter or dfa is defined by the quotable

M + (= > δ , #< f!

?here

= is finite set of internal states > is finite set of sym%ols called input alpha%ets δ : Q×∑→Q is total function called the transition function

q 0∈Q is the initial state

⊆ Q is a set of final states.

& deterministic finite accepter operates in the following manner. &t this initial time, it is assumedto be in the initial state q+, with its input mechanism on the left symbol of the input string.Muring each move the automation, the input mechanism advances one position to the right, soeach move consumes one input symbol. 'hen the end of the string is reached, the string is

accepted if the automaton is in its final state. ;therwise the string is re!ected.The input mechanism can move from left to right and reads e actly one symbol on each step.

The transitions from one internal state to another are governed by the transition function δ .

5or e*ample' if

8/18/2019 FLT Course Manual

http://slidepdf.com/reader/full/flt-course-manual 15/28

14

δ (q 0, a )= q 1

Then if the dfa is in the state q+, and the current input symbol a, the dfa will go into state q .

To visuali#e and represent finite automata, we use transition "raphs , in which the 8erte* represent states and the ed"es represent transitions . The labels in vertices are the names of thestates, while the labels on the edges are the current value of the input symbol.

(or e ample7 if q+ and q are the internal states of some dfa A, then the graph associated with Awill have one verte labeled q+ and another labeled q . &nd edge $q+.q % labeled a represents the

transition δ (q 0, a )= q 1 . The initial state will be identified by an incoming unlabeled arrow

not originating at any verte . (inal states are drawn with doubled circle.

(or every transition rule, δ (qi ,a )= qj the "raph has an ed"e (#i #&! la%eled a.

(rom the above given graph representation,

! =( {q 0, q 1, q 2 }, {0,1 }, δ , q 0, {q 1 })

?here δ is "i8en as

8/18/2019 FLT Course Manual

http://slidepdf.com/reader/full/flt-course-manual 16/28

1+

The dfa accepts the string + . )tating in the q+, the symbol + is read first. 4ooking at the edges of the graph, we see that the automaton remains in the state q+. Ge t is read and the automatongoes into state q . 'e ate know at the end of the string and at the same time in a final state q .Therefore, the string + is accepted. The dfa doesn@t accept the string ++, since after reading twoconsecutive +@s, it will be in state q+.by similar reason we see that the automaton will accept thestrings + , + , ++ , but not ++ or ++.

.1. )*tended Transition 5unction for a $5A"t is convenient to introduce the e tended transition function δ ¿ : Q× ∑ ¿→Q the second

argument δ ¿ is a string, rather than a single symbol and its value gives the state the

automaton will be in after reading that string. (or e ample if,

δ (q 0, a )= q 1

And

δ (q 1, b )= q 2

Then

δ ¿(q 0, ab )= q 2

Formally& we can de ne δ ¿ recursively by

δ ¿(q 0, ")= q '$(

δ δ (¿¿ ¿(q , # ), a )

δ ¿(q,#a )= ¿ '$)

8/18/2019 FLT Course Manual

http://slidepdf.com/reader/full/flt-course-manual 17/28

1,

For all q∈Q , # ∈∑¿ , a ∈∑ & To "h#$ 5hy "h#$ #$ a ro r#a"e( le"$ u$ a ly

"he$e 'e)n#"#on$ "o "he $#m le !a$e a o e&

δ δ (¿¿ ¿(q 0, a ), b)

δ ¿ (q 0, ab )= ¿ '$)

*ut

δ δ (¿¿ ¿(q 0, "), a )

δ ¿(q 0, a )= ¿

¿δ (q 0, a )

¿q 1

Su%stitutin" in ( . ! we "et

δ ¿(q 0, ab )= δ ¿(q 0, ab )= δ (q 1, ab )= q 2

#s e+pected,,,

.1. ;an"ua"es Accepted %y $5A@s?aving made a precise definition of an accepter, we are now ready to define formally what wemean by an associated language. The association is obvious* language is the set of all stringsaccepted by the automaton.

$efinition' the language accepted by a dfa A 5 $R, > δ #< 5% is the set of all strings on >

accepted %y M in formal notation

Gote that we require that δ and consequently δ ¿ , be total functions. &t each step a unique

move is defined, so that we are !ustified in calling such an automaton deterministic . & dfa will

process every string ∑¿ and either accepted it or re!ected it. Gon acceptance means that the

dfa is stops in a no final state, so that

8/18/2019 FLT Course Manual

http://slidepdf.com/reader/full/flt-course-manual 18/28

1-

$

Lin- "+ample '$'The automaton in 4in# (igure -.- accepts all strings consisting of arbitrary numbers of a@sfollowed by a single b.

"n set notation, the language accepted by the automaton is L 5 / a nb : n ≥ +1.

Gote that q- has two selfEloop edges, each with a different label. 'e write this compactly withmultiple labels.A trap state #$ a $"a"e rom 5h#!h "he au"oma"on !an ne er Je$!a eK&

Gote that q- is a trap state in the dfa transition graph shown in 4in# (igure -.-.Transition graphs are quite convenient for understanding finite automata.

Lin- Fig$ '$': DF# !ransition .raph with !rap State

(or other purposesSsuch as representing finite automata in programsSa ta%ular representation of

transition function δ may also be convenient $as shown in 4in# (ig. -.>%.

Lin- Fig$ '$): DF# !ransition !able

8/18/2019 FLT Course Manual

http://slidepdf.com/reader/full/flt-course-manual 19/28

1.

Lin- "+ample '$)(ind a deterministic finite accepter that recogni#es the set of all string on 5 / a, b 1 starting withthe prefi ab . 4in# (igure -.2 shows a transition graph for a dfa for this e ample.The dfa must accept ab and then continue until the string ends.

This dfa has a final trap state q- $accepts% and a non-final trap state q> $re!ects%.

Lin- Fig$ '$/

Lin- "+ample '$/(ind a dfa that accepts all strings on /+ , 1, e cept those containing the substring ++ .

• need to “remember” whether last two inputs were ++• use state changes for “memory”

Lin- Fig$ '$0

4in# (igure -. shows a dfa for this e ample.&ccepts7 , +, ++, + , + ++++

e!ects7 ++ , +++++ , ++ + + + + +

.1.0 4e"ular ;an"ua"es;in $efinition . (4e"ular ;an"ua"e! 7 & language L is called regular if and only if theree ists a dfa M such that L 5 L$ M %.Thus dfas define the family of languages called regular .

Lin- "+ample '$0

Sho5 "ha" "he language L : = awa: w ∈ = a, b ? H? #$ regular&• on$"ru!" a ' a&• he!< 5he"her eg#n en' 5#"h J a K&• Am #n )nal $"a"e 5hen $e!on' a #n u"&

8/18/2019 FLT Course Manual

http://slidepdf.com/reader/full/flt-course-manual 20/28

1/

L#n F#gure 2&, $ho5$ a ' a or "h#$ e8am le

**Question: How would we prove that a language is not regular?We will come back to this question in chapter 4.

Lin- Fig$ '$1

Lin- "+ample '$1Le" L e "he language #n "he re #ou$ e8am le ;L#n E8am le 2&+0&Sho5 "ha" L2 #$ regular &

L- 5 / aw aaw - a: w , w2 ∈ / a, b 1Q1.• on$"ru!" a ' a&• U$e E8am le 2&+ ' a a$ $"ar"#ng o#n"&• A!!e " "5o !on$e!u"# e $"r#ng$ o orm awa &• No"e "ha" any "5o !on$e!u"# e a B$ !oul' $"ar" a $e!on' $"r#ng&

Lin- Figure '$2 shows a dfa for this e+ample

8/18/2019 FLT Course Manual

http://slidepdf.com/reader/full/flt-course-manual 21/28

2>

The last e ample suggests the con!ecture that if a language L then so is L- , L>, etc. 'e will come back to this issue in chapter 2.

. Nondeterministic 5inite Accepters

. .1 Nondeterministic AcceptersLin- De nition '$/ 3NF#4 6 A nondeterministic nite accepter or nfa #$ 'e)ne' y "he "u le

: ; !, ∑ , δ , >, # 0 5here ! ( ( >( an' # are 'e)ne' a$ or 'e"erm#n#$"#! )n#"e

a!!e "er$( u"

δ : Q × (∑∪ { " })→ 2Q

emember for dfas7

• Q is a finite set of internal states .• a finite set of symbols called the input alphabet .• q+ ∈ Q is the initial state .• F $ Q is a set of final states .

The key differences between dfas and nfas are

. dfa7 δ yields a single state

nfa7 δ yields a set of states

-. dfa7 consumes input on each move nfa7 can move without input $ %

>. dfa7 moves for all inputs in all statesnfa7 some situations have no defined moves

&n nfa accepts a string if some possible sequence of moves ends in a final state.&n nfa rejects a string if no possible sequence of moves ends in a final state.

Lin- "+ample '$2Consider the transition graph shown in 4in# (igure -.K. Gote the nondeterminism in state q+with two possible transitions for input a . &lso state q> has no transition for any input.

8/18/2019 FLT Course Manual

http://slidepdf.com/reader/full/flt-course-manual 22/28

21

Lin- Fig$ '$5

Lin- "+ample '$5

Consider the transition graph for an nfa shown in 4in# (igure -.L. Gote the nondeterminism and the Etransition. Gote7 ?ere means the move takes place without consuming any input symbol.This is different rom a!!e "#ng an em "y $"r#ng&

Tran$#"#on$6

Lin- Fig$ '$6

• or ; >, >0O• or ; 1 , >0O• or ; 2 , >0O• or ; 2 , 10O

&ccepts7 , +, + +, + + +e!ects7 +,

'hat about +, + ++6

. . )*tended Transition 5unction for an N5A&s with dfas, the transition function can be exten e so its second argument is a string.

8/18/2019 FLT Course Manual

http://slidepdf.com/reader/full/flt-course-manual 23/28

22

equirement7 δ ¿ $qi, w% 5Qj where Qj is the set of all possible states the automaton may be

in, having started in state qi and read string w.;in $efinition .2 ()*tended Transition 5unction! 7 (or an nfa, the exten e transition

function is defined so thatδ ¿

$ $qi,w% contains qj if there is a walk in the transition graph fromqi to qj labeled w. This holds for all qi, qj

∈Q and w∈

E*.

Lin- "+ample '$6Lin- Figure '$(7 re re$en"$ an n a& I" ha$ $e eral "ran$#"#on$ an' $ome un'e)ne'"ran$#"#on$ $u!h a$

δ (q 2, a )

Lin- Fig$ '$(7

Su o$e 5e 5an" "o )n' δ ¿(q 1, a )∧δ ¿(q 2, ") there is a walk labeled a involving two Etransitions

from q to itself. By using some of the Eedges twice we see that there are also walks involving Etransitions to q+and q-.Thus

δ ¿

(q 1, a )={q 0, q 1, q 2 })ince there is a Eedge between q- and q+, we have immediately that δ ¿(q 2, ")= q 0 . &lso any state can be

reached from itself by making no involve, and consequently no input symbol, δ ¿(q 2, ")= q 2 .

ThereforeBδ ¿(q 2, ")={q 0. q 2 }

Fsing many Etransitions as needed you can also check that,δ ¿(q 2 , a a )={q 0, q 1, q 2 }

. . ;an"ua"e Accepted %y an N5A;in $efinition .3 (;an"ua"e Accepted %y N5A! 7 The language L accepted by the nfa

M 5 $Q, , δ , q+ , F % is defined

L $ M % 5 /w ∈∑¿7 δ ¿ $q+ , w% ∈ F $ 0 1.

That is, L$ M % is the set of all strings w for which there is a walk labeled w from the initial verteof the transition graph to some final verte .

8/18/2019 FLT Course Manual

http://slidepdf.com/reader/full/flt-course-manual 24/28

23

Lin- "+ample '$(7 3"+ample '$5 evisited44et@s again e amine the automaton given in 4in# (igure -.L $ ample -.K%.This nfa, called it M 7

must end in q+• L$ M % 5 /$ +%n 7n ≥ +1

Gote that q- is a dead configuration because δ ¿ $q+ , +% 5 ∅

.

Lin- Fig$ '$6 3 epeated4

2&2&4 ?hy Nondeterminism'hen computers are deterministic6

• an nfa can model a search or backtracking algorithm• nfa solutions may be simpler than dfa solutions $can convert from nfa to dfa%• nondeterminism may model e ternally influenced interactions $or abstract more detailed

computations%

Chapter

6rammars

Introduction"n this chapter we introduce the notation grammar called the conte tEfree grammar $C(=% as alanguage generator. The notation of derivation is instrumental in understanding how the stringsare generated in a grammar.

"n the conte t of natural languages, the grammar of the language is the set of rules which areused to construct Ovalidate sentence of the language. 'e look into the general features of thegrammars $of natural languages% to formali#e the notation in the present conte t which facilitatefor the better understanding of formal languages.

Consider the nglish language

8/18/2019 FLT Course Manual

http://slidepdf.com/reader/full/flt-course-manual 25/28

24

The students study automata theory.

"n order to observe that the sentence is grammatically correct, one may contribute certain rules of the nglish grammar to the sentence and validate it. (or instance the article “ The ” followed by

the noun students form a nonEphrase and similarly the noun automata theory form a nonEphrase.(urther, study is a verb. Gow choose the sentential form “sub!ect SverbEob!ect” of the nglishgrammar. Therefore, based upon this grammatical rule, the above sentence may be concludedthat is correct grammatically. This verificationOderivation is depicted in figure >. . the derivationcan also be represented by a tree structure as figure -.-.

(igure >. 7 derivation of an nglish sentence.

8/18/2019 FLT Course Manual

http://slidepdf.com/reader/full/flt-course-manual 26/28

2+

(igure -.-7 Merivation Tree of an nglish sentence.

.1 Conte*t 5ree 6rammar& context-free grammar is a model considered by the Chomsky school of formal linguistics.

The idea is that sentences are recursively generated from internal mental symbols through a

series of production rules.'e know understand that a grammar should have the following components.

& set of nonterminal symbols& set of terminal symbols& set of rules&s a grammar is to construct Ovalidate sentence of a language, we distinguish a

symbol in a set of nonterminal to represent a sentence S from which various sentence

of the language can be generatedO validated.

$efinition .1.1' & grammar is a quadruple

6 + (N > P S!

'here

. G is finite set of nonterminal-. is a finite set of terminals

>. % ∈ & #$ "he $"ar" $ym ol( an'

2. D is a finite subset of % × & ¿ called the sets production rules. ?ere , & = % ∪ ∑ .

8/18/2019 FLT Course Manual

http://slidepdf.com/reader/full/flt-course-manual 27/28

2,

"t is convenient to write A →' for the production rule

8/18/2019 FLT Course Manual

http://slidepdf.com/reader/full/flt-course-manual 28/28

# In FA( or ea!h #n u" $ym ol( one !an 'e"erm#ne "he $"a"e "o 5h#!h "he ma!h#ne 5#llmo e& en!e( #" #$ !alle' Deterministic #utomaton$ A$ #" ha$ a )n#"e num er o $"a"e$("he ma!h#ne #$ !alle' Deterministic Finite Machine or Deterministic Finite#utomaton$