theory of computation lecture 02
TRANSCRIPT
Theory of Computation
Lecture #2 Mahmoud Ali Ahmed
Languages & GrammarBefore discussing languages & grammar let us deal
with some related issues.
Alphabet: is defined as a nonempty finite set
of symbols or letters such as {a1, a2, …, ak}.
Particularly the alphabet for theory of
computation is mostly the binary alphabet {0, 1}.
Set of alphabet represented by
Languages & Grammar
A string (word or sentence): is defied as a
finite sequence of symbols over the set of
alphabet (∑), e.g. aaabbbabbb is string over
∑ = {a, b}. A string might be represent by w.
A string may have no symbol at all, in this
case it is called the empty string / null string
and denoted by or
Note:Some times string (word/sentence) can be
represented as function. e.g.
A string x = x0 x1 x2 x3 … xn, which can be
viewed as function as:
x : [n] | x(k) = xk.
where n is the length of the string x, which
denoted by |x|, e.g. |aabbaab| = 7
Operations on stringsFor an alphabet Σ, given any two strings u: [m] → Σ
and v: [n] → Σ, the concatenation u・ v (also written
as uv) of u and v is the string uv: [m + n] → Σ,
defined such that:
nmimifmiv
miifiuiuv
1)(,1)(
)(
In particular uuu
Formal Languages
Formal vs. Natural Languages
Natural Languages: used by human to communicate.
e.g. Arabic, English, …
syntax is very complicated not completely specified
Formal LanguagesFormal vs. Natural Languages
Formal Language: specified by a well-defined sets
of grammatical rules
e.g. programming languages
syntax completely defined by given grammar.
Why study Formal Languages?
Programming languages are formal.
very useful for pattern matching.
Important for research in natural language
processing (NLP) with computers.
Closely tied to study of abstract machines.
Formal Languages
Let be any finite set of symbols (possibly
none), called an alphabet.
A word (string) over is any fine
sequence of letters from .
A formal language is simply a set of strings
Formal Languages* defined as the set of all possible words
over , including the empty word (denoted
by ).
ExampleLet , then},{ ba
},,,,,,,,,,{* baaabaaaabbbaabaaba
A language over is defined as strings over i.e. a subset of
*
Examples: ∑ = {a, b} aabaaa ,,,
8, * xbax
oddisxbax *,
)()(, * xNxNbax ba
Example
,,,,,,,,,* aabaaabbbaabaaba
Let ., ba Then
aabaaaL ,,
The language
is a finite language on .
Note: Language can be infinite, and the most
interesting languages are infinite.
New language can be constructed using any of the
set operations.
As an example the union of tow languages over ,
is also a language over .
Having {as, so} and {if, soon, possible} as tow
languages over the concatenation of the tow
languages give another language over as follows.
{as, so} {if, soon, possible} =
{asif, assoon, aspossible, soif, sosoon, sopossible}.
Regular Expression & Language
Definition: A regular expression over , that
corresponding to a language, can be recursively
defined as follows:
• ε is a Regular Expression indicates the language containing an
empty string. (L (ε) = {ε})
• φ is a Regular Expression denoting an empty language. (L (φ) = { })
• a is a Regular Expression where L = {a}.
Regular Expression & Language
• If x is a Regular Expression denoting the
language L(x) and y is a Regular Expression denoting the
language L(y), then
o x + y is a Regular Expression corresponding to the
language L(x) L(y)∪ where L(x + y) = L(x) L(y)∪ .
o x . x is a Regular Expression corresponding to the
language L(x) . L(y) where L(x. y) = L(x) . L(y).o R* is a Regular Expression corresponding to the language
L(R*)where L(R*) = (L(R))*.• if we apply any of the above rules several times, they are
Regular Expressions.
A language over is a regular language if there is
some regular expression over corresponding to it.
Examples of regular expression over
**011
1*
** 110
MeaningExpressionAll words over contains exactly a single 0.
All words over containing at least one 1.
All words over that containing a substring 110.
}1,0{
(00 + 01 + 10 + 11)*All string of 0’s and 1’s of even length can be obtained by concatenating any combination of the strings 00, 01, 10 and 11 including ε
MeaningExpression*)( All words over ∑ with an even length
*)( All words over ∑ of length a multiple of three
101100 All words over ∑ starts and ends with the same symbol
011 All strings containing over ∑ that begin with 1 and ending with 01
(0|(1(01*0)*1))* All strings containing over ∑ contains binary numbers that are multiples of 3
Formal Grammar
Formal definition
,,,, PSTVG where,
A generative grammar, which firstly proposed
by Noam Chomsky in 1950s, considered as a
grammar G that defined as a quadruple.
is a finite set of objects called variables.V
is a finite set of objects called terminal symbols.T
is a special symbol called start variable.VS
is a finite set of productions.P
V TIt is assumed that and are nonempty and disjoint
Definition (L(G))
wSTwGL :)(
Let be a grammar. Then the set ,,,, PSTVG
is the language generated by . G
Example
,,,, PSTVG Consider a grammar G that defined as
where,V = {S, B} T = {a, b, c}P consists of the following production rule
S aBSc S abc
Ba aB Bb bb
L(G) may be derived to be consisted of the
following strings:
L(G) = {abc, aabbcc, aaabbbccc, …}
Therefore L(G) can be represented as
L(G) = {an bn cn | n>0}
L(G) = {w ϵ {a, b, c}* | na(w) = nb(w) = nc(w)}
Representation of Language using Graphs
Transition Graph over is a finite directed graph in
which every arrow (edge) is labeled by some word
(possibly the empty word ( )).
There is a least one vertex , labeled by a ( ) sign
(such vertices are called initial vertices), and a
(possibly empty set) of vertices, labeled by a ( )
sign (called the final vertices). A vertex can be both
initial and final.
T *w
''
''
ba , *
a b bab,,
ExampleConsider the following transition graphs over
ba,
ba
All words over beginning with a followed by a sb'
ba ,aaba ,
ba,
bb
All words over containing two consecutive ( ) or two consecutive ( )
sa'sb'
bbaa,baab,bbaa ,baab,
All words over containing an even number of ( ) and even number of ( )
sa'sb'
b
aa
ba,
ba,
All words over either starting with ( ) or containing ( ) .
baa