1 chapter 3 - 2 construction techniques. 2 section 3.3 grammars a grammar is a finite set of rules,...
TRANSCRIPT
1
Chapter 3 - 2
Construction Techniques
2
Section 3.3 Grammars
• A grammar is a finite set of rules, called productions, that are used to describe the strings of a language.
• Notational Example. The productions take the form a → ß, where a and ß are strings over an alphabet of terminals and nonterminals. Read a → ß as, “a produces ß,” “a derives ß,” or “a is replaced by ß.” The following four expressions are productions for a grammar.S → aSB Alternative Short formS → Λ S → aSB | ΛB → bB B → bB | bB → b.
3
Terms
• Terminals: {a, b}, the alphabet of the language.
• Nonterminals: {S, B}, the grammar symbols (uppercase letters), disjoint from terminals.
• Start symbol: S, a specified nonterminal alone on the left side of some production.
• Sentential form: any string of terminals and/or nonterminals.
4
Derivation
• a transformation of sentential forms by means of productions as follows: If xay is a sentential form and a→ß is a production, then the replacement of a by ß in xay to obtain xßy is a derivation step, which we denote by xay → xßy.
• Example Derivation: S ➯ aSB ➯ aaSBB ➯ aaBB ➯ aabBB ➯ aabbB ➯ aabbb. This is a leftmost derivation, where each step replaces the leftmost nonterminal.
• The symbol ➯+ means one or more steps and ➯* means zero or more steps. So we could write S ➯+ aabbb or S ➯* aabbb or aSB ➯* aSB, and so on.
5
The Language of a Grammar• The language of a grammar is the set of terminal strings derived
from the start symbol. • Example. Can we find the language of the grammar S → aSB | Λ
and B → bB | b? • Solution: Examine some derivations to see if a pattern emerges.
S ➯ Λ S ➯ aSB ➯ aB ➯ ab S ➯ aSB ➯ aB ➯ abB ➯ abbB ➯ abbb S ➯ aSB ➯ aaSBB ➯ aaBB ➯ aabB ➯ aabb S ➯ aSB ➯ aaSBB ➯ aaBB ➯ aabBB ➯ aabbBB ➯ aabbbB ➯
aabbbb. So we have a pretty good idea that the language of the grammar is {anbn+k | n, k ∊ N}.
• Quiz (1 minute). Describe the language of the grammar S → a | bcS.
• Solution: {(bc)na | n ∊ N}.
6
Construction of Grammars
• Example. Find a grammar for {anb | n ∊ N}.
• Solution: We need to derive any string of a’s followed by b. The production S → aS can be used to derive strings of a’s. The production S → b will stop the derivation and produce the desired string ending with b. So a grammar for the language is
S → aS | b.
7
Quizzes
• Quiz (1 minute). Find a grammar for {ban | n ∊ N}.
• Solution: S → Sa | b.
• Quiz (1 minute). Find a grammar for {(ab)n | n ∊ N}.
• Solution: S → Sab | Λ or S → abS | Λ.
8
Rules for Combining Grammars
• Let L and M be two languages with grammars that have start symbols A and B, respectively, and with disjoint sets of nonterminals. Then the following rules apply.
• L ∪ M has a grammar starting with S → A | B.• LM has a grammar starting with S → AB.• L* has a grammar starting with S → AS | Λ.
9
Example
• Find a grammar for {ambmcn | m, n ∊ N}.• Solution: The language is the product LM, where
L = {ambm | m ∊ N} and M = {cn | n ∊ N}.
So a grammar for LM can be written in terms of grammars for L and M as follows.
S → AB
A → aAb | Λ
B → cB | Λ.
10
Example• Find a grammar for the set Odd, of odd decimal numerals with no
leading zeros, where, for example, 305 ∊ Odd, but 0305 ∉ Odd.• Solution: Notice that Odd can be written in the form Odd = (PD*)*O,
whereO = {1, 3, 5, 7, 9}, P = {1, 2, 3, 4, 5, 6, 7, 8, 9}, and D = {0} ∪ P. Grammars for O, P, and D can be written with start symbols A, B,
and C as: A → 1 | 3 | 5 | 7 | 9, B → A | 2 | 4 | 6 | 8, and C → B | 0. Grammars for D* and PD* and (PD*)* can be written with start
symbols E, F, and G as: E → CE | Λ, F → BE, and G → FG | Λ. So a grammar for Odd with start symbol S is S → GA.
11
Example• Find a grammar for the language L defined inductively by,
– Basis: a, b, c ∊ L. – Induction: If x, y ∊ L then ƒ(x), g(x, y) ∊ L.
• Solution: We can get some idea about L by listing some of its strings. a, b, c, ƒ(a), ƒ(b), …, g(a, a), …, g(ƒ(a), ƒ(a)), …, ƒ(g(b, c)), …, g(ƒ(a), g(b, ƒ(c))), … So L is the set of all algebraic expressions made up from the letters
a, b, c, and the function symbols ƒ and g of arities 1 and 2, respectively. A grammar for L can be written as
S → a | b | c | ƒ(S) | g(S, S). For example, a leftmost derivation of g(ƒ(a), g(b, ƒ(c))) can be
written as S ➯ g(S, S) ➯ g(ƒ(S), S) ➯ g(ƒ(a), S) ➯ g(ƒ(a), g(S, S)) ➯ g(ƒ(a), g(b, S)) ➯ g(ƒ(a), g(b, ƒ(S))) ➯ g(ƒ(a), g(b, ƒ(c))).
12
Parse Tree
• A Parse Tree is a tree that represents a derivation. The root is the start symbol and the children of a nonterminal node are the symbols(terminals, nonterminals, or Λ) on the right side of the production used in the derivation step that replaces that node.
• Example. The tree shown in the picture is the parse tree for the following derivation: S ➯ g(S, S) ➯ g(ƒ(S), S) ➯ g(ƒ(a), S) ➯ g(ƒ(a), b).
S
g ( S , S )
f ( S ) b
a
13
Ambiguous Grammar
• Means there is at least one string with two distinct parse trees, or equivalently, two distinct leftmost derivations or two distinct rightmost derivations.
• Example. Is the grammar S → SaS | b ambiguous? • Solution: Yes. For example, the string babab has two
distinct leftmost derivations: S ⇒ SaS ⇒ SaSaS ⇒ baSaS ⇒ babaS ⇒ babab. S ⇒ SaS ⇒ baS ⇒ baSaS ⇒ babaS ⇒ babab. The parse trees for the derivations are pictured in the next slide.
14
parse treesS S
S S S S
S S S S
a
a
a
ab
b b
b
b b
15
Quiz (2 minutes)
• Show that the grammar S → abS | Sab | c is ambiguous.
• Solution: The string abcab has two distinct leftmost derivations:
S ⇒ abS ⇒ abSab ⇒ abcab
S ⇒ Sab ⇒ abSab ⇒ abcab.
16
Unambiguous Grammar
• Sometimes one can find a grammar that is not ambiguous for the language of an ambiguous grammar.
• Example. The previous example showed
S → SaS | b is ambiguous. The language of the grammar is {b, bab, babab, …}. Another grammar for the language is S → baS | b. It is unambiguous because S produces either baS or b, which can’t derive the same string.
17
Example
• The previous quiz showed S → c | abS | Sab is ambiguous. Its language is
{(ab)mc(ab)n | m, n ∈ N}.
Another grammar for the language is
S → abS | cT and T → abT | ٨. • It is unambiguous because S produces either
abS or cT, which can’t derive the same string. • Similarly, T produces either abT or ٨, which can’t
derive the same string.
18
The End of Chapter 3 - 2