top down parsing(sid) (1)
Post on 28-Nov-2014
784 Views
Preview:
DESCRIPTION
TRANSCRIPT
Top-Down Parsing
Structure of Compiler
Syntax Analysis• Scanner and parser
– Scanner looks at the lower level part of the programming language• Only the language for the tokens
– Parser looks at the higher lever part of the programming language• E.g., if statement, functions• The tokens are abstracted
• Parser = Syntax analyzer– Input: sequence of tokens from lexical analysis– Output: a parse tree of the program
• E.g., AST
– Process:• Try to derive from a starting symbol to the input string (How?)• Build the parse tree following the derivation
Top-Down Parsing
• Top down parsing– Recursive descent parsing– Making Recursive descent a Predictive parsing algorithm
• Left recursion elimination• Left factoring
– Predictive parsing• First set and Follow set• Parse table construction• Parsing procedure
– LL grammars and languages• LL(1) grammar
Predictive parsing
A predictive parser is an efficient way of implementing recursive-descent parsing by handling the stack of activation records explicitly.
First set and Follow setThe construction of a predictive parser is aided by two functions associated with
a grammar G. These functions, FIRST and FOLLOW, allow us to fill in the proper entries of a predictive parsing table for G, if such a parsing table for G exist.
First :If a is any string of grammar symbols, let FIRST(a) be the set of terminals that
begin the strings derived from a. If a ⇒ ε then e is also in FIRST(a).
Follow :Define FOLLOW(A), for nonterminal A, to be the set of terminals a that can
appear immediately to the right of A in some sentential form, that is, the set of terminals a such that there exists a derivation of the form S ⇒ αAa β for some a and b. Note that there may, at some time during the derivation, have been symbols between A and a, but if so, they derived ε and disappeared. If A can be the rightmost symbol in some sentential form, then $, representing the input right endmarker, is in FOLLOW(A).
First
Rules for Compute First set:
1. If X is terminal, then FIRST (X) is {X}.2. If X is nonterminal and X aα is a production, then add a to
FIRST (X). If X ε is a production, then add ε to FIRST (X).3. If XY1Y2…..Yk is a production, then for all i such that all of
Y1,Y2…..,Yi-1 are nonterminals and FIRST (Yj) contains ε for j=1,2,
…,i-1 (i.e. Y1Y2…. Yi-1 ⇒ ε), add every non-ε symbol in FIRST (Yi)
to FIRST(X). If ε is in FIRST (Yj) for all j=1, 2… k, then add ε to
FIRST(X).
Follow
Rules for Compute Follow set:
1. $ is in FOLLOW(S), where S is the starting symbol.2. If there is a production A αBβ, β ≠ ε, then everything is in
FIRST (Β) except for ε is placed in FOLLOW (B).3. If there is a production A αB, or a production A αBβ, where
FIRST (Β) contains ε (i.e. β ⇒ ε), then everything in FOLLOW (A) is in FOLLOW (B).
Example
Grammar E TE’ E’ +TE’ | T FT’ T’ *FT’ | F (E) | id | num
Note : This grammar is non left-recursive type so, we get Both FIRST and FOLLOW sets
First
Grammar E TE’ E’ +TE’ | T FT’ T’ *FT’ | F (E) | id | num
First setFIRST (E) = { } FIRST (+) = { + } FIRST (id) = { id }FIRST(E') = { } FIRST (*) = { * } FIRST (num) = { num }FIRST(T) = { } FIRST (‘(‘) = { ( }FIRST(T') = { } FIRST (‘)’) = { ) }FIRST( F ) = { } FIRST () = { }
The First thing we do is Add terminal itself in its FIRST set by applying 1st
rule of FIRST set
First
Grammar E TE’ E’ +TE’ | T FT’ T’ *FT’ | F (E) | id | num
First setFIRST (E) = { } FIRST (+) = { + } FIRST (id) = { id }FIRST(E') = { + , ε } FIRST (*) = { * } FIRST (num) = { num }FIRST(T) = { } FIRST (‘(‘) = { ( }FIRST(T') = { * , ε } FIRST (‘)’) = { ) }FIRST( F ) = { ( , id , num } FIRST () = { }
Next, we apply rule 2 i.e. for every production X aα add a to First (X)
Rule 2 also says that for every production X ε add ε to First (X)
First
Grammar E TE’ E’ +TE’ | T FT’ T’ *FT’ | F (E) | id | num
First setFIRST (E) = { ( , id , num } FIRST (+) = { + } FIRST (id) = { id }FIRST(E') = { + , ε } FIRST (*) = { * } FIRST (num) = { num }FIRST(T) = { ( , id , num } FIRST (‘(‘) = { ( }FIRST(T') = { * , ε } FIRST (‘)’) = { ) }FIRST( F ) = { ( , id , num } FIRST () = { }
For Production T FT’ 3rd rule says that everything in FIRST (F) is into FIRST (T). (since production is like XY1Y2…Yk and
FIRST (Y1) not contain ε )Similarly For Production E TE’ 3rd rule says
that everything in FIRST (F) is into FIRST (T).
Follow
Grammar E TE’ E’ +TE’ | T FT’ T’ *FT’ | F (E) | id | num
The First thing we do is Add $ to the start Symbol nonterminal “E”
Follow setFOLLOW(E) = { $ }FOLLOW(E') = { }FOLLOW(T) = { }FOLLOW(T') = { }FOLLOW( F ) = { }
Follow
Grammar E TE’ E’ +TE’ | T FT’ T’ *FT’ | F (E) | id | num
Follow setFOLLOW(E) = { $ , ) }FOLLOW(E') = { }FOLLOW(T) = { + }FOLLOW(T') = { }FOLLOW( F ) = { * }
Next we get this productions on which we apply rule 2we apply rule 2 to E' →+TE' This says that
everything in First(E') except for ε should be in Follow(T)Similarly we can apply this rule 2 to other
two productions T' → *FT‘ and F → (E) which give Follow(F) & Follow(E)
respectively
Follow
Grammar E TE’ E’ +TE’ | T FT’ T’ *FT’ | F (E) | id | num
Follow setFOLLOW(E) = { $ , ) }FOLLOW(E') = { $ , ) }FOLLOW(T) = { + , $ , ) }FOLLOW(T') = { + , $ , ) }FOLLOW(F) = { * , + , $ , ) }
Now applying 3rd rule on this production we can say everything
in FOLLOW (E) is into FOLLOW (E’). (since here β = ε )
Now applying 3rd rule on this production we can say everything
in FOLLOW (E’) is into FOLLOW (T). (since here β ⇒ ε )
Same here everything in FOLLOW (T) is into FOLLOW (T’).
(since here β = ε )Same here everything in
FOLLOW (T’) is into FOLLOW (F). (since here β ⇒ ε )
Parse table Construct a parse table M[N, T{$}]
Non-terminals in the rows and terminals in the columns
For each production A For each terminal a First()
add A to M[A, a] Meaning: When at A and seeing input a, A should be used
If First() then for each terminal a Follow(A)
add A to M[A, a] Meaning: When at A and seeing input a, A should be used
• In order to continue expansion to • X AC A B B b | C cc
If First() and $ Follow(A)
add A to M[A, $] Same as the above
Construct a Parse Table
Grammar E TE’ E’ +TE’ | T FT’ T’ *FT’ | F (E) | id | num
First(*) = {*}First(F) = {(, id, num}First(T’) = {*, }First(T) {(, id, num}First(E’) = {+, }First(E) {(, id, num}
Follow(E) = {$, )}Follow(E’) = {$, )}Follow(T) = {$, ), +}Follow(T) = {$, ), +}Follow(T’) = {$, ), +}Follow(F) = {*, $, ), +}
E TE’: First(TE’) = {(, id, num}E’ +TE’: First(+TE’) = {+}E’ : Follow(E’) = {$,)}T FT’: First(FT’) = {(, id, num}T’ *FT’: First(*FT’) = {*}T’ : Follow(T’) = {$, ), +}id num * + ( ) $
E E TE’ E TE’ E TE’
E’ E’ +TE’ E’ E’
T T FT’ T FT’ T FT’
T’ T’ *FT’ T’ T’ T’
F F id F num F (E)
Now we can have a predictive parsing mechanism Use a stack to keep track of the expanded form Initialization
Put starting symbol S and $ into the stack Add $ to the end of the input string $ is for the recognition of the termination configuration
If ‘a’ is at the top of the stack and ‘a’ is the next input symbol then Simply pop ‘a’ from stack and advance on the input string
If ‘A’ is on top of the stack and ‘a’ is the next input symbol then Assume that M [A, a] = A Replace A by in the stack
Termination When only $ in the stack and in the input string
If ‘A’ is on top of the stack and ‘a’ is the next input but M[A, a] = empty Error
Parsing procedure
id num * + ( ) $
E E TE’ E TE’ E TE’
E’ E’ +TE’ E’ E’
T T FT’ T FT’ T FT’
T’ T’ *FT’ T’ T’ T’
F F id F num F (E)
Stack Input Action
E $ id + num * id $ ETE’
T E’ $ id + num * id $ T FT’
F T’ E’ $ id + num * id $ F id
T’ E’ $ + num * id $ T’
E’ $ + num * id $ E’ +TE’
T E’ $ num * id $ T FT’
F T’ E’ $ num * id $ F num
T’ E’ $ * id $ T’ *FT’
F T’ E’ $ id $ F id
T’ E’ $ $ T’
E’ $ $ E’
$ $ AcceptERROR
Example of Predictive Parsing:
id + num * id
Thank you
top related