top down parsing(sid) (1)

Top-Down Parsing

Structure of Compiler

Syntax Analysis• Scanner and parser

– Scanner looks at the lower level part of the programming language• Only the language for the tokens

– Parser looks at the higher lever part of the programming language• E.g., if statement, functions• The tokens are abstracted

• Parser = Syntax analyzer– Input: sequence of tokens from lexical analysis– Output: a parse tree of the program

• E.g., AST

– Process:• Try to derive from a starting symbol to the input string (How?)• Build the parse tree following the derivation

Top-Down Parsing

• Top down parsing– Recursive descent parsing– Making Recursive descent a Predictive parsing algorithm

• Left recursion elimination• Left factoring

– Predictive parsing• First set and Follow set• Parse table construction• Parsing procedure

– LL grammars and languages• LL(1) grammar

Predictive parsing

A predictive parser is an efficient way of implementing recursive-descent parsing by handling the stack of activation records explicitly.

First set and Follow setThe construction of a predictive parser is aided by two functions associated with

a grammar G. These functions, FIRST and FOLLOW, allow us to fill in the proper entries of a predictive parsing table for G, if such a parsing table for G exist.

First :If a is any string of grammar symbols, let FIRST(a) be the set of terminals that

begin the strings derived from a. If a ⇒ ε then e is also in FIRST(a).

Follow :Define FOLLOW(A), for nonterminal A, to be the set of terminals a that can

appear immediately to the right of A in some sentential form, that is, the set of terminals a such that there exists a derivation of the form S ⇒ αAa β for some a and b. Note that there may, at some time during the derivation, have been symbols between A and a, but if so, they derived ε and disappeared. If A can be the rightmost symbol in some sentential form, then $, representing the input right endmarker, is in FOLLOW(A).

First

Rules for Compute First set:

1. If X is terminal, then FIRST (X) is {X}.2. If X is nonterminal and X aα is a production, then add a to

FIRST (X). If X ε is a production, then add ε to FIRST (X).3. If XY1Y2…..Yk is a production, then for all i such that all of

Y1,Y2…..,Yi-1 are nonterminals and FIRST (Yj) contains ε for j=1,2,

…,i-1 (i.e. Y1Y2…. Yi-1 ⇒ ε), add every non-ε symbol in FIRST (Yi)

to FIRST(X). If ε is in FIRST (Yj) for all j=1, 2… k, then add ε to

FIRST(X).

Follow

Rules for Compute Follow set:

1. $ is in FOLLOW(S), where S is the starting symbol.2. If there is a production A αBβ, β ≠ ε, then everything is in

FIRST (Β) except for ε is placed in FOLLOW (B).3. If there is a production A αB, or a production A αBβ, where

FIRST (Β) contains ε (i.e. β ⇒ ε), then everything in FOLLOW (A) is in FOLLOW (B).

Example

Grammar E TE’ E’ +TE’ | T FT’ T’ *FT’ | F (E) | id | num

Note : This grammar is non left-recursive type so, we get Both FIRST and FOLLOW sets

First


First setFIRST (E) = { } FIRST (+) = { + } FIRST (id) = { id }FIRST(E') = { } FIRST (*) = { * } FIRST (num) = { num }FIRST(T) = { } FIRST (‘(‘) = { ( }FIRST(T') = { } FIRST (‘)’) = { ) }FIRST( F ) = { } FIRST () = { }

The First thing we do is Add terminal itself in its FIRST set by applying 1st

rule of FIRST set

First


First setFIRST (E) = { } FIRST (+) = { + } FIRST (id) = { id }FIRST(E') = { + , ε } FIRST (*) = { * } FIRST (num) = { num }FIRST(T) = { } FIRST (‘(‘) = { ( }FIRST(T') = { * , ε } FIRST (‘)’) = { ) }FIRST( F ) = { ( , id , num } FIRST () = { }

Next, we apply rule 2 i.e. for every production X aα add a to First (X)

Rule 2 also says that for every production X ε add ε to First (X)

First


First setFIRST (E) = { ( , id , num } FIRST (+) = { + } FIRST (id) = { id }FIRST(E') = { + , ε } FIRST (*) = { * } FIRST (num) = { num }FIRST(T) = { ( , id , num } FIRST (‘(‘) = { ( }FIRST(T') = { * , ε } FIRST (‘)’) = { ) }FIRST( F ) = { ( , id , num } FIRST () = { }

For Production T FT’ 3rd rule says that everything in FIRST (F) is into FIRST (T). (since production is like XY1Y2…Yk and

FIRST (Y1) not contain ε )Similarly For Production E TE’ 3rd rule says

that everything in FIRST (F) is into FIRST (T).

Follow


The First thing we do is Add $ to the start Symbol nonterminal “E”

Follow setFOLLOW(E) = { $ }FOLLOW(E') = { }FOLLOW(T) = { }FOLLOW(T') = { }FOLLOW( F ) = { }

Follow


Follow setFOLLOW(E) = { $ , ) }FOLLOW(E') = { }FOLLOW(T) = { + }FOLLOW(T') = { }FOLLOW( F ) = { * }

Next we get this productions on which we apply rule 2we apply rule 2 to E' →+TE' This says that

everything in First(E') except for ε should be in Follow(T)Similarly we can apply this rule 2 to other

two productions T' → *FT‘ and F → (E) which give Follow(F) & Follow(E)

respectively

Follow


Follow setFOLLOW(E) = { $ , ) }FOLLOW(E') = { $ , ) }FOLLOW(T) = { + , $ , ) }FOLLOW(T') = { + , $ , ) }FOLLOW(F) = { * , + , $ , ) }

Now applying 3rd rule on this production we can say everything

in FOLLOW (E) is into FOLLOW (E’). (since here β = ε )

Now applying 3rd rule on this production we can say everything

in FOLLOW (E’) is into FOLLOW (T). (since here β ⇒ ε )

Same here everything in FOLLOW (T) is into FOLLOW (T’).

(since here β = ε )Same here everything in

FOLLOW (T’) is into FOLLOW (F). (since here β ⇒ ε )

Parse table Construct a parse table M[N, T{$}]

Non-terminals in the rows and terminals in the columns

For each production A For each terminal a First()

add A to M[A, a] Meaning: When at A and seeing input a, A should be used

If First() then for each terminal a Follow(A)

add A to M[A, a] Meaning: When at A and seeing input a, A should be used

• In order to continue expansion to • X AC A B B b | C cc

If First() and $ Follow(A)

add A to M[A, $] Same as the above

Construct a Parse Table


First(*) = {*}First(F) = {(, id, num}First(T’) = {*, }First(T) {(, id, num}First(E’) = {+, }First(E) {(, id, num}

Follow(E) = {$, )}Follow(E’) = {$, )}Follow(T) = {$, ), +}Follow(T) = {$, ), +}Follow(T’) = {$, ), +}Follow(F) = {*, $, ), +}

E TE’: First(TE’) = {(, id, num}E’ +TE’: First(+TE’) = {+}E’ : Follow(E’) = {$,)}T FT’: First(FT’) = {(, id, num}T’ *FT’: First(*FT’) = {*}T’ : Follow(T’) = {$, ), +}id num * + ( ) $

E E TE’ E TE’ E TE’

E’ E’ +TE’ E’ E’

T T FT’ T FT’ T FT’

T’ T’ *FT’ T’ T’ T’

F F id F num F (E)

Now we can have a predictive parsing mechanism Use a stack to keep track of the expanded form Initialization

Put starting symbol S and $ into the stack Add $ to the end of the input string $ is for the recognition of the termination configuration

If ‘a’ is at the top of the stack and ‘a’ is the next input symbol then Simply pop ‘a’ from stack and advance on the input string

If ‘A’ is on top of the stack and ‘a’ is the next input symbol then Assume that M [A, a] = A Replace A by in the stack

Termination When only $ in the stack and in the input string

If ‘A’ is on top of the stack and ‘a’ is the next input but M[A, a] = empty Error

Parsing procedure

id num * + ( ) $

E E TE’ E TE’ E TE’

E’ E’ +TE’ E’ E’

T T FT’ T FT’ T FT’

T’ T’ *FT’ T’ T’ T’

F F id F num F (E)

Stack Input Action

E $ id + num * id $ ETE’

T E’ $ id + num * id $ T FT’

F T’ E’ $ id + num * id $ F id

T’ E’ $ + num * id $ T’

E’ $ + num * id $ E’ +TE’

T E’ $ num * id $ T FT’

F T’ E’ $ num * id $ F num

T’ E’ $ * id $ T’ *FT’

F T’ E’ $ id $ F id

T’ E’ $ $ T’

E’ $ $ E’

$ $ AcceptERROR

Example of Predictive Parsing:

id + num * id

Thank you

top down parsing(sid) (1)

Education