· web viewa translator is a program that takes as input a program written in one programming...

80
CS6660-PRINCIPLES OF COMPILER DESIGN PART-A (2 MARKS) Unit I 1. Define compilers and translators? A translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language is a high level language and the object language is a low-level language then such a translator is called a compiler. 2. What are the phases of a compiler? i) Lexical analysis. ii) Syntax analysis. iii) Intermediate code generation. iv) Code optimization. v) Code generation. 3. Define Passes? In an implementation of a compiler, portion of one or more phases are combined into a module called pass. A pass reads the source program or the output of the previous pass, makes the transformations specified by its phases and writes output into an intermediate file, which is read by subsequent pass. 4. Define Lexical Analysis? The lexical analyzer reads the source program one character at a time, carving the source program into a sequence of atomic units called tokens. Identifiers, keywords, constants, operators and punctuation symbols are typical tokens. 5. Write notes on syntax analysis? Syntax analysis is also called parsing. It involves grouping the tokens of the source program into grammatical phrases that are used by the compiler to synthesize output. 6. What is meant by semantic analysis? The semantic analysis phase checks the source program for semantic errors and gathers type information for the subsequent code generation phase.

Upload: others

Post on 14-Mar-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

CS6660-PRINCIPLES OF COMPILER DESIGN

PART-A (2 MARKS)

Unit I

1. Define compilers and translators?A translator is a program that takes as input a program written in one

programming language and produces as output a program in another language. If the source language is a high level language and the object language is a low-level language then such a translator is called a compiler.

2. What are the phases of a compiler?

i) Lexical analysis. ii) Syntax analysis. iii) Intermediate code generation. iv) Code optimization. v) Code generation.

3. Define Passes? In an implementation of a compiler, portion of one or more phases are

combined into a module called pass. A pass reads the source program or the output of the previous pass, makes the transformations specified by its phases and writes output into an intermediate file, which is read by subsequent pass.

4. Define Lexical Analysis?The lexical analyzer reads the source program one character at a time, carving

the source program into a sequence of atomic units called tokens. Identifiers, keywords, constants, operators and punctuation symbols are typical tokens.

5. Write notes on syntax analysis?Syntax analysis is also called parsing. It involves grouping the tokens of the

source program into grammatical phrases that are used by the compiler to synthesize output.

6. What is meant by semantic analysis?The semantic analysis phase checks the source program for semantic errors

and gathers type information for the subsequent code generation phase. It uses the hierarchical structure determined by the syntax-analysis phase to identify the operators and operand of expressions and statements.

7. Define optimization?Certain compilers apply transformations to the output of the intermediate code

generator. It is used to produce an intermediate-language from which a faster or smaller object program can be produced. This phase is called optimization phase. Types of optimization are local optimization and loop optimization.

8. What is cross compiler?

Page 2:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

3

A compiler may run on one machine and produce object code for another machine is called cross compiler.

9. Define semantics of a programming language?The rules that tell whether a string is a valid program or not are called syntax

of the language. The rules that give meaning to programs are called the semantics of a programming language.

10. What are the data elements of a programming language? a) Numerical data. b) Logical data. c) Character data. d) Pointers. e) Labels.

11. Define binding? The act of associating attributes to a name is referred to as binding the

attributes to the name. Most binding done at compile time called static binding. Some languages, such as SNOBOL allow dynamic binding, binding done at run time.

12. What is coercion of types?The translation of the operator, which the compiler must provide, includes any

necessary conversion from one type to another, and this implied change in type is called coercion.

13. What is meant by loaders and link-editors?A program called a loader performs the two function of loading and link-

editing. The process of loading consists of taking relocatable machine code, altering the relocatable addresses and placing the altered instruction and data in memory at the proper locations.

14. Write down the various compiler construction tools? Some of the useful compiler construction tools are a) Parser generator b) Scanner generators c) Syntax-directed translation engines d) Automatic code generators e) Data-flow engines

15. What are the possible error recovery actions in lexical analysis: a) Deleting an extraneous character b) Inserting a missing character c) Replacing an incorrect character by a correct character d) Transposing two adjacent characters

16. Define regular expressions? Regular expressions are the notation we shall use to define the class of

languages known as regular sets. It is used to describe tokens. In regular expression notation we could write

identifier = letter ( letter | digit )*17. Write the regular expression for denoting the set containing the string a and

all strings consisting of zero or more a’s followed by a b.

Page 3:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

4

a | a * b

18. Describe the language generated by the regular expressions? a) 0(0|1)*0

The set of zero or more number of zeroes and ones prefixed by zero and suffixed by 0.

19. What is a regular definition? If Σ is an alphabet of basic symbols, then a regular definition is a sequence

of definition of the formd1 r1d2 r2….dn fnWhere each di is a distinct name, and each ri is a regular expression over

the symbol in Σ U {d1, d2, …di-1}20. Define finite automata?

A better way to convert a regular expression to a recognizer is to construct a generalized transition diagram from the expression. This diagram is called a finite automaton.

21. What is Deterministic Automata?A finite automaton is deterministic if

a. It has no transition of input . b. For each state s and input symbol a, there is at most one edge

labeled a leaving s. 22. Write the algorithm for simulating a

DFA? s := s0;c := nextchar while c ≠ eof do

s := move(s,c) c := nectchar

endif s is in F then

return “yes”else return “no”;

23. Write the transition graph for an NFA that recognizes the language (a|b)*abb ?

Page 4:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

5

a

start 0 a 1 b 2 b

b

24. Define LEX?LEX is a tool for automatically generating lexical analyzers. A LEX source

program is a specification of a lexical analyzer, consisting of a set of regular expressions together with an action for each regular expression. The output of LEX is a lexical analyzer program.

25. Write notes on auxiliary definitions?The auxiliary definitions are statements of the form

a. D1 = R1 b. D2 = R2 c. …….. d. …….. e. Dn = Rn

Where each Di is a distinct name, and each Ri is a regular expression.

Page 5:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

6

Unit II

26. Define context-free grammar?The syntactic specification of a programming language can be formed by a

notation called a context-free grammar, which is also called a BNF (Backus-Naur form ) description. Context-free grammars are capable of describing most, but not all, of the syntax of programming languages.

27. Define parse trees?The graphical representation for derivations that filters out the choice

regarding replacement order. This representation is called the parse trees. It represents the hierarchical syntactic structure of sentences that is implied by the grammar.

28. What are the various types of errors in program? a) Lexical, such as misspelling an identifier, keyword, or operator. b) Syntactic , such as an arithmetic expression with unbalanced parenthesis. c) Semantic, such a as an operator applied to an incompatible operand. d) Logical, such as an infinitely recursive call.

29. What re the various error-recovery strategies? a) Panic mode - On discovering this error, the parser discards the input

symbols one at a time until one of a designated set of synchronized tokens is found.

b) Phrase level – On discovering an error, a parser perform local correction on the remaining input ; that is , it may replace a prefix or the remaining input by some string that allows the parser to continue.

c) Error production and - If we are having good idea of error we recover it.

d) Global correction – Use the compiler to make as few changes as possible in processing an input string.

30. Write a grammar to define simple arithmetic expression? expr expr op expr expr (expr)expr - expr expr idop + | - | * | / | ^

31. Define context-free language? Given a grammar G with start symbol S, we can use the ==> relation to define

L(G) , the language generated by G. We say a string of terminals w is in *L(G) if and only if S ==> w. The string w is called a sentence of G. the language that can only generated by a grammar is said to be a context-free language.

32. Define ambiguity?A grammar that produces more than one parse tree for some sentence is said

to be ambiguous. An ambiguous grammar is one that produces more than one leftmost or more than one right most derivation for some sentence.

33. What is meant by left recursion?A grammar is left recursive if it has a nonterminal A such that there is a

derivation A ==> A α for some string α . Top down parsing methods cannot

Page 6:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

7

handle left-recursion grammars, so a transformation that eliminates left recursion in needed.Ex:-

E E +T | TT T * F | F F (E) | id

34. Write the algorithm to eliminate left recursion from a grammar? 1. Arrange the non terminals in some order A1,A2….An 2 for i := 1 to n do begin

for j := 1 to i-1 do beginreplace each production of the form Ai Ajγ by the productions Ai δ1γ | δ2γ | …|δkγ

endeliminate the immediate left recursion among the Ai productions

end.35. Write is meant by left factoring?

Left factoring is a grammar transformation that is useful for producing a grammar suitable for predictive parsing. The basic idea is that when it is not clear which of two alternative productions to use to expand a nonterminal A, we may be able to rewrite the A production to defer the decision until we have seen enough of the input to make the right choice.

35. Define parser?A parser for grammar G is a program that takes as input a string w and

produces as output either a parse tree for w, if w is a sentence of G, or an error message indicating that w is not a sentence of G.

36. What is shift_reduce parsing?The bottom_up style of parsing is called shift_reduce parsing. This parsing

method is bottom_up because it attempts to construct a parse tree for an input string beginning at the leaves and working up towards the root.

37. Define Handles?A handle of a right-sentential form γ is a production A β and a position of

γ where the string β may be found and replaced by A to produce the previous right-sentential form in a rightmost derivation of γ.

38. What are the four possible action of a shift_reduce parser? a) Shift action – the next input symbol is shifted to the top of the stack. b) Reduce action – replace handle. c) Accept action – successful completion of parsing. d) Error action- find syntax error.

39. What is an operator grammar?

The grammars have the property that no production right side is or has two adjacent nonterminals is called operator grammar.

40. What are the problems in top down parsing? a) Left recursion. b) Backtracking. c) The order in which alternates are tried can affect the language accepted.

Page 7:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

8

41. Define recursive-descent parser?A parser that uses a set of recursive procedures to recognize its input with no

backtracking is called a recursive-descent parser. The recursive procedures can be quite easy to write.

42. Define predictive parsers?A predictive parser is an efficient way of implementing recursive_descent

parsing by handling the stack of activation records explicitly. The predictive parser has an input, a stack , a parsing table and an output.

43. Define FIRST in predictive parsing? a) If X is terminal , then FIRST(X) is {X}. b) If X ε is a production , then add ε to FIRST(X). c) If X is non terminal and X Y1Y2….Yk is a production, then place a in FIRST(X) if for some i, a is in FIRST(Yi), and ε is in all of FIRST(Y1) ,…, FIRST(Yi-1) ;

*that is ,Y1…Yi-1 ==>ε. If ε is in FIRST(Yj) for all j = 1,2,…,k, then add ε toFIRST(X).

44. Define FOLLOW in predictive parsing?The FOLLOW(A) , for some non terminal A , to be the set of terminals a

that can appear immediately to the right of A in some sentential form, that is, S * Aaβ for some and β. If A can be the rightmost symbol in some sentential form, then we add $ to FOLLOW(A).

45. Write the algorithm for the construction of a predictive parsing table? Input : Grammar G Output : Parsing table M Method :

a) For each production A α of the grammar, do step b and c. b) For each terminal a in FIRST(α) and A α to M [A, a] c) If ε is in FIRST(α) add A α to M [A, b] for each terminal b is FOLLOW (A). If ε is in FIRST (α) and $ is in FOLLOW(A), and A α to M [A, $]d) Make each undefined entry of M be error.

46. What is LL(1) grammar? A grammar whose parsing table has no multiply-defined entries is said to be

LL(1).47. What are LR parsers?

LR(k) parsers scan the input from (L) left to right and construct a (R) rightmost derivation in reverse. LR parsers consist of a driver routine and a parsing table. The k is for the number of input symbols of lookahead that are used in making parsing decisions. When k is omitted , k is assumed to be 1. It is attractive because

a) It is constructed to recognize virtually all programming language constructs for which context-free grammars can be written.

b) It is the most general nonbacktracking shift-reduce method. c) The class of grammars that can be parsed using LR methods is a proper

superset of the class of grammars that can be parsed with predictive parsers.

Page 8:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

9

d) It can detect syntactic error as soon as possible. 48. Define LR grammar?

A grammar for which we can construct a parsing table in which every entry is uniquely defined is said to be an LR grammar.

49. What is augmented grammar?If G is a grammar with start symbol S, then G’, the augmented grammar

for G, is G with a new start symbol S’ and production S’ S. It is to indicate theparser when it should stop and announce acceptance of the input.

Page 9:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

10

Unit III

1. Define procedure definition?A procedure definition is a declaration that, in its simplest form, associates

an identifier with a statement. The identifier is the procedure name, and the statement body. Some of the identifiers appearing in a procedure definition are special and are called formal parameters of the procedure. Arguments, known as actual parameters may be passed to a called procedure; they are substituted for the formal in the body.

2. Define activation trees?A recursive procedure p need not call itself directly; p may call another

procedure q, which may then call p through some sequence of procedure calls. We can use a tree called an activation tree, to depict the way control enters and leaves activation. In an activation tree

a) Each node represents an activation of a procedure, b) The root represents the activation of the main program c) The node for a is the parent of the node for b if an only if control flows

from activation a to b, and d) The node for a is to the left of the node for b if an only if the lifetime of

a occurs before the lifetime of b. 3. Write notes on control stack?

A control stack is to keep track of live procedure activations. The idea is to push the node for activation onto the control stack as the activation begins and to pop the node when the activation ends.

4. Write the scope of a declaration?A portion of the program to which a declaration applies is called the scope

of that declaration. An occurrence of a name in a procedure is said to be local to eh procedure if it is in the cope of a declaration within the procedure; otherwise, the occurrence is said to be nonlocal.

5. Define binding of names?When an environment associates storage location s with a name x, we say

that x is bound to s; the association itself is referred to as a binding of x. A binding is the dynamic counterpart of a declaring.

6. What is the use of run time storage? The run time storage might be subdivided to hold a) The generated target code b) Data objects, and c) A counterpart of the control stack to keep track of procedure activation.

7. What is an activation record? Information needed by a single execution of a procedure is managed using

a contiguous block of storage called an activation record or frame, consisting of the collection of fields such as

a) Return value b) Actual parameters

Page 10:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

11

c) Optional control link d) Optional access link e) Saved machine status f) Local data g) Temporaries

8. What are the storage allocation strategies? a) Static allocation lays out storage for all data objects at compile time. b) Stack allocation manages the run-storage as a stack. c) Heap allocation allocates and deallocates storage as needed at run time from a data area known as heap.

9. What is static allocation? In static allocation, names are bound to storage as the program is compiled, so

there is no need for a run-time support package. Since the bindings do not change at run time, every time a procedure is activated, its names are bound to the same storage location.

10. What is stack allocation?Stack allocation is based on the idea of a control stack; storage is

organized as a stack, and activation records are pushed and popped as activations begin and end respectively.

11. What are the limitations of static allocation? a) The size of a data object and constraints on its position in memory must be known at compile time. b) Recursive procedure is restricted. c) Data structures cannot be created dynamically.

12. What is dangling references? Whenever storage can be deallocated, the problem of dangling references

arises. A dangling reference occurs when there is a reference to storage that has been deallocated.

13. What is heap allocation?Heap allocation parcels out pieces of contiguous storage, as needed for

activation records or other objects. Pieces may be deallocated in any order, so over time the heap will consist of alternate areas that are free and in use.

14. Define displays?Faster access to nonlocals than with access links can be obtained using an

array d of pointers to activation records, called display. The display changes when a new activation occurs and it must be reset when control returns from the new activation.

15. Write notes on call-by-value?This is the simplest method for passing parameters. The actual

parameters are evaluated and their r-values are passed to the called procedure. Call-by-value can be implemented as follows

a) A formal parameter is treated just like a local name, so the storage for the formals is in the activation record of the called procedure.

b) The caller evaluates the actual parameters and places their r-values in the storage for the formals.

Page 11:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

12

16. What is meant by call-by-reference?When parameters are passed by reference, the caller passes to the called

procedure a pointer to the storage address of each actual parameter.a) If an actual parameter is a name or an expression having l-value,

then that l-value itself is passed. b) However, if the actual parameter is an expression, then the

expression is evaluated in a new location, and address of that location is passed.

17. What is meant by copy-restore?A hybrid between call-by-value and call by reference is copy-restore

linkage.1. Before control flows to the called procedure, the actual parameters

are evaluated. 2. When control returns, the current r-values of the formal parameters

are copied back into the l-values of the actual, using the l-values computed before the call.

18. Write notes on call-by-name? Call-by-name is traditionally defined by the copy-rule of Algol, which isa) The procedure is treated as if it were a macro; that is, its body is

substituted for the call in the caller, with the actual parameters literally substituted for the formals. Such a literal substitution is called macro-expansion or in-line expansion.

b) The local named of the called procedure are kept distinct from the names of the calling procedure.

c) The actual parameters are surrounded by parenthesis if necessary to preserve their integrity.

19. Define symbol tables?A compiler uses a symbol table to deep track of scope and binding

information about names. The symbol table is searched every time a name is encountered in the source text. Two symbol table mechanisms are linear list and hash tables.

20. Define Garbage?Dynamically allocated storage can become unreachable. Storage that a

program allocates but cannot refer to is called garbage. Lisp performs garbage collection that reclaims inaccessible storage.

21. What are the dynamic storage allocation techniques? a) Explicit allocation of Fixed-sized blocks. b) Explicit allocation of Variable-sized blocks – one method is first-fit method, in this when a block of size s is allocated; we search for the first free block that is of size f ≥ s. c) Implicit Deallocation – it requires cooperation between the user program and the run-time package.

22. What is memory map?

Page 12:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

For each data area the compiler creates a memory map, which is a description of the contents of the area. This memory map might simply consist of

Page 13:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

13

an indication, in the symbol-table entry for each name in the area, of its offset in the area.

23. Write notes on COMMON statement?When processing a declaration like COMMON / BLOCK1 / NAME 1,NAME 2a) In the table for COMMON block names, create a record for BLOCK 1, if one does not already exist. b) In the symbol table entries fro NAME1 and NAME2, set a pointer to the symbol table entry for BLOCK1, indicating that these are in COMMON and members of BLOCK1.

24. Write notes on equivalence statement?A sequence of EQUIVALENCE statements groups names into

equivalence sets whose positions relative to one another are all defined by the EQUIVALENCE statements. For example

EQUIVALENCE A,B+100EQUIVALENCE C,D-100EQUIVALENCE A,C+30 EQUIVALENCE E,FGroups names into the sets {A,B,C,D} and {E,F}where E and F denote

the same location.25. Write the algorithm for adjustment of offsets?

Beginh := offset(nk-1);for i := k-2 downto 1 do begin

parent(ni) := nk;h := h + offset(ni); offset(ni) := h

EndEnd

Page 14:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

14

Unit IV

50. Define intermediate code?In many compilers the source code is translated into a language which is

intermediate in complexity between a high-level programming language and machine code. Such a language is therefore called intermediate code or intermediate text.

51. What are the benefits of using a machine-independent intermediate form? a) Retargeting is facilitated; a compiler for a different machine can be

created by attaching a back end for the new machine to an existing front end.

b) A machine-independent code optimizer can be applied to the intermediate representation.

52. What are the various kinds of intermediate representations for intermediate code generation?

a) Syntax trees b) Postfix notation c) Three address code

53. What is syntax directed translation scheme? A syntax directed translation scheme is merely a context-free grammar in

which a program fragment called an output action ( or sometimes a semantic action or semantic rule) is associated with each production.

54. Write the syntax-directed translation scheme for infix-postfix translation?PRODUCTION SEMANTIC ACTIONE E(1) op E(2) E.CODE := E(1) .CODE || E(2) .CODE || opE (E(1)) E.CODE := E(1) .CODEE id E.CODE := id

55. Define parse trees and syntax trees.The parse tree itself is a useful intermediate language representation for a

source program. A parse tree, however often contains redundant information which can be eliminated. A variant of a parse tree is what is called an syntax tree, a tree in which each leaf represents an operand and each interior node an operator.

56. What is a three-address code?Three-address code is a sequence of statements, typically of the general

form A:= B op C, where A,B and C are either programmer-defined names, constants or compiler-generated temporary names; op stands for any operator, such as fixed- or floating-point arithmetic operator, or a logical operator on Boolean-valued data.

57. Write the three address code for the assignment statement a:= b * -c + b * -c t1 := -ct2 := b * t1t3 := -ct4 := b * t3t5 := t2 + t4

Page 15:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

15

a := t558. Name any four types of three-address statements?

a) Assignment statements of the form x := y op z b) Assignment instruction of the form x := op y c) Copy statement of the form x := y d) The unconditional jump goto L.

59. What are the representations of three-address statements? A three address statement is an abstract form of intermediate code. There

are three representation are available. They area) Quadruples b) Triples c) Indirect triples

60. Write notes on quadruples?A quadruple is a record structure with four fields, which we call op, arg1,

arg2 and result. The op field contains an internal code for the operator. The three address statement x := y op z is represented by placing y in arg1, z in arg2, and x in result. The quadruple representation for x := y * z is shown as below

Op Arg1 Arg2 Result(0) * Y Z x

The quadruple representation for x := -y is shown as belowOp Arg1 Arg2 Result

(0) - Y x61. Write notes on triples?

To avoid entering temporary names into the symbol table, we might refer to a temporary value by the position of the statement that computes it. The three address codes are represented in triples in three fields as op, arg1 and arg2. The triple representation for the assignment statement a := b * -c + b * -c is shown as below

Op arg1 arg2(0) Uminus C(1) * B (0)(2) uminus C(3) * B (2)(4) + (1) (3)(5) assign A (4)

62. Define indirect triples?An implementation of three-address code, which has been considered, is

that of listing pointers to triples, rather than listing the triples themselves. This implementation is naturally called indirect triples.

63. Write the translation scheme for the statementIf A < B then 1 else

0 if A < B goto (4)T := 0 goto (5)T := 1….

Page 16:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

16

64. Write a grammar for array reference? A L := E L id[elist] | id elsit elist,E | E E E+E | (E) | L

65. Write a grammar for simple procedure call statement? S call id (elist) elist elist , E elist E

66. Write a grammar for declaration statement? D integer namelist | namelist namelist id, namelist | id

67. Write the grammar for Boolean expression?E E or E | E and E | not E | (E) | id relop id | true | falseWe use the attribute op to determine which of the comparison operators <,

≥, ≤ , ≠ , or > is represented by relop.68. What are the methods of translating Boolean expression?

There are two principal methods of representing the value of a Boolean expression. The first method is to encode true or false numerically and to evaluate a Boolean expression analogously to an arithmetic expression. The second principle method of implementing Boolean expression is by flow of control, that is, representing the value of a Boolean expression by a position reached in a program.

69. Define short-circuit code?We can also translate a Boolean expression into three-address code

without generating code for any of the Boolean operators and without having thecode necessarily evaluate the entire expression. This style of evaluation is sometimes called “short-circuit” or “jumping” code.

70. Write the grammar for if-then, if-then-else, and while-do statements?S if E then S1

| if E then S1 else S2 | while E do S1

71. Write the three address code for the following statements?While a < b do

If c < d then x:= y+z

Elsex:= y – z

AnswerL1 : if a < b go to

L2 gto LnextL2 : if c < d goto

L3 goto L4L3 : t1 := y + z

x := t1

Page 17:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

17

goto L1 L4 : t2 := y – z

x := t2 goto L1

Lnext :72. Define back patching?

The main problem with generating code for Boolean expressions and flow-of-control statements in a single pass is that during one single pass we may not know the labels that control must go to at the time the jump statements are generated. We can get around d this problem by generating a series of branching statements with the targets of the jumps unspecified. The labels will be filled in when the proper label can be determined. We call this subsequent filling in of labels back-patching.

73. Write the three address code for the expression given below a < b or c < d and e < f

Answer100 : if a < b goto 103101 : t1 : = 0102 : goto 104103 : t1 : = 1104 : if c < d goto 107105 : t2 : = 0106 : goto 108107 : t2 : = 1108 : if e < f goto 111109 : t3 : = 0110 : goto 112 111: t3 : = 1112 : t4 : = t2 and t3113 : t5 : = t1 or t4

74. Write the grammar for procedure calls?The grammar for a simple procedure call statement

is S call id (Elist)Elist Elist, E Elist E

Page 18:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

18

Unit V

1. What is meant by code generation phase?The final phase in compiler model is the code generator. It takes as input

an intermediate representation of the source program and produces as output an equivalent target program. In code generation the output code must be correct and of high quality.

2. What code to generate to manage activation records at run time?Two standard storage-allocation strategies were presented, namely, static

allocation and stack allocation. In static allocation, the position of an activation record in memory is fixed at compile time. In stack allocation, a new activation record is pushed onto the stack for each execution of a procedure.

3. Write the various three address modes together with their assembly-languageforms and associated costs?

MODE FORM ADDRESS ADDED COSTAbsolute M M 1Register R R 0Indexed c(R) c+ contents(R) 1Indirect register *R contents(R) 0Indirect indexed *c(R) contents(c+contents(R)) 1

4. What is the instruction cost of the following instruction?MOV b, R0MOV c, R0MOV R0,aThe instruction cost is 6

5. What are the two standard storage allocation strategies? a) Static allocation. b) Stack allocation.

6. Write the three address statements related to procedure calls? a) Call b) Return c) halt d) action

7. Write notes on stack allocation? In stack allocation the position of the record for an activation of a

procedure is usually stored in a register, so words in the activation record can be accessed as offsets from the value in the register. The indexed addressed mode of the target machine is convenient for this purpose.8. Define code optimization?

The term code optimization refers to techniques a compiler can employ in an attempt to produce a better object language program than the most obvious for a given source program.

9. Define basic blocks?

Page 19:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

19

The sequences of consecutive statements which may be entered only at the beginning, and when entered are executed in sequence without halt or possibly of branch.

10. What are the structure preserving transformations of basic blocks? a) common sub expression elimination. b) dead-code elimination. c) renaming of temporary variables. d) interchange of two independent adjacent statements.

11. Define flow graph? The basic blocks and their successor relationships are portrayed by a

directed graph called a flow graph. The nodes of the flow graph are the basic blocks.

12. Define loops?Loop is a collection of nodes that

(i) It is strongly connected. (ii) It has a unique entry.

13. What is a preheader?Several transformations require us to move statements “before the

header”. We therefore begin treatment of a loop L by creating a new block, calledthe preheader. The preheader has only the headeras successor, and all edges which is formerly entered the header L from outside L instead inter the preheader.

14. Define code motion?An important source of modification is called code motion, where we take

a computation that yields the same result independent of the number of times through the loop and place it before the loop.

15. What is reduction in strength?The replacement of an expensive operation by a cheaper one is termed

reduction in strength. That is replacing a multiplication operation by an addition operation.

16. Define DAG?A useful data structure for automatically analyzing basic blocks is a

directed acyclic graph (DAG). A DAG is a directed graph with no cycles.17. What are the applications of DAG’s?

a) Automatically detect common subexpressions. b) Determine which identifiers have their values used in the block. c) Determine which statements compute values which could be used

outside the block. 18. Define Dominators?

A flow graph dominates node n, written d DOM n, if every path from the initial node of the flow graph to n goes through d. Under this definition, every node dominates itself, and the entry of a loop dominates all nodes in the loop.

19. What is reducible flow graphs?A flow graph G is reducible if an only if we can partition the edges into

two disjoint groups, forward edges and backward edges, with the following two properties.

Page 20:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

20

a) The forward edges form an acyclic graph in which every node can be reached from the initial node of G.

b) The back edges consist only of edges whose heads dominates their tails. 20. Define depth-first ordering?

The depth first ordering of the nodes is the reverse of the order in which we last visit the nodes in the preorder traversal.

21. Define induction variable?Induction variable of loop L to be either a basic induction variable or a

name J for which there is a basic induction variable I such that each time J is assigned in L, J’s value is the same linear function of the value of I.

22. Define loop unrolling?Loop unrolling is to avoid a test at every iteration by recognizing that the

number of iteration is constant and replication the body of the loop.23. What are the problems in code generation?

a) What instruction should we generate? b) In what order should we perform computation. c) What registers should we use?

24. What is peephole optimization? Peephole optimization is a technique used in many compilers, in

connection with the optimization of either intermediate or object code. It is really an attempt to overcome the difficulties encountered in syntax-directed generation of code.

25. Write the algorithm for constructing the natural loop?Procedure INSERT(m); If M is not in Loop then BeginLoop:= Loop U {m}; Push m onto stack EndMain()STACK:=empty;Loop:={d}INSERT(n)While STACK is not empty do BeginPop m, the first element of STACK, off STACK For each predecessor p of do INSERT (P)END

Page 21:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

21

Part-B

1. Explain the phases of the compiler?A compiler operates in phases, each of which transforms the source

program from one representation to another. The various phases area) lexical analyzer b) syntax analyzer c) semantic analyzer d) intermediate code generator e) code optimizer f) code generator

Two other activities, symbol table management and error handling interact with the six phases. Lexical analyzer:-

The lexical analyzer reads the source program one character at a time, carving the source program into a sequence of atomic units called tokens. Identifiers, keywords, constants, operators and punctuation symbols are typical tokens.Syntax analyzer:-

Syntax analysis is also called parsing. It involves grouping the tokens of the source program into grammatical phrases that are used by the compiler to synthesize output.Semantic analyzer:-

The semantic analysis phase checks the source program for semantic errors and gathers type information for the subsequent code generation phase. It uses the hierarchical structure determined by the syntax-analysis phase to identify the operators and operand of expressions and statements.Intermediate code generator:-

After syntax and semantic analysis, some compilers generate an explicit intermediate representation of eth source program. This intermediate representation should have two important properties; it should be easy to produce and easy to translate into the target program.Code optimizer:-

Certain compilers apply transformations to the output of the intermediate code generator. It is used to produce an intermediate-language from which a faster or smaller object program can be produced. This phase is called optimization phase. Types of optimization are local optimization and loop optimization.Code generator:-

The final phase of the compiler is the generation of target code, consisting normally of relocatable machine code or assembly code.Symbol Table management:-

A symbol table is a data structure containing a record for each identifier with fields for the attribute of the identifier. The data structure allows us to find

Page 22:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

22

the record for each identifier quickly to store and retrieve data from that record quickly.Error Detection and Reporting:-

The lexical phase can detect errors where the characters remaining in ht input do not form any token of ht language. Errors where the token stream violates eh structure rules of the language are determined by the syntax analysis phase.

2. Briefly discuss on the cousins of the compiler?The cousins of the compiler are

a) Preprocessors b) Assemblers c) Two-pass assembly d) Loaders and link-

editors Preprocessors:- Preprocessors produce input to compilers. They may perform the

following functions.Macro processing : A preprocessor may also allow a user to define macros that are shorthand’s for longer constructs.File inclusion: A preprocessor may include header files into the program text.“Rational” preprocessors: These processors augument older languages with more modern flow-of-control and data-structuring facilities.

Language extensions: These processors attempt to add capabilities to the language by what amounts to built-in macros.Assemblers:-

Some compilers produce assembly code that is passed to an assembler for further processing.

Assembly code is a mnemonic version of machine code, which names are used instead of binary codes for operations and names are also given to memory addresses.Two-pass assembly:-

The simplest form of assembler makes two passes over the input, where a pass1 consist of reading an input file once. In the first pass, all the identifiers that denote the storage locations are found and stored in a symbol table. Identifiers are assigned storage location.

In the second pass, the assembler scans the input again. It translates each operation code into the sequence of bits representing that operation in machine language. The output of second pass is usually a relocatable machine code.Loaders and link-editors:-

A program called a loader performs the two function of loading and link-editing. The process of loading consists of taking relocatable machine code, altering the relocatable addresses and placing the altered instruction and data in memory at the proper locations.

3. Explain about input buffering? 1. Buffer pairs

Page 23:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

23

2. Sentinelsspecify the tokens in a language?

Strings and languages Operations on languages Regular expressions Regular definitions Notational Shorthands Non regular sets

4.Explain about context free grammars?Non-terminals are special symbols that denote set of strings. “Syntactic

variable” and “syntactic category” is other names for non-terminals. One non-terminal is select as start symbol. The productions (rewriting rules) define the ways in which the syntactic categories may be built from one another. Production consists of a non-terminal followed by an arrow, followed by a string of non-terminals.

Figure in fig 4.1

Production rule for expression: -

The terminal symbols areId, +, -, *, /, ^, (,).

Productions are

Expression expression operator expressionExpression (expression)Expression -expressionExpression idOperator +Operator -Operator *Operator /Operator ^

Notational convention for production rule: -

1. Non-terminals are lowercase names, italic capital letters, the letter S (start symbol). 2. Terminals are lowercase letters a, b, c.. .. Operator symbol +, punctuation parenthesis, comma etc. Digits -- 0,1…9.3. X, Y, Z represent grammar symbols, that is either non-terminal or terminal. 4. ,, represent strings of grammar symbols. Production rule A --1

Page 24:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

24

5. If A 1, A 2, A k, are may write A—>k 6. The left side of the first production is the start symbol.

Ex: Using shorthands, the above grammars is written as

E -> EAE | (E) | -E| idA-> + | - | * | / |

E and A are non-terminals, E the start symbol. Remaining are terminals

5.Briefly explain on the stack implementation of shift-reduce parsing?Shift Reduce –Parsing

Bottom – up shift parsing is called shift-reduce-parsing. At each step a string matching on the right side of a production is replaced by the symbol on the left.Ex: consider the grammar S->aACBeA->Ab|b B->dConsider the string abbcde AbbcdeA->b: aAbcdeA->Ab: aAcde B->d:aAcBe S->aAcBe: SShift reduce parsing is one of finding and reducing handlesHandles:

If S => Aw =>, then A->wHandle Pruning:

The right derivation is reverse, often called a canonical reduction. Sequence is obtained by handle parsing.Stack implementation of shift reduce parsing:

A convenient way to implement a shift reduce parser is to use a stack at an input buffer. Use $ to mark the bottom of stack. The parser shift zero or more input symbols onto the stack until a handle is on top of the stack. The parser then reduces to the left side of the production.

Ex: Steps of shift reduced parser is passing id, +id2 x id3 according to the grammar

E->E+E|E * E| (E) |id

Stack Input Action

Page 25:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

25

(1) $ id1+ id2* Shift(2) $ id1 id3 $ Reduce by E ->(3) $ E + id2* id3 id

$”

Shift

(4) $E+ id2* id3 $ ”(5) $E+ id2 * id3 $ reduce by E->id(6) $E+E ” shift(7) $E+E id3 $ ”(8) $E+E* $ reduce by E -(9) id3 $ >id

$E+E*E reduce by E->E* E

(10) $E+E $ ” E ->(11) $ E $ E + E

accept

Four possible action of shift reduce parser are (1) shift (2) reduce (3) accept(4) error1. In shift action, the near i/p symbol is shifted to the top of stack. 2. In reduce action; the parser knows the right end of the parser is at the top of the stack. 3. In accept action, the parser give successful completion. 4. In error action, the parser discovers syntax error.

Constructing a Parse TreeThe bottom – up tree construction has two aspects.

(1) When we shift an input symbol a on to the stack we create a one node tree labeled a.(2) When we reduce X1, X2…. Xn to A, we create a new node labeled as A.

Ex:-

Stack: $id, stack: $Eo Eid1 id1fig (a)

Page 26:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

26

fig (b)Stack: $ E + E E + E

id1 E * E

id2fig( c) after reducing id1 + id2* id3to E + E Stack: E

E

id1 + E

E * E

id1 id3

6.Explain in detail Code Optimization

Code optimization refers to techniques a compiles can employ in an attempt to produce a better object language program than then most obvious for a given source program.

(Or) Code improvement

The principal sources of optimization

Code-optimization are generally applied after syntax analysis, usually both before and during code generation.

Inner loops Removal of loop invariant computations Elimination of induction variables. Language implementation in accessible to the user. Identification of Common sub expression

Page 27:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

Loop optimizationTechnique for detecting loops and optimize it.

Page 28:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

27

Basic Blocks

Sequence of consecutive statements, which may be entered at the beginning and executed in sequence without halt.

Algorithm :Partition into basic blocks

Input: a sequence of three address statements.

Output: A list of basic blocks of three-address statement

Method:1.Determine the set of leaders. The rules are (i)The first statement is a leader(ii)Any statement, which is the target of c conditional/unconditional, go to is a leader. (iii)Any statement immediately follows a conditional go to is a leader.2.For each leader construct its basic blocks, which consist of the leader and all statements up to but not including the next leader or end of the program

Flow graphs

The basic blocks and their successor relationships are portrayed by a directed graph is called flow graph. The nodes of the flow graph are called basic blocks.Loops

Loop is a collection of modes that(i)is strongly connected, that is from any node in the loop to any other there is a path. (2)Loop has a unique entry

Code motionRunning time is reduced by decreasing the length of loops. Code motion involves

find loop invariant computation and place it before the loop

Ex:The statement T2:=addr(A)-4 and t4:=addr(B)-4 are loop invariant So place it before the loop.

Induction Variables

When there are two or more induction variable in a loop we have get rid of all but one, and we call this process induction variable elimination.

Ex:T1:=4 * I and I takes the value 1….20Thus T1 takes 4, 8,….80We replace this statement by T1:=T1+4Hence we reduces the variable I.

Page 29:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

28

Reduction in strength

The replacement of an expensive operation with cheaper one is called reduction in strength.Ex: 1.T1: =4 * I is replaced by T1: =T1+4.

2.I=LENGTH(s2||s2) replaced by L=LENGTH(S10+LENGTH(S1)

The DAG Representation of Basic Blocks

A data structure for automatically analyzing basic blocks is a directed cyclic graph (DAG). Constructing DAG from three-address code is a good way of determining common sub expression determining which names are inside the block but used outside the block. Which statements have values used outside the block.

A computation of DAG is a directed graph with the following labels in nodes.1. Leaves are labeled by unique identifiers, variable names or constants 2. Interior nodes are labeled by an operator symbol. 3. Nodes are also given optionally an extra set of identifiers for labels. The node

represent computed values and identifiers labeling a node are have that values.

7.Write the Algorithm constructing a DAG

Input: A vasic blockOutput: A DAG with the following information

1.A label for each node leaves – identifiers interior –operator 2.For each node list of attached identifiers

Methodi)If Node(B) is undefined , create a leaf labeled B, let Node(B) be this node.ii) If there is a node labeled op, whose left which is None(B) at high child is Node(C) If not create it.iii)Appeared A to the list of attached identifiers for the node n. Finally set whose (n) to n.

Application of DAG’s

1. Detect common sub expressions. 2. Determine which identifiers have their values used in the block. 3. Determine which statements compute values which could be used outside the

block..

Arrays, pointers and procedure cells

Consider the blockX:= A[I]

Page 30:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

29

A[I]:=YZ:= A[I]A[I] to be common sub expression and optimize itX:= A[I]Z:= xA[j]:=Y

Value Numbers and Algebraic Laws

Value number

A data structure called hash table is used to determine whether a node with children and operator exit is called value numbers.

Algebraic Law

Multiplication is commutative so eve userA * B = B * AAssociative law for code improvementA:= B+CE=C+D+B

Global Data flow analysis

Given that identifier A is used at point P, at which point(position) could the value of A used at p have been defined. It is called Use-definition chaining

Reaching Definitions:

To determine this, we assign C designer number to each definition. Since each definition is associated with a unique quadruple the index of the variable with do loop.

8 .Explain in detail Move About loop Optimization

Cycle in the flow graph of a program1. It should have single entry node 2. Strongly connected.

Reducible flow graph – a flow graph in which every cycle has a unique entry

Deleting loop is done by using dominators

Dominators

Page 31:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

30

Node d of a flow graph dominates node n, written d DOM n, if every path from the initial node of the flow graph to n goes through d.

Properties of DOM1.Dominance is reflexive( a DOM a for all a) ant symmetric( a DOM b and b DOM a a=b) transitive(a DOM B AND B DOM C a DOM c)2.The dominator each node n all linearly ordered by the DOM relation

Dominator tree

A useful way of presenting dominator information is in a tree called Dominator Tree, in which the initial node is the root, and the parent of each other node is its immediate dominator.

Loop detection

A good way is to search for edges in flow graph whose heeds dominate their tails we call such edges back edges

Algorithm: Constructing natural loops of back edge.

Input : A flow graph G and a back edge nd.Output: The set Loop consisting of all nodes in loop.

MethodProcedure INSERT(m);If M is not in Loop thenBeginLoop:= Loop U {m};Push m onto stackEndMain()STACK:=empty;Loop:={d}INSERT(n)While STACK is not empty doBeginPop m, the first element of STACK, off STACKFor each predecessor p of do INSERT (P)END

Finding dominators

It is based on the principle that if P1, p2,…. K are all predecessors of n and dxn then d= then d DOM n if and only if d DOM pi for each I,

Page 32:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

31

Algorithm: Finding dominators

Input: A flow graph G with set of nodes N, and set of edges E and initial node no. Output:- the relation DomMethodsFig 13.5

Page 33:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

32

Reducible flow graphs

A special flow graph in which several code optimization are easily perform Definition: A flow graph G is reducible if and only if we can partition the edges into two disjoint groups, for ward edges and back edge with the following properties

1. The forward edge form an cyclic graph in which every node can be reached from initial node G.

2. The back edge consist only of edges whose heeds dominates their tails 3. Ex:fig 3.1 is reducible 4. Reason – cycle 2-3 can be entered at two different places 2 and 3.

Properties of reducible flow graphs

*Loop contain back edgeInduction variable removal not directly applicable *Place nested structure on loops.

Depth first search

Used to detect loops in any flow graph. Start at the initial node and search the entire graph, trying to visit nodes as far as away from the initial node quickly as possible

Depth first ordering

It is the reverse of the order in which we last with the nodes in the preorder traversal

Depth for spanning Trees

Depth first ordering of a flow graph by constructing and tracing a tree.

Algorithm: Depth first spanning tree and depth first ordering

Input: A flow graph GOutput: a DFST of G and an ordering of nodes of G.MethodThe edge in a depth first Presentation of a flow graph

1.Edges that go from a Node m to an ancestor of m in the tree called retreating(backward)edges.2.Advancing edges – go from a node m to a proper descendent of m in the tree. 3.Cross edges – mn; neither m nor n Is an ancestor of the other

Page 34:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

The depth of a flow Graph

Page 35:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

33

The depth is the largest number of retreating edges on any cycle-free path. depth is d(1043)

Loop-invariant computations

It is done by creating a new block called the pre-header. Pre header has header as successor.

Detection of Loop invariant computations

Discover the statements whose value is loop invariant, that is, that does not change. Some of these assignments may be moved to the pre header

9 Brief about Algorithm:Detection of Loop Invariant computation

Input:A look L consist of a set f basic blocks Each block containing three address statements Output: Three additional statementMethod1.Mark ‘invariant’

3. Repeat step(30 until some repetition no ne 4. Mark ‘invarient’ those statements have exactly one reaching definition. Performing code motion Algorithm:Code Mothion IInput:A A too L with ud-chaining and dominatoer information Output:Revised version of loop with some statements moved to the pre header. Method:1.Find loop-invarient statements 2.For each statement S in step 1 checki)that is in a block which dominates all exists of L. ii)that A is no defined elsewhere in Landiii)That all uses in L of A can only be reached by the definition of A in state S. 3.Move

Induction Variable Elimination

Induction variable of a loop L are those names I whose only assignments within loop L are of the form I :=I+C

10 Brief about Algorithm: Detection and elimination of induction variables Input:A loop Loutput: A revised

loop Method:1.Find all basic induction variables in L. 2. Fine additional induction variable.

Page 36:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

34

3.For every induction variable a in the fminly of B i)Create a new name SFA(B)ii)Replace assignment to A by A:=SFA(B) iii)Set FA(B)=c1B+C2.4.For each basic induction variable B whose values areto compute other induction variables. Delete all assignments to B from the loop.5.Consider induction variable A, replace all uses of A by uses of SFA(B) and delete statement A:=SFA(B).

Some other loop optimization

Loop unrolling

Avoid a test at every iteration . Number of iteration is constant. Replicating the bodyof the loop

EX:begin: I:=1

A[I]=0I=I+1A[I]=0I=I+1

Page 37:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

While I < 100 do BeginA[I]=0I=I+1

EndEnd

Loop JammingMerge the bodies of two loops, if two loops are executed at same number of times and indices are same.Ex:13.20

11.Explain in detail Code generation

Final phase of compilation. The input to code generation is an intermediate language program – quadruples, triples, on tree or a postage polish string. The output is the object program.

Problems in code generation

1.What instructions should we generate?A:= A+1 AOS a (or) LOAD A AND # store

a 2.In what order should we perform computationsSome computation order requires few registers

Page 38:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

35

3.What registers should we use?Some machine requires register pairs for manipulation

A machine model

Good code generation requires the knowledge of the target machine.

Addressing mode 1.r-regiser mode 2.*r-indirect register mode 3.x®ine xed mode.4.*x® - indirect indexed mode 5.#x-immediaqte modes6.x- absolute

Examples of machine instructions 1.Move R0,\R12.MOV R5,M3,ADD #1, R3 4.SUB 4(R0), *5(R0)

A simple code generatorFor each operator in a quadruple there is a machine code operator

The Code generation algorithm

Register descriptor

For register allocation, we shall maintain a register descriptor that keeps track of what is currently in each register. Initially it shows all registers are empty

Address descriptor

It keeps track of the location where the current value of the name can be found at runtime.

The Code-Generation algorithm

For each quadruple A:= B op C1. Invoke GETREG(0) to determine the location L where computation of B op C shouldbe performed. L is register or memory2.Consult the address descriptor for B. If the value not is L. Generate Mov B’, L to place the copy in L

Page 39:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

36

3.Generate the instruction OP C’, L where C’ is the current location of C indicate a is onL.4.If B or C have no next uses. Alter the register descriptor to indicate no longer B and C exist.Suppose a quadraple A:= B existSimply change the register and address descriptor to record the value of a is in the register holding the value of B.

The function Getreg

Returns L to hold the value of A for the assignment A:=B op C.1.If the name B is in a register that hold no other names and B and line, that return register B for L.2.Failing (1)return an empty register for L.3.Failing(2), if a has next use in the block such as index. Store the value of R into a memory location4.If a is not used in the block or no suitable register can be found select the memory location of A as L.

Code generation for other types of statements

1.A:=B[I] mov I’, Lmov B(L), A2.A[I]:=BMov I’, LMov B, A[I] 3.A:= *p move *p, A

Conditional statement

Machines implement conditional jumps in one of two ways1.have a collection of jump instructions depends on condition. Jump if subtracting B from A is-ve.2.Use condition code, ie +ve if A>B

Register allocation and assignmentsRegister operations are shorter and faster. Register allocation is assign quantities

to register.

Global register allocation

To save stores and corresponding loads, assign registers to frequently used variables.

Page 40:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

37

Code Generation from DAG;

Generating code for a basic block from DAG representation. Advantage is rearrange the order of the final computation.

Rearranging the order

Order of computation affect the cost of resulting object code.

A heuristic rodering for DAG’s?

Node listing algorithm

This algorithm produces the ordering in reverse .

Optimal ordering for trees

It means the order that yields the shortest instruction sequence. Here we use register pairs. It works on tree representation of quadruple. It has two parts1.Lable each node of the tree, bottom up, with an interger that denotes the fewest number of registers required to evaluate the tree.2.Tree traversal to generate O/P code.

12. Brief about The Labeling Algorithm

Terms used – Left leaf – node that is a leaf and the leftmost descendant of its parent. All others ‘right leaves”.

Peephole Optimization

It is an attempt to overcome the difficulties in syntax directed generation of code. It look object code within small range of instructions

Redundant Load and stores Ex: 1.Move Ro, A 2)Move A, R0Delete (2)

Unreachable instruction immediately following an unconditional jump may be removed. Ex: If DEBUG = 1 GOTO L1Goto L2L1:printL2:Replaced by – eliminate jump over jump. If ADEBUG # goto L2

Page 41:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

38

PrintL2:

Multiple jumps

Replace jumps to jumps

Algebraic simple fraction

X:=X * 1 (or)X:= X+0Can be eliminatedA:= 2*A is replaced by ADD R1, R2….

Reduction in strength

Replace expensive operation by inexpensive oneX2 is replaced by x * x

Use of machine idiomsUse hardware instructionsEx: Autoincrement, Autodecrement for A:= A+1 or A:= A-1

13 Explain in detail Runtime Storage Administration

The rules that define the scope and declaration of names in a programming language to locate storage to data objects

Implementation of a simple stack-allocation scheme

Let us consider UNIX-c Data may be global or local.

One possible organization of memory is, the low memory location contain the code for the various procedures. Global starting from the highest available location is the runtime stack.

Organization of C activation record

In addition to local data 1.The value of the actual parameters2. The count of the number of arguments 3. The return address 4. The return value

Page 42:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

39

5. The value of SP for activation record.

Procedure calls in C

The translation of call P, n statement is first store the argument count n, the return address, and the old stack pointer, and jump to the first statement of the procedure called.

Implementation of a Block structural language

Block structural languages are ALGOL or PL/I. Blocks and procedures define their own data. These languages allow array of adjustable length.

Displays

It is a common way of providing more direct access to non-local data. A display consist of an array of pointers to the currently accessible activation records

Format of activation Records

It contains space for local data, the return address, return value, argument content, actual parameters and the old value of sp. Also points on the display

Procedure calls

When a procedure P1 at level l1 calls P2 at level l2, the name P2 must be defined as part of P1.

Parameter passing

Arrays

Storage Allocation in FORTRANStorage for temporaries, COMMON, equivalence

Data areas

The size and relative position of each data object must be known as compile time. FORTRAN one data area for each routine. One data for common block and Compiler completes the size of each data area.

A simple Equivalence Algorithm

I/P – a list of equivalence defining statements. Equivalence A, B+ Idist O/P – A collection of trees

Page 43:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

40

Method –Storage Allocation in Block-structural languages

Data will be put in the static areas for each procedure. Bulk of data allocated onstack.When arrays are used, the count is incremented by the size of the pointer rather than size. Temporaries for evaluation are stored in stack

14. Notes on Error detection and Recovery.

The compiler should be able to detect errors and also to recover from them.

Errors

A single compiler may stop all activities other than lexical and syntactic analysis after the detection of error.

Some compiler may attempt to correct the erroneous input by making a guess.

Reporting errors

Reporting error can help reduce debugging and maintenance. The various properties are

1.The message should pin point the errors2.The message should be understandable by the user. 3.The message should localize the problem4.The message should not be redundant

Sources of error Error at design specification. The algorithm may be incorrect. Programmer introduce errors in implementing algorithm Keypunching error Program exceed machine limit Compiler can insert errors during translation

Syntactic Errors1. Missing right parenthesis 2. Extraneous comma 3. Color in place of semicolon 4. Misspelled keyword. 5. Extra blank.

Minimum Distance correction of Syntactic errors

Page 44:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

41

One way of defining errors and their location is the minimum Hamming distance method. If means least number of insertions, deletions and symbol modifications to transform one string to another.

Semantic Errors

These errors can be detected both at compile time and at run time. Most common semantic errors that are detected at compile time are declaration and scope.type incompatibilities between operators and operands are a semantic error detected at compile time.

The type of every name and expression must be calculated at compile time. Such language is called strongly typed.

Dynamic errors

Errors detected at runtime.

Ex: Range checking for certain values, subscript out of range. Error seen by each phaseA compiler expect some specification along with its input. If it does not it report error.

Fig 11.1 plan of error detection and recovery

Lexical phase errors

In some processing if the lexical analyzer discovers that no prefix of the remaining input fits the specification, it invoke error-runtime.

Minimum distance matching

It is used to correct the spelling of a tokenFinding a word from a given collection that is chosen to a given string x.

Syntactic – phase Errors

A passes recognizes the language specified by the grammar. Violation of the syntactic specification will be detected by passer.

Time of detection

The LL(1) and LR(1) will announce the error as soon as a prefix of the input has been seen for which there is run valid continuation.

Panic mode

Page 45:  · Web viewA translator is a program that takes as input a program written in one programming language and produces as output a program in another language. If the source language

42

The passes discards input symbols were a “synchronizing” taken, usually a statement delimiter is encountered.

Error Recovery in Operator precedence passingTwo points

1. If no precedence relation hold between the terminal 2. If a handle found but there is no production.

Handling shift reduce errors

Change the symbols, insert symbols into the input or stack or delete symbol. But it will not continue into infinite loop.

Error Recovery in LR passing

LR passes until detect error when it s consult the passing action table and finds an error entry.Recovering would be to scan down the stack until a state S with a goto on a particular non terminal A is found.

Semantic Errors

Undeclared names and Type incompatibilities. Undeclared is detected by using a flag in symbol table entry.