cs1352_apr08

Upload: sridharanc23

Post on 03-Apr-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/28/2019 CS1352_APR08

    1/15

    MAY/JUNE-'09/CS1352-Answer Key

    CS1352 - Principles of Compiler DesignUniversity Question Key

    April / May 2008

    PART-A

    1. Differentiate compiler and interpreter.Compiler produces a target program whereas an interpreter performs the

    operations implied by the source program.

    2. Write short notes on buffer pair.Concerns with efficiency issuesUsed with a lookahead on the input

    It is a specialized buffering technique used to reduce the overhead required toprocess an input character. Buffer is divided into two N-character halves. Use two

    pointers. Used at times when the lexical analyzer needs to look ahead severalcharacters beyond the lexeme for a pattern before a match is announced.

    3. Construct a parse tree of (a+b)*c for the grammar E->E+E | E*E | (E) | id.

    4. Eliminate immediate left recursion for the following grammar E->E+T | T, T->T* F | F, F-> (E) | id.

    The rule to eliminate the left recursion is A->A | can be converted as A-> A andA-> A | . So, the grammar after eliminating left recursion is

    E->TE; E->+TE| ; T->FT; T->*FT | ; F-> (E) | id.

    5. Write short notes on global data flow analysis.Collecting information about the way data is used in a program.

    Takes control flow into accountForward flow vs. backward flowForward: Compute OUT for given IN, GEN, KILL

    Information propagates from the predecessors of a vertex. Examples: Reachability, available expressions, constant propagation

    Backward: Compute IN for given OUT, GEN, KILL Information propagates from the successors of a vertex. Example: Live variable Analysis

    - 1 -

    http://engineerportal.blogspot.in/

  • 7/28/2019 CS1352_APR08

    2/15

    MAY/JUNE-'09/CS1352-Answer Key

    6. Define back patching with an example.Back patching is the activity of filling up unspecified information of labels using

    appropriate semantic actions in during the code generation process. In the semanticactions the functions used are mklist(i), merge_list(p1,p2) and backpatch(p,i).

    Source: L2: x= y+1if a or b then L3:if c then After Backpatching:

    x= y+1 100: if a goto 103Translation: 101: if b goto 103

    if a go to L1 102: goto 106if b go to L1 103: if c goto 105go to L3 104: goto 106

    L1: if c goto L2 105: x=y+1goto L3 106:

    7. Give syntax directed translation for the following statement Call p1(int a, int b).param aparam bcall p1

    8. How will you find the leaders in basic block?

    Leaders: The first statement of basic blocks. The first statement is a leader Any statement that is the target of a conditional or unconditional goto is a

    leader Any statement that immediately follows a goto or conditional goto statement

    is a leader9. Define code motion.

    It decreases the amount of code in a loop. Taking the expression which yield the sameresult independent of the number of times a loop is executed (a loop-invariantcomputation and places it before the loop.

    10.Define basic block and flow graph.A basic block is a sequence of consecutive statements in which flow of control enters at

    the beginning and leaves at the end without halt or possibility of branching except atthe end. A flow graph is a directed graph in which the flow control information is added to

    the basic blocks. The nodes in the flow graph are basic blocks the block whose leader is the first statement is called initial block. There is a directed edge from block B1 to block B2 if B2 immediately follows B1 in thesome execution sequence. We can say that B1 is a predecessor of B2 and B2 is asuccessor of B1.

    - 2 -

    http://engineerportal.blogspot.in/

  • 7/28/2019 CS1352_APR08

    3/15

    APR/MAY-'08/CS1352-Answer Key

    PART - B11.a. i. Explain the phases of compiler, with the neat schematic. (12)

    The process of compilation is very complex. So it comes out to be customary from thelogical as well as implementation point of view to partition the compilation process intoseveral phases. A phase is a logically cohesive operation that takes as input one

    representation of source program and produces as output another representation. (2)Source program is a stream of characters: E.g.pos = init + rate * 60 (6) lexical analysis: groups characters into non-separable units, called token, and

    generates token stream: id1 = id2 + id3 * const The information about the identifiers must be stored somewhere (symbol

    table). Syntax analysis: checks whether the token stream meets the grammatical

    specification of the language and generates the syntax tree. Semantic analysis: checks whether the program has a meaning (e.g. if pos is a record

    and init and rate are integers then the assignment does not make a sense).

    :=

    id1 +

    id2

    *

    id3 60

    Syntax analysis

    :=

    id1 +

    id2

    *

    id3 inttoreal

    60

    Semantic analysis Intermediate code generation, intermediate code is something that is both close to the

    final machine code and easy to manipulate (for optimization). One example is thethreeaddress code:

    dst = op1 op op2

    The three-address code for the assignment statement:temp1 = inttoreal(60);temp2 = id3 * temp1;temp3 = id2 + temp2;id1 = temp3

    Code optimization: produces better/semantically equivalent code.temp1 = id3 * 60.0id1 = id2 + temp1

    Code generation: generates assemblyMOVF id3, R2MULF #60.0, R2

    MOVF id2, R1ADDF R2, R1MOVF R1, id1

    Symbol Table Creation / Maintenance

    Contains Info (storage, type, scope, args) on Each Meaningful Token, typicallyIdentifiersData Structure Created / Initialized During Lexical Analysis Utilized / Updated During Later Analysis & Synthesis

    - 3 -

    http://engineerportal.blogspot.in/

  • 7/28/2019 CS1352_APR08

    4/15

    APR/MAY-'08/CS1352-Answer Key

    Error HandlingDetection of Different Errors Which Correspond to All PhasesEach phase should know somehow to deal with error, so that compilation

    can proceed, to allow further errors to be detectedSource Program

    Symbol-table

    Manager

    1

    2

    3

    4

    5

    6

    Lexical Analyzer

    Syntax Analyzer

    Semantic Analyzer

    Error Handler

    Intermediate Code

    Generator

    Code Optimizer

    Code Generator

    Target Program

    ii. Write short notes on compiler construction tools. (4) Parser Generators : Produce Syntax Analyzers Scanner Generators : Produce Lexical Analyzers Syntax-directed Translation Engines : Generate Intermediate Code

    Automatic Code Generators : Generate Actual Code Data-Flow Engines : Support Optimization

    (OR)

    b. i. Explain grouping of phases. (8)

    Front and back ends: (3)

    Often, the phases are collected into a front end and a back end. The front end hasthose phases, which depend primarily on source language and largely independent of thetarget machine. These include lexical and syntactic analysis, the creation of symbol table,semantic analysis and the generation of intermediate code.

    Back end has those phases, which depend primarily on target machine and largelyindependent of the source language, just the intermediate language. These include codeoptimization phase, along with necessary error handling and symbol table operations.

    Passes: (2)

    Several phases are implemented in a single pass consisting of reading an input fileand writing an output file. The activity of those phases can be interleaved during the pass.

    - 4 -

    http://engineerportal.blogspot.in/

  • 7/28/2019 CS1352_APR08

    5/15

    APR/MAY-'08/CS1352-Answer Key

    Reducing the number of passes: (3)

    It is desirable to have few passes, since it takes time to read and write intermediate files. But,on the other hand, if we group several phases into one pass, then we must keep entireprogram in memory, because one phase may need information in a different order than aprevious phase produces it.

    For some phases, grouping into one pass may present few problems: The interface between the lexical and syntactic analyzers can be limited to asingle token

    It is often very hard to perform code generation until the intermediaterepresentation has been completely generated

    It cannot generate target code for a construct if we do not know the types ofthe variables involved in the construct

    It cannot determine target address of forward jump until we have seen theintervening source code and generated target code for it.

    Intermediate and target code generation can be merged into a single pass using atechnique called back patching. Use back patching, in which blank space slot is left for

    missing information and fill in the slot when the information becomes available.

    ii. Explain specification of tokens. (8)

    Regular expressions are the notations for specifying the patterns. Each patternmatches a set of stringsStrings and languages: (2)

    An alphabet is a finite set of symbols. A string over an alphabet is a finite sequence ofsymbols from the alphabet. Terms for parts of a string: Prefix, Suffix, Substring, Properprefix and proper suffix Language: It is a set of strings over some fixed alphabet.

    Operations on languages: (2)

    Concatenation Union Kleene closure Positive closure

    Regular expressions: (2) is a regular expression that denotes {} if a is a symbol in , then a is a regular expression that denotes {a}

    Suppose r and s are regular expressions denoting the languages L(r) and L(s).Then,

    (r) | (s) is a regular expression denoting L(r) U L(s)

    (r) (s) is a regular expression denoting L(r) L(s) (r)* is a regular expression denoting L(r)* (r) is a regular expression denoting L(r)

    A language denoted by a regular expression is said to be a regular set. Unary operator * has the highest precedence and is left associative Concatenation has the second highest precedence and is left associative |has lowest precedence and is left associative

    - 5 -

    http://engineerportal.blogspot.in/

  • 7/28/2019 CS1352_APR08

    6/15

    APR/MAY-'08/CS1352-Answer Key

    Regular definitions: (2)It is a sequence of definitions of the form d1->r1, d2->r2 dn->rn

    Where each di is a distinct name and each riis a regular expression over the symbols in U{d1, d2, .. di-1}

    12.a. Find the SLR parsing table for the given grammar and parse the sentence(a+b)*c. E->E+E | E*E | (E) | id.

    Given grammar: E->E*.E1. E->E+E E->.E+E2. E->E*E E->.E*E3. E->(E) E->.(E)4. E->id E->.id

    Augmented grammar: I6: goto(I2, E)E->E E->(E.)E->E+E E->E.+EE->E*E E->E.*E

    E->(E) I7: goto(I4, E)E->id E->E+E.I0: E->.E E->E.+E

    E->.E+E E->E.*EE->.E*E I8: goto(I5, E)E->.(E) E->E*E.E->.id E->E.+E

    I1: goto(I0, E) E->E.*EE->E.E->E.+E goto(I2, ()=I2E->E.*E goto(I2, id)=I3

    I2: goto(I0, () goto(I4, ()=I2E->(.E) goto(I4, id)=I3E->.E+E goto(I5, ()=I2E->.E*E goto(I5, id)=I3E->.(E)E->.id I9: goto(I6, ))

    I3: goto(I0, id) E->(E).E->id.

    I4: goto(I1, +) goto(I6, +)=I4E->E+.E goto(I6, *)=I5E->.E+E goto(I7, +)=I4E->.E*E goto(I7, *)=I5E->.(E) goto(I8, +)=I4E->.id goto(I8, *)=I5

    I5: goto(I1, *)

    First(E) = {(, id}Follow(E)={+, *, ), $}

    - 6 -

    http://engineerportal.blogspot.in/

  • 7/28/2019 CS1352_APR08

    7/15

    APR/MAY-'08/CS1352-Answer Key

    SLR parsing table:

    States

    0123456789

    + * (S2S4 S5

    S2r4 r4

    S2S2

    S4 S5S4, r1 S5, r1S4, r2 S5, r2

    r3 r3

    Action Goto

    ) id $ ES3 1Acc

    S3 6r4 r4

    S3 7S3 8

    S9r1 r1r2 r2r3 r3

    Parsing the sentence (a+b)*c:

    0 (a+b)*c$ shift 20(2 a+b)*c$ shift 30(2a3 +b)*c$ reduce by E->id0(2E6 +b)*c$ shift 40(2E6+4 b)*c$ shift 30(2E6+4b3 )*c$ reduce by E->id0(2E6+4E7 )*c$ reduce by E->E+E0(2E6 )*c$ shift 9

    0(2E6)9 *c$ reduce by E->(E)0E1 *c$ shift 50E1*5 c$ shift 30E1*5c3 $ reduce by E->id0E1*5E8 $ reduce by E->E*E0E1 $ accept

    (OR)

    b.Find the predictive parser for the given grammar and parse the sentence (a+b)*c.E->E+E | E*E | (E) | id.

    Elimination of left recursion (2)Calculation of First (3)Calculation of Follow (3)Predictive parsing table (6)Parsing the sentence (2)

    - 7 -

    http://engineerportal.blogspot.in/

  • 7/28/2019 CS1352_APR08

    8/15

  • 7/28/2019 CS1352_APR08

    9/15

    APR/MAY-'08/CS1352-Answer Key

    L2: code for S2Goto next

    Ln-1: code for Sn-1go to next

    Ln: code for Sngoto nexttest: if t=V1 goto L1

    if t=V2 goto L2

    if t=Vn-1 goto Ln-1goto Ln

    next:Translation 2:

    code to evaluate E into t iftV1 goto L1code for S1goto next

    L1: if tV2 goto L2code for S2goto next

    L2:

    Ln-2: if tVn-1 goto Ln-1code for Sn-1goto next

    Ln-1: code for Snnext:

    Intermediate code generated:int a,b;float c;

    SYMTAB have the following information for the above declarations:Let offset=0Name Type Offset Widtha integer 0 4b integer 4 4c float 8 8

    3AC:a:=10if a10 goto L1c:=1goto next

    L1: if a20 goto nextc:=2

    next:

    - 9 -

    http://engineerportal.blogspot.in/

  • 7/28/2019 CS1352_APR08

    10/15

    APR/MAY-'08/CS1352-Answer Key

    (OR)

    b. i. Generate intermediate code for the following code segment along with the required

    syntax directed translation scheme: (8)

    i=1; s=0;

    while(i

  • 7/28/2019 CS1352_APR08

    11/15

    APR/MAY-'08/CS1352-Answer Key

    After Backpatching: 103: if c goto 105100: if a goto 103 104: goto 106101: if b goto 103 105: x=y+1102: goto 106 106:

    14.a. i. Explain the various issues in the design of code generation. (6) Input to the code generator

    Intermediate representation of the source program, like linearrepresentations such as postfix notation, three address representations such asquadruples, virtual machine representations such as stack machine code andgraphical representations such as syntax trees and dags.

    Target programsIt is the output such as absolute machine language, relocatable machine

    language or assembly language. Memory management

    Mapping of names in the source program to addresses of data object in run time

    memory is done by front end and the code generator. Instruction selectionNature of the instruction set of the target machine determines the difficulty of

    instruction selection. Register allocation

    Instructions involving registers are shorter and faster. The use of registers isbeing divided into two sub problems:

    o During register allocation, we select the set of variables that will reside inregisters at a point in the program

    o During a subsequent register assignment phase, we pick the specificregister that a variable will reside in

    Choice of evaluation orderThe order in which computations are performed affect the efficiency of targetcode. Approaches to code generation

    ii. Explain code generation phase with simple code generation algorithm. (10)

    It generates target code for a sequence of three address statements. (2)Assumptions:

    For each operator in three address statement, there is a corresponding targetlanguage operator.

    Computed results can be left in registers as long as possible.

    E.g. a=b+c: (2) Add Rj,Ri where Ri has b and Rj has c and result in Ri. Cost=1; Add c, Ri where Ri has b and result in Ri. Cost=2; Mov c, Rj; Add Rj, Ri; Cost=3;

    Register descriptor: Keeps track of what is currently in each registerAddress descriptor: Keeps tracks of the location where the current value of the name canbe found at run time. (2)

    - 11 -

    http://engineerportal.blogspot.in/

  • 7/28/2019 CS1352_APR08

    12/15

    APR/MAY-'08/CS1352-Answer Key

    Code generation algorithm: For x= y op z (2)

    Invoke the function getreg to determine the location L, where the result of yop z should be stored (register or memory location)

    Check the address descriptor for y to determine y Generate the instruction op z, L where z is the current location of z

    If the current values of y and/or z have no next uses, alter register descriptorGetreg: (2) If y is in a register that holds the values of no other names and y is not live,

    return register of y for L If failed, return empty register If failed, if X has next use, find an occupied register and empty it If X is not used in the block, or suitable register is found, select memory

    location of x as L

    (OR)b. i. Generate DAG representation of the following code and list out the

    applications of DAG representation: (12)i=1; s=0;

    while(i

  • 7/28/2019 CS1352_APR08

    13/15

    APR/MAY-'08/CS1352-Answer Key

    ii. Write short notes on next-use information with suitable example. (4)

    If the name in a register is no longer needed, then the register can be assigned tosome other name. This idea of keeping a name in storage only if it will be usedsubsequently can be applied in a number of contexts.Computing next uses: (2)

    The use of a name in a three-address statement is defined as follows: Suppose athree-address statement i assigns a value to x. If statement j has x as an operand andcontrol can flow from statement i to j along a path that has no intervening assignments to x,then we say statement j uses the value of x computed at i.Example:

    x:=ij:=x op y // j uses the value of x

    Algorithm to determine next use: (2)

    The algorithm to determine next uses makes a backward pass over each basicblock, recording for each name x whether x has a next use in the block and if not,whether it is live on exit from the block (using data flow analysis). Suppose we reachthree-address statement i: x: =y op z in our backward scan. Then do the following:

    Attach to statement i, the information currently found in the symbol tableregarding the next use and the liveness of x, y, and z.

    In the symbol table, set x to not live and no next use In the symbol table, set y and z to live and the next uses of y and z to i.

    15.a. i. Explain - principle sources of optimization. (8)Code optimization is needed to make the code run faster or take less space or both.Function preserving transformations:

    Common sub expression elimination Copy propagation

    Dead-code elimination

    Constant foldingCommon sub expression elimination: (2)

    E is called as a common sub expression if E was previously computed and thevalues of variables in E have not changed since the previous computation.Copy propagation: (2)

    Assignments of the form f:=g is called copy statements or copies in short. Theidea here is use g for f wherever possible after the copy statement.Dead code elimination: (2)

    A variable is live at a point in the program if its value can be used subsequently.Otherwise dead. Deducing at compile time that the value of an expression is a constant

    and using the constant instead is called constant folding.Loop optimization: (2) Code motion: Moving code outside the loop

    Takes an expression that yields the same result independent of the number oftimes a loop is executed (a loop-invariant computation) and place the expression beforethe loop. Induction variable elimination Reduction in strength: Replacing an expensive operation by a cheaper one.

    - 13 -

    http://engineerportal.blogspot.in/

  • 7/28/2019 CS1352_APR08

    14/15

    APR/MAY-'08/CS1352-Answer Key

    ii. Write short notes on: (8)(1) Storage organization

    Subdivision of run time memory:

    Run time storage: The block of memory obtained by compiler from OS to execute thecompiled program. It is subdivided into

    Generated target code Data objects Stack to keep track of the activations Heap to store all other information

    Activation record: (Frame)

    Code

    Static dataStack

    Heap

    It is used to store the information required by a single procedure call.Returned valueActual parameters

    Optional control linkOptional access linkSaved machine statusLocal datatemporaries

    Temporaries are used to hold values that arise in the evaluation of expressions.Local data is the data that is local to the execution of procedure. Saved machine statusrepresents status of machine just before the procedure is called. Control link (dynamic link)points to the activation record of the calling procedure. Access link refers to the non-localdata in other activation records. Actual parameters are the one which is passed to the calledprocedure. Returned value field is used by the called procedure to return a value to the

    calling procedure

    Compile time layout of local data:

    The amount of storage needed for a name is determined by its type. The field for thelocal data is laid out as the declarations in a procedure are examined at compile time. Thestorage layout for data objects is strongly influenced by the addressing constraints on thetarget machine.

    (2) Parameter passing.

    Call by valueA formal parameter is treated just like a local name. Its storage is in the

    activation record of the called procedureThe caller evaluates the actual parameter and place the r-value in the storage

    for the formals Call by reference

    If an actual parameter is a name or expression having L-value, then that l-value itself is passed

    However, if it is not (e.g. a+b or 2) that has no l-value, then expression isevaluated in the new location and its address is passed.

    - 14 -

    http://engineerportal.blogspot.in/

  • 7/28/2019 CS1352_APR08

    15/15

    APR/MAY-'08/CS1352-Answer Key

    Copy-Restore: Hybrid between call-by-value and call-by-ref (copy in, copy out)Actual parameters evaluated, its r-value is passed and l-value of the actuals

    are determinedWhen the called procedure is done, r-value of the formals are copied back to

    the l-value of the actuals

    Call by nameInline expansion(procedures are treated like a macro)

    (OR)

    b. i. Optimize the following code using various optimization technique: (12)

    i=1; s=0;for (i=1; i