symbol table design - wordpress.com · symbol table design 7/22/2015 1. the structure of a compiler...
TRANSCRIPT
SYMBOL TABLE DESIGN
7/22/2015 1
THE STRUCTURE OF A COMPILER
• Up to this point we have treated a compiler as a single boxthat maps a source program into a semantically equivalenttarget program.COMPILERCOMPILERCOMPILERSource Program Target Program
7/22/2015 2
WHAT’S INSIDE THIS BOX?• If we open up this box a little, we see that there are twoparts to this mapping:ANALYSIS SYNTHESISANALYSIS SYNTHESIS
7/22/2015 3
LEXICAL
ANALYZER
SYNTAX
ANALYZER
Streamoftokens
SYNTAX
ANALYZER
Parsetree
CODE
OPTIMIZER
CODE
GENERATOR
INTER
MEDIATE
CODE
GENERAT -OR
INTER-MEDIATECODEGENER-ATOR
SEMANTIC
LEXICAL
ANALYZER
SYNTAX
ANALYZER
Streamoftokens
SYNTAX
ANALYZER
CODE
OPTIMIZER
CODE
GENERATOR
INTER
MEDIATE
CODE
GENERAT -OR
SYMBOL TABLEAnalysis Synthesis7/22/2015 4
ANALYSIS
Breaks up the sourceprogram intoconstituent pieces andimposes agrammaticalstructure on them. Itthen uses thisstructure to create anintermediaterepresentation of thesource program.
If the analysis partdetects that thesource program iseither syntactically illformed orsemantically unsound,then it must provideinformative messages,so the user can takecorrective action.
The analysis part alsocollects informationabout the sourceprogram and stores itin a data structurecalled a symboltable, which is passedalong with theintermediaterepresentation to thesynthesis part.
Breaks up the sourceprogram intoconstituent pieces andimposes agrammaticalstructure on them. Itthen uses thisstructure to create anintermediaterepresentation of thesource program.
If the analysis partdetects that thesource program iseither syntactically illformed orsemantically unsound,then it must provideinformative messages,so the user can takecorrective action.
The analysis part alsocollects informationabout the sourceprogram and stores itin a data structurecalled a symboltable, which is passedalong with theintermediaterepresentation to thesynthesis part.
7/22/2015 5
SYNTHESIS
• The synthesis part constructs the desired target programfrom the intermediate representation and the informationin the symbol table • Front end of compilerANALYSIS • Front end of compilerANALYSIS • Back end of compilerSYNTHESIS7/22/2015 6
COMPILERS ROLE??
• An essential function of a compiler –
• These attributes may provide information about thestorage allocated for a name , its type and its scope ,procedure names ,number and types of its arguments, themethod of passing each argument and the type returned
Record the variable names used in the sourceprogram and collect information about various
attributes of each name.
• An essential function of a compiler –
• These attributes may provide information about thestorage allocated for a name , its type and its scope ,procedure names ,number and types of its arguments, themethod of passing each argument and the type returned
Record the variable names used in the sourceprogram and collect information about various
attributes of each name.
7/22/2015 7
SO , WHAT EXACTLY IS SYMBOL TABLE??
A symbol table is a necessary component because Declaration of identifiers appears once in a program Use of identifiers may appear in many places of theprogram text
Symbol tables are data structures that are used bycompilers to hold information about source-programconstructs.
Symbol tables are data structures that are used bycompilers to hold information about source-programconstructs.
A symbol table is a necessary component because Declaration of identifiers appears once in a program Use of identifiers may appear in many places of theprogram text
7/22/2015 8
INFORMATION PROVIDED BYSYMBOL TABLE
Given an Identifier which name is it? What information is to be associated with a name? How do we access this information?
Given an Identifier which name is it? What information is to be associated with a name? How do we access this information?
Given an Identifier which name is it? What information is to be associated with a name? How do we access this information?
Given an Identifier which name is it? What information is to be associated with a name? How do we access this information?
7/22/2015 9
SYMBOL TABLE - NAMES
NAME
Variable and labels
Constant
Record
Parameter
NAME
RecordField
Record
Procedure
Array and files
7/22/2015 10
SYMBOL TABLE-ATTRIBUTES• Each piece of information associated with a name is calledan attribute.• Attributes are language dependent.• Different classes of Symbols have different AttributesVariable,Constants Procedure orfunction Array
• Each piece of information associated with a name is calledan attribute.• Attributes are language dependent.• Different classes of Symbols have different AttributesVariable,Constants• Type , Linenumberwheredeclared ,Lines wherereferenced ,Scope
Procedure orfunction• Number ofparameters,parametersthemselves,result type.Array• # ofDimensions,Arraybounds.
7/22/2015 11
WHO CREATES SYMBOL TABLE??
Identifiers and attributes are entered by the analysis phaseswhen processing a definition (declaration) of anidentifier In simple languages with only global variables and implicitdeclarations:
The scanner can enter an identifier into a symbol tableif it is not already there In block-structured languages with scopes and explicitdeclarations:
The parser and/or semantic analyzer enter identifiersand corresponding attributes
Identifiers and attributes are entered by the analysis phaseswhen processing a definition (declaration) of anidentifier In simple languages with only global variables and implicitdeclarations:
The scanner can enter an identifier into a symbol tableif it is not already there In block-structured languages with scopes and explicitdeclarations:
The parser and/or semantic analyzer enter identifiersand corresponding attributes7/22/2015 12
USE OF SYMBOL TABLE
• Symbol table information is used by the analysis andsynthesis phases• To verify that used identifiers have been defined(declared)• To verify that expressions and assignments aresemantically correct – type checking• To generate intermediate or target code
• Symbol table information is used by the analysis andsynthesis phases• To verify that used identifiers have been defined(declared)• To verify that expressions and assignments aresemantically correct – type checking• To generate intermediate or target code7/22/2015 13
IMPLEMENTATION OF SYMBOL TABLE• Each entry in the symbol table can be implemented as arecord consisting of several field.• These fields are dependent on the information to be savedabout the name• But since the information about a name depends on theusage of the name the entries in the symbol table recordswill not be uniform.• Hence to keep the symbol tables records uniform someinformation are kept outside the symbol table and apointer to this information is stored in the symbol tablerecord.
• Each entry in the symbol table can be implemented as arecord consisting of several field.• These fields are dependent on the information to be savedabout the name• But since the information about a name depends on theusage of the name the entries in the symbol table recordswill not be uniform.• Hence to keep the symbol tables records uniform someinformation are kept outside the symbol table and apointer to this information is stored in the symbol tablerecord.
7/22/2015 14
a int LB1
UB1
SYMBOL TABLE
A pointer steers the symbol table to remotely stored informationfor array a.
7/22/2015 15
WHERE SHOULD NAMES BE HELD??
• If there is modest upper bound on the length of the name ,then the name can be stored in the symbol table recorditself.• But If there is no such limit or the limit is already reachedthen an indirect scheme of storing name is used.• A separate array of characters called a ‘string table’ isused to store the name and a pointer to the name is kept inthe symbol table record
• If there is modest upper bound on the length of the name ,then the name can be stored in the symbol table recorditself.• But If there is no such limit or the limit is already reachedthen an indirect scheme of storing name is used.• A separate array of characters called a ‘string table’ isused to store the name and a pointer to the name is kept inthe symbol table record7/22/2015 16
int LB1UB1
A BSYMBOL TABLE
STRING TABLE
7/22/2015 17
SYMBOL TABLE AND SCOPE
• Symbol tables typically need to support multipledeclarations of the same identifier within a program.
• We shall implement scopes by setting up a separatesymbol table for each scope.
The scope of a declaration is the portion of a programto which the declaration applies.
• Symbol tables typically need to support multipledeclarations of the same identifier within a program.
• We shall implement scopes by setting up a separatesymbol table for each scope.
The scope of a declaration is the portion of a programto which the declaration applies.
7/22/2015 18
The rules governing the scope of names in a block-structured language are as follows1. A name declared within a block B is valid onlywithin B.2. If block B1 is nested within B2, then any name thatany name that is valid for B2 is also valid forB1,unless the identifier for that name is re-declaredin B1. The scope rules required a more complicated symbol tableorganization than simply a list association between namesand attributes. Each table is list names and there associated attributesand the tables are organized into a stack.
The rules governing the scope of names in a block-structured language are as follows1. A name declared within a block B is valid onlywithin B.2. If block B1 is nested within B2, then any name thatany name that is valid for B2 is also valid forB1,unless the identifier for that name is re-declaredin B1. The scope rules required a more complicated symbol tableorganization than simply a list association between namesand attributes. Each table is list names and there associated attributesand the tables are organized into a stack.7/22/2015 19
z RealY Realx Realq Real
Symbol table for block q
SYMBOL TABLE ORGANIZATION
TOP
Var x,y : integer
Procedure P:Var x,a :boolean;
Procedure q:Var x,y,z : real;
begin……endbegin…..End
q Reala Realx RealP ProcY IntegerX Integer
Symbol table for p
Symbol table for main
Var x,y : integer
Procedure P:Var x,a :boolean;
Procedure q:Var x,y,z : real;
begin……endbegin…..End
7/22/2015 20
NESTING DEPTH
• Another technique can be used to represent scopeinformation in the symbol table.• We store the nesting depth of each procedure block in thesymbol table and use the [procedure name , nestingdepth] pairs as the key to accessing the information fromthe table.
• This number is basically a count of how many proceduresare there in the referencing environment of the procedure .
• Another technique can be used to represent scopeinformation in the symbol table.• We store the nesting depth of each procedure block in thesymbol table and use the [procedure name , nestingdepth] pairs as the key to accessing the information fromthe table.
• This number is basically a count of how many proceduresare there in the referencing environment of the procedure .A nesting depth of a procedure is a number that is obtained by startingwith a value of one for the main and adding one to it every time we gofrom an enclosing to an enclosed procedure.
7/22/2015 21
Var x,y : integer
Procedure P:Var x,a :boolean;
Procedure q:Var x,y,z : real;
begin……endbegin…..End
x 3 Realy 3 Realz 3 Realq 2 Proca 2 Boolean
Var x,y : integer
Procedure P:Var x,a :boolean;
Procedure q:Var x,y,z : real;
begin……endbegin…..End
a 2 Booleanx 2 BooleanP 1 Procy 1 Integerz 1 integer7/22/2015 22
SYMBOL TABLE DATA STRUCTURES Issues to consider : Operations required• Insert– Add symbol to symbol table• Look UP– Find symbol in the symbol table (and get itsattributes) Insertion is done only once Look Up is done many times Need Fast Look Up The data structure should be designed to allow thecompiler to find the record for each name quickly and tostore or retrieve data from that record quickly.
Issues to consider : Operations required• Insert– Add symbol to symbol table• Look UP– Find symbol in the symbol table (and get itsattributes) Insertion is done only once Look Up is done many times Need Fast Look Up The data structure should be designed to allow thecompiler to find the record for each name quickly and tostore or retrieve data from that record quickly.7/22/2015 23
LINKED LIST A linear list of records is the easiest way to implementsymbol table. The new names are added to the symbol table in the orderthey arrive. Whenever a new name is to be added to be added it is firstsearched linearly or sequentially to check if or the name isalready present in the table or not and if not , it is addedaccordingly.• Time complexity – O(n)• Advantage – less space , additions are simple• Disadvantages - higher access time.
A linear list of records is the easiest way to implementsymbol table. The new names are added to the symbol table in the orderthey arrive. Whenever a new name is to be added to be added it is firstsearched linearly or sequentially to check if or the name isalready present in the table or not and if not , it is addedaccordingly.• Time complexity – O(n)• Advantage – less space , additions are simple• Disadvantages - higher access time.7/22/2015 24
UNSORTED LIST01 PROGRAM Main02 GLOBAL a,b03 PROCEDURE P (PARAMETER x)04 LOCAL a05 BEGIN {P}06 …a…07 …b…08 …x…09 END {P}10 BEGIN{Main}11 Call P(a)12 END {Main}
nOLook up Complexity
01 PROGRAM Main02 GLOBAL a,b03 PROCEDURE P (PARAMETER x)04 LOCAL a05 BEGIN {P}06 …a…07 …b…08 …x…09 END {P}10 BEGIN{Main}11 Call P(a)12 END {Main}
Name Characteristic Class Scope Other AttributesDeclared Referenced Other
Main Program 0 Line 1a Variable 0 Line 2 Line 11b Variable 0 Line 2 Line 7P Procedure 0 Line 3 Line 11 1, parameter, xx Parameter 1 Line 3 Line 8a Variable 1 Line 4 Line 67/22/2015 25
SORTED LIST01 PROGRAM Main02 GLOBAL a,b03 PROCEDURE P (PARAMETER x)04 LOCAL a05 BEGIN {P}06 …a…07 …b…08 …x…09 END {P}10 BEGIN{Main}11 Call P(a)12 END {Main}
nO 2logLook up Complexity
If stored as array (complex insertion)
nOLook up Complexity
Name Characteristic Class Scope Other AttributesDeclared Reference Other
a Variable 0 Line 2 Line 11a Variable 1 Line 4 Line 6b Variable 0 Line 2 Line 7Main Program 0 Line 1P Procedure 0 Line 3 Line 11 1, parameter, xx Parameter 1 Line 3 Line 8
01 PROGRAM Main02 GLOBAL a,b03 PROCEDURE P (PARAMETER x)04 LOCAL a05 BEGIN {P}06 …a…07 …b…08 …x…09 END {P}10 BEGIN{Main}11 Call P(a)12 END {Main}
nOIf stored as linked list (easy insertion)
7/22/2015 26
SEARCH TREES
• Efficient approach for symbol table organisation• We add two links left and right in each record in the searchtree.• Whenever a name is to be added first the name is searchedin the tree.• If it does not exists then a record for new name is createdand added at the proper position.• This has alphabetical accessibility.
• Efficient approach for symbol table organisation• We add two links left and right in each record in the searchtree.• Whenever a name is to be added first the name is searchedin the tree.• If it does not exists then a record for new name is createdand added at the proper position.• This has alphabetical accessibility.7/22/2015 27
BINARY TREE
Main Program 0 Line1
P Procedure 1 Line3 Line11
x Parameter 1 Line3 Line8
a Variable 0 Line2 Line11
b Variable 0 Line2 Line7
a Variable 1 Line4 Line67/22/2015 28
BINARY TREE
Lookup complexity if treebalanced nO 2log
nOLookup complexity if treeunbalanced
7/22/2015 29
HASH TABLE
• Table of k pointers numbered from zero to k-1 that pointsto the symbol table and a record within the symbol table.• To enter a name in to the symbol table we found out thehash value of the name by applying a suitable hashfunction.• The hash function maps the name into an integer betweenzero and k-1 and using this value as an index in the hashtable.
• Table of k pointers numbered from zero to k-1 that pointsto the symbol table and a record within the symbol table.• To enter a name in to the symbol table we found out thehash value of the name by applying a suitable hashfunction.• The hash function maps the name into an integer betweenzero and k-1 and using this value as an index in the hashtable.7/22/2015 30
HASH TABLE - EXAMPLE
0
1
2
3
4
M n a b P x77 110 97 98 80 120
PROGRAM MainGLOBAL a,bPROCEDUREP(PARAMETER x)LOCAL aBEGIN (P)
Main Program 0 Line1
4
5
6
7
8
9
10
BEGIN (P)…a……b……x…END (P)BEGIN (Main)Call P(a)End (Main)H(Id) = (# of first letter + # of last letter) mod 11
a Variable 0 Line2
b Variable 0 Line2
P Procedure 1 Line 3
a Variable 1 Line4 a Variable 0 Line2
b Variable 0 Line2x Parameter 1 Line3
7/22/2015 31
REFERENCES
O.G.Kakde - Compiler design Compilers Principles, Techniques & Tools -Second Edition : Alfred V Aho , Monica S Lam, RaviSethi, Jeffery D Ullman
O.G.Kakde - Compiler design Compilers Principles, Techniques & Tools -Second Edition : Alfred V Aho , Monica S Lam, RaviSethi, Jeffery D Ullman
7/22/2015 32
7/22/2015 33