introduction - home - sri venkateswara college of … · web viewm00000705 is the modification...

61
SYSTEM SOFTWARE ( CS 2304) UNIT - II Introduction An assembler is system software that accepts an assembly language program as its input and produces its machine language equivalent along with information for the loader as its output. It is a translator that converts the assembly language program into machine language program. The structure of the assembler is given as Assembly language Program Assembly language program The sequence of instructions to the assembler is called as assembly language program that uses set of mnemonics. The format of the instruction varies from system to system based on the machine architecture. In SIC, the format of the assembly language instruction is given as Label Opcode or Mnemonics Operands LALITHAMBIGAI.B AP/IT Page 1 Assembler Data structures (Ex) symbol table, opcode table Machine language program and extra information for loading

Upload: ngothu

Post on 15-May-2018

215 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Introduction

An assembler is system software that accepts an assembly language program as its input

and produces its machine language equivalent along with information for the loader as its output

It is a translator that converts the assembly language program into machine language program

The structure of the assembler is given as

Assembly language

Program

Assembly language program

The sequence of instructions to the assembler is called as assembly language program

that uses set of mnemonics

The format of the instruction varies from system to system based on the machine

architecture In SIC the format of the assembly language instruction is given as

Label Opcode or Mnemonics Operands

(Ex) FIRST SLT RETADR

CLOOP JSUB RDREC

LDA LENGTH

RSUB

LALITHAMBIGAIB APIT Page 1

Assembler

Data structures(Ex) symbol table

opcode table

Machine language program and extra information for loading

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Machine language program

Each line of assembly language instruction is translated to machine language The

machine language takes two forms depending on the architecture They are

1 hexadecimal form

2 binary form

SIC takes hexadecimal form of machine code

Data structures

Assembler built or uses one or more data structure to perform the assembling process

Some of the data structures are SYMTAB and OPTAB

Basic assembler functionsFundamental functions of an assembler

(i) A simple SIC assembler

(ii) Assembler algorithm and data structure

(i) A simple SIC assembler

Section 21 introduces the most fundamental operations performed by a typical assembler and

describes common ways of accomplishing these functions The data structures described are shared

by almost all assemblers These data structures can be used as a framework to design the assembler

for a new or unfamiliar machine

LALITHAMBIGAIB APIT Page 2

Mnemonic operation code Machine language

Symbolic labels Machine addresses

SYSTEM SOFTWARE ( CS 2304) UNIT - II

21 BASIC ASSEMBLER FUNCTIONS

The figure 21 shows an assembly language program for the basic version of SIC The same program

is used with different variations throughout this chapter to learn different assembler features The

line numbers used in this example program are used for reference and are not part of the program

The mnemonic instructions are used in this example

Mnemonic instruction definition A word or acronym used in

assembly language to represent a binary machine instruction operation code

Indexed addressing is indicated by adding the modifier ldquoXrdquo following the operand Example ndash check

line no 160

STCH BUFFER X

Lines beginning with ldquordquo contain comments only

In addition to the mnemonic instructions the following assembler directives are used

1 START

2 END

3 BYTE

4 WORD

5 RESB

6 RESW

Assembler directives definition - A statement in an assembly-language program that gives

instructions to the assembler and does not generate machine language or object code are called

assembler directives

The Assembler directives are

ndash START Specify name amp starting address for the program

ndash END Indicate the end of the source program and specify the first execution instruction in the

program

ndash BYTE Generate character or hexadecimal constant occupying as many bytes as needed to

represent the constant

LALITHAMBIGAIB APIT Page 3

SYSTEM SOFTWARE ( CS 2304) UNIT - IIndash WORD Generate one-word integer constant

ndash RESB Reserves the indicated number of bytes for a data area

ndash RESW Reserves the indicated number of words for a data area

ndash End of record a null char (00)

ndash End of file a zero length record

In addition to the line numbers the program is written using three columns ndash the first column

indicates the labels used in the source code the second indicates the opcode field and the third

represents the operand

LALITHAMBIGAIB APIT Page 4

Label Opcode Operand

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Figure 21 Assembler language program for basic SIC version

The example program considered in Figure 21 has a main module and two subroutines

Purpose of example program -

- Reads records from input device (code F1)

- Copies them to output device (code 05)

- At the end of the file writes EOF on the output device then RSUB to the

operating system

bull Data transfer (RD WD)

-A buffer is used to store record

-Buffering is necessary for different IO rates

-The end of each record is marked with a null character (00) hexadecimal

-The end of the file is indicated by a zero-length record

LALITHAMBIGAIB APIT Page 5

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The main routine calls subroutines

RDREC ndash To read a record into a buffer

WRREC ndash To write the record from the buffer to the output device

The end of each record is marked with a null character (hexadecimal 00)

211 A Simple SIC Assembler

The figure 22 shows the same program as in figure 21 with the generated object code for each

statement

LALITHAMBIGAIB APIT Page 6

Label Opcode Operand

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Figure 22 Assembler language program for basic SIC version with Object code

LALITHAMBIGAIB APIT Page 7

SYSTEM SOFTWARE ( CS 2304) UNIT - II

1 The column ldquoLocrdquo in the example program gives the machine address in hexadecimal for

each and every line of the assembled program

2 The program starts at address 1000 (itrsquos only assumption ndash for simple SIC machine the

starting address will always be assumed and the value will not be 0)

3 The translation of source program to object code requires the following functions

a Convert mnemonic operation codes to their machine language equivalents Eg Translate

STL to 14 (line 10)

b Convert symbolic operands to their equivalent machine addresses EgTranslate

RETADR to 1033 (line 10)

c Build the machine instructions in the proper format

d Convert the data constants specified in the source program into their internal machine

representations Eg Translate EOF to 454F46(line 80)

e Write the object program and the assembly listing

4 All the statements in the program except statement 2 can be established by sequential

processing of source program one line at a time

Consider the statement

10 1000 FIRST STL RETADR 141033

5 This instruction contains a forward reference (ie) - a reference to a label (RETADR) that is

defined later in the program

6 It is unable to process this line because the address that will be assigned to RETADR is not

known

7 Hence most assemblers make two passes over the source program

8 The first pass does little more than scan the source program for label definitions and assign

address such as those in the Loc column

9 The second pass performs most of the actual translation

10 The assembler must also process statements called assembler directives or pseudo

instructions which are not translated into machine instructions Instead they provide

instructions to the assembler itself Examples RESB and RESW instruct the assembler to

reserve memory locations without generating data values

LALITHAMBIGAIB APIT Page 8

SYSTEM SOFTWARE ( CS 2304) UNIT - II11 The assembler must write the generated object code onto some output device This object

program will later be loaded into memory for execution

Object program format contains three types of records

Header record Contains the program name starting address and length

Text record Contains the machine code and data of the program

End record Marks the end of the object program and specifies the address in the program

where execution is to begin

Record format is as follows

Header record

Col 1 H

Col2-7 Program name

Col8-13 Starting address of object program

Col14-19 Length of object program in bytes

Text record

Col1 T

Col2-7 Starting address for object code in this record

Col8-9 Length of object code in this record in bytes

Col 10-69 Object code represented in hexadecimal (2 columns per byte of object code)

End record

Col1 E

Col2-7 Address of first executable instruction in object program

LALITHAMBIGAIB APIT Page 9

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2111 The assembler design can be done

Single pass assembler

Multi-pass assembler

Single-pass Assembler

In Single-pass assembler the whole process of scanning parsing and object code conversion

is done in single pass

The only problem with this method is resolving forward reference

The problem of forward reference is shown with an example below

10 1000 FIRST STL RETADR 141033

--

--

--

--

95 1033 RETADR RESW 1

In the above example in line number 10 the instruction STL will store the linkage register

with the contents of RETADR But during the processing of this instruction the value of this

symbol is not known as it is defined at the line number 95

Since in single-pass assembler the scanning parsing and object code conversion happens

simultaneously

The instruction is fetched it is scanned for tokens parsed for syntax and semantic validity If

it is valid then it has to be converted to its equivalent object code For this the object code is

generated by adding the opcode of STL and the value for the symbol RETADR which is not

available

Due to this reason usually the design is done in two passes

So a multi-pass assembler resolves the forward references and then converts into the object

code Hence the process of the multi-pass assembler can be as follows

LALITHAMBIGAIB APIT Page 10

SYSTEM SOFTWARE ( CS 2304) UNIT - IIPass-1

Assign addresses to all the statements

Save the addresses assigned to all labels to be used in Pass-2

Perform some processing of assembler directives such as RESW RESB to find the length of

data areas for assigning the address values

Defines the symbols in the symbol table(generate the symbol table)

Pass-2

Assemble the instructions (translating operation codes and looking up addresses)

Generate data values defined by BYTE WORD etc

Perform the processing of the assembler directives not done during pass-1

Write the object program and assembly listing

2112 Assembler Design

The most important things which need to be concentrated is the generation of Symbol table

and resolving forward references

bull Symbol Table

ndash This is created during pass 1

ndash All the labels of the instructions are symbols

ndash Table has entry for symbol name address value

bull Forward reference

ndash Symbols that are defined in the later part of the program are called forward

referencing

ndash There will not be any address value for such symbols in the symbol table in pass 1

2113 Functions of the two passes of assembler

Pass 1 (Define symbols)

1 Assign addresses to all statements in the program

2 Save the addresses assigned to all labels for use in Pass 2

3 Perform some processing of assembler directives

LALITHAMBIGAIB APIT Page 11

SYSTEM SOFTWARE ( CS 2304) UNIT - IIPass 2 (Assemble instructions and generate object programs)

1 Assemble instructions (translating operation codes and looking up addresses)

2 Generate data values defined by BYTEWORD etc

3 Perform processing of assembler directives not done in Pass 1

4 Write the object program and the assembly listing

212 Assembler Algorithm and Data Structures

Assembler uses two major internal data structures

1 Operation Code Table (OPTAB) Used to lookup mnemonic operation codes and translate

them into their machine language equivalents

2 Symbol Table (SYMTAB) Used to store values(Addresses) assigned to labels

3

Location Counter (LOCCTR)

It is a variable used to help in the assignment of addresses

It is initialized to the beginning address specified in the START statement

After each source statement is processed the length of the assembled instruction or data area

to be generated is added to LOCCTR

Whenever a label is reached in the source program the current value of LOCCTR gives the

address to be associated with that label

Operation Code Table (OPTAB)

Contains the mnemonic operation and its machine language equivalent

Also contains information about instruction format and length

In Pass 1 OPTAB is used to lookup and validate operation codes in the source program

In Pass 2 it is used to translate the operation codes to machine language program

During Pass 2 the information in OPTAB tells which instruction format to use in assembling

the instruction and any peculiarities of the object code instruction

LALITHAMBIGAIB APIT Page 12

SYSTEM SOFTWARE ( CS 2304) UNIT - II OPTAB is usually organized as a hash table with mnemonic operation code as the key The

hash table organization is particularly appropriate since it provides fast retrieval with a

minimum search

In most cases OPTAB is a static table ndash ie entries are not added or deleted from it

Symbol Table (SYMTAB)

Includes the name and value(address) for each label in the source program and flags to

indicate error conditions(ex ndash symbol defined in two different places)

During Pass 1 of the assembler labels are entered into SYMTAB as they are encountered in

the source program along with their assigned addresses

During Pass 2 symbols used as operands are looked up in SYMTAB to obtain the addresses

to be inserted in the assembled instructions

Pass 1 usually writes an intermediate file that contains each source statement together with its

assigned address error indicators This file is used as the input to Pass 2

This copy of the source program can also be used to retain the results of certain operations that may

be performed during Pass 1 such as scanning the operand field for symbols and addressing flags so

these need not be performed again during Pass 2

The Algorithm for Pass 1

Begin

read first input line

if OPCODE = lsquoSTARTrsquo then

begin

save [Operand] as starting address

initialize LOCCTR to starting address

write line to intermediate file

read next input line

end if START

else

LALITHAMBIGAIB APIT Page 13

SYSTEM SOFTWARE ( CS 2304) UNIT - IIinitialize LOCCTR to 0

While OPCODE = lsquoENDrsquo do

begin

if this is not a comment line then

begin

if there is a symbol in the LABEL field then

begin

search SYMTAB for LABEL

if found then

set error flag (duplicate symbol)

else

end if symbol

search OPTAB for OPCODE

if found then

add 3 (instruction length) to LOCCTR

else if OPCODE = lsquoWORDrsquo then

add 3 to LOCCTR

else if OPCODE = lsquoRESWrsquo then

add 3 [OPERAND] to LOCCTR

else if OPCODE = lsquoRESBrsquo then

add [OPERAND] to LOCCTR

else if OPCODE = lsquoBYTErsquo then

begin

find length of constant in bytes

add length to LOCCTR

end if BYTE

else

set error flag (invalid operation code)

end if not a comment

write line to intermediate file

read next input line

LALITHAMBIGAIB APIT Page 14

SYSTEM SOFTWARE ( CS 2304) UNIT - IIend while not END

write last line to intermediate file

Save (LOCCTR ndash starting address) as program length

End pass 1

Explanation ndash PASS I Algorithm

The algorithm scans the first statement START and saves the operand field (the address) as

the starting address of the program Initializes the LOCCTR value to this address This line is

written to the intermediate line

If no operand is mentioned the LOCCTR is initialized to zero

If a label is encountered the symbol has to be entered in the symbol table along with its

associated address value If the symbol already exists that indicates an entry of the same

symbol already exists So an error flag is set indicating a duplication of the symbol

It next checks for the mnemonic code it searches for this code in the OPTAB If found then

the length of the instruction is added to the LOCCTR to make it point to the next instruction

If the opcode is the directive WORD it adds a value 3 to the LOCCTR

If it is RESW it needs to add 3 the number of data word to the LOCCTR

If it is BYTE it adds a value one to the LOCCTR if RESB it adds number of bytes

If it is END directive then it is the end of the program it finds the length of the program by

evaluating current LOCCTR ndash the starting address mentioned in the operand field of the

END directive

Each processed line is written to the intermediate file

The Algorithm for Pass 2

Begin

read 1st input line from intermediate file

if OPCODE = lsquoSTARTrsquo then

begin

write listing line

read next input line

end if START

LALITHAMBIGAIB APIT Page 15

SYSTEM SOFTWARE ( CS 2304) UNIT - IIwrite Header record to object program

initialize 1st Text record

while OPCODE = lsquoENDrsquo do

begin

if this is not comment line then

begin

search OPTAB for OPCODE

if found then

begin

if there is a symbol in OPERAND field then

begin

search SYMTAB for OPERAND

if found then

store symbol value as operand address

else

begin

store 0 as operand address

set error flag (undefined symbol)

end

end if symbol

else

store 0 as operand address

assemble the object code instruction

end if opcode found

else if OPCODE = lsquoBYTErsquo or lsquoWORDrdquo then

convert constant to object code

if object code doesnrsquot fit into current Text record then

begin

Write text record to object code

initialize new Text record

end

LALITHAMBIGAIB APIT Page 16

SYSTEM SOFTWARE ( CS 2304) UNIT - IIadd object code to Text record

end if not comment

write listing line

read next input line

end while not END

Write last Text record to object program

Write End record to object program

Write last listing line

End Pass 2

Explanation ndash PASS II Algorithm

Here the first input line is read from the intermediate file

If the opcode is START then this line is directly written to the list file

A header record is written in the object program which gives the starting address and the

length of the program (which is calculated during pass 1)

Then the first text record is initialized Comment lines are ignored

In the instruction for the opcode the OPTAB is searched to find the object code

If a symbol is there in the operand field the symbol table is searched to get the address value

for this which gets added to the object code of the opcode

If the address not found then zero value is stored as operands address An error flag is set

indicating it as undefined

If symbol itself is not found then store 0 as operand address and the object code instruction is

assembled

If the opcode is BYTE or WORD then the constant value is converted to its equivalent

object code( for example for character EOF its equivalent hexadecimal value lsquo454f46rsquo is

stored)

If the object code cannot fit into the current text record a new text record is created and the

rest of the instructions object code is listed

The text records are written to the object program Once the whole program is assembled and

when the END directive is encountered the End record is written

LALITHAMBIGAIB APIT Page 17

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Design and Implementation Issues

Some of the features in the program depend on the architecture of the machine If the program is for

SIC machine then we have only limited instruction formats and hence limited addressing modes

We have only single operand instructions The operand is always a memory reference Anything to

be fetched from memory requires more time Hence the improved version of SICXE machine

provides more instruction formats and hence more addressing modes The moment we change the

machine architecture the availability of number of instruction formats and the addressing modes

changes Therefore the design usually requires considering two things Machine-dependent features

and Machine-independent features

22 MACHINE DEPENDENT ASSEMBLER FEATURES

Instruction formats and addressing modes

Program relocation

In this section the design and implementation of an assembler for the more complex XE version of

SIC is considered

LALITHAMBIGAIB APIT Page 18

SYSTEM SOFTWARE ( CS 2304) UNIT - II

200 SUBROUTINE TO WRITE RECORD FROM BUFFER

205

210 WRREC CLEAR X CLEAR LOOP COUNTER

212 LDT LENGTH

215 WLOOP TD OUTPUT TEST OUTPUT DEVICE

220 JEQ WLOOP LOOP UNTIL READY

225 LDCH BUFFER X GET CHARACTER FROM BUFFER

230 WD OUTPUT WRITE CHARACTER

235 TIXR T LOOP UNTIL ALL CHARACTERS

HAVE BEEN WRITTEN

240 JLT WLOOP

245 RSUB

250 OUTPUT BYTE Xrsquo05rsquo

255 END FIRST

Figure 24 Example program of a SICXE

LALITHAMBIGAIB APIT Page 19

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The above figure 23 shows the example program from figure 21 as it is might be rewritten to

take advantage of the SICXE instruction set

Indirect addressing is indicated by adding the prefix to the operand (line70)

Immediate operands are denoted with the prefix (lines 25 55133)

Instructions that refer to memory are normally assembled using either the program counter

relative or base counter relative mode

The assembler directive BASE (line 13) is used in conjunction with base relative addressing

The four byte extended instruction format is specified with the prefix + added to the

operation code in the source statement

Register-to-register instructions are used wherever possible For example the statement on

line 150 is changed from COMP ZERO to COMPR AS

Immediate and indirect addressing have also been used as much as possible

Register-to-register instructions are faster than the corresponding register-to-memory

operations because they are shorter and do not require another memory reference

While using immediate addressing the operand is already present as part of the instruction

and need not be fetched from anywhere

The use of indirect addressing often avoids the need for another instruction

Line Loc Label Opcode Operand Object Code

5 COPY START 0

10 FIRST STL RETADR

12 LDB LENGTH

13 BASE LENGTH

15 CLOOP +JSUB RDREC

20 LDA LENGTH

25 COMP 0

30 JEQ ENDFIL

35 +JSUB WRREC

40 J CLOOP

45 ENDFIL LDA EOF

LALITHAMBIGAIB APIT Page 20

SYSTEM SOFTWARE ( CS 2304) UNIT - II50 STA BUFFER

55 LDA 3

60 STA LENGTH

65 +JSUB WRREC

70 J RETADR

80 EOF BYTE CrsquoEOFrsquo

95 RETADR RESW 1

100 LENGTH RESW 1

105 BUFFER RESB 4096

110

SUBROUTINE TO READ RECORD INTO

BUFFER

115

120

125 RDREC CLEAR X

130 CLEAR A

132 CLEAR S

133 +LDT 4096

135 RLOOP TD INPUT

140 JEQ RLOOP

145 RD INPUT

150 COMPR A S

155 JEQ EXIT

160 STCH BUFFERX

165 TIXR T

170 JLT RLOOP

175 EXIT STX LENGTH

180 RSUB

185 INPUT BYTE XrsquoF1rsquo

195

SUBROUTINE TO WRITE RECORD

FROM BUFFER

200

205

LALITHAMBIGAIB APIT Page 21

SYSTEM SOFTWARE ( CS 2304) UNIT - II210 WRREC CLEAR X

212 LDT LENGTH

215 WLOOP TD OUTPUT

220 JEQ WLOOP

225 LDCH BUFFERX

230 WD OUTPUT

235 TIXR T

240 JLT WLOOP

245 RSUB

250 OUTPUT BYTE Xrsquo05rsquo

255 END FIRST

Figure 25 Program from figure 24(SICXE) with object code

221 Instruction Formats and Addressing Modes

The instruction formats depend on the memory organization and the size of the memory

In SIC machine the memory is byte addressable Word size is 3 bytes So the size of the

memory is 212 bytes Accordingly it supports only one instruction format It has only two

registers register A and Index register Therefore the addressing modes supported by this

architecture are direct and indexed

Whereas the memory of a SICXE machine is 220 bytes (1 MB) This supports four different

types of instruction types they are

1 byte instruction

2 byte instruction

3 byte instruction

4 byte instruction

LALITHAMBIGAIB APIT Page 22

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Instructions can be

Instructions involving register to register

Instructions with one operand in memory the other in Accumulator (Single

operand instruction)

Extended instruction format

Addressing Modes are

PC-relative or Base-relative addressing op m

Indirect addressing op m

Immediate addressing op c

Extended format +op m

Index addressing op mx

register-to-register instructions

larger memory -gt multi-programming (program allocation)

Translation

1 Translations for the Instruction involving Register-Register addressing mode

During pass 1 the registers can be entered as part of the symbol table itself The value for these

registers is their equivalent numeric codes During pass 2 these values are assembled along with the

mnemonics object code If required a separate table can be created with the register names and their

equivalent numeric values

2 Translation involving Register-Memory instructions

In SICXE machine there are four instruction formats and five addressing modes For formats and

addressing modes refer chapter 1

LALITHAMBIGAIB APIT Page 23

SYSTEM SOFTWARE ( CS 2304) UNIT - IIAmong the instruction formats format -3 and format-4 instructions are Register-Memory type of

instruction One of the operand is always in a register and the other operand is in the memory The

addressing mode tells us the way in which the operand from the memory is to be fetched

There are two ways

1) Program-counter relative and

2) Base-relative

This addressing mode can be represented by either using format-3 type or format-4

type of instruction format

In format-3 the instruction has the opcode followed by a 12-bit displacement value in

the address field

Where as in format-4 the instruction contains the mnemonic code followed by a 20-

bit displacement value in the address field

1 Program-Counter Relative

a) In this usually format-3 instruction format is used

b) The instruction contains the opcode followed by a 12-bit displacement value

c) The range of displacement values are from 0 -2048 This displacement (should be small

enough to fit in a 12-bit field) value is added to the current contents of the program counter to

get the target address of the operand required by the instruction

d) This is relative way of calculating the address of the operand relative to the program counter

Hence the displacement of the operand is relative to the current program counter value

e) The following example shows how the address is calculated

LALITHAMBIGAIB APIT Page 24

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Base-Relative Addressing Mode

a) In this mode the base register is used to mention the displacement value Therefore the target

address is

TA = (base) + displacement value

b) This addressing mode is used when the range of displacement value is not sufficient Hence

the operand is not relative to the instruction as in PC-relative addressing mode Whenever

this mode is used it is indicated by using a directive BASE The moment the assembler

encounters this directive the next instruction uses base-relative addressing mode to calculate

the target address of the operand

c) When NOBASE directive is used then it indicates the base register is no more used to

calculate the target address of the operand Assembler first chooses PC-relative when the

displacement field is not enough it uses Base-relative

LDB LENGTH (instruction)

BASE LENGTH (directive)

NOBASE

LALITHAMBIGAIB APIT Page 25

SYSTEM SOFTWARE ( CS 2304) UNIT - II

For example

12 0003 LDB LENGTH 69202D

13 BASE LENGTH

100 0033 LENGTH RESW 1

105 0036 BUFFER RESB 4096

160 104E STCH BUFFER X 57C003

165 1051 TIXR T B850

In the above example the use of directive BASE indicates that Base-relative addressing mode is

to be used to calculate the target address PC-relative is no longer used The value of the LENGTH is

stored in the base register

The LDB instruction loads the value of length in the base register which 0033 BASE directive

explicitly tells the assembler that it has the value of LENGTH

BUFFER is at location (0036)16

(B) = (0033)16

disp = 0036 ndash 0033 = (0003)16

20 000A LDA LENGTH 032026

175 1056 EXIT STX LENGTH 134000

LALITHAMBIGAIB APIT Page 26

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Consider Line 175 If we use PC-relative

Disp = TA ndash (PC) = 0033 ndash1059 = EFDA

PC relative is no longer applicable so we try to use BASE relative addressing mode

3 Immediate Addressing Mode

In this mode no memory reference is involved If immediate mode is used the target address is the

operand itself

If the symbol is referred in the instruction as the immediate operand then it is immediate with PC-

relative mode as shown in the example below

LALITHAMBIGAIB APIT Page 27

SYSTEM SOFTWARE ( CS 2304) UNIT - II

4 Indirect and PC-relative mode

In this type of instruction the symbol used in the instruction is the address of the location which

contains the address of the operand The address of this is found using PC-relative addressing mode

For example

The instruction jumps the control to the address location RETADR which in turn has the address of

the operand If address of RETADR is 0030 the target address is then 0003 as calculated above

222 Program Relocation

The need for program relocation

It is desirable to load and run several programs at the same time

The system must be able to load programs into memory wherever there is room

The exact starting address of the program is not known until load time

Absolute Address

Program with starting address specified at assembly time

The address may be invalid if the program is loaded into somewhere else

LALITHAMBIGAIB APIT Page 28

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example

The above statement says that the register A is loaded with the value stored at location 102D

Suppose it is decided to load and execute the program at location 3000 instead of location 1000

Then at address 102D the required value which needs to be loaded in the register A is no more

available

The address also gets changed relative to the displacement of the program Hence we need to

make some changes in the address portion of the instruction so that we can load and execute the

program at location 3000

Apart from the instruction which will undergo a change in their operand address value as the

program load address changes There exist some parts in the program which will remain same

regardless of where the program is being loaded

Since assembler will not know actual location where the program will get loaded it cannot make

the necessary changes in the addresses used in the program However the assembler identifies

for the loader those parts of the program which need modification

An object program that has the information necessary to perform this kind of modification is

called the relocatable program

LALITHAMBIGAIB APIT Page 29

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example Program Relocation

1) The above diagram shows the concept of relocation

a) Initially the program is loaded at location 0000

b) The instruction JSUB is loaded at location 0006

c) The address field of this instruction contains 01036 which is the address of the instruction

labeled RDREC

2) The second figure shows that if the program is to be loaded at new location 5000 The address of

the instruction JSUB gets modified to new location 6036

3) Likewise the third figure shows that if the program is relocated at location 7420 the JSUB

instruction would need to be changed to 4B108456 that correspond to the new address of

RDREC

The only parts of the program that require modification at load time are those that specify

direct addresses

LALITHAMBIGAIB APIT Page 30

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The rest of the instructions need not be modified

o Not a memory address (immediate addressing)

o PC-relative Base-relative

From the object program it is not possible to distinguish the address and constant

o The assembler must keep some information to tell the loader

o The object program that contains the modification record is called a relocatable

program

The way to solve the relocation problem

For an address label its address is assigned relative to the start of the program(START 0)

Produce a Modification record to store the starting location and the length of the address

field to be modified

The command for the loader must also be a part of the object program

Modification record

One modification record for each address to be modified

The length is stored in half-bytes (4 bits)

The starting location is the location of the byte containing the leftmost bits of the address

field to be modified

If the field contains an odd number of half-bytes the starting location begins in the middle of

the first byte

LALITHAMBIGAIB APIT Page 31

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Relocatable Object Program

In the above object code the red boxes indicate the addresses that need modifications The object

code lines at the end are the descriptions of the modification records for those instructions which

need change if relocation occurs M00000705 is the modification suggested for the statement at

location 0007 and requires modification 5-half bytes

Machine dependent assembler featuresAssembler features not closely related to machine architecture

bull Literalsbull Symbol-defining statementsbull Expressionsbull Program blocksbull Control sections and program linking

Literals

It is convenient for the programmer to be able to write the value of a constant operand as a part of

the instruction that uses it Such an operand is called a literal

45 001A ENDFIL LDA =ClsquoEOFrsquo 003210

002D =ClsquoEOFrsquo 454F46

LALITHAMBIGAIB APIT Page 32

SYSTEM SOFTWARE ( CS 2304) UNIT - II215 1062 WLOOP TD =Xlsquo05rsquo E32011

1076 =Xlsquo05rsquo 05

bull In this assembler language notation a literal is identified with the prefix= which is followed by a

specification of the literal value

The difference between a literal and an immediate operand

With immediate addressing the operand value is assembled as a part of the machine

instruction

With a literal the assembler generates the specified value as a constant at some other

memory location The address of this generated constant is used as the target address for the

machine instruction

Literal pool

All of the literal operands used in a program are gathered together into one or more literal

pools

Where the literal pool should be placed

93 LTORG

002D =ClsquoEOFrsquo 454F46

1048698 The assembler directive LTORG tells the assembler to generate a literal pool here

Literal for current value of location counter

1048698 The value of the location counter can be denoted by a literal operand

bull BASE

bull LDB =

Handling duplicate literal operands

The assembler should avoid storing duplicate literals

The easiest way to recognize duplicate literals is by comparison of the character

strings defining them

bull 100 LDA =ClsquoEOFrsquo

LALITHAMBIGAIB APIT Page 33

SYSTEM SOFTWARE ( CS 2304) UNIT - IIbull 125 LDA =ClsquoEOFrsquo

bull 160 LDA =Xlsquo454F46rsquo Literal operands with different

literal names may have the same

literal values

Recognizing literal operands that have different literal names but have the same literal values will

complicate the design of an assembler

Literal table (LITTAB)

The basic data structure needed to process literal operands is a literal table (LITTAB)

LITTAB contains Literal NameValue and Length

LITTAB is often organized as a hash table using the literal name or value as the key

Processing literal operands

Pass 1

bull For each recognized literal operand search LITTAB If the literal is already present in the

table no action is need if it is not present the literal is added to LITTAB without assigning

its address

bull When a LTORG statement is encountered or the end of the program the assembler makes a

scan of LITTAB and assigns an address to each literal

bull Update the location counter to reflect the number of bytes occupied by each literal

Pass 2

bull Search LITTAB for each literal operand encountered

bull The data values specified by the literals in each literal pool are inserted at the appropriate

places in the object program

bull In the same way as these values generated by BYTE or WORD statements

bull If a literal value represents an address in the program the assembler must generate the

appropriate Modification record

Symbol-defining statements

Assembler directives

EQU

ORG

Assembler directive EQU

LALITHAMBIGAIB APIT Page 34

SYSTEM SOFTWARE ( CS 2304) UNIT - II Most assemblers provide an assembler directive that allows the programmer to define

symbols and specify their values

Symbol EQU value

When the assembler encounters the EQU statement it enters ldquosymbolrdquo into SYMTAB with

the value of ldquosymbolrdquo

Use of EQU

Establish symbolic names that can be used for improved readability in place of numeric

values

+LDT 4096

MAXLEN EQU 4096

+LDT MAXLEN

Define mnemonic names for registers

A EQU 0

X EQU 1

L EQU 2

RMO 01

Establish and use names that reflect the logical function of the registers in the program

BASE EQU R1

ACCUMULATOR EQU R2

INDEX EQU R3

Assembler directive ORG

The assembler directive ORG is usually used to indirectly assign values to symbols

ORG value

ldquoValuerdquo is a constant or an expression involving constants and previously defined

symbols

When this statement is encountered the assembler resets its location counter (LOCCTR) to

the specified value

Use ORG for label definition

Suppose that we want to define a table with the following structure

STAB SYMBOL VALUE FLAGS

100 entries 6 bytes 3 bytes 1byte

LALITHAMBIGAIB APIT Page 35

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In some assemblers the previous value of LOCCTR is automatically remembered so we can

write

ORG

to return to the normal use of LOCCTR

Restrictions of EQU and ORG in an ordinary two-pass assembler For an ordinary two-pass assembler all symbols must be defined during Pass 1 Hence the

following sequences could not be processed by an ordinary two-pass assembler

All terms used to specify the value of the new symbol must have been defined previously in

the program

BETA EQU ALPHA

ALPHA RESW 1

ALPHA EQU BETA

BETA EQU DELTA

DELTA RESW 1

disallowed

ORG ALPHA

BYTE1 RESB 1

BYTE2 RESB 1

BYTE3 RESB 1

ORG

ALPHA RESB 1

disallowed

ALPHA RESW 1

BETA EQU ALPHA

Allowed

LALITHAMBIGAIB APIT Page 36

disallowed

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Expressions Most assemblers allow the use of expressions whenever a single operand such as a label or

literal is permitted

bull Each such expression must be evaluated by the assembler to produce a single

operand address or value

Assemblers generally allow arithmetic expressions formed according to the normal rule using

the operators +- and

bull Individual terms in the expression may be

bull constants

bull user-defined symbols or

bull special terms

bull The most common special term is the current value of the location

counter (often designated by )

Types of terms

Absolute terms - The value of an absolute term is independent of program location

Relative terms - The value of a relative term is dependent on the beginning address of the

program

Types of expressions

By the type of value produced expressions can classified as

1 Absolute expressions

bull The value of an absolute expression is independent of the program location

bull The absolute expression may contains relative terms provided the relative terms occur in

pairs and the terms in each such pair have opposite signs No relative term can enter multiplication

or division operation

bull eg MAXLEN EQU BUFEND-BUFFER

LALITHAMBIGAIB APIT Page 37

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Relative expressions

bull The value of a relative expression is relative the beginning address of the object

program

bull A relative expression is one in which all of the relative terms except

one can be paired as described above The remaining unpaired term must have a positive sign No

relative term can enter multiplication or division operation

Expressions that are neither relative nor absolute should be flagged by the assembler as errors

Determining types of expressions

Symbol Type Value

MAXLEN A 1000

BUFEND R 1036

BUFFER R 0036

RETADR R 0030

BUFEND+BUFFER 100-BUFFER and 3BUFFER are neither relative expressions nor

absolute expressions

Program blocks

The program is divided into three blocks They are Default - 0CDATA-1

and CBLKS-2

Default ( 0 ) block for mnemonic instructions

CDATA ( 1 ) block for less memory instructions

CBLKs ( 2 ) block for more memory instructions

Then one more column is included in the program when object code is

generated (ie) block whenever the system will encounter the new block need to write

USE and BLOCK name then set LOCCTR value to 0000

LALITHAMBIGAIB APIT Page 38

Add this field to SYMTAB

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In program block table consists of block name block valuestarting address and

length

After finding location counter value need to find object code and write object

program

LALITHAMBIGAIB APIT Page 39

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Program Block Diagram

LALITHAMBIGAIB APIT Page 40

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Control section and program linkingControl section

i A control section is a part of the program that maintains its identity after assembly

ii Each control section can be loaded and relocated independently of the others

iii Different control sections are most often used for subroutines or other logical

subdivisions of a program

Assembler directive CSECT

CSECT signal the start of a new control section

The assembler establishes a separate location counter (initialized as 0) for each control section

Assembler directives EXTDEF EXTREF

LALITHAMBIGAIB APIT Page 41

SYSTEM SOFTWARE ( CS 2304) UNIT - IIEXTDEF external definition

The EXTDEF statement in a control section names symbols called external symbols that are

defined in this control section and may be used by other sections Control section names are

automatically considered to be external symbols

EXTREF external reference

The EXTREF statement names symbols that are used in this control section and are defined

elsewhere

LALITHAMBIGAIB APIT Page 42

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 2: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Machine language program

Each line of assembly language instruction is translated to machine language The

machine language takes two forms depending on the architecture They are

1 hexadecimal form

2 binary form

SIC takes hexadecimal form of machine code

Data structures

Assembler built or uses one or more data structure to perform the assembling process

Some of the data structures are SYMTAB and OPTAB

Basic assembler functionsFundamental functions of an assembler

(i) A simple SIC assembler

(ii) Assembler algorithm and data structure

(i) A simple SIC assembler

Section 21 introduces the most fundamental operations performed by a typical assembler and

describes common ways of accomplishing these functions The data structures described are shared

by almost all assemblers These data structures can be used as a framework to design the assembler

for a new or unfamiliar machine

LALITHAMBIGAIB APIT Page 2

Mnemonic operation code Machine language

Symbolic labels Machine addresses

SYSTEM SOFTWARE ( CS 2304) UNIT - II

21 BASIC ASSEMBLER FUNCTIONS

The figure 21 shows an assembly language program for the basic version of SIC The same program

is used with different variations throughout this chapter to learn different assembler features The

line numbers used in this example program are used for reference and are not part of the program

The mnemonic instructions are used in this example

Mnemonic instruction definition A word or acronym used in

assembly language to represent a binary machine instruction operation code

Indexed addressing is indicated by adding the modifier ldquoXrdquo following the operand Example ndash check

line no 160

STCH BUFFER X

Lines beginning with ldquordquo contain comments only

In addition to the mnemonic instructions the following assembler directives are used

1 START

2 END

3 BYTE

4 WORD

5 RESB

6 RESW

Assembler directives definition - A statement in an assembly-language program that gives

instructions to the assembler and does not generate machine language or object code are called

assembler directives

The Assembler directives are

ndash START Specify name amp starting address for the program

ndash END Indicate the end of the source program and specify the first execution instruction in the

program

ndash BYTE Generate character or hexadecimal constant occupying as many bytes as needed to

represent the constant

LALITHAMBIGAIB APIT Page 3

SYSTEM SOFTWARE ( CS 2304) UNIT - IIndash WORD Generate one-word integer constant

ndash RESB Reserves the indicated number of bytes for a data area

ndash RESW Reserves the indicated number of words for a data area

ndash End of record a null char (00)

ndash End of file a zero length record

In addition to the line numbers the program is written using three columns ndash the first column

indicates the labels used in the source code the second indicates the opcode field and the third

represents the operand

LALITHAMBIGAIB APIT Page 4

Label Opcode Operand

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Figure 21 Assembler language program for basic SIC version

The example program considered in Figure 21 has a main module and two subroutines

Purpose of example program -

- Reads records from input device (code F1)

- Copies them to output device (code 05)

- At the end of the file writes EOF on the output device then RSUB to the

operating system

bull Data transfer (RD WD)

-A buffer is used to store record

-Buffering is necessary for different IO rates

-The end of each record is marked with a null character (00) hexadecimal

-The end of the file is indicated by a zero-length record

LALITHAMBIGAIB APIT Page 5

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The main routine calls subroutines

RDREC ndash To read a record into a buffer

WRREC ndash To write the record from the buffer to the output device

The end of each record is marked with a null character (hexadecimal 00)

211 A Simple SIC Assembler

The figure 22 shows the same program as in figure 21 with the generated object code for each

statement

LALITHAMBIGAIB APIT Page 6

Label Opcode Operand

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Figure 22 Assembler language program for basic SIC version with Object code

LALITHAMBIGAIB APIT Page 7

SYSTEM SOFTWARE ( CS 2304) UNIT - II

1 The column ldquoLocrdquo in the example program gives the machine address in hexadecimal for

each and every line of the assembled program

2 The program starts at address 1000 (itrsquos only assumption ndash for simple SIC machine the

starting address will always be assumed and the value will not be 0)

3 The translation of source program to object code requires the following functions

a Convert mnemonic operation codes to their machine language equivalents Eg Translate

STL to 14 (line 10)

b Convert symbolic operands to their equivalent machine addresses EgTranslate

RETADR to 1033 (line 10)

c Build the machine instructions in the proper format

d Convert the data constants specified in the source program into their internal machine

representations Eg Translate EOF to 454F46(line 80)

e Write the object program and the assembly listing

4 All the statements in the program except statement 2 can be established by sequential

processing of source program one line at a time

Consider the statement

10 1000 FIRST STL RETADR 141033

5 This instruction contains a forward reference (ie) - a reference to a label (RETADR) that is

defined later in the program

6 It is unable to process this line because the address that will be assigned to RETADR is not

known

7 Hence most assemblers make two passes over the source program

8 The first pass does little more than scan the source program for label definitions and assign

address such as those in the Loc column

9 The second pass performs most of the actual translation

10 The assembler must also process statements called assembler directives or pseudo

instructions which are not translated into machine instructions Instead they provide

instructions to the assembler itself Examples RESB and RESW instruct the assembler to

reserve memory locations without generating data values

LALITHAMBIGAIB APIT Page 8

SYSTEM SOFTWARE ( CS 2304) UNIT - II11 The assembler must write the generated object code onto some output device This object

program will later be loaded into memory for execution

Object program format contains three types of records

Header record Contains the program name starting address and length

Text record Contains the machine code and data of the program

End record Marks the end of the object program and specifies the address in the program

where execution is to begin

Record format is as follows

Header record

Col 1 H

Col2-7 Program name

Col8-13 Starting address of object program

Col14-19 Length of object program in bytes

Text record

Col1 T

Col2-7 Starting address for object code in this record

Col8-9 Length of object code in this record in bytes

Col 10-69 Object code represented in hexadecimal (2 columns per byte of object code)

End record

Col1 E

Col2-7 Address of first executable instruction in object program

LALITHAMBIGAIB APIT Page 9

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2111 The assembler design can be done

Single pass assembler

Multi-pass assembler

Single-pass Assembler

In Single-pass assembler the whole process of scanning parsing and object code conversion

is done in single pass

The only problem with this method is resolving forward reference

The problem of forward reference is shown with an example below

10 1000 FIRST STL RETADR 141033

--

--

--

--

95 1033 RETADR RESW 1

In the above example in line number 10 the instruction STL will store the linkage register

with the contents of RETADR But during the processing of this instruction the value of this

symbol is not known as it is defined at the line number 95

Since in single-pass assembler the scanning parsing and object code conversion happens

simultaneously

The instruction is fetched it is scanned for tokens parsed for syntax and semantic validity If

it is valid then it has to be converted to its equivalent object code For this the object code is

generated by adding the opcode of STL and the value for the symbol RETADR which is not

available

Due to this reason usually the design is done in two passes

So a multi-pass assembler resolves the forward references and then converts into the object

code Hence the process of the multi-pass assembler can be as follows

LALITHAMBIGAIB APIT Page 10

SYSTEM SOFTWARE ( CS 2304) UNIT - IIPass-1

Assign addresses to all the statements

Save the addresses assigned to all labels to be used in Pass-2

Perform some processing of assembler directives such as RESW RESB to find the length of

data areas for assigning the address values

Defines the symbols in the symbol table(generate the symbol table)

Pass-2

Assemble the instructions (translating operation codes and looking up addresses)

Generate data values defined by BYTE WORD etc

Perform the processing of the assembler directives not done during pass-1

Write the object program and assembly listing

2112 Assembler Design

The most important things which need to be concentrated is the generation of Symbol table

and resolving forward references

bull Symbol Table

ndash This is created during pass 1

ndash All the labels of the instructions are symbols

ndash Table has entry for symbol name address value

bull Forward reference

ndash Symbols that are defined in the later part of the program are called forward

referencing

ndash There will not be any address value for such symbols in the symbol table in pass 1

2113 Functions of the two passes of assembler

Pass 1 (Define symbols)

1 Assign addresses to all statements in the program

2 Save the addresses assigned to all labels for use in Pass 2

3 Perform some processing of assembler directives

LALITHAMBIGAIB APIT Page 11

SYSTEM SOFTWARE ( CS 2304) UNIT - IIPass 2 (Assemble instructions and generate object programs)

1 Assemble instructions (translating operation codes and looking up addresses)

2 Generate data values defined by BYTEWORD etc

3 Perform processing of assembler directives not done in Pass 1

4 Write the object program and the assembly listing

212 Assembler Algorithm and Data Structures

Assembler uses two major internal data structures

1 Operation Code Table (OPTAB) Used to lookup mnemonic operation codes and translate

them into their machine language equivalents

2 Symbol Table (SYMTAB) Used to store values(Addresses) assigned to labels

3

Location Counter (LOCCTR)

It is a variable used to help in the assignment of addresses

It is initialized to the beginning address specified in the START statement

After each source statement is processed the length of the assembled instruction or data area

to be generated is added to LOCCTR

Whenever a label is reached in the source program the current value of LOCCTR gives the

address to be associated with that label

Operation Code Table (OPTAB)

Contains the mnemonic operation and its machine language equivalent

Also contains information about instruction format and length

In Pass 1 OPTAB is used to lookup and validate operation codes in the source program

In Pass 2 it is used to translate the operation codes to machine language program

During Pass 2 the information in OPTAB tells which instruction format to use in assembling

the instruction and any peculiarities of the object code instruction

LALITHAMBIGAIB APIT Page 12

SYSTEM SOFTWARE ( CS 2304) UNIT - II OPTAB is usually organized as a hash table with mnemonic operation code as the key The

hash table organization is particularly appropriate since it provides fast retrieval with a

minimum search

In most cases OPTAB is a static table ndash ie entries are not added or deleted from it

Symbol Table (SYMTAB)

Includes the name and value(address) for each label in the source program and flags to

indicate error conditions(ex ndash symbol defined in two different places)

During Pass 1 of the assembler labels are entered into SYMTAB as they are encountered in

the source program along with their assigned addresses

During Pass 2 symbols used as operands are looked up in SYMTAB to obtain the addresses

to be inserted in the assembled instructions

Pass 1 usually writes an intermediate file that contains each source statement together with its

assigned address error indicators This file is used as the input to Pass 2

This copy of the source program can also be used to retain the results of certain operations that may

be performed during Pass 1 such as scanning the operand field for symbols and addressing flags so

these need not be performed again during Pass 2

The Algorithm for Pass 1

Begin

read first input line

if OPCODE = lsquoSTARTrsquo then

begin

save [Operand] as starting address

initialize LOCCTR to starting address

write line to intermediate file

read next input line

end if START

else

LALITHAMBIGAIB APIT Page 13

SYSTEM SOFTWARE ( CS 2304) UNIT - IIinitialize LOCCTR to 0

While OPCODE = lsquoENDrsquo do

begin

if this is not a comment line then

begin

if there is a symbol in the LABEL field then

begin

search SYMTAB for LABEL

if found then

set error flag (duplicate symbol)

else

end if symbol

search OPTAB for OPCODE

if found then

add 3 (instruction length) to LOCCTR

else if OPCODE = lsquoWORDrsquo then

add 3 to LOCCTR

else if OPCODE = lsquoRESWrsquo then

add 3 [OPERAND] to LOCCTR

else if OPCODE = lsquoRESBrsquo then

add [OPERAND] to LOCCTR

else if OPCODE = lsquoBYTErsquo then

begin

find length of constant in bytes

add length to LOCCTR

end if BYTE

else

set error flag (invalid operation code)

end if not a comment

write line to intermediate file

read next input line

LALITHAMBIGAIB APIT Page 14

SYSTEM SOFTWARE ( CS 2304) UNIT - IIend while not END

write last line to intermediate file

Save (LOCCTR ndash starting address) as program length

End pass 1

Explanation ndash PASS I Algorithm

The algorithm scans the first statement START and saves the operand field (the address) as

the starting address of the program Initializes the LOCCTR value to this address This line is

written to the intermediate line

If no operand is mentioned the LOCCTR is initialized to zero

If a label is encountered the symbol has to be entered in the symbol table along with its

associated address value If the symbol already exists that indicates an entry of the same

symbol already exists So an error flag is set indicating a duplication of the symbol

It next checks for the mnemonic code it searches for this code in the OPTAB If found then

the length of the instruction is added to the LOCCTR to make it point to the next instruction

If the opcode is the directive WORD it adds a value 3 to the LOCCTR

If it is RESW it needs to add 3 the number of data word to the LOCCTR

If it is BYTE it adds a value one to the LOCCTR if RESB it adds number of bytes

If it is END directive then it is the end of the program it finds the length of the program by

evaluating current LOCCTR ndash the starting address mentioned in the operand field of the

END directive

Each processed line is written to the intermediate file

The Algorithm for Pass 2

Begin

read 1st input line from intermediate file

if OPCODE = lsquoSTARTrsquo then

begin

write listing line

read next input line

end if START

LALITHAMBIGAIB APIT Page 15

SYSTEM SOFTWARE ( CS 2304) UNIT - IIwrite Header record to object program

initialize 1st Text record

while OPCODE = lsquoENDrsquo do

begin

if this is not comment line then

begin

search OPTAB for OPCODE

if found then

begin

if there is a symbol in OPERAND field then

begin

search SYMTAB for OPERAND

if found then

store symbol value as operand address

else

begin

store 0 as operand address

set error flag (undefined symbol)

end

end if symbol

else

store 0 as operand address

assemble the object code instruction

end if opcode found

else if OPCODE = lsquoBYTErsquo or lsquoWORDrdquo then

convert constant to object code

if object code doesnrsquot fit into current Text record then

begin

Write text record to object code

initialize new Text record

end

LALITHAMBIGAIB APIT Page 16

SYSTEM SOFTWARE ( CS 2304) UNIT - IIadd object code to Text record

end if not comment

write listing line

read next input line

end while not END

Write last Text record to object program

Write End record to object program

Write last listing line

End Pass 2

Explanation ndash PASS II Algorithm

Here the first input line is read from the intermediate file

If the opcode is START then this line is directly written to the list file

A header record is written in the object program which gives the starting address and the

length of the program (which is calculated during pass 1)

Then the first text record is initialized Comment lines are ignored

In the instruction for the opcode the OPTAB is searched to find the object code

If a symbol is there in the operand field the symbol table is searched to get the address value

for this which gets added to the object code of the opcode

If the address not found then zero value is stored as operands address An error flag is set

indicating it as undefined

If symbol itself is not found then store 0 as operand address and the object code instruction is

assembled

If the opcode is BYTE or WORD then the constant value is converted to its equivalent

object code( for example for character EOF its equivalent hexadecimal value lsquo454f46rsquo is

stored)

If the object code cannot fit into the current text record a new text record is created and the

rest of the instructions object code is listed

The text records are written to the object program Once the whole program is assembled and

when the END directive is encountered the End record is written

LALITHAMBIGAIB APIT Page 17

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Design and Implementation Issues

Some of the features in the program depend on the architecture of the machine If the program is for

SIC machine then we have only limited instruction formats and hence limited addressing modes

We have only single operand instructions The operand is always a memory reference Anything to

be fetched from memory requires more time Hence the improved version of SICXE machine

provides more instruction formats and hence more addressing modes The moment we change the

machine architecture the availability of number of instruction formats and the addressing modes

changes Therefore the design usually requires considering two things Machine-dependent features

and Machine-independent features

22 MACHINE DEPENDENT ASSEMBLER FEATURES

Instruction formats and addressing modes

Program relocation

In this section the design and implementation of an assembler for the more complex XE version of

SIC is considered

LALITHAMBIGAIB APIT Page 18

SYSTEM SOFTWARE ( CS 2304) UNIT - II

200 SUBROUTINE TO WRITE RECORD FROM BUFFER

205

210 WRREC CLEAR X CLEAR LOOP COUNTER

212 LDT LENGTH

215 WLOOP TD OUTPUT TEST OUTPUT DEVICE

220 JEQ WLOOP LOOP UNTIL READY

225 LDCH BUFFER X GET CHARACTER FROM BUFFER

230 WD OUTPUT WRITE CHARACTER

235 TIXR T LOOP UNTIL ALL CHARACTERS

HAVE BEEN WRITTEN

240 JLT WLOOP

245 RSUB

250 OUTPUT BYTE Xrsquo05rsquo

255 END FIRST

Figure 24 Example program of a SICXE

LALITHAMBIGAIB APIT Page 19

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The above figure 23 shows the example program from figure 21 as it is might be rewritten to

take advantage of the SICXE instruction set

Indirect addressing is indicated by adding the prefix to the operand (line70)

Immediate operands are denoted with the prefix (lines 25 55133)

Instructions that refer to memory are normally assembled using either the program counter

relative or base counter relative mode

The assembler directive BASE (line 13) is used in conjunction with base relative addressing

The four byte extended instruction format is specified with the prefix + added to the

operation code in the source statement

Register-to-register instructions are used wherever possible For example the statement on

line 150 is changed from COMP ZERO to COMPR AS

Immediate and indirect addressing have also been used as much as possible

Register-to-register instructions are faster than the corresponding register-to-memory

operations because they are shorter and do not require another memory reference

While using immediate addressing the operand is already present as part of the instruction

and need not be fetched from anywhere

The use of indirect addressing often avoids the need for another instruction

Line Loc Label Opcode Operand Object Code

5 COPY START 0

10 FIRST STL RETADR

12 LDB LENGTH

13 BASE LENGTH

15 CLOOP +JSUB RDREC

20 LDA LENGTH

25 COMP 0

30 JEQ ENDFIL

35 +JSUB WRREC

40 J CLOOP

45 ENDFIL LDA EOF

LALITHAMBIGAIB APIT Page 20

SYSTEM SOFTWARE ( CS 2304) UNIT - II50 STA BUFFER

55 LDA 3

60 STA LENGTH

65 +JSUB WRREC

70 J RETADR

80 EOF BYTE CrsquoEOFrsquo

95 RETADR RESW 1

100 LENGTH RESW 1

105 BUFFER RESB 4096

110

SUBROUTINE TO READ RECORD INTO

BUFFER

115

120

125 RDREC CLEAR X

130 CLEAR A

132 CLEAR S

133 +LDT 4096

135 RLOOP TD INPUT

140 JEQ RLOOP

145 RD INPUT

150 COMPR A S

155 JEQ EXIT

160 STCH BUFFERX

165 TIXR T

170 JLT RLOOP

175 EXIT STX LENGTH

180 RSUB

185 INPUT BYTE XrsquoF1rsquo

195

SUBROUTINE TO WRITE RECORD

FROM BUFFER

200

205

LALITHAMBIGAIB APIT Page 21

SYSTEM SOFTWARE ( CS 2304) UNIT - II210 WRREC CLEAR X

212 LDT LENGTH

215 WLOOP TD OUTPUT

220 JEQ WLOOP

225 LDCH BUFFERX

230 WD OUTPUT

235 TIXR T

240 JLT WLOOP

245 RSUB

250 OUTPUT BYTE Xrsquo05rsquo

255 END FIRST

Figure 25 Program from figure 24(SICXE) with object code

221 Instruction Formats and Addressing Modes

The instruction formats depend on the memory organization and the size of the memory

In SIC machine the memory is byte addressable Word size is 3 bytes So the size of the

memory is 212 bytes Accordingly it supports only one instruction format It has only two

registers register A and Index register Therefore the addressing modes supported by this

architecture are direct and indexed

Whereas the memory of a SICXE machine is 220 bytes (1 MB) This supports four different

types of instruction types they are

1 byte instruction

2 byte instruction

3 byte instruction

4 byte instruction

LALITHAMBIGAIB APIT Page 22

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Instructions can be

Instructions involving register to register

Instructions with one operand in memory the other in Accumulator (Single

operand instruction)

Extended instruction format

Addressing Modes are

PC-relative or Base-relative addressing op m

Indirect addressing op m

Immediate addressing op c

Extended format +op m

Index addressing op mx

register-to-register instructions

larger memory -gt multi-programming (program allocation)

Translation

1 Translations for the Instruction involving Register-Register addressing mode

During pass 1 the registers can be entered as part of the symbol table itself The value for these

registers is their equivalent numeric codes During pass 2 these values are assembled along with the

mnemonics object code If required a separate table can be created with the register names and their

equivalent numeric values

2 Translation involving Register-Memory instructions

In SICXE machine there are four instruction formats and five addressing modes For formats and

addressing modes refer chapter 1

LALITHAMBIGAIB APIT Page 23

SYSTEM SOFTWARE ( CS 2304) UNIT - IIAmong the instruction formats format -3 and format-4 instructions are Register-Memory type of

instruction One of the operand is always in a register and the other operand is in the memory The

addressing mode tells us the way in which the operand from the memory is to be fetched

There are two ways

1) Program-counter relative and

2) Base-relative

This addressing mode can be represented by either using format-3 type or format-4

type of instruction format

In format-3 the instruction has the opcode followed by a 12-bit displacement value in

the address field

Where as in format-4 the instruction contains the mnemonic code followed by a 20-

bit displacement value in the address field

1 Program-Counter Relative

a) In this usually format-3 instruction format is used

b) The instruction contains the opcode followed by a 12-bit displacement value

c) The range of displacement values are from 0 -2048 This displacement (should be small

enough to fit in a 12-bit field) value is added to the current contents of the program counter to

get the target address of the operand required by the instruction

d) This is relative way of calculating the address of the operand relative to the program counter

Hence the displacement of the operand is relative to the current program counter value

e) The following example shows how the address is calculated

LALITHAMBIGAIB APIT Page 24

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Base-Relative Addressing Mode

a) In this mode the base register is used to mention the displacement value Therefore the target

address is

TA = (base) + displacement value

b) This addressing mode is used when the range of displacement value is not sufficient Hence

the operand is not relative to the instruction as in PC-relative addressing mode Whenever

this mode is used it is indicated by using a directive BASE The moment the assembler

encounters this directive the next instruction uses base-relative addressing mode to calculate

the target address of the operand

c) When NOBASE directive is used then it indicates the base register is no more used to

calculate the target address of the operand Assembler first chooses PC-relative when the

displacement field is not enough it uses Base-relative

LDB LENGTH (instruction)

BASE LENGTH (directive)

NOBASE

LALITHAMBIGAIB APIT Page 25

SYSTEM SOFTWARE ( CS 2304) UNIT - II

For example

12 0003 LDB LENGTH 69202D

13 BASE LENGTH

100 0033 LENGTH RESW 1

105 0036 BUFFER RESB 4096

160 104E STCH BUFFER X 57C003

165 1051 TIXR T B850

In the above example the use of directive BASE indicates that Base-relative addressing mode is

to be used to calculate the target address PC-relative is no longer used The value of the LENGTH is

stored in the base register

The LDB instruction loads the value of length in the base register which 0033 BASE directive

explicitly tells the assembler that it has the value of LENGTH

BUFFER is at location (0036)16

(B) = (0033)16

disp = 0036 ndash 0033 = (0003)16

20 000A LDA LENGTH 032026

175 1056 EXIT STX LENGTH 134000

LALITHAMBIGAIB APIT Page 26

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Consider Line 175 If we use PC-relative

Disp = TA ndash (PC) = 0033 ndash1059 = EFDA

PC relative is no longer applicable so we try to use BASE relative addressing mode

3 Immediate Addressing Mode

In this mode no memory reference is involved If immediate mode is used the target address is the

operand itself

If the symbol is referred in the instruction as the immediate operand then it is immediate with PC-

relative mode as shown in the example below

LALITHAMBIGAIB APIT Page 27

SYSTEM SOFTWARE ( CS 2304) UNIT - II

4 Indirect and PC-relative mode

In this type of instruction the symbol used in the instruction is the address of the location which

contains the address of the operand The address of this is found using PC-relative addressing mode

For example

The instruction jumps the control to the address location RETADR which in turn has the address of

the operand If address of RETADR is 0030 the target address is then 0003 as calculated above

222 Program Relocation

The need for program relocation

It is desirable to load and run several programs at the same time

The system must be able to load programs into memory wherever there is room

The exact starting address of the program is not known until load time

Absolute Address

Program with starting address specified at assembly time

The address may be invalid if the program is loaded into somewhere else

LALITHAMBIGAIB APIT Page 28

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example

The above statement says that the register A is loaded with the value stored at location 102D

Suppose it is decided to load and execute the program at location 3000 instead of location 1000

Then at address 102D the required value which needs to be loaded in the register A is no more

available

The address also gets changed relative to the displacement of the program Hence we need to

make some changes in the address portion of the instruction so that we can load and execute the

program at location 3000

Apart from the instruction which will undergo a change in their operand address value as the

program load address changes There exist some parts in the program which will remain same

regardless of where the program is being loaded

Since assembler will not know actual location where the program will get loaded it cannot make

the necessary changes in the addresses used in the program However the assembler identifies

for the loader those parts of the program which need modification

An object program that has the information necessary to perform this kind of modification is

called the relocatable program

LALITHAMBIGAIB APIT Page 29

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example Program Relocation

1) The above diagram shows the concept of relocation

a) Initially the program is loaded at location 0000

b) The instruction JSUB is loaded at location 0006

c) The address field of this instruction contains 01036 which is the address of the instruction

labeled RDREC

2) The second figure shows that if the program is to be loaded at new location 5000 The address of

the instruction JSUB gets modified to new location 6036

3) Likewise the third figure shows that if the program is relocated at location 7420 the JSUB

instruction would need to be changed to 4B108456 that correspond to the new address of

RDREC

The only parts of the program that require modification at load time are those that specify

direct addresses

LALITHAMBIGAIB APIT Page 30

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The rest of the instructions need not be modified

o Not a memory address (immediate addressing)

o PC-relative Base-relative

From the object program it is not possible to distinguish the address and constant

o The assembler must keep some information to tell the loader

o The object program that contains the modification record is called a relocatable

program

The way to solve the relocation problem

For an address label its address is assigned relative to the start of the program(START 0)

Produce a Modification record to store the starting location and the length of the address

field to be modified

The command for the loader must also be a part of the object program

Modification record

One modification record for each address to be modified

The length is stored in half-bytes (4 bits)

The starting location is the location of the byte containing the leftmost bits of the address

field to be modified

If the field contains an odd number of half-bytes the starting location begins in the middle of

the first byte

LALITHAMBIGAIB APIT Page 31

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Relocatable Object Program

In the above object code the red boxes indicate the addresses that need modifications The object

code lines at the end are the descriptions of the modification records for those instructions which

need change if relocation occurs M00000705 is the modification suggested for the statement at

location 0007 and requires modification 5-half bytes

Machine dependent assembler featuresAssembler features not closely related to machine architecture

bull Literalsbull Symbol-defining statementsbull Expressionsbull Program blocksbull Control sections and program linking

Literals

It is convenient for the programmer to be able to write the value of a constant operand as a part of

the instruction that uses it Such an operand is called a literal

45 001A ENDFIL LDA =ClsquoEOFrsquo 003210

002D =ClsquoEOFrsquo 454F46

LALITHAMBIGAIB APIT Page 32

SYSTEM SOFTWARE ( CS 2304) UNIT - II215 1062 WLOOP TD =Xlsquo05rsquo E32011

1076 =Xlsquo05rsquo 05

bull In this assembler language notation a literal is identified with the prefix= which is followed by a

specification of the literal value

The difference between a literal and an immediate operand

With immediate addressing the operand value is assembled as a part of the machine

instruction

With a literal the assembler generates the specified value as a constant at some other

memory location The address of this generated constant is used as the target address for the

machine instruction

Literal pool

All of the literal operands used in a program are gathered together into one or more literal

pools

Where the literal pool should be placed

93 LTORG

002D =ClsquoEOFrsquo 454F46

1048698 The assembler directive LTORG tells the assembler to generate a literal pool here

Literal for current value of location counter

1048698 The value of the location counter can be denoted by a literal operand

bull BASE

bull LDB =

Handling duplicate literal operands

The assembler should avoid storing duplicate literals

The easiest way to recognize duplicate literals is by comparison of the character

strings defining them

bull 100 LDA =ClsquoEOFrsquo

LALITHAMBIGAIB APIT Page 33

SYSTEM SOFTWARE ( CS 2304) UNIT - IIbull 125 LDA =ClsquoEOFrsquo

bull 160 LDA =Xlsquo454F46rsquo Literal operands with different

literal names may have the same

literal values

Recognizing literal operands that have different literal names but have the same literal values will

complicate the design of an assembler

Literal table (LITTAB)

The basic data structure needed to process literal operands is a literal table (LITTAB)

LITTAB contains Literal NameValue and Length

LITTAB is often organized as a hash table using the literal name or value as the key

Processing literal operands

Pass 1

bull For each recognized literal operand search LITTAB If the literal is already present in the

table no action is need if it is not present the literal is added to LITTAB without assigning

its address

bull When a LTORG statement is encountered or the end of the program the assembler makes a

scan of LITTAB and assigns an address to each literal

bull Update the location counter to reflect the number of bytes occupied by each literal

Pass 2

bull Search LITTAB for each literal operand encountered

bull The data values specified by the literals in each literal pool are inserted at the appropriate

places in the object program

bull In the same way as these values generated by BYTE or WORD statements

bull If a literal value represents an address in the program the assembler must generate the

appropriate Modification record

Symbol-defining statements

Assembler directives

EQU

ORG

Assembler directive EQU

LALITHAMBIGAIB APIT Page 34

SYSTEM SOFTWARE ( CS 2304) UNIT - II Most assemblers provide an assembler directive that allows the programmer to define

symbols and specify their values

Symbol EQU value

When the assembler encounters the EQU statement it enters ldquosymbolrdquo into SYMTAB with

the value of ldquosymbolrdquo

Use of EQU

Establish symbolic names that can be used for improved readability in place of numeric

values

+LDT 4096

MAXLEN EQU 4096

+LDT MAXLEN

Define mnemonic names for registers

A EQU 0

X EQU 1

L EQU 2

RMO 01

Establish and use names that reflect the logical function of the registers in the program

BASE EQU R1

ACCUMULATOR EQU R2

INDEX EQU R3

Assembler directive ORG

The assembler directive ORG is usually used to indirectly assign values to symbols

ORG value

ldquoValuerdquo is a constant or an expression involving constants and previously defined

symbols

When this statement is encountered the assembler resets its location counter (LOCCTR) to

the specified value

Use ORG for label definition

Suppose that we want to define a table with the following structure

STAB SYMBOL VALUE FLAGS

100 entries 6 bytes 3 bytes 1byte

LALITHAMBIGAIB APIT Page 35

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In some assemblers the previous value of LOCCTR is automatically remembered so we can

write

ORG

to return to the normal use of LOCCTR

Restrictions of EQU and ORG in an ordinary two-pass assembler For an ordinary two-pass assembler all symbols must be defined during Pass 1 Hence the

following sequences could not be processed by an ordinary two-pass assembler

All terms used to specify the value of the new symbol must have been defined previously in

the program

BETA EQU ALPHA

ALPHA RESW 1

ALPHA EQU BETA

BETA EQU DELTA

DELTA RESW 1

disallowed

ORG ALPHA

BYTE1 RESB 1

BYTE2 RESB 1

BYTE3 RESB 1

ORG

ALPHA RESB 1

disallowed

ALPHA RESW 1

BETA EQU ALPHA

Allowed

LALITHAMBIGAIB APIT Page 36

disallowed

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Expressions Most assemblers allow the use of expressions whenever a single operand such as a label or

literal is permitted

bull Each such expression must be evaluated by the assembler to produce a single

operand address or value

Assemblers generally allow arithmetic expressions formed according to the normal rule using

the operators +- and

bull Individual terms in the expression may be

bull constants

bull user-defined symbols or

bull special terms

bull The most common special term is the current value of the location

counter (often designated by )

Types of terms

Absolute terms - The value of an absolute term is independent of program location

Relative terms - The value of a relative term is dependent on the beginning address of the

program

Types of expressions

By the type of value produced expressions can classified as

1 Absolute expressions

bull The value of an absolute expression is independent of the program location

bull The absolute expression may contains relative terms provided the relative terms occur in

pairs and the terms in each such pair have opposite signs No relative term can enter multiplication

or division operation

bull eg MAXLEN EQU BUFEND-BUFFER

LALITHAMBIGAIB APIT Page 37

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Relative expressions

bull The value of a relative expression is relative the beginning address of the object

program

bull A relative expression is one in which all of the relative terms except

one can be paired as described above The remaining unpaired term must have a positive sign No

relative term can enter multiplication or division operation

Expressions that are neither relative nor absolute should be flagged by the assembler as errors

Determining types of expressions

Symbol Type Value

MAXLEN A 1000

BUFEND R 1036

BUFFER R 0036

RETADR R 0030

BUFEND+BUFFER 100-BUFFER and 3BUFFER are neither relative expressions nor

absolute expressions

Program blocks

The program is divided into three blocks They are Default - 0CDATA-1

and CBLKS-2

Default ( 0 ) block for mnemonic instructions

CDATA ( 1 ) block for less memory instructions

CBLKs ( 2 ) block for more memory instructions

Then one more column is included in the program when object code is

generated (ie) block whenever the system will encounter the new block need to write

USE and BLOCK name then set LOCCTR value to 0000

LALITHAMBIGAIB APIT Page 38

Add this field to SYMTAB

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In program block table consists of block name block valuestarting address and

length

After finding location counter value need to find object code and write object

program

LALITHAMBIGAIB APIT Page 39

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Program Block Diagram

LALITHAMBIGAIB APIT Page 40

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Control section and program linkingControl section

i A control section is a part of the program that maintains its identity after assembly

ii Each control section can be loaded and relocated independently of the others

iii Different control sections are most often used for subroutines or other logical

subdivisions of a program

Assembler directive CSECT

CSECT signal the start of a new control section

The assembler establishes a separate location counter (initialized as 0) for each control section

Assembler directives EXTDEF EXTREF

LALITHAMBIGAIB APIT Page 41

SYSTEM SOFTWARE ( CS 2304) UNIT - IIEXTDEF external definition

The EXTDEF statement in a control section names symbols called external symbols that are

defined in this control section and may be used by other sections Control section names are

automatically considered to be external symbols

EXTREF external reference

The EXTREF statement names symbols that are used in this control section and are defined

elsewhere

LALITHAMBIGAIB APIT Page 42

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 3: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - II

21 BASIC ASSEMBLER FUNCTIONS

The figure 21 shows an assembly language program for the basic version of SIC The same program

is used with different variations throughout this chapter to learn different assembler features The

line numbers used in this example program are used for reference and are not part of the program

The mnemonic instructions are used in this example

Mnemonic instruction definition A word or acronym used in

assembly language to represent a binary machine instruction operation code

Indexed addressing is indicated by adding the modifier ldquoXrdquo following the operand Example ndash check

line no 160

STCH BUFFER X

Lines beginning with ldquordquo contain comments only

In addition to the mnemonic instructions the following assembler directives are used

1 START

2 END

3 BYTE

4 WORD

5 RESB

6 RESW

Assembler directives definition - A statement in an assembly-language program that gives

instructions to the assembler and does not generate machine language or object code are called

assembler directives

The Assembler directives are

ndash START Specify name amp starting address for the program

ndash END Indicate the end of the source program and specify the first execution instruction in the

program

ndash BYTE Generate character or hexadecimal constant occupying as many bytes as needed to

represent the constant

LALITHAMBIGAIB APIT Page 3

SYSTEM SOFTWARE ( CS 2304) UNIT - IIndash WORD Generate one-word integer constant

ndash RESB Reserves the indicated number of bytes for a data area

ndash RESW Reserves the indicated number of words for a data area

ndash End of record a null char (00)

ndash End of file a zero length record

In addition to the line numbers the program is written using three columns ndash the first column

indicates the labels used in the source code the second indicates the opcode field and the third

represents the operand

LALITHAMBIGAIB APIT Page 4

Label Opcode Operand

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Figure 21 Assembler language program for basic SIC version

The example program considered in Figure 21 has a main module and two subroutines

Purpose of example program -

- Reads records from input device (code F1)

- Copies them to output device (code 05)

- At the end of the file writes EOF on the output device then RSUB to the

operating system

bull Data transfer (RD WD)

-A buffer is used to store record

-Buffering is necessary for different IO rates

-The end of each record is marked with a null character (00) hexadecimal

-The end of the file is indicated by a zero-length record

LALITHAMBIGAIB APIT Page 5

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The main routine calls subroutines

RDREC ndash To read a record into a buffer

WRREC ndash To write the record from the buffer to the output device

The end of each record is marked with a null character (hexadecimal 00)

211 A Simple SIC Assembler

The figure 22 shows the same program as in figure 21 with the generated object code for each

statement

LALITHAMBIGAIB APIT Page 6

Label Opcode Operand

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Figure 22 Assembler language program for basic SIC version with Object code

LALITHAMBIGAIB APIT Page 7

SYSTEM SOFTWARE ( CS 2304) UNIT - II

1 The column ldquoLocrdquo in the example program gives the machine address in hexadecimal for

each and every line of the assembled program

2 The program starts at address 1000 (itrsquos only assumption ndash for simple SIC machine the

starting address will always be assumed and the value will not be 0)

3 The translation of source program to object code requires the following functions

a Convert mnemonic operation codes to their machine language equivalents Eg Translate

STL to 14 (line 10)

b Convert symbolic operands to their equivalent machine addresses EgTranslate

RETADR to 1033 (line 10)

c Build the machine instructions in the proper format

d Convert the data constants specified in the source program into their internal machine

representations Eg Translate EOF to 454F46(line 80)

e Write the object program and the assembly listing

4 All the statements in the program except statement 2 can be established by sequential

processing of source program one line at a time

Consider the statement

10 1000 FIRST STL RETADR 141033

5 This instruction contains a forward reference (ie) - a reference to a label (RETADR) that is

defined later in the program

6 It is unable to process this line because the address that will be assigned to RETADR is not

known

7 Hence most assemblers make two passes over the source program

8 The first pass does little more than scan the source program for label definitions and assign

address such as those in the Loc column

9 The second pass performs most of the actual translation

10 The assembler must also process statements called assembler directives or pseudo

instructions which are not translated into machine instructions Instead they provide

instructions to the assembler itself Examples RESB and RESW instruct the assembler to

reserve memory locations without generating data values

LALITHAMBIGAIB APIT Page 8

SYSTEM SOFTWARE ( CS 2304) UNIT - II11 The assembler must write the generated object code onto some output device This object

program will later be loaded into memory for execution

Object program format contains three types of records

Header record Contains the program name starting address and length

Text record Contains the machine code and data of the program

End record Marks the end of the object program and specifies the address in the program

where execution is to begin

Record format is as follows

Header record

Col 1 H

Col2-7 Program name

Col8-13 Starting address of object program

Col14-19 Length of object program in bytes

Text record

Col1 T

Col2-7 Starting address for object code in this record

Col8-9 Length of object code in this record in bytes

Col 10-69 Object code represented in hexadecimal (2 columns per byte of object code)

End record

Col1 E

Col2-7 Address of first executable instruction in object program

LALITHAMBIGAIB APIT Page 9

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2111 The assembler design can be done

Single pass assembler

Multi-pass assembler

Single-pass Assembler

In Single-pass assembler the whole process of scanning parsing and object code conversion

is done in single pass

The only problem with this method is resolving forward reference

The problem of forward reference is shown with an example below

10 1000 FIRST STL RETADR 141033

--

--

--

--

95 1033 RETADR RESW 1

In the above example in line number 10 the instruction STL will store the linkage register

with the contents of RETADR But during the processing of this instruction the value of this

symbol is not known as it is defined at the line number 95

Since in single-pass assembler the scanning parsing and object code conversion happens

simultaneously

The instruction is fetched it is scanned for tokens parsed for syntax and semantic validity If

it is valid then it has to be converted to its equivalent object code For this the object code is

generated by adding the opcode of STL and the value for the symbol RETADR which is not

available

Due to this reason usually the design is done in two passes

So a multi-pass assembler resolves the forward references and then converts into the object

code Hence the process of the multi-pass assembler can be as follows

LALITHAMBIGAIB APIT Page 10

SYSTEM SOFTWARE ( CS 2304) UNIT - IIPass-1

Assign addresses to all the statements

Save the addresses assigned to all labels to be used in Pass-2

Perform some processing of assembler directives such as RESW RESB to find the length of

data areas for assigning the address values

Defines the symbols in the symbol table(generate the symbol table)

Pass-2

Assemble the instructions (translating operation codes and looking up addresses)

Generate data values defined by BYTE WORD etc

Perform the processing of the assembler directives not done during pass-1

Write the object program and assembly listing

2112 Assembler Design

The most important things which need to be concentrated is the generation of Symbol table

and resolving forward references

bull Symbol Table

ndash This is created during pass 1

ndash All the labels of the instructions are symbols

ndash Table has entry for symbol name address value

bull Forward reference

ndash Symbols that are defined in the later part of the program are called forward

referencing

ndash There will not be any address value for such symbols in the symbol table in pass 1

2113 Functions of the two passes of assembler

Pass 1 (Define symbols)

1 Assign addresses to all statements in the program

2 Save the addresses assigned to all labels for use in Pass 2

3 Perform some processing of assembler directives

LALITHAMBIGAIB APIT Page 11

SYSTEM SOFTWARE ( CS 2304) UNIT - IIPass 2 (Assemble instructions and generate object programs)

1 Assemble instructions (translating operation codes and looking up addresses)

2 Generate data values defined by BYTEWORD etc

3 Perform processing of assembler directives not done in Pass 1

4 Write the object program and the assembly listing

212 Assembler Algorithm and Data Structures

Assembler uses two major internal data structures

1 Operation Code Table (OPTAB) Used to lookup mnemonic operation codes and translate

them into their machine language equivalents

2 Symbol Table (SYMTAB) Used to store values(Addresses) assigned to labels

3

Location Counter (LOCCTR)

It is a variable used to help in the assignment of addresses

It is initialized to the beginning address specified in the START statement

After each source statement is processed the length of the assembled instruction or data area

to be generated is added to LOCCTR

Whenever a label is reached in the source program the current value of LOCCTR gives the

address to be associated with that label

Operation Code Table (OPTAB)

Contains the mnemonic operation and its machine language equivalent

Also contains information about instruction format and length

In Pass 1 OPTAB is used to lookup and validate operation codes in the source program

In Pass 2 it is used to translate the operation codes to machine language program

During Pass 2 the information in OPTAB tells which instruction format to use in assembling

the instruction and any peculiarities of the object code instruction

LALITHAMBIGAIB APIT Page 12

SYSTEM SOFTWARE ( CS 2304) UNIT - II OPTAB is usually organized as a hash table with mnemonic operation code as the key The

hash table organization is particularly appropriate since it provides fast retrieval with a

minimum search

In most cases OPTAB is a static table ndash ie entries are not added or deleted from it

Symbol Table (SYMTAB)

Includes the name and value(address) for each label in the source program and flags to

indicate error conditions(ex ndash symbol defined in two different places)

During Pass 1 of the assembler labels are entered into SYMTAB as they are encountered in

the source program along with their assigned addresses

During Pass 2 symbols used as operands are looked up in SYMTAB to obtain the addresses

to be inserted in the assembled instructions

Pass 1 usually writes an intermediate file that contains each source statement together with its

assigned address error indicators This file is used as the input to Pass 2

This copy of the source program can also be used to retain the results of certain operations that may

be performed during Pass 1 such as scanning the operand field for symbols and addressing flags so

these need not be performed again during Pass 2

The Algorithm for Pass 1

Begin

read first input line

if OPCODE = lsquoSTARTrsquo then

begin

save [Operand] as starting address

initialize LOCCTR to starting address

write line to intermediate file

read next input line

end if START

else

LALITHAMBIGAIB APIT Page 13

SYSTEM SOFTWARE ( CS 2304) UNIT - IIinitialize LOCCTR to 0

While OPCODE = lsquoENDrsquo do

begin

if this is not a comment line then

begin

if there is a symbol in the LABEL field then

begin

search SYMTAB for LABEL

if found then

set error flag (duplicate symbol)

else

end if symbol

search OPTAB for OPCODE

if found then

add 3 (instruction length) to LOCCTR

else if OPCODE = lsquoWORDrsquo then

add 3 to LOCCTR

else if OPCODE = lsquoRESWrsquo then

add 3 [OPERAND] to LOCCTR

else if OPCODE = lsquoRESBrsquo then

add [OPERAND] to LOCCTR

else if OPCODE = lsquoBYTErsquo then

begin

find length of constant in bytes

add length to LOCCTR

end if BYTE

else

set error flag (invalid operation code)

end if not a comment

write line to intermediate file

read next input line

LALITHAMBIGAIB APIT Page 14

SYSTEM SOFTWARE ( CS 2304) UNIT - IIend while not END

write last line to intermediate file

Save (LOCCTR ndash starting address) as program length

End pass 1

Explanation ndash PASS I Algorithm

The algorithm scans the first statement START and saves the operand field (the address) as

the starting address of the program Initializes the LOCCTR value to this address This line is

written to the intermediate line

If no operand is mentioned the LOCCTR is initialized to zero

If a label is encountered the symbol has to be entered in the symbol table along with its

associated address value If the symbol already exists that indicates an entry of the same

symbol already exists So an error flag is set indicating a duplication of the symbol

It next checks for the mnemonic code it searches for this code in the OPTAB If found then

the length of the instruction is added to the LOCCTR to make it point to the next instruction

If the opcode is the directive WORD it adds a value 3 to the LOCCTR

If it is RESW it needs to add 3 the number of data word to the LOCCTR

If it is BYTE it adds a value one to the LOCCTR if RESB it adds number of bytes

If it is END directive then it is the end of the program it finds the length of the program by

evaluating current LOCCTR ndash the starting address mentioned in the operand field of the

END directive

Each processed line is written to the intermediate file

The Algorithm for Pass 2

Begin

read 1st input line from intermediate file

if OPCODE = lsquoSTARTrsquo then

begin

write listing line

read next input line

end if START

LALITHAMBIGAIB APIT Page 15

SYSTEM SOFTWARE ( CS 2304) UNIT - IIwrite Header record to object program

initialize 1st Text record

while OPCODE = lsquoENDrsquo do

begin

if this is not comment line then

begin

search OPTAB for OPCODE

if found then

begin

if there is a symbol in OPERAND field then

begin

search SYMTAB for OPERAND

if found then

store symbol value as operand address

else

begin

store 0 as operand address

set error flag (undefined symbol)

end

end if symbol

else

store 0 as operand address

assemble the object code instruction

end if opcode found

else if OPCODE = lsquoBYTErsquo or lsquoWORDrdquo then

convert constant to object code

if object code doesnrsquot fit into current Text record then

begin

Write text record to object code

initialize new Text record

end

LALITHAMBIGAIB APIT Page 16

SYSTEM SOFTWARE ( CS 2304) UNIT - IIadd object code to Text record

end if not comment

write listing line

read next input line

end while not END

Write last Text record to object program

Write End record to object program

Write last listing line

End Pass 2

Explanation ndash PASS II Algorithm

Here the first input line is read from the intermediate file

If the opcode is START then this line is directly written to the list file

A header record is written in the object program which gives the starting address and the

length of the program (which is calculated during pass 1)

Then the first text record is initialized Comment lines are ignored

In the instruction for the opcode the OPTAB is searched to find the object code

If a symbol is there in the operand field the symbol table is searched to get the address value

for this which gets added to the object code of the opcode

If the address not found then zero value is stored as operands address An error flag is set

indicating it as undefined

If symbol itself is not found then store 0 as operand address and the object code instruction is

assembled

If the opcode is BYTE or WORD then the constant value is converted to its equivalent

object code( for example for character EOF its equivalent hexadecimal value lsquo454f46rsquo is

stored)

If the object code cannot fit into the current text record a new text record is created and the

rest of the instructions object code is listed

The text records are written to the object program Once the whole program is assembled and

when the END directive is encountered the End record is written

LALITHAMBIGAIB APIT Page 17

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Design and Implementation Issues

Some of the features in the program depend on the architecture of the machine If the program is for

SIC machine then we have only limited instruction formats and hence limited addressing modes

We have only single operand instructions The operand is always a memory reference Anything to

be fetched from memory requires more time Hence the improved version of SICXE machine

provides more instruction formats and hence more addressing modes The moment we change the

machine architecture the availability of number of instruction formats and the addressing modes

changes Therefore the design usually requires considering two things Machine-dependent features

and Machine-independent features

22 MACHINE DEPENDENT ASSEMBLER FEATURES

Instruction formats and addressing modes

Program relocation

In this section the design and implementation of an assembler for the more complex XE version of

SIC is considered

LALITHAMBIGAIB APIT Page 18

SYSTEM SOFTWARE ( CS 2304) UNIT - II

200 SUBROUTINE TO WRITE RECORD FROM BUFFER

205

210 WRREC CLEAR X CLEAR LOOP COUNTER

212 LDT LENGTH

215 WLOOP TD OUTPUT TEST OUTPUT DEVICE

220 JEQ WLOOP LOOP UNTIL READY

225 LDCH BUFFER X GET CHARACTER FROM BUFFER

230 WD OUTPUT WRITE CHARACTER

235 TIXR T LOOP UNTIL ALL CHARACTERS

HAVE BEEN WRITTEN

240 JLT WLOOP

245 RSUB

250 OUTPUT BYTE Xrsquo05rsquo

255 END FIRST

Figure 24 Example program of a SICXE

LALITHAMBIGAIB APIT Page 19

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The above figure 23 shows the example program from figure 21 as it is might be rewritten to

take advantage of the SICXE instruction set

Indirect addressing is indicated by adding the prefix to the operand (line70)

Immediate operands are denoted with the prefix (lines 25 55133)

Instructions that refer to memory are normally assembled using either the program counter

relative or base counter relative mode

The assembler directive BASE (line 13) is used in conjunction with base relative addressing

The four byte extended instruction format is specified with the prefix + added to the

operation code in the source statement

Register-to-register instructions are used wherever possible For example the statement on

line 150 is changed from COMP ZERO to COMPR AS

Immediate and indirect addressing have also been used as much as possible

Register-to-register instructions are faster than the corresponding register-to-memory

operations because they are shorter and do not require another memory reference

While using immediate addressing the operand is already present as part of the instruction

and need not be fetched from anywhere

The use of indirect addressing often avoids the need for another instruction

Line Loc Label Opcode Operand Object Code

5 COPY START 0

10 FIRST STL RETADR

12 LDB LENGTH

13 BASE LENGTH

15 CLOOP +JSUB RDREC

20 LDA LENGTH

25 COMP 0

30 JEQ ENDFIL

35 +JSUB WRREC

40 J CLOOP

45 ENDFIL LDA EOF

LALITHAMBIGAIB APIT Page 20

SYSTEM SOFTWARE ( CS 2304) UNIT - II50 STA BUFFER

55 LDA 3

60 STA LENGTH

65 +JSUB WRREC

70 J RETADR

80 EOF BYTE CrsquoEOFrsquo

95 RETADR RESW 1

100 LENGTH RESW 1

105 BUFFER RESB 4096

110

SUBROUTINE TO READ RECORD INTO

BUFFER

115

120

125 RDREC CLEAR X

130 CLEAR A

132 CLEAR S

133 +LDT 4096

135 RLOOP TD INPUT

140 JEQ RLOOP

145 RD INPUT

150 COMPR A S

155 JEQ EXIT

160 STCH BUFFERX

165 TIXR T

170 JLT RLOOP

175 EXIT STX LENGTH

180 RSUB

185 INPUT BYTE XrsquoF1rsquo

195

SUBROUTINE TO WRITE RECORD

FROM BUFFER

200

205

LALITHAMBIGAIB APIT Page 21

SYSTEM SOFTWARE ( CS 2304) UNIT - II210 WRREC CLEAR X

212 LDT LENGTH

215 WLOOP TD OUTPUT

220 JEQ WLOOP

225 LDCH BUFFERX

230 WD OUTPUT

235 TIXR T

240 JLT WLOOP

245 RSUB

250 OUTPUT BYTE Xrsquo05rsquo

255 END FIRST

Figure 25 Program from figure 24(SICXE) with object code

221 Instruction Formats and Addressing Modes

The instruction formats depend on the memory organization and the size of the memory

In SIC machine the memory is byte addressable Word size is 3 bytes So the size of the

memory is 212 bytes Accordingly it supports only one instruction format It has only two

registers register A and Index register Therefore the addressing modes supported by this

architecture are direct and indexed

Whereas the memory of a SICXE machine is 220 bytes (1 MB) This supports four different

types of instruction types they are

1 byte instruction

2 byte instruction

3 byte instruction

4 byte instruction

LALITHAMBIGAIB APIT Page 22

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Instructions can be

Instructions involving register to register

Instructions with one operand in memory the other in Accumulator (Single

operand instruction)

Extended instruction format

Addressing Modes are

PC-relative or Base-relative addressing op m

Indirect addressing op m

Immediate addressing op c

Extended format +op m

Index addressing op mx

register-to-register instructions

larger memory -gt multi-programming (program allocation)

Translation

1 Translations for the Instruction involving Register-Register addressing mode

During pass 1 the registers can be entered as part of the symbol table itself The value for these

registers is their equivalent numeric codes During pass 2 these values are assembled along with the

mnemonics object code If required a separate table can be created with the register names and their

equivalent numeric values

2 Translation involving Register-Memory instructions

In SICXE machine there are four instruction formats and five addressing modes For formats and

addressing modes refer chapter 1

LALITHAMBIGAIB APIT Page 23

SYSTEM SOFTWARE ( CS 2304) UNIT - IIAmong the instruction formats format -3 and format-4 instructions are Register-Memory type of

instruction One of the operand is always in a register and the other operand is in the memory The

addressing mode tells us the way in which the operand from the memory is to be fetched

There are two ways

1) Program-counter relative and

2) Base-relative

This addressing mode can be represented by either using format-3 type or format-4

type of instruction format

In format-3 the instruction has the opcode followed by a 12-bit displacement value in

the address field

Where as in format-4 the instruction contains the mnemonic code followed by a 20-

bit displacement value in the address field

1 Program-Counter Relative

a) In this usually format-3 instruction format is used

b) The instruction contains the opcode followed by a 12-bit displacement value

c) The range of displacement values are from 0 -2048 This displacement (should be small

enough to fit in a 12-bit field) value is added to the current contents of the program counter to

get the target address of the operand required by the instruction

d) This is relative way of calculating the address of the operand relative to the program counter

Hence the displacement of the operand is relative to the current program counter value

e) The following example shows how the address is calculated

LALITHAMBIGAIB APIT Page 24

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Base-Relative Addressing Mode

a) In this mode the base register is used to mention the displacement value Therefore the target

address is

TA = (base) + displacement value

b) This addressing mode is used when the range of displacement value is not sufficient Hence

the operand is not relative to the instruction as in PC-relative addressing mode Whenever

this mode is used it is indicated by using a directive BASE The moment the assembler

encounters this directive the next instruction uses base-relative addressing mode to calculate

the target address of the operand

c) When NOBASE directive is used then it indicates the base register is no more used to

calculate the target address of the operand Assembler first chooses PC-relative when the

displacement field is not enough it uses Base-relative

LDB LENGTH (instruction)

BASE LENGTH (directive)

NOBASE

LALITHAMBIGAIB APIT Page 25

SYSTEM SOFTWARE ( CS 2304) UNIT - II

For example

12 0003 LDB LENGTH 69202D

13 BASE LENGTH

100 0033 LENGTH RESW 1

105 0036 BUFFER RESB 4096

160 104E STCH BUFFER X 57C003

165 1051 TIXR T B850

In the above example the use of directive BASE indicates that Base-relative addressing mode is

to be used to calculate the target address PC-relative is no longer used The value of the LENGTH is

stored in the base register

The LDB instruction loads the value of length in the base register which 0033 BASE directive

explicitly tells the assembler that it has the value of LENGTH

BUFFER is at location (0036)16

(B) = (0033)16

disp = 0036 ndash 0033 = (0003)16

20 000A LDA LENGTH 032026

175 1056 EXIT STX LENGTH 134000

LALITHAMBIGAIB APIT Page 26

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Consider Line 175 If we use PC-relative

Disp = TA ndash (PC) = 0033 ndash1059 = EFDA

PC relative is no longer applicable so we try to use BASE relative addressing mode

3 Immediate Addressing Mode

In this mode no memory reference is involved If immediate mode is used the target address is the

operand itself

If the symbol is referred in the instruction as the immediate operand then it is immediate with PC-

relative mode as shown in the example below

LALITHAMBIGAIB APIT Page 27

SYSTEM SOFTWARE ( CS 2304) UNIT - II

4 Indirect and PC-relative mode

In this type of instruction the symbol used in the instruction is the address of the location which

contains the address of the operand The address of this is found using PC-relative addressing mode

For example

The instruction jumps the control to the address location RETADR which in turn has the address of

the operand If address of RETADR is 0030 the target address is then 0003 as calculated above

222 Program Relocation

The need for program relocation

It is desirable to load and run several programs at the same time

The system must be able to load programs into memory wherever there is room

The exact starting address of the program is not known until load time

Absolute Address

Program with starting address specified at assembly time

The address may be invalid if the program is loaded into somewhere else

LALITHAMBIGAIB APIT Page 28

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example

The above statement says that the register A is loaded with the value stored at location 102D

Suppose it is decided to load and execute the program at location 3000 instead of location 1000

Then at address 102D the required value which needs to be loaded in the register A is no more

available

The address also gets changed relative to the displacement of the program Hence we need to

make some changes in the address portion of the instruction so that we can load and execute the

program at location 3000

Apart from the instruction which will undergo a change in their operand address value as the

program load address changes There exist some parts in the program which will remain same

regardless of where the program is being loaded

Since assembler will not know actual location where the program will get loaded it cannot make

the necessary changes in the addresses used in the program However the assembler identifies

for the loader those parts of the program which need modification

An object program that has the information necessary to perform this kind of modification is

called the relocatable program

LALITHAMBIGAIB APIT Page 29

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example Program Relocation

1) The above diagram shows the concept of relocation

a) Initially the program is loaded at location 0000

b) The instruction JSUB is loaded at location 0006

c) The address field of this instruction contains 01036 which is the address of the instruction

labeled RDREC

2) The second figure shows that if the program is to be loaded at new location 5000 The address of

the instruction JSUB gets modified to new location 6036

3) Likewise the third figure shows that if the program is relocated at location 7420 the JSUB

instruction would need to be changed to 4B108456 that correspond to the new address of

RDREC

The only parts of the program that require modification at load time are those that specify

direct addresses

LALITHAMBIGAIB APIT Page 30

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The rest of the instructions need not be modified

o Not a memory address (immediate addressing)

o PC-relative Base-relative

From the object program it is not possible to distinguish the address and constant

o The assembler must keep some information to tell the loader

o The object program that contains the modification record is called a relocatable

program

The way to solve the relocation problem

For an address label its address is assigned relative to the start of the program(START 0)

Produce a Modification record to store the starting location and the length of the address

field to be modified

The command for the loader must also be a part of the object program

Modification record

One modification record for each address to be modified

The length is stored in half-bytes (4 bits)

The starting location is the location of the byte containing the leftmost bits of the address

field to be modified

If the field contains an odd number of half-bytes the starting location begins in the middle of

the first byte

LALITHAMBIGAIB APIT Page 31

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Relocatable Object Program

In the above object code the red boxes indicate the addresses that need modifications The object

code lines at the end are the descriptions of the modification records for those instructions which

need change if relocation occurs M00000705 is the modification suggested for the statement at

location 0007 and requires modification 5-half bytes

Machine dependent assembler featuresAssembler features not closely related to machine architecture

bull Literalsbull Symbol-defining statementsbull Expressionsbull Program blocksbull Control sections and program linking

Literals

It is convenient for the programmer to be able to write the value of a constant operand as a part of

the instruction that uses it Such an operand is called a literal

45 001A ENDFIL LDA =ClsquoEOFrsquo 003210

002D =ClsquoEOFrsquo 454F46

LALITHAMBIGAIB APIT Page 32

SYSTEM SOFTWARE ( CS 2304) UNIT - II215 1062 WLOOP TD =Xlsquo05rsquo E32011

1076 =Xlsquo05rsquo 05

bull In this assembler language notation a literal is identified with the prefix= which is followed by a

specification of the literal value

The difference between a literal and an immediate operand

With immediate addressing the operand value is assembled as a part of the machine

instruction

With a literal the assembler generates the specified value as a constant at some other

memory location The address of this generated constant is used as the target address for the

machine instruction

Literal pool

All of the literal operands used in a program are gathered together into one or more literal

pools

Where the literal pool should be placed

93 LTORG

002D =ClsquoEOFrsquo 454F46

1048698 The assembler directive LTORG tells the assembler to generate a literal pool here

Literal for current value of location counter

1048698 The value of the location counter can be denoted by a literal operand

bull BASE

bull LDB =

Handling duplicate literal operands

The assembler should avoid storing duplicate literals

The easiest way to recognize duplicate literals is by comparison of the character

strings defining them

bull 100 LDA =ClsquoEOFrsquo

LALITHAMBIGAIB APIT Page 33

SYSTEM SOFTWARE ( CS 2304) UNIT - IIbull 125 LDA =ClsquoEOFrsquo

bull 160 LDA =Xlsquo454F46rsquo Literal operands with different

literal names may have the same

literal values

Recognizing literal operands that have different literal names but have the same literal values will

complicate the design of an assembler

Literal table (LITTAB)

The basic data structure needed to process literal operands is a literal table (LITTAB)

LITTAB contains Literal NameValue and Length

LITTAB is often organized as a hash table using the literal name or value as the key

Processing literal operands

Pass 1

bull For each recognized literal operand search LITTAB If the literal is already present in the

table no action is need if it is not present the literal is added to LITTAB without assigning

its address

bull When a LTORG statement is encountered or the end of the program the assembler makes a

scan of LITTAB and assigns an address to each literal

bull Update the location counter to reflect the number of bytes occupied by each literal

Pass 2

bull Search LITTAB for each literal operand encountered

bull The data values specified by the literals in each literal pool are inserted at the appropriate

places in the object program

bull In the same way as these values generated by BYTE or WORD statements

bull If a literal value represents an address in the program the assembler must generate the

appropriate Modification record

Symbol-defining statements

Assembler directives

EQU

ORG

Assembler directive EQU

LALITHAMBIGAIB APIT Page 34

SYSTEM SOFTWARE ( CS 2304) UNIT - II Most assemblers provide an assembler directive that allows the programmer to define

symbols and specify their values

Symbol EQU value

When the assembler encounters the EQU statement it enters ldquosymbolrdquo into SYMTAB with

the value of ldquosymbolrdquo

Use of EQU

Establish symbolic names that can be used for improved readability in place of numeric

values

+LDT 4096

MAXLEN EQU 4096

+LDT MAXLEN

Define mnemonic names for registers

A EQU 0

X EQU 1

L EQU 2

RMO 01

Establish and use names that reflect the logical function of the registers in the program

BASE EQU R1

ACCUMULATOR EQU R2

INDEX EQU R3

Assembler directive ORG

The assembler directive ORG is usually used to indirectly assign values to symbols

ORG value

ldquoValuerdquo is a constant or an expression involving constants and previously defined

symbols

When this statement is encountered the assembler resets its location counter (LOCCTR) to

the specified value

Use ORG for label definition

Suppose that we want to define a table with the following structure

STAB SYMBOL VALUE FLAGS

100 entries 6 bytes 3 bytes 1byte

LALITHAMBIGAIB APIT Page 35

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In some assemblers the previous value of LOCCTR is automatically remembered so we can

write

ORG

to return to the normal use of LOCCTR

Restrictions of EQU and ORG in an ordinary two-pass assembler For an ordinary two-pass assembler all symbols must be defined during Pass 1 Hence the

following sequences could not be processed by an ordinary two-pass assembler

All terms used to specify the value of the new symbol must have been defined previously in

the program

BETA EQU ALPHA

ALPHA RESW 1

ALPHA EQU BETA

BETA EQU DELTA

DELTA RESW 1

disallowed

ORG ALPHA

BYTE1 RESB 1

BYTE2 RESB 1

BYTE3 RESB 1

ORG

ALPHA RESB 1

disallowed

ALPHA RESW 1

BETA EQU ALPHA

Allowed

LALITHAMBIGAIB APIT Page 36

disallowed

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Expressions Most assemblers allow the use of expressions whenever a single operand such as a label or

literal is permitted

bull Each such expression must be evaluated by the assembler to produce a single

operand address or value

Assemblers generally allow arithmetic expressions formed according to the normal rule using

the operators +- and

bull Individual terms in the expression may be

bull constants

bull user-defined symbols or

bull special terms

bull The most common special term is the current value of the location

counter (often designated by )

Types of terms

Absolute terms - The value of an absolute term is independent of program location

Relative terms - The value of a relative term is dependent on the beginning address of the

program

Types of expressions

By the type of value produced expressions can classified as

1 Absolute expressions

bull The value of an absolute expression is independent of the program location

bull The absolute expression may contains relative terms provided the relative terms occur in

pairs and the terms in each such pair have opposite signs No relative term can enter multiplication

or division operation

bull eg MAXLEN EQU BUFEND-BUFFER

LALITHAMBIGAIB APIT Page 37

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Relative expressions

bull The value of a relative expression is relative the beginning address of the object

program

bull A relative expression is one in which all of the relative terms except

one can be paired as described above The remaining unpaired term must have a positive sign No

relative term can enter multiplication or division operation

Expressions that are neither relative nor absolute should be flagged by the assembler as errors

Determining types of expressions

Symbol Type Value

MAXLEN A 1000

BUFEND R 1036

BUFFER R 0036

RETADR R 0030

BUFEND+BUFFER 100-BUFFER and 3BUFFER are neither relative expressions nor

absolute expressions

Program blocks

The program is divided into three blocks They are Default - 0CDATA-1

and CBLKS-2

Default ( 0 ) block for mnemonic instructions

CDATA ( 1 ) block for less memory instructions

CBLKs ( 2 ) block for more memory instructions

Then one more column is included in the program when object code is

generated (ie) block whenever the system will encounter the new block need to write

USE and BLOCK name then set LOCCTR value to 0000

LALITHAMBIGAIB APIT Page 38

Add this field to SYMTAB

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In program block table consists of block name block valuestarting address and

length

After finding location counter value need to find object code and write object

program

LALITHAMBIGAIB APIT Page 39

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Program Block Diagram

LALITHAMBIGAIB APIT Page 40

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Control section and program linkingControl section

i A control section is a part of the program that maintains its identity after assembly

ii Each control section can be loaded and relocated independently of the others

iii Different control sections are most often used for subroutines or other logical

subdivisions of a program

Assembler directive CSECT

CSECT signal the start of a new control section

The assembler establishes a separate location counter (initialized as 0) for each control section

Assembler directives EXTDEF EXTREF

LALITHAMBIGAIB APIT Page 41

SYSTEM SOFTWARE ( CS 2304) UNIT - IIEXTDEF external definition

The EXTDEF statement in a control section names symbols called external symbols that are

defined in this control section and may be used by other sections Control section names are

automatically considered to be external symbols

EXTREF external reference

The EXTREF statement names symbols that are used in this control section and are defined

elsewhere

LALITHAMBIGAIB APIT Page 42

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 4: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - IIndash WORD Generate one-word integer constant

ndash RESB Reserves the indicated number of bytes for a data area

ndash RESW Reserves the indicated number of words for a data area

ndash End of record a null char (00)

ndash End of file a zero length record

In addition to the line numbers the program is written using three columns ndash the first column

indicates the labels used in the source code the second indicates the opcode field and the third

represents the operand

LALITHAMBIGAIB APIT Page 4

Label Opcode Operand

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Figure 21 Assembler language program for basic SIC version

The example program considered in Figure 21 has a main module and two subroutines

Purpose of example program -

- Reads records from input device (code F1)

- Copies them to output device (code 05)

- At the end of the file writes EOF on the output device then RSUB to the

operating system

bull Data transfer (RD WD)

-A buffer is used to store record

-Buffering is necessary for different IO rates

-The end of each record is marked with a null character (00) hexadecimal

-The end of the file is indicated by a zero-length record

LALITHAMBIGAIB APIT Page 5

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The main routine calls subroutines

RDREC ndash To read a record into a buffer

WRREC ndash To write the record from the buffer to the output device

The end of each record is marked with a null character (hexadecimal 00)

211 A Simple SIC Assembler

The figure 22 shows the same program as in figure 21 with the generated object code for each

statement

LALITHAMBIGAIB APIT Page 6

Label Opcode Operand

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Figure 22 Assembler language program for basic SIC version with Object code

LALITHAMBIGAIB APIT Page 7

SYSTEM SOFTWARE ( CS 2304) UNIT - II

1 The column ldquoLocrdquo in the example program gives the machine address in hexadecimal for

each and every line of the assembled program

2 The program starts at address 1000 (itrsquos only assumption ndash for simple SIC machine the

starting address will always be assumed and the value will not be 0)

3 The translation of source program to object code requires the following functions

a Convert mnemonic operation codes to their machine language equivalents Eg Translate

STL to 14 (line 10)

b Convert symbolic operands to their equivalent machine addresses EgTranslate

RETADR to 1033 (line 10)

c Build the machine instructions in the proper format

d Convert the data constants specified in the source program into their internal machine

representations Eg Translate EOF to 454F46(line 80)

e Write the object program and the assembly listing

4 All the statements in the program except statement 2 can be established by sequential

processing of source program one line at a time

Consider the statement

10 1000 FIRST STL RETADR 141033

5 This instruction contains a forward reference (ie) - a reference to a label (RETADR) that is

defined later in the program

6 It is unable to process this line because the address that will be assigned to RETADR is not

known

7 Hence most assemblers make two passes over the source program

8 The first pass does little more than scan the source program for label definitions and assign

address such as those in the Loc column

9 The second pass performs most of the actual translation

10 The assembler must also process statements called assembler directives or pseudo

instructions which are not translated into machine instructions Instead they provide

instructions to the assembler itself Examples RESB and RESW instruct the assembler to

reserve memory locations without generating data values

LALITHAMBIGAIB APIT Page 8

SYSTEM SOFTWARE ( CS 2304) UNIT - II11 The assembler must write the generated object code onto some output device This object

program will later be loaded into memory for execution

Object program format contains three types of records

Header record Contains the program name starting address and length

Text record Contains the machine code and data of the program

End record Marks the end of the object program and specifies the address in the program

where execution is to begin

Record format is as follows

Header record

Col 1 H

Col2-7 Program name

Col8-13 Starting address of object program

Col14-19 Length of object program in bytes

Text record

Col1 T

Col2-7 Starting address for object code in this record

Col8-9 Length of object code in this record in bytes

Col 10-69 Object code represented in hexadecimal (2 columns per byte of object code)

End record

Col1 E

Col2-7 Address of first executable instruction in object program

LALITHAMBIGAIB APIT Page 9

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2111 The assembler design can be done

Single pass assembler

Multi-pass assembler

Single-pass Assembler

In Single-pass assembler the whole process of scanning parsing and object code conversion

is done in single pass

The only problem with this method is resolving forward reference

The problem of forward reference is shown with an example below

10 1000 FIRST STL RETADR 141033

--

--

--

--

95 1033 RETADR RESW 1

In the above example in line number 10 the instruction STL will store the linkage register

with the contents of RETADR But during the processing of this instruction the value of this

symbol is not known as it is defined at the line number 95

Since in single-pass assembler the scanning parsing and object code conversion happens

simultaneously

The instruction is fetched it is scanned for tokens parsed for syntax and semantic validity If

it is valid then it has to be converted to its equivalent object code For this the object code is

generated by adding the opcode of STL and the value for the symbol RETADR which is not

available

Due to this reason usually the design is done in two passes

So a multi-pass assembler resolves the forward references and then converts into the object

code Hence the process of the multi-pass assembler can be as follows

LALITHAMBIGAIB APIT Page 10

SYSTEM SOFTWARE ( CS 2304) UNIT - IIPass-1

Assign addresses to all the statements

Save the addresses assigned to all labels to be used in Pass-2

Perform some processing of assembler directives such as RESW RESB to find the length of

data areas for assigning the address values

Defines the symbols in the symbol table(generate the symbol table)

Pass-2

Assemble the instructions (translating operation codes and looking up addresses)

Generate data values defined by BYTE WORD etc

Perform the processing of the assembler directives not done during pass-1

Write the object program and assembly listing

2112 Assembler Design

The most important things which need to be concentrated is the generation of Symbol table

and resolving forward references

bull Symbol Table

ndash This is created during pass 1

ndash All the labels of the instructions are symbols

ndash Table has entry for symbol name address value

bull Forward reference

ndash Symbols that are defined in the later part of the program are called forward

referencing

ndash There will not be any address value for such symbols in the symbol table in pass 1

2113 Functions of the two passes of assembler

Pass 1 (Define symbols)

1 Assign addresses to all statements in the program

2 Save the addresses assigned to all labels for use in Pass 2

3 Perform some processing of assembler directives

LALITHAMBIGAIB APIT Page 11

SYSTEM SOFTWARE ( CS 2304) UNIT - IIPass 2 (Assemble instructions and generate object programs)

1 Assemble instructions (translating operation codes and looking up addresses)

2 Generate data values defined by BYTEWORD etc

3 Perform processing of assembler directives not done in Pass 1

4 Write the object program and the assembly listing

212 Assembler Algorithm and Data Structures

Assembler uses two major internal data structures

1 Operation Code Table (OPTAB) Used to lookup mnemonic operation codes and translate

them into their machine language equivalents

2 Symbol Table (SYMTAB) Used to store values(Addresses) assigned to labels

3

Location Counter (LOCCTR)

It is a variable used to help in the assignment of addresses

It is initialized to the beginning address specified in the START statement

After each source statement is processed the length of the assembled instruction or data area

to be generated is added to LOCCTR

Whenever a label is reached in the source program the current value of LOCCTR gives the

address to be associated with that label

Operation Code Table (OPTAB)

Contains the mnemonic operation and its machine language equivalent

Also contains information about instruction format and length

In Pass 1 OPTAB is used to lookup and validate operation codes in the source program

In Pass 2 it is used to translate the operation codes to machine language program

During Pass 2 the information in OPTAB tells which instruction format to use in assembling

the instruction and any peculiarities of the object code instruction

LALITHAMBIGAIB APIT Page 12

SYSTEM SOFTWARE ( CS 2304) UNIT - II OPTAB is usually organized as a hash table with mnemonic operation code as the key The

hash table organization is particularly appropriate since it provides fast retrieval with a

minimum search

In most cases OPTAB is a static table ndash ie entries are not added or deleted from it

Symbol Table (SYMTAB)

Includes the name and value(address) for each label in the source program and flags to

indicate error conditions(ex ndash symbol defined in two different places)

During Pass 1 of the assembler labels are entered into SYMTAB as they are encountered in

the source program along with their assigned addresses

During Pass 2 symbols used as operands are looked up in SYMTAB to obtain the addresses

to be inserted in the assembled instructions

Pass 1 usually writes an intermediate file that contains each source statement together with its

assigned address error indicators This file is used as the input to Pass 2

This copy of the source program can also be used to retain the results of certain operations that may

be performed during Pass 1 such as scanning the operand field for symbols and addressing flags so

these need not be performed again during Pass 2

The Algorithm for Pass 1

Begin

read first input line

if OPCODE = lsquoSTARTrsquo then

begin

save [Operand] as starting address

initialize LOCCTR to starting address

write line to intermediate file

read next input line

end if START

else

LALITHAMBIGAIB APIT Page 13

SYSTEM SOFTWARE ( CS 2304) UNIT - IIinitialize LOCCTR to 0

While OPCODE = lsquoENDrsquo do

begin

if this is not a comment line then

begin

if there is a symbol in the LABEL field then

begin

search SYMTAB for LABEL

if found then

set error flag (duplicate symbol)

else

end if symbol

search OPTAB for OPCODE

if found then

add 3 (instruction length) to LOCCTR

else if OPCODE = lsquoWORDrsquo then

add 3 to LOCCTR

else if OPCODE = lsquoRESWrsquo then

add 3 [OPERAND] to LOCCTR

else if OPCODE = lsquoRESBrsquo then

add [OPERAND] to LOCCTR

else if OPCODE = lsquoBYTErsquo then

begin

find length of constant in bytes

add length to LOCCTR

end if BYTE

else

set error flag (invalid operation code)

end if not a comment

write line to intermediate file

read next input line

LALITHAMBIGAIB APIT Page 14

SYSTEM SOFTWARE ( CS 2304) UNIT - IIend while not END

write last line to intermediate file

Save (LOCCTR ndash starting address) as program length

End pass 1

Explanation ndash PASS I Algorithm

The algorithm scans the first statement START and saves the operand field (the address) as

the starting address of the program Initializes the LOCCTR value to this address This line is

written to the intermediate line

If no operand is mentioned the LOCCTR is initialized to zero

If a label is encountered the symbol has to be entered in the symbol table along with its

associated address value If the symbol already exists that indicates an entry of the same

symbol already exists So an error flag is set indicating a duplication of the symbol

It next checks for the mnemonic code it searches for this code in the OPTAB If found then

the length of the instruction is added to the LOCCTR to make it point to the next instruction

If the opcode is the directive WORD it adds a value 3 to the LOCCTR

If it is RESW it needs to add 3 the number of data word to the LOCCTR

If it is BYTE it adds a value one to the LOCCTR if RESB it adds number of bytes

If it is END directive then it is the end of the program it finds the length of the program by

evaluating current LOCCTR ndash the starting address mentioned in the operand field of the

END directive

Each processed line is written to the intermediate file

The Algorithm for Pass 2

Begin

read 1st input line from intermediate file

if OPCODE = lsquoSTARTrsquo then

begin

write listing line

read next input line

end if START

LALITHAMBIGAIB APIT Page 15

SYSTEM SOFTWARE ( CS 2304) UNIT - IIwrite Header record to object program

initialize 1st Text record

while OPCODE = lsquoENDrsquo do

begin

if this is not comment line then

begin

search OPTAB for OPCODE

if found then

begin

if there is a symbol in OPERAND field then

begin

search SYMTAB for OPERAND

if found then

store symbol value as operand address

else

begin

store 0 as operand address

set error flag (undefined symbol)

end

end if symbol

else

store 0 as operand address

assemble the object code instruction

end if opcode found

else if OPCODE = lsquoBYTErsquo or lsquoWORDrdquo then

convert constant to object code

if object code doesnrsquot fit into current Text record then

begin

Write text record to object code

initialize new Text record

end

LALITHAMBIGAIB APIT Page 16

SYSTEM SOFTWARE ( CS 2304) UNIT - IIadd object code to Text record

end if not comment

write listing line

read next input line

end while not END

Write last Text record to object program

Write End record to object program

Write last listing line

End Pass 2

Explanation ndash PASS II Algorithm

Here the first input line is read from the intermediate file

If the opcode is START then this line is directly written to the list file

A header record is written in the object program which gives the starting address and the

length of the program (which is calculated during pass 1)

Then the first text record is initialized Comment lines are ignored

In the instruction for the opcode the OPTAB is searched to find the object code

If a symbol is there in the operand field the symbol table is searched to get the address value

for this which gets added to the object code of the opcode

If the address not found then zero value is stored as operands address An error flag is set

indicating it as undefined

If symbol itself is not found then store 0 as operand address and the object code instruction is

assembled

If the opcode is BYTE or WORD then the constant value is converted to its equivalent

object code( for example for character EOF its equivalent hexadecimal value lsquo454f46rsquo is

stored)

If the object code cannot fit into the current text record a new text record is created and the

rest of the instructions object code is listed

The text records are written to the object program Once the whole program is assembled and

when the END directive is encountered the End record is written

LALITHAMBIGAIB APIT Page 17

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Design and Implementation Issues

Some of the features in the program depend on the architecture of the machine If the program is for

SIC machine then we have only limited instruction formats and hence limited addressing modes

We have only single operand instructions The operand is always a memory reference Anything to

be fetched from memory requires more time Hence the improved version of SICXE machine

provides more instruction formats and hence more addressing modes The moment we change the

machine architecture the availability of number of instruction formats and the addressing modes

changes Therefore the design usually requires considering two things Machine-dependent features

and Machine-independent features

22 MACHINE DEPENDENT ASSEMBLER FEATURES

Instruction formats and addressing modes

Program relocation

In this section the design and implementation of an assembler for the more complex XE version of

SIC is considered

LALITHAMBIGAIB APIT Page 18

SYSTEM SOFTWARE ( CS 2304) UNIT - II

200 SUBROUTINE TO WRITE RECORD FROM BUFFER

205

210 WRREC CLEAR X CLEAR LOOP COUNTER

212 LDT LENGTH

215 WLOOP TD OUTPUT TEST OUTPUT DEVICE

220 JEQ WLOOP LOOP UNTIL READY

225 LDCH BUFFER X GET CHARACTER FROM BUFFER

230 WD OUTPUT WRITE CHARACTER

235 TIXR T LOOP UNTIL ALL CHARACTERS

HAVE BEEN WRITTEN

240 JLT WLOOP

245 RSUB

250 OUTPUT BYTE Xrsquo05rsquo

255 END FIRST

Figure 24 Example program of a SICXE

LALITHAMBIGAIB APIT Page 19

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The above figure 23 shows the example program from figure 21 as it is might be rewritten to

take advantage of the SICXE instruction set

Indirect addressing is indicated by adding the prefix to the operand (line70)

Immediate operands are denoted with the prefix (lines 25 55133)

Instructions that refer to memory are normally assembled using either the program counter

relative or base counter relative mode

The assembler directive BASE (line 13) is used in conjunction with base relative addressing

The four byte extended instruction format is specified with the prefix + added to the

operation code in the source statement

Register-to-register instructions are used wherever possible For example the statement on

line 150 is changed from COMP ZERO to COMPR AS

Immediate and indirect addressing have also been used as much as possible

Register-to-register instructions are faster than the corresponding register-to-memory

operations because they are shorter and do not require another memory reference

While using immediate addressing the operand is already present as part of the instruction

and need not be fetched from anywhere

The use of indirect addressing often avoids the need for another instruction

Line Loc Label Opcode Operand Object Code

5 COPY START 0

10 FIRST STL RETADR

12 LDB LENGTH

13 BASE LENGTH

15 CLOOP +JSUB RDREC

20 LDA LENGTH

25 COMP 0

30 JEQ ENDFIL

35 +JSUB WRREC

40 J CLOOP

45 ENDFIL LDA EOF

LALITHAMBIGAIB APIT Page 20

SYSTEM SOFTWARE ( CS 2304) UNIT - II50 STA BUFFER

55 LDA 3

60 STA LENGTH

65 +JSUB WRREC

70 J RETADR

80 EOF BYTE CrsquoEOFrsquo

95 RETADR RESW 1

100 LENGTH RESW 1

105 BUFFER RESB 4096

110

SUBROUTINE TO READ RECORD INTO

BUFFER

115

120

125 RDREC CLEAR X

130 CLEAR A

132 CLEAR S

133 +LDT 4096

135 RLOOP TD INPUT

140 JEQ RLOOP

145 RD INPUT

150 COMPR A S

155 JEQ EXIT

160 STCH BUFFERX

165 TIXR T

170 JLT RLOOP

175 EXIT STX LENGTH

180 RSUB

185 INPUT BYTE XrsquoF1rsquo

195

SUBROUTINE TO WRITE RECORD

FROM BUFFER

200

205

LALITHAMBIGAIB APIT Page 21

SYSTEM SOFTWARE ( CS 2304) UNIT - II210 WRREC CLEAR X

212 LDT LENGTH

215 WLOOP TD OUTPUT

220 JEQ WLOOP

225 LDCH BUFFERX

230 WD OUTPUT

235 TIXR T

240 JLT WLOOP

245 RSUB

250 OUTPUT BYTE Xrsquo05rsquo

255 END FIRST

Figure 25 Program from figure 24(SICXE) with object code

221 Instruction Formats and Addressing Modes

The instruction formats depend on the memory organization and the size of the memory

In SIC machine the memory is byte addressable Word size is 3 bytes So the size of the

memory is 212 bytes Accordingly it supports only one instruction format It has only two

registers register A and Index register Therefore the addressing modes supported by this

architecture are direct and indexed

Whereas the memory of a SICXE machine is 220 bytes (1 MB) This supports four different

types of instruction types they are

1 byte instruction

2 byte instruction

3 byte instruction

4 byte instruction

LALITHAMBIGAIB APIT Page 22

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Instructions can be

Instructions involving register to register

Instructions with one operand in memory the other in Accumulator (Single

operand instruction)

Extended instruction format

Addressing Modes are

PC-relative or Base-relative addressing op m

Indirect addressing op m

Immediate addressing op c

Extended format +op m

Index addressing op mx

register-to-register instructions

larger memory -gt multi-programming (program allocation)

Translation

1 Translations for the Instruction involving Register-Register addressing mode

During pass 1 the registers can be entered as part of the symbol table itself The value for these

registers is their equivalent numeric codes During pass 2 these values are assembled along with the

mnemonics object code If required a separate table can be created with the register names and their

equivalent numeric values

2 Translation involving Register-Memory instructions

In SICXE machine there are four instruction formats and five addressing modes For formats and

addressing modes refer chapter 1

LALITHAMBIGAIB APIT Page 23

SYSTEM SOFTWARE ( CS 2304) UNIT - IIAmong the instruction formats format -3 and format-4 instructions are Register-Memory type of

instruction One of the operand is always in a register and the other operand is in the memory The

addressing mode tells us the way in which the operand from the memory is to be fetched

There are two ways

1) Program-counter relative and

2) Base-relative

This addressing mode can be represented by either using format-3 type or format-4

type of instruction format

In format-3 the instruction has the opcode followed by a 12-bit displacement value in

the address field

Where as in format-4 the instruction contains the mnemonic code followed by a 20-

bit displacement value in the address field

1 Program-Counter Relative

a) In this usually format-3 instruction format is used

b) The instruction contains the opcode followed by a 12-bit displacement value

c) The range of displacement values are from 0 -2048 This displacement (should be small

enough to fit in a 12-bit field) value is added to the current contents of the program counter to

get the target address of the operand required by the instruction

d) This is relative way of calculating the address of the operand relative to the program counter

Hence the displacement of the operand is relative to the current program counter value

e) The following example shows how the address is calculated

LALITHAMBIGAIB APIT Page 24

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Base-Relative Addressing Mode

a) In this mode the base register is used to mention the displacement value Therefore the target

address is

TA = (base) + displacement value

b) This addressing mode is used when the range of displacement value is not sufficient Hence

the operand is not relative to the instruction as in PC-relative addressing mode Whenever

this mode is used it is indicated by using a directive BASE The moment the assembler

encounters this directive the next instruction uses base-relative addressing mode to calculate

the target address of the operand

c) When NOBASE directive is used then it indicates the base register is no more used to

calculate the target address of the operand Assembler first chooses PC-relative when the

displacement field is not enough it uses Base-relative

LDB LENGTH (instruction)

BASE LENGTH (directive)

NOBASE

LALITHAMBIGAIB APIT Page 25

SYSTEM SOFTWARE ( CS 2304) UNIT - II

For example

12 0003 LDB LENGTH 69202D

13 BASE LENGTH

100 0033 LENGTH RESW 1

105 0036 BUFFER RESB 4096

160 104E STCH BUFFER X 57C003

165 1051 TIXR T B850

In the above example the use of directive BASE indicates that Base-relative addressing mode is

to be used to calculate the target address PC-relative is no longer used The value of the LENGTH is

stored in the base register

The LDB instruction loads the value of length in the base register which 0033 BASE directive

explicitly tells the assembler that it has the value of LENGTH

BUFFER is at location (0036)16

(B) = (0033)16

disp = 0036 ndash 0033 = (0003)16

20 000A LDA LENGTH 032026

175 1056 EXIT STX LENGTH 134000

LALITHAMBIGAIB APIT Page 26

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Consider Line 175 If we use PC-relative

Disp = TA ndash (PC) = 0033 ndash1059 = EFDA

PC relative is no longer applicable so we try to use BASE relative addressing mode

3 Immediate Addressing Mode

In this mode no memory reference is involved If immediate mode is used the target address is the

operand itself

If the symbol is referred in the instruction as the immediate operand then it is immediate with PC-

relative mode as shown in the example below

LALITHAMBIGAIB APIT Page 27

SYSTEM SOFTWARE ( CS 2304) UNIT - II

4 Indirect and PC-relative mode

In this type of instruction the symbol used in the instruction is the address of the location which

contains the address of the operand The address of this is found using PC-relative addressing mode

For example

The instruction jumps the control to the address location RETADR which in turn has the address of

the operand If address of RETADR is 0030 the target address is then 0003 as calculated above

222 Program Relocation

The need for program relocation

It is desirable to load and run several programs at the same time

The system must be able to load programs into memory wherever there is room

The exact starting address of the program is not known until load time

Absolute Address

Program with starting address specified at assembly time

The address may be invalid if the program is loaded into somewhere else

LALITHAMBIGAIB APIT Page 28

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example

The above statement says that the register A is loaded with the value stored at location 102D

Suppose it is decided to load and execute the program at location 3000 instead of location 1000

Then at address 102D the required value which needs to be loaded in the register A is no more

available

The address also gets changed relative to the displacement of the program Hence we need to

make some changes in the address portion of the instruction so that we can load and execute the

program at location 3000

Apart from the instruction which will undergo a change in their operand address value as the

program load address changes There exist some parts in the program which will remain same

regardless of where the program is being loaded

Since assembler will not know actual location where the program will get loaded it cannot make

the necessary changes in the addresses used in the program However the assembler identifies

for the loader those parts of the program which need modification

An object program that has the information necessary to perform this kind of modification is

called the relocatable program

LALITHAMBIGAIB APIT Page 29

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example Program Relocation

1) The above diagram shows the concept of relocation

a) Initially the program is loaded at location 0000

b) The instruction JSUB is loaded at location 0006

c) The address field of this instruction contains 01036 which is the address of the instruction

labeled RDREC

2) The second figure shows that if the program is to be loaded at new location 5000 The address of

the instruction JSUB gets modified to new location 6036

3) Likewise the third figure shows that if the program is relocated at location 7420 the JSUB

instruction would need to be changed to 4B108456 that correspond to the new address of

RDREC

The only parts of the program that require modification at load time are those that specify

direct addresses

LALITHAMBIGAIB APIT Page 30

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The rest of the instructions need not be modified

o Not a memory address (immediate addressing)

o PC-relative Base-relative

From the object program it is not possible to distinguish the address and constant

o The assembler must keep some information to tell the loader

o The object program that contains the modification record is called a relocatable

program

The way to solve the relocation problem

For an address label its address is assigned relative to the start of the program(START 0)

Produce a Modification record to store the starting location and the length of the address

field to be modified

The command for the loader must also be a part of the object program

Modification record

One modification record for each address to be modified

The length is stored in half-bytes (4 bits)

The starting location is the location of the byte containing the leftmost bits of the address

field to be modified

If the field contains an odd number of half-bytes the starting location begins in the middle of

the first byte

LALITHAMBIGAIB APIT Page 31

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Relocatable Object Program

In the above object code the red boxes indicate the addresses that need modifications The object

code lines at the end are the descriptions of the modification records for those instructions which

need change if relocation occurs M00000705 is the modification suggested for the statement at

location 0007 and requires modification 5-half bytes

Machine dependent assembler featuresAssembler features not closely related to machine architecture

bull Literalsbull Symbol-defining statementsbull Expressionsbull Program blocksbull Control sections and program linking

Literals

It is convenient for the programmer to be able to write the value of a constant operand as a part of

the instruction that uses it Such an operand is called a literal

45 001A ENDFIL LDA =ClsquoEOFrsquo 003210

002D =ClsquoEOFrsquo 454F46

LALITHAMBIGAIB APIT Page 32

SYSTEM SOFTWARE ( CS 2304) UNIT - II215 1062 WLOOP TD =Xlsquo05rsquo E32011

1076 =Xlsquo05rsquo 05

bull In this assembler language notation a literal is identified with the prefix= which is followed by a

specification of the literal value

The difference between a literal and an immediate operand

With immediate addressing the operand value is assembled as a part of the machine

instruction

With a literal the assembler generates the specified value as a constant at some other

memory location The address of this generated constant is used as the target address for the

machine instruction

Literal pool

All of the literal operands used in a program are gathered together into one or more literal

pools

Where the literal pool should be placed

93 LTORG

002D =ClsquoEOFrsquo 454F46

1048698 The assembler directive LTORG tells the assembler to generate a literal pool here

Literal for current value of location counter

1048698 The value of the location counter can be denoted by a literal operand

bull BASE

bull LDB =

Handling duplicate literal operands

The assembler should avoid storing duplicate literals

The easiest way to recognize duplicate literals is by comparison of the character

strings defining them

bull 100 LDA =ClsquoEOFrsquo

LALITHAMBIGAIB APIT Page 33

SYSTEM SOFTWARE ( CS 2304) UNIT - IIbull 125 LDA =ClsquoEOFrsquo

bull 160 LDA =Xlsquo454F46rsquo Literal operands with different

literal names may have the same

literal values

Recognizing literal operands that have different literal names but have the same literal values will

complicate the design of an assembler

Literal table (LITTAB)

The basic data structure needed to process literal operands is a literal table (LITTAB)

LITTAB contains Literal NameValue and Length

LITTAB is often organized as a hash table using the literal name or value as the key

Processing literal operands

Pass 1

bull For each recognized literal operand search LITTAB If the literal is already present in the

table no action is need if it is not present the literal is added to LITTAB without assigning

its address

bull When a LTORG statement is encountered or the end of the program the assembler makes a

scan of LITTAB and assigns an address to each literal

bull Update the location counter to reflect the number of bytes occupied by each literal

Pass 2

bull Search LITTAB for each literal operand encountered

bull The data values specified by the literals in each literal pool are inserted at the appropriate

places in the object program

bull In the same way as these values generated by BYTE or WORD statements

bull If a literal value represents an address in the program the assembler must generate the

appropriate Modification record

Symbol-defining statements

Assembler directives

EQU

ORG

Assembler directive EQU

LALITHAMBIGAIB APIT Page 34

SYSTEM SOFTWARE ( CS 2304) UNIT - II Most assemblers provide an assembler directive that allows the programmer to define

symbols and specify their values

Symbol EQU value

When the assembler encounters the EQU statement it enters ldquosymbolrdquo into SYMTAB with

the value of ldquosymbolrdquo

Use of EQU

Establish symbolic names that can be used for improved readability in place of numeric

values

+LDT 4096

MAXLEN EQU 4096

+LDT MAXLEN

Define mnemonic names for registers

A EQU 0

X EQU 1

L EQU 2

RMO 01

Establish and use names that reflect the logical function of the registers in the program

BASE EQU R1

ACCUMULATOR EQU R2

INDEX EQU R3

Assembler directive ORG

The assembler directive ORG is usually used to indirectly assign values to symbols

ORG value

ldquoValuerdquo is a constant or an expression involving constants and previously defined

symbols

When this statement is encountered the assembler resets its location counter (LOCCTR) to

the specified value

Use ORG for label definition

Suppose that we want to define a table with the following structure

STAB SYMBOL VALUE FLAGS

100 entries 6 bytes 3 bytes 1byte

LALITHAMBIGAIB APIT Page 35

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In some assemblers the previous value of LOCCTR is automatically remembered so we can

write

ORG

to return to the normal use of LOCCTR

Restrictions of EQU and ORG in an ordinary two-pass assembler For an ordinary two-pass assembler all symbols must be defined during Pass 1 Hence the

following sequences could not be processed by an ordinary two-pass assembler

All terms used to specify the value of the new symbol must have been defined previously in

the program

BETA EQU ALPHA

ALPHA RESW 1

ALPHA EQU BETA

BETA EQU DELTA

DELTA RESW 1

disallowed

ORG ALPHA

BYTE1 RESB 1

BYTE2 RESB 1

BYTE3 RESB 1

ORG

ALPHA RESB 1

disallowed

ALPHA RESW 1

BETA EQU ALPHA

Allowed

LALITHAMBIGAIB APIT Page 36

disallowed

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Expressions Most assemblers allow the use of expressions whenever a single operand such as a label or

literal is permitted

bull Each such expression must be evaluated by the assembler to produce a single

operand address or value

Assemblers generally allow arithmetic expressions formed according to the normal rule using

the operators +- and

bull Individual terms in the expression may be

bull constants

bull user-defined symbols or

bull special terms

bull The most common special term is the current value of the location

counter (often designated by )

Types of terms

Absolute terms - The value of an absolute term is independent of program location

Relative terms - The value of a relative term is dependent on the beginning address of the

program

Types of expressions

By the type of value produced expressions can classified as

1 Absolute expressions

bull The value of an absolute expression is independent of the program location

bull The absolute expression may contains relative terms provided the relative terms occur in

pairs and the terms in each such pair have opposite signs No relative term can enter multiplication

or division operation

bull eg MAXLEN EQU BUFEND-BUFFER

LALITHAMBIGAIB APIT Page 37

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Relative expressions

bull The value of a relative expression is relative the beginning address of the object

program

bull A relative expression is one in which all of the relative terms except

one can be paired as described above The remaining unpaired term must have a positive sign No

relative term can enter multiplication or division operation

Expressions that are neither relative nor absolute should be flagged by the assembler as errors

Determining types of expressions

Symbol Type Value

MAXLEN A 1000

BUFEND R 1036

BUFFER R 0036

RETADR R 0030

BUFEND+BUFFER 100-BUFFER and 3BUFFER are neither relative expressions nor

absolute expressions

Program blocks

The program is divided into three blocks They are Default - 0CDATA-1

and CBLKS-2

Default ( 0 ) block for mnemonic instructions

CDATA ( 1 ) block for less memory instructions

CBLKs ( 2 ) block for more memory instructions

Then one more column is included in the program when object code is

generated (ie) block whenever the system will encounter the new block need to write

USE and BLOCK name then set LOCCTR value to 0000

LALITHAMBIGAIB APIT Page 38

Add this field to SYMTAB

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In program block table consists of block name block valuestarting address and

length

After finding location counter value need to find object code and write object

program

LALITHAMBIGAIB APIT Page 39

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Program Block Diagram

LALITHAMBIGAIB APIT Page 40

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Control section and program linkingControl section

i A control section is a part of the program that maintains its identity after assembly

ii Each control section can be loaded and relocated independently of the others

iii Different control sections are most often used for subroutines or other logical

subdivisions of a program

Assembler directive CSECT

CSECT signal the start of a new control section

The assembler establishes a separate location counter (initialized as 0) for each control section

Assembler directives EXTDEF EXTREF

LALITHAMBIGAIB APIT Page 41

SYSTEM SOFTWARE ( CS 2304) UNIT - IIEXTDEF external definition

The EXTDEF statement in a control section names symbols called external symbols that are

defined in this control section and may be used by other sections Control section names are

automatically considered to be external symbols

EXTREF external reference

The EXTREF statement names symbols that are used in this control section and are defined

elsewhere

LALITHAMBIGAIB APIT Page 42

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 5: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Figure 21 Assembler language program for basic SIC version

The example program considered in Figure 21 has a main module and two subroutines

Purpose of example program -

- Reads records from input device (code F1)

- Copies them to output device (code 05)

- At the end of the file writes EOF on the output device then RSUB to the

operating system

bull Data transfer (RD WD)

-A buffer is used to store record

-Buffering is necessary for different IO rates

-The end of each record is marked with a null character (00) hexadecimal

-The end of the file is indicated by a zero-length record

LALITHAMBIGAIB APIT Page 5

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The main routine calls subroutines

RDREC ndash To read a record into a buffer

WRREC ndash To write the record from the buffer to the output device

The end of each record is marked with a null character (hexadecimal 00)

211 A Simple SIC Assembler

The figure 22 shows the same program as in figure 21 with the generated object code for each

statement

LALITHAMBIGAIB APIT Page 6

Label Opcode Operand

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Figure 22 Assembler language program for basic SIC version with Object code

LALITHAMBIGAIB APIT Page 7

SYSTEM SOFTWARE ( CS 2304) UNIT - II

1 The column ldquoLocrdquo in the example program gives the machine address in hexadecimal for

each and every line of the assembled program

2 The program starts at address 1000 (itrsquos only assumption ndash for simple SIC machine the

starting address will always be assumed and the value will not be 0)

3 The translation of source program to object code requires the following functions

a Convert mnemonic operation codes to their machine language equivalents Eg Translate

STL to 14 (line 10)

b Convert symbolic operands to their equivalent machine addresses EgTranslate

RETADR to 1033 (line 10)

c Build the machine instructions in the proper format

d Convert the data constants specified in the source program into their internal machine

representations Eg Translate EOF to 454F46(line 80)

e Write the object program and the assembly listing

4 All the statements in the program except statement 2 can be established by sequential

processing of source program one line at a time

Consider the statement

10 1000 FIRST STL RETADR 141033

5 This instruction contains a forward reference (ie) - a reference to a label (RETADR) that is

defined later in the program

6 It is unable to process this line because the address that will be assigned to RETADR is not

known

7 Hence most assemblers make two passes over the source program

8 The first pass does little more than scan the source program for label definitions and assign

address such as those in the Loc column

9 The second pass performs most of the actual translation

10 The assembler must also process statements called assembler directives or pseudo

instructions which are not translated into machine instructions Instead they provide

instructions to the assembler itself Examples RESB and RESW instruct the assembler to

reserve memory locations without generating data values

LALITHAMBIGAIB APIT Page 8

SYSTEM SOFTWARE ( CS 2304) UNIT - II11 The assembler must write the generated object code onto some output device This object

program will later be loaded into memory for execution

Object program format contains three types of records

Header record Contains the program name starting address and length

Text record Contains the machine code and data of the program

End record Marks the end of the object program and specifies the address in the program

where execution is to begin

Record format is as follows

Header record

Col 1 H

Col2-7 Program name

Col8-13 Starting address of object program

Col14-19 Length of object program in bytes

Text record

Col1 T

Col2-7 Starting address for object code in this record

Col8-9 Length of object code in this record in bytes

Col 10-69 Object code represented in hexadecimal (2 columns per byte of object code)

End record

Col1 E

Col2-7 Address of first executable instruction in object program

LALITHAMBIGAIB APIT Page 9

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2111 The assembler design can be done

Single pass assembler

Multi-pass assembler

Single-pass Assembler

In Single-pass assembler the whole process of scanning parsing and object code conversion

is done in single pass

The only problem with this method is resolving forward reference

The problem of forward reference is shown with an example below

10 1000 FIRST STL RETADR 141033

--

--

--

--

95 1033 RETADR RESW 1

In the above example in line number 10 the instruction STL will store the linkage register

with the contents of RETADR But during the processing of this instruction the value of this

symbol is not known as it is defined at the line number 95

Since in single-pass assembler the scanning parsing and object code conversion happens

simultaneously

The instruction is fetched it is scanned for tokens parsed for syntax and semantic validity If

it is valid then it has to be converted to its equivalent object code For this the object code is

generated by adding the opcode of STL and the value for the symbol RETADR which is not

available

Due to this reason usually the design is done in two passes

So a multi-pass assembler resolves the forward references and then converts into the object

code Hence the process of the multi-pass assembler can be as follows

LALITHAMBIGAIB APIT Page 10

SYSTEM SOFTWARE ( CS 2304) UNIT - IIPass-1

Assign addresses to all the statements

Save the addresses assigned to all labels to be used in Pass-2

Perform some processing of assembler directives such as RESW RESB to find the length of

data areas for assigning the address values

Defines the symbols in the symbol table(generate the symbol table)

Pass-2

Assemble the instructions (translating operation codes and looking up addresses)

Generate data values defined by BYTE WORD etc

Perform the processing of the assembler directives not done during pass-1

Write the object program and assembly listing

2112 Assembler Design

The most important things which need to be concentrated is the generation of Symbol table

and resolving forward references

bull Symbol Table

ndash This is created during pass 1

ndash All the labels of the instructions are symbols

ndash Table has entry for symbol name address value

bull Forward reference

ndash Symbols that are defined in the later part of the program are called forward

referencing

ndash There will not be any address value for such symbols in the symbol table in pass 1

2113 Functions of the two passes of assembler

Pass 1 (Define symbols)

1 Assign addresses to all statements in the program

2 Save the addresses assigned to all labels for use in Pass 2

3 Perform some processing of assembler directives

LALITHAMBIGAIB APIT Page 11

SYSTEM SOFTWARE ( CS 2304) UNIT - IIPass 2 (Assemble instructions and generate object programs)

1 Assemble instructions (translating operation codes and looking up addresses)

2 Generate data values defined by BYTEWORD etc

3 Perform processing of assembler directives not done in Pass 1

4 Write the object program and the assembly listing

212 Assembler Algorithm and Data Structures

Assembler uses two major internal data structures

1 Operation Code Table (OPTAB) Used to lookup mnemonic operation codes and translate

them into their machine language equivalents

2 Symbol Table (SYMTAB) Used to store values(Addresses) assigned to labels

3

Location Counter (LOCCTR)

It is a variable used to help in the assignment of addresses

It is initialized to the beginning address specified in the START statement

After each source statement is processed the length of the assembled instruction or data area

to be generated is added to LOCCTR

Whenever a label is reached in the source program the current value of LOCCTR gives the

address to be associated with that label

Operation Code Table (OPTAB)

Contains the mnemonic operation and its machine language equivalent

Also contains information about instruction format and length

In Pass 1 OPTAB is used to lookup and validate operation codes in the source program

In Pass 2 it is used to translate the operation codes to machine language program

During Pass 2 the information in OPTAB tells which instruction format to use in assembling

the instruction and any peculiarities of the object code instruction

LALITHAMBIGAIB APIT Page 12

SYSTEM SOFTWARE ( CS 2304) UNIT - II OPTAB is usually organized as a hash table with mnemonic operation code as the key The

hash table organization is particularly appropriate since it provides fast retrieval with a

minimum search

In most cases OPTAB is a static table ndash ie entries are not added or deleted from it

Symbol Table (SYMTAB)

Includes the name and value(address) for each label in the source program and flags to

indicate error conditions(ex ndash symbol defined in two different places)

During Pass 1 of the assembler labels are entered into SYMTAB as they are encountered in

the source program along with their assigned addresses

During Pass 2 symbols used as operands are looked up in SYMTAB to obtain the addresses

to be inserted in the assembled instructions

Pass 1 usually writes an intermediate file that contains each source statement together with its

assigned address error indicators This file is used as the input to Pass 2

This copy of the source program can also be used to retain the results of certain operations that may

be performed during Pass 1 such as scanning the operand field for symbols and addressing flags so

these need not be performed again during Pass 2

The Algorithm for Pass 1

Begin

read first input line

if OPCODE = lsquoSTARTrsquo then

begin

save [Operand] as starting address

initialize LOCCTR to starting address

write line to intermediate file

read next input line

end if START

else

LALITHAMBIGAIB APIT Page 13

SYSTEM SOFTWARE ( CS 2304) UNIT - IIinitialize LOCCTR to 0

While OPCODE = lsquoENDrsquo do

begin

if this is not a comment line then

begin

if there is a symbol in the LABEL field then

begin

search SYMTAB for LABEL

if found then

set error flag (duplicate symbol)

else

end if symbol

search OPTAB for OPCODE

if found then

add 3 (instruction length) to LOCCTR

else if OPCODE = lsquoWORDrsquo then

add 3 to LOCCTR

else if OPCODE = lsquoRESWrsquo then

add 3 [OPERAND] to LOCCTR

else if OPCODE = lsquoRESBrsquo then

add [OPERAND] to LOCCTR

else if OPCODE = lsquoBYTErsquo then

begin

find length of constant in bytes

add length to LOCCTR

end if BYTE

else

set error flag (invalid operation code)

end if not a comment

write line to intermediate file

read next input line

LALITHAMBIGAIB APIT Page 14

SYSTEM SOFTWARE ( CS 2304) UNIT - IIend while not END

write last line to intermediate file

Save (LOCCTR ndash starting address) as program length

End pass 1

Explanation ndash PASS I Algorithm

The algorithm scans the first statement START and saves the operand field (the address) as

the starting address of the program Initializes the LOCCTR value to this address This line is

written to the intermediate line

If no operand is mentioned the LOCCTR is initialized to zero

If a label is encountered the symbol has to be entered in the symbol table along with its

associated address value If the symbol already exists that indicates an entry of the same

symbol already exists So an error flag is set indicating a duplication of the symbol

It next checks for the mnemonic code it searches for this code in the OPTAB If found then

the length of the instruction is added to the LOCCTR to make it point to the next instruction

If the opcode is the directive WORD it adds a value 3 to the LOCCTR

If it is RESW it needs to add 3 the number of data word to the LOCCTR

If it is BYTE it adds a value one to the LOCCTR if RESB it adds number of bytes

If it is END directive then it is the end of the program it finds the length of the program by

evaluating current LOCCTR ndash the starting address mentioned in the operand field of the

END directive

Each processed line is written to the intermediate file

The Algorithm for Pass 2

Begin

read 1st input line from intermediate file

if OPCODE = lsquoSTARTrsquo then

begin

write listing line

read next input line

end if START

LALITHAMBIGAIB APIT Page 15

SYSTEM SOFTWARE ( CS 2304) UNIT - IIwrite Header record to object program

initialize 1st Text record

while OPCODE = lsquoENDrsquo do

begin

if this is not comment line then

begin

search OPTAB for OPCODE

if found then

begin

if there is a symbol in OPERAND field then

begin

search SYMTAB for OPERAND

if found then

store symbol value as operand address

else

begin

store 0 as operand address

set error flag (undefined symbol)

end

end if symbol

else

store 0 as operand address

assemble the object code instruction

end if opcode found

else if OPCODE = lsquoBYTErsquo or lsquoWORDrdquo then

convert constant to object code

if object code doesnrsquot fit into current Text record then

begin

Write text record to object code

initialize new Text record

end

LALITHAMBIGAIB APIT Page 16

SYSTEM SOFTWARE ( CS 2304) UNIT - IIadd object code to Text record

end if not comment

write listing line

read next input line

end while not END

Write last Text record to object program

Write End record to object program

Write last listing line

End Pass 2

Explanation ndash PASS II Algorithm

Here the first input line is read from the intermediate file

If the opcode is START then this line is directly written to the list file

A header record is written in the object program which gives the starting address and the

length of the program (which is calculated during pass 1)

Then the first text record is initialized Comment lines are ignored

In the instruction for the opcode the OPTAB is searched to find the object code

If a symbol is there in the operand field the symbol table is searched to get the address value

for this which gets added to the object code of the opcode

If the address not found then zero value is stored as operands address An error flag is set

indicating it as undefined

If symbol itself is not found then store 0 as operand address and the object code instruction is

assembled

If the opcode is BYTE or WORD then the constant value is converted to its equivalent

object code( for example for character EOF its equivalent hexadecimal value lsquo454f46rsquo is

stored)

If the object code cannot fit into the current text record a new text record is created and the

rest of the instructions object code is listed

The text records are written to the object program Once the whole program is assembled and

when the END directive is encountered the End record is written

LALITHAMBIGAIB APIT Page 17

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Design and Implementation Issues

Some of the features in the program depend on the architecture of the machine If the program is for

SIC machine then we have only limited instruction formats and hence limited addressing modes

We have only single operand instructions The operand is always a memory reference Anything to

be fetched from memory requires more time Hence the improved version of SICXE machine

provides more instruction formats and hence more addressing modes The moment we change the

machine architecture the availability of number of instruction formats and the addressing modes

changes Therefore the design usually requires considering two things Machine-dependent features

and Machine-independent features

22 MACHINE DEPENDENT ASSEMBLER FEATURES

Instruction formats and addressing modes

Program relocation

In this section the design and implementation of an assembler for the more complex XE version of

SIC is considered

LALITHAMBIGAIB APIT Page 18

SYSTEM SOFTWARE ( CS 2304) UNIT - II

200 SUBROUTINE TO WRITE RECORD FROM BUFFER

205

210 WRREC CLEAR X CLEAR LOOP COUNTER

212 LDT LENGTH

215 WLOOP TD OUTPUT TEST OUTPUT DEVICE

220 JEQ WLOOP LOOP UNTIL READY

225 LDCH BUFFER X GET CHARACTER FROM BUFFER

230 WD OUTPUT WRITE CHARACTER

235 TIXR T LOOP UNTIL ALL CHARACTERS

HAVE BEEN WRITTEN

240 JLT WLOOP

245 RSUB

250 OUTPUT BYTE Xrsquo05rsquo

255 END FIRST

Figure 24 Example program of a SICXE

LALITHAMBIGAIB APIT Page 19

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The above figure 23 shows the example program from figure 21 as it is might be rewritten to

take advantage of the SICXE instruction set

Indirect addressing is indicated by adding the prefix to the operand (line70)

Immediate operands are denoted with the prefix (lines 25 55133)

Instructions that refer to memory are normally assembled using either the program counter

relative or base counter relative mode

The assembler directive BASE (line 13) is used in conjunction with base relative addressing

The four byte extended instruction format is specified with the prefix + added to the

operation code in the source statement

Register-to-register instructions are used wherever possible For example the statement on

line 150 is changed from COMP ZERO to COMPR AS

Immediate and indirect addressing have also been used as much as possible

Register-to-register instructions are faster than the corresponding register-to-memory

operations because they are shorter and do not require another memory reference

While using immediate addressing the operand is already present as part of the instruction

and need not be fetched from anywhere

The use of indirect addressing often avoids the need for another instruction

Line Loc Label Opcode Operand Object Code

5 COPY START 0

10 FIRST STL RETADR

12 LDB LENGTH

13 BASE LENGTH

15 CLOOP +JSUB RDREC

20 LDA LENGTH

25 COMP 0

30 JEQ ENDFIL

35 +JSUB WRREC

40 J CLOOP

45 ENDFIL LDA EOF

LALITHAMBIGAIB APIT Page 20

SYSTEM SOFTWARE ( CS 2304) UNIT - II50 STA BUFFER

55 LDA 3

60 STA LENGTH

65 +JSUB WRREC

70 J RETADR

80 EOF BYTE CrsquoEOFrsquo

95 RETADR RESW 1

100 LENGTH RESW 1

105 BUFFER RESB 4096

110

SUBROUTINE TO READ RECORD INTO

BUFFER

115

120

125 RDREC CLEAR X

130 CLEAR A

132 CLEAR S

133 +LDT 4096

135 RLOOP TD INPUT

140 JEQ RLOOP

145 RD INPUT

150 COMPR A S

155 JEQ EXIT

160 STCH BUFFERX

165 TIXR T

170 JLT RLOOP

175 EXIT STX LENGTH

180 RSUB

185 INPUT BYTE XrsquoF1rsquo

195

SUBROUTINE TO WRITE RECORD

FROM BUFFER

200

205

LALITHAMBIGAIB APIT Page 21

SYSTEM SOFTWARE ( CS 2304) UNIT - II210 WRREC CLEAR X

212 LDT LENGTH

215 WLOOP TD OUTPUT

220 JEQ WLOOP

225 LDCH BUFFERX

230 WD OUTPUT

235 TIXR T

240 JLT WLOOP

245 RSUB

250 OUTPUT BYTE Xrsquo05rsquo

255 END FIRST

Figure 25 Program from figure 24(SICXE) with object code

221 Instruction Formats and Addressing Modes

The instruction formats depend on the memory organization and the size of the memory

In SIC machine the memory is byte addressable Word size is 3 bytes So the size of the

memory is 212 bytes Accordingly it supports only one instruction format It has only two

registers register A and Index register Therefore the addressing modes supported by this

architecture are direct and indexed

Whereas the memory of a SICXE machine is 220 bytes (1 MB) This supports four different

types of instruction types they are

1 byte instruction

2 byte instruction

3 byte instruction

4 byte instruction

LALITHAMBIGAIB APIT Page 22

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Instructions can be

Instructions involving register to register

Instructions with one operand in memory the other in Accumulator (Single

operand instruction)

Extended instruction format

Addressing Modes are

PC-relative or Base-relative addressing op m

Indirect addressing op m

Immediate addressing op c

Extended format +op m

Index addressing op mx

register-to-register instructions

larger memory -gt multi-programming (program allocation)

Translation

1 Translations for the Instruction involving Register-Register addressing mode

During pass 1 the registers can be entered as part of the symbol table itself The value for these

registers is their equivalent numeric codes During pass 2 these values are assembled along with the

mnemonics object code If required a separate table can be created with the register names and their

equivalent numeric values

2 Translation involving Register-Memory instructions

In SICXE machine there are four instruction formats and five addressing modes For formats and

addressing modes refer chapter 1

LALITHAMBIGAIB APIT Page 23

SYSTEM SOFTWARE ( CS 2304) UNIT - IIAmong the instruction formats format -3 and format-4 instructions are Register-Memory type of

instruction One of the operand is always in a register and the other operand is in the memory The

addressing mode tells us the way in which the operand from the memory is to be fetched

There are two ways

1) Program-counter relative and

2) Base-relative

This addressing mode can be represented by either using format-3 type or format-4

type of instruction format

In format-3 the instruction has the opcode followed by a 12-bit displacement value in

the address field

Where as in format-4 the instruction contains the mnemonic code followed by a 20-

bit displacement value in the address field

1 Program-Counter Relative

a) In this usually format-3 instruction format is used

b) The instruction contains the opcode followed by a 12-bit displacement value

c) The range of displacement values are from 0 -2048 This displacement (should be small

enough to fit in a 12-bit field) value is added to the current contents of the program counter to

get the target address of the operand required by the instruction

d) This is relative way of calculating the address of the operand relative to the program counter

Hence the displacement of the operand is relative to the current program counter value

e) The following example shows how the address is calculated

LALITHAMBIGAIB APIT Page 24

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Base-Relative Addressing Mode

a) In this mode the base register is used to mention the displacement value Therefore the target

address is

TA = (base) + displacement value

b) This addressing mode is used when the range of displacement value is not sufficient Hence

the operand is not relative to the instruction as in PC-relative addressing mode Whenever

this mode is used it is indicated by using a directive BASE The moment the assembler

encounters this directive the next instruction uses base-relative addressing mode to calculate

the target address of the operand

c) When NOBASE directive is used then it indicates the base register is no more used to

calculate the target address of the operand Assembler first chooses PC-relative when the

displacement field is not enough it uses Base-relative

LDB LENGTH (instruction)

BASE LENGTH (directive)

NOBASE

LALITHAMBIGAIB APIT Page 25

SYSTEM SOFTWARE ( CS 2304) UNIT - II

For example

12 0003 LDB LENGTH 69202D

13 BASE LENGTH

100 0033 LENGTH RESW 1

105 0036 BUFFER RESB 4096

160 104E STCH BUFFER X 57C003

165 1051 TIXR T B850

In the above example the use of directive BASE indicates that Base-relative addressing mode is

to be used to calculate the target address PC-relative is no longer used The value of the LENGTH is

stored in the base register

The LDB instruction loads the value of length in the base register which 0033 BASE directive

explicitly tells the assembler that it has the value of LENGTH

BUFFER is at location (0036)16

(B) = (0033)16

disp = 0036 ndash 0033 = (0003)16

20 000A LDA LENGTH 032026

175 1056 EXIT STX LENGTH 134000

LALITHAMBIGAIB APIT Page 26

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Consider Line 175 If we use PC-relative

Disp = TA ndash (PC) = 0033 ndash1059 = EFDA

PC relative is no longer applicable so we try to use BASE relative addressing mode

3 Immediate Addressing Mode

In this mode no memory reference is involved If immediate mode is used the target address is the

operand itself

If the symbol is referred in the instruction as the immediate operand then it is immediate with PC-

relative mode as shown in the example below

LALITHAMBIGAIB APIT Page 27

SYSTEM SOFTWARE ( CS 2304) UNIT - II

4 Indirect and PC-relative mode

In this type of instruction the symbol used in the instruction is the address of the location which

contains the address of the operand The address of this is found using PC-relative addressing mode

For example

The instruction jumps the control to the address location RETADR which in turn has the address of

the operand If address of RETADR is 0030 the target address is then 0003 as calculated above

222 Program Relocation

The need for program relocation

It is desirable to load and run several programs at the same time

The system must be able to load programs into memory wherever there is room

The exact starting address of the program is not known until load time

Absolute Address

Program with starting address specified at assembly time

The address may be invalid if the program is loaded into somewhere else

LALITHAMBIGAIB APIT Page 28

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example

The above statement says that the register A is loaded with the value stored at location 102D

Suppose it is decided to load and execute the program at location 3000 instead of location 1000

Then at address 102D the required value which needs to be loaded in the register A is no more

available

The address also gets changed relative to the displacement of the program Hence we need to

make some changes in the address portion of the instruction so that we can load and execute the

program at location 3000

Apart from the instruction which will undergo a change in their operand address value as the

program load address changes There exist some parts in the program which will remain same

regardless of where the program is being loaded

Since assembler will not know actual location where the program will get loaded it cannot make

the necessary changes in the addresses used in the program However the assembler identifies

for the loader those parts of the program which need modification

An object program that has the information necessary to perform this kind of modification is

called the relocatable program

LALITHAMBIGAIB APIT Page 29

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example Program Relocation

1) The above diagram shows the concept of relocation

a) Initially the program is loaded at location 0000

b) The instruction JSUB is loaded at location 0006

c) The address field of this instruction contains 01036 which is the address of the instruction

labeled RDREC

2) The second figure shows that if the program is to be loaded at new location 5000 The address of

the instruction JSUB gets modified to new location 6036

3) Likewise the third figure shows that if the program is relocated at location 7420 the JSUB

instruction would need to be changed to 4B108456 that correspond to the new address of

RDREC

The only parts of the program that require modification at load time are those that specify

direct addresses

LALITHAMBIGAIB APIT Page 30

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The rest of the instructions need not be modified

o Not a memory address (immediate addressing)

o PC-relative Base-relative

From the object program it is not possible to distinguish the address and constant

o The assembler must keep some information to tell the loader

o The object program that contains the modification record is called a relocatable

program

The way to solve the relocation problem

For an address label its address is assigned relative to the start of the program(START 0)

Produce a Modification record to store the starting location and the length of the address

field to be modified

The command for the loader must also be a part of the object program

Modification record

One modification record for each address to be modified

The length is stored in half-bytes (4 bits)

The starting location is the location of the byte containing the leftmost bits of the address

field to be modified

If the field contains an odd number of half-bytes the starting location begins in the middle of

the first byte

LALITHAMBIGAIB APIT Page 31

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Relocatable Object Program

In the above object code the red boxes indicate the addresses that need modifications The object

code lines at the end are the descriptions of the modification records for those instructions which

need change if relocation occurs M00000705 is the modification suggested for the statement at

location 0007 and requires modification 5-half bytes

Machine dependent assembler featuresAssembler features not closely related to machine architecture

bull Literalsbull Symbol-defining statementsbull Expressionsbull Program blocksbull Control sections and program linking

Literals

It is convenient for the programmer to be able to write the value of a constant operand as a part of

the instruction that uses it Such an operand is called a literal

45 001A ENDFIL LDA =ClsquoEOFrsquo 003210

002D =ClsquoEOFrsquo 454F46

LALITHAMBIGAIB APIT Page 32

SYSTEM SOFTWARE ( CS 2304) UNIT - II215 1062 WLOOP TD =Xlsquo05rsquo E32011

1076 =Xlsquo05rsquo 05

bull In this assembler language notation a literal is identified with the prefix= which is followed by a

specification of the literal value

The difference between a literal and an immediate operand

With immediate addressing the operand value is assembled as a part of the machine

instruction

With a literal the assembler generates the specified value as a constant at some other

memory location The address of this generated constant is used as the target address for the

machine instruction

Literal pool

All of the literal operands used in a program are gathered together into one or more literal

pools

Where the literal pool should be placed

93 LTORG

002D =ClsquoEOFrsquo 454F46

1048698 The assembler directive LTORG tells the assembler to generate a literal pool here

Literal for current value of location counter

1048698 The value of the location counter can be denoted by a literal operand

bull BASE

bull LDB =

Handling duplicate literal operands

The assembler should avoid storing duplicate literals

The easiest way to recognize duplicate literals is by comparison of the character

strings defining them

bull 100 LDA =ClsquoEOFrsquo

LALITHAMBIGAIB APIT Page 33

SYSTEM SOFTWARE ( CS 2304) UNIT - IIbull 125 LDA =ClsquoEOFrsquo

bull 160 LDA =Xlsquo454F46rsquo Literal operands with different

literal names may have the same

literal values

Recognizing literal operands that have different literal names but have the same literal values will

complicate the design of an assembler

Literal table (LITTAB)

The basic data structure needed to process literal operands is a literal table (LITTAB)

LITTAB contains Literal NameValue and Length

LITTAB is often organized as a hash table using the literal name or value as the key

Processing literal operands

Pass 1

bull For each recognized literal operand search LITTAB If the literal is already present in the

table no action is need if it is not present the literal is added to LITTAB without assigning

its address

bull When a LTORG statement is encountered or the end of the program the assembler makes a

scan of LITTAB and assigns an address to each literal

bull Update the location counter to reflect the number of bytes occupied by each literal

Pass 2

bull Search LITTAB for each literal operand encountered

bull The data values specified by the literals in each literal pool are inserted at the appropriate

places in the object program

bull In the same way as these values generated by BYTE or WORD statements

bull If a literal value represents an address in the program the assembler must generate the

appropriate Modification record

Symbol-defining statements

Assembler directives

EQU

ORG

Assembler directive EQU

LALITHAMBIGAIB APIT Page 34

SYSTEM SOFTWARE ( CS 2304) UNIT - II Most assemblers provide an assembler directive that allows the programmer to define

symbols and specify their values

Symbol EQU value

When the assembler encounters the EQU statement it enters ldquosymbolrdquo into SYMTAB with

the value of ldquosymbolrdquo

Use of EQU

Establish symbolic names that can be used for improved readability in place of numeric

values

+LDT 4096

MAXLEN EQU 4096

+LDT MAXLEN

Define mnemonic names for registers

A EQU 0

X EQU 1

L EQU 2

RMO 01

Establish and use names that reflect the logical function of the registers in the program

BASE EQU R1

ACCUMULATOR EQU R2

INDEX EQU R3

Assembler directive ORG

The assembler directive ORG is usually used to indirectly assign values to symbols

ORG value

ldquoValuerdquo is a constant or an expression involving constants and previously defined

symbols

When this statement is encountered the assembler resets its location counter (LOCCTR) to

the specified value

Use ORG for label definition

Suppose that we want to define a table with the following structure

STAB SYMBOL VALUE FLAGS

100 entries 6 bytes 3 bytes 1byte

LALITHAMBIGAIB APIT Page 35

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In some assemblers the previous value of LOCCTR is automatically remembered so we can

write

ORG

to return to the normal use of LOCCTR

Restrictions of EQU and ORG in an ordinary two-pass assembler For an ordinary two-pass assembler all symbols must be defined during Pass 1 Hence the

following sequences could not be processed by an ordinary two-pass assembler

All terms used to specify the value of the new symbol must have been defined previously in

the program

BETA EQU ALPHA

ALPHA RESW 1

ALPHA EQU BETA

BETA EQU DELTA

DELTA RESW 1

disallowed

ORG ALPHA

BYTE1 RESB 1

BYTE2 RESB 1

BYTE3 RESB 1

ORG

ALPHA RESB 1

disallowed

ALPHA RESW 1

BETA EQU ALPHA

Allowed

LALITHAMBIGAIB APIT Page 36

disallowed

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Expressions Most assemblers allow the use of expressions whenever a single operand such as a label or

literal is permitted

bull Each such expression must be evaluated by the assembler to produce a single

operand address or value

Assemblers generally allow arithmetic expressions formed according to the normal rule using

the operators +- and

bull Individual terms in the expression may be

bull constants

bull user-defined symbols or

bull special terms

bull The most common special term is the current value of the location

counter (often designated by )

Types of terms

Absolute terms - The value of an absolute term is independent of program location

Relative terms - The value of a relative term is dependent on the beginning address of the

program

Types of expressions

By the type of value produced expressions can classified as

1 Absolute expressions

bull The value of an absolute expression is independent of the program location

bull The absolute expression may contains relative terms provided the relative terms occur in

pairs and the terms in each such pair have opposite signs No relative term can enter multiplication

or division operation

bull eg MAXLEN EQU BUFEND-BUFFER

LALITHAMBIGAIB APIT Page 37

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Relative expressions

bull The value of a relative expression is relative the beginning address of the object

program

bull A relative expression is one in which all of the relative terms except

one can be paired as described above The remaining unpaired term must have a positive sign No

relative term can enter multiplication or division operation

Expressions that are neither relative nor absolute should be flagged by the assembler as errors

Determining types of expressions

Symbol Type Value

MAXLEN A 1000

BUFEND R 1036

BUFFER R 0036

RETADR R 0030

BUFEND+BUFFER 100-BUFFER and 3BUFFER are neither relative expressions nor

absolute expressions

Program blocks

The program is divided into three blocks They are Default - 0CDATA-1

and CBLKS-2

Default ( 0 ) block for mnemonic instructions

CDATA ( 1 ) block for less memory instructions

CBLKs ( 2 ) block for more memory instructions

Then one more column is included in the program when object code is

generated (ie) block whenever the system will encounter the new block need to write

USE and BLOCK name then set LOCCTR value to 0000

LALITHAMBIGAIB APIT Page 38

Add this field to SYMTAB

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In program block table consists of block name block valuestarting address and

length

After finding location counter value need to find object code and write object

program

LALITHAMBIGAIB APIT Page 39

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Program Block Diagram

LALITHAMBIGAIB APIT Page 40

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Control section and program linkingControl section

i A control section is a part of the program that maintains its identity after assembly

ii Each control section can be loaded and relocated independently of the others

iii Different control sections are most often used for subroutines or other logical

subdivisions of a program

Assembler directive CSECT

CSECT signal the start of a new control section

The assembler establishes a separate location counter (initialized as 0) for each control section

Assembler directives EXTDEF EXTREF

LALITHAMBIGAIB APIT Page 41

SYSTEM SOFTWARE ( CS 2304) UNIT - IIEXTDEF external definition

The EXTDEF statement in a control section names symbols called external symbols that are

defined in this control section and may be used by other sections Control section names are

automatically considered to be external symbols

EXTREF external reference

The EXTREF statement names symbols that are used in this control section and are defined

elsewhere

LALITHAMBIGAIB APIT Page 42

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 6: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The main routine calls subroutines

RDREC ndash To read a record into a buffer

WRREC ndash To write the record from the buffer to the output device

The end of each record is marked with a null character (hexadecimal 00)

211 A Simple SIC Assembler

The figure 22 shows the same program as in figure 21 with the generated object code for each

statement

LALITHAMBIGAIB APIT Page 6

Label Opcode Operand

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Figure 22 Assembler language program for basic SIC version with Object code

LALITHAMBIGAIB APIT Page 7

SYSTEM SOFTWARE ( CS 2304) UNIT - II

1 The column ldquoLocrdquo in the example program gives the machine address in hexadecimal for

each and every line of the assembled program

2 The program starts at address 1000 (itrsquos only assumption ndash for simple SIC machine the

starting address will always be assumed and the value will not be 0)

3 The translation of source program to object code requires the following functions

a Convert mnemonic operation codes to their machine language equivalents Eg Translate

STL to 14 (line 10)

b Convert symbolic operands to their equivalent machine addresses EgTranslate

RETADR to 1033 (line 10)

c Build the machine instructions in the proper format

d Convert the data constants specified in the source program into their internal machine

representations Eg Translate EOF to 454F46(line 80)

e Write the object program and the assembly listing

4 All the statements in the program except statement 2 can be established by sequential

processing of source program one line at a time

Consider the statement

10 1000 FIRST STL RETADR 141033

5 This instruction contains a forward reference (ie) - a reference to a label (RETADR) that is

defined later in the program

6 It is unable to process this line because the address that will be assigned to RETADR is not

known

7 Hence most assemblers make two passes over the source program

8 The first pass does little more than scan the source program for label definitions and assign

address such as those in the Loc column

9 The second pass performs most of the actual translation

10 The assembler must also process statements called assembler directives or pseudo

instructions which are not translated into machine instructions Instead they provide

instructions to the assembler itself Examples RESB and RESW instruct the assembler to

reserve memory locations without generating data values

LALITHAMBIGAIB APIT Page 8

SYSTEM SOFTWARE ( CS 2304) UNIT - II11 The assembler must write the generated object code onto some output device This object

program will later be loaded into memory for execution

Object program format contains three types of records

Header record Contains the program name starting address and length

Text record Contains the machine code and data of the program

End record Marks the end of the object program and specifies the address in the program

where execution is to begin

Record format is as follows

Header record

Col 1 H

Col2-7 Program name

Col8-13 Starting address of object program

Col14-19 Length of object program in bytes

Text record

Col1 T

Col2-7 Starting address for object code in this record

Col8-9 Length of object code in this record in bytes

Col 10-69 Object code represented in hexadecimal (2 columns per byte of object code)

End record

Col1 E

Col2-7 Address of first executable instruction in object program

LALITHAMBIGAIB APIT Page 9

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2111 The assembler design can be done

Single pass assembler

Multi-pass assembler

Single-pass Assembler

In Single-pass assembler the whole process of scanning parsing and object code conversion

is done in single pass

The only problem with this method is resolving forward reference

The problem of forward reference is shown with an example below

10 1000 FIRST STL RETADR 141033

--

--

--

--

95 1033 RETADR RESW 1

In the above example in line number 10 the instruction STL will store the linkage register

with the contents of RETADR But during the processing of this instruction the value of this

symbol is not known as it is defined at the line number 95

Since in single-pass assembler the scanning parsing and object code conversion happens

simultaneously

The instruction is fetched it is scanned for tokens parsed for syntax and semantic validity If

it is valid then it has to be converted to its equivalent object code For this the object code is

generated by adding the opcode of STL and the value for the symbol RETADR which is not

available

Due to this reason usually the design is done in two passes

So a multi-pass assembler resolves the forward references and then converts into the object

code Hence the process of the multi-pass assembler can be as follows

LALITHAMBIGAIB APIT Page 10

SYSTEM SOFTWARE ( CS 2304) UNIT - IIPass-1

Assign addresses to all the statements

Save the addresses assigned to all labels to be used in Pass-2

Perform some processing of assembler directives such as RESW RESB to find the length of

data areas for assigning the address values

Defines the symbols in the symbol table(generate the symbol table)

Pass-2

Assemble the instructions (translating operation codes and looking up addresses)

Generate data values defined by BYTE WORD etc

Perform the processing of the assembler directives not done during pass-1

Write the object program and assembly listing

2112 Assembler Design

The most important things which need to be concentrated is the generation of Symbol table

and resolving forward references

bull Symbol Table

ndash This is created during pass 1

ndash All the labels of the instructions are symbols

ndash Table has entry for symbol name address value

bull Forward reference

ndash Symbols that are defined in the later part of the program are called forward

referencing

ndash There will not be any address value for such symbols in the symbol table in pass 1

2113 Functions of the two passes of assembler

Pass 1 (Define symbols)

1 Assign addresses to all statements in the program

2 Save the addresses assigned to all labels for use in Pass 2

3 Perform some processing of assembler directives

LALITHAMBIGAIB APIT Page 11

SYSTEM SOFTWARE ( CS 2304) UNIT - IIPass 2 (Assemble instructions and generate object programs)

1 Assemble instructions (translating operation codes and looking up addresses)

2 Generate data values defined by BYTEWORD etc

3 Perform processing of assembler directives not done in Pass 1

4 Write the object program and the assembly listing

212 Assembler Algorithm and Data Structures

Assembler uses two major internal data structures

1 Operation Code Table (OPTAB) Used to lookup mnemonic operation codes and translate

them into their machine language equivalents

2 Symbol Table (SYMTAB) Used to store values(Addresses) assigned to labels

3

Location Counter (LOCCTR)

It is a variable used to help in the assignment of addresses

It is initialized to the beginning address specified in the START statement

After each source statement is processed the length of the assembled instruction or data area

to be generated is added to LOCCTR

Whenever a label is reached in the source program the current value of LOCCTR gives the

address to be associated with that label

Operation Code Table (OPTAB)

Contains the mnemonic operation and its machine language equivalent

Also contains information about instruction format and length

In Pass 1 OPTAB is used to lookup and validate operation codes in the source program

In Pass 2 it is used to translate the operation codes to machine language program

During Pass 2 the information in OPTAB tells which instruction format to use in assembling

the instruction and any peculiarities of the object code instruction

LALITHAMBIGAIB APIT Page 12

SYSTEM SOFTWARE ( CS 2304) UNIT - II OPTAB is usually organized as a hash table with mnemonic operation code as the key The

hash table organization is particularly appropriate since it provides fast retrieval with a

minimum search

In most cases OPTAB is a static table ndash ie entries are not added or deleted from it

Symbol Table (SYMTAB)

Includes the name and value(address) for each label in the source program and flags to

indicate error conditions(ex ndash symbol defined in two different places)

During Pass 1 of the assembler labels are entered into SYMTAB as they are encountered in

the source program along with their assigned addresses

During Pass 2 symbols used as operands are looked up in SYMTAB to obtain the addresses

to be inserted in the assembled instructions

Pass 1 usually writes an intermediate file that contains each source statement together with its

assigned address error indicators This file is used as the input to Pass 2

This copy of the source program can also be used to retain the results of certain operations that may

be performed during Pass 1 such as scanning the operand field for symbols and addressing flags so

these need not be performed again during Pass 2

The Algorithm for Pass 1

Begin

read first input line

if OPCODE = lsquoSTARTrsquo then

begin

save [Operand] as starting address

initialize LOCCTR to starting address

write line to intermediate file

read next input line

end if START

else

LALITHAMBIGAIB APIT Page 13

SYSTEM SOFTWARE ( CS 2304) UNIT - IIinitialize LOCCTR to 0

While OPCODE = lsquoENDrsquo do

begin

if this is not a comment line then

begin

if there is a symbol in the LABEL field then

begin

search SYMTAB for LABEL

if found then

set error flag (duplicate symbol)

else

end if symbol

search OPTAB for OPCODE

if found then

add 3 (instruction length) to LOCCTR

else if OPCODE = lsquoWORDrsquo then

add 3 to LOCCTR

else if OPCODE = lsquoRESWrsquo then

add 3 [OPERAND] to LOCCTR

else if OPCODE = lsquoRESBrsquo then

add [OPERAND] to LOCCTR

else if OPCODE = lsquoBYTErsquo then

begin

find length of constant in bytes

add length to LOCCTR

end if BYTE

else

set error flag (invalid operation code)

end if not a comment

write line to intermediate file

read next input line

LALITHAMBIGAIB APIT Page 14

SYSTEM SOFTWARE ( CS 2304) UNIT - IIend while not END

write last line to intermediate file

Save (LOCCTR ndash starting address) as program length

End pass 1

Explanation ndash PASS I Algorithm

The algorithm scans the first statement START and saves the operand field (the address) as

the starting address of the program Initializes the LOCCTR value to this address This line is

written to the intermediate line

If no operand is mentioned the LOCCTR is initialized to zero

If a label is encountered the symbol has to be entered in the symbol table along with its

associated address value If the symbol already exists that indicates an entry of the same

symbol already exists So an error flag is set indicating a duplication of the symbol

It next checks for the mnemonic code it searches for this code in the OPTAB If found then

the length of the instruction is added to the LOCCTR to make it point to the next instruction

If the opcode is the directive WORD it adds a value 3 to the LOCCTR

If it is RESW it needs to add 3 the number of data word to the LOCCTR

If it is BYTE it adds a value one to the LOCCTR if RESB it adds number of bytes

If it is END directive then it is the end of the program it finds the length of the program by

evaluating current LOCCTR ndash the starting address mentioned in the operand field of the

END directive

Each processed line is written to the intermediate file

The Algorithm for Pass 2

Begin

read 1st input line from intermediate file

if OPCODE = lsquoSTARTrsquo then

begin

write listing line

read next input line

end if START

LALITHAMBIGAIB APIT Page 15

SYSTEM SOFTWARE ( CS 2304) UNIT - IIwrite Header record to object program

initialize 1st Text record

while OPCODE = lsquoENDrsquo do

begin

if this is not comment line then

begin

search OPTAB for OPCODE

if found then

begin

if there is a symbol in OPERAND field then

begin

search SYMTAB for OPERAND

if found then

store symbol value as operand address

else

begin

store 0 as operand address

set error flag (undefined symbol)

end

end if symbol

else

store 0 as operand address

assemble the object code instruction

end if opcode found

else if OPCODE = lsquoBYTErsquo or lsquoWORDrdquo then

convert constant to object code

if object code doesnrsquot fit into current Text record then

begin

Write text record to object code

initialize new Text record

end

LALITHAMBIGAIB APIT Page 16

SYSTEM SOFTWARE ( CS 2304) UNIT - IIadd object code to Text record

end if not comment

write listing line

read next input line

end while not END

Write last Text record to object program

Write End record to object program

Write last listing line

End Pass 2

Explanation ndash PASS II Algorithm

Here the first input line is read from the intermediate file

If the opcode is START then this line is directly written to the list file

A header record is written in the object program which gives the starting address and the

length of the program (which is calculated during pass 1)

Then the first text record is initialized Comment lines are ignored

In the instruction for the opcode the OPTAB is searched to find the object code

If a symbol is there in the operand field the symbol table is searched to get the address value

for this which gets added to the object code of the opcode

If the address not found then zero value is stored as operands address An error flag is set

indicating it as undefined

If symbol itself is not found then store 0 as operand address and the object code instruction is

assembled

If the opcode is BYTE or WORD then the constant value is converted to its equivalent

object code( for example for character EOF its equivalent hexadecimal value lsquo454f46rsquo is

stored)

If the object code cannot fit into the current text record a new text record is created and the

rest of the instructions object code is listed

The text records are written to the object program Once the whole program is assembled and

when the END directive is encountered the End record is written

LALITHAMBIGAIB APIT Page 17

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Design and Implementation Issues

Some of the features in the program depend on the architecture of the machine If the program is for

SIC machine then we have only limited instruction formats and hence limited addressing modes

We have only single operand instructions The operand is always a memory reference Anything to

be fetched from memory requires more time Hence the improved version of SICXE machine

provides more instruction formats and hence more addressing modes The moment we change the

machine architecture the availability of number of instruction formats and the addressing modes

changes Therefore the design usually requires considering two things Machine-dependent features

and Machine-independent features

22 MACHINE DEPENDENT ASSEMBLER FEATURES

Instruction formats and addressing modes

Program relocation

In this section the design and implementation of an assembler for the more complex XE version of

SIC is considered

LALITHAMBIGAIB APIT Page 18

SYSTEM SOFTWARE ( CS 2304) UNIT - II

200 SUBROUTINE TO WRITE RECORD FROM BUFFER

205

210 WRREC CLEAR X CLEAR LOOP COUNTER

212 LDT LENGTH

215 WLOOP TD OUTPUT TEST OUTPUT DEVICE

220 JEQ WLOOP LOOP UNTIL READY

225 LDCH BUFFER X GET CHARACTER FROM BUFFER

230 WD OUTPUT WRITE CHARACTER

235 TIXR T LOOP UNTIL ALL CHARACTERS

HAVE BEEN WRITTEN

240 JLT WLOOP

245 RSUB

250 OUTPUT BYTE Xrsquo05rsquo

255 END FIRST

Figure 24 Example program of a SICXE

LALITHAMBIGAIB APIT Page 19

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The above figure 23 shows the example program from figure 21 as it is might be rewritten to

take advantage of the SICXE instruction set

Indirect addressing is indicated by adding the prefix to the operand (line70)

Immediate operands are denoted with the prefix (lines 25 55133)

Instructions that refer to memory are normally assembled using either the program counter

relative or base counter relative mode

The assembler directive BASE (line 13) is used in conjunction with base relative addressing

The four byte extended instruction format is specified with the prefix + added to the

operation code in the source statement

Register-to-register instructions are used wherever possible For example the statement on

line 150 is changed from COMP ZERO to COMPR AS

Immediate and indirect addressing have also been used as much as possible

Register-to-register instructions are faster than the corresponding register-to-memory

operations because they are shorter and do not require another memory reference

While using immediate addressing the operand is already present as part of the instruction

and need not be fetched from anywhere

The use of indirect addressing often avoids the need for another instruction

Line Loc Label Opcode Operand Object Code

5 COPY START 0

10 FIRST STL RETADR

12 LDB LENGTH

13 BASE LENGTH

15 CLOOP +JSUB RDREC

20 LDA LENGTH

25 COMP 0

30 JEQ ENDFIL

35 +JSUB WRREC

40 J CLOOP

45 ENDFIL LDA EOF

LALITHAMBIGAIB APIT Page 20

SYSTEM SOFTWARE ( CS 2304) UNIT - II50 STA BUFFER

55 LDA 3

60 STA LENGTH

65 +JSUB WRREC

70 J RETADR

80 EOF BYTE CrsquoEOFrsquo

95 RETADR RESW 1

100 LENGTH RESW 1

105 BUFFER RESB 4096

110

SUBROUTINE TO READ RECORD INTO

BUFFER

115

120

125 RDREC CLEAR X

130 CLEAR A

132 CLEAR S

133 +LDT 4096

135 RLOOP TD INPUT

140 JEQ RLOOP

145 RD INPUT

150 COMPR A S

155 JEQ EXIT

160 STCH BUFFERX

165 TIXR T

170 JLT RLOOP

175 EXIT STX LENGTH

180 RSUB

185 INPUT BYTE XrsquoF1rsquo

195

SUBROUTINE TO WRITE RECORD

FROM BUFFER

200

205

LALITHAMBIGAIB APIT Page 21

SYSTEM SOFTWARE ( CS 2304) UNIT - II210 WRREC CLEAR X

212 LDT LENGTH

215 WLOOP TD OUTPUT

220 JEQ WLOOP

225 LDCH BUFFERX

230 WD OUTPUT

235 TIXR T

240 JLT WLOOP

245 RSUB

250 OUTPUT BYTE Xrsquo05rsquo

255 END FIRST

Figure 25 Program from figure 24(SICXE) with object code

221 Instruction Formats and Addressing Modes

The instruction formats depend on the memory organization and the size of the memory

In SIC machine the memory is byte addressable Word size is 3 bytes So the size of the

memory is 212 bytes Accordingly it supports only one instruction format It has only two

registers register A and Index register Therefore the addressing modes supported by this

architecture are direct and indexed

Whereas the memory of a SICXE machine is 220 bytes (1 MB) This supports four different

types of instruction types they are

1 byte instruction

2 byte instruction

3 byte instruction

4 byte instruction

LALITHAMBIGAIB APIT Page 22

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Instructions can be

Instructions involving register to register

Instructions with one operand in memory the other in Accumulator (Single

operand instruction)

Extended instruction format

Addressing Modes are

PC-relative or Base-relative addressing op m

Indirect addressing op m

Immediate addressing op c

Extended format +op m

Index addressing op mx

register-to-register instructions

larger memory -gt multi-programming (program allocation)

Translation

1 Translations for the Instruction involving Register-Register addressing mode

During pass 1 the registers can be entered as part of the symbol table itself The value for these

registers is their equivalent numeric codes During pass 2 these values are assembled along with the

mnemonics object code If required a separate table can be created with the register names and their

equivalent numeric values

2 Translation involving Register-Memory instructions

In SICXE machine there are four instruction formats and five addressing modes For formats and

addressing modes refer chapter 1

LALITHAMBIGAIB APIT Page 23

SYSTEM SOFTWARE ( CS 2304) UNIT - IIAmong the instruction formats format -3 and format-4 instructions are Register-Memory type of

instruction One of the operand is always in a register and the other operand is in the memory The

addressing mode tells us the way in which the operand from the memory is to be fetched

There are two ways

1) Program-counter relative and

2) Base-relative

This addressing mode can be represented by either using format-3 type or format-4

type of instruction format

In format-3 the instruction has the opcode followed by a 12-bit displacement value in

the address field

Where as in format-4 the instruction contains the mnemonic code followed by a 20-

bit displacement value in the address field

1 Program-Counter Relative

a) In this usually format-3 instruction format is used

b) The instruction contains the opcode followed by a 12-bit displacement value

c) The range of displacement values are from 0 -2048 This displacement (should be small

enough to fit in a 12-bit field) value is added to the current contents of the program counter to

get the target address of the operand required by the instruction

d) This is relative way of calculating the address of the operand relative to the program counter

Hence the displacement of the operand is relative to the current program counter value

e) The following example shows how the address is calculated

LALITHAMBIGAIB APIT Page 24

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Base-Relative Addressing Mode

a) In this mode the base register is used to mention the displacement value Therefore the target

address is

TA = (base) + displacement value

b) This addressing mode is used when the range of displacement value is not sufficient Hence

the operand is not relative to the instruction as in PC-relative addressing mode Whenever

this mode is used it is indicated by using a directive BASE The moment the assembler

encounters this directive the next instruction uses base-relative addressing mode to calculate

the target address of the operand

c) When NOBASE directive is used then it indicates the base register is no more used to

calculate the target address of the operand Assembler first chooses PC-relative when the

displacement field is not enough it uses Base-relative

LDB LENGTH (instruction)

BASE LENGTH (directive)

NOBASE

LALITHAMBIGAIB APIT Page 25

SYSTEM SOFTWARE ( CS 2304) UNIT - II

For example

12 0003 LDB LENGTH 69202D

13 BASE LENGTH

100 0033 LENGTH RESW 1

105 0036 BUFFER RESB 4096

160 104E STCH BUFFER X 57C003

165 1051 TIXR T B850

In the above example the use of directive BASE indicates that Base-relative addressing mode is

to be used to calculate the target address PC-relative is no longer used The value of the LENGTH is

stored in the base register

The LDB instruction loads the value of length in the base register which 0033 BASE directive

explicitly tells the assembler that it has the value of LENGTH

BUFFER is at location (0036)16

(B) = (0033)16

disp = 0036 ndash 0033 = (0003)16

20 000A LDA LENGTH 032026

175 1056 EXIT STX LENGTH 134000

LALITHAMBIGAIB APIT Page 26

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Consider Line 175 If we use PC-relative

Disp = TA ndash (PC) = 0033 ndash1059 = EFDA

PC relative is no longer applicable so we try to use BASE relative addressing mode

3 Immediate Addressing Mode

In this mode no memory reference is involved If immediate mode is used the target address is the

operand itself

If the symbol is referred in the instruction as the immediate operand then it is immediate with PC-

relative mode as shown in the example below

LALITHAMBIGAIB APIT Page 27

SYSTEM SOFTWARE ( CS 2304) UNIT - II

4 Indirect and PC-relative mode

In this type of instruction the symbol used in the instruction is the address of the location which

contains the address of the operand The address of this is found using PC-relative addressing mode

For example

The instruction jumps the control to the address location RETADR which in turn has the address of

the operand If address of RETADR is 0030 the target address is then 0003 as calculated above

222 Program Relocation

The need for program relocation

It is desirable to load and run several programs at the same time

The system must be able to load programs into memory wherever there is room

The exact starting address of the program is not known until load time

Absolute Address

Program with starting address specified at assembly time

The address may be invalid if the program is loaded into somewhere else

LALITHAMBIGAIB APIT Page 28

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example

The above statement says that the register A is loaded with the value stored at location 102D

Suppose it is decided to load and execute the program at location 3000 instead of location 1000

Then at address 102D the required value which needs to be loaded in the register A is no more

available

The address also gets changed relative to the displacement of the program Hence we need to

make some changes in the address portion of the instruction so that we can load and execute the

program at location 3000

Apart from the instruction which will undergo a change in their operand address value as the

program load address changes There exist some parts in the program which will remain same

regardless of where the program is being loaded

Since assembler will not know actual location where the program will get loaded it cannot make

the necessary changes in the addresses used in the program However the assembler identifies

for the loader those parts of the program which need modification

An object program that has the information necessary to perform this kind of modification is

called the relocatable program

LALITHAMBIGAIB APIT Page 29

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example Program Relocation

1) The above diagram shows the concept of relocation

a) Initially the program is loaded at location 0000

b) The instruction JSUB is loaded at location 0006

c) The address field of this instruction contains 01036 which is the address of the instruction

labeled RDREC

2) The second figure shows that if the program is to be loaded at new location 5000 The address of

the instruction JSUB gets modified to new location 6036

3) Likewise the third figure shows that if the program is relocated at location 7420 the JSUB

instruction would need to be changed to 4B108456 that correspond to the new address of

RDREC

The only parts of the program that require modification at load time are those that specify

direct addresses

LALITHAMBIGAIB APIT Page 30

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The rest of the instructions need not be modified

o Not a memory address (immediate addressing)

o PC-relative Base-relative

From the object program it is not possible to distinguish the address and constant

o The assembler must keep some information to tell the loader

o The object program that contains the modification record is called a relocatable

program

The way to solve the relocation problem

For an address label its address is assigned relative to the start of the program(START 0)

Produce a Modification record to store the starting location and the length of the address

field to be modified

The command for the loader must also be a part of the object program

Modification record

One modification record for each address to be modified

The length is stored in half-bytes (4 bits)

The starting location is the location of the byte containing the leftmost bits of the address

field to be modified

If the field contains an odd number of half-bytes the starting location begins in the middle of

the first byte

LALITHAMBIGAIB APIT Page 31

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Relocatable Object Program

In the above object code the red boxes indicate the addresses that need modifications The object

code lines at the end are the descriptions of the modification records for those instructions which

need change if relocation occurs M00000705 is the modification suggested for the statement at

location 0007 and requires modification 5-half bytes

Machine dependent assembler featuresAssembler features not closely related to machine architecture

bull Literalsbull Symbol-defining statementsbull Expressionsbull Program blocksbull Control sections and program linking

Literals

It is convenient for the programmer to be able to write the value of a constant operand as a part of

the instruction that uses it Such an operand is called a literal

45 001A ENDFIL LDA =ClsquoEOFrsquo 003210

002D =ClsquoEOFrsquo 454F46

LALITHAMBIGAIB APIT Page 32

SYSTEM SOFTWARE ( CS 2304) UNIT - II215 1062 WLOOP TD =Xlsquo05rsquo E32011

1076 =Xlsquo05rsquo 05

bull In this assembler language notation a literal is identified with the prefix= which is followed by a

specification of the literal value

The difference between a literal and an immediate operand

With immediate addressing the operand value is assembled as a part of the machine

instruction

With a literal the assembler generates the specified value as a constant at some other

memory location The address of this generated constant is used as the target address for the

machine instruction

Literal pool

All of the literal operands used in a program are gathered together into one or more literal

pools

Where the literal pool should be placed

93 LTORG

002D =ClsquoEOFrsquo 454F46

1048698 The assembler directive LTORG tells the assembler to generate a literal pool here

Literal for current value of location counter

1048698 The value of the location counter can be denoted by a literal operand

bull BASE

bull LDB =

Handling duplicate literal operands

The assembler should avoid storing duplicate literals

The easiest way to recognize duplicate literals is by comparison of the character

strings defining them

bull 100 LDA =ClsquoEOFrsquo

LALITHAMBIGAIB APIT Page 33

SYSTEM SOFTWARE ( CS 2304) UNIT - IIbull 125 LDA =ClsquoEOFrsquo

bull 160 LDA =Xlsquo454F46rsquo Literal operands with different

literal names may have the same

literal values

Recognizing literal operands that have different literal names but have the same literal values will

complicate the design of an assembler

Literal table (LITTAB)

The basic data structure needed to process literal operands is a literal table (LITTAB)

LITTAB contains Literal NameValue and Length

LITTAB is often organized as a hash table using the literal name or value as the key

Processing literal operands

Pass 1

bull For each recognized literal operand search LITTAB If the literal is already present in the

table no action is need if it is not present the literal is added to LITTAB without assigning

its address

bull When a LTORG statement is encountered or the end of the program the assembler makes a

scan of LITTAB and assigns an address to each literal

bull Update the location counter to reflect the number of bytes occupied by each literal

Pass 2

bull Search LITTAB for each literal operand encountered

bull The data values specified by the literals in each literal pool are inserted at the appropriate

places in the object program

bull In the same way as these values generated by BYTE or WORD statements

bull If a literal value represents an address in the program the assembler must generate the

appropriate Modification record

Symbol-defining statements

Assembler directives

EQU

ORG

Assembler directive EQU

LALITHAMBIGAIB APIT Page 34

SYSTEM SOFTWARE ( CS 2304) UNIT - II Most assemblers provide an assembler directive that allows the programmer to define

symbols and specify their values

Symbol EQU value

When the assembler encounters the EQU statement it enters ldquosymbolrdquo into SYMTAB with

the value of ldquosymbolrdquo

Use of EQU

Establish symbolic names that can be used for improved readability in place of numeric

values

+LDT 4096

MAXLEN EQU 4096

+LDT MAXLEN

Define mnemonic names for registers

A EQU 0

X EQU 1

L EQU 2

RMO 01

Establish and use names that reflect the logical function of the registers in the program

BASE EQU R1

ACCUMULATOR EQU R2

INDEX EQU R3

Assembler directive ORG

The assembler directive ORG is usually used to indirectly assign values to symbols

ORG value

ldquoValuerdquo is a constant or an expression involving constants and previously defined

symbols

When this statement is encountered the assembler resets its location counter (LOCCTR) to

the specified value

Use ORG for label definition

Suppose that we want to define a table with the following structure

STAB SYMBOL VALUE FLAGS

100 entries 6 bytes 3 bytes 1byte

LALITHAMBIGAIB APIT Page 35

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In some assemblers the previous value of LOCCTR is automatically remembered so we can

write

ORG

to return to the normal use of LOCCTR

Restrictions of EQU and ORG in an ordinary two-pass assembler For an ordinary two-pass assembler all symbols must be defined during Pass 1 Hence the

following sequences could not be processed by an ordinary two-pass assembler

All terms used to specify the value of the new symbol must have been defined previously in

the program

BETA EQU ALPHA

ALPHA RESW 1

ALPHA EQU BETA

BETA EQU DELTA

DELTA RESW 1

disallowed

ORG ALPHA

BYTE1 RESB 1

BYTE2 RESB 1

BYTE3 RESB 1

ORG

ALPHA RESB 1

disallowed

ALPHA RESW 1

BETA EQU ALPHA

Allowed

LALITHAMBIGAIB APIT Page 36

disallowed

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Expressions Most assemblers allow the use of expressions whenever a single operand such as a label or

literal is permitted

bull Each such expression must be evaluated by the assembler to produce a single

operand address or value

Assemblers generally allow arithmetic expressions formed according to the normal rule using

the operators +- and

bull Individual terms in the expression may be

bull constants

bull user-defined symbols or

bull special terms

bull The most common special term is the current value of the location

counter (often designated by )

Types of terms

Absolute terms - The value of an absolute term is independent of program location

Relative terms - The value of a relative term is dependent on the beginning address of the

program

Types of expressions

By the type of value produced expressions can classified as

1 Absolute expressions

bull The value of an absolute expression is independent of the program location

bull The absolute expression may contains relative terms provided the relative terms occur in

pairs and the terms in each such pair have opposite signs No relative term can enter multiplication

or division operation

bull eg MAXLEN EQU BUFEND-BUFFER

LALITHAMBIGAIB APIT Page 37

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Relative expressions

bull The value of a relative expression is relative the beginning address of the object

program

bull A relative expression is one in which all of the relative terms except

one can be paired as described above The remaining unpaired term must have a positive sign No

relative term can enter multiplication or division operation

Expressions that are neither relative nor absolute should be flagged by the assembler as errors

Determining types of expressions

Symbol Type Value

MAXLEN A 1000

BUFEND R 1036

BUFFER R 0036

RETADR R 0030

BUFEND+BUFFER 100-BUFFER and 3BUFFER are neither relative expressions nor

absolute expressions

Program blocks

The program is divided into three blocks They are Default - 0CDATA-1

and CBLKS-2

Default ( 0 ) block for mnemonic instructions

CDATA ( 1 ) block for less memory instructions

CBLKs ( 2 ) block for more memory instructions

Then one more column is included in the program when object code is

generated (ie) block whenever the system will encounter the new block need to write

USE and BLOCK name then set LOCCTR value to 0000

LALITHAMBIGAIB APIT Page 38

Add this field to SYMTAB

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In program block table consists of block name block valuestarting address and

length

After finding location counter value need to find object code and write object

program

LALITHAMBIGAIB APIT Page 39

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Program Block Diagram

LALITHAMBIGAIB APIT Page 40

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Control section and program linkingControl section

i A control section is a part of the program that maintains its identity after assembly

ii Each control section can be loaded and relocated independently of the others

iii Different control sections are most often used for subroutines or other logical

subdivisions of a program

Assembler directive CSECT

CSECT signal the start of a new control section

The assembler establishes a separate location counter (initialized as 0) for each control section

Assembler directives EXTDEF EXTREF

LALITHAMBIGAIB APIT Page 41

SYSTEM SOFTWARE ( CS 2304) UNIT - IIEXTDEF external definition

The EXTDEF statement in a control section names symbols called external symbols that are

defined in this control section and may be used by other sections Control section names are

automatically considered to be external symbols

EXTREF external reference

The EXTREF statement names symbols that are used in this control section and are defined

elsewhere

LALITHAMBIGAIB APIT Page 42

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 7: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Figure 22 Assembler language program for basic SIC version with Object code

LALITHAMBIGAIB APIT Page 7

SYSTEM SOFTWARE ( CS 2304) UNIT - II

1 The column ldquoLocrdquo in the example program gives the machine address in hexadecimal for

each and every line of the assembled program

2 The program starts at address 1000 (itrsquos only assumption ndash for simple SIC machine the

starting address will always be assumed and the value will not be 0)

3 The translation of source program to object code requires the following functions

a Convert mnemonic operation codes to their machine language equivalents Eg Translate

STL to 14 (line 10)

b Convert symbolic operands to their equivalent machine addresses EgTranslate

RETADR to 1033 (line 10)

c Build the machine instructions in the proper format

d Convert the data constants specified in the source program into their internal machine

representations Eg Translate EOF to 454F46(line 80)

e Write the object program and the assembly listing

4 All the statements in the program except statement 2 can be established by sequential

processing of source program one line at a time

Consider the statement

10 1000 FIRST STL RETADR 141033

5 This instruction contains a forward reference (ie) - a reference to a label (RETADR) that is

defined later in the program

6 It is unable to process this line because the address that will be assigned to RETADR is not

known

7 Hence most assemblers make two passes over the source program

8 The first pass does little more than scan the source program for label definitions and assign

address such as those in the Loc column

9 The second pass performs most of the actual translation

10 The assembler must also process statements called assembler directives or pseudo

instructions which are not translated into machine instructions Instead they provide

instructions to the assembler itself Examples RESB and RESW instruct the assembler to

reserve memory locations without generating data values

LALITHAMBIGAIB APIT Page 8

SYSTEM SOFTWARE ( CS 2304) UNIT - II11 The assembler must write the generated object code onto some output device This object

program will later be loaded into memory for execution

Object program format contains three types of records

Header record Contains the program name starting address and length

Text record Contains the machine code and data of the program

End record Marks the end of the object program and specifies the address in the program

where execution is to begin

Record format is as follows

Header record

Col 1 H

Col2-7 Program name

Col8-13 Starting address of object program

Col14-19 Length of object program in bytes

Text record

Col1 T

Col2-7 Starting address for object code in this record

Col8-9 Length of object code in this record in bytes

Col 10-69 Object code represented in hexadecimal (2 columns per byte of object code)

End record

Col1 E

Col2-7 Address of first executable instruction in object program

LALITHAMBIGAIB APIT Page 9

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2111 The assembler design can be done

Single pass assembler

Multi-pass assembler

Single-pass Assembler

In Single-pass assembler the whole process of scanning parsing and object code conversion

is done in single pass

The only problem with this method is resolving forward reference

The problem of forward reference is shown with an example below

10 1000 FIRST STL RETADR 141033

--

--

--

--

95 1033 RETADR RESW 1

In the above example in line number 10 the instruction STL will store the linkage register

with the contents of RETADR But during the processing of this instruction the value of this

symbol is not known as it is defined at the line number 95

Since in single-pass assembler the scanning parsing and object code conversion happens

simultaneously

The instruction is fetched it is scanned for tokens parsed for syntax and semantic validity If

it is valid then it has to be converted to its equivalent object code For this the object code is

generated by adding the opcode of STL and the value for the symbol RETADR which is not

available

Due to this reason usually the design is done in two passes

So a multi-pass assembler resolves the forward references and then converts into the object

code Hence the process of the multi-pass assembler can be as follows

LALITHAMBIGAIB APIT Page 10

SYSTEM SOFTWARE ( CS 2304) UNIT - IIPass-1

Assign addresses to all the statements

Save the addresses assigned to all labels to be used in Pass-2

Perform some processing of assembler directives such as RESW RESB to find the length of

data areas for assigning the address values

Defines the symbols in the symbol table(generate the symbol table)

Pass-2

Assemble the instructions (translating operation codes and looking up addresses)

Generate data values defined by BYTE WORD etc

Perform the processing of the assembler directives not done during pass-1

Write the object program and assembly listing

2112 Assembler Design

The most important things which need to be concentrated is the generation of Symbol table

and resolving forward references

bull Symbol Table

ndash This is created during pass 1

ndash All the labels of the instructions are symbols

ndash Table has entry for symbol name address value

bull Forward reference

ndash Symbols that are defined in the later part of the program are called forward

referencing

ndash There will not be any address value for such symbols in the symbol table in pass 1

2113 Functions of the two passes of assembler

Pass 1 (Define symbols)

1 Assign addresses to all statements in the program

2 Save the addresses assigned to all labels for use in Pass 2

3 Perform some processing of assembler directives

LALITHAMBIGAIB APIT Page 11

SYSTEM SOFTWARE ( CS 2304) UNIT - IIPass 2 (Assemble instructions and generate object programs)

1 Assemble instructions (translating operation codes and looking up addresses)

2 Generate data values defined by BYTEWORD etc

3 Perform processing of assembler directives not done in Pass 1

4 Write the object program and the assembly listing

212 Assembler Algorithm and Data Structures

Assembler uses two major internal data structures

1 Operation Code Table (OPTAB) Used to lookup mnemonic operation codes and translate

them into their machine language equivalents

2 Symbol Table (SYMTAB) Used to store values(Addresses) assigned to labels

3

Location Counter (LOCCTR)

It is a variable used to help in the assignment of addresses

It is initialized to the beginning address specified in the START statement

After each source statement is processed the length of the assembled instruction or data area

to be generated is added to LOCCTR

Whenever a label is reached in the source program the current value of LOCCTR gives the

address to be associated with that label

Operation Code Table (OPTAB)

Contains the mnemonic operation and its machine language equivalent

Also contains information about instruction format and length

In Pass 1 OPTAB is used to lookup and validate operation codes in the source program

In Pass 2 it is used to translate the operation codes to machine language program

During Pass 2 the information in OPTAB tells which instruction format to use in assembling

the instruction and any peculiarities of the object code instruction

LALITHAMBIGAIB APIT Page 12

SYSTEM SOFTWARE ( CS 2304) UNIT - II OPTAB is usually organized as a hash table with mnemonic operation code as the key The

hash table organization is particularly appropriate since it provides fast retrieval with a

minimum search

In most cases OPTAB is a static table ndash ie entries are not added or deleted from it

Symbol Table (SYMTAB)

Includes the name and value(address) for each label in the source program and flags to

indicate error conditions(ex ndash symbol defined in two different places)

During Pass 1 of the assembler labels are entered into SYMTAB as they are encountered in

the source program along with their assigned addresses

During Pass 2 symbols used as operands are looked up in SYMTAB to obtain the addresses

to be inserted in the assembled instructions

Pass 1 usually writes an intermediate file that contains each source statement together with its

assigned address error indicators This file is used as the input to Pass 2

This copy of the source program can also be used to retain the results of certain operations that may

be performed during Pass 1 such as scanning the operand field for symbols and addressing flags so

these need not be performed again during Pass 2

The Algorithm for Pass 1

Begin

read first input line

if OPCODE = lsquoSTARTrsquo then

begin

save [Operand] as starting address

initialize LOCCTR to starting address

write line to intermediate file

read next input line

end if START

else

LALITHAMBIGAIB APIT Page 13

SYSTEM SOFTWARE ( CS 2304) UNIT - IIinitialize LOCCTR to 0

While OPCODE = lsquoENDrsquo do

begin

if this is not a comment line then

begin

if there is a symbol in the LABEL field then

begin

search SYMTAB for LABEL

if found then

set error flag (duplicate symbol)

else

end if symbol

search OPTAB for OPCODE

if found then

add 3 (instruction length) to LOCCTR

else if OPCODE = lsquoWORDrsquo then

add 3 to LOCCTR

else if OPCODE = lsquoRESWrsquo then

add 3 [OPERAND] to LOCCTR

else if OPCODE = lsquoRESBrsquo then

add [OPERAND] to LOCCTR

else if OPCODE = lsquoBYTErsquo then

begin

find length of constant in bytes

add length to LOCCTR

end if BYTE

else

set error flag (invalid operation code)

end if not a comment

write line to intermediate file

read next input line

LALITHAMBIGAIB APIT Page 14

SYSTEM SOFTWARE ( CS 2304) UNIT - IIend while not END

write last line to intermediate file

Save (LOCCTR ndash starting address) as program length

End pass 1

Explanation ndash PASS I Algorithm

The algorithm scans the first statement START and saves the operand field (the address) as

the starting address of the program Initializes the LOCCTR value to this address This line is

written to the intermediate line

If no operand is mentioned the LOCCTR is initialized to zero

If a label is encountered the symbol has to be entered in the symbol table along with its

associated address value If the symbol already exists that indicates an entry of the same

symbol already exists So an error flag is set indicating a duplication of the symbol

It next checks for the mnemonic code it searches for this code in the OPTAB If found then

the length of the instruction is added to the LOCCTR to make it point to the next instruction

If the opcode is the directive WORD it adds a value 3 to the LOCCTR

If it is RESW it needs to add 3 the number of data word to the LOCCTR

If it is BYTE it adds a value one to the LOCCTR if RESB it adds number of bytes

If it is END directive then it is the end of the program it finds the length of the program by

evaluating current LOCCTR ndash the starting address mentioned in the operand field of the

END directive

Each processed line is written to the intermediate file

The Algorithm for Pass 2

Begin

read 1st input line from intermediate file

if OPCODE = lsquoSTARTrsquo then

begin

write listing line

read next input line

end if START

LALITHAMBIGAIB APIT Page 15

SYSTEM SOFTWARE ( CS 2304) UNIT - IIwrite Header record to object program

initialize 1st Text record

while OPCODE = lsquoENDrsquo do

begin

if this is not comment line then

begin

search OPTAB for OPCODE

if found then

begin

if there is a symbol in OPERAND field then

begin

search SYMTAB for OPERAND

if found then

store symbol value as operand address

else

begin

store 0 as operand address

set error flag (undefined symbol)

end

end if symbol

else

store 0 as operand address

assemble the object code instruction

end if opcode found

else if OPCODE = lsquoBYTErsquo or lsquoWORDrdquo then

convert constant to object code

if object code doesnrsquot fit into current Text record then

begin

Write text record to object code

initialize new Text record

end

LALITHAMBIGAIB APIT Page 16

SYSTEM SOFTWARE ( CS 2304) UNIT - IIadd object code to Text record

end if not comment

write listing line

read next input line

end while not END

Write last Text record to object program

Write End record to object program

Write last listing line

End Pass 2

Explanation ndash PASS II Algorithm

Here the first input line is read from the intermediate file

If the opcode is START then this line is directly written to the list file

A header record is written in the object program which gives the starting address and the

length of the program (which is calculated during pass 1)

Then the first text record is initialized Comment lines are ignored

In the instruction for the opcode the OPTAB is searched to find the object code

If a symbol is there in the operand field the symbol table is searched to get the address value

for this which gets added to the object code of the opcode

If the address not found then zero value is stored as operands address An error flag is set

indicating it as undefined

If symbol itself is not found then store 0 as operand address and the object code instruction is

assembled

If the opcode is BYTE or WORD then the constant value is converted to its equivalent

object code( for example for character EOF its equivalent hexadecimal value lsquo454f46rsquo is

stored)

If the object code cannot fit into the current text record a new text record is created and the

rest of the instructions object code is listed

The text records are written to the object program Once the whole program is assembled and

when the END directive is encountered the End record is written

LALITHAMBIGAIB APIT Page 17

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Design and Implementation Issues

Some of the features in the program depend on the architecture of the machine If the program is for

SIC machine then we have only limited instruction formats and hence limited addressing modes

We have only single operand instructions The operand is always a memory reference Anything to

be fetched from memory requires more time Hence the improved version of SICXE machine

provides more instruction formats and hence more addressing modes The moment we change the

machine architecture the availability of number of instruction formats and the addressing modes

changes Therefore the design usually requires considering two things Machine-dependent features

and Machine-independent features

22 MACHINE DEPENDENT ASSEMBLER FEATURES

Instruction formats and addressing modes

Program relocation

In this section the design and implementation of an assembler for the more complex XE version of

SIC is considered

LALITHAMBIGAIB APIT Page 18

SYSTEM SOFTWARE ( CS 2304) UNIT - II

200 SUBROUTINE TO WRITE RECORD FROM BUFFER

205

210 WRREC CLEAR X CLEAR LOOP COUNTER

212 LDT LENGTH

215 WLOOP TD OUTPUT TEST OUTPUT DEVICE

220 JEQ WLOOP LOOP UNTIL READY

225 LDCH BUFFER X GET CHARACTER FROM BUFFER

230 WD OUTPUT WRITE CHARACTER

235 TIXR T LOOP UNTIL ALL CHARACTERS

HAVE BEEN WRITTEN

240 JLT WLOOP

245 RSUB

250 OUTPUT BYTE Xrsquo05rsquo

255 END FIRST

Figure 24 Example program of a SICXE

LALITHAMBIGAIB APIT Page 19

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The above figure 23 shows the example program from figure 21 as it is might be rewritten to

take advantage of the SICXE instruction set

Indirect addressing is indicated by adding the prefix to the operand (line70)

Immediate operands are denoted with the prefix (lines 25 55133)

Instructions that refer to memory are normally assembled using either the program counter

relative or base counter relative mode

The assembler directive BASE (line 13) is used in conjunction with base relative addressing

The four byte extended instruction format is specified with the prefix + added to the

operation code in the source statement

Register-to-register instructions are used wherever possible For example the statement on

line 150 is changed from COMP ZERO to COMPR AS

Immediate and indirect addressing have also been used as much as possible

Register-to-register instructions are faster than the corresponding register-to-memory

operations because they are shorter and do not require another memory reference

While using immediate addressing the operand is already present as part of the instruction

and need not be fetched from anywhere

The use of indirect addressing often avoids the need for another instruction

Line Loc Label Opcode Operand Object Code

5 COPY START 0

10 FIRST STL RETADR

12 LDB LENGTH

13 BASE LENGTH

15 CLOOP +JSUB RDREC

20 LDA LENGTH

25 COMP 0

30 JEQ ENDFIL

35 +JSUB WRREC

40 J CLOOP

45 ENDFIL LDA EOF

LALITHAMBIGAIB APIT Page 20

SYSTEM SOFTWARE ( CS 2304) UNIT - II50 STA BUFFER

55 LDA 3

60 STA LENGTH

65 +JSUB WRREC

70 J RETADR

80 EOF BYTE CrsquoEOFrsquo

95 RETADR RESW 1

100 LENGTH RESW 1

105 BUFFER RESB 4096

110

SUBROUTINE TO READ RECORD INTO

BUFFER

115

120

125 RDREC CLEAR X

130 CLEAR A

132 CLEAR S

133 +LDT 4096

135 RLOOP TD INPUT

140 JEQ RLOOP

145 RD INPUT

150 COMPR A S

155 JEQ EXIT

160 STCH BUFFERX

165 TIXR T

170 JLT RLOOP

175 EXIT STX LENGTH

180 RSUB

185 INPUT BYTE XrsquoF1rsquo

195

SUBROUTINE TO WRITE RECORD

FROM BUFFER

200

205

LALITHAMBIGAIB APIT Page 21

SYSTEM SOFTWARE ( CS 2304) UNIT - II210 WRREC CLEAR X

212 LDT LENGTH

215 WLOOP TD OUTPUT

220 JEQ WLOOP

225 LDCH BUFFERX

230 WD OUTPUT

235 TIXR T

240 JLT WLOOP

245 RSUB

250 OUTPUT BYTE Xrsquo05rsquo

255 END FIRST

Figure 25 Program from figure 24(SICXE) with object code

221 Instruction Formats and Addressing Modes

The instruction formats depend on the memory organization and the size of the memory

In SIC machine the memory is byte addressable Word size is 3 bytes So the size of the

memory is 212 bytes Accordingly it supports only one instruction format It has only two

registers register A and Index register Therefore the addressing modes supported by this

architecture are direct and indexed

Whereas the memory of a SICXE machine is 220 bytes (1 MB) This supports four different

types of instruction types they are

1 byte instruction

2 byte instruction

3 byte instruction

4 byte instruction

LALITHAMBIGAIB APIT Page 22

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Instructions can be

Instructions involving register to register

Instructions with one operand in memory the other in Accumulator (Single

operand instruction)

Extended instruction format

Addressing Modes are

PC-relative or Base-relative addressing op m

Indirect addressing op m

Immediate addressing op c

Extended format +op m

Index addressing op mx

register-to-register instructions

larger memory -gt multi-programming (program allocation)

Translation

1 Translations for the Instruction involving Register-Register addressing mode

During pass 1 the registers can be entered as part of the symbol table itself The value for these

registers is their equivalent numeric codes During pass 2 these values are assembled along with the

mnemonics object code If required a separate table can be created with the register names and their

equivalent numeric values

2 Translation involving Register-Memory instructions

In SICXE machine there are four instruction formats and five addressing modes For formats and

addressing modes refer chapter 1

LALITHAMBIGAIB APIT Page 23

SYSTEM SOFTWARE ( CS 2304) UNIT - IIAmong the instruction formats format -3 and format-4 instructions are Register-Memory type of

instruction One of the operand is always in a register and the other operand is in the memory The

addressing mode tells us the way in which the operand from the memory is to be fetched

There are two ways

1) Program-counter relative and

2) Base-relative

This addressing mode can be represented by either using format-3 type or format-4

type of instruction format

In format-3 the instruction has the opcode followed by a 12-bit displacement value in

the address field

Where as in format-4 the instruction contains the mnemonic code followed by a 20-

bit displacement value in the address field

1 Program-Counter Relative

a) In this usually format-3 instruction format is used

b) The instruction contains the opcode followed by a 12-bit displacement value

c) The range of displacement values are from 0 -2048 This displacement (should be small

enough to fit in a 12-bit field) value is added to the current contents of the program counter to

get the target address of the operand required by the instruction

d) This is relative way of calculating the address of the operand relative to the program counter

Hence the displacement of the operand is relative to the current program counter value

e) The following example shows how the address is calculated

LALITHAMBIGAIB APIT Page 24

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Base-Relative Addressing Mode

a) In this mode the base register is used to mention the displacement value Therefore the target

address is

TA = (base) + displacement value

b) This addressing mode is used when the range of displacement value is not sufficient Hence

the operand is not relative to the instruction as in PC-relative addressing mode Whenever

this mode is used it is indicated by using a directive BASE The moment the assembler

encounters this directive the next instruction uses base-relative addressing mode to calculate

the target address of the operand

c) When NOBASE directive is used then it indicates the base register is no more used to

calculate the target address of the operand Assembler first chooses PC-relative when the

displacement field is not enough it uses Base-relative

LDB LENGTH (instruction)

BASE LENGTH (directive)

NOBASE

LALITHAMBIGAIB APIT Page 25

SYSTEM SOFTWARE ( CS 2304) UNIT - II

For example

12 0003 LDB LENGTH 69202D

13 BASE LENGTH

100 0033 LENGTH RESW 1

105 0036 BUFFER RESB 4096

160 104E STCH BUFFER X 57C003

165 1051 TIXR T B850

In the above example the use of directive BASE indicates that Base-relative addressing mode is

to be used to calculate the target address PC-relative is no longer used The value of the LENGTH is

stored in the base register

The LDB instruction loads the value of length in the base register which 0033 BASE directive

explicitly tells the assembler that it has the value of LENGTH

BUFFER is at location (0036)16

(B) = (0033)16

disp = 0036 ndash 0033 = (0003)16

20 000A LDA LENGTH 032026

175 1056 EXIT STX LENGTH 134000

LALITHAMBIGAIB APIT Page 26

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Consider Line 175 If we use PC-relative

Disp = TA ndash (PC) = 0033 ndash1059 = EFDA

PC relative is no longer applicable so we try to use BASE relative addressing mode

3 Immediate Addressing Mode

In this mode no memory reference is involved If immediate mode is used the target address is the

operand itself

If the symbol is referred in the instruction as the immediate operand then it is immediate with PC-

relative mode as shown in the example below

LALITHAMBIGAIB APIT Page 27

SYSTEM SOFTWARE ( CS 2304) UNIT - II

4 Indirect and PC-relative mode

In this type of instruction the symbol used in the instruction is the address of the location which

contains the address of the operand The address of this is found using PC-relative addressing mode

For example

The instruction jumps the control to the address location RETADR which in turn has the address of

the operand If address of RETADR is 0030 the target address is then 0003 as calculated above

222 Program Relocation

The need for program relocation

It is desirable to load and run several programs at the same time

The system must be able to load programs into memory wherever there is room

The exact starting address of the program is not known until load time

Absolute Address

Program with starting address specified at assembly time

The address may be invalid if the program is loaded into somewhere else

LALITHAMBIGAIB APIT Page 28

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example

The above statement says that the register A is loaded with the value stored at location 102D

Suppose it is decided to load and execute the program at location 3000 instead of location 1000

Then at address 102D the required value which needs to be loaded in the register A is no more

available

The address also gets changed relative to the displacement of the program Hence we need to

make some changes in the address portion of the instruction so that we can load and execute the

program at location 3000

Apart from the instruction which will undergo a change in their operand address value as the

program load address changes There exist some parts in the program which will remain same

regardless of where the program is being loaded

Since assembler will not know actual location where the program will get loaded it cannot make

the necessary changes in the addresses used in the program However the assembler identifies

for the loader those parts of the program which need modification

An object program that has the information necessary to perform this kind of modification is

called the relocatable program

LALITHAMBIGAIB APIT Page 29

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example Program Relocation

1) The above diagram shows the concept of relocation

a) Initially the program is loaded at location 0000

b) The instruction JSUB is loaded at location 0006

c) The address field of this instruction contains 01036 which is the address of the instruction

labeled RDREC

2) The second figure shows that if the program is to be loaded at new location 5000 The address of

the instruction JSUB gets modified to new location 6036

3) Likewise the third figure shows that if the program is relocated at location 7420 the JSUB

instruction would need to be changed to 4B108456 that correspond to the new address of

RDREC

The only parts of the program that require modification at load time are those that specify

direct addresses

LALITHAMBIGAIB APIT Page 30

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The rest of the instructions need not be modified

o Not a memory address (immediate addressing)

o PC-relative Base-relative

From the object program it is not possible to distinguish the address and constant

o The assembler must keep some information to tell the loader

o The object program that contains the modification record is called a relocatable

program

The way to solve the relocation problem

For an address label its address is assigned relative to the start of the program(START 0)

Produce a Modification record to store the starting location and the length of the address

field to be modified

The command for the loader must also be a part of the object program

Modification record

One modification record for each address to be modified

The length is stored in half-bytes (4 bits)

The starting location is the location of the byte containing the leftmost bits of the address

field to be modified

If the field contains an odd number of half-bytes the starting location begins in the middle of

the first byte

LALITHAMBIGAIB APIT Page 31

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Relocatable Object Program

In the above object code the red boxes indicate the addresses that need modifications The object

code lines at the end are the descriptions of the modification records for those instructions which

need change if relocation occurs M00000705 is the modification suggested for the statement at

location 0007 and requires modification 5-half bytes

Machine dependent assembler featuresAssembler features not closely related to machine architecture

bull Literalsbull Symbol-defining statementsbull Expressionsbull Program blocksbull Control sections and program linking

Literals

It is convenient for the programmer to be able to write the value of a constant operand as a part of

the instruction that uses it Such an operand is called a literal

45 001A ENDFIL LDA =ClsquoEOFrsquo 003210

002D =ClsquoEOFrsquo 454F46

LALITHAMBIGAIB APIT Page 32

SYSTEM SOFTWARE ( CS 2304) UNIT - II215 1062 WLOOP TD =Xlsquo05rsquo E32011

1076 =Xlsquo05rsquo 05

bull In this assembler language notation a literal is identified with the prefix= which is followed by a

specification of the literal value

The difference between a literal and an immediate operand

With immediate addressing the operand value is assembled as a part of the machine

instruction

With a literal the assembler generates the specified value as a constant at some other

memory location The address of this generated constant is used as the target address for the

machine instruction

Literal pool

All of the literal operands used in a program are gathered together into one or more literal

pools

Where the literal pool should be placed

93 LTORG

002D =ClsquoEOFrsquo 454F46

1048698 The assembler directive LTORG tells the assembler to generate a literal pool here

Literal for current value of location counter

1048698 The value of the location counter can be denoted by a literal operand

bull BASE

bull LDB =

Handling duplicate literal operands

The assembler should avoid storing duplicate literals

The easiest way to recognize duplicate literals is by comparison of the character

strings defining them

bull 100 LDA =ClsquoEOFrsquo

LALITHAMBIGAIB APIT Page 33

SYSTEM SOFTWARE ( CS 2304) UNIT - IIbull 125 LDA =ClsquoEOFrsquo

bull 160 LDA =Xlsquo454F46rsquo Literal operands with different

literal names may have the same

literal values

Recognizing literal operands that have different literal names but have the same literal values will

complicate the design of an assembler

Literal table (LITTAB)

The basic data structure needed to process literal operands is a literal table (LITTAB)

LITTAB contains Literal NameValue and Length

LITTAB is often organized as a hash table using the literal name or value as the key

Processing literal operands

Pass 1

bull For each recognized literal operand search LITTAB If the literal is already present in the

table no action is need if it is not present the literal is added to LITTAB without assigning

its address

bull When a LTORG statement is encountered or the end of the program the assembler makes a

scan of LITTAB and assigns an address to each literal

bull Update the location counter to reflect the number of bytes occupied by each literal

Pass 2

bull Search LITTAB for each literal operand encountered

bull The data values specified by the literals in each literal pool are inserted at the appropriate

places in the object program

bull In the same way as these values generated by BYTE or WORD statements

bull If a literal value represents an address in the program the assembler must generate the

appropriate Modification record

Symbol-defining statements

Assembler directives

EQU

ORG

Assembler directive EQU

LALITHAMBIGAIB APIT Page 34

SYSTEM SOFTWARE ( CS 2304) UNIT - II Most assemblers provide an assembler directive that allows the programmer to define

symbols and specify their values

Symbol EQU value

When the assembler encounters the EQU statement it enters ldquosymbolrdquo into SYMTAB with

the value of ldquosymbolrdquo

Use of EQU

Establish symbolic names that can be used for improved readability in place of numeric

values

+LDT 4096

MAXLEN EQU 4096

+LDT MAXLEN

Define mnemonic names for registers

A EQU 0

X EQU 1

L EQU 2

RMO 01

Establish and use names that reflect the logical function of the registers in the program

BASE EQU R1

ACCUMULATOR EQU R2

INDEX EQU R3

Assembler directive ORG

The assembler directive ORG is usually used to indirectly assign values to symbols

ORG value

ldquoValuerdquo is a constant or an expression involving constants and previously defined

symbols

When this statement is encountered the assembler resets its location counter (LOCCTR) to

the specified value

Use ORG for label definition

Suppose that we want to define a table with the following structure

STAB SYMBOL VALUE FLAGS

100 entries 6 bytes 3 bytes 1byte

LALITHAMBIGAIB APIT Page 35

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In some assemblers the previous value of LOCCTR is automatically remembered so we can

write

ORG

to return to the normal use of LOCCTR

Restrictions of EQU and ORG in an ordinary two-pass assembler For an ordinary two-pass assembler all symbols must be defined during Pass 1 Hence the

following sequences could not be processed by an ordinary two-pass assembler

All terms used to specify the value of the new symbol must have been defined previously in

the program

BETA EQU ALPHA

ALPHA RESW 1

ALPHA EQU BETA

BETA EQU DELTA

DELTA RESW 1

disallowed

ORG ALPHA

BYTE1 RESB 1

BYTE2 RESB 1

BYTE3 RESB 1

ORG

ALPHA RESB 1

disallowed

ALPHA RESW 1

BETA EQU ALPHA

Allowed

LALITHAMBIGAIB APIT Page 36

disallowed

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Expressions Most assemblers allow the use of expressions whenever a single operand such as a label or

literal is permitted

bull Each such expression must be evaluated by the assembler to produce a single

operand address or value

Assemblers generally allow arithmetic expressions formed according to the normal rule using

the operators +- and

bull Individual terms in the expression may be

bull constants

bull user-defined symbols or

bull special terms

bull The most common special term is the current value of the location

counter (often designated by )

Types of terms

Absolute terms - The value of an absolute term is independent of program location

Relative terms - The value of a relative term is dependent on the beginning address of the

program

Types of expressions

By the type of value produced expressions can classified as

1 Absolute expressions

bull The value of an absolute expression is independent of the program location

bull The absolute expression may contains relative terms provided the relative terms occur in

pairs and the terms in each such pair have opposite signs No relative term can enter multiplication

or division operation

bull eg MAXLEN EQU BUFEND-BUFFER

LALITHAMBIGAIB APIT Page 37

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Relative expressions

bull The value of a relative expression is relative the beginning address of the object

program

bull A relative expression is one in which all of the relative terms except

one can be paired as described above The remaining unpaired term must have a positive sign No

relative term can enter multiplication or division operation

Expressions that are neither relative nor absolute should be flagged by the assembler as errors

Determining types of expressions

Symbol Type Value

MAXLEN A 1000

BUFEND R 1036

BUFFER R 0036

RETADR R 0030

BUFEND+BUFFER 100-BUFFER and 3BUFFER are neither relative expressions nor

absolute expressions

Program blocks

The program is divided into three blocks They are Default - 0CDATA-1

and CBLKS-2

Default ( 0 ) block for mnemonic instructions

CDATA ( 1 ) block for less memory instructions

CBLKs ( 2 ) block for more memory instructions

Then one more column is included in the program when object code is

generated (ie) block whenever the system will encounter the new block need to write

USE and BLOCK name then set LOCCTR value to 0000

LALITHAMBIGAIB APIT Page 38

Add this field to SYMTAB

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In program block table consists of block name block valuestarting address and

length

After finding location counter value need to find object code and write object

program

LALITHAMBIGAIB APIT Page 39

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Program Block Diagram

LALITHAMBIGAIB APIT Page 40

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Control section and program linkingControl section

i A control section is a part of the program that maintains its identity after assembly

ii Each control section can be loaded and relocated independently of the others

iii Different control sections are most often used for subroutines or other logical

subdivisions of a program

Assembler directive CSECT

CSECT signal the start of a new control section

The assembler establishes a separate location counter (initialized as 0) for each control section

Assembler directives EXTDEF EXTREF

LALITHAMBIGAIB APIT Page 41

SYSTEM SOFTWARE ( CS 2304) UNIT - IIEXTDEF external definition

The EXTDEF statement in a control section names symbols called external symbols that are

defined in this control section and may be used by other sections Control section names are

automatically considered to be external symbols

EXTREF external reference

The EXTREF statement names symbols that are used in this control section and are defined

elsewhere

LALITHAMBIGAIB APIT Page 42

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 8: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - II

1 The column ldquoLocrdquo in the example program gives the machine address in hexadecimal for

each and every line of the assembled program

2 The program starts at address 1000 (itrsquos only assumption ndash for simple SIC machine the

starting address will always be assumed and the value will not be 0)

3 The translation of source program to object code requires the following functions

a Convert mnemonic operation codes to their machine language equivalents Eg Translate

STL to 14 (line 10)

b Convert symbolic operands to their equivalent machine addresses EgTranslate

RETADR to 1033 (line 10)

c Build the machine instructions in the proper format

d Convert the data constants specified in the source program into their internal machine

representations Eg Translate EOF to 454F46(line 80)

e Write the object program and the assembly listing

4 All the statements in the program except statement 2 can be established by sequential

processing of source program one line at a time

Consider the statement

10 1000 FIRST STL RETADR 141033

5 This instruction contains a forward reference (ie) - a reference to a label (RETADR) that is

defined later in the program

6 It is unable to process this line because the address that will be assigned to RETADR is not

known

7 Hence most assemblers make two passes over the source program

8 The first pass does little more than scan the source program for label definitions and assign

address such as those in the Loc column

9 The second pass performs most of the actual translation

10 The assembler must also process statements called assembler directives or pseudo

instructions which are not translated into machine instructions Instead they provide

instructions to the assembler itself Examples RESB and RESW instruct the assembler to

reserve memory locations without generating data values

LALITHAMBIGAIB APIT Page 8

SYSTEM SOFTWARE ( CS 2304) UNIT - II11 The assembler must write the generated object code onto some output device This object

program will later be loaded into memory for execution

Object program format contains three types of records

Header record Contains the program name starting address and length

Text record Contains the machine code and data of the program

End record Marks the end of the object program and specifies the address in the program

where execution is to begin

Record format is as follows

Header record

Col 1 H

Col2-7 Program name

Col8-13 Starting address of object program

Col14-19 Length of object program in bytes

Text record

Col1 T

Col2-7 Starting address for object code in this record

Col8-9 Length of object code in this record in bytes

Col 10-69 Object code represented in hexadecimal (2 columns per byte of object code)

End record

Col1 E

Col2-7 Address of first executable instruction in object program

LALITHAMBIGAIB APIT Page 9

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2111 The assembler design can be done

Single pass assembler

Multi-pass assembler

Single-pass Assembler

In Single-pass assembler the whole process of scanning parsing and object code conversion

is done in single pass

The only problem with this method is resolving forward reference

The problem of forward reference is shown with an example below

10 1000 FIRST STL RETADR 141033

--

--

--

--

95 1033 RETADR RESW 1

In the above example in line number 10 the instruction STL will store the linkage register

with the contents of RETADR But during the processing of this instruction the value of this

symbol is not known as it is defined at the line number 95

Since in single-pass assembler the scanning parsing and object code conversion happens

simultaneously

The instruction is fetched it is scanned for tokens parsed for syntax and semantic validity If

it is valid then it has to be converted to its equivalent object code For this the object code is

generated by adding the opcode of STL and the value for the symbol RETADR which is not

available

Due to this reason usually the design is done in two passes

So a multi-pass assembler resolves the forward references and then converts into the object

code Hence the process of the multi-pass assembler can be as follows

LALITHAMBIGAIB APIT Page 10

SYSTEM SOFTWARE ( CS 2304) UNIT - IIPass-1

Assign addresses to all the statements

Save the addresses assigned to all labels to be used in Pass-2

Perform some processing of assembler directives such as RESW RESB to find the length of

data areas for assigning the address values

Defines the symbols in the symbol table(generate the symbol table)

Pass-2

Assemble the instructions (translating operation codes and looking up addresses)

Generate data values defined by BYTE WORD etc

Perform the processing of the assembler directives not done during pass-1

Write the object program and assembly listing

2112 Assembler Design

The most important things which need to be concentrated is the generation of Symbol table

and resolving forward references

bull Symbol Table

ndash This is created during pass 1

ndash All the labels of the instructions are symbols

ndash Table has entry for symbol name address value

bull Forward reference

ndash Symbols that are defined in the later part of the program are called forward

referencing

ndash There will not be any address value for such symbols in the symbol table in pass 1

2113 Functions of the two passes of assembler

Pass 1 (Define symbols)

1 Assign addresses to all statements in the program

2 Save the addresses assigned to all labels for use in Pass 2

3 Perform some processing of assembler directives

LALITHAMBIGAIB APIT Page 11

SYSTEM SOFTWARE ( CS 2304) UNIT - IIPass 2 (Assemble instructions and generate object programs)

1 Assemble instructions (translating operation codes and looking up addresses)

2 Generate data values defined by BYTEWORD etc

3 Perform processing of assembler directives not done in Pass 1

4 Write the object program and the assembly listing

212 Assembler Algorithm and Data Structures

Assembler uses two major internal data structures

1 Operation Code Table (OPTAB) Used to lookup mnemonic operation codes and translate

them into their machine language equivalents

2 Symbol Table (SYMTAB) Used to store values(Addresses) assigned to labels

3

Location Counter (LOCCTR)

It is a variable used to help in the assignment of addresses

It is initialized to the beginning address specified in the START statement

After each source statement is processed the length of the assembled instruction or data area

to be generated is added to LOCCTR

Whenever a label is reached in the source program the current value of LOCCTR gives the

address to be associated with that label

Operation Code Table (OPTAB)

Contains the mnemonic operation and its machine language equivalent

Also contains information about instruction format and length

In Pass 1 OPTAB is used to lookup and validate operation codes in the source program

In Pass 2 it is used to translate the operation codes to machine language program

During Pass 2 the information in OPTAB tells which instruction format to use in assembling

the instruction and any peculiarities of the object code instruction

LALITHAMBIGAIB APIT Page 12

SYSTEM SOFTWARE ( CS 2304) UNIT - II OPTAB is usually organized as a hash table with mnemonic operation code as the key The

hash table organization is particularly appropriate since it provides fast retrieval with a

minimum search

In most cases OPTAB is a static table ndash ie entries are not added or deleted from it

Symbol Table (SYMTAB)

Includes the name and value(address) for each label in the source program and flags to

indicate error conditions(ex ndash symbol defined in two different places)

During Pass 1 of the assembler labels are entered into SYMTAB as they are encountered in

the source program along with their assigned addresses

During Pass 2 symbols used as operands are looked up in SYMTAB to obtain the addresses

to be inserted in the assembled instructions

Pass 1 usually writes an intermediate file that contains each source statement together with its

assigned address error indicators This file is used as the input to Pass 2

This copy of the source program can also be used to retain the results of certain operations that may

be performed during Pass 1 such as scanning the operand field for symbols and addressing flags so

these need not be performed again during Pass 2

The Algorithm for Pass 1

Begin

read first input line

if OPCODE = lsquoSTARTrsquo then

begin

save [Operand] as starting address

initialize LOCCTR to starting address

write line to intermediate file

read next input line

end if START

else

LALITHAMBIGAIB APIT Page 13

SYSTEM SOFTWARE ( CS 2304) UNIT - IIinitialize LOCCTR to 0

While OPCODE = lsquoENDrsquo do

begin

if this is not a comment line then

begin

if there is a symbol in the LABEL field then

begin

search SYMTAB for LABEL

if found then

set error flag (duplicate symbol)

else

end if symbol

search OPTAB for OPCODE

if found then

add 3 (instruction length) to LOCCTR

else if OPCODE = lsquoWORDrsquo then

add 3 to LOCCTR

else if OPCODE = lsquoRESWrsquo then

add 3 [OPERAND] to LOCCTR

else if OPCODE = lsquoRESBrsquo then

add [OPERAND] to LOCCTR

else if OPCODE = lsquoBYTErsquo then

begin

find length of constant in bytes

add length to LOCCTR

end if BYTE

else

set error flag (invalid operation code)

end if not a comment

write line to intermediate file

read next input line

LALITHAMBIGAIB APIT Page 14

SYSTEM SOFTWARE ( CS 2304) UNIT - IIend while not END

write last line to intermediate file

Save (LOCCTR ndash starting address) as program length

End pass 1

Explanation ndash PASS I Algorithm

The algorithm scans the first statement START and saves the operand field (the address) as

the starting address of the program Initializes the LOCCTR value to this address This line is

written to the intermediate line

If no operand is mentioned the LOCCTR is initialized to zero

If a label is encountered the symbol has to be entered in the symbol table along with its

associated address value If the symbol already exists that indicates an entry of the same

symbol already exists So an error flag is set indicating a duplication of the symbol

It next checks for the mnemonic code it searches for this code in the OPTAB If found then

the length of the instruction is added to the LOCCTR to make it point to the next instruction

If the opcode is the directive WORD it adds a value 3 to the LOCCTR

If it is RESW it needs to add 3 the number of data word to the LOCCTR

If it is BYTE it adds a value one to the LOCCTR if RESB it adds number of bytes

If it is END directive then it is the end of the program it finds the length of the program by

evaluating current LOCCTR ndash the starting address mentioned in the operand field of the

END directive

Each processed line is written to the intermediate file

The Algorithm for Pass 2

Begin

read 1st input line from intermediate file

if OPCODE = lsquoSTARTrsquo then

begin

write listing line

read next input line

end if START

LALITHAMBIGAIB APIT Page 15

SYSTEM SOFTWARE ( CS 2304) UNIT - IIwrite Header record to object program

initialize 1st Text record

while OPCODE = lsquoENDrsquo do

begin

if this is not comment line then

begin

search OPTAB for OPCODE

if found then

begin

if there is a symbol in OPERAND field then

begin

search SYMTAB for OPERAND

if found then

store symbol value as operand address

else

begin

store 0 as operand address

set error flag (undefined symbol)

end

end if symbol

else

store 0 as operand address

assemble the object code instruction

end if opcode found

else if OPCODE = lsquoBYTErsquo or lsquoWORDrdquo then

convert constant to object code

if object code doesnrsquot fit into current Text record then

begin

Write text record to object code

initialize new Text record

end

LALITHAMBIGAIB APIT Page 16

SYSTEM SOFTWARE ( CS 2304) UNIT - IIadd object code to Text record

end if not comment

write listing line

read next input line

end while not END

Write last Text record to object program

Write End record to object program

Write last listing line

End Pass 2

Explanation ndash PASS II Algorithm

Here the first input line is read from the intermediate file

If the opcode is START then this line is directly written to the list file

A header record is written in the object program which gives the starting address and the

length of the program (which is calculated during pass 1)

Then the first text record is initialized Comment lines are ignored

In the instruction for the opcode the OPTAB is searched to find the object code

If a symbol is there in the operand field the symbol table is searched to get the address value

for this which gets added to the object code of the opcode

If the address not found then zero value is stored as operands address An error flag is set

indicating it as undefined

If symbol itself is not found then store 0 as operand address and the object code instruction is

assembled

If the opcode is BYTE or WORD then the constant value is converted to its equivalent

object code( for example for character EOF its equivalent hexadecimal value lsquo454f46rsquo is

stored)

If the object code cannot fit into the current text record a new text record is created and the

rest of the instructions object code is listed

The text records are written to the object program Once the whole program is assembled and

when the END directive is encountered the End record is written

LALITHAMBIGAIB APIT Page 17

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Design and Implementation Issues

Some of the features in the program depend on the architecture of the machine If the program is for

SIC machine then we have only limited instruction formats and hence limited addressing modes

We have only single operand instructions The operand is always a memory reference Anything to

be fetched from memory requires more time Hence the improved version of SICXE machine

provides more instruction formats and hence more addressing modes The moment we change the

machine architecture the availability of number of instruction formats and the addressing modes

changes Therefore the design usually requires considering two things Machine-dependent features

and Machine-independent features

22 MACHINE DEPENDENT ASSEMBLER FEATURES

Instruction formats and addressing modes

Program relocation

In this section the design and implementation of an assembler for the more complex XE version of

SIC is considered

LALITHAMBIGAIB APIT Page 18

SYSTEM SOFTWARE ( CS 2304) UNIT - II

200 SUBROUTINE TO WRITE RECORD FROM BUFFER

205

210 WRREC CLEAR X CLEAR LOOP COUNTER

212 LDT LENGTH

215 WLOOP TD OUTPUT TEST OUTPUT DEVICE

220 JEQ WLOOP LOOP UNTIL READY

225 LDCH BUFFER X GET CHARACTER FROM BUFFER

230 WD OUTPUT WRITE CHARACTER

235 TIXR T LOOP UNTIL ALL CHARACTERS

HAVE BEEN WRITTEN

240 JLT WLOOP

245 RSUB

250 OUTPUT BYTE Xrsquo05rsquo

255 END FIRST

Figure 24 Example program of a SICXE

LALITHAMBIGAIB APIT Page 19

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The above figure 23 shows the example program from figure 21 as it is might be rewritten to

take advantage of the SICXE instruction set

Indirect addressing is indicated by adding the prefix to the operand (line70)

Immediate operands are denoted with the prefix (lines 25 55133)

Instructions that refer to memory are normally assembled using either the program counter

relative or base counter relative mode

The assembler directive BASE (line 13) is used in conjunction with base relative addressing

The four byte extended instruction format is specified with the prefix + added to the

operation code in the source statement

Register-to-register instructions are used wherever possible For example the statement on

line 150 is changed from COMP ZERO to COMPR AS

Immediate and indirect addressing have also been used as much as possible

Register-to-register instructions are faster than the corresponding register-to-memory

operations because they are shorter and do not require another memory reference

While using immediate addressing the operand is already present as part of the instruction

and need not be fetched from anywhere

The use of indirect addressing often avoids the need for another instruction

Line Loc Label Opcode Operand Object Code

5 COPY START 0

10 FIRST STL RETADR

12 LDB LENGTH

13 BASE LENGTH

15 CLOOP +JSUB RDREC

20 LDA LENGTH

25 COMP 0

30 JEQ ENDFIL

35 +JSUB WRREC

40 J CLOOP

45 ENDFIL LDA EOF

LALITHAMBIGAIB APIT Page 20

SYSTEM SOFTWARE ( CS 2304) UNIT - II50 STA BUFFER

55 LDA 3

60 STA LENGTH

65 +JSUB WRREC

70 J RETADR

80 EOF BYTE CrsquoEOFrsquo

95 RETADR RESW 1

100 LENGTH RESW 1

105 BUFFER RESB 4096

110

SUBROUTINE TO READ RECORD INTO

BUFFER

115

120

125 RDREC CLEAR X

130 CLEAR A

132 CLEAR S

133 +LDT 4096

135 RLOOP TD INPUT

140 JEQ RLOOP

145 RD INPUT

150 COMPR A S

155 JEQ EXIT

160 STCH BUFFERX

165 TIXR T

170 JLT RLOOP

175 EXIT STX LENGTH

180 RSUB

185 INPUT BYTE XrsquoF1rsquo

195

SUBROUTINE TO WRITE RECORD

FROM BUFFER

200

205

LALITHAMBIGAIB APIT Page 21

SYSTEM SOFTWARE ( CS 2304) UNIT - II210 WRREC CLEAR X

212 LDT LENGTH

215 WLOOP TD OUTPUT

220 JEQ WLOOP

225 LDCH BUFFERX

230 WD OUTPUT

235 TIXR T

240 JLT WLOOP

245 RSUB

250 OUTPUT BYTE Xrsquo05rsquo

255 END FIRST

Figure 25 Program from figure 24(SICXE) with object code

221 Instruction Formats and Addressing Modes

The instruction formats depend on the memory organization and the size of the memory

In SIC machine the memory is byte addressable Word size is 3 bytes So the size of the

memory is 212 bytes Accordingly it supports only one instruction format It has only two

registers register A and Index register Therefore the addressing modes supported by this

architecture are direct and indexed

Whereas the memory of a SICXE machine is 220 bytes (1 MB) This supports four different

types of instruction types they are

1 byte instruction

2 byte instruction

3 byte instruction

4 byte instruction

LALITHAMBIGAIB APIT Page 22

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Instructions can be

Instructions involving register to register

Instructions with one operand in memory the other in Accumulator (Single

operand instruction)

Extended instruction format

Addressing Modes are

PC-relative or Base-relative addressing op m

Indirect addressing op m

Immediate addressing op c

Extended format +op m

Index addressing op mx

register-to-register instructions

larger memory -gt multi-programming (program allocation)

Translation

1 Translations for the Instruction involving Register-Register addressing mode

During pass 1 the registers can be entered as part of the symbol table itself The value for these

registers is their equivalent numeric codes During pass 2 these values are assembled along with the

mnemonics object code If required a separate table can be created with the register names and their

equivalent numeric values

2 Translation involving Register-Memory instructions

In SICXE machine there are four instruction formats and five addressing modes For formats and

addressing modes refer chapter 1

LALITHAMBIGAIB APIT Page 23

SYSTEM SOFTWARE ( CS 2304) UNIT - IIAmong the instruction formats format -3 and format-4 instructions are Register-Memory type of

instruction One of the operand is always in a register and the other operand is in the memory The

addressing mode tells us the way in which the operand from the memory is to be fetched

There are two ways

1) Program-counter relative and

2) Base-relative

This addressing mode can be represented by either using format-3 type or format-4

type of instruction format

In format-3 the instruction has the opcode followed by a 12-bit displacement value in

the address field

Where as in format-4 the instruction contains the mnemonic code followed by a 20-

bit displacement value in the address field

1 Program-Counter Relative

a) In this usually format-3 instruction format is used

b) The instruction contains the opcode followed by a 12-bit displacement value

c) The range of displacement values are from 0 -2048 This displacement (should be small

enough to fit in a 12-bit field) value is added to the current contents of the program counter to

get the target address of the operand required by the instruction

d) This is relative way of calculating the address of the operand relative to the program counter

Hence the displacement of the operand is relative to the current program counter value

e) The following example shows how the address is calculated

LALITHAMBIGAIB APIT Page 24

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Base-Relative Addressing Mode

a) In this mode the base register is used to mention the displacement value Therefore the target

address is

TA = (base) + displacement value

b) This addressing mode is used when the range of displacement value is not sufficient Hence

the operand is not relative to the instruction as in PC-relative addressing mode Whenever

this mode is used it is indicated by using a directive BASE The moment the assembler

encounters this directive the next instruction uses base-relative addressing mode to calculate

the target address of the operand

c) When NOBASE directive is used then it indicates the base register is no more used to

calculate the target address of the operand Assembler first chooses PC-relative when the

displacement field is not enough it uses Base-relative

LDB LENGTH (instruction)

BASE LENGTH (directive)

NOBASE

LALITHAMBIGAIB APIT Page 25

SYSTEM SOFTWARE ( CS 2304) UNIT - II

For example

12 0003 LDB LENGTH 69202D

13 BASE LENGTH

100 0033 LENGTH RESW 1

105 0036 BUFFER RESB 4096

160 104E STCH BUFFER X 57C003

165 1051 TIXR T B850

In the above example the use of directive BASE indicates that Base-relative addressing mode is

to be used to calculate the target address PC-relative is no longer used The value of the LENGTH is

stored in the base register

The LDB instruction loads the value of length in the base register which 0033 BASE directive

explicitly tells the assembler that it has the value of LENGTH

BUFFER is at location (0036)16

(B) = (0033)16

disp = 0036 ndash 0033 = (0003)16

20 000A LDA LENGTH 032026

175 1056 EXIT STX LENGTH 134000

LALITHAMBIGAIB APIT Page 26

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Consider Line 175 If we use PC-relative

Disp = TA ndash (PC) = 0033 ndash1059 = EFDA

PC relative is no longer applicable so we try to use BASE relative addressing mode

3 Immediate Addressing Mode

In this mode no memory reference is involved If immediate mode is used the target address is the

operand itself

If the symbol is referred in the instruction as the immediate operand then it is immediate with PC-

relative mode as shown in the example below

LALITHAMBIGAIB APIT Page 27

SYSTEM SOFTWARE ( CS 2304) UNIT - II

4 Indirect and PC-relative mode

In this type of instruction the symbol used in the instruction is the address of the location which

contains the address of the operand The address of this is found using PC-relative addressing mode

For example

The instruction jumps the control to the address location RETADR which in turn has the address of

the operand If address of RETADR is 0030 the target address is then 0003 as calculated above

222 Program Relocation

The need for program relocation

It is desirable to load and run several programs at the same time

The system must be able to load programs into memory wherever there is room

The exact starting address of the program is not known until load time

Absolute Address

Program with starting address specified at assembly time

The address may be invalid if the program is loaded into somewhere else

LALITHAMBIGAIB APIT Page 28

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example

The above statement says that the register A is loaded with the value stored at location 102D

Suppose it is decided to load and execute the program at location 3000 instead of location 1000

Then at address 102D the required value which needs to be loaded in the register A is no more

available

The address also gets changed relative to the displacement of the program Hence we need to

make some changes in the address portion of the instruction so that we can load and execute the

program at location 3000

Apart from the instruction which will undergo a change in their operand address value as the

program load address changes There exist some parts in the program which will remain same

regardless of where the program is being loaded

Since assembler will not know actual location where the program will get loaded it cannot make

the necessary changes in the addresses used in the program However the assembler identifies

for the loader those parts of the program which need modification

An object program that has the information necessary to perform this kind of modification is

called the relocatable program

LALITHAMBIGAIB APIT Page 29

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example Program Relocation

1) The above diagram shows the concept of relocation

a) Initially the program is loaded at location 0000

b) The instruction JSUB is loaded at location 0006

c) The address field of this instruction contains 01036 which is the address of the instruction

labeled RDREC

2) The second figure shows that if the program is to be loaded at new location 5000 The address of

the instruction JSUB gets modified to new location 6036

3) Likewise the third figure shows that if the program is relocated at location 7420 the JSUB

instruction would need to be changed to 4B108456 that correspond to the new address of

RDREC

The only parts of the program that require modification at load time are those that specify

direct addresses

LALITHAMBIGAIB APIT Page 30

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The rest of the instructions need not be modified

o Not a memory address (immediate addressing)

o PC-relative Base-relative

From the object program it is not possible to distinguish the address and constant

o The assembler must keep some information to tell the loader

o The object program that contains the modification record is called a relocatable

program

The way to solve the relocation problem

For an address label its address is assigned relative to the start of the program(START 0)

Produce a Modification record to store the starting location and the length of the address

field to be modified

The command for the loader must also be a part of the object program

Modification record

One modification record for each address to be modified

The length is stored in half-bytes (4 bits)

The starting location is the location of the byte containing the leftmost bits of the address

field to be modified

If the field contains an odd number of half-bytes the starting location begins in the middle of

the first byte

LALITHAMBIGAIB APIT Page 31

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Relocatable Object Program

In the above object code the red boxes indicate the addresses that need modifications The object

code lines at the end are the descriptions of the modification records for those instructions which

need change if relocation occurs M00000705 is the modification suggested for the statement at

location 0007 and requires modification 5-half bytes

Machine dependent assembler featuresAssembler features not closely related to machine architecture

bull Literalsbull Symbol-defining statementsbull Expressionsbull Program blocksbull Control sections and program linking

Literals

It is convenient for the programmer to be able to write the value of a constant operand as a part of

the instruction that uses it Such an operand is called a literal

45 001A ENDFIL LDA =ClsquoEOFrsquo 003210

002D =ClsquoEOFrsquo 454F46

LALITHAMBIGAIB APIT Page 32

SYSTEM SOFTWARE ( CS 2304) UNIT - II215 1062 WLOOP TD =Xlsquo05rsquo E32011

1076 =Xlsquo05rsquo 05

bull In this assembler language notation a literal is identified with the prefix= which is followed by a

specification of the literal value

The difference between a literal and an immediate operand

With immediate addressing the operand value is assembled as a part of the machine

instruction

With a literal the assembler generates the specified value as a constant at some other

memory location The address of this generated constant is used as the target address for the

machine instruction

Literal pool

All of the literal operands used in a program are gathered together into one or more literal

pools

Where the literal pool should be placed

93 LTORG

002D =ClsquoEOFrsquo 454F46

1048698 The assembler directive LTORG tells the assembler to generate a literal pool here

Literal for current value of location counter

1048698 The value of the location counter can be denoted by a literal operand

bull BASE

bull LDB =

Handling duplicate literal operands

The assembler should avoid storing duplicate literals

The easiest way to recognize duplicate literals is by comparison of the character

strings defining them

bull 100 LDA =ClsquoEOFrsquo

LALITHAMBIGAIB APIT Page 33

SYSTEM SOFTWARE ( CS 2304) UNIT - IIbull 125 LDA =ClsquoEOFrsquo

bull 160 LDA =Xlsquo454F46rsquo Literal operands with different

literal names may have the same

literal values

Recognizing literal operands that have different literal names but have the same literal values will

complicate the design of an assembler

Literal table (LITTAB)

The basic data structure needed to process literal operands is a literal table (LITTAB)

LITTAB contains Literal NameValue and Length

LITTAB is often organized as a hash table using the literal name or value as the key

Processing literal operands

Pass 1

bull For each recognized literal operand search LITTAB If the literal is already present in the

table no action is need if it is not present the literal is added to LITTAB without assigning

its address

bull When a LTORG statement is encountered or the end of the program the assembler makes a

scan of LITTAB and assigns an address to each literal

bull Update the location counter to reflect the number of bytes occupied by each literal

Pass 2

bull Search LITTAB for each literal operand encountered

bull The data values specified by the literals in each literal pool are inserted at the appropriate

places in the object program

bull In the same way as these values generated by BYTE or WORD statements

bull If a literal value represents an address in the program the assembler must generate the

appropriate Modification record

Symbol-defining statements

Assembler directives

EQU

ORG

Assembler directive EQU

LALITHAMBIGAIB APIT Page 34

SYSTEM SOFTWARE ( CS 2304) UNIT - II Most assemblers provide an assembler directive that allows the programmer to define

symbols and specify their values

Symbol EQU value

When the assembler encounters the EQU statement it enters ldquosymbolrdquo into SYMTAB with

the value of ldquosymbolrdquo

Use of EQU

Establish symbolic names that can be used for improved readability in place of numeric

values

+LDT 4096

MAXLEN EQU 4096

+LDT MAXLEN

Define mnemonic names for registers

A EQU 0

X EQU 1

L EQU 2

RMO 01

Establish and use names that reflect the logical function of the registers in the program

BASE EQU R1

ACCUMULATOR EQU R2

INDEX EQU R3

Assembler directive ORG

The assembler directive ORG is usually used to indirectly assign values to symbols

ORG value

ldquoValuerdquo is a constant or an expression involving constants and previously defined

symbols

When this statement is encountered the assembler resets its location counter (LOCCTR) to

the specified value

Use ORG for label definition

Suppose that we want to define a table with the following structure

STAB SYMBOL VALUE FLAGS

100 entries 6 bytes 3 bytes 1byte

LALITHAMBIGAIB APIT Page 35

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In some assemblers the previous value of LOCCTR is automatically remembered so we can

write

ORG

to return to the normal use of LOCCTR

Restrictions of EQU and ORG in an ordinary two-pass assembler For an ordinary two-pass assembler all symbols must be defined during Pass 1 Hence the

following sequences could not be processed by an ordinary two-pass assembler

All terms used to specify the value of the new symbol must have been defined previously in

the program

BETA EQU ALPHA

ALPHA RESW 1

ALPHA EQU BETA

BETA EQU DELTA

DELTA RESW 1

disallowed

ORG ALPHA

BYTE1 RESB 1

BYTE2 RESB 1

BYTE3 RESB 1

ORG

ALPHA RESB 1

disallowed

ALPHA RESW 1

BETA EQU ALPHA

Allowed

LALITHAMBIGAIB APIT Page 36

disallowed

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Expressions Most assemblers allow the use of expressions whenever a single operand such as a label or

literal is permitted

bull Each such expression must be evaluated by the assembler to produce a single

operand address or value

Assemblers generally allow arithmetic expressions formed according to the normal rule using

the operators +- and

bull Individual terms in the expression may be

bull constants

bull user-defined symbols or

bull special terms

bull The most common special term is the current value of the location

counter (often designated by )

Types of terms

Absolute terms - The value of an absolute term is independent of program location

Relative terms - The value of a relative term is dependent on the beginning address of the

program

Types of expressions

By the type of value produced expressions can classified as

1 Absolute expressions

bull The value of an absolute expression is independent of the program location

bull The absolute expression may contains relative terms provided the relative terms occur in

pairs and the terms in each such pair have opposite signs No relative term can enter multiplication

or division operation

bull eg MAXLEN EQU BUFEND-BUFFER

LALITHAMBIGAIB APIT Page 37

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Relative expressions

bull The value of a relative expression is relative the beginning address of the object

program

bull A relative expression is one in which all of the relative terms except

one can be paired as described above The remaining unpaired term must have a positive sign No

relative term can enter multiplication or division operation

Expressions that are neither relative nor absolute should be flagged by the assembler as errors

Determining types of expressions

Symbol Type Value

MAXLEN A 1000

BUFEND R 1036

BUFFER R 0036

RETADR R 0030

BUFEND+BUFFER 100-BUFFER and 3BUFFER are neither relative expressions nor

absolute expressions

Program blocks

The program is divided into three blocks They are Default - 0CDATA-1

and CBLKS-2

Default ( 0 ) block for mnemonic instructions

CDATA ( 1 ) block for less memory instructions

CBLKs ( 2 ) block for more memory instructions

Then one more column is included in the program when object code is

generated (ie) block whenever the system will encounter the new block need to write

USE and BLOCK name then set LOCCTR value to 0000

LALITHAMBIGAIB APIT Page 38

Add this field to SYMTAB

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In program block table consists of block name block valuestarting address and

length

After finding location counter value need to find object code and write object

program

LALITHAMBIGAIB APIT Page 39

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Program Block Diagram

LALITHAMBIGAIB APIT Page 40

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Control section and program linkingControl section

i A control section is a part of the program that maintains its identity after assembly

ii Each control section can be loaded and relocated independently of the others

iii Different control sections are most often used for subroutines or other logical

subdivisions of a program

Assembler directive CSECT

CSECT signal the start of a new control section

The assembler establishes a separate location counter (initialized as 0) for each control section

Assembler directives EXTDEF EXTREF

LALITHAMBIGAIB APIT Page 41

SYSTEM SOFTWARE ( CS 2304) UNIT - IIEXTDEF external definition

The EXTDEF statement in a control section names symbols called external symbols that are

defined in this control section and may be used by other sections Control section names are

automatically considered to be external symbols

EXTREF external reference

The EXTREF statement names symbols that are used in this control section and are defined

elsewhere

LALITHAMBIGAIB APIT Page 42

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 9: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - II11 The assembler must write the generated object code onto some output device This object

program will later be loaded into memory for execution

Object program format contains three types of records

Header record Contains the program name starting address and length

Text record Contains the machine code and data of the program

End record Marks the end of the object program and specifies the address in the program

where execution is to begin

Record format is as follows

Header record

Col 1 H

Col2-7 Program name

Col8-13 Starting address of object program

Col14-19 Length of object program in bytes

Text record

Col1 T

Col2-7 Starting address for object code in this record

Col8-9 Length of object code in this record in bytes

Col 10-69 Object code represented in hexadecimal (2 columns per byte of object code)

End record

Col1 E

Col2-7 Address of first executable instruction in object program

LALITHAMBIGAIB APIT Page 9

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2111 The assembler design can be done

Single pass assembler

Multi-pass assembler

Single-pass Assembler

In Single-pass assembler the whole process of scanning parsing and object code conversion

is done in single pass

The only problem with this method is resolving forward reference

The problem of forward reference is shown with an example below

10 1000 FIRST STL RETADR 141033

--

--

--

--

95 1033 RETADR RESW 1

In the above example in line number 10 the instruction STL will store the linkage register

with the contents of RETADR But during the processing of this instruction the value of this

symbol is not known as it is defined at the line number 95

Since in single-pass assembler the scanning parsing and object code conversion happens

simultaneously

The instruction is fetched it is scanned for tokens parsed for syntax and semantic validity If

it is valid then it has to be converted to its equivalent object code For this the object code is

generated by adding the opcode of STL and the value for the symbol RETADR which is not

available

Due to this reason usually the design is done in two passes

So a multi-pass assembler resolves the forward references and then converts into the object

code Hence the process of the multi-pass assembler can be as follows

LALITHAMBIGAIB APIT Page 10

SYSTEM SOFTWARE ( CS 2304) UNIT - IIPass-1

Assign addresses to all the statements

Save the addresses assigned to all labels to be used in Pass-2

Perform some processing of assembler directives such as RESW RESB to find the length of

data areas for assigning the address values

Defines the symbols in the symbol table(generate the symbol table)

Pass-2

Assemble the instructions (translating operation codes and looking up addresses)

Generate data values defined by BYTE WORD etc

Perform the processing of the assembler directives not done during pass-1

Write the object program and assembly listing

2112 Assembler Design

The most important things which need to be concentrated is the generation of Symbol table

and resolving forward references

bull Symbol Table

ndash This is created during pass 1

ndash All the labels of the instructions are symbols

ndash Table has entry for symbol name address value

bull Forward reference

ndash Symbols that are defined in the later part of the program are called forward

referencing

ndash There will not be any address value for such symbols in the symbol table in pass 1

2113 Functions of the two passes of assembler

Pass 1 (Define symbols)

1 Assign addresses to all statements in the program

2 Save the addresses assigned to all labels for use in Pass 2

3 Perform some processing of assembler directives

LALITHAMBIGAIB APIT Page 11

SYSTEM SOFTWARE ( CS 2304) UNIT - IIPass 2 (Assemble instructions and generate object programs)

1 Assemble instructions (translating operation codes and looking up addresses)

2 Generate data values defined by BYTEWORD etc

3 Perform processing of assembler directives not done in Pass 1

4 Write the object program and the assembly listing

212 Assembler Algorithm and Data Structures

Assembler uses two major internal data structures

1 Operation Code Table (OPTAB) Used to lookup mnemonic operation codes and translate

them into their machine language equivalents

2 Symbol Table (SYMTAB) Used to store values(Addresses) assigned to labels

3

Location Counter (LOCCTR)

It is a variable used to help in the assignment of addresses

It is initialized to the beginning address specified in the START statement

After each source statement is processed the length of the assembled instruction or data area

to be generated is added to LOCCTR

Whenever a label is reached in the source program the current value of LOCCTR gives the

address to be associated with that label

Operation Code Table (OPTAB)

Contains the mnemonic operation and its machine language equivalent

Also contains information about instruction format and length

In Pass 1 OPTAB is used to lookup and validate operation codes in the source program

In Pass 2 it is used to translate the operation codes to machine language program

During Pass 2 the information in OPTAB tells which instruction format to use in assembling

the instruction and any peculiarities of the object code instruction

LALITHAMBIGAIB APIT Page 12

SYSTEM SOFTWARE ( CS 2304) UNIT - II OPTAB is usually organized as a hash table with mnemonic operation code as the key The

hash table organization is particularly appropriate since it provides fast retrieval with a

minimum search

In most cases OPTAB is a static table ndash ie entries are not added or deleted from it

Symbol Table (SYMTAB)

Includes the name and value(address) for each label in the source program and flags to

indicate error conditions(ex ndash symbol defined in two different places)

During Pass 1 of the assembler labels are entered into SYMTAB as they are encountered in

the source program along with their assigned addresses

During Pass 2 symbols used as operands are looked up in SYMTAB to obtain the addresses

to be inserted in the assembled instructions

Pass 1 usually writes an intermediate file that contains each source statement together with its

assigned address error indicators This file is used as the input to Pass 2

This copy of the source program can also be used to retain the results of certain operations that may

be performed during Pass 1 such as scanning the operand field for symbols and addressing flags so

these need not be performed again during Pass 2

The Algorithm for Pass 1

Begin

read first input line

if OPCODE = lsquoSTARTrsquo then

begin

save [Operand] as starting address

initialize LOCCTR to starting address

write line to intermediate file

read next input line

end if START

else

LALITHAMBIGAIB APIT Page 13

SYSTEM SOFTWARE ( CS 2304) UNIT - IIinitialize LOCCTR to 0

While OPCODE = lsquoENDrsquo do

begin

if this is not a comment line then

begin

if there is a symbol in the LABEL field then

begin

search SYMTAB for LABEL

if found then

set error flag (duplicate symbol)

else

end if symbol

search OPTAB for OPCODE

if found then

add 3 (instruction length) to LOCCTR

else if OPCODE = lsquoWORDrsquo then

add 3 to LOCCTR

else if OPCODE = lsquoRESWrsquo then

add 3 [OPERAND] to LOCCTR

else if OPCODE = lsquoRESBrsquo then

add [OPERAND] to LOCCTR

else if OPCODE = lsquoBYTErsquo then

begin

find length of constant in bytes

add length to LOCCTR

end if BYTE

else

set error flag (invalid operation code)

end if not a comment

write line to intermediate file

read next input line

LALITHAMBIGAIB APIT Page 14

SYSTEM SOFTWARE ( CS 2304) UNIT - IIend while not END

write last line to intermediate file

Save (LOCCTR ndash starting address) as program length

End pass 1

Explanation ndash PASS I Algorithm

The algorithm scans the first statement START and saves the operand field (the address) as

the starting address of the program Initializes the LOCCTR value to this address This line is

written to the intermediate line

If no operand is mentioned the LOCCTR is initialized to zero

If a label is encountered the symbol has to be entered in the symbol table along with its

associated address value If the symbol already exists that indicates an entry of the same

symbol already exists So an error flag is set indicating a duplication of the symbol

It next checks for the mnemonic code it searches for this code in the OPTAB If found then

the length of the instruction is added to the LOCCTR to make it point to the next instruction

If the opcode is the directive WORD it adds a value 3 to the LOCCTR

If it is RESW it needs to add 3 the number of data word to the LOCCTR

If it is BYTE it adds a value one to the LOCCTR if RESB it adds number of bytes

If it is END directive then it is the end of the program it finds the length of the program by

evaluating current LOCCTR ndash the starting address mentioned in the operand field of the

END directive

Each processed line is written to the intermediate file

The Algorithm for Pass 2

Begin

read 1st input line from intermediate file

if OPCODE = lsquoSTARTrsquo then

begin

write listing line

read next input line

end if START

LALITHAMBIGAIB APIT Page 15

SYSTEM SOFTWARE ( CS 2304) UNIT - IIwrite Header record to object program

initialize 1st Text record

while OPCODE = lsquoENDrsquo do

begin

if this is not comment line then

begin

search OPTAB for OPCODE

if found then

begin

if there is a symbol in OPERAND field then

begin

search SYMTAB for OPERAND

if found then

store symbol value as operand address

else

begin

store 0 as operand address

set error flag (undefined symbol)

end

end if symbol

else

store 0 as operand address

assemble the object code instruction

end if opcode found

else if OPCODE = lsquoBYTErsquo or lsquoWORDrdquo then

convert constant to object code

if object code doesnrsquot fit into current Text record then

begin

Write text record to object code

initialize new Text record

end

LALITHAMBIGAIB APIT Page 16

SYSTEM SOFTWARE ( CS 2304) UNIT - IIadd object code to Text record

end if not comment

write listing line

read next input line

end while not END

Write last Text record to object program

Write End record to object program

Write last listing line

End Pass 2

Explanation ndash PASS II Algorithm

Here the first input line is read from the intermediate file

If the opcode is START then this line is directly written to the list file

A header record is written in the object program which gives the starting address and the

length of the program (which is calculated during pass 1)

Then the first text record is initialized Comment lines are ignored

In the instruction for the opcode the OPTAB is searched to find the object code

If a symbol is there in the operand field the symbol table is searched to get the address value

for this which gets added to the object code of the opcode

If the address not found then zero value is stored as operands address An error flag is set

indicating it as undefined

If symbol itself is not found then store 0 as operand address and the object code instruction is

assembled

If the opcode is BYTE or WORD then the constant value is converted to its equivalent

object code( for example for character EOF its equivalent hexadecimal value lsquo454f46rsquo is

stored)

If the object code cannot fit into the current text record a new text record is created and the

rest of the instructions object code is listed

The text records are written to the object program Once the whole program is assembled and

when the END directive is encountered the End record is written

LALITHAMBIGAIB APIT Page 17

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Design and Implementation Issues

Some of the features in the program depend on the architecture of the machine If the program is for

SIC machine then we have only limited instruction formats and hence limited addressing modes

We have only single operand instructions The operand is always a memory reference Anything to

be fetched from memory requires more time Hence the improved version of SICXE machine

provides more instruction formats and hence more addressing modes The moment we change the

machine architecture the availability of number of instruction formats and the addressing modes

changes Therefore the design usually requires considering two things Machine-dependent features

and Machine-independent features

22 MACHINE DEPENDENT ASSEMBLER FEATURES

Instruction formats and addressing modes

Program relocation

In this section the design and implementation of an assembler for the more complex XE version of

SIC is considered

LALITHAMBIGAIB APIT Page 18

SYSTEM SOFTWARE ( CS 2304) UNIT - II

200 SUBROUTINE TO WRITE RECORD FROM BUFFER

205

210 WRREC CLEAR X CLEAR LOOP COUNTER

212 LDT LENGTH

215 WLOOP TD OUTPUT TEST OUTPUT DEVICE

220 JEQ WLOOP LOOP UNTIL READY

225 LDCH BUFFER X GET CHARACTER FROM BUFFER

230 WD OUTPUT WRITE CHARACTER

235 TIXR T LOOP UNTIL ALL CHARACTERS

HAVE BEEN WRITTEN

240 JLT WLOOP

245 RSUB

250 OUTPUT BYTE Xrsquo05rsquo

255 END FIRST

Figure 24 Example program of a SICXE

LALITHAMBIGAIB APIT Page 19

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The above figure 23 shows the example program from figure 21 as it is might be rewritten to

take advantage of the SICXE instruction set

Indirect addressing is indicated by adding the prefix to the operand (line70)

Immediate operands are denoted with the prefix (lines 25 55133)

Instructions that refer to memory are normally assembled using either the program counter

relative or base counter relative mode

The assembler directive BASE (line 13) is used in conjunction with base relative addressing

The four byte extended instruction format is specified with the prefix + added to the

operation code in the source statement

Register-to-register instructions are used wherever possible For example the statement on

line 150 is changed from COMP ZERO to COMPR AS

Immediate and indirect addressing have also been used as much as possible

Register-to-register instructions are faster than the corresponding register-to-memory

operations because they are shorter and do not require another memory reference

While using immediate addressing the operand is already present as part of the instruction

and need not be fetched from anywhere

The use of indirect addressing often avoids the need for another instruction

Line Loc Label Opcode Operand Object Code

5 COPY START 0

10 FIRST STL RETADR

12 LDB LENGTH

13 BASE LENGTH

15 CLOOP +JSUB RDREC

20 LDA LENGTH

25 COMP 0

30 JEQ ENDFIL

35 +JSUB WRREC

40 J CLOOP

45 ENDFIL LDA EOF

LALITHAMBIGAIB APIT Page 20

SYSTEM SOFTWARE ( CS 2304) UNIT - II50 STA BUFFER

55 LDA 3

60 STA LENGTH

65 +JSUB WRREC

70 J RETADR

80 EOF BYTE CrsquoEOFrsquo

95 RETADR RESW 1

100 LENGTH RESW 1

105 BUFFER RESB 4096

110

SUBROUTINE TO READ RECORD INTO

BUFFER

115

120

125 RDREC CLEAR X

130 CLEAR A

132 CLEAR S

133 +LDT 4096

135 RLOOP TD INPUT

140 JEQ RLOOP

145 RD INPUT

150 COMPR A S

155 JEQ EXIT

160 STCH BUFFERX

165 TIXR T

170 JLT RLOOP

175 EXIT STX LENGTH

180 RSUB

185 INPUT BYTE XrsquoF1rsquo

195

SUBROUTINE TO WRITE RECORD

FROM BUFFER

200

205

LALITHAMBIGAIB APIT Page 21

SYSTEM SOFTWARE ( CS 2304) UNIT - II210 WRREC CLEAR X

212 LDT LENGTH

215 WLOOP TD OUTPUT

220 JEQ WLOOP

225 LDCH BUFFERX

230 WD OUTPUT

235 TIXR T

240 JLT WLOOP

245 RSUB

250 OUTPUT BYTE Xrsquo05rsquo

255 END FIRST

Figure 25 Program from figure 24(SICXE) with object code

221 Instruction Formats and Addressing Modes

The instruction formats depend on the memory organization and the size of the memory

In SIC machine the memory is byte addressable Word size is 3 bytes So the size of the

memory is 212 bytes Accordingly it supports only one instruction format It has only two

registers register A and Index register Therefore the addressing modes supported by this

architecture are direct and indexed

Whereas the memory of a SICXE machine is 220 bytes (1 MB) This supports four different

types of instruction types they are

1 byte instruction

2 byte instruction

3 byte instruction

4 byte instruction

LALITHAMBIGAIB APIT Page 22

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Instructions can be

Instructions involving register to register

Instructions with one operand in memory the other in Accumulator (Single

operand instruction)

Extended instruction format

Addressing Modes are

PC-relative or Base-relative addressing op m

Indirect addressing op m

Immediate addressing op c

Extended format +op m

Index addressing op mx

register-to-register instructions

larger memory -gt multi-programming (program allocation)

Translation

1 Translations for the Instruction involving Register-Register addressing mode

During pass 1 the registers can be entered as part of the symbol table itself The value for these

registers is their equivalent numeric codes During pass 2 these values are assembled along with the

mnemonics object code If required a separate table can be created with the register names and their

equivalent numeric values

2 Translation involving Register-Memory instructions

In SICXE machine there are four instruction formats and five addressing modes For formats and

addressing modes refer chapter 1

LALITHAMBIGAIB APIT Page 23

SYSTEM SOFTWARE ( CS 2304) UNIT - IIAmong the instruction formats format -3 and format-4 instructions are Register-Memory type of

instruction One of the operand is always in a register and the other operand is in the memory The

addressing mode tells us the way in which the operand from the memory is to be fetched

There are two ways

1) Program-counter relative and

2) Base-relative

This addressing mode can be represented by either using format-3 type or format-4

type of instruction format

In format-3 the instruction has the opcode followed by a 12-bit displacement value in

the address field

Where as in format-4 the instruction contains the mnemonic code followed by a 20-

bit displacement value in the address field

1 Program-Counter Relative

a) In this usually format-3 instruction format is used

b) The instruction contains the opcode followed by a 12-bit displacement value

c) The range of displacement values are from 0 -2048 This displacement (should be small

enough to fit in a 12-bit field) value is added to the current contents of the program counter to

get the target address of the operand required by the instruction

d) This is relative way of calculating the address of the operand relative to the program counter

Hence the displacement of the operand is relative to the current program counter value

e) The following example shows how the address is calculated

LALITHAMBIGAIB APIT Page 24

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Base-Relative Addressing Mode

a) In this mode the base register is used to mention the displacement value Therefore the target

address is

TA = (base) + displacement value

b) This addressing mode is used when the range of displacement value is not sufficient Hence

the operand is not relative to the instruction as in PC-relative addressing mode Whenever

this mode is used it is indicated by using a directive BASE The moment the assembler

encounters this directive the next instruction uses base-relative addressing mode to calculate

the target address of the operand

c) When NOBASE directive is used then it indicates the base register is no more used to

calculate the target address of the operand Assembler first chooses PC-relative when the

displacement field is not enough it uses Base-relative

LDB LENGTH (instruction)

BASE LENGTH (directive)

NOBASE

LALITHAMBIGAIB APIT Page 25

SYSTEM SOFTWARE ( CS 2304) UNIT - II

For example

12 0003 LDB LENGTH 69202D

13 BASE LENGTH

100 0033 LENGTH RESW 1

105 0036 BUFFER RESB 4096

160 104E STCH BUFFER X 57C003

165 1051 TIXR T B850

In the above example the use of directive BASE indicates that Base-relative addressing mode is

to be used to calculate the target address PC-relative is no longer used The value of the LENGTH is

stored in the base register

The LDB instruction loads the value of length in the base register which 0033 BASE directive

explicitly tells the assembler that it has the value of LENGTH

BUFFER is at location (0036)16

(B) = (0033)16

disp = 0036 ndash 0033 = (0003)16

20 000A LDA LENGTH 032026

175 1056 EXIT STX LENGTH 134000

LALITHAMBIGAIB APIT Page 26

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Consider Line 175 If we use PC-relative

Disp = TA ndash (PC) = 0033 ndash1059 = EFDA

PC relative is no longer applicable so we try to use BASE relative addressing mode

3 Immediate Addressing Mode

In this mode no memory reference is involved If immediate mode is used the target address is the

operand itself

If the symbol is referred in the instruction as the immediate operand then it is immediate with PC-

relative mode as shown in the example below

LALITHAMBIGAIB APIT Page 27

SYSTEM SOFTWARE ( CS 2304) UNIT - II

4 Indirect and PC-relative mode

In this type of instruction the symbol used in the instruction is the address of the location which

contains the address of the operand The address of this is found using PC-relative addressing mode

For example

The instruction jumps the control to the address location RETADR which in turn has the address of

the operand If address of RETADR is 0030 the target address is then 0003 as calculated above

222 Program Relocation

The need for program relocation

It is desirable to load and run several programs at the same time

The system must be able to load programs into memory wherever there is room

The exact starting address of the program is not known until load time

Absolute Address

Program with starting address specified at assembly time

The address may be invalid if the program is loaded into somewhere else

LALITHAMBIGAIB APIT Page 28

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example

The above statement says that the register A is loaded with the value stored at location 102D

Suppose it is decided to load and execute the program at location 3000 instead of location 1000

Then at address 102D the required value which needs to be loaded in the register A is no more

available

The address also gets changed relative to the displacement of the program Hence we need to

make some changes in the address portion of the instruction so that we can load and execute the

program at location 3000

Apart from the instruction which will undergo a change in their operand address value as the

program load address changes There exist some parts in the program which will remain same

regardless of where the program is being loaded

Since assembler will not know actual location where the program will get loaded it cannot make

the necessary changes in the addresses used in the program However the assembler identifies

for the loader those parts of the program which need modification

An object program that has the information necessary to perform this kind of modification is

called the relocatable program

LALITHAMBIGAIB APIT Page 29

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example Program Relocation

1) The above diagram shows the concept of relocation

a) Initially the program is loaded at location 0000

b) The instruction JSUB is loaded at location 0006

c) The address field of this instruction contains 01036 which is the address of the instruction

labeled RDREC

2) The second figure shows that if the program is to be loaded at new location 5000 The address of

the instruction JSUB gets modified to new location 6036

3) Likewise the third figure shows that if the program is relocated at location 7420 the JSUB

instruction would need to be changed to 4B108456 that correspond to the new address of

RDREC

The only parts of the program that require modification at load time are those that specify

direct addresses

LALITHAMBIGAIB APIT Page 30

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The rest of the instructions need not be modified

o Not a memory address (immediate addressing)

o PC-relative Base-relative

From the object program it is not possible to distinguish the address and constant

o The assembler must keep some information to tell the loader

o The object program that contains the modification record is called a relocatable

program

The way to solve the relocation problem

For an address label its address is assigned relative to the start of the program(START 0)

Produce a Modification record to store the starting location and the length of the address

field to be modified

The command for the loader must also be a part of the object program

Modification record

One modification record for each address to be modified

The length is stored in half-bytes (4 bits)

The starting location is the location of the byte containing the leftmost bits of the address

field to be modified

If the field contains an odd number of half-bytes the starting location begins in the middle of

the first byte

LALITHAMBIGAIB APIT Page 31

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Relocatable Object Program

In the above object code the red boxes indicate the addresses that need modifications The object

code lines at the end are the descriptions of the modification records for those instructions which

need change if relocation occurs M00000705 is the modification suggested for the statement at

location 0007 and requires modification 5-half bytes

Machine dependent assembler featuresAssembler features not closely related to machine architecture

bull Literalsbull Symbol-defining statementsbull Expressionsbull Program blocksbull Control sections and program linking

Literals

It is convenient for the programmer to be able to write the value of a constant operand as a part of

the instruction that uses it Such an operand is called a literal

45 001A ENDFIL LDA =ClsquoEOFrsquo 003210

002D =ClsquoEOFrsquo 454F46

LALITHAMBIGAIB APIT Page 32

SYSTEM SOFTWARE ( CS 2304) UNIT - II215 1062 WLOOP TD =Xlsquo05rsquo E32011

1076 =Xlsquo05rsquo 05

bull In this assembler language notation a literal is identified with the prefix= which is followed by a

specification of the literal value

The difference between a literal and an immediate operand

With immediate addressing the operand value is assembled as a part of the machine

instruction

With a literal the assembler generates the specified value as a constant at some other

memory location The address of this generated constant is used as the target address for the

machine instruction

Literal pool

All of the literal operands used in a program are gathered together into one or more literal

pools

Where the literal pool should be placed

93 LTORG

002D =ClsquoEOFrsquo 454F46

1048698 The assembler directive LTORG tells the assembler to generate a literal pool here

Literal for current value of location counter

1048698 The value of the location counter can be denoted by a literal operand

bull BASE

bull LDB =

Handling duplicate literal operands

The assembler should avoid storing duplicate literals

The easiest way to recognize duplicate literals is by comparison of the character

strings defining them

bull 100 LDA =ClsquoEOFrsquo

LALITHAMBIGAIB APIT Page 33

SYSTEM SOFTWARE ( CS 2304) UNIT - IIbull 125 LDA =ClsquoEOFrsquo

bull 160 LDA =Xlsquo454F46rsquo Literal operands with different

literal names may have the same

literal values

Recognizing literal operands that have different literal names but have the same literal values will

complicate the design of an assembler

Literal table (LITTAB)

The basic data structure needed to process literal operands is a literal table (LITTAB)

LITTAB contains Literal NameValue and Length

LITTAB is often organized as a hash table using the literal name or value as the key

Processing literal operands

Pass 1

bull For each recognized literal operand search LITTAB If the literal is already present in the

table no action is need if it is not present the literal is added to LITTAB without assigning

its address

bull When a LTORG statement is encountered or the end of the program the assembler makes a

scan of LITTAB and assigns an address to each literal

bull Update the location counter to reflect the number of bytes occupied by each literal

Pass 2

bull Search LITTAB for each literal operand encountered

bull The data values specified by the literals in each literal pool are inserted at the appropriate

places in the object program

bull In the same way as these values generated by BYTE or WORD statements

bull If a literal value represents an address in the program the assembler must generate the

appropriate Modification record

Symbol-defining statements

Assembler directives

EQU

ORG

Assembler directive EQU

LALITHAMBIGAIB APIT Page 34

SYSTEM SOFTWARE ( CS 2304) UNIT - II Most assemblers provide an assembler directive that allows the programmer to define

symbols and specify their values

Symbol EQU value

When the assembler encounters the EQU statement it enters ldquosymbolrdquo into SYMTAB with

the value of ldquosymbolrdquo

Use of EQU

Establish symbolic names that can be used for improved readability in place of numeric

values

+LDT 4096

MAXLEN EQU 4096

+LDT MAXLEN

Define mnemonic names for registers

A EQU 0

X EQU 1

L EQU 2

RMO 01

Establish and use names that reflect the logical function of the registers in the program

BASE EQU R1

ACCUMULATOR EQU R2

INDEX EQU R3

Assembler directive ORG

The assembler directive ORG is usually used to indirectly assign values to symbols

ORG value

ldquoValuerdquo is a constant or an expression involving constants and previously defined

symbols

When this statement is encountered the assembler resets its location counter (LOCCTR) to

the specified value

Use ORG for label definition

Suppose that we want to define a table with the following structure

STAB SYMBOL VALUE FLAGS

100 entries 6 bytes 3 bytes 1byte

LALITHAMBIGAIB APIT Page 35

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In some assemblers the previous value of LOCCTR is automatically remembered so we can

write

ORG

to return to the normal use of LOCCTR

Restrictions of EQU and ORG in an ordinary two-pass assembler For an ordinary two-pass assembler all symbols must be defined during Pass 1 Hence the

following sequences could not be processed by an ordinary two-pass assembler

All terms used to specify the value of the new symbol must have been defined previously in

the program

BETA EQU ALPHA

ALPHA RESW 1

ALPHA EQU BETA

BETA EQU DELTA

DELTA RESW 1

disallowed

ORG ALPHA

BYTE1 RESB 1

BYTE2 RESB 1

BYTE3 RESB 1

ORG

ALPHA RESB 1

disallowed

ALPHA RESW 1

BETA EQU ALPHA

Allowed

LALITHAMBIGAIB APIT Page 36

disallowed

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Expressions Most assemblers allow the use of expressions whenever a single operand such as a label or

literal is permitted

bull Each such expression must be evaluated by the assembler to produce a single

operand address or value

Assemblers generally allow arithmetic expressions formed according to the normal rule using

the operators +- and

bull Individual terms in the expression may be

bull constants

bull user-defined symbols or

bull special terms

bull The most common special term is the current value of the location

counter (often designated by )

Types of terms

Absolute terms - The value of an absolute term is independent of program location

Relative terms - The value of a relative term is dependent on the beginning address of the

program

Types of expressions

By the type of value produced expressions can classified as

1 Absolute expressions

bull The value of an absolute expression is independent of the program location

bull The absolute expression may contains relative terms provided the relative terms occur in

pairs and the terms in each such pair have opposite signs No relative term can enter multiplication

or division operation

bull eg MAXLEN EQU BUFEND-BUFFER

LALITHAMBIGAIB APIT Page 37

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Relative expressions

bull The value of a relative expression is relative the beginning address of the object

program

bull A relative expression is one in which all of the relative terms except

one can be paired as described above The remaining unpaired term must have a positive sign No

relative term can enter multiplication or division operation

Expressions that are neither relative nor absolute should be flagged by the assembler as errors

Determining types of expressions

Symbol Type Value

MAXLEN A 1000

BUFEND R 1036

BUFFER R 0036

RETADR R 0030

BUFEND+BUFFER 100-BUFFER and 3BUFFER are neither relative expressions nor

absolute expressions

Program blocks

The program is divided into three blocks They are Default - 0CDATA-1

and CBLKS-2

Default ( 0 ) block for mnemonic instructions

CDATA ( 1 ) block for less memory instructions

CBLKs ( 2 ) block for more memory instructions

Then one more column is included in the program when object code is

generated (ie) block whenever the system will encounter the new block need to write

USE and BLOCK name then set LOCCTR value to 0000

LALITHAMBIGAIB APIT Page 38

Add this field to SYMTAB

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In program block table consists of block name block valuestarting address and

length

After finding location counter value need to find object code and write object

program

LALITHAMBIGAIB APIT Page 39

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Program Block Diagram

LALITHAMBIGAIB APIT Page 40

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Control section and program linkingControl section

i A control section is a part of the program that maintains its identity after assembly

ii Each control section can be loaded and relocated independently of the others

iii Different control sections are most often used for subroutines or other logical

subdivisions of a program

Assembler directive CSECT

CSECT signal the start of a new control section

The assembler establishes a separate location counter (initialized as 0) for each control section

Assembler directives EXTDEF EXTREF

LALITHAMBIGAIB APIT Page 41

SYSTEM SOFTWARE ( CS 2304) UNIT - IIEXTDEF external definition

The EXTDEF statement in a control section names symbols called external symbols that are

defined in this control section and may be used by other sections Control section names are

automatically considered to be external symbols

EXTREF external reference

The EXTREF statement names symbols that are used in this control section and are defined

elsewhere

LALITHAMBIGAIB APIT Page 42

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 10: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2111 The assembler design can be done

Single pass assembler

Multi-pass assembler

Single-pass Assembler

In Single-pass assembler the whole process of scanning parsing and object code conversion

is done in single pass

The only problem with this method is resolving forward reference

The problem of forward reference is shown with an example below

10 1000 FIRST STL RETADR 141033

--

--

--

--

95 1033 RETADR RESW 1

In the above example in line number 10 the instruction STL will store the linkage register

with the contents of RETADR But during the processing of this instruction the value of this

symbol is not known as it is defined at the line number 95

Since in single-pass assembler the scanning parsing and object code conversion happens

simultaneously

The instruction is fetched it is scanned for tokens parsed for syntax and semantic validity If

it is valid then it has to be converted to its equivalent object code For this the object code is

generated by adding the opcode of STL and the value for the symbol RETADR which is not

available

Due to this reason usually the design is done in two passes

So a multi-pass assembler resolves the forward references and then converts into the object

code Hence the process of the multi-pass assembler can be as follows

LALITHAMBIGAIB APIT Page 10

SYSTEM SOFTWARE ( CS 2304) UNIT - IIPass-1

Assign addresses to all the statements

Save the addresses assigned to all labels to be used in Pass-2

Perform some processing of assembler directives such as RESW RESB to find the length of

data areas for assigning the address values

Defines the symbols in the symbol table(generate the symbol table)

Pass-2

Assemble the instructions (translating operation codes and looking up addresses)

Generate data values defined by BYTE WORD etc

Perform the processing of the assembler directives not done during pass-1

Write the object program and assembly listing

2112 Assembler Design

The most important things which need to be concentrated is the generation of Symbol table

and resolving forward references

bull Symbol Table

ndash This is created during pass 1

ndash All the labels of the instructions are symbols

ndash Table has entry for symbol name address value

bull Forward reference

ndash Symbols that are defined in the later part of the program are called forward

referencing

ndash There will not be any address value for such symbols in the symbol table in pass 1

2113 Functions of the two passes of assembler

Pass 1 (Define symbols)

1 Assign addresses to all statements in the program

2 Save the addresses assigned to all labels for use in Pass 2

3 Perform some processing of assembler directives

LALITHAMBIGAIB APIT Page 11

SYSTEM SOFTWARE ( CS 2304) UNIT - IIPass 2 (Assemble instructions and generate object programs)

1 Assemble instructions (translating operation codes and looking up addresses)

2 Generate data values defined by BYTEWORD etc

3 Perform processing of assembler directives not done in Pass 1

4 Write the object program and the assembly listing

212 Assembler Algorithm and Data Structures

Assembler uses two major internal data structures

1 Operation Code Table (OPTAB) Used to lookup mnemonic operation codes and translate

them into their machine language equivalents

2 Symbol Table (SYMTAB) Used to store values(Addresses) assigned to labels

3

Location Counter (LOCCTR)

It is a variable used to help in the assignment of addresses

It is initialized to the beginning address specified in the START statement

After each source statement is processed the length of the assembled instruction or data area

to be generated is added to LOCCTR

Whenever a label is reached in the source program the current value of LOCCTR gives the

address to be associated with that label

Operation Code Table (OPTAB)

Contains the mnemonic operation and its machine language equivalent

Also contains information about instruction format and length

In Pass 1 OPTAB is used to lookup and validate operation codes in the source program

In Pass 2 it is used to translate the operation codes to machine language program

During Pass 2 the information in OPTAB tells which instruction format to use in assembling

the instruction and any peculiarities of the object code instruction

LALITHAMBIGAIB APIT Page 12

SYSTEM SOFTWARE ( CS 2304) UNIT - II OPTAB is usually organized as a hash table with mnemonic operation code as the key The

hash table organization is particularly appropriate since it provides fast retrieval with a

minimum search

In most cases OPTAB is a static table ndash ie entries are not added or deleted from it

Symbol Table (SYMTAB)

Includes the name and value(address) for each label in the source program and flags to

indicate error conditions(ex ndash symbol defined in two different places)

During Pass 1 of the assembler labels are entered into SYMTAB as they are encountered in

the source program along with their assigned addresses

During Pass 2 symbols used as operands are looked up in SYMTAB to obtain the addresses

to be inserted in the assembled instructions

Pass 1 usually writes an intermediate file that contains each source statement together with its

assigned address error indicators This file is used as the input to Pass 2

This copy of the source program can also be used to retain the results of certain operations that may

be performed during Pass 1 such as scanning the operand field for symbols and addressing flags so

these need not be performed again during Pass 2

The Algorithm for Pass 1

Begin

read first input line

if OPCODE = lsquoSTARTrsquo then

begin

save [Operand] as starting address

initialize LOCCTR to starting address

write line to intermediate file

read next input line

end if START

else

LALITHAMBIGAIB APIT Page 13

SYSTEM SOFTWARE ( CS 2304) UNIT - IIinitialize LOCCTR to 0

While OPCODE = lsquoENDrsquo do

begin

if this is not a comment line then

begin

if there is a symbol in the LABEL field then

begin

search SYMTAB for LABEL

if found then

set error flag (duplicate symbol)

else

end if symbol

search OPTAB for OPCODE

if found then

add 3 (instruction length) to LOCCTR

else if OPCODE = lsquoWORDrsquo then

add 3 to LOCCTR

else if OPCODE = lsquoRESWrsquo then

add 3 [OPERAND] to LOCCTR

else if OPCODE = lsquoRESBrsquo then

add [OPERAND] to LOCCTR

else if OPCODE = lsquoBYTErsquo then

begin

find length of constant in bytes

add length to LOCCTR

end if BYTE

else

set error flag (invalid operation code)

end if not a comment

write line to intermediate file

read next input line

LALITHAMBIGAIB APIT Page 14

SYSTEM SOFTWARE ( CS 2304) UNIT - IIend while not END

write last line to intermediate file

Save (LOCCTR ndash starting address) as program length

End pass 1

Explanation ndash PASS I Algorithm

The algorithm scans the first statement START and saves the operand field (the address) as

the starting address of the program Initializes the LOCCTR value to this address This line is

written to the intermediate line

If no operand is mentioned the LOCCTR is initialized to zero

If a label is encountered the symbol has to be entered in the symbol table along with its

associated address value If the symbol already exists that indicates an entry of the same

symbol already exists So an error flag is set indicating a duplication of the symbol

It next checks for the mnemonic code it searches for this code in the OPTAB If found then

the length of the instruction is added to the LOCCTR to make it point to the next instruction

If the opcode is the directive WORD it adds a value 3 to the LOCCTR

If it is RESW it needs to add 3 the number of data word to the LOCCTR

If it is BYTE it adds a value one to the LOCCTR if RESB it adds number of bytes

If it is END directive then it is the end of the program it finds the length of the program by

evaluating current LOCCTR ndash the starting address mentioned in the operand field of the

END directive

Each processed line is written to the intermediate file

The Algorithm for Pass 2

Begin

read 1st input line from intermediate file

if OPCODE = lsquoSTARTrsquo then

begin

write listing line

read next input line

end if START

LALITHAMBIGAIB APIT Page 15

SYSTEM SOFTWARE ( CS 2304) UNIT - IIwrite Header record to object program

initialize 1st Text record

while OPCODE = lsquoENDrsquo do

begin

if this is not comment line then

begin

search OPTAB for OPCODE

if found then

begin

if there is a symbol in OPERAND field then

begin

search SYMTAB for OPERAND

if found then

store symbol value as operand address

else

begin

store 0 as operand address

set error flag (undefined symbol)

end

end if symbol

else

store 0 as operand address

assemble the object code instruction

end if opcode found

else if OPCODE = lsquoBYTErsquo or lsquoWORDrdquo then

convert constant to object code

if object code doesnrsquot fit into current Text record then

begin

Write text record to object code

initialize new Text record

end

LALITHAMBIGAIB APIT Page 16

SYSTEM SOFTWARE ( CS 2304) UNIT - IIadd object code to Text record

end if not comment

write listing line

read next input line

end while not END

Write last Text record to object program

Write End record to object program

Write last listing line

End Pass 2

Explanation ndash PASS II Algorithm

Here the first input line is read from the intermediate file

If the opcode is START then this line is directly written to the list file

A header record is written in the object program which gives the starting address and the

length of the program (which is calculated during pass 1)

Then the first text record is initialized Comment lines are ignored

In the instruction for the opcode the OPTAB is searched to find the object code

If a symbol is there in the operand field the symbol table is searched to get the address value

for this which gets added to the object code of the opcode

If the address not found then zero value is stored as operands address An error flag is set

indicating it as undefined

If symbol itself is not found then store 0 as operand address and the object code instruction is

assembled

If the opcode is BYTE or WORD then the constant value is converted to its equivalent

object code( for example for character EOF its equivalent hexadecimal value lsquo454f46rsquo is

stored)

If the object code cannot fit into the current text record a new text record is created and the

rest of the instructions object code is listed

The text records are written to the object program Once the whole program is assembled and

when the END directive is encountered the End record is written

LALITHAMBIGAIB APIT Page 17

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Design and Implementation Issues

Some of the features in the program depend on the architecture of the machine If the program is for

SIC machine then we have only limited instruction formats and hence limited addressing modes

We have only single operand instructions The operand is always a memory reference Anything to

be fetched from memory requires more time Hence the improved version of SICXE machine

provides more instruction formats and hence more addressing modes The moment we change the

machine architecture the availability of number of instruction formats and the addressing modes

changes Therefore the design usually requires considering two things Machine-dependent features

and Machine-independent features

22 MACHINE DEPENDENT ASSEMBLER FEATURES

Instruction formats and addressing modes

Program relocation

In this section the design and implementation of an assembler for the more complex XE version of

SIC is considered

LALITHAMBIGAIB APIT Page 18

SYSTEM SOFTWARE ( CS 2304) UNIT - II

200 SUBROUTINE TO WRITE RECORD FROM BUFFER

205

210 WRREC CLEAR X CLEAR LOOP COUNTER

212 LDT LENGTH

215 WLOOP TD OUTPUT TEST OUTPUT DEVICE

220 JEQ WLOOP LOOP UNTIL READY

225 LDCH BUFFER X GET CHARACTER FROM BUFFER

230 WD OUTPUT WRITE CHARACTER

235 TIXR T LOOP UNTIL ALL CHARACTERS

HAVE BEEN WRITTEN

240 JLT WLOOP

245 RSUB

250 OUTPUT BYTE Xrsquo05rsquo

255 END FIRST

Figure 24 Example program of a SICXE

LALITHAMBIGAIB APIT Page 19

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The above figure 23 shows the example program from figure 21 as it is might be rewritten to

take advantage of the SICXE instruction set

Indirect addressing is indicated by adding the prefix to the operand (line70)

Immediate operands are denoted with the prefix (lines 25 55133)

Instructions that refer to memory are normally assembled using either the program counter

relative or base counter relative mode

The assembler directive BASE (line 13) is used in conjunction with base relative addressing

The four byte extended instruction format is specified with the prefix + added to the

operation code in the source statement

Register-to-register instructions are used wherever possible For example the statement on

line 150 is changed from COMP ZERO to COMPR AS

Immediate and indirect addressing have also been used as much as possible

Register-to-register instructions are faster than the corresponding register-to-memory

operations because they are shorter and do not require another memory reference

While using immediate addressing the operand is already present as part of the instruction

and need not be fetched from anywhere

The use of indirect addressing often avoids the need for another instruction

Line Loc Label Opcode Operand Object Code

5 COPY START 0

10 FIRST STL RETADR

12 LDB LENGTH

13 BASE LENGTH

15 CLOOP +JSUB RDREC

20 LDA LENGTH

25 COMP 0

30 JEQ ENDFIL

35 +JSUB WRREC

40 J CLOOP

45 ENDFIL LDA EOF

LALITHAMBIGAIB APIT Page 20

SYSTEM SOFTWARE ( CS 2304) UNIT - II50 STA BUFFER

55 LDA 3

60 STA LENGTH

65 +JSUB WRREC

70 J RETADR

80 EOF BYTE CrsquoEOFrsquo

95 RETADR RESW 1

100 LENGTH RESW 1

105 BUFFER RESB 4096

110

SUBROUTINE TO READ RECORD INTO

BUFFER

115

120

125 RDREC CLEAR X

130 CLEAR A

132 CLEAR S

133 +LDT 4096

135 RLOOP TD INPUT

140 JEQ RLOOP

145 RD INPUT

150 COMPR A S

155 JEQ EXIT

160 STCH BUFFERX

165 TIXR T

170 JLT RLOOP

175 EXIT STX LENGTH

180 RSUB

185 INPUT BYTE XrsquoF1rsquo

195

SUBROUTINE TO WRITE RECORD

FROM BUFFER

200

205

LALITHAMBIGAIB APIT Page 21

SYSTEM SOFTWARE ( CS 2304) UNIT - II210 WRREC CLEAR X

212 LDT LENGTH

215 WLOOP TD OUTPUT

220 JEQ WLOOP

225 LDCH BUFFERX

230 WD OUTPUT

235 TIXR T

240 JLT WLOOP

245 RSUB

250 OUTPUT BYTE Xrsquo05rsquo

255 END FIRST

Figure 25 Program from figure 24(SICXE) with object code

221 Instruction Formats and Addressing Modes

The instruction formats depend on the memory organization and the size of the memory

In SIC machine the memory is byte addressable Word size is 3 bytes So the size of the

memory is 212 bytes Accordingly it supports only one instruction format It has only two

registers register A and Index register Therefore the addressing modes supported by this

architecture are direct and indexed

Whereas the memory of a SICXE machine is 220 bytes (1 MB) This supports four different

types of instruction types they are

1 byte instruction

2 byte instruction

3 byte instruction

4 byte instruction

LALITHAMBIGAIB APIT Page 22

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Instructions can be

Instructions involving register to register

Instructions with one operand in memory the other in Accumulator (Single

operand instruction)

Extended instruction format

Addressing Modes are

PC-relative or Base-relative addressing op m

Indirect addressing op m

Immediate addressing op c

Extended format +op m

Index addressing op mx

register-to-register instructions

larger memory -gt multi-programming (program allocation)

Translation

1 Translations for the Instruction involving Register-Register addressing mode

During pass 1 the registers can be entered as part of the symbol table itself The value for these

registers is their equivalent numeric codes During pass 2 these values are assembled along with the

mnemonics object code If required a separate table can be created with the register names and their

equivalent numeric values

2 Translation involving Register-Memory instructions

In SICXE machine there are four instruction formats and five addressing modes For formats and

addressing modes refer chapter 1

LALITHAMBIGAIB APIT Page 23

SYSTEM SOFTWARE ( CS 2304) UNIT - IIAmong the instruction formats format -3 and format-4 instructions are Register-Memory type of

instruction One of the operand is always in a register and the other operand is in the memory The

addressing mode tells us the way in which the operand from the memory is to be fetched

There are two ways

1) Program-counter relative and

2) Base-relative

This addressing mode can be represented by either using format-3 type or format-4

type of instruction format

In format-3 the instruction has the opcode followed by a 12-bit displacement value in

the address field

Where as in format-4 the instruction contains the mnemonic code followed by a 20-

bit displacement value in the address field

1 Program-Counter Relative

a) In this usually format-3 instruction format is used

b) The instruction contains the opcode followed by a 12-bit displacement value

c) The range of displacement values are from 0 -2048 This displacement (should be small

enough to fit in a 12-bit field) value is added to the current contents of the program counter to

get the target address of the operand required by the instruction

d) This is relative way of calculating the address of the operand relative to the program counter

Hence the displacement of the operand is relative to the current program counter value

e) The following example shows how the address is calculated

LALITHAMBIGAIB APIT Page 24

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Base-Relative Addressing Mode

a) In this mode the base register is used to mention the displacement value Therefore the target

address is

TA = (base) + displacement value

b) This addressing mode is used when the range of displacement value is not sufficient Hence

the operand is not relative to the instruction as in PC-relative addressing mode Whenever

this mode is used it is indicated by using a directive BASE The moment the assembler

encounters this directive the next instruction uses base-relative addressing mode to calculate

the target address of the operand

c) When NOBASE directive is used then it indicates the base register is no more used to

calculate the target address of the operand Assembler first chooses PC-relative when the

displacement field is not enough it uses Base-relative

LDB LENGTH (instruction)

BASE LENGTH (directive)

NOBASE

LALITHAMBIGAIB APIT Page 25

SYSTEM SOFTWARE ( CS 2304) UNIT - II

For example

12 0003 LDB LENGTH 69202D

13 BASE LENGTH

100 0033 LENGTH RESW 1

105 0036 BUFFER RESB 4096

160 104E STCH BUFFER X 57C003

165 1051 TIXR T B850

In the above example the use of directive BASE indicates that Base-relative addressing mode is

to be used to calculate the target address PC-relative is no longer used The value of the LENGTH is

stored in the base register

The LDB instruction loads the value of length in the base register which 0033 BASE directive

explicitly tells the assembler that it has the value of LENGTH

BUFFER is at location (0036)16

(B) = (0033)16

disp = 0036 ndash 0033 = (0003)16

20 000A LDA LENGTH 032026

175 1056 EXIT STX LENGTH 134000

LALITHAMBIGAIB APIT Page 26

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Consider Line 175 If we use PC-relative

Disp = TA ndash (PC) = 0033 ndash1059 = EFDA

PC relative is no longer applicable so we try to use BASE relative addressing mode

3 Immediate Addressing Mode

In this mode no memory reference is involved If immediate mode is used the target address is the

operand itself

If the symbol is referred in the instruction as the immediate operand then it is immediate with PC-

relative mode as shown in the example below

LALITHAMBIGAIB APIT Page 27

SYSTEM SOFTWARE ( CS 2304) UNIT - II

4 Indirect and PC-relative mode

In this type of instruction the symbol used in the instruction is the address of the location which

contains the address of the operand The address of this is found using PC-relative addressing mode

For example

The instruction jumps the control to the address location RETADR which in turn has the address of

the operand If address of RETADR is 0030 the target address is then 0003 as calculated above

222 Program Relocation

The need for program relocation

It is desirable to load and run several programs at the same time

The system must be able to load programs into memory wherever there is room

The exact starting address of the program is not known until load time

Absolute Address

Program with starting address specified at assembly time

The address may be invalid if the program is loaded into somewhere else

LALITHAMBIGAIB APIT Page 28

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example

The above statement says that the register A is loaded with the value stored at location 102D

Suppose it is decided to load and execute the program at location 3000 instead of location 1000

Then at address 102D the required value which needs to be loaded in the register A is no more

available

The address also gets changed relative to the displacement of the program Hence we need to

make some changes in the address portion of the instruction so that we can load and execute the

program at location 3000

Apart from the instruction which will undergo a change in their operand address value as the

program load address changes There exist some parts in the program which will remain same

regardless of where the program is being loaded

Since assembler will not know actual location where the program will get loaded it cannot make

the necessary changes in the addresses used in the program However the assembler identifies

for the loader those parts of the program which need modification

An object program that has the information necessary to perform this kind of modification is

called the relocatable program

LALITHAMBIGAIB APIT Page 29

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example Program Relocation

1) The above diagram shows the concept of relocation

a) Initially the program is loaded at location 0000

b) The instruction JSUB is loaded at location 0006

c) The address field of this instruction contains 01036 which is the address of the instruction

labeled RDREC

2) The second figure shows that if the program is to be loaded at new location 5000 The address of

the instruction JSUB gets modified to new location 6036

3) Likewise the third figure shows that if the program is relocated at location 7420 the JSUB

instruction would need to be changed to 4B108456 that correspond to the new address of

RDREC

The only parts of the program that require modification at load time are those that specify

direct addresses

LALITHAMBIGAIB APIT Page 30

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The rest of the instructions need not be modified

o Not a memory address (immediate addressing)

o PC-relative Base-relative

From the object program it is not possible to distinguish the address and constant

o The assembler must keep some information to tell the loader

o The object program that contains the modification record is called a relocatable

program

The way to solve the relocation problem

For an address label its address is assigned relative to the start of the program(START 0)

Produce a Modification record to store the starting location and the length of the address

field to be modified

The command for the loader must also be a part of the object program

Modification record

One modification record for each address to be modified

The length is stored in half-bytes (4 bits)

The starting location is the location of the byte containing the leftmost bits of the address

field to be modified

If the field contains an odd number of half-bytes the starting location begins in the middle of

the first byte

LALITHAMBIGAIB APIT Page 31

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Relocatable Object Program

In the above object code the red boxes indicate the addresses that need modifications The object

code lines at the end are the descriptions of the modification records for those instructions which

need change if relocation occurs M00000705 is the modification suggested for the statement at

location 0007 and requires modification 5-half bytes

Machine dependent assembler featuresAssembler features not closely related to machine architecture

bull Literalsbull Symbol-defining statementsbull Expressionsbull Program blocksbull Control sections and program linking

Literals

It is convenient for the programmer to be able to write the value of a constant operand as a part of

the instruction that uses it Such an operand is called a literal

45 001A ENDFIL LDA =ClsquoEOFrsquo 003210

002D =ClsquoEOFrsquo 454F46

LALITHAMBIGAIB APIT Page 32

SYSTEM SOFTWARE ( CS 2304) UNIT - II215 1062 WLOOP TD =Xlsquo05rsquo E32011

1076 =Xlsquo05rsquo 05

bull In this assembler language notation a literal is identified with the prefix= which is followed by a

specification of the literal value

The difference between a literal and an immediate operand

With immediate addressing the operand value is assembled as a part of the machine

instruction

With a literal the assembler generates the specified value as a constant at some other

memory location The address of this generated constant is used as the target address for the

machine instruction

Literal pool

All of the literal operands used in a program are gathered together into one or more literal

pools

Where the literal pool should be placed

93 LTORG

002D =ClsquoEOFrsquo 454F46

1048698 The assembler directive LTORG tells the assembler to generate a literal pool here

Literal for current value of location counter

1048698 The value of the location counter can be denoted by a literal operand

bull BASE

bull LDB =

Handling duplicate literal operands

The assembler should avoid storing duplicate literals

The easiest way to recognize duplicate literals is by comparison of the character

strings defining them

bull 100 LDA =ClsquoEOFrsquo

LALITHAMBIGAIB APIT Page 33

SYSTEM SOFTWARE ( CS 2304) UNIT - IIbull 125 LDA =ClsquoEOFrsquo

bull 160 LDA =Xlsquo454F46rsquo Literal operands with different

literal names may have the same

literal values

Recognizing literal operands that have different literal names but have the same literal values will

complicate the design of an assembler

Literal table (LITTAB)

The basic data structure needed to process literal operands is a literal table (LITTAB)

LITTAB contains Literal NameValue and Length

LITTAB is often organized as a hash table using the literal name or value as the key

Processing literal operands

Pass 1

bull For each recognized literal operand search LITTAB If the literal is already present in the

table no action is need if it is not present the literal is added to LITTAB without assigning

its address

bull When a LTORG statement is encountered or the end of the program the assembler makes a

scan of LITTAB and assigns an address to each literal

bull Update the location counter to reflect the number of bytes occupied by each literal

Pass 2

bull Search LITTAB for each literal operand encountered

bull The data values specified by the literals in each literal pool are inserted at the appropriate

places in the object program

bull In the same way as these values generated by BYTE or WORD statements

bull If a literal value represents an address in the program the assembler must generate the

appropriate Modification record

Symbol-defining statements

Assembler directives

EQU

ORG

Assembler directive EQU

LALITHAMBIGAIB APIT Page 34

SYSTEM SOFTWARE ( CS 2304) UNIT - II Most assemblers provide an assembler directive that allows the programmer to define

symbols and specify their values

Symbol EQU value

When the assembler encounters the EQU statement it enters ldquosymbolrdquo into SYMTAB with

the value of ldquosymbolrdquo

Use of EQU

Establish symbolic names that can be used for improved readability in place of numeric

values

+LDT 4096

MAXLEN EQU 4096

+LDT MAXLEN

Define mnemonic names for registers

A EQU 0

X EQU 1

L EQU 2

RMO 01

Establish and use names that reflect the logical function of the registers in the program

BASE EQU R1

ACCUMULATOR EQU R2

INDEX EQU R3

Assembler directive ORG

The assembler directive ORG is usually used to indirectly assign values to symbols

ORG value

ldquoValuerdquo is a constant or an expression involving constants and previously defined

symbols

When this statement is encountered the assembler resets its location counter (LOCCTR) to

the specified value

Use ORG for label definition

Suppose that we want to define a table with the following structure

STAB SYMBOL VALUE FLAGS

100 entries 6 bytes 3 bytes 1byte

LALITHAMBIGAIB APIT Page 35

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In some assemblers the previous value of LOCCTR is automatically remembered so we can

write

ORG

to return to the normal use of LOCCTR

Restrictions of EQU and ORG in an ordinary two-pass assembler For an ordinary two-pass assembler all symbols must be defined during Pass 1 Hence the

following sequences could not be processed by an ordinary two-pass assembler

All terms used to specify the value of the new symbol must have been defined previously in

the program

BETA EQU ALPHA

ALPHA RESW 1

ALPHA EQU BETA

BETA EQU DELTA

DELTA RESW 1

disallowed

ORG ALPHA

BYTE1 RESB 1

BYTE2 RESB 1

BYTE3 RESB 1

ORG

ALPHA RESB 1

disallowed

ALPHA RESW 1

BETA EQU ALPHA

Allowed

LALITHAMBIGAIB APIT Page 36

disallowed

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Expressions Most assemblers allow the use of expressions whenever a single operand such as a label or

literal is permitted

bull Each such expression must be evaluated by the assembler to produce a single

operand address or value

Assemblers generally allow arithmetic expressions formed according to the normal rule using

the operators +- and

bull Individual terms in the expression may be

bull constants

bull user-defined symbols or

bull special terms

bull The most common special term is the current value of the location

counter (often designated by )

Types of terms

Absolute terms - The value of an absolute term is independent of program location

Relative terms - The value of a relative term is dependent on the beginning address of the

program

Types of expressions

By the type of value produced expressions can classified as

1 Absolute expressions

bull The value of an absolute expression is independent of the program location

bull The absolute expression may contains relative terms provided the relative terms occur in

pairs and the terms in each such pair have opposite signs No relative term can enter multiplication

or division operation

bull eg MAXLEN EQU BUFEND-BUFFER

LALITHAMBIGAIB APIT Page 37

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Relative expressions

bull The value of a relative expression is relative the beginning address of the object

program

bull A relative expression is one in which all of the relative terms except

one can be paired as described above The remaining unpaired term must have a positive sign No

relative term can enter multiplication or division operation

Expressions that are neither relative nor absolute should be flagged by the assembler as errors

Determining types of expressions

Symbol Type Value

MAXLEN A 1000

BUFEND R 1036

BUFFER R 0036

RETADR R 0030

BUFEND+BUFFER 100-BUFFER and 3BUFFER are neither relative expressions nor

absolute expressions

Program blocks

The program is divided into three blocks They are Default - 0CDATA-1

and CBLKS-2

Default ( 0 ) block for mnemonic instructions

CDATA ( 1 ) block for less memory instructions

CBLKs ( 2 ) block for more memory instructions

Then one more column is included in the program when object code is

generated (ie) block whenever the system will encounter the new block need to write

USE and BLOCK name then set LOCCTR value to 0000

LALITHAMBIGAIB APIT Page 38

Add this field to SYMTAB

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In program block table consists of block name block valuestarting address and

length

After finding location counter value need to find object code and write object

program

LALITHAMBIGAIB APIT Page 39

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Program Block Diagram

LALITHAMBIGAIB APIT Page 40

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Control section and program linkingControl section

i A control section is a part of the program that maintains its identity after assembly

ii Each control section can be loaded and relocated independently of the others

iii Different control sections are most often used for subroutines or other logical

subdivisions of a program

Assembler directive CSECT

CSECT signal the start of a new control section

The assembler establishes a separate location counter (initialized as 0) for each control section

Assembler directives EXTDEF EXTREF

LALITHAMBIGAIB APIT Page 41

SYSTEM SOFTWARE ( CS 2304) UNIT - IIEXTDEF external definition

The EXTDEF statement in a control section names symbols called external symbols that are

defined in this control section and may be used by other sections Control section names are

automatically considered to be external symbols

EXTREF external reference

The EXTREF statement names symbols that are used in this control section and are defined

elsewhere

LALITHAMBIGAIB APIT Page 42

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 11: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - IIPass-1

Assign addresses to all the statements

Save the addresses assigned to all labels to be used in Pass-2

Perform some processing of assembler directives such as RESW RESB to find the length of

data areas for assigning the address values

Defines the symbols in the symbol table(generate the symbol table)

Pass-2

Assemble the instructions (translating operation codes and looking up addresses)

Generate data values defined by BYTE WORD etc

Perform the processing of the assembler directives not done during pass-1

Write the object program and assembly listing

2112 Assembler Design

The most important things which need to be concentrated is the generation of Symbol table

and resolving forward references

bull Symbol Table

ndash This is created during pass 1

ndash All the labels of the instructions are symbols

ndash Table has entry for symbol name address value

bull Forward reference

ndash Symbols that are defined in the later part of the program are called forward

referencing

ndash There will not be any address value for such symbols in the symbol table in pass 1

2113 Functions of the two passes of assembler

Pass 1 (Define symbols)

1 Assign addresses to all statements in the program

2 Save the addresses assigned to all labels for use in Pass 2

3 Perform some processing of assembler directives

LALITHAMBIGAIB APIT Page 11

SYSTEM SOFTWARE ( CS 2304) UNIT - IIPass 2 (Assemble instructions and generate object programs)

1 Assemble instructions (translating operation codes and looking up addresses)

2 Generate data values defined by BYTEWORD etc

3 Perform processing of assembler directives not done in Pass 1

4 Write the object program and the assembly listing

212 Assembler Algorithm and Data Structures

Assembler uses two major internal data structures

1 Operation Code Table (OPTAB) Used to lookup mnemonic operation codes and translate

them into their machine language equivalents

2 Symbol Table (SYMTAB) Used to store values(Addresses) assigned to labels

3

Location Counter (LOCCTR)

It is a variable used to help in the assignment of addresses

It is initialized to the beginning address specified in the START statement

After each source statement is processed the length of the assembled instruction or data area

to be generated is added to LOCCTR

Whenever a label is reached in the source program the current value of LOCCTR gives the

address to be associated with that label

Operation Code Table (OPTAB)

Contains the mnemonic operation and its machine language equivalent

Also contains information about instruction format and length

In Pass 1 OPTAB is used to lookup and validate operation codes in the source program

In Pass 2 it is used to translate the operation codes to machine language program

During Pass 2 the information in OPTAB tells which instruction format to use in assembling

the instruction and any peculiarities of the object code instruction

LALITHAMBIGAIB APIT Page 12

SYSTEM SOFTWARE ( CS 2304) UNIT - II OPTAB is usually organized as a hash table with mnemonic operation code as the key The

hash table organization is particularly appropriate since it provides fast retrieval with a

minimum search

In most cases OPTAB is a static table ndash ie entries are not added or deleted from it

Symbol Table (SYMTAB)

Includes the name and value(address) for each label in the source program and flags to

indicate error conditions(ex ndash symbol defined in two different places)

During Pass 1 of the assembler labels are entered into SYMTAB as they are encountered in

the source program along with their assigned addresses

During Pass 2 symbols used as operands are looked up in SYMTAB to obtain the addresses

to be inserted in the assembled instructions

Pass 1 usually writes an intermediate file that contains each source statement together with its

assigned address error indicators This file is used as the input to Pass 2

This copy of the source program can also be used to retain the results of certain operations that may

be performed during Pass 1 such as scanning the operand field for symbols and addressing flags so

these need not be performed again during Pass 2

The Algorithm for Pass 1

Begin

read first input line

if OPCODE = lsquoSTARTrsquo then

begin

save [Operand] as starting address

initialize LOCCTR to starting address

write line to intermediate file

read next input line

end if START

else

LALITHAMBIGAIB APIT Page 13

SYSTEM SOFTWARE ( CS 2304) UNIT - IIinitialize LOCCTR to 0

While OPCODE = lsquoENDrsquo do

begin

if this is not a comment line then

begin

if there is a symbol in the LABEL field then

begin

search SYMTAB for LABEL

if found then

set error flag (duplicate symbol)

else

end if symbol

search OPTAB for OPCODE

if found then

add 3 (instruction length) to LOCCTR

else if OPCODE = lsquoWORDrsquo then

add 3 to LOCCTR

else if OPCODE = lsquoRESWrsquo then

add 3 [OPERAND] to LOCCTR

else if OPCODE = lsquoRESBrsquo then

add [OPERAND] to LOCCTR

else if OPCODE = lsquoBYTErsquo then

begin

find length of constant in bytes

add length to LOCCTR

end if BYTE

else

set error flag (invalid operation code)

end if not a comment

write line to intermediate file

read next input line

LALITHAMBIGAIB APIT Page 14

SYSTEM SOFTWARE ( CS 2304) UNIT - IIend while not END

write last line to intermediate file

Save (LOCCTR ndash starting address) as program length

End pass 1

Explanation ndash PASS I Algorithm

The algorithm scans the first statement START and saves the operand field (the address) as

the starting address of the program Initializes the LOCCTR value to this address This line is

written to the intermediate line

If no operand is mentioned the LOCCTR is initialized to zero

If a label is encountered the symbol has to be entered in the symbol table along with its

associated address value If the symbol already exists that indicates an entry of the same

symbol already exists So an error flag is set indicating a duplication of the symbol

It next checks for the mnemonic code it searches for this code in the OPTAB If found then

the length of the instruction is added to the LOCCTR to make it point to the next instruction

If the opcode is the directive WORD it adds a value 3 to the LOCCTR

If it is RESW it needs to add 3 the number of data word to the LOCCTR

If it is BYTE it adds a value one to the LOCCTR if RESB it adds number of bytes

If it is END directive then it is the end of the program it finds the length of the program by

evaluating current LOCCTR ndash the starting address mentioned in the operand field of the

END directive

Each processed line is written to the intermediate file

The Algorithm for Pass 2

Begin

read 1st input line from intermediate file

if OPCODE = lsquoSTARTrsquo then

begin

write listing line

read next input line

end if START

LALITHAMBIGAIB APIT Page 15

SYSTEM SOFTWARE ( CS 2304) UNIT - IIwrite Header record to object program

initialize 1st Text record

while OPCODE = lsquoENDrsquo do

begin

if this is not comment line then

begin

search OPTAB for OPCODE

if found then

begin

if there is a symbol in OPERAND field then

begin

search SYMTAB for OPERAND

if found then

store symbol value as operand address

else

begin

store 0 as operand address

set error flag (undefined symbol)

end

end if symbol

else

store 0 as operand address

assemble the object code instruction

end if opcode found

else if OPCODE = lsquoBYTErsquo or lsquoWORDrdquo then

convert constant to object code

if object code doesnrsquot fit into current Text record then

begin

Write text record to object code

initialize new Text record

end

LALITHAMBIGAIB APIT Page 16

SYSTEM SOFTWARE ( CS 2304) UNIT - IIadd object code to Text record

end if not comment

write listing line

read next input line

end while not END

Write last Text record to object program

Write End record to object program

Write last listing line

End Pass 2

Explanation ndash PASS II Algorithm

Here the first input line is read from the intermediate file

If the opcode is START then this line is directly written to the list file

A header record is written in the object program which gives the starting address and the

length of the program (which is calculated during pass 1)

Then the first text record is initialized Comment lines are ignored

In the instruction for the opcode the OPTAB is searched to find the object code

If a symbol is there in the operand field the symbol table is searched to get the address value

for this which gets added to the object code of the opcode

If the address not found then zero value is stored as operands address An error flag is set

indicating it as undefined

If symbol itself is not found then store 0 as operand address and the object code instruction is

assembled

If the opcode is BYTE or WORD then the constant value is converted to its equivalent

object code( for example for character EOF its equivalent hexadecimal value lsquo454f46rsquo is

stored)

If the object code cannot fit into the current text record a new text record is created and the

rest of the instructions object code is listed

The text records are written to the object program Once the whole program is assembled and

when the END directive is encountered the End record is written

LALITHAMBIGAIB APIT Page 17

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Design and Implementation Issues

Some of the features in the program depend on the architecture of the machine If the program is for

SIC machine then we have only limited instruction formats and hence limited addressing modes

We have only single operand instructions The operand is always a memory reference Anything to

be fetched from memory requires more time Hence the improved version of SICXE machine

provides more instruction formats and hence more addressing modes The moment we change the

machine architecture the availability of number of instruction formats and the addressing modes

changes Therefore the design usually requires considering two things Machine-dependent features

and Machine-independent features

22 MACHINE DEPENDENT ASSEMBLER FEATURES

Instruction formats and addressing modes

Program relocation

In this section the design and implementation of an assembler for the more complex XE version of

SIC is considered

LALITHAMBIGAIB APIT Page 18

SYSTEM SOFTWARE ( CS 2304) UNIT - II

200 SUBROUTINE TO WRITE RECORD FROM BUFFER

205

210 WRREC CLEAR X CLEAR LOOP COUNTER

212 LDT LENGTH

215 WLOOP TD OUTPUT TEST OUTPUT DEVICE

220 JEQ WLOOP LOOP UNTIL READY

225 LDCH BUFFER X GET CHARACTER FROM BUFFER

230 WD OUTPUT WRITE CHARACTER

235 TIXR T LOOP UNTIL ALL CHARACTERS

HAVE BEEN WRITTEN

240 JLT WLOOP

245 RSUB

250 OUTPUT BYTE Xrsquo05rsquo

255 END FIRST

Figure 24 Example program of a SICXE

LALITHAMBIGAIB APIT Page 19

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The above figure 23 shows the example program from figure 21 as it is might be rewritten to

take advantage of the SICXE instruction set

Indirect addressing is indicated by adding the prefix to the operand (line70)

Immediate operands are denoted with the prefix (lines 25 55133)

Instructions that refer to memory are normally assembled using either the program counter

relative or base counter relative mode

The assembler directive BASE (line 13) is used in conjunction with base relative addressing

The four byte extended instruction format is specified with the prefix + added to the

operation code in the source statement

Register-to-register instructions are used wherever possible For example the statement on

line 150 is changed from COMP ZERO to COMPR AS

Immediate and indirect addressing have also been used as much as possible

Register-to-register instructions are faster than the corresponding register-to-memory

operations because they are shorter and do not require another memory reference

While using immediate addressing the operand is already present as part of the instruction

and need not be fetched from anywhere

The use of indirect addressing often avoids the need for another instruction

Line Loc Label Opcode Operand Object Code

5 COPY START 0

10 FIRST STL RETADR

12 LDB LENGTH

13 BASE LENGTH

15 CLOOP +JSUB RDREC

20 LDA LENGTH

25 COMP 0

30 JEQ ENDFIL

35 +JSUB WRREC

40 J CLOOP

45 ENDFIL LDA EOF

LALITHAMBIGAIB APIT Page 20

SYSTEM SOFTWARE ( CS 2304) UNIT - II50 STA BUFFER

55 LDA 3

60 STA LENGTH

65 +JSUB WRREC

70 J RETADR

80 EOF BYTE CrsquoEOFrsquo

95 RETADR RESW 1

100 LENGTH RESW 1

105 BUFFER RESB 4096

110

SUBROUTINE TO READ RECORD INTO

BUFFER

115

120

125 RDREC CLEAR X

130 CLEAR A

132 CLEAR S

133 +LDT 4096

135 RLOOP TD INPUT

140 JEQ RLOOP

145 RD INPUT

150 COMPR A S

155 JEQ EXIT

160 STCH BUFFERX

165 TIXR T

170 JLT RLOOP

175 EXIT STX LENGTH

180 RSUB

185 INPUT BYTE XrsquoF1rsquo

195

SUBROUTINE TO WRITE RECORD

FROM BUFFER

200

205

LALITHAMBIGAIB APIT Page 21

SYSTEM SOFTWARE ( CS 2304) UNIT - II210 WRREC CLEAR X

212 LDT LENGTH

215 WLOOP TD OUTPUT

220 JEQ WLOOP

225 LDCH BUFFERX

230 WD OUTPUT

235 TIXR T

240 JLT WLOOP

245 RSUB

250 OUTPUT BYTE Xrsquo05rsquo

255 END FIRST

Figure 25 Program from figure 24(SICXE) with object code

221 Instruction Formats and Addressing Modes

The instruction formats depend on the memory organization and the size of the memory

In SIC machine the memory is byte addressable Word size is 3 bytes So the size of the

memory is 212 bytes Accordingly it supports only one instruction format It has only two

registers register A and Index register Therefore the addressing modes supported by this

architecture are direct and indexed

Whereas the memory of a SICXE machine is 220 bytes (1 MB) This supports four different

types of instruction types they are

1 byte instruction

2 byte instruction

3 byte instruction

4 byte instruction

LALITHAMBIGAIB APIT Page 22

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Instructions can be

Instructions involving register to register

Instructions with one operand in memory the other in Accumulator (Single

operand instruction)

Extended instruction format

Addressing Modes are

PC-relative or Base-relative addressing op m

Indirect addressing op m

Immediate addressing op c

Extended format +op m

Index addressing op mx

register-to-register instructions

larger memory -gt multi-programming (program allocation)

Translation

1 Translations for the Instruction involving Register-Register addressing mode

During pass 1 the registers can be entered as part of the symbol table itself The value for these

registers is their equivalent numeric codes During pass 2 these values are assembled along with the

mnemonics object code If required a separate table can be created with the register names and their

equivalent numeric values

2 Translation involving Register-Memory instructions

In SICXE machine there are four instruction formats and five addressing modes For formats and

addressing modes refer chapter 1

LALITHAMBIGAIB APIT Page 23

SYSTEM SOFTWARE ( CS 2304) UNIT - IIAmong the instruction formats format -3 and format-4 instructions are Register-Memory type of

instruction One of the operand is always in a register and the other operand is in the memory The

addressing mode tells us the way in which the operand from the memory is to be fetched

There are two ways

1) Program-counter relative and

2) Base-relative

This addressing mode can be represented by either using format-3 type or format-4

type of instruction format

In format-3 the instruction has the opcode followed by a 12-bit displacement value in

the address field

Where as in format-4 the instruction contains the mnemonic code followed by a 20-

bit displacement value in the address field

1 Program-Counter Relative

a) In this usually format-3 instruction format is used

b) The instruction contains the opcode followed by a 12-bit displacement value

c) The range of displacement values are from 0 -2048 This displacement (should be small

enough to fit in a 12-bit field) value is added to the current contents of the program counter to

get the target address of the operand required by the instruction

d) This is relative way of calculating the address of the operand relative to the program counter

Hence the displacement of the operand is relative to the current program counter value

e) The following example shows how the address is calculated

LALITHAMBIGAIB APIT Page 24

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Base-Relative Addressing Mode

a) In this mode the base register is used to mention the displacement value Therefore the target

address is

TA = (base) + displacement value

b) This addressing mode is used when the range of displacement value is not sufficient Hence

the operand is not relative to the instruction as in PC-relative addressing mode Whenever

this mode is used it is indicated by using a directive BASE The moment the assembler

encounters this directive the next instruction uses base-relative addressing mode to calculate

the target address of the operand

c) When NOBASE directive is used then it indicates the base register is no more used to

calculate the target address of the operand Assembler first chooses PC-relative when the

displacement field is not enough it uses Base-relative

LDB LENGTH (instruction)

BASE LENGTH (directive)

NOBASE

LALITHAMBIGAIB APIT Page 25

SYSTEM SOFTWARE ( CS 2304) UNIT - II

For example

12 0003 LDB LENGTH 69202D

13 BASE LENGTH

100 0033 LENGTH RESW 1

105 0036 BUFFER RESB 4096

160 104E STCH BUFFER X 57C003

165 1051 TIXR T B850

In the above example the use of directive BASE indicates that Base-relative addressing mode is

to be used to calculate the target address PC-relative is no longer used The value of the LENGTH is

stored in the base register

The LDB instruction loads the value of length in the base register which 0033 BASE directive

explicitly tells the assembler that it has the value of LENGTH

BUFFER is at location (0036)16

(B) = (0033)16

disp = 0036 ndash 0033 = (0003)16

20 000A LDA LENGTH 032026

175 1056 EXIT STX LENGTH 134000

LALITHAMBIGAIB APIT Page 26

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Consider Line 175 If we use PC-relative

Disp = TA ndash (PC) = 0033 ndash1059 = EFDA

PC relative is no longer applicable so we try to use BASE relative addressing mode

3 Immediate Addressing Mode

In this mode no memory reference is involved If immediate mode is used the target address is the

operand itself

If the symbol is referred in the instruction as the immediate operand then it is immediate with PC-

relative mode as shown in the example below

LALITHAMBIGAIB APIT Page 27

SYSTEM SOFTWARE ( CS 2304) UNIT - II

4 Indirect and PC-relative mode

In this type of instruction the symbol used in the instruction is the address of the location which

contains the address of the operand The address of this is found using PC-relative addressing mode

For example

The instruction jumps the control to the address location RETADR which in turn has the address of

the operand If address of RETADR is 0030 the target address is then 0003 as calculated above

222 Program Relocation

The need for program relocation

It is desirable to load and run several programs at the same time

The system must be able to load programs into memory wherever there is room

The exact starting address of the program is not known until load time

Absolute Address

Program with starting address specified at assembly time

The address may be invalid if the program is loaded into somewhere else

LALITHAMBIGAIB APIT Page 28

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example

The above statement says that the register A is loaded with the value stored at location 102D

Suppose it is decided to load and execute the program at location 3000 instead of location 1000

Then at address 102D the required value which needs to be loaded in the register A is no more

available

The address also gets changed relative to the displacement of the program Hence we need to

make some changes in the address portion of the instruction so that we can load and execute the

program at location 3000

Apart from the instruction which will undergo a change in their operand address value as the

program load address changes There exist some parts in the program which will remain same

regardless of where the program is being loaded

Since assembler will not know actual location where the program will get loaded it cannot make

the necessary changes in the addresses used in the program However the assembler identifies

for the loader those parts of the program which need modification

An object program that has the information necessary to perform this kind of modification is

called the relocatable program

LALITHAMBIGAIB APIT Page 29

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example Program Relocation

1) The above diagram shows the concept of relocation

a) Initially the program is loaded at location 0000

b) The instruction JSUB is loaded at location 0006

c) The address field of this instruction contains 01036 which is the address of the instruction

labeled RDREC

2) The second figure shows that if the program is to be loaded at new location 5000 The address of

the instruction JSUB gets modified to new location 6036

3) Likewise the third figure shows that if the program is relocated at location 7420 the JSUB

instruction would need to be changed to 4B108456 that correspond to the new address of

RDREC

The only parts of the program that require modification at load time are those that specify

direct addresses

LALITHAMBIGAIB APIT Page 30

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The rest of the instructions need not be modified

o Not a memory address (immediate addressing)

o PC-relative Base-relative

From the object program it is not possible to distinguish the address and constant

o The assembler must keep some information to tell the loader

o The object program that contains the modification record is called a relocatable

program

The way to solve the relocation problem

For an address label its address is assigned relative to the start of the program(START 0)

Produce a Modification record to store the starting location and the length of the address

field to be modified

The command for the loader must also be a part of the object program

Modification record

One modification record for each address to be modified

The length is stored in half-bytes (4 bits)

The starting location is the location of the byte containing the leftmost bits of the address

field to be modified

If the field contains an odd number of half-bytes the starting location begins in the middle of

the first byte

LALITHAMBIGAIB APIT Page 31

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Relocatable Object Program

In the above object code the red boxes indicate the addresses that need modifications The object

code lines at the end are the descriptions of the modification records for those instructions which

need change if relocation occurs M00000705 is the modification suggested for the statement at

location 0007 and requires modification 5-half bytes

Machine dependent assembler featuresAssembler features not closely related to machine architecture

bull Literalsbull Symbol-defining statementsbull Expressionsbull Program blocksbull Control sections and program linking

Literals

It is convenient for the programmer to be able to write the value of a constant operand as a part of

the instruction that uses it Such an operand is called a literal

45 001A ENDFIL LDA =ClsquoEOFrsquo 003210

002D =ClsquoEOFrsquo 454F46

LALITHAMBIGAIB APIT Page 32

SYSTEM SOFTWARE ( CS 2304) UNIT - II215 1062 WLOOP TD =Xlsquo05rsquo E32011

1076 =Xlsquo05rsquo 05

bull In this assembler language notation a literal is identified with the prefix= which is followed by a

specification of the literal value

The difference between a literal and an immediate operand

With immediate addressing the operand value is assembled as a part of the machine

instruction

With a literal the assembler generates the specified value as a constant at some other

memory location The address of this generated constant is used as the target address for the

machine instruction

Literal pool

All of the literal operands used in a program are gathered together into one or more literal

pools

Where the literal pool should be placed

93 LTORG

002D =ClsquoEOFrsquo 454F46

1048698 The assembler directive LTORG tells the assembler to generate a literal pool here

Literal for current value of location counter

1048698 The value of the location counter can be denoted by a literal operand

bull BASE

bull LDB =

Handling duplicate literal operands

The assembler should avoid storing duplicate literals

The easiest way to recognize duplicate literals is by comparison of the character

strings defining them

bull 100 LDA =ClsquoEOFrsquo

LALITHAMBIGAIB APIT Page 33

SYSTEM SOFTWARE ( CS 2304) UNIT - IIbull 125 LDA =ClsquoEOFrsquo

bull 160 LDA =Xlsquo454F46rsquo Literal operands with different

literal names may have the same

literal values

Recognizing literal operands that have different literal names but have the same literal values will

complicate the design of an assembler

Literal table (LITTAB)

The basic data structure needed to process literal operands is a literal table (LITTAB)

LITTAB contains Literal NameValue and Length

LITTAB is often organized as a hash table using the literal name or value as the key

Processing literal operands

Pass 1

bull For each recognized literal operand search LITTAB If the literal is already present in the

table no action is need if it is not present the literal is added to LITTAB without assigning

its address

bull When a LTORG statement is encountered or the end of the program the assembler makes a

scan of LITTAB and assigns an address to each literal

bull Update the location counter to reflect the number of bytes occupied by each literal

Pass 2

bull Search LITTAB for each literal operand encountered

bull The data values specified by the literals in each literal pool are inserted at the appropriate

places in the object program

bull In the same way as these values generated by BYTE or WORD statements

bull If a literal value represents an address in the program the assembler must generate the

appropriate Modification record

Symbol-defining statements

Assembler directives

EQU

ORG

Assembler directive EQU

LALITHAMBIGAIB APIT Page 34

SYSTEM SOFTWARE ( CS 2304) UNIT - II Most assemblers provide an assembler directive that allows the programmer to define

symbols and specify their values

Symbol EQU value

When the assembler encounters the EQU statement it enters ldquosymbolrdquo into SYMTAB with

the value of ldquosymbolrdquo

Use of EQU

Establish symbolic names that can be used for improved readability in place of numeric

values

+LDT 4096

MAXLEN EQU 4096

+LDT MAXLEN

Define mnemonic names for registers

A EQU 0

X EQU 1

L EQU 2

RMO 01

Establish and use names that reflect the logical function of the registers in the program

BASE EQU R1

ACCUMULATOR EQU R2

INDEX EQU R3

Assembler directive ORG

The assembler directive ORG is usually used to indirectly assign values to symbols

ORG value

ldquoValuerdquo is a constant or an expression involving constants and previously defined

symbols

When this statement is encountered the assembler resets its location counter (LOCCTR) to

the specified value

Use ORG for label definition

Suppose that we want to define a table with the following structure

STAB SYMBOL VALUE FLAGS

100 entries 6 bytes 3 bytes 1byte

LALITHAMBIGAIB APIT Page 35

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In some assemblers the previous value of LOCCTR is automatically remembered so we can

write

ORG

to return to the normal use of LOCCTR

Restrictions of EQU and ORG in an ordinary two-pass assembler For an ordinary two-pass assembler all symbols must be defined during Pass 1 Hence the

following sequences could not be processed by an ordinary two-pass assembler

All terms used to specify the value of the new symbol must have been defined previously in

the program

BETA EQU ALPHA

ALPHA RESW 1

ALPHA EQU BETA

BETA EQU DELTA

DELTA RESW 1

disallowed

ORG ALPHA

BYTE1 RESB 1

BYTE2 RESB 1

BYTE3 RESB 1

ORG

ALPHA RESB 1

disallowed

ALPHA RESW 1

BETA EQU ALPHA

Allowed

LALITHAMBIGAIB APIT Page 36

disallowed

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Expressions Most assemblers allow the use of expressions whenever a single operand such as a label or

literal is permitted

bull Each such expression must be evaluated by the assembler to produce a single

operand address or value

Assemblers generally allow arithmetic expressions formed according to the normal rule using

the operators +- and

bull Individual terms in the expression may be

bull constants

bull user-defined symbols or

bull special terms

bull The most common special term is the current value of the location

counter (often designated by )

Types of terms

Absolute terms - The value of an absolute term is independent of program location

Relative terms - The value of a relative term is dependent on the beginning address of the

program

Types of expressions

By the type of value produced expressions can classified as

1 Absolute expressions

bull The value of an absolute expression is independent of the program location

bull The absolute expression may contains relative terms provided the relative terms occur in

pairs and the terms in each such pair have opposite signs No relative term can enter multiplication

or division operation

bull eg MAXLEN EQU BUFEND-BUFFER

LALITHAMBIGAIB APIT Page 37

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Relative expressions

bull The value of a relative expression is relative the beginning address of the object

program

bull A relative expression is one in which all of the relative terms except

one can be paired as described above The remaining unpaired term must have a positive sign No

relative term can enter multiplication or division operation

Expressions that are neither relative nor absolute should be flagged by the assembler as errors

Determining types of expressions

Symbol Type Value

MAXLEN A 1000

BUFEND R 1036

BUFFER R 0036

RETADR R 0030

BUFEND+BUFFER 100-BUFFER and 3BUFFER are neither relative expressions nor

absolute expressions

Program blocks

The program is divided into three blocks They are Default - 0CDATA-1

and CBLKS-2

Default ( 0 ) block for mnemonic instructions

CDATA ( 1 ) block for less memory instructions

CBLKs ( 2 ) block for more memory instructions

Then one more column is included in the program when object code is

generated (ie) block whenever the system will encounter the new block need to write

USE and BLOCK name then set LOCCTR value to 0000

LALITHAMBIGAIB APIT Page 38

Add this field to SYMTAB

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In program block table consists of block name block valuestarting address and

length

After finding location counter value need to find object code and write object

program

LALITHAMBIGAIB APIT Page 39

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Program Block Diagram

LALITHAMBIGAIB APIT Page 40

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Control section and program linkingControl section

i A control section is a part of the program that maintains its identity after assembly

ii Each control section can be loaded and relocated independently of the others

iii Different control sections are most often used for subroutines or other logical

subdivisions of a program

Assembler directive CSECT

CSECT signal the start of a new control section

The assembler establishes a separate location counter (initialized as 0) for each control section

Assembler directives EXTDEF EXTREF

LALITHAMBIGAIB APIT Page 41

SYSTEM SOFTWARE ( CS 2304) UNIT - IIEXTDEF external definition

The EXTDEF statement in a control section names symbols called external symbols that are

defined in this control section and may be used by other sections Control section names are

automatically considered to be external symbols

EXTREF external reference

The EXTREF statement names symbols that are used in this control section and are defined

elsewhere

LALITHAMBIGAIB APIT Page 42

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 12: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - IIPass 2 (Assemble instructions and generate object programs)

1 Assemble instructions (translating operation codes and looking up addresses)

2 Generate data values defined by BYTEWORD etc

3 Perform processing of assembler directives not done in Pass 1

4 Write the object program and the assembly listing

212 Assembler Algorithm and Data Structures

Assembler uses two major internal data structures

1 Operation Code Table (OPTAB) Used to lookup mnemonic operation codes and translate

them into their machine language equivalents

2 Symbol Table (SYMTAB) Used to store values(Addresses) assigned to labels

3

Location Counter (LOCCTR)

It is a variable used to help in the assignment of addresses

It is initialized to the beginning address specified in the START statement

After each source statement is processed the length of the assembled instruction or data area

to be generated is added to LOCCTR

Whenever a label is reached in the source program the current value of LOCCTR gives the

address to be associated with that label

Operation Code Table (OPTAB)

Contains the mnemonic operation and its machine language equivalent

Also contains information about instruction format and length

In Pass 1 OPTAB is used to lookup and validate operation codes in the source program

In Pass 2 it is used to translate the operation codes to machine language program

During Pass 2 the information in OPTAB tells which instruction format to use in assembling

the instruction and any peculiarities of the object code instruction

LALITHAMBIGAIB APIT Page 12

SYSTEM SOFTWARE ( CS 2304) UNIT - II OPTAB is usually organized as a hash table with mnemonic operation code as the key The

hash table organization is particularly appropriate since it provides fast retrieval with a

minimum search

In most cases OPTAB is a static table ndash ie entries are not added or deleted from it

Symbol Table (SYMTAB)

Includes the name and value(address) for each label in the source program and flags to

indicate error conditions(ex ndash symbol defined in two different places)

During Pass 1 of the assembler labels are entered into SYMTAB as they are encountered in

the source program along with their assigned addresses

During Pass 2 symbols used as operands are looked up in SYMTAB to obtain the addresses

to be inserted in the assembled instructions

Pass 1 usually writes an intermediate file that contains each source statement together with its

assigned address error indicators This file is used as the input to Pass 2

This copy of the source program can also be used to retain the results of certain operations that may

be performed during Pass 1 such as scanning the operand field for symbols and addressing flags so

these need not be performed again during Pass 2

The Algorithm for Pass 1

Begin

read first input line

if OPCODE = lsquoSTARTrsquo then

begin

save [Operand] as starting address

initialize LOCCTR to starting address

write line to intermediate file

read next input line

end if START

else

LALITHAMBIGAIB APIT Page 13

SYSTEM SOFTWARE ( CS 2304) UNIT - IIinitialize LOCCTR to 0

While OPCODE = lsquoENDrsquo do

begin

if this is not a comment line then

begin

if there is a symbol in the LABEL field then

begin

search SYMTAB for LABEL

if found then

set error flag (duplicate symbol)

else

end if symbol

search OPTAB for OPCODE

if found then

add 3 (instruction length) to LOCCTR

else if OPCODE = lsquoWORDrsquo then

add 3 to LOCCTR

else if OPCODE = lsquoRESWrsquo then

add 3 [OPERAND] to LOCCTR

else if OPCODE = lsquoRESBrsquo then

add [OPERAND] to LOCCTR

else if OPCODE = lsquoBYTErsquo then

begin

find length of constant in bytes

add length to LOCCTR

end if BYTE

else

set error flag (invalid operation code)

end if not a comment

write line to intermediate file

read next input line

LALITHAMBIGAIB APIT Page 14

SYSTEM SOFTWARE ( CS 2304) UNIT - IIend while not END

write last line to intermediate file

Save (LOCCTR ndash starting address) as program length

End pass 1

Explanation ndash PASS I Algorithm

The algorithm scans the first statement START and saves the operand field (the address) as

the starting address of the program Initializes the LOCCTR value to this address This line is

written to the intermediate line

If no operand is mentioned the LOCCTR is initialized to zero

If a label is encountered the symbol has to be entered in the symbol table along with its

associated address value If the symbol already exists that indicates an entry of the same

symbol already exists So an error flag is set indicating a duplication of the symbol

It next checks for the mnemonic code it searches for this code in the OPTAB If found then

the length of the instruction is added to the LOCCTR to make it point to the next instruction

If the opcode is the directive WORD it adds a value 3 to the LOCCTR

If it is RESW it needs to add 3 the number of data word to the LOCCTR

If it is BYTE it adds a value one to the LOCCTR if RESB it adds number of bytes

If it is END directive then it is the end of the program it finds the length of the program by

evaluating current LOCCTR ndash the starting address mentioned in the operand field of the

END directive

Each processed line is written to the intermediate file

The Algorithm for Pass 2

Begin

read 1st input line from intermediate file

if OPCODE = lsquoSTARTrsquo then

begin

write listing line

read next input line

end if START

LALITHAMBIGAIB APIT Page 15

SYSTEM SOFTWARE ( CS 2304) UNIT - IIwrite Header record to object program

initialize 1st Text record

while OPCODE = lsquoENDrsquo do

begin

if this is not comment line then

begin

search OPTAB for OPCODE

if found then

begin

if there is a symbol in OPERAND field then

begin

search SYMTAB for OPERAND

if found then

store symbol value as operand address

else

begin

store 0 as operand address

set error flag (undefined symbol)

end

end if symbol

else

store 0 as operand address

assemble the object code instruction

end if opcode found

else if OPCODE = lsquoBYTErsquo or lsquoWORDrdquo then

convert constant to object code

if object code doesnrsquot fit into current Text record then

begin

Write text record to object code

initialize new Text record

end

LALITHAMBIGAIB APIT Page 16

SYSTEM SOFTWARE ( CS 2304) UNIT - IIadd object code to Text record

end if not comment

write listing line

read next input line

end while not END

Write last Text record to object program

Write End record to object program

Write last listing line

End Pass 2

Explanation ndash PASS II Algorithm

Here the first input line is read from the intermediate file

If the opcode is START then this line is directly written to the list file

A header record is written in the object program which gives the starting address and the

length of the program (which is calculated during pass 1)

Then the first text record is initialized Comment lines are ignored

In the instruction for the opcode the OPTAB is searched to find the object code

If a symbol is there in the operand field the symbol table is searched to get the address value

for this which gets added to the object code of the opcode

If the address not found then zero value is stored as operands address An error flag is set

indicating it as undefined

If symbol itself is not found then store 0 as operand address and the object code instruction is

assembled

If the opcode is BYTE or WORD then the constant value is converted to its equivalent

object code( for example for character EOF its equivalent hexadecimal value lsquo454f46rsquo is

stored)

If the object code cannot fit into the current text record a new text record is created and the

rest of the instructions object code is listed

The text records are written to the object program Once the whole program is assembled and

when the END directive is encountered the End record is written

LALITHAMBIGAIB APIT Page 17

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Design and Implementation Issues

Some of the features in the program depend on the architecture of the machine If the program is for

SIC machine then we have only limited instruction formats and hence limited addressing modes

We have only single operand instructions The operand is always a memory reference Anything to

be fetched from memory requires more time Hence the improved version of SICXE machine

provides more instruction formats and hence more addressing modes The moment we change the

machine architecture the availability of number of instruction formats and the addressing modes

changes Therefore the design usually requires considering two things Machine-dependent features

and Machine-independent features

22 MACHINE DEPENDENT ASSEMBLER FEATURES

Instruction formats and addressing modes

Program relocation

In this section the design and implementation of an assembler for the more complex XE version of

SIC is considered

LALITHAMBIGAIB APIT Page 18

SYSTEM SOFTWARE ( CS 2304) UNIT - II

200 SUBROUTINE TO WRITE RECORD FROM BUFFER

205

210 WRREC CLEAR X CLEAR LOOP COUNTER

212 LDT LENGTH

215 WLOOP TD OUTPUT TEST OUTPUT DEVICE

220 JEQ WLOOP LOOP UNTIL READY

225 LDCH BUFFER X GET CHARACTER FROM BUFFER

230 WD OUTPUT WRITE CHARACTER

235 TIXR T LOOP UNTIL ALL CHARACTERS

HAVE BEEN WRITTEN

240 JLT WLOOP

245 RSUB

250 OUTPUT BYTE Xrsquo05rsquo

255 END FIRST

Figure 24 Example program of a SICXE

LALITHAMBIGAIB APIT Page 19

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The above figure 23 shows the example program from figure 21 as it is might be rewritten to

take advantage of the SICXE instruction set

Indirect addressing is indicated by adding the prefix to the operand (line70)

Immediate operands are denoted with the prefix (lines 25 55133)

Instructions that refer to memory are normally assembled using either the program counter

relative or base counter relative mode

The assembler directive BASE (line 13) is used in conjunction with base relative addressing

The four byte extended instruction format is specified with the prefix + added to the

operation code in the source statement

Register-to-register instructions are used wherever possible For example the statement on

line 150 is changed from COMP ZERO to COMPR AS

Immediate and indirect addressing have also been used as much as possible

Register-to-register instructions are faster than the corresponding register-to-memory

operations because they are shorter and do not require another memory reference

While using immediate addressing the operand is already present as part of the instruction

and need not be fetched from anywhere

The use of indirect addressing often avoids the need for another instruction

Line Loc Label Opcode Operand Object Code

5 COPY START 0

10 FIRST STL RETADR

12 LDB LENGTH

13 BASE LENGTH

15 CLOOP +JSUB RDREC

20 LDA LENGTH

25 COMP 0

30 JEQ ENDFIL

35 +JSUB WRREC

40 J CLOOP

45 ENDFIL LDA EOF

LALITHAMBIGAIB APIT Page 20

SYSTEM SOFTWARE ( CS 2304) UNIT - II50 STA BUFFER

55 LDA 3

60 STA LENGTH

65 +JSUB WRREC

70 J RETADR

80 EOF BYTE CrsquoEOFrsquo

95 RETADR RESW 1

100 LENGTH RESW 1

105 BUFFER RESB 4096

110

SUBROUTINE TO READ RECORD INTO

BUFFER

115

120

125 RDREC CLEAR X

130 CLEAR A

132 CLEAR S

133 +LDT 4096

135 RLOOP TD INPUT

140 JEQ RLOOP

145 RD INPUT

150 COMPR A S

155 JEQ EXIT

160 STCH BUFFERX

165 TIXR T

170 JLT RLOOP

175 EXIT STX LENGTH

180 RSUB

185 INPUT BYTE XrsquoF1rsquo

195

SUBROUTINE TO WRITE RECORD

FROM BUFFER

200

205

LALITHAMBIGAIB APIT Page 21

SYSTEM SOFTWARE ( CS 2304) UNIT - II210 WRREC CLEAR X

212 LDT LENGTH

215 WLOOP TD OUTPUT

220 JEQ WLOOP

225 LDCH BUFFERX

230 WD OUTPUT

235 TIXR T

240 JLT WLOOP

245 RSUB

250 OUTPUT BYTE Xrsquo05rsquo

255 END FIRST

Figure 25 Program from figure 24(SICXE) with object code

221 Instruction Formats and Addressing Modes

The instruction formats depend on the memory organization and the size of the memory

In SIC machine the memory is byte addressable Word size is 3 bytes So the size of the

memory is 212 bytes Accordingly it supports only one instruction format It has only two

registers register A and Index register Therefore the addressing modes supported by this

architecture are direct and indexed

Whereas the memory of a SICXE machine is 220 bytes (1 MB) This supports four different

types of instruction types they are

1 byte instruction

2 byte instruction

3 byte instruction

4 byte instruction

LALITHAMBIGAIB APIT Page 22

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Instructions can be

Instructions involving register to register

Instructions with one operand in memory the other in Accumulator (Single

operand instruction)

Extended instruction format

Addressing Modes are

PC-relative or Base-relative addressing op m

Indirect addressing op m

Immediate addressing op c

Extended format +op m

Index addressing op mx

register-to-register instructions

larger memory -gt multi-programming (program allocation)

Translation

1 Translations for the Instruction involving Register-Register addressing mode

During pass 1 the registers can be entered as part of the symbol table itself The value for these

registers is their equivalent numeric codes During pass 2 these values are assembled along with the

mnemonics object code If required a separate table can be created with the register names and their

equivalent numeric values

2 Translation involving Register-Memory instructions

In SICXE machine there are four instruction formats and five addressing modes For formats and

addressing modes refer chapter 1

LALITHAMBIGAIB APIT Page 23

SYSTEM SOFTWARE ( CS 2304) UNIT - IIAmong the instruction formats format -3 and format-4 instructions are Register-Memory type of

instruction One of the operand is always in a register and the other operand is in the memory The

addressing mode tells us the way in which the operand from the memory is to be fetched

There are two ways

1) Program-counter relative and

2) Base-relative

This addressing mode can be represented by either using format-3 type or format-4

type of instruction format

In format-3 the instruction has the opcode followed by a 12-bit displacement value in

the address field

Where as in format-4 the instruction contains the mnemonic code followed by a 20-

bit displacement value in the address field

1 Program-Counter Relative

a) In this usually format-3 instruction format is used

b) The instruction contains the opcode followed by a 12-bit displacement value

c) The range of displacement values are from 0 -2048 This displacement (should be small

enough to fit in a 12-bit field) value is added to the current contents of the program counter to

get the target address of the operand required by the instruction

d) This is relative way of calculating the address of the operand relative to the program counter

Hence the displacement of the operand is relative to the current program counter value

e) The following example shows how the address is calculated

LALITHAMBIGAIB APIT Page 24

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Base-Relative Addressing Mode

a) In this mode the base register is used to mention the displacement value Therefore the target

address is

TA = (base) + displacement value

b) This addressing mode is used when the range of displacement value is not sufficient Hence

the operand is not relative to the instruction as in PC-relative addressing mode Whenever

this mode is used it is indicated by using a directive BASE The moment the assembler

encounters this directive the next instruction uses base-relative addressing mode to calculate

the target address of the operand

c) When NOBASE directive is used then it indicates the base register is no more used to

calculate the target address of the operand Assembler first chooses PC-relative when the

displacement field is not enough it uses Base-relative

LDB LENGTH (instruction)

BASE LENGTH (directive)

NOBASE

LALITHAMBIGAIB APIT Page 25

SYSTEM SOFTWARE ( CS 2304) UNIT - II

For example

12 0003 LDB LENGTH 69202D

13 BASE LENGTH

100 0033 LENGTH RESW 1

105 0036 BUFFER RESB 4096

160 104E STCH BUFFER X 57C003

165 1051 TIXR T B850

In the above example the use of directive BASE indicates that Base-relative addressing mode is

to be used to calculate the target address PC-relative is no longer used The value of the LENGTH is

stored in the base register

The LDB instruction loads the value of length in the base register which 0033 BASE directive

explicitly tells the assembler that it has the value of LENGTH

BUFFER is at location (0036)16

(B) = (0033)16

disp = 0036 ndash 0033 = (0003)16

20 000A LDA LENGTH 032026

175 1056 EXIT STX LENGTH 134000

LALITHAMBIGAIB APIT Page 26

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Consider Line 175 If we use PC-relative

Disp = TA ndash (PC) = 0033 ndash1059 = EFDA

PC relative is no longer applicable so we try to use BASE relative addressing mode

3 Immediate Addressing Mode

In this mode no memory reference is involved If immediate mode is used the target address is the

operand itself

If the symbol is referred in the instruction as the immediate operand then it is immediate with PC-

relative mode as shown in the example below

LALITHAMBIGAIB APIT Page 27

SYSTEM SOFTWARE ( CS 2304) UNIT - II

4 Indirect and PC-relative mode

In this type of instruction the symbol used in the instruction is the address of the location which

contains the address of the operand The address of this is found using PC-relative addressing mode

For example

The instruction jumps the control to the address location RETADR which in turn has the address of

the operand If address of RETADR is 0030 the target address is then 0003 as calculated above

222 Program Relocation

The need for program relocation

It is desirable to load and run several programs at the same time

The system must be able to load programs into memory wherever there is room

The exact starting address of the program is not known until load time

Absolute Address

Program with starting address specified at assembly time

The address may be invalid if the program is loaded into somewhere else

LALITHAMBIGAIB APIT Page 28

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example

The above statement says that the register A is loaded with the value stored at location 102D

Suppose it is decided to load and execute the program at location 3000 instead of location 1000

Then at address 102D the required value which needs to be loaded in the register A is no more

available

The address also gets changed relative to the displacement of the program Hence we need to

make some changes in the address portion of the instruction so that we can load and execute the

program at location 3000

Apart from the instruction which will undergo a change in their operand address value as the

program load address changes There exist some parts in the program which will remain same

regardless of where the program is being loaded

Since assembler will not know actual location where the program will get loaded it cannot make

the necessary changes in the addresses used in the program However the assembler identifies

for the loader those parts of the program which need modification

An object program that has the information necessary to perform this kind of modification is

called the relocatable program

LALITHAMBIGAIB APIT Page 29

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example Program Relocation

1) The above diagram shows the concept of relocation

a) Initially the program is loaded at location 0000

b) The instruction JSUB is loaded at location 0006

c) The address field of this instruction contains 01036 which is the address of the instruction

labeled RDREC

2) The second figure shows that if the program is to be loaded at new location 5000 The address of

the instruction JSUB gets modified to new location 6036

3) Likewise the third figure shows that if the program is relocated at location 7420 the JSUB

instruction would need to be changed to 4B108456 that correspond to the new address of

RDREC

The only parts of the program that require modification at load time are those that specify

direct addresses

LALITHAMBIGAIB APIT Page 30

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The rest of the instructions need not be modified

o Not a memory address (immediate addressing)

o PC-relative Base-relative

From the object program it is not possible to distinguish the address and constant

o The assembler must keep some information to tell the loader

o The object program that contains the modification record is called a relocatable

program

The way to solve the relocation problem

For an address label its address is assigned relative to the start of the program(START 0)

Produce a Modification record to store the starting location and the length of the address

field to be modified

The command for the loader must also be a part of the object program

Modification record

One modification record for each address to be modified

The length is stored in half-bytes (4 bits)

The starting location is the location of the byte containing the leftmost bits of the address

field to be modified

If the field contains an odd number of half-bytes the starting location begins in the middle of

the first byte

LALITHAMBIGAIB APIT Page 31

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Relocatable Object Program

In the above object code the red boxes indicate the addresses that need modifications The object

code lines at the end are the descriptions of the modification records for those instructions which

need change if relocation occurs M00000705 is the modification suggested for the statement at

location 0007 and requires modification 5-half bytes

Machine dependent assembler featuresAssembler features not closely related to machine architecture

bull Literalsbull Symbol-defining statementsbull Expressionsbull Program blocksbull Control sections and program linking

Literals

It is convenient for the programmer to be able to write the value of a constant operand as a part of

the instruction that uses it Such an operand is called a literal

45 001A ENDFIL LDA =ClsquoEOFrsquo 003210

002D =ClsquoEOFrsquo 454F46

LALITHAMBIGAIB APIT Page 32

SYSTEM SOFTWARE ( CS 2304) UNIT - II215 1062 WLOOP TD =Xlsquo05rsquo E32011

1076 =Xlsquo05rsquo 05

bull In this assembler language notation a literal is identified with the prefix= which is followed by a

specification of the literal value

The difference between a literal and an immediate operand

With immediate addressing the operand value is assembled as a part of the machine

instruction

With a literal the assembler generates the specified value as a constant at some other

memory location The address of this generated constant is used as the target address for the

machine instruction

Literal pool

All of the literal operands used in a program are gathered together into one or more literal

pools

Where the literal pool should be placed

93 LTORG

002D =ClsquoEOFrsquo 454F46

1048698 The assembler directive LTORG tells the assembler to generate a literal pool here

Literal for current value of location counter

1048698 The value of the location counter can be denoted by a literal operand

bull BASE

bull LDB =

Handling duplicate literal operands

The assembler should avoid storing duplicate literals

The easiest way to recognize duplicate literals is by comparison of the character

strings defining them

bull 100 LDA =ClsquoEOFrsquo

LALITHAMBIGAIB APIT Page 33

SYSTEM SOFTWARE ( CS 2304) UNIT - IIbull 125 LDA =ClsquoEOFrsquo

bull 160 LDA =Xlsquo454F46rsquo Literal operands with different

literal names may have the same

literal values

Recognizing literal operands that have different literal names but have the same literal values will

complicate the design of an assembler

Literal table (LITTAB)

The basic data structure needed to process literal operands is a literal table (LITTAB)

LITTAB contains Literal NameValue and Length

LITTAB is often organized as a hash table using the literal name or value as the key

Processing literal operands

Pass 1

bull For each recognized literal operand search LITTAB If the literal is already present in the

table no action is need if it is not present the literal is added to LITTAB without assigning

its address

bull When a LTORG statement is encountered or the end of the program the assembler makes a

scan of LITTAB and assigns an address to each literal

bull Update the location counter to reflect the number of bytes occupied by each literal

Pass 2

bull Search LITTAB for each literal operand encountered

bull The data values specified by the literals in each literal pool are inserted at the appropriate

places in the object program

bull In the same way as these values generated by BYTE or WORD statements

bull If a literal value represents an address in the program the assembler must generate the

appropriate Modification record

Symbol-defining statements

Assembler directives

EQU

ORG

Assembler directive EQU

LALITHAMBIGAIB APIT Page 34

SYSTEM SOFTWARE ( CS 2304) UNIT - II Most assemblers provide an assembler directive that allows the programmer to define

symbols and specify their values

Symbol EQU value

When the assembler encounters the EQU statement it enters ldquosymbolrdquo into SYMTAB with

the value of ldquosymbolrdquo

Use of EQU

Establish symbolic names that can be used for improved readability in place of numeric

values

+LDT 4096

MAXLEN EQU 4096

+LDT MAXLEN

Define mnemonic names for registers

A EQU 0

X EQU 1

L EQU 2

RMO 01

Establish and use names that reflect the logical function of the registers in the program

BASE EQU R1

ACCUMULATOR EQU R2

INDEX EQU R3

Assembler directive ORG

The assembler directive ORG is usually used to indirectly assign values to symbols

ORG value

ldquoValuerdquo is a constant or an expression involving constants and previously defined

symbols

When this statement is encountered the assembler resets its location counter (LOCCTR) to

the specified value

Use ORG for label definition

Suppose that we want to define a table with the following structure

STAB SYMBOL VALUE FLAGS

100 entries 6 bytes 3 bytes 1byte

LALITHAMBIGAIB APIT Page 35

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In some assemblers the previous value of LOCCTR is automatically remembered so we can

write

ORG

to return to the normal use of LOCCTR

Restrictions of EQU and ORG in an ordinary two-pass assembler For an ordinary two-pass assembler all symbols must be defined during Pass 1 Hence the

following sequences could not be processed by an ordinary two-pass assembler

All terms used to specify the value of the new symbol must have been defined previously in

the program

BETA EQU ALPHA

ALPHA RESW 1

ALPHA EQU BETA

BETA EQU DELTA

DELTA RESW 1

disallowed

ORG ALPHA

BYTE1 RESB 1

BYTE2 RESB 1

BYTE3 RESB 1

ORG

ALPHA RESB 1

disallowed

ALPHA RESW 1

BETA EQU ALPHA

Allowed

LALITHAMBIGAIB APIT Page 36

disallowed

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Expressions Most assemblers allow the use of expressions whenever a single operand such as a label or

literal is permitted

bull Each such expression must be evaluated by the assembler to produce a single

operand address or value

Assemblers generally allow arithmetic expressions formed according to the normal rule using

the operators +- and

bull Individual terms in the expression may be

bull constants

bull user-defined symbols or

bull special terms

bull The most common special term is the current value of the location

counter (often designated by )

Types of terms

Absolute terms - The value of an absolute term is independent of program location

Relative terms - The value of a relative term is dependent on the beginning address of the

program

Types of expressions

By the type of value produced expressions can classified as

1 Absolute expressions

bull The value of an absolute expression is independent of the program location

bull The absolute expression may contains relative terms provided the relative terms occur in

pairs and the terms in each such pair have opposite signs No relative term can enter multiplication

or division operation

bull eg MAXLEN EQU BUFEND-BUFFER

LALITHAMBIGAIB APIT Page 37

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Relative expressions

bull The value of a relative expression is relative the beginning address of the object

program

bull A relative expression is one in which all of the relative terms except

one can be paired as described above The remaining unpaired term must have a positive sign No

relative term can enter multiplication or division operation

Expressions that are neither relative nor absolute should be flagged by the assembler as errors

Determining types of expressions

Symbol Type Value

MAXLEN A 1000

BUFEND R 1036

BUFFER R 0036

RETADR R 0030

BUFEND+BUFFER 100-BUFFER and 3BUFFER are neither relative expressions nor

absolute expressions

Program blocks

The program is divided into three blocks They are Default - 0CDATA-1

and CBLKS-2

Default ( 0 ) block for mnemonic instructions

CDATA ( 1 ) block for less memory instructions

CBLKs ( 2 ) block for more memory instructions

Then one more column is included in the program when object code is

generated (ie) block whenever the system will encounter the new block need to write

USE and BLOCK name then set LOCCTR value to 0000

LALITHAMBIGAIB APIT Page 38

Add this field to SYMTAB

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In program block table consists of block name block valuestarting address and

length

After finding location counter value need to find object code and write object

program

LALITHAMBIGAIB APIT Page 39

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Program Block Diagram

LALITHAMBIGAIB APIT Page 40

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Control section and program linkingControl section

i A control section is a part of the program that maintains its identity after assembly

ii Each control section can be loaded and relocated independently of the others

iii Different control sections are most often used for subroutines or other logical

subdivisions of a program

Assembler directive CSECT

CSECT signal the start of a new control section

The assembler establishes a separate location counter (initialized as 0) for each control section

Assembler directives EXTDEF EXTREF

LALITHAMBIGAIB APIT Page 41

SYSTEM SOFTWARE ( CS 2304) UNIT - IIEXTDEF external definition

The EXTDEF statement in a control section names symbols called external symbols that are

defined in this control section and may be used by other sections Control section names are

automatically considered to be external symbols

EXTREF external reference

The EXTREF statement names symbols that are used in this control section and are defined

elsewhere

LALITHAMBIGAIB APIT Page 42

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 13: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - II OPTAB is usually organized as a hash table with mnemonic operation code as the key The

hash table organization is particularly appropriate since it provides fast retrieval with a

minimum search

In most cases OPTAB is a static table ndash ie entries are not added or deleted from it

Symbol Table (SYMTAB)

Includes the name and value(address) for each label in the source program and flags to

indicate error conditions(ex ndash symbol defined in two different places)

During Pass 1 of the assembler labels are entered into SYMTAB as they are encountered in

the source program along with their assigned addresses

During Pass 2 symbols used as operands are looked up in SYMTAB to obtain the addresses

to be inserted in the assembled instructions

Pass 1 usually writes an intermediate file that contains each source statement together with its

assigned address error indicators This file is used as the input to Pass 2

This copy of the source program can also be used to retain the results of certain operations that may

be performed during Pass 1 such as scanning the operand field for symbols and addressing flags so

these need not be performed again during Pass 2

The Algorithm for Pass 1

Begin

read first input line

if OPCODE = lsquoSTARTrsquo then

begin

save [Operand] as starting address

initialize LOCCTR to starting address

write line to intermediate file

read next input line

end if START

else

LALITHAMBIGAIB APIT Page 13

SYSTEM SOFTWARE ( CS 2304) UNIT - IIinitialize LOCCTR to 0

While OPCODE = lsquoENDrsquo do

begin

if this is not a comment line then

begin

if there is a symbol in the LABEL field then

begin

search SYMTAB for LABEL

if found then

set error flag (duplicate symbol)

else

end if symbol

search OPTAB for OPCODE

if found then

add 3 (instruction length) to LOCCTR

else if OPCODE = lsquoWORDrsquo then

add 3 to LOCCTR

else if OPCODE = lsquoRESWrsquo then

add 3 [OPERAND] to LOCCTR

else if OPCODE = lsquoRESBrsquo then

add [OPERAND] to LOCCTR

else if OPCODE = lsquoBYTErsquo then

begin

find length of constant in bytes

add length to LOCCTR

end if BYTE

else

set error flag (invalid operation code)

end if not a comment

write line to intermediate file

read next input line

LALITHAMBIGAIB APIT Page 14

SYSTEM SOFTWARE ( CS 2304) UNIT - IIend while not END

write last line to intermediate file

Save (LOCCTR ndash starting address) as program length

End pass 1

Explanation ndash PASS I Algorithm

The algorithm scans the first statement START and saves the operand field (the address) as

the starting address of the program Initializes the LOCCTR value to this address This line is

written to the intermediate line

If no operand is mentioned the LOCCTR is initialized to zero

If a label is encountered the symbol has to be entered in the symbol table along with its

associated address value If the symbol already exists that indicates an entry of the same

symbol already exists So an error flag is set indicating a duplication of the symbol

It next checks for the mnemonic code it searches for this code in the OPTAB If found then

the length of the instruction is added to the LOCCTR to make it point to the next instruction

If the opcode is the directive WORD it adds a value 3 to the LOCCTR

If it is RESW it needs to add 3 the number of data word to the LOCCTR

If it is BYTE it adds a value one to the LOCCTR if RESB it adds number of bytes

If it is END directive then it is the end of the program it finds the length of the program by

evaluating current LOCCTR ndash the starting address mentioned in the operand field of the

END directive

Each processed line is written to the intermediate file

The Algorithm for Pass 2

Begin

read 1st input line from intermediate file

if OPCODE = lsquoSTARTrsquo then

begin

write listing line

read next input line

end if START

LALITHAMBIGAIB APIT Page 15

SYSTEM SOFTWARE ( CS 2304) UNIT - IIwrite Header record to object program

initialize 1st Text record

while OPCODE = lsquoENDrsquo do

begin

if this is not comment line then

begin

search OPTAB for OPCODE

if found then

begin

if there is a symbol in OPERAND field then

begin

search SYMTAB for OPERAND

if found then

store symbol value as operand address

else

begin

store 0 as operand address

set error flag (undefined symbol)

end

end if symbol

else

store 0 as operand address

assemble the object code instruction

end if opcode found

else if OPCODE = lsquoBYTErsquo or lsquoWORDrdquo then

convert constant to object code

if object code doesnrsquot fit into current Text record then

begin

Write text record to object code

initialize new Text record

end

LALITHAMBIGAIB APIT Page 16

SYSTEM SOFTWARE ( CS 2304) UNIT - IIadd object code to Text record

end if not comment

write listing line

read next input line

end while not END

Write last Text record to object program

Write End record to object program

Write last listing line

End Pass 2

Explanation ndash PASS II Algorithm

Here the first input line is read from the intermediate file

If the opcode is START then this line is directly written to the list file

A header record is written in the object program which gives the starting address and the

length of the program (which is calculated during pass 1)

Then the first text record is initialized Comment lines are ignored

In the instruction for the opcode the OPTAB is searched to find the object code

If a symbol is there in the operand field the symbol table is searched to get the address value

for this which gets added to the object code of the opcode

If the address not found then zero value is stored as operands address An error flag is set

indicating it as undefined

If symbol itself is not found then store 0 as operand address and the object code instruction is

assembled

If the opcode is BYTE or WORD then the constant value is converted to its equivalent

object code( for example for character EOF its equivalent hexadecimal value lsquo454f46rsquo is

stored)

If the object code cannot fit into the current text record a new text record is created and the

rest of the instructions object code is listed

The text records are written to the object program Once the whole program is assembled and

when the END directive is encountered the End record is written

LALITHAMBIGAIB APIT Page 17

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Design and Implementation Issues

Some of the features in the program depend on the architecture of the machine If the program is for

SIC machine then we have only limited instruction formats and hence limited addressing modes

We have only single operand instructions The operand is always a memory reference Anything to

be fetched from memory requires more time Hence the improved version of SICXE machine

provides more instruction formats and hence more addressing modes The moment we change the

machine architecture the availability of number of instruction formats and the addressing modes

changes Therefore the design usually requires considering two things Machine-dependent features

and Machine-independent features

22 MACHINE DEPENDENT ASSEMBLER FEATURES

Instruction formats and addressing modes

Program relocation

In this section the design and implementation of an assembler for the more complex XE version of

SIC is considered

LALITHAMBIGAIB APIT Page 18

SYSTEM SOFTWARE ( CS 2304) UNIT - II

200 SUBROUTINE TO WRITE RECORD FROM BUFFER

205

210 WRREC CLEAR X CLEAR LOOP COUNTER

212 LDT LENGTH

215 WLOOP TD OUTPUT TEST OUTPUT DEVICE

220 JEQ WLOOP LOOP UNTIL READY

225 LDCH BUFFER X GET CHARACTER FROM BUFFER

230 WD OUTPUT WRITE CHARACTER

235 TIXR T LOOP UNTIL ALL CHARACTERS

HAVE BEEN WRITTEN

240 JLT WLOOP

245 RSUB

250 OUTPUT BYTE Xrsquo05rsquo

255 END FIRST

Figure 24 Example program of a SICXE

LALITHAMBIGAIB APIT Page 19

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The above figure 23 shows the example program from figure 21 as it is might be rewritten to

take advantage of the SICXE instruction set

Indirect addressing is indicated by adding the prefix to the operand (line70)

Immediate operands are denoted with the prefix (lines 25 55133)

Instructions that refer to memory are normally assembled using either the program counter

relative or base counter relative mode

The assembler directive BASE (line 13) is used in conjunction with base relative addressing

The four byte extended instruction format is specified with the prefix + added to the

operation code in the source statement

Register-to-register instructions are used wherever possible For example the statement on

line 150 is changed from COMP ZERO to COMPR AS

Immediate and indirect addressing have also been used as much as possible

Register-to-register instructions are faster than the corresponding register-to-memory

operations because they are shorter and do not require another memory reference

While using immediate addressing the operand is already present as part of the instruction

and need not be fetched from anywhere

The use of indirect addressing often avoids the need for another instruction

Line Loc Label Opcode Operand Object Code

5 COPY START 0

10 FIRST STL RETADR

12 LDB LENGTH

13 BASE LENGTH

15 CLOOP +JSUB RDREC

20 LDA LENGTH

25 COMP 0

30 JEQ ENDFIL

35 +JSUB WRREC

40 J CLOOP

45 ENDFIL LDA EOF

LALITHAMBIGAIB APIT Page 20

SYSTEM SOFTWARE ( CS 2304) UNIT - II50 STA BUFFER

55 LDA 3

60 STA LENGTH

65 +JSUB WRREC

70 J RETADR

80 EOF BYTE CrsquoEOFrsquo

95 RETADR RESW 1

100 LENGTH RESW 1

105 BUFFER RESB 4096

110

SUBROUTINE TO READ RECORD INTO

BUFFER

115

120

125 RDREC CLEAR X

130 CLEAR A

132 CLEAR S

133 +LDT 4096

135 RLOOP TD INPUT

140 JEQ RLOOP

145 RD INPUT

150 COMPR A S

155 JEQ EXIT

160 STCH BUFFERX

165 TIXR T

170 JLT RLOOP

175 EXIT STX LENGTH

180 RSUB

185 INPUT BYTE XrsquoF1rsquo

195

SUBROUTINE TO WRITE RECORD

FROM BUFFER

200

205

LALITHAMBIGAIB APIT Page 21

SYSTEM SOFTWARE ( CS 2304) UNIT - II210 WRREC CLEAR X

212 LDT LENGTH

215 WLOOP TD OUTPUT

220 JEQ WLOOP

225 LDCH BUFFERX

230 WD OUTPUT

235 TIXR T

240 JLT WLOOP

245 RSUB

250 OUTPUT BYTE Xrsquo05rsquo

255 END FIRST

Figure 25 Program from figure 24(SICXE) with object code

221 Instruction Formats and Addressing Modes

The instruction formats depend on the memory organization and the size of the memory

In SIC machine the memory is byte addressable Word size is 3 bytes So the size of the

memory is 212 bytes Accordingly it supports only one instruction format It has only two

registers register A and Index register Therefore the addressing modes supported by this

architecture are direct and indexed

Whereas the memory of a SICXE machine is 220 bytes (1 MB) This supports four different

types of instruction types they are

1 byte instruction

2 byte instruction

3 byte instruction

4 byte instruction

LALITHAMBIGAIB APIT Page 22

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Instructions can be

Instructions involving register to register

Instructions with one operand in memory the other in Accumulator (Single

operand instruction)

Extended instruction format

Addressing Modes are

PC-relative or Base-relative addressing op m

Indirect addressing op m

Immediate addressing op c

Extended format +op m

Index addressing op mx

register-to-register instructions

larger memory -gt multi-programming (program allocation)

Translation

1 Translations for the Instruction involving Register-Register addressing mode

During pass 1 the registers can be entered as part of the symbol table itself The value for these

registers is their equivalent numeric codes During pass 2 these values are assembled along with the

mnemonics object code If required a separate table can be created with the register names and their

equivalent numeric values

2 Translation involving Register-Memory instructions

In SICXE machine there are four instruction formats and five addressing modes For formats and

addressing modes refer chapter 1

LALITHAMBIGAIB APIT Page 23

SYSTEM SOFTWARE ( CS 2304) UNIT - IIAmong the instruction formats format -3 and format-4 instructions are Register-Memory type of

instruction One of the operand is always in a register and the other operand is in the memory The

addressing mode tells us the way in which the operand from the memory is to be fetched

There are two ways

1) Program-counter relative and

2) Base-relative

This addressing mode can be represented by either using format-3 type or format-4

type of instruction format

In format-3 the instruction has the opcode followed by a 12-bit displacement value in

the address field

Where as in format-4 the instruction contains the mnemonic code followed by a 20-

bit displacement value in the address field

1 Program-Counter Relative

a) In this usually format-3 instruction format is used

b) The instruction contains the opcode followed by a 12-bit displacement value

c) The range of displacement values are from 0 -2048 This displacement (should be small

enough to fit in a 12-bit field) value is added to the current contents of the program counter to

get the target address of the operand required by the instruction

d) This is relative way of calculating the address of the operand relative to the program counter

Hence the displacement of the operand is relative to the current program counter value

e) The following example shows how the address is calculated

LALITHAMBIGAIB APIT Page 24

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Base-Relative Addressing Mode

a) In this mode the base register is used to mention the displacement value Therefore the target

address is

TA = (base) + displacement value

b) This addressing mode is used when the range of displacement value is not sufficient Hence

the operand is not relative to the instruction as in PC-relative addressing mode Whenever

this mode is used it is indicated by using a directive BASE The moment the assembler

encounters this directive the next instruction uses base-relative addressing mode to calculate

the target address of the operand

c) When NOBASE directive is used then it indicates the base register is no more used to

calculate the target address of the operand Assembler first chooses PC-relative when the

displacement field is not enough it uses Base-relative

LDB LENGTH (instruction)

BASE LENGTH (directive)

NOBASE

LALITHAMBIGAIB APIT Page 25

SYSTEM SOFTWARE ( CS 2304) UNIT - II

For example

12 0003 LDB LENGTH 69202D

13 BASE LENGTH

100 0033 LENGTH RESW 1

105 0036 BUFFER RESB 4096

160 104E STCH BUFFER X 57C003

165 1051 TIXR T B850

In the above example the use of directive BASE indicates that Base-relative addressing mode is

to be used to calculate the target address PC-relative is no longer used The value of the LENGTH is

stored in the base register

The LDB instruction loads the value of length in the base register which 0033 BASE directive

explicitly tells the assembler that it has the value of LENGTH

BUFFER is at location (0036)16

(B) = (0033)16

disp = 0036 ndash 0033 = (0003)16

20 000A LDA LENGTH 032026

175 1056 EXIT STX LENGTH 134000

LALITHAMBIGAIB APIT Page 26

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Consider Line 175 If we use PC-relative

Disp = TA ndash (PC) = 0033 ndash1059 = EFDA

PC relative is no longer applicable so we try to use BASE relative addressing mode

3 Immediate Addressing Mode

In this mode no memory reference is involved If immediate mode is used the target address is the

operand itself

If the symbol is referred in the instruction as the immediate operand then it is immediate with PC-

relative mode as shown in the example below

LALITHAMBIGAIB APIT Page 27

SYSTEM SOFTWARE ( CS 2304) UNIT - II

4 Indirect and PC-relative mode

In this type of instruction the symbol used in the instruction is the address of the location which

contains the address of the operand The address of this is found using PC-relative addressing mode

For example

The instruction jumps the control to the address location RETADR which in turn has the address of

the operand If address of RETADR is 0030 the target address is then 0003 as calculated above

222 Program Relocation

The need for program relocation

It is desirable to load and run several programs at the same time

The system must be able to load programs into memory wherever there is room

The exact starting address of the program is not known until load time

Absolute Address

Program with starting address specified at assembly time

The address may be invalid if the program is loaded into somewhere else

LALITHAMBIGAIB APIT Page 28

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example

The above statement says that the register A is loaded with the value stored at location 102D

Suppose it is decided to load and execute the program at location 3000 instead of location 1000

Then at address 102D the required value which needs to be loaded in the register A is no more

available

The address also gets changed relative to the displacement of the program Hence we need to

make some changes in the address portion of the instruction so that we can load and execute the

program at location 3000

Apart from the instruction which will undergo a change in their operand address value as the

program load address changes There exist some parts in the program which will remain same

regardless of where the program is being loaded

Since assembler will not know actual location where the program will get loaded it cannot make

the necessary changes in the addresses used in the program However the assembler identifies

for the loader those parts of the program which need modification

An object program that has the information necessary to perform this kind of modification is

called the relocatable program

LALITHAMBIGAIB APIT Page 29

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example Program Relocation

1) The above diagram shows the concept of relocation

a) Initially the program is loaded at location 0000

b) The instruction JSUB is loaded at location 0006

c) The address field of this instruction contains 01036 which is the address of the instruction

labeled RDREC

2) The second figure shows that if the program is to be loaded at new location 5000 The address of

the instruction JSUB gets modified to new location 6036

3) Likewise the third figure shows that if the program is relocated at location 7420 the JSUB

instruction would need to be changed to 4B108456 that correspond to the new address of

RDREC

The only parts of the program that require modification at load time are those that specify

direct addresses

LALITHAMBIGAIB APIT Page 30

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The rest of the instructions need not be modified

o Not a memory address (immediate addressing)

o PC-relative Base-relative

From the object program it is not possible to distinguish the address and constant

o The assembler must keep some information to tell the loader

o The object program that contains the modification record is called a relocatable

program

The way to solve the relocation problem

For an address label its address is assigned relative to the start of the program(START 0)

Produce a Modification record to store the starting location and the length of the address

field to be modified

The command for the loader must also be a part of the object program

Modification record

One modification record for each address to be modified

The length is stored in half-bytes (4 bits)

The starting location is the location of the byte containing the leftmost bits of the address

field to be modified

If the field contains an odd number of half-bytes the starting location begins in the middle of

the first byte

LALITHAMBIGAIB APIT Page 31

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Relocatable Object Program

In the above object code the red boxes indicate the addresses that need modifications The object

code lines at the end are the descriptions of the modification records for those instructions which

need change if relocation occurs M00000705 is the modification suggested for the statement at

location 0007 and requires modification 5-half bytes

Machine dependent assembler featuresAssembler features not closely related to machine architecture

bull Literalsbull Symbol-defining statementsbull Expressionsbull Program blocksbull Control sections and program linking

Literals

It is convenient for the programmer to be able to write the value of a constant operand as a part of

the instruction that uses it Such an operand is called a literal

45 001A ENDFIL LDA =ClsquoEOFrsquo 003210

002D =ClsquoEOFrsquo 454F46

LALITHAMBIGAIB APIT Page 32

SYSTEM SOFTWARE ( CS 2304) UNIT - II215 1062 WLOOP TD =Xlsquo05rsquo E32011

1076 =Xlsquo05rsquo 05

bull In this assembler language notation a literal is identified with the prefix= which is followed by a

specification of the literal value

The difference between a literal and an immediate operand

With immediate addressing the operand value is assembled as a part of the machine

instruction

With a literal the assembler generates the specified value as a constant at some other

memory location The address of this generated constant is used as the target address for the

machine instruction

Literal pool

All of the literal operands used in a program are gathered together into one or more literal

pools

Where the literal pool should be placed

93 LTORG

002D =ClsquoEOFrsquo 454F46

1048698 The assembler directive LTORG tells the assembler to generate a literal pool here

Literal for current value of location counter

1048698 The value of the location counter can be denoted by a literal operand

bull BASE

bull LDB =

Handling duplicate literal operands

The assembler should avoid storing duplicate literals

The easiest way to recognize duplicate literals is by comparison of the character

strings defining them

bull 100 LDA =ClsquoEOFrsquo

LALITHAMBIGAIB APIT Page 33

SYSTEM SOFTWARE ( CS 2304) UNIT - IIbull 125 LDA =ClsquoEOFrsquo

bull 160 LDA =Xlsquo454F46rsquo Literal operands with different

literal names may have the same

literal values

Recognizing literal operands that have different literal names but have the same literal values will

complicate the design of an assembler

Literal table (LITTAB)

The basic data structure needed to process literal operands is a literal table (LITTAB)

LITTAB contains Literal NameValue and Length

LITTAB is often organized as a hash table using the literal name or value as the key

Processing literal operands

Pass 1

bull For each recognized literal operand search LITTAB If the literal is already present in the

table no action is need if it is not present the literal is added to LITTAB without assigning

its address

bull When a LTORG statement is encountered or the end of the program the assembler makes a

scan of LITTAB and assigns an address to each literal

bull Update the location counter to reflect the number of bytes occupied by each literal

Pass 2

bull Search LITTAB for each literal operand encountered

bull The data values specified by the literals in each literal pool are inserted at the appropriate

places in the object program

bull In the same way as these values generated by BYTE or WORD statements

bull If a literal value represents an address in the program the assembler must generate the

appropriate Modification record

Symbol-defining statements

Assembler directives

EQU

ORG

Assembler directive EQU

LALITHAMBIGAIB APIT Page 34

SYSTEM SOFTWARE ( CS 2304) UNIT - II Most assemblers provide an assembler directive that allows the programmer to define

symbols and specify their values

Symbol EQU value

When the assembler encounters the EQU statement it enters ldquosymbolrdquo into SYMTAB with

the value of ldquosymbolrdquo

Use of EQU

Establish symbolic names that can be used for improved readability in place of numeric

values

+LDT 4096

MAXLEN EQU 4096

+LDT MAXLEN

Define mnemonic names for registers

A EQU 0

X EQU 1

L EQU 2

RMO 01

Establish and use names that reflect the logical function of the registers in the program

BASE EQU R1

ACCUMULATOR EQU R2

INDEX EQU R3

Assembler directive ORG

The assembler directive ORG is usually used to indirectly assign values to symbols

ORG value

ldquoValuerdquo is a constant or an expression involving constants and previously defined

symbols

When this statement is encountered the assembler resets its location counter (LOCCTR) to

the specified value

Use ORG for label definition

Suppose that we want to define a table with the following structure

STAB SYMBOL VALUE FLAGS

100 entries 6 bytes 3 bytes 1byte

LALITHAMBIGAIB APIT Page 35

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In some assemblers the previous value of LOCCTR is automatically remembered so we can

write

ORG

to return to the normal use of LOCCTR

Restrictions of EQU and ORG in an ordinary two-pass assembler For an ordinary two-pass assembler all symbols must be defined during Pass 1 Hence the

following sequences could not be processed by an ordinary two-pass assembler

All terms used to specify the value of the new symbol must have been defined previously in

the program

BETA EQU ALPHA

ALPHA RESW 1

ALPHA EQU BETA

BETA EQU DELTA

DELTA RESW 1

disallowed

ORG ALPHA

BYTE1 RESB 1

BYTE2 RESB 1

BYTE3 RESB 1

ORG

ALPHA RESB 1

disallowed

ALPHA RESW 1

BETA EQU ALPHA

Allowed

LALITHAMBIGAIB APIT Page 36

disallowed

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Expressions Most assemblers allow the use of expressions whenever a single operand such as a label or

literal is permitted

bull Each such expression must be evaluated by the assembler to produce a single

operand address or value

Assemblers generally allow arithmetic expressions formed according to the normal rule using

the operators +- and

bull Individual terms in the expression may be

bull constants

bull user-defined symbols or

bull special terms

bull The most common special term is the current value of the location

counter (often designated by )

Types of terms

Absolute terms - The value of an absolute term is independent of program location

Relative terms - The value of a relative term is dependent on the beginning address of the

program

Types of expressions

By the type of value produced expressions can classified as

1 Absolute expressions

bull The value of an absolute expression is independent of the program location

bull The absolute expression may contains relative terms provided the relative terms occur in

pairs and the terms in each such pair have opposite signs No relative term can enter multiplication

or division operation

bull eg MAXLEN EQU BUFEND-BUFFER

LALITHAMBIGAIB APIT Page 37

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Relative expressions

bull The value of a relative expression is relative the beginning address of the object

program

bull A relative expression is one in which all of the relative terms except

one can be paired as described above The remaining unpaired term must have a positive sign No

relative term can enter multiplication or division operation

Expressions that are neither relative nor absolute should be flagged by the assembler as errors

Determining types of expressions

Symbol Type Value

MAXLEN A 1000

BUFEND R 1036

BUFFER R 0036

RETADR R 0030

BUFEND+BUFFER 100-BUFFER and 3BUFFER are neither relative expressions nor

absolute expressions

Program blocks

The program is divided into three blocks They are Default - 0CDATA-1

and CBLKS-2

Default ( 0 ) block for mnemonic instructions

CDATA ( 1 ) block for less memory instructions

CBLKs ( 2 ) block for more memory instructions

Then one more column is included in the program when object code is

generated (ie) block whenever the system will encounter the new block need to write

USE and BLOCK name then set LOCCTR value to 0000

LALITHAMBIGAIB APIT Page 38

Add this field to SYMTAB

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In program block table consists of block name block valuestarting address and

length

After finding location counter value need to find object code and write object

program

LALITHAMBIGAIB APIT Page 39

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Program Block Diagram

LALITHAMBIGAIB APIT Page 40

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Control section and program linkingControl section

i A control section is a part of the program that maintains its identity after assembly

ii Each control section can be loaded and relocated independently of the others

iii Different control sections are most often used for subroutines or other logical

subdivisions of a program

Assembler directive CSECT

CSECT signal the start of a new control section

The assembler establishes a separate location counter (initialized as 0) for each control section

Assembler directives EXTDEF EXTREF

LALITHAMBIGAIB APIT Page 41

SYSTEM SOFTWARE ( CS 2304) UNIT - IIEXTDEF external definition

The EXTDEF statement in a control section names symbols called external symbols that are

defined in this control section and may be used by other sections Control section names are

automatically considered to be external symbols

EXTREF external reference

The EXTREF statement names symbols that are used in this control section and are defined

elsewhere

LALITHAMBIGAIB APIT Page 42

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 14: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - IIinitialize LOCCTR to 0

While OPCODE = lsquoENDrsquo do

begin

if this is not a comment line then

begin

if there is a symbol in the LABEL field then

begin

search SYMTAB for LABEL

if found then

set error flag (duplicate symbol)

else

end if symbol

search OPTAB for OPCODE

if found then

add 3 (instruction length) to LOCCTR

else if OPCODE = lsquoWORDrsquo then

add 3 to LOCCTR

else if OPCODE = lsquoRESWrsquo then

add 3 [OPERAND] to LOCCTR

else if OPCODE = lsquoRESBrsquo then

add [OPERAND] to LOCCTR

else if OPCODE = lsquoBYTErsquo then

begin

find length of constant in bytes

add length to LOCCTR

end if BYTE

else

set error flag (invalid operation code)

end if not a comment

write line to intermediate file

read next input line

LALITHAMBIGAIB APIT Page 14

SYSTEM SOFTWARE ( CS 2304) UNIT - IIend while not END

write last line to intermediate file

Save (LOCCTR ndash starting address) as program length

End pass 1

Explanation ndash PASS I Algorithm

The algorithm scans the first statement START and saves the operand field (the address) as

the starting address of the program Initializes the LOCCTR value to this address This line is

written to the intermediate line

If no operand is mentioned the LOCCTR is initialized to zero

If a label is encountered the symbol has to be entered in the symbol table along with its

associated address value If the symbol already exists that indicates an entry of the same

symbol already exists So an error flag is set indicating a duplication of the symbol

It next checks for the mnemonic code it searches for this code in the OPTAB If found then

the length of the instruction is added to the LOCCTR to make it point to the next instruction

If the opcode is the directive WORD it adds a value 3 to the LOCCTR

If it is RESW it needs to add 3 the number of data word to the LOCCTR

If it is BYTE it adds a value one to the LOCCTR if RESB it adds number of bytes

If it is END directive then it is the end of the program it finds the length of the program by

evaluating current LOCCTR ndash the starting address mentioned in the operand field of the

END directive

Each processed line is written to the intermediate file

The Algorithm for Pass 2

Begin

read 1st input line from intermediate file

if OPCODE = lsquoSTARTrsquo then

begin

write listing line

read next input line

end if START

LALITHAMBIGAIB APIT Page 15

SYSTEM SOFTWARE ( CS 2304) UNIT - IIwrite Header record to object program

initialize 1st Text record

while OPCODE = lsquoENDrsquo do

begin

if this is not comment line then

begin

search OPTAB for OPCODE

if found then

begin

if there is a symbol in OPERAND field then

begin

search SYMTAB for OPERAND

if found then

store symbol value as operand address

else

begin

store 0 as operand address

set error flag (undefined symbol)

end

end if symbol

else

store 0 as operand address

assemble the object code instruction

end if opcode found

else if OPCODE = lsquoBYTErsquo or lsquoWORDrdquo then

convert constant to object code

if object code doesnrsquot fit into current Text record then

begin

Write text record to object code

initialize new Text record

end

LALITHAMBIGAIB APIT Page 16

SYSTEM SOFTWARE ( CS 2304) UNIT - IIadd object code to Text record

end if not comment

write listing line

read next input line

end while not END

Write last Text record to object program

Write End record to object program

Write last listing line

End Pass 2

Explanation ndash PASS II Algorithm

Here the first input line is read from the intermediate file

If the opcode is START then this line is directly written to the list file

A header record is written in the object program which gives the starting address and the

length of the program (which is calculated during pass 1)

Then the first text record is initialized Comment lines are ignored

In the instruction for the opcode the OPTAB is searched to find the object code

If a symbol is there in the operand field the symbol table is searched to get the address value

for this which gets added to the object code of the opcode

If the address not found then zero value is stored as operands address An error flag is set

indicating it as undefined

If symbol itself is not found then store 0 as operand address and the object code instruction is

assembled

If the opcode is BYTE or WORD then the constant value is converted to its equivalent

object code( for example for character EOF its equivalent hexadecimal value lsquo454f46rsquo is

stored)

If the object code cannot fit into the current text record a new text record is created and the

rest of the instructions object code is listed

The text records are written to the object program Once the whole program is assembled and

when the END directive is encountered the End record is written

LALITHAMBIGAIB APIT Page 17

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Design and Implementation Issues

Some of the features in the program depend on the architecture of the machine If the program is for

SIC machine then we have only limited instruction formats and hence limited addressing modes

We have only single operand instructions The operand is always a memory reference Anything to

be fetched from memory requires more time Hence the improved version of SICXE machine

provides more instruction formats and hence more addressing modes The moment we change the

machine architecture the availability of number of instruction formats and the addressing modes

changes Therefore the design usually requires considering two things Machine-dependent features

and Machine-independent features

22 MACHINE DEPENDENT ASSEMBLER FEATURES

Instruction formats and addressing modes

Program relocation

In this section the design and implementation of an assembler for the more complex XE version of

SIC is considered

LALITHAMBIGAIB APIT Page 18

SYSTEM SOFTWARE ( CS 2304) UNIT - II

200 SUBROUTINE TO WRITE RECORD FROM BUFFER

205

210 WRREC CLEAR X CLEAR LOOP COUNTER

212 LDT LENGTH

215 WLOOP TD OUTPUT TEST OUTPUT DEVICE

220 JEQ WLOOP LOOP UNTIL READY

225 LDCH BUFFER X GET CHARACTER FROM BUFFER

230 WD OUTPUT WRITE CHARACTER

235 TIXR T LOOP UNTIL ALL CHARACTERS

HAVE BEEN WRITTEN

240 JLT WLOOP

245 RSUB

250 OUTPUT BYTE Xrsquo05rsquo

255 END FIRST

Figure 24 Example program of a SICXE

LALITHAMBIGAIB APIT Page 19

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The above figure 23 shows the example program from figure 21 as it is might be rewritten to

take advantage of the SICXE instruction set

Indirect addressing is indicated by adding the prefix to the operand (line70)

Immediate operands are denoted with the prefix (lines 25 55133)

Instructions that refer to memory are normally assembled using either the program counter

relative or base counter relative mode

The assembler directive BASE (line 13) is used in conjunction with base relative addressing

The four byte extended instruction format is specified with the prefix + added to the

operation code in the source statement

Register-to-register instructions are used wherever possible For example the statement on

line 150 is changed from COMP ZERO to COMPR AS

Immediate and indirect addressing have also been used as much as possible

Register-to-register instructions are faster than the corresponding register-to-memory

operations because they are shorter and do not require another memory reference

While using immediate addressing the operand is already present as part of the instruction

and need not be fetched from anywhere

The use of indirect addressing often avoids the need for another instruction

Line Loc Label Opcode Operand Object Code

5 COPY START 0

10 FIRST STL RETADR

12 LDB LENGTH

13 BASE LENGTH

15 CLOOP +JSUB RDREC

20 LDA LENGTH

25 COMP 0

30 JEQ ENDFIL

35 +JSUB WRREC

40 J CLOOP

45 ENDFIL LDA EOF

LALITHAMBIGAIB APIT Page 20

SYSTEM SOFTWARE ( CS 2304) UNIT - II50 STA BUFFER

55 LDA 3

60 STA LENGTH

65 +JSUB WRREC

70 J RETADR

80 EOF BYTE CrsquoEOFrsquo

95 RETADR RESW 1

100 LENGTH RESW 1

105 BUFFER RESB 4096

110

SUBROUTINE TO READ RECORD INTO

BUFFER

115

120

125 RDREC CLEAR X

130 CLEAR A

132 CLEAR S

133 +LDT 4096

135 RLOOP TD INPUT

140 JEQ RLOOP

145 RD INPUT

150 COMPR A S

155 JEQ EXIT

160 STCH BUFFERX

165 TIXR T

170 JLT RLOOP

175 EXIT STX LENGTH

180 RSUB

185 INPUT BYTE XrsquoF1rsquo

195

SUBROUTINE TO WRITE RECORD

FROM BUFFER

200

205

LALITHAMBIGAIB APIT Page 21

SYSTEM SOFTWARE ( CS 2304) UNIT - II210 WRREC CLEAR X

212 LDT LENGTH

215 WLOOP TD OUTPUT

220 JEQ WLOOP

225 LDCH BUFFERX

230 WD OUTPUT

235 TIXR T

240 JLT WLOOP

245 RSUB

250 OUTPUT BYTE Xrsquo05rsquo

255 END FIRST

Figure 25 Program from figure 24(SICXE) with object code

221 Instruction Formats and Addressing Modes

The instruction formats depend on the memory organization and the size of the memory

In SIC machine the memory is byte addressable Word size is 3 bytes So the size of the

memory is 212 bytes Accordingly it supports only one instruction format It has only two

registers register A and Index register Therefore the addressing modes supported by this

architecture are direct and indexed

Whereas the memory of a SICXE machine is 220 bytes (1 MB) This supports four different

types of instruction types they are

1 byte instruction

2 byte instruction

3 byte instruction

4 byte instruction

LALITHAMBIGAIB APIT Page 22

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Instructions can be

Instructions involving register to register

Instructions with one operand in memory the other in Accumulator (Single

operand instruction)

Extended instruction format

Addressing Modes are

PC-relative or Base-relative addressing op m

Indirect addressing op m

Immediate addressing op c

Extended format +op m

Index addressing op mx

register-to-register instructions

larger memory -gt multi-programming (program allocation)

Translation

1 Translations for the Instruction involving Register-Register addressing mode

During pass 1 the registers can be entered as part of the symbol table itself The value for these

registers is their equivalent numeric codes During pass 2 these values are assembled along with the

mnemonics object code If required a separate table can be created with the register names and their

equivalent numeric values

2 Translation involving Register-Memory instructions

In SICXE machine there are four instruction formats and five addressing modes For formats and

addressing modes refer chapter 1

LALITHAMBIGAIB APIT Page 23

SYSTEM SOFTWARE ( CS 2304) UNIT - IIAmong the instruction formats format -3 and format-4 instructions are Register-Memory type of

instruction One of the operand is always in a register and the other operand is in the memory The

addressing mode tells us the way in which the operand from the memory is to be fetched

There are two ways

1) Program-counter relative and

2) Base-relative

This addressing mode can be represented by either using format-3 type or format-4

type of instruction format

In format-3 the instruction has the opcode followed by a 12-bit displacement value in

the address field

Where as in format-4 the instruction contains the mnemonic code followed by a 20-

bit displacement value in the address field

1 Program-Counter Relative

a) In this usually format-3 instruction format is used

b) The instruction contains the opcode followed by a 12-bit displacement value

c) The range of displacement values are from 0 -2048 This displacement (should be small

enough to fit in a 12-bit field) value is added to the current contents of the program counter to

get the target address of the operand required by the instruction

d) This is relative way of calculating the address of the operand relative to the program counter

Hence the displacement of the operand is relative to the current program counter value

e) The following example shows how the address is calculated

LALITHAMBIGAIB APIT Page 24

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Base-Relative Addressing Mode

a) In this mode the base register is used to mention the displacement value Therefore the target

address is

TA = (base) + displacement value

b) This addressing mode is used when the range of displacement value is not sufficient Hence

the operand is not relative to the instruction as in PC-relative addressing mode Whenever

this mode is used it is indicated by using a directive BASE The moment the assembler

encounters this directive the next instruction uses base-relative addressing mode to calculate

the target address of the operand

c) When NOBASE directive is used then it indicates the base register is no more used to

calculate the target address of the operand Assembler first chooses PC-relative when the

displacement field is not enough it uses Base-relative

LDB LENGTH (instruction)

BASE LENGTH (directive)

NOBASE

LALITHAMBIGAIB APIT Page 25

SYSTEM SOFTWARE ( CS 2304) UNIT - II

For example

12 0003 LDB LENGTH 69202D

13 BASE LENGTH

100 0033 LENGTH RESW 1

105 0036 BUFFER RESB 4096

160 104E STCH BUFFER X 57C003

165 1051 TIXR T B850

In the above example the use of directive BASE indicates that Base-relative addressing mode is

to be used to calculate the target address PC-relative is no longer used The value of the LENGTH is

stored in the base register

The LDB instruction loads the value of length in the base register which 0033 BASE directive

explicitly tells the assembler that it has the value of LENGTH

BUFFER is at location (0036)16

(B) = (0033)16

disp = 0036 ndash 0033 = (0003)16

20 000A LDA LENGTH 032026

175 1056 EXIT STX LENGTH 134000

LALITHAMBIGAIB APIT Page 26

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Consider Line 175 If we use PC-relative

Disp = TA ndash (PC) = 0033 ndash1059 = EFDA

PC relative is no longer applicable so we try to use BASE relative addressing mode

3 Immediate Addressing Mode

In this mode no memory reference is involved If immediate mode is used the target address is the

operand itself

If the symbol is referred in the instruction as the immediate operand then it is immediate with PC-

relative mode as shown in the example below

LALITHAMBIGAIB APIT Page 27

SYSTEM SOFTWARE ( CS 2304) UNIT - II

4 Indirect and PC-relative mode

In this type of instruction the symbol used in the instruction is the address of the location which

contains the address of the operand The address of this is found using PC-relative addressing mode

For example

The instruction jumps the control to the address location RETADR which in turn has the address of

the operand If address of RETADR is 0030 the target address is then 0003 as calculated above

222 Program Relocation

The need for program relocation

It is desirable to load and run several programs at the same time

The system must be able to load programs into memory wherever there is room

The exact starting address of the program is not known until load time

Absolute Address

Program with starting address specified at assembly time

The address may be invalid if the program is loaded into somewhere else

LALITHAMBIGAIB APIT Page 28

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example

The above statement says that the register A is loaded with the value stored at location 102D

Suppose it is decided to load and execute the program at location 3000 instead of location 1000

Then at address 102D the required value which needs to be loaded in the register A is no more

available

The address also gets changed relative to the displacement of the program Hence we need to

make some changes in the address portion of the instruction so that we can load and execute the

program at location 3000

Apart from the instruction which will undergo a change in their operand address value as the

program load address changes There exist some parts in the program which will remain same

regardless of where the program is being loaded

Since assembler will not know actual location where the program will get loaded it cannot make

the necessary changes in the addresses used in the program However the assembler identifies

for the loader those parts of the program which need modification

An object program that has the information necessary to perform this kind of modification is

called the relocatable program

LALITHAMBIGAIB APIT Page 29

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example Program Relocation

1) The above diagram shows the concept of relocation

a) Initially the program is loaded at location 0000

b) The instruction JSUB is loaded at location 0006

c) The address field of this instruction contains 01036 which is the address of the instruction

labeled RDREC

2) The second figure shows that if the program is to be loaded at new location 5000 The address of

the instruction JSUB gets modified to new location 6036

3) Likewise the third figure shows that if the program is relocated at location 7420 the JSUB

instruction would need to be changed to 4B108456 that correspond to the new address of

RDREC

The only parts of the program that require modification at load time are those that specify

direct addresses

LALITHAMBIGAIB APIT Page 30

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The rest of the instructions need not be modified

o Not a memory address (immediate addressing)

o PC-relative Base-relative

From the object program it is not possible to distinguish the address and constant

o The assembler must keep some information to tell the loader

o The object program that contains the modification record is called a relocatable

program

The way to solve the relocation problem

For an address label its address is assigned relative to the start of the program(START 0)

Produce a Modification record to store the starting location and the length of the address

field to be modified

The command for the loader must also be a part of the object program

Modification record

One modification record for each address to be modified

The length is stored in half-bytes (4 bits)

The starting location is the location of the byte containing the leftmost bits of the address

field to be modified

If the field contains an odd number of half-bytes the starting location begins in the middle of

the first byte

LALITHAMBIGAIB APIT Page 31

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Relocatable Object Program

In the above object code the red boxes indicate the addresses that need modifications The object

code lines at the end are the descriptions of the modification records for those instructions which

need change if relocation occurs M00000705 is the modification suggested for the statement at

location 0007 and requires modification 5-half bytes

Machine dependent assembler featuresAssembler features not closely related to machine architecture

bull Literalsbull Symbol-defining statementsbull Expressionsbull Program blocksbull Control sections and program linking

Literals

It is convenient for the programmer to be able to write the value of a constant operand as a part of

the instruction that uses it Such an operand is called a literal

45 001A ENDFIL LDA =ClsquoEOFrsquo 003210

002D =ClsquoEOFrsquo 454F46

LALITHAMBIGAIB APIT Page 32

SYSTEM SOFTWARE ( CS 2304) UNIT - II215 1062 WLOOP TD =Xlsquo05rsquo E32011

1076 =Xlsquo05rsquo 05

bull In this assembler language notation a literal is identified with the prefix= which is followed by a

specification of the literal value

The difference between a literal and an immediate operand

With immediate addressing the operand value is assembled as a part of the machine

instruction

With a literal the assembler generates the specified value as a constant at some other

memory location The address of this generated constant is used as the target address for the

machine instruction

Literal pool

All of the literal operands used in a program are gathered together into one or more literal

pools

Where the literal pool should be placed

93 LTORG

002D =ClsquoEOFrsquo 454F46

1048698 The assembler directive LTORG tells the assembler to generate a literal pool here

Literal for current value of location counter

1048698 The value of the location counter can be denoted by a literal operand

bull BASE

bull LDB =

Handling duplicate literal operands

The assembler should avoid storing duplicate literals

The easiest way to recognize duplicate literals is by comparison of the character

strings defining them

bull 100 LDA =ClsquoEOFrsquo

LALITHAMBIGAIB APIT Page 33

SYSTEM SOFTWARE ( CS 2304) UNIT - IIbull 125 LDA =ClsquoEOFrsquo

bull 160 LDA =Xlsquo454F46rsquo Literal operands with different

literal names may have the same

literal values

Recognizing literal operands that have different literal names but have the same literal values will

complicate the design of an assembler

Literal table (LITTAB)

The basic data structure needed to process literal operands is a literal table (LITTAB)

LITTAB contains Literal NameValue and Length

LITTAB is often organized as a hash table using the literal name or value as the key

Processing literal operands

Pass 1

bull For each recognized literal operand search LITTAB If the literal is already present in the

table no action is need if it is not present the literal is added to LITTAB without assigning

its address

bull When a LTORG statement is encountered or the end of the program the assembler makes a

scan of LITTAB and assigns an address to each literal

bull Update the location counter to reflect the number of bytes occupied by each literal

Pass 2

bull Search LITTAB for each literal operand encountered

bull The data values specified by the literals in each literal pool are inserted at the appropriate

places in the object program

bull In the same way as these values generated by BYTE or WORD statements

bull If a literal value represents an address in the program the assembler must generate the

appropriate Modification record

Symbol-defining statements

Assembler directives

EQU

ORG

Assembler directive EQU

LALITHAMBIGAIB APIT Page 34

SYSTEM SOFTWARE ( CS 2304) UNIT - II Most assemblers provide an assembler directive that allows the programmer to define

symbols and specify their values

Symbol EQU value

When the assembler encounters the EQU statement it enters ldquosymbolrdquo into SYMTAB with

the value of ldquosymbolrdquo

Use of EQU

Establish symbolic names that can be used for improved readability in place of numeric

values

+LDT 4096

MAXLEN EQU 4096

+LDT MAXLEN

Define mnemonic names for registers

A EQU 0

X EQU 1

L EQU 2

RMO 01

Establish and use names that reflect the logical function of the registers in the program

BASE EQU R1

ACCUMULATOR EQU R2

INDEX EQU R3

Assembler directive ORG

The assembler directive ORG is usually used to indirectly assign values to symbols

ORG value

ldquoValuerdquo is a constant or an expression involving constants and previously defined

symbols

When this statement is encountered the assembler resets its location counter (LOCCTR) to

the specified value

Use ORG for label definition

Suppose that we want to define a table with the following structure

STAB SYMBOL VALUE FLAGS

100 entries 6 bytes 3 bytes 1byte

LALITHAMBIGAIB APIT Page 35

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In some assemblers the previous value of LOCCTR is automatically remembered so we can

write

ORG

to return to the normal use of LOCCTR

Restrictions of EQU and ORG in an ordinary two-pass assembler For an ordinary two-pass assembler all symbols must be defined during Pass 1 Hence the

following sequences could not be processed by an ordinary two-pass assembler

All terms used to specify the value of the new symbol must have been defined previously in

the program

BETA EQU ALPHA

ALPHA RESW 1

ALPHA EQU BETA

BETA EQU DELTA

DELTA RESW 1

disallowed

ORG ALPHA

BYTE1 RESB 1

BYTE2 RESB 1

BYTE3 RESB 1

ORG

ALPHA RESB 1

disallowed

ALPHA RESW 1

BETA EQU ALPHA

Allowed

LALITHAMBIGAIB APIT Page 36

disallowed

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Expressions Most assemblers allow the use of expressions whenever a single operand such as a label or

literal is permitted

bull Each such expression must be evaluated by the assembler to produce a single

operand address or value

Assemblers generally allow arithmetic expressions formed according to the normal rule using

the operators +- and

bull Individual terms in the expression may be

bull constants

bull user-defined symbols or

bull special terms

bull The most common special term is the current value of the location

counter (often designated by )

Types of terms

Absolute terms - The value of an absolute term is independent of program location

Relative terms - The value of a relative term is dependent on the beginning address of the

program

Types of expressions

By the type of value produced expressions can classified as

1 Absolute expressions

bull The value of an absolute expression is independent of the program location

bull The absolute expression may contains relative terms provided the relative terms occur in

pairs and the terms in each such pair have opposite signs No relative term can enter multiplication

or division operation

bull eg MAXLEN EQU BUFEND-BUFFER

LALITHAMBIGAIB APIT Page 37

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Relative expressions

bull The value of a relative expression is relative the beginning address of the object

program

bull A relative expression is one in which all of the relative terms except

one can be paired as described above The remaining unpaired term must have a positive sign No

relative term can enter multiplication or division operation

Expressions that are neither relative nor absolute should be flagged by the assembler as errors

Determining types of expressions

Symbol Type Value

MAXLEN A 1000

BUFEND R 1036

BUFFER R 0036

RETADR R 0030

BUFEND+BUFFER 100-BUFFER and 3BUFFER are neither relative expressions nor

absolute expressions

Program blocks

The program is divided into three blocks They are Default - 0CDATA-1

and CBLKS-2

Default ( 0 ) block for mnemonic instructions

CDATA ( 1 ) block for less memory instructions

CBLKs ( 2 ) block for more memory instructions

Then one more column is included in the program when object code is

generated (ie) block whenever the system will encounter the new block need to write

USE and BLOCK name then set LOCCTR value to 0000

LALITHAMBIGAIB APIT Page 38

Add this field to SYMTAB

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In program block table consists of block name block valuestarting address and

length

After finding location counter value need to find object code and write object

program

LALITHAMBIGAIB APIT Page 39

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Program Block Diagram

LALITHAMBIGAIB APIT Page 40

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Control section and program linkingControl section

i A control section is a part of the program that maintains its identity after assembly

ii Each control section can be loaded and relocated independently of the others

iii Different control sections are most often used for subroutines or other logical

subdivisions of a program

Assembler directive CSECT

CSECT signal the start of a new control section

The assembler establishes a separate location counter (initialized as 0) for each control section

Assembler directives EXTDEF EXTREF

LALITHAMBIGAIB APIT Page 41

SYSTEM SOFTWARE ( CS 2304) UNIT - IIEXTDEF external definition

The EXTDEF statement in a control section names symbols called external symbols that are

defined in this control section and may be used by other sections Control section names are

automatically considered to be external symbols

EXTREF external reference

The EXTREF statement names symbols that are used in this control section and are defined

elsewhere

LALITHAMBIGAIB APIT Page 42

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 15: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - IIend while not END

write last line to intermediate file

Save (LOCCTR ndash starting address) as program length

End pass 1

Explanation ndash PASS I Algorithm

The algorithm scans the first statement START and saves the operand field (the address) as

the starting address of the program Initializes the LOCCTR value to this address This line is

written to the intermediate line

If no operand is mentioned the LOCCTR is initialized to zero

If a label is encountered the symbol has to be entered in the symbol table along with its

associated address value If the symbol already exists that indicates an entry of the same

symbol already exists So an error flag is set indicating a duplication of the symbol

It next checks for the mnemonic code it searches for this code in the OPTAB If found then

the length of the instruction is added to the LOCCTR to make it point to the next instruction

If the opcode is the directive WORD it adds a value 3 to the LOCCTR

If it is RESW it needs to add 3 the number of data word to the LOCCTR

If it is BYTE it adds a value one to the LOCCTR if RESB it adds number of bytes

If it is END directive then it is the end of the program it finds the length of the program by

evaluating current LOCCTR ndash the starting address mentioned in the operand field of the

END directive

Each processed line is written to the intermediate file

The Algorithm for Pass 2

Begin

read 1st input line from intermediate file

if OPCODE = lsquoSTARTrsquo then

begin

write listing line

read next input line

end if START

LALITHAMBIGAIB APIT Page 15

SYSTEM SOFTWARE ( CS 2304) UNIT - IIwrite Header record to object program

initialize 1st Text record

while OPCODE = lsquoENDrsquo do

begin

if this is not comment line then

begin

search OPTAB for OPCODE

if found then

begin

if there is a symbol in OPERAND field then

begin

search SYMTAB for OPERAND

if found then

store symbol value as operand address

else

begin

store 0 as operand address

set error flag (undefined symbol)

end

end if symbol

else

store 0 as operand address

assemble the object code instruction

end if opcode found

else if OPCODE = lsquoBYTErsquo or lsquoWORDrdquo then

convert constant to object code

if object code doesnrsquot fit into current Text record then

begin

Write text record to object code

initialize new Text record

end

LALITHAMBIGAIB APIT Page 16

SYSTEM SOFTWARE ( CS 2304) UNIT - IIadd object code to Text record

end if not comment

write listing line

read next input line

end while not END

Write last Text record to object program

Write End record to object program

Write last listing line

End Pass 2

Explanation ndash PASS II Algorithm

Here the first input line is read from the intermediate file

If the opcode is START then this line is directly written to the list file

A header record is written in the object program which gives the starting address and the

length of the program (which is calculated during pass 1)

Then the first text record is initialized Comment lines are ignored

In the instruction for the opcode the OPTAB is searched to find the object code

If a symbol is there in the operand field the symbol table is searched to get the address value

for this which gets added to the object code of the opcode

If the address not found then zero value is stored as operands address An error flag is set

indicating it as undefined

If symbol itself is not found then store 0 as operand address and the object code instruction is

assembled

If the opcode is BYTE or WORD then the constant value is converted to its equivalent

object code( for example for character EOF its equivalent hexadecimal value lsquo454f46rsquo is

stored)

If the object code cannot fit into the current text record a new text record is created and the

rest of the instructions object code is listed

The text records are written to the object program Once the whole program is assembled and

when the END directive is encountered the End record is written

LALITHAMBIGAIB APIT Page 17

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Design and Implementation Issues

Some of the features in the program depend on the architecture of the machine If the program is for

SIC machine then we have only limited instruction formats and hence limited addressing modes

We have only single operand instructions The operand is always a memory reference Anything to

be fetched from memory requires more time Hence the improved version of SICXE machine

provides more instruction formats and hence more addressing modes The moment we change the

machine architecture the availability of number of instruction formats and the addressing modes

changes Therefore the design usually requires considering two things Machine-dependent features

and Machine-independent features

22 MACHINE DEPENDENT ASSEMBLER FEATURES

Instruction formats and addressing modes

Program relocation

In this section the design and implementation of an assembler for the more complex XE version of

SIC is considered

LALITHAMBIGAIB APIT Page 18

SYSTEM SOFTWARE ( CS 2304) UNIT - II

200 SUBROUTINE TO WRITE RECORD FROM BUFFER

205

210 WRREC CLEAR X CLEAR LOOP COUNTER

212 LDT LENGTH

215 WLOOP TD OUTPUT TEST OUTPUT DEVICE

220 JEQ WLOOP LOOP UNTIL READY

225 LDCH BUFFER X GET CHARACTER FROM BUFFER

230 WD OUTPUT WRITE CHARACTER

235 TIXR T LOOP UNTIL ALL CHARACTERS

HAVE BEEN WRITTEN

240 JLT WLOOP

245 RSUB

250 OUTPUT BYTE Xrsquo05rsquo

255 END FIRST

Figure 24 Example program of a SICXE

LALITHAMBIGAIB APIT Page 19

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The above figure 23 shows the example program from figure 21 as it is might be rewritten to

take advantage of the SICXE instruction set

Indirect addressing is indicated by adding the prefix to the operand (line70)

Immediate operands are denoted with the prefix (lines 25 55133)

Instructions that refer to memory are normally assembled using either the program counter

relative or base counter relative mode

The assembler directive BASE (line 13) is used in conjunction with base relative addressing

The four byte extended instruction format is specified with the prefix + added to the

operation code in the source statement

Register-to-register instructions are used wherever possible For example the statement on

line 150 is changed from COMP ZERO to COMPR AS

Immediate and indirect addressing have also been used as much as possible

Register-to-register instructions are faster than the corresponding register-to-memory

operations because they are shorter and do not require another memory reference

While using immediate addressing the operand is already present as part of the instruction

and need not be fetched from anywhere

The use of indirect addressing often avoids the need for another instruction

Line Loc Label Opcode Operand Object Code

5 COPY START 0

10 FIRST STL RETADR

12 LDB LENGTH

13 BASE LENGTH

15 CLOOP +JSUB RDREC

20 LDA LENGTH

25 COMP 0

30 JEQ ENDFIL

35 +JSUB WRREC

40 J CLOOP

45 ENDFIL LDA EOF

LALITHAMBIGAIB APIT Page 20

SYSTEM SOFTWARE ( CS 2304) UNIT - II50 STA BUFFER

55 LDA 3

60 STA LENGTH

65 +JSUB WRREC

70 J RETADR

80 EOF BYTE CrsquoEOFrsquo

95 RETADR RESW 1

100 LENGTH RESW 1

105 BUFFER RESB 4096

110

SUBROUTINE TO READ RECORD INTO

BUFFER

115

120

125 RDREC CLEAR X

130 CLEAR A

132 CLEAR S

133 +LDT 4096

135 RLOOP TD INPUT

140 JEQ RLOOP

145 RD INPUT

150 COMPR A S

155 JEQ EXIT

160 STCH BUFFERX

165 TIXR T

170 JLT RLOOP

175 EXIT STX LENGTH

180 RSUB

185 INPUT BYTE XrsquoF1rsquo

195

SUBROUTINE TO WRITE RECORD

FROM BUFFER

200

205

LALITHAMBIGAIB APIT Page 21

SYSTEM SOFTWARE ( CS 2304) UNIT - II210 WRREC CLEAR X

212 LDT LENGTH

215 WLOOP TD OUTPUT

220 JEQ WLOOP

225 LDCH BUFFERX

230 WD OUTPUT

235 TIXR T

240 JLT WLOOP

245 RSUB

250 OUTPUT BYTE Xrsquo05rsquo

255 END FIRST

Figure 25 Program from figure 24(SICXE) with object code

221 Instruction Formats and Addressing Modes

The instruction formats depend on the memory organization and the size of the memory

In SIC machine the memory is byte addressable Word size is 3 bytes So the size of the

memory is 212 bytes Accordingly it supports only one instruction format It has only two

registers register A and Index register Therefore the addressing modes supported by this

architecture are direct and indexed

Whereas the memory of a SICXE machine is 220 bytes (1 MB) This supports four different

types of instruction types they are

1 byte instruction

2 byte instruction

3 byte instruction

4 byte instruction

LALITHAMBIGAIB APIT Page 22

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Instructions can be

Instructions involving register to register

Instructions with one operand in memory the other in Accumulator (Single

operand instruction)

Extended instruction format

Addressing Modes are

PC-relative or Base-relative addressing op m

Indirect addressing op m

Immediate addressing op c

Extended format +op m

Index addressing op mx

register-to-register instructions

larger memory -gt multi-programming (program allocation)

Translation

1 Translations for the Instruction involving Register-Register addressing mode

During pass 1 the registers can be entered as part of the symbol table itself The value for these

registers is their equivalent numeric codes During pass 2 these values are assembled along with the

mnemonics object code If required a separate table can be created with the register names and their

equivalent numeric values

2 Translation involving Register-Memory instructions

In SICXE machine there are four instruction formats and five addressing modes For formats and

addressing modes refer chapter 1

LALITHAMBIGAIB APIT Page 23

SYSTEM SOFTWARE ( CS 2304) UNIT - IIAmong the instruction formats format -3 and format-4 instructions are Register-Memory type of

instruction One of the operand is always in a register and the other operand is in the memory The

addressing mode tells us the way in which the operand from the memory is to be fetched

There are two ways

1) Program-counter relative and

2) Base-relative

This addressing mode can be represented by either using format-3 type or format-4

type of instruction format

In format-3 the instruction has the opcode followed by a 12-bit displacement value in

the address field

Where as in format-4 the instruction contains the mnemonic code followed by a 20-

bit displacement value in the address field

1 Program-Counter Relative

a) In this usually format-3 instruction format is used

b) The instruction contains the opcode followed by a 12-bit displacement value

c) The range of displacement values are from 0 -2048 This displacement (should be small

enough to fit in a 12-bit field) value is added to the current contents of the program counter to

get the target address of the operand required by the instruction

d) This is relative way of calculating the address of the operand relative to the program counter

Hence the displacement of the operand is relative to the current program counter value

e) The following example shows how the address is calculated

LALITHAMBIGAIB APIT Page 24

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Base-Relative Addressing Mode

a) In this mode the base register is used to mention the displacement value Therefore the target

address is

TA = (base) + displacement value

b) This addressing mode is used when the range of displacement value is not sufficient Hence

the operand is not relative to the instruction as in PC-relative addressing mode Whenever

this mode is used it is indicated by using a directive BASE The moment the assembler

encounters this directive the next instruction uses base-relative addressing mode to calculate

the target address of the operand

c) When NOBASE directive is used then it indicates the base register is no more used to

calculate the target address of the operand Assembler first chooses PC-relative when the

displacement field is not enough it uses Base-relative

LDB LENGTH (instruction)

BASE LENGTH (directive)

NOBASE

LALITHAMBIGAIB APIT Page 25

SYSTEM SOFTWARE ( CS 2304) UNIT - II

For example

12 0003 LDB LENGTH 69202D

13 BASE LENGTH

100 0033 LENGTH RESW 1

105 0036 BUFFER RESB 4096

160 104E STCH BUFFER X 57C003

165 1051 TIXR T B850

In the above example the use of directive BASE indicates that Base-relative addressing mode is

to be used to calculate the target address PC-relative is no longer used The value of the LENGTH is

stored in the base register

The LDB instruction loads the value of length in the base register which 0033 BASE directive

explicitly tells the assembler that it has the value of LENGTH

BUFFER is at location (0036)16

(B) = (0033)16

disp = 0036 ndash 0033 = (0003)16

20 000A LDA LENGTH 032026

175 1056 EXIT STX LENGTH 134000

LALITHAMBIGAIB APIT Page 26

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Consider Line 175 If we use PC-relative

Disp = TA ndash (PC) = 0033 ndash1059 = EFDA

PC relative is no longer applicable so we try to use BASE relative addressing mode

3 Immediate Addressing Mode

In this mode no memory reference is involved If immediate mode is used the target address is the

operand itself

If the symbol is referred in the instruction as the immediate operand then it is immediate with PC-

relative mode as shown in the example below

LALITHAMBIGAIB APIT Page 27

SYSTEM SOFTWARE ( CS 2304) UNIT - II

4 Indirect and PC-relative mode

In this type of instruction the symbol used in the instruction is the address of the location which

contains the address of the operand The address of this is found using PC-relative addressing mode

For example

The instruction jumps the control to the address location RETADR which in turn has the address of

the operand If address of RETADR is 0030 the target address is then 0003 as calculated above

222 Program Relocation

The need for program relocation

It is desirable to load and run several programs at the same time

The system must be able to load programs into memory wherever there is room

The exact starting address of the program is not known until load time

Absolute Address

Program with starting address specified at assembly time

The address may be invalid if the program is loaded into somewhere else

LALITHAMBIGAIB APIT Page 28

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example

The above statement says that the register A is loaded with the value stored at location 102D

Suppose it is decided to load and execute the program at location 3000 instead of location 1000

Then at address 102D the required value which needs to be loaded in the register A is no more

available

The address also gets changed relative to the displacement of the program Hence we need to

make some changes in the address portion of the instruction so that we can load and execute the

program at location 3000

Apart from the instruction which will undergo a change in their operand address value as the

program load address changes There exist some parts in the program which will remain same

regardless of where the program is being loaded

Since assembler will not know actual location where the program will get loaded it cannot make

the necessary changes in the addresses used in the program However the assembler identifies

for the loader those parts of the program which need modification

An object program that has the information necessary to perform this kind of modification is

called the relocatable program

LALITHAMBIGAIB APIT Page 29

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example Program Relocation

1) The above diagram shows the concept of relocation

a) Initially the program is loaded at location 0000

b) The instruction JSUB is loaded at location 0006

c) The address field of this instruction contains 01036 which is the address of the instruction

labeled RDREC

2) The second figure shows that if the program is to be loaded at new location 5000 The address of

the instruction JSUB gets modified to new location 6036

3) Likewise the third figure shows that if the program is relocated at location 7420 the JSUB

instruction would need to be changed to 4B108456 that correspond to the new address of

RDREC

The only parts of the program that require modification at load time are those that specify

direct addresses

LALITHAMBIGAIB APIT Page 30

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The rest of the instructions need not be modified

o Not a memory address (immediate addressing)

o PC-relative Base-relative

From the object program it is not possible to distinguish the address and constant

o The assembler must keep some information to tell the loader

o The object program that contains the modification record is called a relocatable

program

The way to solve the relocation problem

For an address label its address is assigned relative to the start of the program(START 0)

Produce a Modification record to store the starting location and the length of the address

field to be modified

The command for the loader must also be a part of the object program

Modification record

One modification record for each address to be modified

The length is stored in half-bytes (4 bits)

The starting location is the location of the byte containing the leftmost bits of the address

field to be modified

If the field contains an odd number of half-bytes the starting location begins in the middle of

the first byte

LALITHAMBIGAIB APIT Page 31

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Relocatable Object Program

In the above object code the red boxes indicate the addresses that need modifications The object

code lines at the end are the descriptions of the modification records for those instructions which

need change if relocation occurs M00000705 is the modification suggested for the statement at

location 0007 and requires modification 5-half bytes

Machine dependent assembler featuresAssembler features not closely related to machine architecture

bull Literalsbull Symbol-defining statementsbull Expressionsbull Program blocksbull Control sections and program linking

Literals

It is convenient for the programmer to be able to write the value of a constant operand as a part of

the instruction that uses it Such an operand is called a literal

45 001A ENDFIL LDA =ClsquoEOFrsquo 003210

002D =ClsquoEOFrsquo 454F46

LALITHAMBIGAIB APIT Page 32

SYSTEM SOFTWARE ( CS 2304) UNIT - II215 1062 WLOOP TD =Xlsquo05rsquo E32011

1076 =Xlsquo05rsquo 05

bull In this assembler language notation a literal is identified with the prefix= which is followed by a

specification of the literal value

The difference between a literal and an immediate operand

With immediate addressing the operand value is assembled as a part of the machine

instruction

With a literal the assembler generates the specified value as a constant at some other

memory location The address of this generated constant is used as the target address for the

machine instruction

Literal pool

All of the literal operands used in a program are gathered together into one or more literal

pools

Where the literal pool should be placed

93 LTORG

002D =ClsquoEOFrsquo 454F46

1048698 The assembler directive LTORG tells the assembler to generate a literal pool here

Literal for current value of location counter

1048698 The value of the location counter can be denoted by a literal operand

bull BASE

bull LDB =

Handling duplicate literal operands

The assembler should avoid storing duplicate literals

The easiest way to recognize duplicate literals is by comparison of the character

strings defining them

bull 100 LDA =ClsquoEOFrsquo

LALITHAMBIGAIB APIT Page 33

SYSTEM SOFTWARE ( CS 2304) UNIT - IIbull 125 LDA =ClsquoEOFrsquo

bull 160 LDA =Xlsquo454F46rsquo Literal operands with different

literal names may have the same

literal values

Recognizing literal operands that have different literal names but have the same literal values will

complicate the design of an assembler

Literal table (LITTAB)

The basic data structure needed to process literal operands is a literal table (LITTAB)

LITTAB contains Literal NameValue and Length

LITTAB is often organized as a hash table using the literal name or value as the key

Processing literal operands

Pass 1

bull For each recognized literal operand search LITTAB If the literal is already present in the

table no action is need if it is not present the literal is added to LITTAB without assigning

its address

bull When a LTORG statement is encountered or the end of the program the assembler makes a

scan of LITTAB and assigns an address to each literal

bull Update the location counter to reflect the number of bytes occupied by each literal

Pass 2

bull Search LITTAB for each literal operand encountered

bull The data values specified by the literals in each literal pool are inserted at the appropriate

places in the object program

bull In the same way as these values generated by BYTE or WORD statements

bull If a literal value represents an address in the program the assembler must generate the

appropriate Modification record

Symbol-defining statements

Assembler directives

EQU

ORG

Assembler directive EQU

LALITHAMBIGAIB APIT Page 34

SYSTEM SOFTWARE ( CS 2304) UNIT - II Most assemblers provide an assembler directive that allows the programmer to define

symbols and specify their values

Symbol EQU value

When the assembler encounters the EQU statement it enters ldquosymbolrdquo into SYMTAB with

the value of ldquosymbolrdquo

Use of EQU

Establish symbolic names that can be used for improved readability in place of numeric

values

+LDT 4096

MAXLEN EQU 4096

+LDT MAXLEN

Define mnemonic names for registers

A EQU 0

X EQU 1

L EQU 2

RMO 01

Establish and use names that reflect the logical function of the registers in the program

BASE EQU R1

ACCUMULATOR EQU R2

INDEX EQU R3

Assembler directive ORG

The assembler directive ORG is usually used to indirectly assign values to symbols

ORG value

ldquoValuerdquo is a constant or an expression involving constants and previously defined

symbols

When this statement is encountered the assembler resets its location counter (LOCCTR) to

the specified value

Use ORG for label definition

Suppose that we want to define a table with the following structure

STAB SYMBOL VALUE FLAGS

100 entries 6 bytes 3 bytes 1byte

LALITHAMBIGAIB APIT Page 35

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In some assemblers the previous value of LOCCTR is automatically remembered so we can

write

ORG

to return to the normal use of LOCCTR

Restrictions of EQU and ORG in an ordinary two-pass assembler For an ordinary two-pass assembler all symbols must be defined during Pass 1 Hence the

following sequences could not be processed by an ordinary two-pass assembler

All terms used to specify the value of the new symbol must have been defined previously in

the program

BETA EQU ALPHA

ALPHA RESW 1

ALPHA EQU BETA

BETA EQU DELTA

DELTA RESW 1

disallowed

ORG ALPHA

BYTE1 RESB 1

BYTE2 RESB 1

BYTE3 RESB 1

ORG

ALPHA RESB 1

disallowed

ALPHA RESW 1

BETA EQU ALPHA

Allowed

LALITHAMBIGAIB APIT Page 36

disallowed

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Expressions Most assemblers allow the use of expressions whenever a single operand such as a label or

literal is permitted

bull Each such expression must be evaluated by the assembler to produce a single

operand address or value

Assemblers generally allow arithmetic expressions formed according to the normal rule using

the operators +- and

bull Individual terms in the expression may be

bull constants

bull user-defined symbols or

bull special terms

bull The most common special term is the current value of the location

counter (often designated by )

Types of terms

Absolute terms - The value of an absolute term is independent of program location

Relative terms - The value of a relative term is dependent on the beginning address of the

program

Types of expressions

By the type of value produced expressions can classified as

1 Absolute expressions

bull The value of an absolute expression is independent of the program location

bull The absolute expression may contains relative terms provided the relative terms occur in

pairs and the terms in each such pair have opposite signs No relative term can enter multiplication

or division operation

bull eg MAXLEN EQU BUFEND-BUFFER

LALITHAMBIGAIB APIT Page 37

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Relative expressions

bull The value of a relative expression is relative the beginning address of the object

program

bull A relative expression is one in which all of the relative terms except

one can be paired as described above The remaining unpaired term must have a positive sign No

relative term can enter multiplication or division operation

Expressions that are neither relative nor absolute should be flagged by the assembler as errors

Determining types of expressions

Symbol Type Value

MAXLEN A 1000

BUFEND R 1036

BUFFER R 0036

RETADR R 0030

BUFEND+BUFFER 100-BUFFER and 3BUFFER are neither relative expressions nor

absolute expressions

Program blocks

The program is divided into three blocks They are Default - 0CDATA-1

and CBLKS-2

Default ( 0 ) block for mnemonic instructions

CDATA ( 1 ) block for less memory instructions

CBLKs ( 2 ) block for more memory instructions

Then one more column is included in the program when object code is

generated (ie) block whenever the system will encounter the new block need to write

USE and BLOCK name then set LOCCTR value to 0000

LALITHAMBIGAIB APIT Page 38

Add this field to SYMTAB

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In program block table consists of block name block valuestarting address and

length

After finding location counter value need to find object code and write object

program

LALITHAMBIGAIB APIT Page 39

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Program Block Diagram

LALITHAMBIGAIB APIT Page 40

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Control section and program linkingControl section

i A control section is a part of the program that maintains its identity after assembly

ii Each control section can be loaded and relocated independently of the others

iii Different control sections are most often used for subroutines or other logical

subdivisions of a program

Assembler directive CSECT

CSECT signal the start of a new control section

The assembler establishes a separate location counter (initialized as 0) for each control section

Assembler directives EXTDEF EXTREF

LALITHAMBIGAIB APIT Page 41

SYSTEM SOFTWARE ( CS 2304) UNIT - IIEXTDEF external definition

The EXTDEF statement in a control section names symbols called external symbols that are

defined in this control section and may be used by other sections Control section names are

automatically considered to be external symbols

EXTREF external reference

The EXTREF statement names symbols that are used in this control section and are defined

elsewhere

LALITHAMBIGAIB APIT Page 42

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 16: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - IIwrite Header record to object program

initialize 1st Text record

while OPCODE = lsquoENDrsquo do

begin

if this is not comment line then

begin

search OPTAB for OPCODE

if found then

begin

if there is a symbol in OPERAND field then

begin

search SYMTAB for OPERAND

if found then

store symbol value as operand address

else

begin

store 0 as operand address

set error flag (undefined symbol)

end

end if symbol

else

store 0 as operand address

assemble the object code instruction

end if opcode found

else if OPCODE = lsquoBYTErsquo or lsquoWORDrdquo then

convert constant to object code

if object code doesnrsquot fit into current Text record then

begin

Write text record to object code

initialize new Text record

end

LALITHAMBIGAIB APIT Page 16

SYSTEM SOFTWARE ( CS 2304) UNIT - IIadd object code to Text record

end if not comment

write listing line

read next input line

end while not END

Write last Text record to object program

Write End record to object program

Write last listing line

End Pass 2

Explanation ndash PASS II Algorithm

Here the first input line is read from the intermediate file

If the opcode is START then this line is directly written to the list file

A header record is written in the object program which gives the starting address and the

length of the program (which is calculated during pass 1)

Then the first text record is initialized Comment lines are ignored

In the instruction for the opcode the OPTAB is searched to find the object code

If a symbol is there in the operand field the symbol table is searched to get the address value

for this which gets added to the object code of the opcode

If the address not found then zero value is stored as operands address An error flag is set

indicating it as undefined

If symbol itself is not found then store 0 as operand address and the object code instruction is

assembled

If the opcode is BYTE or WORD then the constant value is converted to its equivalent

object code( for example for character EOF its equivalent hexadecimal value lsquo454f46rsquo is

stored)

If the object code cannot fit into the current text record a new text record is created and the

rest of the instructions object code is listed

The text records are written to the object program Once the whole program is assembled and

when the END directive is encountered the End record is written

LALITHAMBIGAIB APIT Page 17

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Design and Implementation Issues

Some of the features in the program depend on the architecture of the machine If the program is for

SIC machine then we have only limited instruction formats and hence limited addressing modes

We have only single operand instructions The operand is always a memory reference Anything to

be fetched from memory requires more time Hence the improved version of SICXE machine

provides more instruction formats and hence more addressing modes The moment we change the

machine architecture the availability of number of instruction formats and the addressing modes

changes Therefore the design usually requires considering two things Machine-dependent features

and Machine-independent features

22 MACHINE DEPENDENT ASSEMBLER FEATURES

Instruction formats and addressing modes

Program relocation

In this section the design and implementation of an assembler for the more complex XE version of

SIC is considered

LALITHAMBIGAIB APIT Page 18

SYSTEM SOFTWARE ( CS 2304) UNIT - II

200 SUBROUTINE TO WRITE RECORD FROM BUFFER

205

210 WRREC CLEAR X CLEAR LOOP COUNTER

212 LDT LENGTH

215 WLOOP TD OUTPUT TEST OUTPUT DEVICE

220 JEQ WLOOP LOOP UNTIL READY

225 LDCH BUFFER X GET CHARACTER FROM BUFFER

230 WD OUTPUT WRITE CHARACTER

235 TIXR T LOOP UNTIL ALL CHARACTERS

HAVE BEEN WRITTEN

240 JLT WLOOP

245 RSUB

250 OUTPUT BYTE Xrsquo05rsquo

255 END FIRST

Figure 24 Example program of a SICXE

LALITHAMBIGAIB APIT Page 19

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The above figure 23 shows the example program from figure 21 as it is might be rewritten to

take advantage of the SICXE instruction set

Indirect addressing is indicated by adding the prefix to the operand (line70)

Immediate operands are denoted with the prefix (lines 25 55133)

Instructions that refer to memory are normally assembled using either the program counter

relative or base counter relative mode

The assembler directive BASE (line 13) is used in conjunction with base relative addressing

The four byte extended instruction format is specified with the prefix + added to the

operation code in the source statement

Register-to-register instructions are used wherever possible For example the statement on

line 150 is changed from COMP ZERO to COMPR AS

Immediate and indirect addressing have also been used as much as possible

Register-to-register instructions are faster than the corresponding register-to-memory

operations because they are shorter and do not require another memory reference

While using immediate addressing the operand is already present as part of the instruction

and need not be fetched from anywhere

The use of indirect addressing often avoids the need for another instruction

Line Loc Label Opcode Operand Object Code

5 COPY START 0

10 FIRST STL RETADR

12 LDB LENGTH

13 BASE LENGTH

15 CLOOP +JSUB RDREC

20 LDA LENGTH

25 COMP 0

30 JEQ ENDFIL

35 +JSUB WRREC

40 J CLOOP

45 ENDFIL LDA EOF

LALITHAMBIGAIB APIT Page 20

SYSTEM SOFTWARE ( CS 2304) UNIT - II50 STA BUFFER

55 LDA 3

60 STA LENGTH

65 +JSUB WRREC

70 J RETADR

80 EOF BYTE CrsquoEOFrsquo

95 RETADR RESW 1

100 LENGTH RESW 1

105 BUFFER RESB 4096

110

SUBROUTINE TO READ RECORD INTO

BUFFER

115

120

125 RDREC CLEAR X

130 CLEAR A

132 CLEAR S

133 +LDT 4096

135 RLOOP TD INPUT

140 JEQ RLOOP

145 RD INPUT

150 COMPR A S

155 JEQ EXIT

160 STCH BUFFERX

165 TIXR T

170 JLT RLOOP

175 EXIT STX LENGTH

180 RSUB

185 INPUT BYTE XrsquoF1rsquo

195

SUBROUTINE TO WRITE RECORD

FROM BUFFER

200

205

LALITHAMBIGAIB APIT Page 21

SYSTEM SOFTWARE ( CS 2304) UNIT - II210 WRREC CLEAR X

212 LDT LENGTH

215 WLOOP TD OUTPUT

220 JEQ WLOOP

225 LDCH BUFFERX

230 WD OUTPUT

235 TIXR T

240 JLT WLOOP

245 RSUB

250 OUTPUT BYTE Xrsquo05rsquo

255 END FIRST

Figure 25 Program from figure 24(SICXE) with object code

221 Instruction Formats and Addressing Modes

The instruction formats depend on the memory organization and the size of the memory

In SIC machine the memory is byte addressable Word size is 3 bytes So the size of the

memory is 212 bytes Accordingly it supports only one instruction format It has only two

registers register A and Index register Therefore the addressing modes supported by this

architecture are direct and indexed

Whereas the memory of a SICXE machine is 220 bytes (1 MB) This supports four different

types of instruction types they are

1 byte instruction

2 byte instruction

3 byte instruction

4 byte instruction

LALITHAMBIGAIB APIT Page 22

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Instructions can be

Instructions involving register to register

Instructions with one operand in memory the other in Accumulator (Single

operand instruction)

Extended instruction format

Addressing Modes are

PC-relative or Base-relative addressing op m

Indirect addressing op m

Immediate addressing op c

Extended format +op m

Index addressing op mx

register-to-register instructions

larger memory -gt multi-programming (program allocation)

Translation

1 Translations for the Instruction involving Register-Register addressing mode

During pass 1 the registers can be entered as part of the symbol table itself The value for these

registers is their equivalent numeric codes During pass 2 these values are assembled along with the

mnemonics object code If required a separate table can be created with the register names and their

equivalent numeric values

2 Translation involving Register-Memory instructions

In SICXE machine there are four instruction formats and five addressing modes For formats and

addressing modes refer chapter 1

LALITHAMBIGAIB APIT Page 23

SYSTEM SOFTWARE ( CS 2304) UNIT - IIAmong the instruction formats format -3 and format-4 instructions are Register-Memory type of

instruction One of the operand is always in a register and the other operand is in the memory The

addressing mode tells us the way in which the operand from the memory is to be fetched

There are two ways

1) Program-counter relative and

2) Base-relative

This addressing mode can be represented by either using format-3 type or format-4

type of instruction format

In format-3 the instruction has the opcode followed by a 12-bit displacement value in

the address field

Where as in format-4 the instruction contains the mnemonic code followed by a 20-

bit displacement value in the address field

1 Program-Counter Relative

a) In this usually format-3 instruction format is used

b) The instruction contains the opcode followed by a 12-bit displacement value

c) The range of displacement values are from 0 -2048 This displacement (should be small

enough to fit in a 12-bit field) value is added to the current contents of the program counter to

get the target address of the operand required by the instruction

d) This is relative way of calculating the address of the operand relative to the program counter

Hence the displacement of the operand is relative to the current program counter value

e) The following example shows how the address is calculated

LALITHAMBIGAIB APIT Page 24

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Base-Relative Addressing Mode

a) In this mode the base register is used to mention the displacement value Therefore the target

address is

TA = (base) + displacement value

b) This addressing mode is used when the range of displacement value is not sufficient Hence

the operand is not relative to the instruction as in PC-relative addressing mode Whenever

this mode is used it is indicated by using a directive BASE The moment the assembler

encounters this directive the next instruction uses base-relative addressing mode to calculate

the target address of the operand

c) When NOBASE directive is used then it indicates the base register is no more used to

calculate the target address of the operand Assembler first chooses PC-relative when the

displacement field is not enough it uses Base-relative

LDB LENGTH (instruction)

BASE LENGTH (directive)

NOBASE

LALITHAMBIGAIB APIT Page 25

SYSTEM SOFTWARE ( CS 2304) UNIT - II

For example

12 0003 LDB LENGTH 69202D

13 BASE LENGTH

100 0033 LENGTH RESW 1

105 0036 BUFFER RESB 4096

160 104E STCH BUFFER X 57C003

165 1051 TIXR T B850

In the above example the use of directive BASE indicates that Base-relative addressing mode is

to be used to calculate the target address PC-relative is no longer used The value of the LENGTH is

stored in the base register

The LDB instruction loads the value of length in the base register which 0033 BASE directive

explicitly tells the assembler that it has the value of LENGTH

BUFFER is at location (0036)16

(B) = (0033)16

disp = 0036 ndash 0033 = (0003)16

20 000A LDA LENGTH 032026

175 1056 EXIT STX LENGTH 134000

LALITHAMBIGAIB APIT Page 26

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Consider Line 175 If we use PC-relative

Disp = TA ndash (PC) = 0033 ndash1059 = EFDA

PC relative is no longer applicable so we try to use BASE relative addressing mode

3 Immediate Addressing Mode

In this mode no memory reference is involved If immediate mode is used the target address is the

operand itself

If the symbol is referred in the instruction as the immediate operand then it is immediate with PC-

relative mode as shown in the example below

LALITHAMBIGAIB APIT Page 27

SYSTEM SOFTWARE ( CS 2304) UNIT - II

4 Indirect and PC-relative mode

In this type of instruction the symbol used in the instruction is the address of the location which

contains the address of the operand The address of this is found using PC-relative addressing mode

For example

The instruction jumps the control to the address location RETADR which in turn has the address of

the operand If address of RETADR is 0030 the target address is then 0003 as calculated above

222 Program Relocation

The need for program relocation

It is desirable to load and run several programs at the same time

The system must be able to load programs into memory wherever there is room

The exact starting address of the program is not known until load time

Absolute Address

Program with starting address specified at assembly time

The address may be invalid if the program is loaded into somewhere else

LALITHAMBIGAIB APIT Page 28

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example

The above statement says that the register A is loaded with the value stored at location 102D

Suppose it is decided to load and execute the program at location 3000 instead of location 1000

Then at address 102D the required value which needs to be loaded in the register A is no more

available

The address also gets changed relative to the displacement of the program Hence we need to

make some changes in the address portion of the instruction so that we can load and execute the

program at location 3000

Apart from the instruction which will undergo a change in their operand address value as the

program load address changes There exist some parts in the program which will remain same

regardless of where the program is being loaded

Since assembler will not know actual location where the program will get loaded it cannot make

the necessary changes in the addresses used in the program However the assembler identifies

for the loader those parts of the program which need modification

An object program that has the information necessary to perform this kind of modification is

called the relocatable program

LALITHAMBIGAIB APIT Page 29

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example Program Relocation

1) The above diagram shows the concept of relocation

a) Initially the program is loaded at location 0000

b) The instruction JSUB is loaded at location 0006

c) The address field of this instruction contains 01036 which is the address of the instruction

labeled RDREC

2) The second figure shows that if the program is to be loaded at new location 5000 The address of

the instruction JSUB gets modified to new location 6036

3) Likewise the third figure shows that if the program is relocated at location 7420 the JSUB

instruction would need to be changed to 4B108456 that correspond to the new address of

RDREC

The only parts of the program that require modification at load time are those that specify

direct addresses

LALITHAMBIGAIB APIT Page 30

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The rest of the instructions need not be modified

o Not a memory address (immediate addressing)

o PC-relative Base-relative

From the object program it is not possible to distinguish the address and constant

o The assembler must keep some information to tell the loader

o The object program that contains the modification record is called a relocatable

program

The way to solve the relocation problem

For an address label its address is assigned relative to the start of the program(START 0)

Produce a Modification record to store the starting location and the length of the address

field to be modified

The command for the loader must also be a part of the object program

Modification record

One modification record for each address to be modified

The length is stored in half-bytes (4 bits)

The starting location is the location of the byte containing the leftmost bits of the address

field to be modified

If the field contains an odd number of half-bytes the starting location begins in the middle of

the first byte

LALITHAMBIGAIB APIT Page 31

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Relocatable Object Program

In the above object code the red boxes indicate the addresses that need modifications The object

code lines at the end are the descriptions of the modification records for those instructions which

need change if relocation occurs M00000705 is the modification suggested for the statement at

location 0007 and requires modification 5-half bytes

Machine dependent assembler featuresAssembler features not closely related to machine architecture

bull Literalsbull Symbol-defining statementsbull Expressionsbull Program blocksbull Control sections and program linking

Literals

It is convenient for the programmer to be able to write the value of a constant operand as a part of

the instruction that uses it Such an operand is called a literal

45 001A ENDFIL LDA =ClsquoEOFrsquo 003210

002D =ClsquoEOFrsquo 454F46

LALITHAMBIGAIB APIT Page 32

SYSTEM SOFTWARE ( CS 2304) UNIT - II215 1062 WLOOP TD =Xlsquo05rsquo E32011

1076 =Xlsquo05rsquo 05

bull In this assembler language notation a literal is identified with the prefix= which is followed by a

specification of the literal value

The difference between a literal and an immediate operand

With immediate addressing the operand value is assembled as a part of the machine

instruction

With a literal the assembler generates the specified value as a constant at some other

memory location The address of this generated constant is used as the target address for the

machine instruction

Literal pool

All of the literal operands used in a program are gathered together into one or more literal

pools

Where the literal pool should be placed

93 LTORG

002D =ClsquoEOFrsquo 454F46

1048698 The assembler directive LTORG tells the assembler to generate a literal pool here

Literal for current value of location counter

1048698 The value of the location counter can be denoted by a literal operand

bull BASE

bull LDB =

Handling duplicate literal operands

The assembler should avoid storing duplicate literals

The easiest way to recognize duplicate literals is by comparison of the character

strings defining them

bull 100 LDA =ClsquoEOFrsquo

LALITHAMBIGAIB APIT Page 33

SYSTEM SOFTWARE ( CS 2304) UNIT - IIbull 125 LDA =ClsquoEOFrsquo

bull 160 LDA =Xlsquo454F46rsquo Literal operands with different

literal names may have the same

literal values

Recognizing literal operands that have different literal names but have the same literal values will

complicate the design of an assembler

Literal table (LITTAB)

The basic data structure needed to process literal operands is a literal table (LITTAB)

LITTAB contains Literal NameValue and Length

LITTAB is often organized as a hash table using the literal name or value as the key

Processing literal operands

Pass 1

bull For each recognized literal operand search LITTAB If the literal is already present in the

table no action is need if it is not present the literal is added to LITTAB without assigning

its address

bull When a LTORG statement is encountered or the end of the program the assembler makes a

scan of LITTAB and assigns an address to each literal

bull Update the location counter to reflect the number of bytes occupied by each literal

Pass 2

bull Search LITTAB for each literal operand encountered

bull The data values specified by the literals in each literal pool are inserted at the appropriate

places in the object program

bull In the same way as these values generated by BYTE or WORD statements

bull If a literal value represents an address in the program the assembler must generate the

appropriate Modification record

Symbol-defining statements

Assembler directives

EQU

ORG

Assembler directive EQU

LALITHAMBIGAIB APIT Page 34

SYSTEM SOFTWARE ( CS 2304) UNIT - II Most assemblers provide an assembler directive that allows the programmer to define

symbols and specify their values

Symbol EQU value

When the assembler encounters the EQU statement it enters ldquosymbolrdquo into SYMTAB with

the value of ldquosymbolrdquo

Use of EQU

Establish symbolic names that can be used for improved readability in place of numeric

values

+LDT 4096

MAXLEN EQU 4096

+LDT MAXLEN

Define mnemonic names for registers

A EQU 0

X EQU 1

L EQU 2

RMO 01

Establish and use names that reflect the logical function of the registers in the program

BASE EQU R1

ACCUMULATOR EQU R2

INDEX EQU R3

Assembler directive ORG

The assembler directive ORG is usually used to indirectly assign values to symbols

ORG value

ldquoValuerdquo is a constant or an expression involving constants and previously defined

symbols

When this statement is encountered the assembler resets its location counter (LOCCTR) to

the specified value

Use ORG for label definition

Suppose that we want to define a table with the following structure

STAB SYMBOL VALUE FLAGS

100 entries 6 bytes 3 bytes 1byte

LALITHAMBIGAIB APIT Page 35

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In some assemblers the previous value of LOCCTR is automatically remembered so we can

write

ORG

to return to the normal use of LOCCTR

Restrictions of EQU and ORG in an ordinary two-pass assembler For an ordinary two-pass assembler all symbols must be defined during Pass 1 Hence the

following sequences could not be processed by an ordinary two-pass assembler

All terms used to specify the value of the new symbol must have been defined previously in

the program

BETA EQU ALPHA

ALPHA RESW 1

ALPHA EQU BETA

BETA EQU DELTA

DELTA RESW 1

disallowed

ORG ALPHA

BYTE1 RESB 1

BYTE2 RESB 1

BYTE3 RESB 1

ORG

ALPHA RESB 1

disallowed

ALPHA RESW 1

BETA EQU ALPHA

Allowed

LALITHAMBIGAIB APIT Page 36

disallowed

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Expressions Most assemblers allow the use of expressions whenever a single operand such as a label or

literal is permitted

bull Each such expression must be evaluated by the assembler to produce a single

operand address or value

Assemblers generally allow arithmetic expressions formed according to the normal rule using

the operators +- and

bull Individual terms in the expression may be

bull constants

bull user-defined symbols or

bull special terms

bull The most common special term is the current value of the location

counter (often designated by )

Types of terms

Absolute terms - The value of an absolute term is independent of program location

Relative terms - The value of a relative term is dependent on the beginning address of the

program

Types of expressions

By the type of value produced expressions can classified as

1 Absolute expressions

bull The value of an absolute expression is independent of the program location

bull The absolute expression may contains relative terms provided the relative terms occur in

pairs and the terms in each such pair have opposite signs No relative term can enter multiplication

or division operation

bull eg MAXLEN EQU BUFEND-BUFFER

LALITHAMBIGAIB APIT Page 37

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Relative expressions

bull The value of a relative expression is relative the beginning address of the object

program

bull A relative expression is one in which all of the relative terms except

one can be paired as described above The remaining unpaired term must have a positive sign No

relative term can enter multiplication or division operation

Expressions that are neither relative nor absolute should be flagged by the assembler as errors

Determining types of expressions

Symbol Type Value

MAXLEN A 1000

BUFEND R 1036

BUFFER R 0036

RETADR R 0030

BUFEND+BUFFER 100-BUFFER and 3BUFFER are neither relative expressions nor

absolute expressions

Program blocks

The program is divided into three blocks They are Default - 0CDATA-1

and CBLKS-2

Default ( 0 ) block for mnemonic instructions

CDATA ( 1 ) block for less memory instructions

CBLKs ( 2 ) block for more memory instructions

Then one more column is included in the program when object code is

generated (ie) block whenever the system will encounter the new block need to write

USE and BLOCK name then set LOCCTR value to 0000

LALITHAMBIGAIB APIT Page 38

Add this field to SYMTAB

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In program block table consists of block name block valuestarting address and

length

After finding location counter value need to find object code and write object

program

LALITHAMBIGAIB APIT Page 39

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Program Block Diagram

LALITHAMBIGAIB APIT Page 40

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Control section and program linkingControl section

i A control section is a part of the program that maintains its identity after assembly

ii Each control section can be loaded and relocated independently of the others

iii Different control sections are most often used for subroutines or other logical

subdivisions of a program

Assembler directive CSECT

CSECT signal the start of a new control section

The assembler establishes a separate location counter (initialized as 0) for each control section

Assembler directives EXTDEF EXTREF

LALITHAMBIGAIB APIT Page 41

SYSTEM SOFTWARE ( CS 2304) UNIT - IIEXTDEF external definition

The EXTDEF statement in a control section names symbols called external symbols that are

defined in this control section and may be used by other sections Control section names are

automatically considered to be external symbols

EXTREF external reference

The EXTREF statement names symbols that are used in this control section and are defined

elsewhere

LALITHAMBIGAIB APIT Page 42

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 17: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - IIadd object code to Text record

end if not comment

write listing line

read next input line

end while not END

Write last Text record to object program

Write End record to object program

Write last listing line

End Pass 2

Explanation ndash PASS II Algorithm

Here the first input line is read from the intermediate file

If the opcode is START then this line is directly written to the list file

A header record is written in the object program which gives the starting address and the

length of the program (which is calculated during pass 1)

Then the first text record is initialized Comment lines are ignored

In the instruction for the opcode the OPTAB is searched to find the object code

If a symbol is there in the operand field the symbol table is searched to get the address value

for this which gets added to the object code of the opcode

If the address not found then zero value is stored as operands address An error flag is set

indicating it as undefined

If symbol itself is not found then store 0 as operand address and the object code instruction is

assembled

If the opcode is BYTE or WORD then the constant value is converted to its equivalent

object code( for example for character EOF its equivalent hexadecimal value lsquo454f46rsquo is

stored)

If the object code cannot fit into the current text record a new text record is created and the

rest of the instructions object code is listed

The text records are written to the object program Once the whole program is assembled and

when the END directive is encountered the End record is written

LALITHAMBIGAIB APIT Page 17

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Design and Implementation Issues

Some of the features in the program depend on the architecture of the machine If the program is for

SIC machine then we have only limited instruction formats and hence limited addressing modes

We have only single operand instructions The operand is always a memory reference Anything to

be fetched from memory requires more time Hence the improved version of SICXE machine

provides more instruction formats and hence more addressing modes The moment we change the

machine architecture the availability of number of instruction formats and the addressing modes

changes Therefore the design usually requires considering two things Machine-dependent features

and Machine-independent features

22 MACHINE DEPENDENT ASSEMBLER FEATURES

Instruction formats and addressing modes

Program relocation

In this section the design and implementation of an assembler for the more complex XE version of

SIC is considered

LALITHAMBIGAIB APIT Page 18

SYSTEM SOFTWARE ( CS 2304) UNIT - II

200 SUBROUTINE TO WRITE RECORD FROM BUFFER

205

210 WRREC CLEAR X CLEAR LOOP COUNTER

212 LDT LENGTH

215 WLOOP TD OUTPUT TEST OUTPUT DEVICE

220 JEQ WLOOP LOOP UNTIL READY

225 LDCH BUFFER X GET CHARACTER FROM BUFFER

230 WD OUTPUT WRITE CHARACTER

235 TIXR T LOOP UNTIL ALL CHARACTERS

HAVE BEEN WRITTEN

240 JLT WLOOP

245 RSUB

250 OUTPUT BYTE Xrsquo05rsquo

255 END FIRST

Figure 24 Example program of a SICXE

LALITHAMBIGAIB APIT Page 19

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The above figure 23 shows the example program from figure 21 as it is might be rewritten to

take advantage of the SICXE instruction set

Indirect addressing is indicated by adding the prefix to the operand (line70)

Immediate operands are denoted with the prefix (lines 25 55133)

Instructions that refer to memory are normally assembled using either the program counter

relative or base counter relative mode

The assembler directive BASE (line 13) is used in conjunction with base relative addressing

The four byte extended instruction format is specified with the prefix + added to the

operation code in the source statement

Register-to-register instructions are used wherever possible For example the statement on

line 150 is changed from COMP ZERO to COMPR AS

Immediate and indirect addressing have also been used as much as possible

Register-to-register instructions are faster than the corresponding register-to-memory

operations because they are shorter and do not require another memory reference

While using immediate addressing the operand is already present as part of the instruction

and need not be fetched from anywhere

The use of indirect addressing often avoids the need for another instruction

Line Loc Label Opcode Operand Object Code

5 COPY START 0

10 FIRST STL RETADR

12 LDB LENGTH

13 BASE LENGTH

15 CLOOP +JSUB RDREC

20 LDA LENGTH

25 COMP 0

30 JEQ ENDFIL

35 +JSUB WRREC

40 J CLOOP

45 ENDFIL LDA EOF

LALITHAMBIGAIB APIT Page 20

SYSTEM SOFTWARE ( CS 2304) UNIT - II50 STA BUFFER

55 LDA 3

60 STA LENGTH

65 +JSUB WRREC

70 J RETADR

80 EOF BYTE CrsquoEOFrsquo

95 RETADR RESW 1

100 LENGTH RESW 1

105 BUFFER RESB 4096

110

SUBROUTINE TO READ RECORD INTO

BUFFER

115

120

125 RDREC CLEAR X

130 CLEAR A

132 CLEAR S

133 +LDT 4096

135 RLOOP TD INPUT

140 JEQ RLOOP

145 RD INPUT

150 COMPR A S

155 JEQ EXIT

160 STCH BUFFERX

165 TIXR T

170 JLT RLOOP

175 EXIT STX LENGTH

180 RSUB

185 INPUT BYTE XrsquoF1rsquo

195

SUBROUTINE TO WRITE RECORD

FROM BUFFER

200

205

LALITHAMBIGAIB APIT Page 21

SYSTEM SOFTWARE ( CS 2304) UNIT - II210 WRREC CLEAR X

212 LDT LENGTH

215 WLOOP TD OUTPUT

220 JEQ WLOOP

225 LDCH BUFFERX

230 WD OUTPUT

235 TIXR T

240 JLT WLOOP

245 RSUB

250 OUTPUT BYTE Xrsquo05rsquo

255 END FIRST

Figure 25 Program from figure 24(SICXE) with object code

221 Instruction Formats and Addressing Modes

The instruction formats depend on the memory organization and the size of the memory

In SIC machine the memory is byte addressable Word size is 3 bytes So the size of the

memory is 212 bytes Accordingly it supports only one instruction format It has only two

registers register A and Index register Therefore the addressing modes supported by this

architecture are direct and indexed

Whereas the memory of a SICXE machine is 220 bytes (1 MB) This supports four different

types of instruction types they are

1 byte instruction

2 byte instruction

3 byte instruction

4 byte instruction

LALITHAMBIGAIB APIT Page 22

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Instructions can be

Instructions involving register to register

Instructions with one operand in memory the other in Accumulator (Single

operand instruction)

Extended instruction format

Addressing Modes are

PC-relative or Base-relative addressing op m

Indirect addressing op m

Immediate addressing op c

Extended format +op m

Index addressing op mx

register-to-register instructions

larger memory -gt multi-programming (program allocation)

Translation

1 Translations for the Instruction involving Register-Register addressing mode

During pass 1 the registers can be entered as part of the symbol table itself The value for these

registers is their equivalent numeric codes During pass 2 these values are assembled along with the

mnemonics object code If required a separate table can be created with the register names and their

equivalent numeric values

2 Translation involving Register-Memory instructions

In SICXE machine there are four instruction formats and five addressing modes For formats and

addressing modes refer chapter 1

LALITHAMBIGAIB APIT Page 23

SYSTEM SOFTWARE ( CS 2304) UNIT - IIAmong the instruction formats format -3 and format-4 instructions are Register-Memory type of

instruction One of the operand is always in a register and the other operand is in the memory The

addressing mode tells us the way in which the operand from the memory is to be fetched

There are two ways

1) Program-counter relative and

2) Base-relative

This addressing mode can be represented by either using format-3 type or format-4

type of instruction format

In format-3 the instruction has the opcode followed by a 12-bit displacement value in

the address field

Where as in format-4 the instruction contains the mnemonic code followed by a 20-

bit displacement value in the address field

1 Program-Counter Relative

a) In this usually format-3 instruction format is used

b) The instruction contains the opcode followed by a 12-bit displacement value

c) The range of displacement values are from 0 -2048 This displacement (should be small

enough to fit in a 12-bit field) value is added to the current contents of the program counter to

get the target address of the operand required by the instruction

d) This is relative way of calculating the address of the operand relative to the program counter

Hence the displacement of the operand is relative to the current program counter value

e) The following example shows how the address is calculated

LALITHAMBIGAIB APIT Page 24

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Base-Relative Addressing Mode

a) In this mode the base register is used to mention the displacement value Therefore the target

address is

TA = (base) + displacement value

b) This addressing mode is used when the range of displacement value is not sufficient Hence

the operand is not relative to the instruction as in PC-relative addressing mode Whenever

this mode is used it is indicated by using a directive BASE The moment the assembler

encounters this directive the next instruction uses base-relative addressing mode to calculate

the target address of the operand

c) When NOBASE directive is used then it indicates the base register is no more used to

calculate the target address of the operand Assembler first chooses PC-relative when the

displacement field is not enough it uses Base-relative

LDB LENGTH (instruction)

BASE LENGTH (directive)

NOBASE

LALITHAMBIGAIB APIT Page 25

SYSTEM SOFTWARE ( CS 2304) UNIT - II

For example

12 0003 LDB LENGTH 69202D

13 BASE LENGTH

100 0033 LENGTH RESW 1

105 0036 BUFFER RESB 4096

160 104E STCH BUFFER X 57C003

165 1051 TIXR T B850

In the above example the use of directive BASE indicates that Base-relative addressing mode is

to be used to calculate the target address PC-relative is no longer used The value of the LENGTH is

stored in the base register

The LDB instruction loads the value of length in the base register which 0033 BASE directive

explicitly tells the assembler that it has the value of LENGTH

BUFFER is at location (0036)16

(B) = (0033)16

disp = 0036 ndash 0033 = (0003)16

20 000A LDA LENGTH 032026

175 1056 EXIT STX LENGTH 134000

LALITHAMBIGAIB APIT Page 26

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Consider Line 175 If we use PC-relative

Disp = TA ndash (PC) = 0033 ndash1059 = EFDA

PC relative is no longer applicable so we try to use BASE relative addressing mode

3 Immediate Addressing Mode

In this mode no memory reference is involved If immediate mode is used the target address is the

operand itself

If the symbol is referred in the instruction as the immediate operand then it is immediate with PC-

relative mode as shown in the example below

LALITHAMBIGAIB APIT Page 27

SYSTEM SOFTWARE ( CS 2304) UNIT - II

4 Indirect and PC-relative mode

In this type of instruction the symbol used in the instruction is the address of the location which

contains the address of the operand The address of this is found using PC-relative addressing mode

For example

The instruction jumps the control to the address location RETADR which in turn has the address of

the operand If address of RETADR is 0030 the target address is then 0003 as calculated above

222 Program Relocation

The need for program relocation

It is desirable to load and run several programs at the same time

The system must be able to load programs into memory wherever there is room

The exact starting address of the program is not known until load time

Absolute Address

Program with starting address specified at assembly time

The address may be invalid if the program is loaded into somewhere else

LALITHAMBIGAIB APIT Page 28

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example

The above statement says that the register A is loaded with the value stored at location 102D

Suppose it is decided to load and execute the program at location 3000 instead of location 1000

Then at address 102D the required value which needs to be loaded in the register A is no more

available

The address also gets changed relative to the displacement of the program Hence we need to

make some changes in the address portion of the instruction so that we can load and execute the

program at location 3000

Apart from the instruction which will undergo a change in their operand address value as the

program load address changes There exist some parts in the program which will remain same

regardless of where the program is being loaded

Since assembler will not know actual location where the program will get loaded it cannot make

the necessary changes in the addresses used in the program However the assembler identifies

for the loader those parts of the program which need modification

An object program that has the information necessary to perform this kind of modification is

called the relocatable program

LALITHAMBIGAIB APIT Page 29

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example Program Relocation

1) The above diagram shows the concept of relocation

a) Initially the program is loaded at location 0000

b) The instruction JSUB is loaded at location 0006

c) The address field of this instruction contains 01036 which is the address of the instruction

labeled RDREC

2) The second figure shows that if the program is to be loaded at new location 5000 The address of

the instruction JSUB gets modified to new location 6036

3) Likewise the third figure shows that if the program is relocated at location 7420 the JSUB

instruction would need to be changed to 4B108456 that correspond to the new address of

RDREC

The only parts of the program that require modification at load time are those that specify

direct addresses

LALITHAMBIGAIB APIT Page 30

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The rest of the instructions need not be modified

o Not a memory address (immediate addressing)

o PC-relative Base-relative

From the object program it is not possible to distinguish the address and constant

o The assembler must keep some information to tell the loader

o The object program that contains the modification record is called a relocatable

program

The way to solve the relocation problem

For an address label its address is assigned relative to the start of the program(START 0)

Produce a Modification record to store the starting location and the length of the address

field to be modified

The command for the loader must also be a part of the object program

Modification record

One modification record for each address to be modified

The length is stored in half-bytes (4 bits)

The starting location is the location of the byte containing the leftmost bits of the address

field to be modified

If the field contains an odd number of half-bytes the starting location begins in the middle of

the first byte

LALITHAMBIGAIB APIT Page 31

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Relocatable Object Program

In the above object code the red boxes indicate the addresses that need modifications The object

code lines at the end are the descriptions of the modification records for those instructions which

need change if relocation occurs M00000705 is the modification suggested for the statement at

location 0007 and requires modification 5-half bytes

Machine dependent assembler featuresAssembler features not closely related to machine architecture

bull Literalsbull Symbol-defining statementsbull Expressionsbull Program blocksbull Control sections and program linking

Literals

It is convenient for the programmer to be able to write the value of a constant operand as a part of

the instruction that uses it Such an operand is called a literal

45 001A ENDFIL LDA =ClsquoEOFrsquo 003210

002D =ClsquoEOFrsquo 454F46

LALITHAMBIGAIB APIT Page 32

SYSTEM SOFTWARE ( CS 2304) UNIT - II215 1062 WLOOP TD =Xlsquo05rsquo E32011

1076 =Xlsquo05rsquo 05

bull In this assembler language notation a literal is identified with the prefix= which is followed by a

specification of the literal value

The difference between a literal and an immediate operand

With immediate addressing the operand value is assembled as a part of the machine

instruction

With a literal the assembler generates the specified value as a constant at some other

memory location The address of this generated constant is used as the target address for the

machine instruction

Literal pool

All of the literal operands used in a program are gathered together into one or more literal

pools

Where the literal pool should be placed

93 LTORG

002D =ClsquoEOFrsquo 454F46

1048698 The assembler directive LTORG tells the assembler to generate a literal pool here

Literal for current value of location counter

1048698 The value of the location counter can be denoted by a literal operand

bull BASE

bull LDB =

Handling duplicate literal operands

The assembler should avoid storing duplicate literals

The easiest way to recognize duplicate literals is by comparison of the character

strings defining them

bull 100 LDA =ClsquoEOFrsquo

LALITHAMBIGAIB APIT Page 33

SYSTEM SOFTWARE ( CS 2304) UNIT - IIbull 125 LDA =ClsquoEOFrsquo

bull 160 LDA =Xlsquo454F46rsquo Literal operands with different

literal names may have the same

literal values

Recognizing literal operands that have different literal names but have the same literal values will

complicate the design of an assembler

Literal table (LITTAB)

The basic data structure needed to process literal operands is a literal table (LITTAB)

LITTAB contains Literal NameValue and Length

LITTAB is often organized as a hash table using the literal name or value as the key

Processing literal operands

Pass 1

bull For each recognized literal operand search LITTAB If the literal is already present in the

table no action is need if it is not present the literal is added to LITTAB without assigning

its address

bull When a LTORG statement is encountered or the end of the program the assembler makes a

scan of LITTAB and assigns an address to each literal

bull Update the location counter to reflect the number of bytes occupied by each literal

Pass 2

bull Search LITTAB for each literal operand encountered

bull The data values specified by the literals in each literal pool are inserted at the appropriate

places in the object program

bull In the same way as these values generated by BYTE or WORD statements

bull If a literal value represents an address in the program the assembler must generate the

appropriate Modification record

Symbol-defining statements

Assembler directives

EQU

ORG

Assembler directive EQU

LALITHAMBIGAIB APIT Page 34

SYSTEM SOFTWARE ( CS 2304) UNIT - II Most assemblers provide an assembler directive that allows the programmer to define

symbols and specify their values

Symbol EQU value

When the assembler encounters the EQU statement it enters ldquosymbolrdquo into SYMTAB with

the value of ldquosymbolrdquo

Use of EQU

Establish symbolic names that can be used for improved readability in place of numeric

values

+LDT 4096

MAXLEN EQU 4096

+LDT MAXLEN

Define mnemonic names for registers

A EQU 0

X EQU 1

L EQU 2

RMO 01

Establish and use names that reflect the logical function of the registers in the program

BASE EQU R1

ACCUMULATOR EQU R2

INDEX EQU R3

Assembler directive ORG

The assembler directive ORG is usually used to indirectly assign values to symbols

ORG value

ldquoValuerdquo is a constant or an expression involving constants and previously defined

symbols

When this statement is encountered the assembler resets its location counter (LOCCTR) to

the specified value

Use ORG for label definition

Suppose that we want to define a table with the following structure

STAB SYMBOL VALUE FLAGS

100 entries 6 bytes 3 bytes 1byte

LALITHAMBIGAIB APIT Page 35

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In some assemblers the previous value of LOCCTR is automatically remembered so we can

write

ORG

to return to the normal use of LOCCTR

Restrictions of EQU and ORG in an ordinary two-pass assembler For an ordinary two-pass assembler all symbols must be defined during Pass 1 Hence the

following sequences could not be processed by an ordinary two-pass assembler

All terms used to specify the value of the new symbol must have been defined previously in

the program

BETA EQU ALPHA

ALPHA RESW 1

ALPHA EQU BETA

BETA EQU DELTA

DELTA RESW 1

disallowed

ORG ALPHA

BYTE1 RESB 1

BYTE2 RESB 1

BYTE3 RESB 1

ORG

ALPHA RESB 1

disallowed

ALPHA RESW 1

BETA EQU ALPHA

Allowed

LALITHAMBIGAIB APIT Page 36

disallowed

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Expressions Most assemblers allow the use of expressions whenever a single operand such as a label or

literal is permitted

bull Each such expression must be evaluated by the assembler to produce a single

operand address or value

Assemblers generally allow arithmetic expressions formed according to the normal rule using

the operators +- and

bull Individual terms in the expression may be

bull constants

bull user-defined symbols or

bull special terms

bull The most common special term is the current value of the location

counter (often designated by )

Types of terms

Absolute terms - The value of an absolute term is independent of program location

Relative terms - The value of a relative term is dependent on the beginning address of the

program

Types of expressions

By the type of value produced expressions can classified as

1 Absolute expressions

bull The value of an absolute expression is independent of the program location

bull The absolute expression may contains relative terms provided the relative terms occur in

pairs and the terms in each such pair have opposite signs No relative term can enter multiplication

or division operation

bull eg MAXLEN EQU BUFEND-BUFFER

LALITHAMBIGAIB APIT Page 37

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Relative expressions

bull The value of a relative expression is relative the beginning address of the object

program

bull A relative expression is one in which all of the relative terms except

one can be paired as described above The remaining unpaired term must have a positive sign No

relative term can enter multiplication or division operation

Expressions that are neither relative nor absolute should be flagged by the assembler as errors

Determining types of expressions

Symbol Type Value

MAXLEN A 1000

BUFEND R 1036

BUFFER R 0036

RETADR R 0030

BUFEND+BUFFER 100-BUFFER and 3BUFFER are neither relative expressions nor

absolute expressions

Program blocks

The program is divided into three blocks They are Default - 0CDATA-1

and CBLKS-2

Default ( 0 ) block for mnemonic instructions

CDATA ( 1 ) block for less memory instructions

CBLKs ( 2 ) block for more memory instructions

Then one more column is included in the program when object code is

generated (ie) block whenever the system will encounter the new block need to write

USE and BLOCK name then set LOCCTR value to 0000

LALITHAMBIGAIB APIT Page 38

Add this field to SYMTAB

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In program block table consists of block name block valuestarting address and

length

After finding location counter value need to find object code and write object

program

LALITHAMBIGAIB APIT Page 39

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Program Block Diagram

LALITHAMBIGAIB APIT Page 40

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Control section and program linkingControl section

i A control section is a part of the program that maintains its identity after assembly

ii Each control section can be loaded and relocated independently of the others

iii Different control sections are most often used for subroutines or other logical

subdivisions of a program

Assembler directive CSECT

CSECT signal the start of a new control section

The assembler establishes a separate location counter (initialized as 0) for each control section

Assembler directives EXTDEF EXTREF

LALITHAMBIGAIB APIT Page 41

SYSTEM SOFTWARE ( CS 2304) UNIT - IIEXTDEF external definition

The EXTDEF statement in a control section names symbols called external symbols that are

defined in this control section and may be used by other sections Control section names are

automatically considered to be external symbols

EXTREF external reference

The EXTREF statement names symbols that are used in this control section and are defined

elsewhere

LALITHAMBIGAIB APIT Page 42

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 18: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Design and Implementation Issues

Some of the features in the program depend on the architecture of the machine If the program is for

SIC machine then we have only limited instruction formats and hence limited addressing modes

We have only single operand instructions The operand is always a memory reference Anything to

be fetched from memory requires more time Hence the improved version of SICXE machine

provides more instruction formats and hence more addressing modes The moment we change the

machine architecture the availability of number of instruction formats and the addressing modes

changes Therefore the design usually requires considering two things Machine-dependent features

and Machine-independent features

22 MACHINE DEPENDENT ASSEMBLER FEATURES

Instruction formats and addressing modes

Program relocation

In this section the design and implementation of an assembler for the more complex XE version of

SIC is considered

LALITHAMBIGAIB APIT Page 18

SYSTEM SOFTWARE ( CS 2304) UNIT - II

200 SUBROUTINE TO WRITE RECORD FROM BUFFER

205

210 WRREC CLEAR X CLEAR LOOP COUNTER

212 LDT LENGTH

215 WLOOP TD OUTPUT TEST OUTPUT DEVICE

220 JEQ WLOOP LOOP UNTIL READY

225 LDCH BUFFER X GET CHARACTER FROM BUFFER

230 WD OUTPUT WRITE CHARACTER

235 TIXR T LOOP UNTIL ALL CHARACTERS

HAVE BEEN WRITTEN

240 JLT WLOOP

245 RSUB

250 OUTPUT BYTE Xrsquo05rsquo

255 END FIRST

Figure 24 Example program of a SICXE

LALITHAMBIGAIB APIT Page 19

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The above figure 23 shows the example program from figure 21 as it is might be rewritten to

take advantage of the SICXE instruction set

Indirect addressing is indicated by adding the prefix to the operand (line70)

Immediate operands are denoted with the prefix (lines 25 55133)

Instructions that refer to memory are normally assembled using either the program counter

relative or base counter relative mode

The assembler directive BASE (line 13) is used in conjunction with base relative addressing

The four byte extended instruction format is specified with the prefix + added to the

operation code in the source statement

Register-to-register instructions are used wherever possible For example the statement on

line 150 is changed from COMP ZERO to COMPR AS

Immediate and indirect addressing have also been used as much as possible

Register-to-register instructions are faster than the corresponding register-to-memory

operations because they are shorter and do not require another memory reference

While using immediate addressing the operand is already present as part of the instruction

and need not be fetched from anywhere

The use of indirect addressing often avoids the need for another instruction

Line Loc Label Opcode Operand Object Code

5 COPY START 0

10 FIRST STL RETADR

12 LDB LENGTH

13 BASE LENGTH

15 CLOOP +JSUB RDREC

20 LDA LENGTH

25 COMP 0

30 JEQ ENDFIL

35 +JSUB WRREC

40 J CLOOP

45 ENDFIL LDA EOF

LALITHAMBIGAIB APIT Page 20

SYSTEM SOFTWARE ( CS 2304) UNIT - II50 STA BUFFER

55 LDA 3

60 STA LENGTH

65 +JSUB WRREC

70 J RETADR

80 EOF BYTE CrsquoEOFrsquo

95 RETADR RESW 1

100 LENGTH RESW 1

105 BUFFER RESB 4096

110

SUBROUTINE TO READ RECORD INTO

BUFFER

115

120

125 RDREC CLEAR X

130 CLEAR A

132 CLEAR S

133 +LDT 4096

135 RLOOP TD INPUT

140 JEQ RLOOP

145 RD INPUT

150 COMPR A S

155 JEQ EXIT

160 STCH BUFFERX

165 TIXR T

170 JLT RLOOP

175 EXIT STX LENGTH

180 RSUB

185 INPUT BYTE XrsquoF1rsquo

195

SUBROUTINE TO WRITE RECORD

FROM BUFFER

200

205

LALITHAMBIGAIB APIT Page 21

SYSTEM SOFTWARE ( CS 2304) UNIT - II210 WRREC CLEAR X

212 LDT LENGTH

215 WLOOP TD OUTPUT

220 JEQ WLOOP

225 LDCH BUFFERX

230 WD OUTPUT

235 TIXR T

240 JLT WLOOP

245 RSUB

250 OUTPUT BYTE Xrsquo05rsquo

255 END FIRST

Figure 25 Program from figure 24(SICXE) with object code

221 Instruction Formats and Addressing Modes

The instruction formats depend on the memory organization and the size of the memory

In SIC machine the memory is byte addressable Word size is 3 bytes So the size of the

memory is 212 bytes Accordingly it supports only one instruction format It has only two

registers register A and Index register Therefore the addressing modes supported by this

architecture are direct and indexed

Whereas the memory of a SICXE machine is 220 bytes (1 MB) This supports four different

types of instruction types they are

1 byte instruction

2 byte instruction

3 byte instruction

4 byte instruction

LALITHAMBIGAIB APIT Page 22

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Instructions can be

Instructions involving register to register

Instructions with one operand in memory the other in Accumulator (Single

operand instruction)

Extended instruction format

Addressing Modes are

PC-relative or Base-relative addressing op m

Indirect addressing op m

Immediate addressing op c

Extended format +op m

Index addressing op mx

register-to-register instructions

larger memory -gt multi-programming (program allocation)

Translation

1 Translations for the Instruction involving Register-Register addressing mode

During pass 1 the registers can be entered as part of the symbol table itself The value for these

registers is their equivalent numeric codes During pass 2 these values are assembled along with the

mnemonics object code If required a separate table can be created with the register names and their

equivalent numeric values

2 Translation involving Register-Memory instructions

In SICXE machine there are four instruction formats and five addressing modes For formats and

addressing modes refer chapter 1

LALITHAMBIGAIB APIT Page 23

SYSTEM SOFTWARE ( CS 2304) UNIT - IIAmong the instruction formats format -3 and format-4 instructions are Register-Memory type of

instruction One of the operand is always in a register and the other operand is in the memory The

addressing mode tells us the way in which the operand from the memory is to be fetched

There are two ways

1) Program-counter relative and

2) Base-relative

This addressing mode can be represented by either using format-3 type or format-4

type of instruction format

In format-3 the instruction has the opcode followed by a 12-bit displacement value in

the address field

Where as in format-4 the instruction contains the mnemonic code followed by a 20-

bit displacement value in the address field

1 Program-Counter Relative

a) In this usually format-3 instruction format is used

b) The instruction contains the opcode followed by a 12-bit displacement value

c) The range of displacement values are from 0 -2048 This displacement (should be small

enough to fit in a 12-bit field) value is added to the current contents of the program counter to

get the target address of the operand required by the instruction

d) This is relative way of calculating the address of the operand relative to the program counter

Hence the displacement of the operand is relative to the current program counter value

e) The following example shows how the address is calculated

LALITHAMBIGAIB APIT Page 24

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Base-Relative Addressing Mode

a) In this mode the base register is used to mention the displacement value Therefore the target

address is

TA = (base) + displacement value

b) This addressing mode is used when the range of displacement value is not sufficient Hence

the operand is not relative to the instruction as in PC-relative addressing mode Whenever

this mode is used it is indicated by using a directive BASE The moment the assembler

encounters this directive the next instruction uses base-relative addressing mode to calculate

the target address of the operand

c) When NOBASE directive is used then it indicates the base register is no more used to

calculate the target address of the operand Assembler first chooses PC-relative when the

displacement field is not enough it uses Base-relative

LDB LENGTH (instruction)

BASE LENGTH (directive)

NOBASE

LALITHAMBIGAIB APIT Page 25

SYSTEM SOFTWARE ( CS 2304) UNIT - II

For example

12 0003 LDB LENGTH 69202D

13 BASE LENGTH

100 0033 LENGTH RESW 1

105 0036 BUFFER RESB 4096

160 104E STCH BUFFER X 57C003

165 1051 TIXR T B850

In the above example the use of directive BASE indicates that Base-relative addressing mode is

to be used to calculate the target address PC-relative is no longer used The value of the LENGTH is

stored in the base register

The LDB instruction loads the value of length in the base register which 0033 BASE directive

explicitly tells the assembler that it has the value of LENGTH

BUFFER is at location (0036)16

(B) = (0033)16

disp = 0036 ndash 0033 = (0003)16

20 000A LDA LENGTH 032026

175 1056 EXIT STX LENGTH 134000

LALITHAMBIGAIB APIT Page 26

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Consider Line 175 If we use PC-relative

Disp = TA ndash (PC) = 0033 ndash1059 = EFDA

PC relative is no longer applicable so we try to use BASE relative addressing mode

3 Immediate Addressing Mode

In this mode no memory reference is involved If immediate mode is used the target address is the

operand itself

If the symbol is referred in the instruction as the immediate operand then it is immediate with PC-

relative mode as shown in the example below

LALITHAMBIGAIB APIT Page 27

SYSTEM SOFTWARE ( CS 2304) UNIT - II

4 Indirect and PC-relative mode

In this type of instruction the symbol used in the instruction is the address of the location which

contains the address of the operand The address of this is found using PC-relative addressing mode

For example

The instruction jumps the control to the address location RETADR which in turn has the address of

the operand If address of RETADR is 0030 the target address is then 0003 as calculated above

222 Program Relocation

The need for program relocation

It is desirable to load and run several programs at the same time

The system must be able to load programs into memory wherever there is room

The exact starting address of the program is not known until load time

Absolute Address

Program with starting address specified at assembly time

The address may be invalid if the program is loaded into somewhere else

LALITHAMBIGAIB APIT Page 28

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example

The above statement says that the register A is loaded with the value stored at location 102D

Suppose it is decided to load and execute the program at location 3000 instead of location 1000

Then at address 102D the required value which needs to be loaded in the register A is no more

available

The address also gets changed relative to the displacement of the program Hence we need to

make some changes in the address portion of the instruction so that we can load and execute the

program at location 3000

Apart from the instruction which will undergo a change in their operand address value as the

program load address changes There exist some parts in the program which will remain same

regardless of where the program is being loaded

Since assembler will not know actual location where the program will get loaded it cannot make

the necessary changes in the addresses used in the program However the assembler identifies

for the loader those parts of the program which need modification

An object program that has the information necessary to perform this kind of modification is

called the relocatable program

LALITHAMBIGAIB APIT Page 29

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example Program Relocation

1) The above diagram shows the concept of relocation

a) Initially the program is loaded at location 0000

b) The instruction JSUB is loaded at location 0006

c) The address field of this instruction contains 01036 which is the address of the instruction

labeled RDREC

2) The second figure shows that if the program is to be loaded at new location 5000 The address of

the instruction JSUB gets modified to new location 6036

3) Likewise the third figure shows that if the program is relocated at location 7420 the JSUB

instruction would need to be changed to 4B108456 that correspond to the new address of

RDREC

The only parts of the program that require modification at load time are those that specify

direct addresses

LALITHAMBIGAIB APIT Page 30

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The rest of the instructions need not be modified

o Not a memory address (immediate addressing)

o PC-relative Base-relative

From the object program it is not possible to distinguish the address and constant

o The assembler must keep some information to tell the loader

o The object program that contains the modification record is called a relocatable

program

The way to solve the relocation problem

For an address label its address is assigned relative to the start of the program(START 0)

Produce a Modification record to store the starting location and the length of the address

field to be modified

The command for the loader must also be a part of the object program

Modification record

One modification record for each address to be modified

The length is stored in half-bytes (4 bits)

The starting location is the location of the byte containing the leftmost bits of the address

field to be modified

If the field contains an odd number of half-bytes the starting location begins in the middle of

the first byte

LALITHAMBIGAIB APIT Page 31

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Relocatable Object Program

In the above object code the red boxes indicate the addresses that need modifications The object

code lines at the end are the descriptions of the modification records for those instructions which

need change if relocation occurs M00000705 is the modification suggested for the statement at

location 0007 and requires modification 5-half bytes

Machine dependent assembler featuresAssembler features not closely related to machine architecture

bull Literalsbull Symbol-defining statementsbull Expressionsbull Program blocksbull Control sections and program linking

Literals

It is convenient for the programmer to be able to write the value of a constant operand as a part of

the instruction that uses it Such an operand is called a literal

45 001A ENDFIL LDA =ClsquoEOFrsquo 003210

002D =ClsquoEOFrsquo 454F46

LALITHAMBIGAIB APIT Page 32

SYSTEM SOFTWARE ( CS 2304) UNIT - II215 1062 WLOOP TD =Xlsquo05rsquo E32011

1076 =Xlsquo05rsquo 05

bull In this assembler language notation a literal is identified with the prefix= which is followed by a

specification of the literal value

The difference between a literal and an immediate operand

With immediate addressing the operand value is assembled as a part of the machine

instruction

With a literal the assembler generates the specified value as a constant at some other

memory location The address of this generated constant is used as the target address for the

machine instruction

Literal pool

All of the literal operands used in a program are gathered together into one or more literal

pools

Where the literal pool should be placed

93 LTORG

002D =ClsquoEOFrsquo 454F46

1048698 The assembler directive LTORG tells the assembler to generate a literal pool here

Literal for current value of location counter

1048698 The value of the location counter can be denoted by a literal operand

bull BASE

bull LDB =

Handling duplicate literal operands

The assembler should avoid storing duplicate literals

The easiest way to recognize duplicate literals is by comparison of the character

strings defining them

bull 100 LDA =ClsquoEOFrsquo

LALITHAMBIGAIB APIT Page 33

SYSTEM SOFTWARE ( CS 2304) UNIT - IIbull 125 LDA =ClsquoEOFrsquo

bull 160 LDA =Xlsquo454F46rsquo Literal operands with different

literal names may have the same

literal values

Recognizing literal operands that have different literal names but have the same literal values will

complicate the design of an assembler

Literal table (LITTAB)

The basic data structure needed to process literal operands is a literal table (LITTAB)

LITTAB contains Literal NameValue and Length

LITTAB is often organized as a hash table using the literal name or value as the key

Processing literal operands

Pass 1

bull For each recognized literal operand search LITTAB If the literal is already present in the

table no action is need if it is not present the literal is added to LITTAB without assigning

its address

bull When a LTORG statement is encountered or the end of the program the assembler makes a

scan of LITTAB and assigns an address to each literal

bull Update the location counter to reflect the number of bytes occupied by each literal

Pass 2

bull Search LITTAB for each literal operand encountered

bull The data values specified by the literals in each literal pool are inserted at the appropriate

places in the object program

bull In the same way as these values generated by BYTE or WORD statements

bull If a literal value represents an address in the program the assembler must generate the

appropriate Modification record

Symbol-defining statements

Assembler directives

EQU

ORG

Assembler directive EQU

LALITHAMBIGAIB APIT Page 34

SYSTEM SOFTWARE ( CS 2304) UNIT - II Most assemblers provide an assembler directive that allows the programmer to define

symbols and specify their values

Symbol EQU value

When the assembler encounters the EQU statement it enters ldquosymbolrdquo into SYMTAB with

the value of ldquosymbolrdquo

Use of EQU

Establish symbolic names that can be used for improved readability in place of numeric

values

+LDT 4096

MAXLEN EQU 4096

+LDT MAXLEN

Define mnemonic names for registers

A EQU 0

X EQU 1

L EQU 2

RMO 01

Establish and use names that reflect the logical function of the registers in the program

BASE EQU R1

ACCUMULATOR EQU R2

INDEX EQU R3

Assembler directive ORG

The assembler directive ORG is usually used to indirectly assign values to symbols

ORG value

ldquoValuerdquo is a constant or an expression involving constants and previously defined

symbols

When this statement is encountered the assembler resets its location counter (LOCCTR) to

the specified value

Use ORG for label definition

Suppose that we want to define a table with the following structure

STAB SYMBOL VALUE FLAGS

100 entries 6 bytes 3 bytes 1byte

LALITHAMBIGAIB APIT Page 35

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In some assemblers the previous value of LOCCTR is automatically remembered so we can

write

ORG

to return to the normal use of LOCCTR

Restrictions of EQU and ORG in an ordinary two-pass assembler For an ordinary two-pass assembler all symbols must be defined during Pass 1 Hence the

following sequences could not be processed by an ordinary two-pass assembler

All terms used to specify the value of the new symbol must have been defined previously in

the program

BETA EQU ALPHA

ALPHA RESW 1

ALPHA EQU BETA

BETA EQU DELTA

DELTA RESW 1

disallowed

ORG ALPHA

BYTE1 RESB 1

BYTE2 RESB 1

BYTE3 RESB 1

ORG

ALPHA RESB 1

disallowed

ALPHA RESW 1

BETA EQU ALPHA

Allowed

LALITHAMBIGAIB APIT Page 36

disallowed

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Expressions Most assemblers allow the use of expressions whenever a single operand such as a label or

literal is permitted

bull Each such expression must be evaluated by the assembler to produce a single

operand address or value

Assemblers generally allow arithmetic expressions formed according to the normal rule using

the operators +- and

bull Individual terms in the expression may be

bull constants

bull user-defined symbols or

bull special terms

bull The most common special term is the current value of the location

counter (often designated by )

Types of terms

Absolute terms - The value of an absolute term is independent of program location

Relative terms - The value of a relative term is dependent on the beginning address of the

program

Types of expressions

By the type of value produced expressions can classified as

1 Absolute expressions

bull The value of an absolute expression is independent of the program location

bull The absolute expression may contains relative terms provided the relative terms occur in

pairs and the terms in each such pair have opposite signs No relative term can enter multiplication

or division operation

bull eg MAXLEN EQU BUFEND-BUFFER

LALITHAMBIGAIB APIT Page 37

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Relative expressions

bull The value of a relative expression is relative the beginning address of the object

program

bull A relative expression is one in which all of the relative terms except

one can be paired as described above The remaining unpaired term must have a positive sign No

relative term can enter multiplication or division operation

Expressions that are neither relative nor absolute should be flagged by the assembler as errors

Determining types of expressions

Symbol Type Value

MAXLEN A 1000

BUFEND R 1036

BUFFER R 0036

RETADR R 0030

BUFEND+BUFFER 100-BUFFER and 3BUFFER are neither relative expressions nor

absolute expressions

Program blocks

The program is divided into three blocks They are Default - 0CDATA-1

and CBLKS-2

Default ( 0 ) block for mnemonic instructions

CDATA ( 1 ) block for less memory instructions

CBLKs ( 2 ) block for more memory instructions

Then one more column is included in the program when object code is

generated (ie) block whenever the system will encounter the new block need to write

USE and BLOCK name then set LOCCTR value to 0000

LALITHAMBIGAIB APIT Page 38

Add this field to SYMTAB

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In program block table consists of block name block valuestarting address and

length

After finding location counter value need to find object code and write object

program

LALITHAMBIGAIB APIT Page 39

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Program Block Diagram

LALITHAMBIGAIB APIT Page 40

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Control section and program linkingControl section

i A control section is a part of the program that maintains its identity after assembly

ii Each control section can be loaded and relocated independently of the others

iii Different control sections are most often used for subroutines or other logical

subdivisions of a program

Assembler directive CSECT

CSECT signal the start of a new control section

The assembler establishes a separate location counter (initialized as 0) for each control section

Assembler directives EXTDEF EXTREF

LALITHAMBIGAIB APIT Page 41

SYSTEM SOFTWARE ( CS 2304) UNIT - IIEXTDEF external definition

The EXTDEF statement in a control section names symbols called external symbols that are

defined in this control section and may be used by other sections Control section names are

automatically considered to be external symbols

EXTREF external reference

The EXTREF statement names symbols that are used in this control section and are defined

elsewhere

LALITHAMBIGAIB APIT Page 42

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 19: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - II

200 SUBROUTINE TO WRITE RECORD FROM BUFFER

205

210 WRREC CLEAR X CLEAR LOOP COUNTER

212 LDT LENGTH

215 WLOOP TD OUTPUT TEST OUTPUT DEVICE

220 JEQ WLOOP LOOP UNTIL READY

225 LDCH BUFFER X GET CHARACTER FROM BUFFER

230 WD OUTPUT WRITE CHARACTER

235 TIXR T LOOP UNTIL ALL CHARACTERS

HAVE BEEN WRITTEN

240 JLT WLOOP

245 RSUB

250 OUTPUT BYTE Xrsquo05rsquo

255 END FIRST

Figure 24 Example program of a SICXE

LALITHAMBIGAIB APIT Page 19

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The above figure 23 shows the example program from figure 21 as it is might be rewritten to

take advantage of the SICXE instruction set

Indirect addressing is indicated by adding the prefix to the operand (line70)

Immediate operands are denoted with the prefix (lines 25 55133)

Instructions that refer to memory are normally assembled using either the program counter

relative or base counter relative mode

The assembler directive BASE (line 13) is used in conjunction with base relative addressing

The four byte extended instruction format is specified with the prefix + added to the

operation code in the source statement

Register-to-register instructions are used wherever possible For example the statement on

line 150 is changed from COMP ZERO to COMPR AS

Immediate and indirect addressing have also been used as much as possible

Register-to-register instructions are faster than the corresponding register-to-memory

operations because they are shorter and do not require another memory reference

While using immediate addressing the operand is already present as part of the instruction

and need not be fetched from anywhere

The use of indirect addressing often avoids the need for another instruction

Line Loc Label Opcode Operand Object Code

5 COPY START 0

10 FIRST STL RETADR

12 LDB LENGTH

13 BASE LENGTH

15 CLOOP +JSUB RDREC

20 LDA LENGTH

25 COMP 0

30 JEQ ENDFIL

35 +JSUB WRREC

40 J CLOOP

45 ENDFIL LDA EOF

LALITHAMBIGAIB APIT Page 20

SYSTEM SOFTWARE ( CS 2304) UNIT - II50 STA BUFFER

55 LDA 3

60 STA LENGTH

65 +JSUB WRREC

70 J RETADR

80 EOF BYTE CrsquoEOFrsquo

95 RETADR RESW 1

100 LENGTH RESW 1

105 BUFFER RESB 4096

110

SUBROUTINE TO READ RECORD INTO

BUFFER

115

120

125 RDREC CLEAR X

130 CLEAR A

132 CLEAR S

133 +LDT 4096

135 RLOOP TD INPUT

140 JEQ RLOOP

145 RD INPUT

150 COMPR A S

155 JEQ EXIT

160 STCH BUFFERX

165 TIXR T

170 JLT RLOOP

175 EXIT STX LENGTH

180 RSUB

185 INPUT BYTE XrsquoF1rsquo

195

SUBROUTINE TO WRITE RECORD

FROM BUFFER

200

205

LALITHAMBIGAIB APIT Page 21

SYSTEM SOFTWARE ( CS 2304) UNIT - II210 WRREC CLEAR X

212 LDT LENGTH

215 WLOOP TD OUTPUT

220 JEQ WLOOP

225 LDCH BUFFERX

230 WD OUTPUT

235 TIXR T

240 JLT WLOOP

245 RSUB

250 OUTPUT BYTE Xrsquo05rsquo

255 END FIRST

Figure 25 Program from figure 24(SICXE) with object code

221 Instruction Formats and Addressing Modes

The instruction formats depend on the memory organization and the size of the memory

In SIC machine the memory is byte addressable Word size is 3 bytes So the size of the

memory is 212 bytes Accordingly it supports only one instruction format It has only two

registers register A and Index register Therefore the addressing modes supported by this

architecture are direct and indexed

Whereas the memory of a SICXE machine is 220 bytes (1 MB) This supports four different

types of instruction types they are

1 byte instruction

2 byte instruction

3 byte instruction

4 byte instruction

LALITHAMBIGAIB APIT Page 22

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Instructions can be

Instructions involving register to register

Instructions with one operand in memory the other in Accumulator (Single

operand instruction)

Extended instruction format

Addressing Modes are

PC-relative or Base-relative addressing op m

Indirect addressing op m

Immediate addressing op c

Extended format +op m

Index addressing op mx

register-to-register instructions

larger memory -gt multi-programming (program allocation)

Translation

1 Translations for the Instruction involving Register-Register addressing mode

During pass 1 the registers can be entered as part of the symbol table itself The value for these

registers is their equivalent numeric codes During pass 2 these values are assembled along with the

mnemonics object code If required a separate table can be created with the register names and their

equivalent numeric values

2 Translation involving Register-Memory instructions

In SICXE machine there are four instruction formats and five addressing modes For formats and

addressing modes refer chapter 1

LALITHAMBIGAIB APIT Page 23

SYSTEM SOFTWARE ( CS 2304) UNIT - IIAmong the instruction formats format -3 and format-4 instructions are Register-Memory type of

instruction One of the operand is always in a register and the other operand is in the memory The

addressing mode tells us the way in which the operand from the memory is to be fetched

There are two ways

1) Program-counter relative and

2) Base-relative

This addressing mode can be represented by either using format-3 type or format-4

type of instruction format

In format-3 the instruction has the opcode followed by a 12-bit displacement value in

the address field

Where as in format-4 the instruction contains the mnemonic code followed by a 20-

bit displacement value in the address field

1 Program-Counter Relative

a) In this usually format-3 instruction format is used

b) The instruction contains the opcode followed by a 12-bit displacement value

c) The range of displacement values are from 0 -2048 This displacement (should be small

enough to fit in a 12-bit field) value is added to the current contents of the program counter to

get the target address of the operand required by the instruction

d) This is relative way of calculating the address of the operand relative to the program counter

Hence the displacement of the operand is relative to the current program counter value

e) The following example shows how the address is calculated

LALITHAMBIGAIB APIT Page 24

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Base-Relative Addressing Mode

a) In this mode the base register is used to mention the displacement value Therefore the target

address is

TA = (base) + displacement value

b) This addressing mode is used when the range of displacement value is not sufficient Hence

the operand is not relative to the instruction as in PC-relative addressing mode Whenever

this mode is used it is indicated by using a directive BASE The moment the assembler

encounters this directive the next instruction uses base-relative addressing mode to calculate

the target address of the operand

c) When NOBASE directive is used then it indicates the base register is no more used to

calculate the target address of the operand Assembler first chooses PC-relative when the

displacement field is not enough it uses Base-relative

LDB LENGTH (instruction)

BASE LENGTH (directive)

NOBASE

LALITHAMBIGAIB APIT Page 25

SYSTEM SOFTWARE ( CS 2304) UNIT - II

For example

12 0003 LDB LENGTH 69202D

13 BASE LENGTH

100 0033 LENGTH RESW 1

105 0036 BUFFER RESB 4096

160 104E STCH BUFFER X 57C003

165 1051 TIXR T B850

In the above example the use of directive BASE indicates that Base-relative addressing mode is

to be used to calculate the target address PC-relative is no longer used The value of the LENGTH is

stored in the base register

The LDB instruction loads the value of length in the base register which 0033 BASE directive

explicitly tells the assembler that it has the value of LENGTH

BUFFER is at location (0036)16

(B) = (0033)16

disp = 0036 ndash 0033 = (0003)16

20 000A LDA LENGTH 032026

175 1056 EXIT STX LENGTH 134000

LALITHAMBIGAIB APIT Page 26

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Consider Line 175 If we use PC-relative

Disp = TA ndash (PC) = 0033 ndash1059 = EFDA

PC relative is no longer applicable so we try to use BASE relative addressing mode

3 Immediate Addressing Mode

In this mode no memory reference is involved If immediate mode is used the target address is the

operand itself

If the symbol is referred in the instruction as the immediate operand then it is immediate with PC-

relative mode as shown in the example below

LALITHAMBIGAIB APIT Page 27

SYSTEM SOFTWARE ( CS 2304) UNIT - II

4 Indirect and PC-relative mode

In this type of instruction the symbol used in the instruction is the address of the location which

contains the address of the operand The address of this is found using PC-relative addressing mode

For example

The instruction jumps the control to the address location RETADR which in turn has the address of

the operand If address of RETADR is 0030 the target address is then 0003 as calculated above

222 Program Relocation

The need for program relocation

It is desirable to load and run several programs at the same time

The system must be able to load programs into memory wherever there is room

The exact starting address of the program is not known until load time

Absolute Address

Program with starting address specified at assembly time

The address may be invalid if the program is loaded into somewhere else

LALITHAMBIGAIB APIT Page 28

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example

The above statement says that the register A is loaded with the value stored at location 102D

Suppose it is decided to load and execute the program at location 3000 instead of location 1000

Then at address 102D the required value which needs to be loaded in the register A is no more

available

The address also gets changed relative to the displacement of the program Hence we need to

make some changes in the address portion of the instruction so that we can load and execute the

program at location 3000

Apart from the instruction which will undergo a change in their operand address value as the

program load address changes There exist some parts in the program which will remain same

regardless of where the program is being loaded

Since assembler will not know actual location where the program will get loaded it cannot make

the necessary changes in the addresses used in the program However the assembler identifies

for the loader those parts of the program which need modification

An object program that has the information necessary to perform this kind of modification is

called the relocatable program

LALITHAMBIGAIB APIT Page 29

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example Program Relocation

1) The above diagram shows the concept of relocation

a) Initially the program is loaded at location 0000

b) The instruction JSUB is loaded at location 0006

c) The address field of this instruction contains 01036 which is the address of the instruction

labeled RDREC

2) The second figure shows that if the program is to be loaded at new location 5000 The address of

the instruction JSUB gets modified to new location 6036

3) Likewise the third figure shows that if the program is relocated at location 7420 the JSUB

instruction would need to be changed to 4B108456 that correspond to the new address of

RDREC

The only parts of the program that require modification at load time are those that specify

direct addresses

LALITHAMBIGAIB APIT Page 30

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The rest of the instructions need not be modified

o Not a memory address (immediate addressing)

o PC-relative Base-relative

From the object program it is not possible to distinguish the address and constant

o The assembler must keep some information to tell the loader

o The object program that contains the modification record is called a relocatable

program

The way to solve the relocation problem

For an address label its address is assigned relative to the start of the program(START 0)

Produce a Modification record to store the starting location and the length of the address

field to be modified

The command for the loader must also be a part of the object program

Modification record

One modification record for each address to be modified

The length is stored in half-bytes (4 bits)

The starting location is the location of the byte containing the leftmost bits of the address

field to be modified

If the field contains an odd number of half-bytes the starting location begins in the middle of

the first byte

LALITHAMBIGAIB APIT Page 31

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Relocatable Object Program

In the above object code the red boxes indicate the addresses that need modifications The object

code lines at the end are the descriptions of the modification records for those instructions which

need change if relocation occurs M00000705 is the modification suggested for the statement at

location 0007 and requires modification 5-half bytes

Machine dependent assembler featuresAssembler features not closely related to machine architecture

bull Literalsbull Symbol-defining statementsbull Expressionsbull Program blocksbull Control sections and program linking

Literals

It is convenient for the programmer to be able to write the value of a constant operand as a part of

the instruction that uses it Such an operand is called a literal

45 001A ENDFIL LDA =ClsquoEOFrsquo 003210

002D =ClsquoEOFrsquo 454F46

LALITHAMBIGAIB APIT Page 32

SYSTEM SOFTWARE ( CS 2304) UNIT - II215 1062 WLOOP TD =Xlsquo05rsquo E32011

1076 =Xlsquo05rsquo 05

bull In this assembler language notation a literal is identified with the prefix= which is followed by a

specification of the literal value

The difference between a literal and an immediate operand

With immediate addressing the operand value is assembled as a part of the machine

instruction

With a literal the assembler generates the specified value as a constant at some other

memory location The address of this generated constant is used as the target address for the

machine instruction

Literal pool

All of the literal operands used in a program are gathered together into one or more literal

pools

Where the literal pool should be placed

93 LTORG

002D =ClsquoEOFrsquo 454F46

1048698 The assembler directive LTORG tells the assembler to generate a literal pool here

Literal for current value of location counter

1048698 The value of the location counter can be denoted by a literal operand

bull BASE

bull LDB =

Handling duplicate literal operands

The assembler should avoid storing duplicate literals

The easiest way to recognize duplicate literals is by comparison of the character

strings defining them

bull 100 LDA =ClsquoEOFrsquo

LALITHAMBIGAIB APIT Page 33

SYSTEM SOFTWARE ( CS 2304) UNIT - IIbull 125 LDA =ClsquoEOFrsquo

bull 160 LDA =Xlsquo454F46rsquo Literal operands with different

literal names may have the same

literal values

Recognizing literal operands that have different literal names but have the same literal values will

complicate the design of an assembler

Literal table (LITTAB)

The basic data structure needed to process literal operands is a literal table (LITTAB)

LITTAB contains Literal NameValue and Length

LITTAB is often organized as a hash table using the literal name or value as the key

Processing literal operands

Pass 1

bull For each recognized literal operand search LITTAB If the literal is already present in the

table no action is need if it is not present the literal is added to LITTAB without assigning

its address

bull When a LTORG statement is encountered or the end of the program the assembler makes a

scan of LITTAB and assigns an address to each literal

bull Update the location counter to reflect the number of bytes occupied by each literal

Pass 2

bull Search LITTAB for each literal operand encountered

bull The data values specified by the literals in each literal pool are inserted at the appropriate

places in the object program

bull In the same way as these values generated by BYTE or WORD statements

bull If a literal value represents an address in the program the assembler must generate the

appropriate Modification record

Symbol-defining statements

Assembler directives

EQU

ORG

Assembler directive EQU

LALITHAMBIGAIB APIT Page 34

SYSTEM SOFTWARE ( CS 2304) UNIT - II Most assemblers provide an assembler directive that allows the programmer to define

symbols and specify their values

Symbol EQU value

When the assembler encounters the EQU statement it enters ldquosymbolrdquo into SYMTAB with

the value of ldquosymbolrdquo

Use of EQU

Establish symbolic names that can be used for improved readability in place of numeric

values

+LDT 4096

MAXLEN EQU 4096

+LDT MAXLEN

Define mnemonic names for registers

A EQU 0

X EQU 1

L EQU 2

RMO 01

Establish and use names that reflect the logical function of the registers in the program

BASE EQU R1

ACCUMULATOR EQU R2

INDEX EQU R3

Assembler directive ORG

The assembler directive ORG is usually used to indirectly assign values to symbols

ORG value

ldquoValuerdquo is a constant or an expression involving constants and previously defined

symbols

When this statement is encountered the assembler resets its location counter (LOCCTR) to

the specified value

Use ORG for label definition

Suppose that we want to define a table with the following structure

STAB SYMBOL VALUE FLAGS

100 entries 6 bytes 3 bytes 1byte

LALITHAMBIGAIB APIT Page 35

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In some assemblers the previous value of LOCCTR is automatically remembered so we can

write

ORG

to return to the normal use of LOCCTR

Restrictions of EQU and ORG in an ordinary two-pass assembler For an ordinary two-pass assembler all symbols must be defined during Pass 1 Hence the

following sequences could not be processed by an ordinary two-pass assembler

All terms used to specify the value of the new symbol must have been defined previously in

the program

BETA EQU ALPHA

ALPHA RESW 1

ALPHA EQU BETA

BETA EQU DELTA

DELTA RESW 1

disallowed

ORG ALPHA

BYTE1 RESB 1

BYTE2 RESB 1

BYTE3 RESB 1

ORG

ALPHA RESB 1

disallowed

ALPHA RESW 1

BETA EQU ALPHA

Allowed

LALITHAMBIGAIB APIT Page 36

disallowed

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Expressions Most assemblers allow the use of expressions whenever a single operand such as a label or

literal is permitted

bull Each such expression must be evaluated by the assembler to produce a single

operand address or value

Assemblers generally allow arithmetic expressions formed according to the normal rule using

the operators +- and

bull Individual terms in the expression may be

bull constants

bull user-defined symbols or

bull special terms

bull The most common special term is the current value of the location

counter (often designated by )

Types of terms

Absolute terms - The value of an absolute term is independent of program location

Relative terms - The value of a relative term is dependent on the beginning address of the

program

Types of expressions

By the type of value produced expressions can classified as

1 Absolute expressions

bull The value of an absolute expression is independent of the program location

bull The absolute expression may contains relative terms provided the relative terms occur in

pairs and the terms in each such pair have opposite signs No relative term can enter multiplication

or division operation

bull eg MAXLEN EQU BUFEND-BUFFER

LALITHAMBIGAIB APIT Page 37

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Relative expressions

bull The value of a relative expression is relative the beginning address of the object

program

bull A relative expression is one in which all of the relative terms except

one can be paired as described above The remaining unpaired term must have a positive sign No

relative term can enter multiplication or division operation

Expressions that are neither relative nor absolute should be flagged by the assembler as errors

Determining types of expressions

Symbol Type Value

MAXLEN A 1000

BUFEND R 1036

BUFFER R 0036

RETADR R 0030

BUFEND+BUFFER 100-BUFFER and 3BUFFER are neither relative expressions nor

absolute expressions

Program blocks

The program is divided into three blocks They are Default - 0CDATA-1

and CBLKS-2

Default ( 0 ) block for mnemonic instructions

CDATA ( 1 ) block for less memory instructions

CBLKs ( 2 ) block for more memory instructions

Then one more column is included in the program when object code is

generated (ie) block whenever the system will encounter the new block need to write

USE and BLOCK name then set LOCCTR value to 0000

LALITHAMBIGAIB APIT Page 38

Add this field to SYMTAB

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In program block table consists of block name block valuestarting address and

length

After finding location counter value need to find object code and write object

program

LALITHAMBIGAIB APIT Page 39

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Program Block Diagram

LALITHAMBIGAIB APIT Page 40

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Control section and program linkingControl section

i A control section is a part of the program that maintains its identity after assembly

ii Each control section can be loaded and relocated independently of the others

iii Different control sections are most often used for subroutines or other logical

subdivisions of a program

Assembler directive CSECT

CSECT signal the start of a new control section

The assembler establishes a separate location counter (initialized as 0) for each control section

Assembler directives EXTDEF EXTREF

LALITHAMBIGAIB APIT Page 41

SYSTEM SOFTWARE ( CS 2304) UNIT - IIEXTDEF external definition

The EXTDEF statement in a control section names symbols called external symbols that are

defined in this control section and may be used by other sections Control section names are

automatically considered to be external symbols

EXTREF external reference

The EXTREF statement names symbols that are used in this control section and are defined

elsewhere

LALITHAMBIGAIB APIT Page 42

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 20: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The above figure 23 shows the example program from figure 21 as it is might be rewritten to

take advantage of the SICXE instruction set

Indirect addressing is indicated by adding the prefix to the operand (line70)

Immediate operands are denoted with the prefix (lines 25 55133)

Instructions that refer to memory are normally assembled using either the program counter

relative or base counter relative mode

The assembler directive BASE (line 13) is used in conjunction with base relative addressing

The four byte extended instruction format is specified with the prefix + added to the

operation code in the source statement

Register-to-register instructions are used wherever possible For example the statement on

line 150 is changed from COMP ZERO to COMPR AS

Immediate and indirect addressing have also been used as much as possible

Register-to-register instructions are faster than the corresponding register-to-memory

operations because they are shorter and do not require another memory reference

While using immediate addressing the operand is already present as part of the instruction

and need not be fetched from anywhere

The use of indirect addressing often avoids the need for another instruction

Line Loc Label Opcode Operand Object Code

5 COPY START 0

10 FIRST STL RETADR

12 LDB LENGTH

13 BASE LENGTH

15 CLOOP +JSUB RDREC

20 LDA LENGTH

25 COMP 0

30 JEQ ENDFIL

35 +JSUB WRREC

40 J CLOOP

45 ENDFIL LDA EOF

LALITHAMBIGAIB APIT Page 20

SYSTEM SOFTWARE ( CS 2304) UNIT - II50 STA BUFFER

55 LDA 3

60 STA LENGTH

65 +JSUB WRREC

70 J RETADR

80 EOF BYTE CrsquoEOFrsquo

95 RETADR RESW 1

100 LENGTH RESW 1

105 BUFFER RESB 4096

110

SUBROUTINE TO READ RECORD INTO

BUFFER

115

120

125 RDREC CLEAR X

130 CLEAR A

132 CLEAR S

133 +LDT 4096

135 RLOOP TD INPUT

140 JEQ RLOOP

145 RD INPUT

150 COMPR A S

155 JEQ EXIT

160 STCH BUFFERX

165 TIXR T

170 JLT RLOOP

175 EXIT STX LENGTH

180 RSUB

185 INPUT BYTE XrsquoF1rsquo

195

SUBROUTINE TO WRITE RECORD

FROM BUFFER

200

205

LALITHAMBIGAIB APIT Page 21

SYSTEM SOFTWARE ( CS 2304) UNIT - II210 WRREC CLEAR X

212 LDT LENGTH

215 WLOOP TD OUTPUT

220 JEQ WLOOP

225 LDCH BUFFERX

230 WD OUTPUT

235 TIXR T

240 JLT WLOOP

245 RSUB

250 OUTPUT BYTE Xrsquo05rsquo

255 END FIRST

Figure 25 Program from figure 24(SICXE) with object code

221 Instruction Formats and Addressing Modes

The instruction formats depend on the memory organization and the size of the memory

In SIC machine the memory is byte addressable Word size is 3 bytes So the size of the

memory is 212 bytes Accordingly it supports only one instruction format It has only two

registers register A and Index register Therefore the addressing modes supported by this

architecture are direct and indexed

Whereas the memory of a SICXE machine is 220 bytes (1 MB) This supports four different

types of instruction types they are

1 byte instruction

2 byte instruction

3 byte instruction

4 byte instruction

LALITHAMBIGAIB APIT Page 22

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Instructions can be

Instructions involving register to register

Instructions with one operand in memory the other in Accumulator (Single

operand instruction)

Extended instruction format

Addressing Modes are

PC-relative or Base-relative addressing op m

Indirect addressing op m

Immediate addressing op c

Extended format +op m

Index addressing op mx

register-to-register instructions

larger memory -gt multi-programming (program allocation)

Translation

1 Translations for the Instruction involving Register-Register addressing mode

During pass 1 the registers can be entered as part of the symbol table itself The value for these

registers is their equivalent numeric codes During pass 2 these values are assembled along with the

mnemonics object code If required a separate table can be created with the register names and their

equivalent numeric values

2 Translation involving Register-Memory instructions

In SICXE machine there are four instruction formats and five addressing modes For formats and

addressing modes refer chapter 1

LALITHAMBIGAIB APIT Page 23

SYSTEM SOFTWARE ( CS 2304) UNIT - IIAmong the instruction formats format -3 and format-4 instructions are Register-Memory type of

instruction One of the operand is always in a register and the other operand is in the memory The

addressing mode tells us the way in which the operand from the memory is to be fetched

There are two ways

1) Program-counter relative and

2) Base-relative

This addressing mode can be represented by either using format-3 type or format-4

type of instruction format

In format-3 the instruction has the opcode followed by a 12-bit displacement value in

the address field

Where as in format-4 the instruction contains the mnemonic code followed by a 20-

bit displacement value in the address field

1 Program-Counter Relative

a) In this usually format-3 instruction format is used

b) The instruction contains the opcode followed by a 12-bit displacement value

c) The range of displacement values are from 0 -2048 This displacement (should be small

enough to fit in a 12-bit field) value is added to the current contents of the program counter to

get the target address of the operand required by the instruction

d) This is relative way of calculating the address of the operand relative to the program counter

Hence the displacement of the operand is relative to the current program counter value

e) The following example shows how the address is calculated

LALITHAMBIGAIB APIT Page 24

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Base-Relative Addressing Mode

a) In this mode the base register is used to mention the displacement value Therefore the target

address is

TA = (base) + displacement value

b) This addressing mode is used when the range of displacement value is not sufficient Hence

the operand is not relative to the instruction as in PC-relative addressing mode Whenever

this mode is used it is indicated by using a directive BASE The moment the assembler

encounters this directive the next instruction uses base-relative addressing mode to calculate

the target address of the operand

c) When NOBASE directive is used then it indicates the base register is no more used to

calculate the target address of the operand Assembler first chooses PC-relative when the

displacement field is not enough it uses Base-relative

LDB LENGTH (instruction)

BASE LENGTH (directive)

NOBASE

LALITHAMBIGAIB APIT Page 25

SYSTEM SOFTWARE ( CS 2304) UNIT - II

For example

12 0003 LDB LENGTH 69202D

13 BASE LENGTH

100 0033 LENGTH RESW 1

105 0036 BUFFER RESB 4096

160 104E STCH BUFFER X 57C003

165 1051 TIXR T B850

In the above example the use of directive BASE indicates that Base-relative addressing mode is

to be used to calculate the target address PC-relative is no longer used The value of the LENGTH is

stored in the base register

The LDB instruction loads the value of length in the base register which 0033 BASE directive

explicitly tells the assembler that it has the value of LENGTH

BUFFER is at location (0036)16

(B) = (0033)16

disp = 0036 ndash 0033 = (0003)16

20 000A LDA LENGTH 032026

175 1056 EXIT STX LENGTH 134000

LALITHAMBIGAIB APIT Page 26

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Consider Line 175 If we use PC-relative

Disp = TA ndash (PC) = 0033 ndash1059 = EFDA

PC relative is no longer applicable so we try to use BASE relative addressing mode

3 Immediate Addressing Mode

In this mode no memory reference is involved If immediate mode is used the target address is the

operand itself

If the symbol is referred in the instruction as the immediate operand then it is immediate with PC-

relative mode as shown in the example below

LALITHAMBIGAIB APIT Page 27

SYSTEM SOFTWARE ( CS 2304) UNIT - II

4 Indirect and PC-relative mode

In this type of instruction the symbol used in the instruction is the address of the location which

contains the address of the operand The address of this is found using PC-relative addressing mode

For example

The instruction jumps the control to the address location RETADR which in turn has the address of

the operand If address of RETADR is 0030 the target address is then 0003 as calculated above

222 Program Relocation

The need for program relocation

It is desirable to load and run several programs at the same time

The system must be able to load programs into memory wherever there is room

The exact starting address of the program is not known until load time

Absolute Address

Program with starting address specified at assembly time

The address may be invalid if the program is loaded into somewhere else

LALITHAMBIGAIB APIT Page 28

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example

The above statement says that the register A is loaded with the value stored at location 102D

Suppose it is decided to load and execute the program at location 3000 instead of location 1000

Then at address 102D the required value which needs to be loaded in the register A is no more

available

The address also gets changed relative to the displacement of the program Hence we need to

make some changes in the address portion of the instruction so that we can load and execute the

program at location 3000

Apart from the instruction which will undergo a change in their operand address value as the

program load address changes There exist some parts in the program which will remain same

regardless of where the program is being loaded

Since assembler will not know actual location where the program will get loaded it cannot make

the necessary changes in the addresses used in the program However the assembler identifies

for the loader those parts of the program which need modification

An object program that has the information necessary to perform this kind of modification is

called the relocatable program

LALITHAMBIGAIB APIT Page 29

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example Program Relocation

1) The above diagram shows the concept of relocation

a) Initially the program is loaded at location 0000

b) The instruction JSUB is loaded at location 0006

c) The address field of this instruction contains 01036 which is the address of the instruction

labeled RDREC

2) The second figure shows that if the program is to be loaded at new location 5000 The address of

the instruction JSUB gets modified to new location 6036

3) Likewise the third figure shows that if the program is relocated at location 7420 the JSUB

instruction would need to be changed to 4B108456 that correspond to the new address of

RDREC

The only parts of the program that require modification at load time are those that specify

direct addresses

LALITHAMBIGAIB APIT Page 30

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The rest of the instructions need not be modified

o Not a memory address (immediate addressing)

o PC-relative Base-relative

From the object program it is not possible to distinguish the address and constant

o The assembler must keep some information to tell the loader

o The object program that contains the modification record is called a relocatable

program

The way to solve the relocation problem

For an address label its address is assigned relative to the start of the program(START 0)

Produce a Modification record to store the starting location and the length of the address

field to be modified

The command for the loader must also be a part of the object program

Modification record

One modification record for each address to be modified

The length is stored in half-bytes (4 bits)

The starting location is the location of the byte containing the leftmost bits of the address

field to be modified

If the field contains an odd number of half-bytes the starting location begins in the middle of

the first byte

LALITHAMBIGAIB APIT Page 31

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Relocatable Object Program

In the above object code the red boxes indicate the addresses that need modifications The object

code lines at the end are the descriptions of the modification records for those instructions which

need change if relocation occurs M00000705 is the modification suggested for the statement at

location 0007 and requires modification 5-half bytes

Machine dependent assembler featuresAssembler features not closely related to machine architecture

bull Literalsbull Symbol-defining statementsbull Expressionsbull Program blocksbull Control sections and program linking

Literals

It is convenient for the programmer to be able to write the value of a constant operand as a part of

the instruction that uses it Such an operand is called a literal

45 001A ENDFIL LDA =ClsquoEOFrsquo 003210

002D =ClsquoEOFrsquo 454F46

LALITHAMBIGAIB APIT Page 32

SYSTEM SOFTWARE ( CS 2304) UNIT - II215 1062 WLOOP TD =Xlsquo05rsquo E32011

1076 =Xlsquo05rsquo 05

bull In this assembler language notation a literal is identified with the prefix= which is followed by a

specification of the literal value

The difference between a literal and an immediate operand

With immediate addressing the operand value is assembled as a part of the machine

instruction

With a literal the assembler generates the specified value as a constant at some other

memory location The address of this generated constant is used as the target address for the

machine instruction

Literal pool

All of the literal operands used in a program are gathered together into one or more literal

pools

Where the literal pool should be placed

93 LTORG

002D =ClsquoEOFrsquo 454F46

1048698 The assembler directive LTORG tells the assembler to generate a literal pool here

Literal for current value of location counter

1048698 The value of the location counter can be denoted by a literal operand

bull BASE

bull LDB =

Handling duplicate literal operands

The assembler should avoid storing duplicate literals

The easiest way to recognize duplicate literals is by comparison of the character

strings defining them

bull 100 LDA =ClsquoEOFrsquo

LALITHAMBIGAIB APIT Page 33

SYSTEM SOFTWARE ( CS 2304) UNIT - IIbull 125 LDA =ClsquoEOFrsquo

bull 160 LDA =Xlsquo454F46rsquo Literal operands with different

literal names may have the same

literal values

Recognizing literal operands that have different literal names but have the same literal values will

complicate the design of an assembler

Literal table (LITTAB)

The basic data structure needed to process literal operands is a literal table (LITTAB)

LITTAB contains Literal NameValue and Length

LITTAB is often organized as a hash table using the literal name or value as the key

Processing literal operands

Pass 1

bull For each recognized literal operand search LITTAB If the literal is already present in the

table no action is need if it is not present the literal is added to LITTAB without assigning

its address

bull When a LTORG statement is encountered or the end of the program the assembler makes a

scan of LITTAB and assigns an address to each literal

bull Update the location counter to reflect the number of bytes occupied by each literal

Pass 2

bull Search LITTAB for each literal operand encountered

bull The data values specified by the literals in each literal pool are inserted at the appropriate

places in the object program

bull In the same way as these values generated by BYTE or WORD statements

bull If a literal value represents an address in the program the assembler must generate the

appropriate Modification record

Symbol-defining statements

Assembler directives

EQU

ORG

Assembler directive EQU

LALITHAMBIGAIB APIT Page 34

SYSTEM SOFTWARE ( CS 2304) UNIT - II Most assemblers provide an assembler directive that allows the programmer to define

symbols and specify their values

Symbol EQU value

When the assembler encounters the EQU statement it enters ldquosymbolrdquo into SYMTAB with

the value of ldquosymbolrdquo

Use of EQU

Establish symbolic names that can be used for improved readability in place of numeric

values

+LDT 4096

MAXLEN EQU 4096

+LDT MAXLEN

Define mnemonic names for registers

A EQU 0

X EQU 1

L EQU 2

RMO 01

Establish and use names that reflect the logical function of the registers in the program

BASE EQU R1

ACCUMULATOR EQU R2

INDEX EQU R3

Assembler directive ORG

The assembler directive ORG is usually used to indirectly assign values to symbols

ORG value

ldquoValuerdquo is a constant or an expression involving constants and previously defined

symbols

When this statement is encountered the assembler resets its location counter (LOCCTR) to

the specified value

Use ORG for label definition

Suppose that we want to define a table with the following structure

STAB SYMBOL VALUE FLAGS

100 entries 6 bytes 3 bytes 1byte

LALITHAMBIGAIB APIT Page 35

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In some assemblers the previous value of LOCCTR is automatically remembered so we can

write

ORG

to return to the normal use of LOCCTR

Restrictions of EQU and ORG in an ordinary two-pass assembler For an ordinary two-pass assembler all symbols must be defined during Pass 1 Hence the

following sequences could not be processed by an ordinary two-pass assembler

All terms used to specify the value of the new symbol must have been defined previously in

the program

BETA EQU ALPHA

ALPHA RESW 1

ALPHA EQU BETA

BETA EQU DELTA

DELTA RESW 1

disallowed

ORG ALPHA

BYTE1 RESB 1

BYTE2 RESB 1

BYTE3 RESB 1

ORG

ALPHA RESB 1

disallowed

ALPHA RESW 1

BETA EQU ALPHA

Allowed

LALITHAMBIGAIB APIT Page 36

disallowed

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Expressions Most assemblers allow the use of expressions whenever a single operand such as a label or

literal is permitted

bull Each such expression must be evaluated by the assembler to produce a single

operand address or value

Assemblers generally allow arithmetic expressions formed according to the normal rule using

the operators +- and

bull Individual terms in the expression may be

bull constants

bull user-defined symbols or

bull special terms

bull The most common special term is the current value of the location

counter (often designated by )

Types of terms

Absolute terms - The value of an absolute term is independent of program location

Relative terms - The value of a relative term is dependent on the beginning address of the

program

Types of expressions

By the type of value produced expressions can classified as

1 Absolute expressions

bull The value of an absolute expression is independent of the program location

bull The absolute expression may contains relative terms provided the relative terms occur in

pairs and the terms in each such pair have opposite signs No relative term can enter multiplication

or division operation

bull eg MAXLEN EQU BUFEND-BUFFER

LALITHAMBIGAIB APIT Page 37

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Relative expressions

bull The value of a relative expression is relative the beginning address of the object

program

bull A relative expression is one in which all of the relative terms except

one can be paired as described above The remaining unpaired term must have a positive sign No

relative term can enter multiplication or division operation

Expressions that are neither relative nor absolute should be flagged by the assembler as errors

Determining types of expressions

Symbol Type Value

MAXLEN A 1000

BUFEND R 1036

BUFFER R 0036

RETADR R 0030

BUFEND+BUFFER 100-BUFFER and 3BUFFER are neither relative expressions nor

absolute expressions

Program blocks

The program is divided into three blocks They are Default - 0CDATA-1

and CBLKS-2

Default ( 0 ) block for mnemonic instructions

CDATA ( 1 ) block for less memory instructions

CBLKs ( 2 ) block for more memory instructions

Then one more column is included in the program when object code is

generated (ie) block whenever the system will encounter the new block need to write

USE and BLOCK name then set LOCCTR value to 0000

LALITHAMBIGAIB APIT Page 38

Add this field to SYMTAB

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In program block table consists of block name block valuestarting address and

length

After finding location counter value need to find object code and write object

program

LALITHAMBIGAIB APIT Page 39

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Program Block Diagram

LALITHAMBIGAIB APIT Page 40

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Control section and program linkingControl section

i A control section is a part of the program that maintains its identity after assembly

ii Each control section can be loaded and relocated independently of the others

iii Different control sections are most often used for subroutines or other logical

subdivisions of a program

Assembler directive CSECT

CSECT signal the start of a new control section

The assembler establishes a separate location counter (initialized as 0) for each control section

Assembler directives EXTDEF EXTREF

LALITHAMBIGAIB APIT Page 41

SYSTEM SOFTWARE ( CS 2304) UNIT - IIEXTDEF external definition

The EXTDEF statement in a control section names symbols called external symbols that are

defined in this control section and may be used by other sections Control section names are

automatically considered to be external symbols

EXTREF external reference

The EXTREF statement names symbols that are used in this control section and are defined

elsewhere

LALITHAMBIGAIB APIT Page 42

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 21: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - II50 STA BUFFER

55 LDA 3

60 STA LENGTH

65 +JSUB WRREC

70 J RETADR

80 EOF BYTE CrsquoEOFrsquo

95 RETADR RESW 1

100 LENGTH RESW 1

105 BUFFER RESB 4096

110

SUBROUTINE TO READ RECORD INTO

BUFFER

115

120

125 RDREC CLEAR X

130 CLEAR A

132 CLEAR S

133 +LDT 4096

135 RLOOP TD INPUT

140 JEQ RLOOP

145 RD INPUT

150 COMPR A S

155 JEQ EXIT

160 STCH BUFFERX

165 TIXR T

170 JLT RLOOP

175 EXIT STX LENGTH

180 RSUB

185 INPUT BYTE XrsquoF1rsquo

195

SUBROUTINE TO WRITE RECORD

FROM BUFFER

200

205

LALITHAMBIGAIB APIT Page 21

SYSTEM SOFTWARE ( CS 2304) UNIT - II210 WRREC CLEAR X

212 LDT LENGTH

215 WLOOP TD OUTPUT

220 JEQ WLOOP

225 LDCH BUFFERX

230 WD OUTPUT

235 TIXR T

240 JLT WLOOP

245 RSUB

250 OUTPUT BYTE Xrsquo05rsquo

255 END FIRST

Figure 25 Program from figure 24(SICXE) with object code

221 Instruction Formats and Addressing Modes

The instruction formats depend on the memory organization and the size of the memory

In SIC machine the memory is byte addressable Word size is 3 bytes So the size of the

memory is 212 bytes Accordingly it supports only one instruction format It has only two

registers register A and Index register Therefore the addressing modes supported by this

architecture are direct and indexed

Whereas the memory of a SICXE machine is 220 bytes (1 MB) This supports four different

types of instruction types they are

1 byte instruction

2 byte instruction

3 byte instruction

4 byte instruction

LALITHAMBIGAIB APIT Page 22

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Instructions can be

Instructions involving register to register

Instructions with one operand in memory the other in Accumulator (Single

operand instruction)

Extended instruction format

Addressing Modes are

PC-relative or Base-relative addressing op m

Indirect addressing op m

Immediate addressing op c

Extended format +op m

Index addressing op mx

register-to-register instructions

larger memory -gt multi-programming (program allocation)

Translation

1 Translations for the Instruction involving Register-Register addressing mode

During pass 1 the registers can be entered as part of the symbol table itself The value for these

registers is their equivalent numeric codes During pass 2 these values are assembled along with the

mnemonics object code If required a separate table can be created with the register names and their

equivalent numeric values

2 Translation involving Register-Memory instructions

In SICXE machine there are four instruction formats and five addressing modes For formats and

addressing modes refer chapter 1

LALITHAMBIGAIB APIT Page 23

SYSTEM SOFTWARE ( CS 2304) UNIT - IIAmong the instruction formats format -3 and format-4 instructions are Register-Memory type of

instruction One of the operand is always in a register and the other operand is in the memory The

addressing mode tells us the way in which the operand from the memory is to be fetched

There are two ways

1) Program-counter relative and

2) Base-relative

This addressing mode can be represented by either using format-3 type or format-4

type of instruction format

In format-3 the instruction has the opcode followed by a 12-bit displacement value in

the address field

Where as in format-4 the instruction contains the mnemonic code followed by a 20-

bit displacement value in the address field

1 Program-Counter Relative

a) In this usually format-3 instruction format is used

b) The instruction contains the opcode followed by a 12-bit displacement value

c) The range of displacement values are from 0 -2048 This displacement (should be small

enough to fit in a 12-bit field) value is added to the current contents of the program counter to

get the target address of the operand required by the instruction

d) This is relative way of calculating the address of the operand relative to the program counter

Hence the displacement of the operand is relative to the current program counter value

e) The following example shows how the address is calculated

LALITHAMBIGAIB APIT Page 24

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Base-Relative Addressing Mode

a) In this mode the base register is used to mention the displacement value Therefore the target

address is

TA = (base) + displacement value

b) This addressing mode is used when the range of displacement value is not sufficient Hence

the operand is not relative to the instruction as in PC-relative addressing mode Whenever

this mode is used it is indicated by using a directive BASE The moment the assembler

encounters this directive the next instruction uses base-relative addressing mode to calculate

the target address of the operand

c) When NOBASE directive is used then it indicates the base register is no more used to

calculate the target address of the operand Assembler first chooses PC-relative when the

displacement field is not enough it uses Base-relative

LDB LENGTH (instruction)

BASE LENGTH (directive)

NOBASE

LALITHAMBIGAIB APIT Page 25

SYSTEM SOFTWARE ( CS 2304) UNIT - II

For example

12 0003 LDB LENGTH 69202D

13 BASE LENGTH

100 0033 LENGTH RESW 1

105 0036 BUFFER RESB 4096

160 104E STCH BUFFER X 57C003

165 1051 TIXR T B850

In the above example the use of directive BASE indicates that Base-relative addressing mode is

to be used to calculate the target address PC-relative is no longer used The value of the LENGTH is

stored in the base register

The LDB instruction loads the value of length in the base register which 0033 BASE directive

explicitly tells the assembler that it has the value of LENGTH

BUFFER is at location (0036)16

(B) = (0033)16

disp = 0036 ndash 0033 = (0003)16

20 000A LDA LENGTH 032026

175 1056 EXIT STX LENGTH 134000

LALITHAMBIGAIB APIT Page 26

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Consider Line 175 If we use PC-relative

Disp = TA ndash (PC) = 0033 ndash1059 = EFDA

PC relative is no longer applicable so we try to use BASE relative addressing mode

3 Immediate Addressing Mode

In this mode no memory reference is involved If immediate mode is used the target address is the

operand itself

If the symbol is referred in the instruction as the immediate operand then it is immediate with PC-

relative mode as shown in the example below

LALITHAMBIGAIB APIT Page 27

SYSTEM SOFTWARE ( CS 2304) UNIT - II

4 Indirect and PC-relative mode

In this type of instruction the symbol used in the instruction is the address of the location which

contains the address of the operand The address of this is found using PC-relative addressing mode

For example

The instruction jumps the control to the address location RETADR which in turn has the address of

the operand If address of RETADR is 0030 the target address is then 0003 as calculated above

222 Program Relocation

The need for program relocation

It is desirable to load and run several programs at the same time

The system must be able to load programs into memory wherever there is room

The exact starting address of the program is not known until load time

Absolute Address

Program with starting address specified at assembly time

The address may be invalid if the program is loaded into somewhere else

LALITHAMBIGAIB APIT Page 28

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example

The above statement says that the register A is loaded with the value stored at location 102D

Suppose it is decided to load and execute the program at location 3000 instead of location 1000

Then at address 102D the required value which needs to be loaded in the register A is no more

available

The address also gets changed relative to the displacement of the program Hence we need to

make some changes in the address portion of the instruction so that we can load and execute the

program at location 3000

Apart from the instruction which will undergo a change in their operand address value as the

program load address changes There exist some parts in the program which will remain same

regardless of where the program is being loaded

Since assembler will not know actual location where the program will get loaded it cannot make

the necessary changes in the addresses used in the program However the assembler identifies

for the loader those parts of the program which need modification

An object program that has the information necessary to perform this kind of modification is

called the relocatable program

LALITHAMBIGAIB APIT Page 29

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example Program Relocation

1) The above diagram shows the concept of relocation

a) Initially the program is loaded at location 0000

b) The instruction JSUB is loaded at location 0006

c) The address field of this instruction contains 01036 which is the address of the instruction

labeled RDREC

2) The second figure shows that if the program is to be loaded at new location 5000 The address of

the instruction JSUB gets modified to new location 6036

3) Likewise the third figure shows that if the program is relocated at location 7420 the JSUB

instruction would need to be changed to 4B108456 that correspond to the new address of

RDREC

The only parts of the program that require modification at load time are those that specify

direct addresses

LALITHAMBIGAIB APIT Page 30

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The rest of the instructions need not be modified

o Not a memory address (immediate addressing)

o PC-relative Base-relative

From the object program it is not possible to distinguish the address and constant

o The assembler must keep some information to tell the loader

o The object program that contains the modification record is called a relocatable

program

The way to solve the relocation problem

For an address label its address is assigned relative to the start of the program(START 0)

Produce a Modification record to store the starting location and the length of the address

field to be modified

The command for the loader must also be a part of the object program

Modification record

One modification record for each address to be modified

The length is stored in half-bytes (4 bits)

The starting location is the location of the byte containing the leftmost bits of the address

field to be modified

If the field contains an odd number of half-bytes the starting location begins in the middle of

the first byte

LALITHAMBIGAIB APIT Page 31

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Relocatable Object Program

In the above object code the red boxes indicate the addresses that need modifications The object

code lines at the end are the descriptions of the modification records for those instructions which

need change if relocation occurs M00000705 is the modification suggested for the statement at

location 0007 and requires modification 5-half bytes

Machine dependent assembler featuresAssembler features not closely related to machine architecture

bull Literalsbull Symbol-defining statementsbull Expressionsbull Program blocksbull Control sections and program linking

Literals

It is convenient for the programmer to be able to write the value of a constant operand as a part of

the instruction that uses it Such an operand is called a literal

45 001A ENDFIL LDA =ClsquoEOFrsquo 003210

002D =ClsquoEOFrsquo 454F46

LALITHAMBIGAIB APIT Page 32

SYSTEM SOFTWARE ( CS 2304) UNIT - II215 1062 WLOOP TD =Xlsquo05rsquo E32011

1076 =Xlsquo05rsquo 05

bull In this assembler language notation a literal is identified with the prefix= which is followed by a

specification of the literal value

The difference between a literal and an immediate operand

With immediate addressing the operand value is assembled as a part of the machine

instruction

With a literal the assembler generates the specified value as a constant at some other

memory location The address of this generated constant is used as the target address for the

machine instruction

Literal pool

All of the literal operands used in a program are gathered together into one or more literal

pools

Where the literal pool should be placed

93 LTORG

002D =ClsquoEOFrsquo 454F46

1048698 The assembler directive LTORG tells the assembler to generate a literal pool here

Literal for current value of location counter

1048698 The value of the location counter can be denoted by a literal operand

bull BASE

bull LDB =

Handling duplicate literal operands

The assembler should avoid storing duplicate literals

The easiest way to recognize duplicate literals is by comparison of the character

strings defining them

bull 100 LDA =ClsquoEOFrsquo

LALITHAMBIGAIB APIT Page 33

SYSTEM SOFTWARE ( CS 2304) UNIT - IIbull 125 LDA =ClsquoEOFrsquo

bull 160 LDA =Xlsquo454F46rsquo Literal operands with different

literal names may have the same

literal values

Recognizing literal operands that have different literal names but have the same literal values will

complicate the design of an assembler

Literal table (LITTAB)

The basic data structure needed to process literal operands is a literal table (LITTAB)

LITTAB contains Literal NameValue and Length

LITTAB is often organized as a hash table using the literal name or value as the key

Processing literal operands

Pass 1

bull For each recognized literal operand search LITTAB If the literal is already present in the

table no action is need if it is not present the literal is added to LITTAB without assigning

its address

bull When a LTORG statement is encountered or the end of the program the assembler makes a

scan of LITTAB and assigns an address to each literal

bull Update the location counter to reflect the number of bytes occupied by each literal

Pass 2

bull Search LITTAB for each literal operand encountered

bull The data values specified by the literals in each literal pool are inserted at the appropriate

places in the object program

bull In the same way as these values generated by BYTE or WORD statements

bull If a literal value represents an address in the program the assembler must generate the

appropriate Modification record

Symbol-defining statements

Assembler directives

EQU

ORG

Assembler directive EQU

LALITHAMBIGAIB APIT Page 34

SYSTEM SOFTWARE ( CS 2304) UNIT - II Most assemblers provide an assembler directive that allows the programmer to define

symbols and specify their values

Symbol EQU value

When the assembler encounters the EQU statement it enters ldquosymbolrdquo into SYMTAB with

the value of ldquosymbolrdquo

Use of EQU

Establish symbolic names that can be used for improved readability in place of numeric

values

+LDT 4096

MAXLEN EQU 4096

+LDT MAXLEN

Define mnemonic names for registers

A EQU 0

X EQU 1

L EQU 2

RMO 01

Establish and use names that reflect the logical function of the registers in the program

BASE EQU R1

ACCUMULATOR EQU R2

INDEX EQU R3

Assembler directive ORG

The assembler directive ORG is usually used to indirectly assign values to symbols

ORG value

ldquoValuerdquo is a constant or an expression involving constants and previously defined

symbols

When this statement is encountered the assembler resets its location counter (LOCCTR) to

the specified value

Use ORG for label definition

Suppose that we want to define a table with the following structure

STAB SYMBOL VALUE FLAGS

100 entries 6 bytes 3 bytes 1byte

LALITHAMBIGAIB APIT Page 35

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In some assemblers the previous value of LOCCTR is automatically remembered so we can

write

ORG

to return to the normal use of LOCCTR

Restrictions of EQU and ORG in an ordinary two-pass assembler For an ordinary two-pass assembler all symbols must be defined during Pass 1 Hence the

following sequences could not be processed by an ordinary two-pass assembler

All terms used to specify the value of the new symbol must have been defined previously in

the program

BETA EQU ALPHA

ALPHA RESW 1

ALPHA EQU BETA

BETA EQU DELTA

DELTA RESW 1

disallowed

ORG ALPHA

BYTE1 RESB 1

BYTE2 RESB 1

BYTE3 RESB 1

ORG

ALPHA RESB 1

disallowed

ALPHA RESW 1

BETA EQU ALPHA

Allowed

LALITHAMBIGAIB APIT Page 36

disallowed

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Expressions Most assemblers allow the use of expressions whenever a single operand such as a label or

literal is permitted

bull Each such expression must be evaluated by the assembler to produce a single

operand address or value

Assemblers generally allow arithmetic expressions formed according to the normal rule using

the operators +- and

bull Individual terms in the expression may be

bull constants

bull user-defined symbols or

bull special terms

bull The most common special term is the current value of the location

counter (often designated by )

Types of terms

Absolute terms - The value of an absolute term is independent of program location

Relative terms - The value of a relative term is dependent on the beginning address of the

program

Types of expressions

By the type of value produced expressions can classified as

1 Absolute expressions

bull The value of an absolute expression is independent of the program location

bull The absolute expression may contains relative terms provided the relative terms occur in

pairs and the terms in each such pair have opposite signs No relative term can enter multiplication

or division operation

bull eg MAXLEN EQU BUFEND-BUFFER

LALITHAMBIGAIB APIT Page 37

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Relative expressions

bull The value of a relative expression is relative the beginning address of the object

program

bull A relative expression is one in which all of the relative terms except

one can be paired as described above The remaining unpaired term must have a positive sign No

relative term can enter multiplication or division operation

Expressions that are neither relative nor absolute should be flagged by the assembler as errors

Determining types of expressions

Symbol Type Value

MAXLEN A 1000

BUFEND R 1036

BUFFER R 0036

RETADR R 0030

BUFEND+BUFFER 100-BUFFER and 3BUFFER are neither relative expressions nor

absolute expressions

Program blocks

The program is divided into three blocks They are Default - 0CDATA-1

and CBLKS-2

Default ( 0 ) block for mnemonic instructions

CDATA ( 1 ) block for less memory instructions

CBLKs ( 2 ) block for more memory instructions

Then one more column is included in the program when object code is

generated (ie) block whenever the system will encounter the new block need to write

USE and BLOCK name then set LOCCTR value to 0000

LALITHAMBIGAIB APIT Page 38

Add this field to SYMTAB

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In program block table consists of block name block valuestarting address and

length

After finding location counter value need to find object code and write object

program

LALITHAMBIGAIB APIT Page 39

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Program Block Diagram

LALITHAMBIGAIB APIT Page 40

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Control section and program linkingControl section

i A control section is a part of the program that maintains its identity after assembly

ii Each control section can be loaded and relocated independently of the others

iii Different control sections are most often used for subroutines or other logical

subdivisions of a program

Assembler directive CSECT

CSECT signal the start of a new control section

The assembler establishes a separate location counter (initialized as 0) for each control section

Assembler directives EXTDEF EXTREF

LALITHAMBIGAIB APIT Page 41

SYSTEM SOFTWARE ( CS 2304) UNIT - IIEXTDEF external definition

The EXTDEF statement in a control section names symbols called external symbols that are

defined in this control section and may be used by other sections Control section names are

automatically considered to be external symbols

EXTREF external reference

The EXTREF statement names symbols that are used in this control section and are defined

elsewhere

LALITHAMBIGAIB APIT Page 42

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 22: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - II210 WRREC CLEAR X

212 LDT LENGTH

215 WLOOP TD OUTPUT

220 JEQ WLOOP

225 LDCH BUFFERX

230 WD OUTPUT

235 TIXR T

240 JLT WLOOP

245 RSUB

250 OUTPUT BYTE Xrsquo05rsquo

255 END FIRST

Figure 25 Program from figure 24(SICXE) with object code

221 Instruction Formats and Addressing Modes

The instruction formats depend on the memory organization and the size of the memory

In SIC machine the memory is byte addressable Word size is 3 bytes So the size of the

memory is 212 bytes Accordingly it supports only one instruction format It has only two

registers register A and Index register Therefore the addressing modes supported by this

architecture are direct and indexed

Whereas the memory of a SICXE machine is 220 bytes (1 MB) This supports four different

types of instruction types they are

1 byte instruction

2 byte instruction

3 byte instruction

4 byte instruction

LALITHAMBIGAIB APIT Page 22

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Instructions can be

Instructions involving register to register

Instructions with one operand in memory the other in Accumulator (Single

operand instruction)

Extended instruction format

Addressing Modes are

PC-relative or Base-relative addressing op m

Indirect addressing op m

Immediate addressing op c

Extended format +op m

Index addressing op mx

register-to-register instructions

larger memory -gt multi-programming (program allocation)

Translation

1 Translations for the Instruction involving Register-Register addressing mode

During pass 1 the registers can be entered as part of the symbol table itself The value for these

registers is their equivalent numeric codes During pass 2 these values are assembled along with the

mnemonics object code If required a separate table can be created with the register names and their

equivalent numeric values

2 Translation involving Register-Memory instructions

In SICXE machine there are four instruction formats and five addressing modes For formats and

addressing modes refer chapter 1

LALITHAMBIGAIB APIT Page 23

SYSTEM SOFTWARE ( CS 2304) UNIT - IIAmong the instruction formats format -3 and format-4 instructions are Register-Memory type of

instruction One of the operand is always in a register and the other operand is in the memory The

addressing mode tells us the way in which the operand from the memory is to be fetched

There are two ways

1) Program-counter relative and

2) Base-relative

This addressing mode can be represented by either using format-3 type or format-4

type of instruction format

In format-3 the instruction has the opcode followed by a 12-bit displacement value in

the address field

Where as in format-4 the instruction contains the mnemonic code followed by a 20-

bit displacement value in the address field

1 Program-Counter Relative

a) In this usually format-3 instruction format is used

b) The instruction contains the opcode followed by a 12-bit displacement value

c) The range of displacement values are from 0 -2048 This displacement (should be small

enough to fit in a 12-bit field) value is added to the current contents of the program counter to

get the target address of the operand required by the instruction

d) This is relative way of calculating the address of the operand relative to the program counter

Hence the displacement of the operand is relative to the current program counter value

e) The following example shows how the address is calculated

LALITHAMBIGAIB APIT Page 24

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Base-Relative Addressing Mode

a) In this mode the base register is used to mention the displacement value Therefore the target

address is

TA = (base) + displacement value

b) This addressing mode is used when the range of displacement value is not sufficient Hence

the operand is not relative to the instruction as in PC-relative addressing mode Whenever

this mode is used it is indicated by using a directive BASE The moment the assembler

encounters this directive the next instruction uses base-relative addressing mode to calculate

the target address of the operand

c) When NOBASE directive is used then it indicates the base register is no more used to

calculate the target address of the operand Assembler first chooses PC-relative when the

displacement field is not enough it uses Base-relative

LDB LENGTH (instruction)

BASE LENGTH (directive)

NOBASE

LALITHAMBIGAIB APIT Page 25

SYSTEM SOFTWARE ( CS 2304) UNIT - II

For example

12 0003 LDB LENGTH 69202D

13 BASE LENGTH

100 0033 LENGTH RESW 1

105 0036 BUFFER RESB 4096

160 104E STCH BUFFER X 57C003

165 1051 TIXR T B850

In the above example the use of directive BASE indicates that Base-relative addressing mode is

to be used to calculate the target address PC-relative is no longer used The value of the LENGTH is

stored in the base register

The LDB instruction loads the value of length in the base register which 0033 BASE directive

explicitly tells the assembler that it has the value of LENGTH

BUFFER is at location (0036)16

(B) = (0033)16

disp = 0036 ndash 0033 = (0003)16

20 000A LDA LENGTH 032026

175 1056 EXIT STX LENGTH 134000

LALITHAMBIGAIB APIT Page 26

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Consider Line 175 If we use PC-relative

Disp = TA ndash (PC) = 0033 ndash1059 = EFDA

PC relative is no longer applicable so we try to use BASE relative addressing mode

3 Immediate Addressing Mode

In this mode no memory reference is involved If immediate mode is used the target address is the

operand itself

If the symbol is referred in the instruction as the immediate operand then it is immediate with PC-

relative mode as shown in the example below

LALITHAMBIGAIB APIT Page 27

SYSTEM SOFTWARE ( CS 2304) UNIT - II

4 Indirect and PC-relative mode

In this type of instruction the symbol used in the instruction is the address of the location which

contains the address of the operand The address of this is found using PC-relative addressing mode

For example

The instruction jumps the control to the address location RETADR which in turn has the address of

the operand If address of RETADR is 0030 the target address is then 0003 as calculated above

222 Program Relocation

The need for program relocation

It is desirable to load and run several programs at the same time

The system must be able to load programs into memory wherever there is room

The exact starting address of the program is not known until load time

Absolute Address

Program with starting address specified at assembly time

The address may be invalid if the program is loaded into somewhere else

LALITHAMBIGAIB APIT Page 28

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example

The above statement says that the register A is loaded with the value stored at location 102D

Suppose it is decided to load and execute the program at location 3000 instead of location 1000

Then at address 102D the required value which needs to be loaded in the register A is no more

available

The address also gets changed relative to the displacement of the program Hence we need to

make some changes in the address portion of the instruction so that we can load and execute the

program at location 3000

Apart from the instruction which will undergo a change in their operand address value as the

program load address changes There exist some parts in the program which will remain same

regardless of where the program is being loaded

Since assembler will not know actual location where the program will get loaded it cannot make

the necessary changes in the addresses used in the program However the assembler identifies

for the loader those parts of the program which need modification

An object program that has the information necessary to perform this kind of modification is

called the relocatable program

LALITHAMBIGAIB APIT Page 29

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example Program Relocation

1) The above diagram shows the concept of relocation

a) Initially the program is loaded at location 0000

b) The instruction JSUB is loaded at location 0006

c) The address field of this instruction contains 01036 which is the address of the instruction

labeled RDREC

2) The second figure shows that if the program is to be loaded at new location 5000 The address of

the instruction JSUB gets modified to new location 6036

3) Likewise the third figure shows that if the program is relocated at location 7420 the JSUB

instruction would need to be changed to 4B108456 that correspond to the new address of

RDREC

The only parts of the program that require modification at load time are those that specify

direct addresses

LALITHAMBIGAIB APIT Page 30

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The rest of the instructions need not be modified

o Not a memory address (immediate addressing)

o PC-relative Base-relative

From the object program it is not possible to distinguish the address and constant

o The assembler must keep some information to tell the loader

o The object program that contains the modification record is called a relocatable

program

The way to solve the relocation problem

For an address label its address is assigned relative to the start of the program(START 0)

Produce a Modification record to store the starting location and the length of the address

field to be modified

The command for the loader must also be a part of the object program

Modification record

One modification record for each address to be modified

The length is stored in half-bytes (4 bits)

The starting location is the location of the byte containing the leftmost bits of the address

field to be modified

If the field contains an odd number of half-bytes the starting location begins in the middle of

the first byte

LALITHAMBIGAIB APIT Page 31

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Relocatable Object Program

In the above object code the red boxes indicate the addresses that need modifications The object

code lines at the end are the descriptions of the modification records for those instructions which

need change if relocation occurs M00000705 is the modification suggested for the statement at

location 0007 and requires modification 5-half bytes

Machine dependent assembler featuresAssembler features not closely related to machine architecture

bull Literalsbull Symbol-defining statementsbull Expressionsbull Program blocksbull Control sections and program linking

Literals

It is convenient for the programmer to be able to write the value of a constant operand as a part of

the instruction that uses it Such an operand is called a literal

45 001A ENDFIL LDA =ClsquoEOFrsquo 003210

002D =ClsquoEOFrsquo 454F46

LALITHAMBIGAIB APIT Page 32

SYSTEM SOFTWARE ( CS 2304) UNIT - II215 1062 WLOOP TD =Xlsquo05rsquo E32011

1076 =Xlsquo05rsquo 05

bull In this assembler language notation a literal is identified with the prefix= which is followed by a

specification of the literal value

The difference between a literal and an immediate operand

With immediate addressing the operand value is assembled as a part of the machine

instruction

With a literal the assembler generates the specified value as a constant at some other

memory location The address of this generated constant is used as the target address for the

machine instruction

Literal pool

All of the literal operands used in a program are gathered together into one or more literal

pools

Where the literal pool should be placed

93 LTORG

002D =ClsquoEOFrsquo 454F46

1048698 The assembler directive LTORG tells the assembler to generate a literal pool here

Literal for current value of location counter

1048698 The value of the location counter can be denoted by a literal operand

bull BASE

bull LDB =

Handling duplicate literal operands

The assembler should avoid storing duplicate literals

The easiest way to recognize duplicate literals is by comparison of the character

strings defining them

bull 100 LDA =ClsquoEOFrsquo

LALITHAMBIGAIB APIT Page 33

SYSTEM SOFTWARE ( CS 2304) UNIT - IIbull 125 LDA =ClsquoEOFrsquo

bull 160 LDA =Xlsquo454F46rsquo Literal operands with different

literal names may have the same

literal values

Recognizing literal operands that have different literal names but have the same literal values will

complicate the design of an assembler

Literal table (LITTAB)

The basic data structure needed to process literal operands is a literal table (LITTAB)

LITTAB contains Literal NameValue and Length

LITTAB is often organized as a hash table using the literal name or value as the key

Processing literal operands

Pass 1

bull For each recognized literal operand search LITTAB If the literal is already present in the

table no action is need if it is not present the literal is added to LITTAB without assigning

its address

bull When a LTORG statement is encountered or the end of the program the assembler makes a

scan of LITTAB and assigns an address to each literal

bull Update the location counter to reflect the number of bytes occupied by each literal

Pass 2

bull Search LITTAB for each literal operand encountered

bull The data values specified by the literals in each literal pool are inserted at the appropriate

places in the object program

bull In the same way as these values generated by BYTE or WORD statements

bull If a literal value represents an address in the program the assembler must generate the

appropriate Modification record

Symbol-defining statements

Assembler directives

EQU

ORG

Assembler directive EQU

LALITHAMBIGAIB APIT Page 34

SYSTEM SOFTWARE ( CS 2304) UNIT - II Most assemblers provide an assembler directive that allows the programmer to define

symbols and specify their values

Symbol EQU value

When the assembler encounters the EQU statement it enters ldquosymbolrdquo into SYMTAB with

the value of ldquosymbolrdquo

Use of EQU

Establish symbolic names that can be used for improved readability in place of numeric

values

+LDT 4096

MAXLEN EQU 4096

+LDT MAXLEN

Define mnemonic names for registers

A EQU 0

X EQU 1

L EQU 2

RMO 01

Establish and use names that reflect the logical function of the registers in the program

BASE EQU R1

ACCUMULATOR EQU R2

INDEX EQU R3

Assembler directive ORG

The assembler directive ORG is usually used to indirectly assign values to symbols

ORG value

ldquoValuerdquo is a constant or an expression involving constants and previously defined

symbols

When this statement is encountered the assembler resets its location counter (LOCCTR) to

the specified value

Use ORG for label definition

Suppose that we want to define a table with the following structure

STAB SYMBOL VALUE FLAGS

100 entries 6 bytes 3 bytes 1byte

LALITHAMBIGAIB APIT Page 35

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In some assemblers the previous value of LOCCTR is automatically remembered so we can

write

ORG

to return to the normal use of LOCCTR

Restrictions of EQU and ORG in an ordinary two-pass assembler For an ordinary two-pass assembler all symbols must be defined during Pass 1 Hence the

following sequences could not be processed by an ordinary two-pass assembler

All terms used to specify the value of the new symbol must have been defined previously in

the program

BETA EQU ALPHA

ALPHA RESW 1

ALPHA EQU BETA

BETA EQU DELTA

DELTA RESW 1

disallowed

ORG ALPHA

BYTE1 RESB 1

BYTE2 RESB 1

BYTE3 RESB 1

ORG

ALPHA RESB 1

disallowed

ALPHA RESW 1

BETA EQU ALPHA

Allowed

LALITHAMBIGAIB APIT Page 36

disallowed

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Expressions Most assemblers allow the use of expressions whenever a single operand such as a label or

literal is permitted

bull Each such expression must be evaluated by the assembler to produce a single

operand address or value

Assemblers generally allow arithmetic expressions formed according to the normal rule using

the operators +- and

bull Individual terms in the expression may be

bull constants

bull user-defined symbols or

bull special terms

bull The most common special term is the current value of the location

counter (often designated by )

Types of terms

Absolute terms - The value of an absolute term is independent of program location

Relative terms - The value of a relative term is dependent on the beginning address of the

program

Types of expressions

By the type of value produced expressions can classified as

1 Absolute expressions

bull The value of an absolute expression is independent of the program location

bull The absolute expression may contains relative terms provided the relative terms occur in

pairs and the terms in each such pair have opposite signs No relative term can enter multiplication

or division operation

bull eg MAXLEN EQU BUFEND-BUFFER

LALITHAMBIGAIB APIT Page 37

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Relative expressions

bull The value of a relative expression is relative the beginning address of the object

program

bull A relative expression is one in which all of the relative terms except

one can be paired as described above The remaining unpaired term must have a positive sign No

relative term can enter multiplication or division operation

Expressions that are neither relative nor absolute should be flagged by the assembler as errors

Determining types of expressions

Symbol Type Value

MAXLEN A 1000

BUFEND R 1036

BUFFER R 0036

RETADR R 0030

BUFEND+BUFFER 100-BUFFER and 3BUFFER are neither relative expressions nor

absolute expressions

Program blocks

The program is divided into three blocks They are Default - 0CDATA-1

and CBLKS-2

Default ( 0 ) block for mnemonic instructions

CDATA ( 1 ) block for less memory instructions

CBLKs ( 2 ) block for more memory instructions

Then one more column is included in the program when object code is

generated (ie) block whenever the system will encounter the new block need to write

USE and BLOCK name then set LOCCTR value to 0000

LALITHAMBIGAIB APIT Page 38

Add this field to SYMTAB

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In program block table consists of block name block valuestarting address and

length

After finding location counter value need to find object code and write object

program

LALITHAMBIGAIB APIT Page 39

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Program Block Diagram

LALITHAMBIGAIB APIT Page 40

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Control section and program linkingControl section

i A control section is a part of the program that maintains its identity after assembly

ii Each control section can be loaded and relocated independently of the others

iii Different control sections are most often used for subroutines or other logical

subdivisions of a program

Assembler directive CSECT

CSECT signal the start of a new control section

The assembler establishes a separate location counter (initialized as 0) for each control section

Assembler directives EXTDEF EXTREF

LALITHAMBIGAIB APIT Page 41

SYSTEM SOFTWARE ( CS 2304) UNIT - IIEXTDEF external definition

The EXTDEF statement in a control section names symbols called external symbols that are

defined in this control section and may be used by other sections Control section names are

automatically considered to be external symbols

EXTREF external reference

The EXTREF statement names symbols that are used in this control section and are defined

elsewhere

LALITHAMBIGAIB APIT Page 42

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 23: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Instructions can be

Instructions involving register to register

Instructions with one operand in memory the other in Accumulator (Single

operand instruction)

Extended instruction format

Addressing Modes are

PC-relative or Base-relative addressing op m

Indirect addressing op m

Immediate addressing op c

Extended format +op m

Index addressing op mx

register-to-register instructions

larger memory -gt multi-programming (program allocation)

Translation

1 Translations for the Instruction involving Register-Register addressing mode

During pass 1 the registers can be entered as part of the symbol table itself The value for these

registers is their equivalent numeric codes During pass 2 these values are assembled along with the

mnemonics object code If required a separate table can be created with the register names and their

equivalent numeric values

2 Translation involving Register-Memory instructions

In SICXE machine there are four instruction formats and five addressing modes For formats and

addressing modes refer chapter 1

LALITHAMBIGAIB APIT Page 23

SYSTEM SOFTWARE ( CS 2304) UNIT - IIAmong the instruction formats format -3 and format-4 instructions are Register-Memory type of

instruction One of the operand is always in a register and the other operand is in the memory The

addressing mode tells us the way in which the operand from the memory is to be fetched

There are two ways

1) Program-counter relative and

2) Base-relative

This addressing mode can be represented by either using format-3 type or format-4

type of instruction format

In format-3 the instruction has the opcode followed by a 12-bit displacement value in

the address field

Where as in format-4 the instruction contains the mnemonic code followed by a 20-

bit displacement value in the address field

1 Program-Counter Relative

a) In this usually format-3 instruction format is used

b) The instruction contains the opcode followed by a 12-bit displacement value

c) The range of displacement values are from 0 -2048 This displacement (should be small

enough to fit in a 12-bit field) value is added to the current contents of the program counter to

get the target address of the operand required by the instruction

d) This is relative way of calculating the address of the operand relative to the program counter

Hence the displacement of the operand is relative to the current program counter value

e) The following example shows how the address is calculated

LALITHAMBIGAIB APIT Page 24

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Base-Relative Addressing Mode

a) In this mode the base register is used to mention the displacement value Therefore the target

address is

TA = (base) + displacement value

b) This addressing mode is used when the range of displacement value is not sufficient Hence

the operand is not relative to the instruction as in PC-relative addressing mode Whenever

this mode is used it is indicated by using a directive BASE The moment the assembler

encounters this directive the next instruction uses base-relative addressing mode to calculate

the target address of the operand

c) When NOBASE directive is used then it indicates the base register is no more used to

calculate the target address of the operand Assembler first chooses PC-relative when the

displacement field is not enough it uses Base-relative

LDB LENGTH (instruction)

BASE LENGTH (directive)

NOBASE

LALITHAMBIGAIB APIT Page 25

SYSTEM SOFTWARE ( CS 2304) UNIT - II

For example

12 0003 LDB LENGTH 69202D

13 BASE LENGTH

100 0033 LENGTH RESW 1

105 0036 BUFFER RESB 4096

160 104E STCH BUFFER X 57C003

165 1051 TIXR T B850

In the above example the use of directive BASE indicates that Base-relative addressing mode is

to be used to calculate the target address PC-relative is no longer used The value of the LENGTH is

stored in the base register

The LDB instruction loads the value of length in the base register which 0033 BASE directive

explicitly tells the assembler that it has the value of LENGTH

BUFFER is at location (0036)16

(B) = (0033)16

disp = 0036 ndash 0033 = (0003)16

20 000A LDA LENGTH 032026

175 1056 EXIT STX LENGTH 134000

LALITHAMBIGAIB APIT Page 26

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Consider Line 175 If we use PC-relative

Disp = TA ndash (PC) = 0033 ndash1059 = EFDA

PC relative is no longer applicable so we try to use BASE relative addressing mode

3 Immediate Addressing Mode

In this mode no memory reference is involved If immediate mode is used the target address is the

operand itself

If the symbol is referred in the instruction as the immediate operand then it is immediate with PC-

relative mode as shown in the example below

LALITHAMBIGAIB APIT Page 27

SYSTEM SOFTWARE ( CS 2304) UNIT - II

4 Indirect and PC-relative mode

In this type of instruction the symbol used in the instruction is the address of the location which

contains the address of the operand The address of this is found using PC-relative addressing mode

For example

The instruction jumps the control to the address location RETADR which in turn has the address of

the operand If address of RETADR is 0030 the target address is then 0003 as calculated above

222 Program Relocation

The need for program relocation

It is desirable to load and run several programs at the same time

The system must be able to load programs into memory wherever there is room

The exact starting address of the program is not known until load time

Absolute Address

Program with starting address specified at assembly time

The address may be invalid if the program is loaded into somewhere else

LALITHAMBIGAIB APIT Page 28

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example

The above statement says that the register A is loaded with the value stored at location 102D

Suppose it is decided to load and execute the program at location 3000 instead of location 1000

Then at address 102D the required value which needs to be loaded in the register A is no more

available

The address also gets changed relative to the displacement of the program Hence we need to

make some changes in the address portion of the instruction so that we can load and execute the

program at location 3000

Apart from the instruction which will undergo a change in their operand address value as the

program load address changes There exist some parts in the program which will remain same

regardless of where the program is being loaded

Since assembler will not know actual location where the program will get loaded it cannot make

the necessary changes in the addresses used in the program However the assembler identifies

for the loader those parts of the program which need modification

An object program that has the information necessary to perform this kind of modification is

called the relocatable program

LALITHAMBIGAIB APIT Page 29

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example Program Relocation

1) The above diagram shows the concept of relocation

a) Initially the program is loaded at location 0000

b) The instruction JSUB is loaded at location 0006

c) The address field of this instruction contains 01036 which is the address of the instruction

labeled RDREC

2) The second figure shows that if the program is to be loaded at new location 5000 The address of

the instruction JSUB gets modified to new location 6036

3) Likewise the third figure shows that if the program is relocated at location 7420 the JSUB

instruction would need to be changed to 4B108456 that correspond to the new address of

RDREC

The only parts of the program that require modification at load time are those that specify

direct addresses

LALITHAMBIGAIB APIT Page 30

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The rest of the instructions need not be modified

o Not a memory address (immediate addressing)

o PC-relative Base-relative

From the object program it is not possible to distinguish the address and constant

o The assembler must keep some information to tell the loader

o The object program that contains the modification record is called a relocatable

program

The way to solve the relocation problem

For an address label its address is assigned relative to the start of the program(START 0)

Produce a Modification record to store the starting location and the length of the address

field to be modified

The command for the loader must also be a part of the object program

Modification record

One modification record for each address to be modified

The length is stored in half-bytes (4 bits)

The starting location is the location of the byte containing the leftmost bits of the address

field to be modified

If the field contains an odd number of half-bytes the starting location begins in the middle of

the first byte

LALITHAMBIGAIB APIT Page 31

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Relocatable Object Program

In the above object code the red boxes indicate the addresses that need modifications The object

code lines at the end are the descriptions of the modification records for those instructions which

need change if relocation occurs M00000705 is the modification suggested for the statement at

location 0007 and requires modification 5-half bytes

Machine dependent assembler featuresAssembler features not closely related to machine architecture

bull Literalsbull Symbol-defining statementsbull Expressionsbull Program blocksbull Control sections and program linking

Literals

It is convenient for the programmer to be able to write the value of a constant operand as a part of

the instruction that uses it Such an operand is called a literal

45 001A ENDFIL LDA =ClsquoEOFrsquo 003210

002D =ClsquoEOFrsquo 454F46

LALITHAMBIGAIB APIT Page 32

SYSTEM SOFTWARE ( CS 2304) UNIT - II215 1062 WLOOP TD =Xlsquo05rsquo E32011

1076 =Xlsquo05rsquo 05

bull In this assembler language notation a literal is identified with the prefix= which is followed by a

specification of the literal value

The difference between a literal and an immediate operand

With immediate addressing the operand value is assembled as a part of the machine

instruction

With a literal the assembler generates the specified value as a constant at some other

memory location The address of this generated constant is used as the target address for the

machine instruction

Literal pool

All of the literal operands used in a program are gathered together into one or more literal

pools

Where the literal pool should be placed

93 LTORG

002D =ClsquoEOFrsquo 454F46

1048698 The assembler directive LTORG tells the assembler to generate a literal pool here

Literal for current value of location counter

1048698 The value of the location counter can be denoted by a literal operand

bull BASE

bull LDB =

Handling duplicate literal operands

The assembler should avoid storing duplicate literals

The easiest way to recognize duplicate literals is by comparison of the character

strings defining them

bull 100 LDA =ClsquoEOFrsquo

LALITHAMBIGAIB APIT Page 33

SYSTEM SOFTWARE ( CS 2304) UNIT - IIbull 125 LDA =ClsquoEOFrsquo

bull 160 LDA =Xlsquo454F46rsquo Literal operands with different

literal names may have the same

literal values

Recognizing literal operands that have different literal names but have the same literal values will

complicate the design of an assembler

Literal table (LITTAB)

The basic data structure needed to process literal operands is a literal table (LITTAB)

LITTAB contains Literal NameValue and Length

LITTAB is often organized as a hash table using the literal name or value as the key

Processing literal operands

Pass 1

bull For each recognized literal operand search LITTAB If the literal is already present in the

table no action is need if it is not present the literal is added to LITTAB without assigning

its address

bull When a LTORG statement is encountered or the end of the program the assembler makes a

scan of LITTAB and assigns an address to each literal

bull Update the location counter to reflect the number of bytes occupied by each literal

Pass 2

bull Search LITTAB for each literal operand encountered

bull The data values specified by the literals in each literal pool are inserted at the appropriate

places in the object program

bull In the same way as these values generated by BYTE or WORD statements

bull If a literal value represents an address in the program the assembler must generate the

appropriate Modification record

Symbol-defining statements

Assembler directives

EQU

ORG

Assembler directive EQU

LALITHAMBIGAIB APIT Page 34

SYSTEM SOFTWARE ( CS 2304) UNIT - II Most assemblers provide an assembler directive that allows the programmer to define

symbols and specify their values

Symbol EQU value

When the assembler encounters the EQU statement it enters ldquosymbolrdquo into SYMTAB with

the value of ldquosymbolrdquo

Use of EQU

Establish symbolic names that can be used for improved readability in place of numeric

values

+LDT 4096

MAXLEN EQU 4096

+LDT MAXLEN

Define mnemonic names for registers

A EQU 0

X EQU 1

L EQU 2

RMO 01

Establish and use names that reflect the logical function of the registers in the program

BASE EQU R1

ACCUMULATOR EQU R2

INDEX EQU R3

Assembler directive ORG

The assembler directive ORG is usually used to indirectly assign values to symbols

ORG value

ldquoValuerdquo is a constant or an expression involving constants and previously defined

symbols

When this statement is encountered the assembler resets its location counter (LOCCTR) to

the specified value

Use ORG for label definition

Suppose that we want to define a table with the following structure

STAB SYMBOL VALUE FLAGS

100 entries 6 bytes 3 bytes 1byte

LALITHAMBIGAIB APIT Page 35

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In some assemblers the previous value of LOCCTR is automatically remembered so we can

write

ORG

to return to the normal use of LOCCTR

Restrictions of EQU and ORG in an ordinary two-pass assembler For an ordinary two-pass assembler all symbols must be defined during Pass 1 Hence the

following sequences could not be processed by an ordinary two-pass assembler

All terms used to specify the value of the new symbol must have been defined previously in

the program

BETA EQU ALPHA

ALPHA RESW 1

ALPHA EQU BETA

BETA EQU DELTA

DELTA RESW 1

disallowed

ORG ALPHA

BYTE1 RESB 1

BYTE2 RESB 1

BYTE3 RESB 1

ORG

ALPHA RESB 1

disallowed

ALPHA RESW 1

BETA EQU ALPHA

Allowed

LALITHAMBIGAIB APIT Page 36

disallowed

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Expressions Most assemblers allow the use of expressions whenever a single operand such as a label or

literal is permitted

bull Each such expression must be evaluated by the assembler to produce a single

operand address or value

Assemblers generally allow arithmetic expressions formed according to the normal rule using

the operators +- and

bull Individual terms in the expression may be

bull constants

bull user-defined symbols or

bull special terms

bull The most common special term is the current value of the location

counter (often designated by )

Types of terms

Absolute terms - The value of an absolute term is independent of program location

Relative terms - The value of a relative term is dependent on the beginning address of the

program

Types of expressions

By the type of value produced expressions can classified as

1 Absolute expressions

bull The value of an absolute expression is independent of the program location

bull The absolute expression may contains relative terms provided the relative terms occur in

pairs and the terms in each such pair have opposite signs No relative term can enter multiplication

or division operation

bull eg MAXLEN EQU BUFEND-BUFFER

LALITHAMBIGAIB APIT Page 37

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Relative expressions

bull The value of a relative expression is relative the beginning address of the object

program

bull A relative expression is one in which all of the relative terms except

one can be paired as described above The remaining unpaired term must have a positive sign No

relative term can enter multiplication or division operation

Expressions that are neither relative nor absolute should be flagged by the assembler as errors

Determining types of expressions

Symbol Type Value

MAXLEN A 1000

BUFEND R 1036

BUFFER R 0036

RETADR R 0030

BUFEND+BUFFER 100-BUFFER and 3BUFFER are neither relative expressions nor

absolute expressions

Program blocks

The program is divided into three blocks They are Default - 0CDATA-1

and CBLKS-2

Default ( 0 ) block for mnemonic instructions

CDATA ( 1 ) block for less memory instructions

CBLKs ( 2 ) block for more memory instructions

Then one more column is included in the program when object code is

generated (ie) block whenever the system will encounter the new block need to write

USE and BLOCK name then set LOCCTR value to 0000

LALITHAMBIGAIB APIT Page 38

Add this field to SYMTAB

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In program block table consists of block name block valuestarting address and

length

After finding location counter value need to find object code and write object

program

LALITHAMBIGAIB APIT Page 39

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Program Block Diagram

LALITHAMBIGAIB APIT Page 40

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Control section and program linkingControl section

i A control section is a part of the program that maintains its identity after assembly

ii Each control section can be loaded and relocated independently of the others

iii Different control sections are most often used for subroutines or other logical

subdivisions of a program

Assembler directive CSECT

CSECT signal the start of a new control section

The assembler establishes a separate location counter (initialized as 0) for each control section

Assembler directives EXTDEF EXTREF

LALITHAMBIGAIB APIT Page 41

SYSTEM SOFTWARE ( CS 2304) UNIT - IIEXTDEF external definition

The EXTDEF statement in a control section names symbols called external symbols that are

defined in this control section and may be used by other sections Control section names are

automatically considered to be external symbols

EXTREF external reference

The EXTREF statement names symbols that are used in this control section and are defined

elsewhere

LALITHAMBIGAIB APIT Page 42

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 24: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - IIAmong the instruction formats format -3 and format-4 instructions are Register-Memory type of

instruction One of the operand is always in a register and the other operand is in the memory The

addressing mode tells us the way in which the operand from the memory is to be fetched

There are two ways

1) Program-counter relative and

2) Base-relative

This addressing mode can be represented by either using format-3 type or format-4

type of instruction format

In format-3 the instruction has the opcode followed by a 12-bit displacement value in

the address field

Where as in format-4 the instruction contains the mnemonic code followed by a 20-

bit displacement value in the address field

1 Program-Counter Relative

a) In this usually format-3 instruction format is used

b) The instruction contains the opcode followed by a 12-bit displacement value

c) The range of displacement values are from 0 -2048 This displacement (should be small

enough to fit in a 12-bit field) value is added to the current contents of the program counter to

get the target address of the operand required by the instruction

d) This is relative way of calculating the address of the operand relative to the program counter

Hence the displacement of the operand is relative to the current program counter value

e) The following example shows how the address is calculated

LALITHAMBIGAIB APIT Page 24

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Base-Relative Addressing Mode

a) In this mode the base register is used to mention the displacement value Therefore the target

address is

TA = (base) + displacement value

b) This addressing mode is used when the range of displacement value is not sufficient Hence

the operand is not relative to the instruction as in PC-relative addressing mode Whenever

this mode is used it is indicated by using a directive BASE The moment the assembler

encounters this directive the next instruction uses base-relative addressing mode to calculate

the target address of the operand

c) When NOBASE directive is used then it indicates the base register is no more used to

calculate the target address of the operand Assembler first chooses PC-relative when the

displacement field is not enough it uses Base-relative

LDB LENGTH (instruction)

BASE LENGTH (directive)

NOBASE

LALITHAMBIGAIB APIT Page 25

SYSTEM SOFTWARE ( CS 2304) UNIT - II

For example

12 0003 LDB LENGTH 69202D

13 BASE LENGTH

100 0033 LENGTH RESW 1

105 0036 BUFFER RESB 4096

160 104E STCH BUFFER X 57C003

165 1051 TIXR T B850

In the above example the use of directive BASE indicates that Base-relative addressing mode is

to be used to calculate the target address PC-relative is no longer used The value of the LENGTH is

stored in the base register

The LDB instruction loads the value of length in the base register which 0033 BASE directive

explicitly tells the assembler that it has the value of LENGTH

BUFFER is at location (0036)16

(B) = (0033)16

disp = 0036 ndash 0033 = (0003)16

20 000A LDA LENGTH 032026

175 1056 EXIT STX LENGTH 134000

LALITHAMBIGAIB APIT Page 26

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Consider Line 175 If we use PC-relative

Disp = TA ndash (PC) = 0033 ndash1059 = EFDA

PC relative is no longer applicable so we try to use BASE relative addressing mode

3 Immediate Addressing Mode

In this mode no memory reference is involved If immediate mode is used the target address is the

operand itself

If the symbol is referred in the instruction as the immediate operand then it is immediate with PC-

relative mode as shown in the example below

LALITHAMBIGAIB APIT Page 27

SYSTEM SOFTWARE ( CS 2304) UNIT - II

4 Indirect and PC-relative mode

In this type of instruction the symbol used in the instruction is the address of the location which

contains the address of the operand The address of this is found using PC-relative addressing mode

For example

The instruction jumps the control to the address location RETADR which in turn has the address of

the operand If address of RETADR is 0030 the target address is then 0003 as calculated above

222 Program Relocation

The need for program relocation

It is desirable to load and run several programs at the same time

The system must be able to load programs into memory wherever there is room

The exact starting address of the program is not known until load time

Absolute Address

Program with starting address specified at assembly time

The address may be invalid if the program is loaded into somewhere else

LALITHAMBIGAIB APIT Page 28

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example

The above statement says that the register A is loaded with the value stored at location 102D

Suppose it is decided to load and execute the program at location 3000 instead of location 1000

Then at address 102D the required value which needs to be loaded in the register A is no more

available

The address also gets changed relative to the displacement of the program Hence we need to

make some changes in the address portion of the instruction so that we can load and execute the

program at location 3000

Apart from the instruction which will undergo a change in their operand address value as the

program load address changes There exist some parts in the program which will remain same

regardless of where the program is being loaded

Since assembler will not know actual location where the program will get loaded it cannot make

the necessary changes in the addresses used in the program However the assembler identifies

for the loader those parts of the program which need modification

An object program that has the information necessary to perform this kind of modification is

called the relocatable program

LALITHAMBIGAIB APIT Page 29

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example Program Relocation

1) The above diagram shows the concept of relocation

a) Initially the program is loaded at location 0000

b) The instruction JSUB is loaded at location 0006

c) The address field of this instruction contains 01036 which is the address of the instruction

labeled RDREC

2) The second figure shows that if the program is to be loaded at new location 5000 The address of

the instruction JSUB gets modified to new location 6036

3) Likewise the third figure shows that if the program is relocated at location 7420 the JSUB

instruction would need to be changed to 4B108456 that correspond to the new address of

RDREC

The only parts of the program that require modification at load time are those that specify

direct addresses

LALITHAMBIGAIB APIT Page 30

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The rest of the instructions need not be modified

o Not a memory address (immediate addressing)

o PC-relative Base-relative

From the object program it is not possible to distinguish the address and constant

o The assembler must keep some information to tell the loader

o The object program that contains the modification record is called a relocatable

program

The way to solve the relocation problem

For an address label its address is assigned relative to the start of the program(START 0)

Produce a Modification record to store the starting location and the length of the address

field to be modified

The command for the loader must also be a part of the object program

Modification record

One modification record for each address to be modified

The length is stored in half-bytes (4 bits)

The starting location is the location of the byte containing the leftmost bits of the address

field to be modified

If the field contains an odd number of half-bytes the starting location begins in the middle of

the first byte

LALITHAMBIGAIB APIT Page 31

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Relocatable Object Program

In the above object code the red boxes indicate the addresses that need modifications The object

code lines at the end are the descriptions of the modification records for those instructions which

need change if relocation occurs M00000705 is the modification suggested for the statement at

location 0007 and requires modification 5-half bytes

Machine dependent assembler featuresAssembler features not closely related to machine architecture

bull Literalsbull Symbol-defining statementsbull Expressionsbull Program blocksbull Control sections and program linking

Literals

It is convenient for the programmer to be able to write the value of a constant operand as a part of

the instruction that uses it Such an operand is called a literal

45 001A ENDFIL LDA =ClsquoEOFrsquo 003210

002D =ClsquoEOFrsquo 454F46

LALITHAMBIGAIB APIT Page 32

SYSTEM SOFTWARE ( CS 2304) UNIT - II215 1062 WLOOP TD =Xlsquo05rsquo E32011

1076 =Xlsquo05rsquo 05

bull In this assembler language notation a literal is identified with the prefix= which is followed by a

specification of the literal value

The difference between a literal and an immediate operand

With immediate addressing the operand value is assembled as a part of the machine

instruction

With a literal the assembler generates the specified value as a constant at some other

memory location The address of this generated constant is used as the target address for the

machine instruction

Literal pool

All of the literal operands used in a program are gathered together into one or more literal

pools

Where the literal pool should be placed

93 LTORG

002D =ClsquoEOFrsquo 454F46

1048698 The assembler directive LTORG tells the assembler to generate a literal pool here

Literal for current value of location counter

1048698 The value of the location counter can be denoted by a literal operand

bull BASE

bull LDB =

Handling duplicate literal operands

The assembler should avoid storing duplicate literals

The easiest way to recognize duplicate literals is by comparison of the character

strings defining them

bull 100 LDA =ClsquoEOFrsquo

LALITHAMBIGAIB APIT Page 33

SYSTEM SOFTWARE ( CS 2304) UNIT - IIbull 125 LDA =ClsquoEOFrsquo

bull 160 LDA =Xlsquo454F46rsquo Literal operands with different

literal names may have the same

literal values

Recognizing literal operands that have different literal names but have the same literal values will

complicate the design of an assembler

Literal table (LITTAB)

The basic data structure needed to process literal operands is a literal table (LITTAB)

LITTAB contains Literal NameValue and Length

LITTAB is often organized as a hash table using the literal name or value as the key

Processing literal operands

Pass 1

bull For each recognized literal operand search LITTAB If the literal is already present in the

table no action is need if it is not present the literal is added to LITTAB without assigning

its address

bull When a LTORG statement is encountered or the end of the program the assembler makes a

scan of LITTAB and assigns an address to each literal

bull Update the location counter to reflect the number of bytes occupied by each literal

Pass 2

bull Search LITTAB for each literal operand encountered

bull The data values specified by the literals in each literal pool are inserted at the appropriate

places in the object program

bull In the same way as these values generated by BYTE or WORD statements

bull If a literal value represents an address in the program the assembler must generate the

appropriate Modification record

Symbol-defining statements

Assembler directives

EQU

ORG

Assembler directive EQU

LALITHAMBIGAIB APIT Page 34

SYSTEM SOFTWARE ( CS 2304) UNIT - II Most assemblers provide an assembler directive that allows the programmer to define

symbols and specify their values

Symbol EQU value

When the assembler encounters the EQU statement it enters ldquosymbolrdquo into SYMTAB with

the value of ldquosymbolrdquo

Use of EQU

Establish symbolic names that can be used for improved readability in place of numeric

values

+LDT 4096

MAXLEN EQU 4096

+LDT MAXLEN

Define mnemonic names for registers

A EQU 0

X EQU 1

L EQU 2

RMO 01

Establish and use names that reflect the logical function of the registers in the program

BASE EQU R1

ACCUMULATOR EQU R2

INDEX EQU R3

Assembler directive ORG

The assembler directive ORG is usually used to indirectly assign values to symbols

ORG value

ldquoValuerdquo is a constant or an expression involving constants and previously defined

symbols

When this statement is encountered the assembler resets its location counter (LOCCTR) to

the specified value

Use ORG for label definition

Suppose that we want to define a table with the following structure

STAB SYMBOL VALUE FLAGS

100 entries 6 bytes 3 bytes 1byte

LALITHAMBIGAIB APIT Page 35

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In some assemblers the previous value of LOCCTR is automatically remembered so we can

write

ORG

to return to the normal use of LOCCTR

Restrictions of EQU and ORG in an ordinary two-pass assembler For an ordinary two-pass assembler all symbols must be defined during Pass 1 Hence the

following sequences could not be processed by an ordinary two-pass assembler

All terms used to specify the value of the new symbol must have been defined previously in

the program

BETA EQU ALPHA

ALPHA RESW 1

ALPHA EQU BETA

BETA EQU DELTA

DELTA RESW 1

disallowed

ORG ALPHA

BYTE1 RESB 1

BYTE2 RESB 1

BYTE3 RESB 1

ORG

ALPHA RESB 1

disallowed

ALPHA RESW 1

BETA EQU ALPHA

Allowed

LALITHAMBIGAIB APIT Page 36

disallowed

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Expressions Most assemblers allow the use of expressions whenever a single operand such as a label or

literal is permitted

bull Each such expression must be evaluated by the assembler to produce a single

operand address or value

Assemblers generally allow arithmetic expressions formed according to the normal rule using

the operators +- and

bull Individual terms in the expression may be

bull constants

bull user-defined symbols or

bull special terms

bull The most common special term is the current value of the location

counter (often designated by )

Types of terms

Absolute terms - The value of an absolute term is independent of program location

Relative terms - The value of a relative term is dependent on the beginning address of the

program

Types of expressions

By the type of value produced expressions can classified as

1 Absolute expressions

bull The value of an absolute expression is independent of the program location

bull The absolute expression may contains relative terms provided the relative terms occur in

pairs and the terms in each such pair have opposite signs No relative term can enter multiplication

or division operation

bull eg MAXLEN EQU BUFEND-BUFFER

LALITHAMBIGAIB APIT Page 37

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Relative expressions

bull The value of a relative expression is relative the beginning address of the object

program

bull A relative expression is one in which all of the relative terms except

one can be paired as described above The remaining unpaired term must have a positive sign No

relative term can enter multiplication or division operation

Expressions that are neither relative nor absolute should be flagged by the assembler as errors

Determining types of expressions

Symbol Type Value

MAXLEN A 1000

BUFEND R 1036

BUFFER R 0036

RETADR R 0030

BUFEND+BUFFER 100-BUFFER and 3BUFFER are neither relative expressions nor

absolute expressions

Program blocks

The program is divided into three blocks They are Default - 0CDATA-1

and CBLKS-2

Default ( 0 ) block for mnemonic instructions

CDATA ( 1 ) block for less memory instructions

CBLKs ( 2 ) block for more memory instructions

Then one more column is included in the program when object code is

generated (ie) block whenever the system will encounter the new block need to write

USE and BLOCK name then set LOCCTR value to 0000

LALITHAMBIGAIB APIT Page 38

Add this field to SYMTAB

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In program block table consists of block name block valuestarting address and

length

After finding location counter value need to find object code and write object

program

LALITHAMBIGAIB APIT Page 39

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Program Block Diagram

LALITHAMBIGAIB APIT Page 40

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Control section and program linkingControl section

i A control section is a part of the program that maintains its identity after assembly

ii Each control section can be loaded and relocated independently of the others

iii Different control sections are most often used for subroutines or other logical

subdivisions of a program

Assembler directive CSECT

CSECT signal the start of a new control section

The assembler establishes a separate location counter (initialized as 0) for each control section

Assembler directives EXTDEF EXTREF

LALITHAMBIGAIB APIT Page 41

SYSTEM SOFTWARE ( CS 2304) UNIT - IIEXTDEF external definition

The EXTDEF statement in a control section names symbols called external symbols that are

defined in this control section and may be used by other sections Control section names are

automatically considered to be external symbols

EXTREF external reference

The EXTREF statement names symbols that are used in this control section and are defined

elsewhere

LALITHAMBIGAIB APIT Page 42

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 25: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Base-Relative Addressing Mode

a) In this mode the base register is used to mention the displacement value Therefore the target

address is

TA = (base) + displacement value

b) This addressing mode is used when the range of displacement value is not sufficient Hence

the operand is not relative to the instruction as in PC-relative addressing mode Whenever

this mode is used it is indicated by using a directive BASE The moment the assembler

encounters this directive the next instruction uses base-relative addressing mode to calculate

the target address of the operand

c) When NOBASE directive is used then it indicates the base register is no more used to

calculate the target address of the operand Assembler first chooses PC-relative when the

displacement field is not enough it uses Base-relative

LDB LENGTH (instruction)

BASE LENGTH (directive)

NOBASE

LALITHAMBIGAIB APIT Page 25

SYSTEM SOFTWARE ( CS 2304) UNIT - II

For example

12 0003 LDB LENGTH 69202D

13 BASE LENGTH

100 0033 LENGTH RESW 1

105 0036 BUFFER RESB 4096

160 104E STCH BUFFER X 57C003

165 1051 TIXR T B850

In the above example the use of directive BASE indicates that Base-relative addressing mode is

to be used to calculate the target address PC-relative is no longer used The value of the LENGTH is

stored in the base register

The LDB instruction loads the value of length in the base register which 0033 BASE directive

explicitly tells the assembler that it has the value of LENGTH

BUFFER is at location (0036)16

(B) = (0033)16

disp = 0036 ndash 0033 = (0003)16

20 000A LDA LENGTH 032026

175 1056 EXIT STX LENGTH 134000

LALITHAMBIGAIB APIT Page 26

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Consider Line 175 If we use PC-relative

Disp = TA ndash (PC) = 0033 ndash1059 = EFDA

PC relative is no longer applicable so we try to use BASE relative addressing mode

3 Immediate Addressing Mode

In this mode no memory reference is involved If immediate mode is used the target address is the

operand itself

If the symbol is referred in the instruction as the immediate operand then it is immediate with PC-

relative mode as shown in the example below

LALITHAMBIGAIB APIT Page 27

SYSTEM SOFTWARE ( CS 2304) UNIT - II

4 Indirect and PC-relative mode

In this type of instruction the symbol used in the instruction is the address of the location which

contains the address of the operand The address of this is found using PC-relative addressing mode

For example

The instruction jumps the control to the address location RETADR which in turn has the address of

the operand If address of RETADR is 0030 the target address is then 0003 as calculated above

222 Program Relocation

The need for program relocation

It is desirable to load and run several programs at the same time

The system must be able to load programs into memory wherever there is room

The exact starting address of the program is not known until load time

Absolute Address

Program with starting address specified at assembly time

The address may be invalid if the program is loaded into somewhere else

LALITHAMBIGAIB APIT Page 28

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example

The above statement says that the register A is loaded with the value stored at location 102D

Suppose it is decided to load and execute the program at location 3000 instead of location 1000

Then at address 102D the required value which needs to be loaded in the register A is no more

available

The address also gets changed relative to the displacement of the program Hence we need to

make some changes in the address portion of the instruction so that we can load and execute the

program at location 3000

Apart from the instruction which will undergo a change in their operand address value as the

program load address changes There exist some parts in the program which will remain same

regardless of where the program is being loaded

Since assembler will not know actual location where the program will get loaded it cannot make

the necessary changes in the addresses used in the program However the assembler identifies

for the loader those parts of the program which need modification

An object program that has the information necessary to perform this kind of modification is

called the relocatable program

LALITHAMBIGAIB APIT Page 29

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example Program Relocation

1) The above diagram shows the concept of relocation

a) Initially the program is loaded at location 0000

b) The instruction JSUB is loaded at location 0006

c) The address field of this instruction contains 01036 which is the address of the instruction

labeled RDREC

2) The second figure shows that if the program is to be loaded at new location 5000 The address of

the instruction JSUB gets modified to new location 6036

3) Likewise the third figure shows that if the program is relocated at location 7420 the JSUB

instruction would need to be changed to 4B108456 that correspond to the new address of

RDREC

The only parts of the program that require modification at load time are those that specify

direct addresses

LALITHAMBIGAIB APIT Page 30

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The rest of the instructions need not be modified

o Not a memory address (immediate addressing)

o PC-relative Base-relative

From the object program it is not possible to distinguish the address and constant

o The assembler must keep some information to tell the loader

o The object program that contains the modification record is called a relocatable

program

The way to solve the relocation problem

For an address label its address is assigned relative to the start of the program(START 0)

Produce a Modification record to store the starting location and the length of the address

field to be modified

The command for the loader must also be a part of the object program

Modification record

One modification record for each address to be modified

The length is stored in half-bytes (4 bits)

The starting location is the location of the byte containing the leftmost bits of the address

field to be modified

If the field contains an odd number of half-bytes the starting location begins in the middle of

the first byte

LALITHAMBIGAIB APIT Page 31

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Relocatable Object Program

In the above object code the red boxes indicate the addresses that need modifications The object

code lines at the end are the descriptions of the modification records for those instructions which

need change if relocation occurs M00000705 is the modification suggested for the statement at

location 0007 and requires modification 5-half bytes

Machine dependent assembler featuresAssembler features not closely related to machine architecture

bull Literalsbull Symbol-defining statementsbull Expressionsbull Program blocksbull Control sections and program linking

Literals

It is convenient for the programmer to be able to write the value of a constant operand as a part of

the instruction that uses it Such an operand is called a literal

45 001A ENDFIL LDA =ClsquoEOFrsquo 003210

002D =ClsquoEOFrsquo 454F46

LALITHAMBIGAIB APIT Page 32

SYSTEM SOFTWARE ( CS 2304) UNIT - II215 1062 WLOOP TD =Xlsquo05rsquo E32011

1076 =Xlsquo05rsquo 05

bull In this assembler language notation a literal is identified with the prefix= which is followed by a

specification of the literal value

The difference between a literal and an immediate operand

With immediate addressing the operand value is assembled as a part of the machine

instruction

With a literal the assembler generates the specified value as a constant at some other

memory location The address of this generated constant is used as the target address for the

machine instruction

Literal pool

All of the literal operands used in a program are gathered together into one or more literal

pools

Where the literal pool should be placed

93 LTORG

002D =ClsquoEOFrsquo 454F46

1048698 The assembler directive LTORG tells the assembler to generate a literal pool here

Literal for current value of location counter

1048698 The value of the location counter can be denoted by a literal operand

bull BASE

bull LDB =

Handling duplicate literal operands

The assembler should avoid storing duplicate literals

The easiest way to recognize duplicate literals is by comparison of the character

strings defining them

bull 100 LDA =ClsquoEOFrsquo

LALITHAMBIGAIB APIT Page 33

SYSTEM SOFTWARE ( CS 2304) UNIT - IIbull 125 LDA =ClsquoEOFrsquo

bull 160 LDA =Xlsquo454F46rsquo Literal operands with different

literal names may have the same

literal values

Recognizing literal operands that have different literal names but have the same literal values will

complicate the design of an assembler

Literal table (LITTAB)

The basic data structure needed to process literal operands is a literal table (LITTAB)

LITTAB contains Literal NameValue and Length

LITTAB is often organized as a hash table using the literal name or value as the key

Processing literal operands

Pass 1

bull For each recognized literal operand search LITTAB If the literal is already present in the

table no action is need if it is not present the literal is added to LITTAB without assigning

its address

bull When a LTORG statement is encountered or the end of the program the assembler makes a

scan of LITTAB and assigns an address to each literal

bull Update the location counter to reflect the number of bytes occupied by each literal

Pass 2

bull Search LITTAB for each literal operand encountered

bull The data values specified by the literals in each literal pool are inserted at the appropriate

places in the object program

bull In the same way as these values generated by BYTE or WORD statements

bull If a literal value represents an address in the program the assembler must generate the

appropriate Modification record

Symbol-defining statements

Assembler directives

EQU

ORG

Assembler directive EQU

LALITHAMBIGAIB APIT Page 34

SYSTEM SOFTWARE ( CS 2304) UNIT - II Most assemblers provide an assembler directive that allows the programmer to define

symbols and specify their values

Symbol EQU value

When the assembler encounters the EQU statement it enters ldquosymbolrdquo into SYMTAB with

the value of ldquosymbolrdquo

Use of EQU

Establish symbolic names that can be used for improved readability in place of numeric

values

+LDT 4096

MAXLEN EQU 4096

+LDT MAXLEN

Define mnemonic names for registers

A EQU 0

X EQU 1

L EQU 2

RMO 01

Establish and use names that reflect the logical function of the registers in the program

BASE EQU R1

ACCUMULATOR EQU R2

INDEX EQU R3

Assembler directive ORG

The assembler directive ORG is usually used to indirectly assign values to symbols

ORG value

ldquoValuerdquo is a constant or an expression involving constants and previously defined

symbols

When this statement is encountered the assembler resets its location counter (LOCCTR) to

the specified value

Use ORG for label definition

Suppose that we want to define a table with the following structure

STAB SYMBOL VALUE FLAGS

100 entries 6 bytes 3 bytes 1byte

LALITHAMBIGAIB APIT Page 35

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In some assemblers the previous value of LOCCTR is automatically remembered so we can

write

ORG

to return to the normal use of LOCCTR

Restrictions of EQU and ORG in an ordinary two-pass assembler For an ordinary two-pass assembler all symbols must be defined during Pass 1 Hence the

following sequences could not be processed by an ordinary two-pass assembler

All terms used to specify the value of the new symbol must have been defined previously in

the program

BETA EQU ALPHA

ALPHA RESW 1

ALPHA EQU BETA

BETA EQU DELTA

DELTA RESW 1

disallowed

ORG ALPHA

BYTE1 RESB 1

BYTE2 RESB 1

BYTE3 RESB 1

ORG

ALPHA RESB 1

disallowed

ALPHA RESW 1

BETA EQU ALPHA

Allowed

LALITHAMBIGAIB APIT Page 36

disallowed

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Expressions Most assemblers allow the use of expressions whenever a single operand such as a label or

literal is permitted

bull Each such expression must be evaluated by the assembler to produce a single

operand address or value

Assemblers generally allow arithmetic expressions formed according to the normal rule using

the operators +- and

bull Individual terms in the expression may be

bull constants

bull user-defined symbols or

bull special terms

bull The most common special term is the current value of the location

counter (often designated by )

Types of terms

Absolute terms - The value of an absolute term is independent of program location

Relative terms - The value of a relative term is dependent on the beginning address of the

program

Types of expressions

By the type of value produced expressions can classified as

1 Absolute expressions

bull The value of an absolute expression is independent of the program location

bull The absolute expression may contains relative terms provided the relative terms occur in

pairs and the terms in each such pair have opposite signs No relative term can enter multiplication

or division operation

bull eg MAXLEN EQU BUFEND-BUFFER

LALITHAMBIGAIB APIT Page 37

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Relative expressions

bull The value of a relative expression is relative the beginning address of the object

program

bull A relative expression is one in which all of the relative terms except

one can be paired as described above The remaining unpaired term must have a positive sign No

relative term can enter multiplication or division operation

Expressions that are neither relative nor absolute should be flagged by the assembler as errors

Determining types of expressions

Symbol Type Value

MAXLEN A 1000

BUFEND R 1036

BUFFER R 0036

RETADR R 0030

BUFEND+BUFFER 100-BUFFER and 3BUFFER are neither relative expressions nor

absolute expressions

Program blocks

The program is divided into three blocks They are Default - 0CDATA-1

and CBLKS-2

Default ( 0 ) block for mnemonic instructions

CDATA ( 1 ) block for less memory instructions

CBLKs ( 2 ) block for more memory instructions

Then one more column is included in the program when object code is

generated (ie) block whenever the system will encounter the new block need to write

USE and BLOCK name then set LOCCTR value to 0000

LALITHAMBIGAIB APIT Page 38

Add this field to SYMTAB

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In program block table consists of block name block valuestarting address and

length

After finding location counter value need to find object code and write object

program

LALITHAMBIGAIB APIT Page 39

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Program Block Diagram

LALITHAMBIGAIB APIT Page 40

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Control section and program linkingControl section

i A control section is a part of the program that maintains its identity after assembly

ii Each control section can be loaded and relocated independently of the others

iii Different control sections are most often used for subroutines or other logical

subdivisions of a program

Assembler directive CSECT

CSECT signal the start of a new control section

The assembler establishes a separate location counter (initialized as 0) for each control section

Assembler directives EXTDEF EXTREF

LALITHAMBIGAIB APIT Page 41

SYSTEM SOFTWARE ( CS 2304) UNIT - IIEXTDEF external definition

The EXTDEF statement in a control section names symbols called external symbols that are

defined in this control section and may be used by other sections Control section names are

automatically considered to be external symbols

EXTREF external reference

The EXTREF statement names symbols that are used in this control section and are defined

elsewhere

LALITHAMBIGAIB APIT Page 42

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 26: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - II

For example

12 0003 LDB LENGTH 69202D

13 BASE LENGTH

100 0033 LENGTH RESW 1

105 0036 BUFFER RESB 4096

160 104E STCH BUFFER X 57C003

165 1051 TIXR T B850

In the above example the use of directive BASE indicates that Base-relative addressing mode is

to be used to calculate the target address PC-relative is no longer used The value of the LENGTH is

stored in the base register

The LDB instruction loads the value of length in the base register which 0033 BASE directive

explicitly tells the assembler that it has the value of LENGTH

BUFFER is at location (0036)16

(B) = (0033)16

disp = 0036 ndash 0033 = (0003)16

20 000A LDA LENGTH 032026

175 1056 EXIT STX LENGTH 134000

LALITHAMBIGAIB APIT Page 26

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Consider Line 175 If we use PC-relative

Disp = TA ndash (PC) = 0033 ndash1059 = EFDA

PC relative is no longer applicable so we try to use BASE relative addressing mode

3 Immediate Addressing Mode

In this mode no memory reference is involved If immediate mode is used the target address is the

operand itself

If the symbol is referred in the instruction as the immediate operand then it is immediate with PC-

relative mode as shown in the example below

LALITHAMBIGAIB APIT Page 27

SYSTEM SOFTWARE ( CS 2304) UNIT - II

4 Indirect and PC-relative mode

In this type of instruction the symbol used in the instruction is the address of the location which

contains the address of the operand The address of this is found using PC-relative addressing mode

For example

The instruction jumps the control to the address location RETADR which in turn has the address of

the operand If address of RETADR is 0030 the target address is then 0003 as calculated above

222 Program Relocation

The need for program relocation

It is desirable to load and run several programs at the same time

The system must be able to load programs into memory wherever there is room

The exact starting address of the program is not known until load time

Absolute Address

Program with starting address specified at assembly time

The address may be invalid if the program is loaded into somewhere else

LALITHAMBIGAIB APIT Page 28

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example

The above statement says that the register A is loaded with the value stored at location 102D

Suppose it is decided to load and execute the program at location 3000 instead of location 1000

Then at address 102D the required value which needs to be loaded in the register A is no more

available

The address also gets changed relative to the displacement of the program Hence we need to

make some changes in the address portion of the instruction so that we can load and execute the

program at location 3000

Apart from the instruction which will undergo a change in their operand address value as the

program load address changes There exist some parts in the program which will remain same

regardless of where the program is being loaded

Since assembler will not know actual location where the program will get loaded it cannot make

the necessary changes in the addresses used in the program However the assembler identifies

for the loader those parts of the program which need modification

An object program that has the information necessary to perform this kind of modification is

called the relocatable program

LALITHAMBIGAIB APIT Page 29

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example Program Relocation

1) The above diagram shows the concept of relocation

a) Initially the program is loaded at location 0000

b) The instruction JSUB is loaded at location 0006

c) The address field of this instruction contains 01036 which is the address of the instruction

labeled RDREC

2) The second figure shows that if the program is to be loaded at new location 5000 The address of

the instruction JSUB gets modified to new location 6036

3) Likewise the third figure shows that if the program is relocated at location 7420 the JSUB

instruction would need to be changed to 4B108456 that correspond to the new address of

RDREC

The only parts of the program that require modification at load time are those that specify

direct addresses

LALITHAMBIGAIB APIT Page 30

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The rest of the instructions need not be modified

o Not a memory address (immediate addressing)

o PC-relative Base-relative

From the object program it is not possible to distinguish the address and constant

o The assembler must keep some information to tell the loader

o The object program that contains the modification record is called a relocatable

program

The way to solve the relocation problem

For an address label its address is assigned relative to the start of the program(START 0)

Produce a Modification record to store the starting location and the length of the address

field to be modified

The command for the loader must also be a part of the object program

Modification record

One modification record for each address to be modified

The length is stored in half-bytes (4 bits)

The starting location is the location of the byte containing the leftmost bits of the address

field to be modified

If the field contains an odd number of half-bytes the starting location begins in the middle of

the first byte

LALITHAMBIGAIB APIT Page 31

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Relocatable Object Program

In the above object code the red boxes indicate the addresses that need modifications The object

code lines at the end are the descriptions of the modification records for those instructions which

need change if relocation occurs M00000705 is the modification suggested for the statement at

location 0007 and requires modification 5-half bytes

Machine dependent assembler featuresAssembler features not closely related to machine architecture

bull Literalsbull Symbol-defining statementsbull Expressionsbull Program blocksbull Control sections and program linking

Literals

It is convenient for the programmer to be able to write the value of a constant operand as a part of

the instruction that uses it Such an operand is called a literal

45 001A ENDFIL LDA =ClsquoEOFrsquo 003210

002D =ClsquoEOFrsquo 454F46

LALITHAMBIGAIB APIT Page 32

SYSTEM SOFTWARE ( CS 2304) UNIT - II215 1062 WLOOP TD =Xlsquo05rsquo E32011

1076 =Xlsquo05rsquo 05

bull In this assembler language notation a literal is identified with the prefix= which is followed by a

specification of the literal value

The difference between a literal and an immediate operand

With immediate addressing the operand value is assembled as a part of the machine

instruction

With a literal the assembler generates the specified value as a constant at some other

memory location The address of this generated constant is used as the target address for the

machine instruction

Literal pool

All of the literal operands used in a program are gathered together into one or more literal

pools

Where the literal pool should be placed

93 LTORG

002D =ClsquoEOFrsquo 454F46

1048698 The assembler directive LTORG tells the assembler to generate a literal pool here

Literal for current value of location counter

1048698 The value of the location counter can be denoted by a literal operand

bull BASE

bull LDB =

Handling duplicate literal operands

The assembler should avoid storing duplicate literals

The easiest way to recognize duplicate literals is by comparison of the character

strings defining them

bull 100 LDA =ClsquoEOFrsquo

LALITHAMBIGAIB APIT Page 33

SYSTEM SOFTWARE ( CS 2304) UNIT - IIbull 125 LDA =ClsquoEOFrsquo

bull 160 LDA =Xlsquo454F46rsquo Literal operands with different

literal names may have the same

literal values

Recognizing literal operands that have different literal names but have the same literal values will

complicate the design of an assembler

Literal table (LITTAB)

The basic data structure needed to process literal operands is a literal table (LITTAB)

LITTAB contains Literal NameValue and Length

LITTAB is often organized as a hash table using the literal name or value as the key

Processing literal operands

Pass 1

bull For each recognized literal operand search LITTAB If the literal is already present in the

table no action is need if it is not present the literal is added to LITTAB without assigning

its address

bull When a LTORG statement is encountered or the end of the program the assembler makes a

scan of LITTAB and assigns an address to each literal

bull Update the location counter to reflect the number of bytes occupied by each literal

Pass 2

bull Search LITTAB for each literal operand encountered

bull The data values specified by the literals in each literal pool are inserted at the appropriate

places in the object program

bull In the same way as these values generated by BYTE or WORD statements

bull If a literal value represents an address in the program the assembler must generate the

appropriate Modification record

Symbol-defining statements

Assembler directives

EQU

ORG

Assembler directive EQU

LALITHAMBIGAIB APIT Page 34

SYSTEM SOFTWARE ( CS 2304) UNIT - II Most assemblers provide an assembler directive that allows the programmer to define

symbols and specify their values

Symbol EQU value

When the assembler encounters the EQU statement it enters ldquosymbolrdquo into SYMTAB with

the value of ldquosymbolrdquo

Use of EQU

Establish symbolic names that can be used for improved readability in place of numeric

values

+LDT 4096

MAXLEN EQU 4096

+LDT MAXLEN

Define mnemonic names for registers

A EQU 0

X EQU 1

L EQU 2

RMO 01

Establish and use names that reflect the logical function of the registers in the program

BASE EQU R1

ACCUMULATOR EQU R2

INDEX EQU R3

Assembler directive ORG

The assembler directive ORG is usually used to indirectly assign values to symbols

ORG value

ldquoValuerdquo is a constant or an expression involving constants and previously defined

symbols

When this statement is encountered the assembler resets its location counter (LOCCTR) to

the specified value

Use ORG for label definition

Suppose that we want to define a table with the following structure

STAB SYMBOL VALUE FLAGS

100 entries 6 bytes 3 bytes 1byte

LALITHAMBIGAIB APIT Page 35

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In some assemblers the previous value of LOCCTR is automatically remembered so we can

write

ORG

to return to the normal use of LOCCTR

Restrictions of EQU and ORG in an ordinary two-pass assembler For an ordinary two-pass assembler all symbols must be defined during Pass 1 Hence the

following sequences could not be processed by an ordinary two-pass assembler

All terms used to specify the value of the new symbol must have been defined previously in

the program

BETA EQU ALPHA

ALPHA RESW 1

ALPHA EQU BETA

BETA EQU DELTA

DELTA RESW 1

disallowed

ORG ALPHA

BYTE1 RESB 1

BYTE2 RESB 1

BYTE3 RESB 1

ORG

ALPHA RESB 1

disallowed

ALPHA RESW 1

BETA EQU ALPHA

Allowed

LALITHAMBIGAIB APIT Page 36

disallowed

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Expressions Most assemblers allow the use of expressions whenever a single operand such as a label or

literal is permitted

bull Each such expression must be evaluated by the assembler to produce a single

operand address or value

Assemblers generally allow arithmetic expressions formed according to the normal rule using

the operators +- and

bull Individual terms in the expression may be

bull constants

bull user-defined symbols or

bull special terms

bull The most common special term is the current value of the location

counter (often designated by )

Types of terms

Absolute terms - The value of an absolute term is independent of program location

Relative terms - The value of a relative term is dependent on the beginning address of the

program

Types of expressions

By the type of value produced expressions can classified as

1 Absolute expressions

bull The value of an absolute expression is independent of the program location

bull The absolute expression may contains relative terms provided the relative terms occur in

pairs and the terms in each such pair have opposite signs No relative term can enter multiplication

or division operation

bull eg MAXLEN EQU BUFEND-BUFFER

LALITHAMBIGAIB APIT Page 37

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Relative expressions

bull The value of a relative expression is relative the beginning address of the object

program

bull A relative expression is one in which all of the relative terms except

one can be paired as described above The remaining unpaired term must have a positive sign No

relative term can enter multiplication or division operation

Expressions that are neither relative nor absolute should be flagged by the assembler as errors

Determining types of expressions

Symbol Type Value

MAXLEN A 1000

BUFEND R 1036

BUFFER R 0036

RETADR R 0030

BUFEND+BUFFER 100-BUFFER and 3BUFFER are neither relative expressions nor

absolute expressions

Program blocks

The program is divided into three blocks They are Default - 0CDATA-1

and CBLKS-2

Default ( 0 ) block for mnemonic instructions

CDATA ( 1 ) block for less memory instructions

CBLKs ( 2 ) block for more memory instructions

Then one more column is included in the program when object code is

generated (ie) block whenever the system will encounter the new block need to write

USE and BLOCK name then set LOCCTR value to 0000

LALITHAMBIGAIB APIT Page 38

Add this field to SYMTAB

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In program block table consists of block name block valuestarting address and

length

After finding location counter value need to find object code and write object

program

LALITHAMBIGAIB APIT Page 39

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Program Block Diagram

LALITHAMBIGAIB APIT Page 40

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Control section and program linkingControl section

i A control section is a part of the program that maintains its identity after assembly

ii Each control section can be loaded and relocated independently of the others

iii Different control sections are most often used for subroutines or other logical

subdivisions of a program

Assembler directive CSECT

CSECT signal the start of a new control section

The assembler establishes a separate location counter (initialized as 0) for each control section

Assembler directives EXTDEF EXTREF

LALITHAMBIGAIB APIT Page 41

SYSTEM SOFTWARE ( CS 2304) UNIT - IIEXTDEF external definition

The EXTDEF statement in a control section names symbols called external symbols that are

defined in this control section and may be used by other sections Control section names are

automatically considered to be external symbols

EXTREF external reference

The EXTREF statement names symbols that are used in this control section and are defined

elsewhere

LALITHAMBIGAIB APIT Page 42

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 27: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Consider Line 175 If we use PC-relative

Disp = TA ndash (PC) = 0033 ndash1059 = EFDA

PC relative is no longer applicable so we try to use BASE relative addressing mode

3 Immediate Addressing Mode

In this mode no memory reference is involved If immediate mode is used the target address is the

operand itself

If the symbol is referred in the instruction as the immediate operand then it is immediate with PC-

relative mode as shown in the example below

LALITHAMBIGAIB APIT Page 27

SYSTEM SOFTWARE ( CS 2304) UNIT - II

4 Indirect and PC-relative mode

In this type of instruction the symbol used in the instruction is the address of the location which

contains the address of the operand The address of this is found using PC-relative addressing mode

For example

The instruction jumps the control to the address location RETADR which in turn has the address of

the operand If address of RETADR is 0030 the target address is then 0003 as calculated above

222 Program Relocation

The need for program relocation

It is desirable to load and run several programs at the same time

The system must be able to load programs into memory wherever there is room

The exact starting address of the program is not known until load time

Absolute Address

Program with starting address specified at assembly time

The address may be invalid if the program is loaded into somewhere else

LALITHAMBIGAIB APIT Page 28

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example

The above statement says that the register A is loaded with the value stored at location 102D

Suppose it is decided to load and execute the program at location 3000 instead of location 1000

Then at address 102D the required value which needs to be loaded in the register A is no more

available

The address also gets changed relative to the displacement of the program Hence we need to

make some changes in the address portion of the instruction so that we can load and execute the

program at location 3000

Apart from the instruction which will undergo a change in their operand address value as the

program load address changes There exist some parts in the program which will remain same

regardless of where the program is being loaded

Since assembler will not know actual location where the program will get loaded it cannot make

the necessary changes in the addresses used in the program However the assembler identifies

for the loader those parts of the program which need modification

An object program that has the information necessary to perform this kind of modification is

called the relocatable program

LALITHAMBIGAIB APIT Page 29

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example Program Relocation

1) The above diagram shows the concept of relocation

a) Initially the program is loaded at location 0000

b) The instruction JSUB is loaded at location 0006

c) The address field of this instruction contains 01036 which is the address of the instruction

labeled RDREC

2) The second figure shows that if the program is to be loaded at new location 5000 The address of

the instruction JSUB gets modified to new location 6036

3) Likewise the third figure shows that if the program is relocated at location 7420 the JSUB

instruction would need to be changed to 4B108456 that correspond to the new address of

RDREC

The only parts of the program that require modification at load time are those that specify

direct addresses

LALITHAMBIGAIB APIT Page 30

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The rest of the instructions need not be modified

o Not a memory address (immediate addressing)

o PC-relative Base-relative

From the object program it is not possible to distinguish the address and constant

o The assembler must keep some information to tell the loader

o The object program that contains the modification record is called a relocatable

program

The way to solve the relocation problem

For an address label its address is assigned relative to the start of the program(START 0)

Produce a Modification record to store the starting location and the length of the address

field to be modified

The command for the loader must also be a part of the object program

Modification record

One modification record for each address to be modified

The length is stored in half-bytes (4 bits)

The starting location is the location of the byte containing the leftmost bits of the address

field to be modified

If the field contains an odd number of half-bytes the starting location begins in the middle of

the first byte

LALITHAMBIGAIB APIT Page 31

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Relocatable Object Program

In the above object code the red boxes indicate the addresses that need modifications The object

code lines at the end are the descriptions of the modification records for those instructions which

need change if relocation occurs M00000705 is the modification suggested for the statement at

location 0007 and requires modification 5-half bytes

Machine dependent assembler featuresAssembler features not closely related to machine architecture

bull Literalsbull Symbol-defining statementsbull Expressionsbull Program blocksbull Control sections and program linking

Literals

It is convenient for the programmer to be able to write the value of a constant operand as a part of

the instruction that uses it Such an operand is called a literal

45 001A ENDFIL LDA =ClsquoEOFrsquo 003210

002D =ClsquoEOFrsquo 454F46

LALITHAMBIGAIB APIT Page 32

SYSTEM SOFTWARE ( CS 2304) UNIT - II215 1062 WLOOP TD =Xlsquo05rsquo E32011

1076 =Xlsquo05rsquo 05

bull In this assembler language notation a literal is identified with the prefix= which is followed by a

specification of the literal value

The difference between a literal and an immediate operand

With immediate addressing the operand value is assembled as a part of the machine

instruction

With a literal the assembler generates the specified value as a constant at some other

memory location The address of this generated constant is used as the target address for the

machine instruction

Literal pool

All of the literal operands used in a program are gathered together into one or more literal

pools

Where the literal pool should be placed

93 LTORG

002D =ClsquoEOFrsquo 454F46

1048698 The assembler directive LTORG tells the assembler to generate a literal pool here

Literal for current value of location counter

1048698 The value of the location counter can be denoted by a literal operand

bull BASE

bull LDB =

Handling duplicate literal operands

The assembler should avoid storing duplicate literals

The easiest way to recognize duplicate literals is by comparison of the character

strings defining them

bull 100 LDA =ClsquoEOFrsquo

LALITHAMBIGAIB APIT Page 33

SYSTEM SOFTWARE ( CS 2304) UNIT - IIbull 125 LDA =ClsquoEOFrsquo

bull 160 LDA =Xlsquo454F46rsquo Literal operands with different

literal names may have the same

literal values

Recognizing literal operands that have different literal names but have the same literal values will

complicate the design of an assembler

Literal table (LITTAB)

The basic data structure needed to process literal operands is a literal table (LITTAB)

LITTAB contains Literal NameValue and Length

LITTAB is often organized as a hash table using the literal name or value as the key

Processing literal operands

Pass 1

bull For each recognized literal operand search LITTAB If the literal is already present in the

table no action is need if it is not present the literal is added to LITTAB without assigning

its address

bull When a LTORG statement is encountered or the end of the program the assembler makes a

scan of LITTAB and assigns an address to each literal

bull Update the location counter to reflect the number of bytes occupied by each literal

Pass 2

bull Search LITTAB for each literal operand encountered

bull The data values specified by the literals in each literal pool are inserted at the appropriate

places in the object program

bull In the same way as these values generated by BYTE or WORD statements

bull If a literal value represents an address in the program the assembler must generate the

appropriate Modification record

Symbol-defining statements

Assembler directives

EQU

ORG

Assembler directive EQU

LALITHAMBIGAIB APIT Page 34

SYSTEM SOFTWARE ( CS 2304) UNIT - II Most assemblers provide an assembler directive that allows the programmer to define

symbols and specify their values

Symbol EQU value

When the assembler encounters the EQU statement it enters ldquosymbolrdquo into SYMTAB with

the value of ldquosymbolrdquo

Use of EQU

Establish symbolic names that can be used for improved readability in place of numeric

values

+LDT 4096

MAXLEN EQU 4096

+LDT MAXLEN

Define mnemonic names for registers

A EQU 0

X EQU 1

L EQU 2

RMO 01

Establish and use names that reflect the logical function of the registers in the program

BASE EQU R1

ACCUMULATOR EQU R2

INDEX EQU R3

Assembler directive ORG

The assembler directive ORG is usually used to indirectly assign values to symbols

ORG value

ldquoValuerdquo is a constant or an expression involving constants and previously defined

symbols

When this statement is encountered the assembler resets its location counter (LOCCTR) to

the specified value

Use ORG for label definition

Suppose that we want to define a table with the following structure

STAB SYMBOL VALUE FLAGS

100 entries 6 bytes 3 bytes 1byte

LALITHAMBIGAIB APIT Page 35

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In some assemblers the previous value of LOCCTR is automatically remembered so we can

write

ORG

to return to the normal use of LOCCTR

Restrictions of EQU and ORG in an ordinary two-pass assembler For an ordinary two-pass assembler all symbols must be defined during Pass 1 Hence the

following sequences could not be processed by an ordinary two-pass assembler

All terms used to specify the value of the new symbol must have been defined previously in

the program

BETA EQU ALPHA

ALPHA RESW 1

ALPHA EQU BETA

BETA EQU DELTA

DELTA RESW 1

disallowed

ORG ALPHA

BYTE1 RESB 1

BYTE2 RESB 1

BYTE3 RESB 1

ORG

ALPHA RESB 1

disallowed

ALPHA RESW 1

BETA EQU ALPHA

Allowed

LALITHAMBIGAIB APIT Page 36

disallowed

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Expressions Most assemblers allow the use of expressions whenever a single operand such as a label or

literal is permitted

bull Each such expression must be evaluated by the assembler to produce a single

operand address or value

Assemblers generally allow arithmetic expressions formed according to the normal rule using

the operators +- and

bull Individual terms in the expression may be

bull constants

bull user-defined symbols or

bull special terms

bull The most common special term is the current value of the location

counter (often designated by )

Types of terms

Absolute terms - The value of an absolute term is independent of program location

Relative terms - The value of a relative term is dependent on the beginning address of the

program

Types of expressions

By the type of value produced expressions can classified as

1 Absolute expressions

bull The value of an absolute expression is independent of the program location

bull The absolute expression may contains relative terms provided the relative terms occur in

pairs and the terms in each such pair have opposite signs No relative term can enter multiplication

or division operation

bull eg MAXLEN EQU BUFEND-BUFFER

LALITHAMBIGAIB APIT Page 37

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Relative expressions

bull The value of a relative expression is relative the beginning address of the object

program

bull A relative expression is one in which all of the relative terms except

one can be paired as described above The remaining unpaired term must have a positive sign No

relative term can enter multiplication or division operation

Expressions that are neither relative nor absolute should be flagged by the assembler as errors

Determining types of expressions

Symbol Type Value

MAXLEN A 1000

BUFEND R 1036

BUFFER R 0036

RETADR R 0030

BUFEND+BUFFER 100-BUFFER and 3BUFFER are neither relative expressions nor

absolute expressions

Program blocks

The program is divided into three blocks They are Default - 0CDATA-1

and CBLKS-2

Default ( 0 ) block for mnemonic instructions

CDATA ( 1 ) block for less memory instructions

CBLKs ( 2 ) block for more memory instructions

Then one more column is included in the program when object code is

generated (ie) block whenever the system will encounter the new block need to write

USE and BLOCK name then set LOCCTR value to 0000

LALITHAMBIGAIB APIT Page 38

Add this field to SYMTAB

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In program block table consists of block name block valuestarting address and

length

After finding location counter value need to find object code and write object

program

LALITHAMBIGAIB APIT Page 39

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Program Block Diagram

LALITHAMBIGAIB APIT Page 40

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Control section and program linkingControl section

i A control section is a part of the program that maintains its identity after assembly

ii Each control section can be loaded and relocated independently of the others

iii Different control sections are most often used for subroutines or other logical

subdivisions of a program

Assembler directive CSECT

CSECT signal the start of a new control section

The assembler establishes a separate location counter (initialized as 0) for each control section

Assembler directives EXTDEF EXTREF

LALITHAMBIGAIB APIT Page 41

SYSTEM SOFTWARE ( CS 2304) UNIT - IIEXTDEF external definition

The EXTDEF statement in a control section names symbols called external symbols that are

defined in this control section and may be used by other sections Control section names are

automatically considered to be external symbols

EXTREF external reference

The EXTREF statement names symbols that are used in this control section and are defined

elsewhere

LALITHAMBIGAIB APIT Page 42

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 28: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - II

4 Indirect and PC-relative mode

In this type of instruction the symbol used in the instruction is the address of the location which

contains the address of the operand The address of this is found using PC-relative addressing mode

For example

The instruction jumps the control to the address location RETADR which in turn has the address of

the operand If address of RETADR is 0030 the target address is then 0003 as calculated above

222 Program Relocation

The need for program relocation

It is desirable to load and run several programs at the same time

The system must be able to load programs into memory wherever there is room

The exact starting address of the program is not known until load time

Absolute Address

Program with starting address specified at assembly time

The address may be invalid if the program is loaded into somewhere else

LALITHAMBIGAIB APIT Page 28

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example

The above statement says that the register A is loaded with the value stored at location 102D

Suppose it is decided to load and execute the program at location 3000 instead of location 1000

Then at address 102D the required value which needs to be loaded in the register A is no more

available

The address also gets changed relative to the displacement of the program Hence we need to

make some changes in the address portion of the instruction so that we can load and execute the

program at location 3000

Apart from the instruction which will undergo a change in their operand address value as the

program load address changes There exist some parts in the program which will remain same

regardless of where the program is being loaded

Since assembler will not know actual location where the program will get loaded it cannot make

the necessary changes in the addresses used in the program However the assembler identifies

for the loader those parts of the program which need modification

An object program that has the information necessary to perform this kind of modification is

called the relocatable program

LALITHAMBIGAIB APIT Page 29

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example Program Relocation

1) The above diagram shows the concept of relocation

a) Initially the program is loaded at location 0000

b) The instruction JSUB is loaded at location 0006

c) The address field of this instruction contains 01036 which is the address of the instruction

labeled RDREC

2) The second figure shows that if the program is to be loaded at new location 5000 The address of

the instruction JSUB gets modified to new location 6036

3) Likewise the third figure shows that if the program is relocated at location 7420 the JSUB

instruction would need to be changed to 4B108456 that correspond to the new address of

RDREC

The only parts of the program that require modification at load time are those that specify

direct addresses

LALITHAMBIGAIB APIT Page 30

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The rest of the instructions need not be modified

o Not a memory address (immediate addressing)

o PC-relative Base-relative

From the object program it is not possible to distinguish the address and constant

o The assembler must keep some information to tell the loader

o The object program that contains the modification record is called a relocatable

program

The way to solve the relocation problem

For an address label its address is assigned relative to the start of the program(START 0)

Produce a Modification record to store the starting location and the length of the address

field to be modified

The command for the loader must also be a part of the object program

Modification record

One modification record for each address to be modified

The length is stored in half-bytes (4 bits)

The starting location is the location of the byte containing the leftmost bits of the address

field to be modified

If the field contains an odd number of half-bytes the starting location begins in the middle of

the first byte

LALITHAMBIGAIB APIT Page 31

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Relocatable Object Program

In the above object code the red boxes indicate the addresses that need modifications The object

code lines at the end are the descriptions of the modification records for those instructions which

need change if relocation occurs M00000705 is the modification suggested for the statement at

location 0007 and requires modification 5-half bytes

Machine dependent assembler featuresAssembler features not closely related to machine architecture

bull Literalsbull Symbol-defining statementsbull Expressionsbull Program blocksbull Control sections and program linking

Literals

It is convenient for the programmer to be able to write the value of a constant operand as a part of

the instruction that uses it Such an operand is called a literal

45 001A ENDFIL LDA =ClsquoEOFrsquo 003210

002D =ClsquoEOFrsquo 454F46

LALITHAMBIGAIB APIT Page 32

SYSTEM SOFTWARE ( CS 2304) UNIT - II215 1062 WLOOP TD =Xlsquo05rsquo E32011

1076 =Xlsquo05rsquo 05

bull In this assembler language notation a literal is identified with the prefix= which is followed by a

specification of the literal value

The difference between a literal and an immediate operand

With immediate addressing the operand value is assembled as a part of the machine

instruction

With a literal the assembler generates the specified value as a constant at some other

memory location The address of this generated constant is used as the target address for the

machine instruction

Literal pool

All of the literal operands used in a program are gathered together into one or more literal

pools

Where the literal pool should be placed

93 LTORG

002D =ClsquoEOFrsquo 454F46

1048698 The assembler directive LTORG tells the assembler to generate a literal pool here

Literal for current value of location counter

1048698 The value of the location counter can be denoted by a literal operand

bull BASE

bull LDB =

Handling duplicate literal operands

The assembler should avoid storing duplicate literals

The easiest way to recognize duplicate literals is by comparison of the character

strings defining them

bull 100 LDA =ClsquoEOFrsquo

LALITHAMBIGAIB APIT Page 33

SYSTEM SOFTWARE ( CS 2304) UNIT - IIbull 125 LDA =ClsquoEOFrsquo

bull 160 LDA =Xlsquo454F46rsquo Literal operands with different

literal names may have the same

literal values

Recognizing literal operands that have different literal names but have the same literal values will

complicate the design of an assembler

Literal table (LITTAB)

The basic data structure needed to process literal operands is a literal table (LITTAB)

LITTAB contains Literal NameValue and Length

LITTAB is often organized as a hash table using the literal name or value as the key

Processing literal operands

Pass 1

bull For each recognized literal operand search LITTAB If the literal is already present in the

table no action is need if it is not present the literal is added to LITTAB without assigning

its address

bull When a LTORG statement is encountered or the end of the program the assembler makes a

scan of LITTAB and assigns an address to each literal

bull Update the location counter to reflect the number of bytes occupied by each literal

Pass 2

bull Search LITTAB for each literal operand encountered

bull The data values specified by the literals in each literal pool are inserted at the appropriate

places in the object program

bull In the same way as these values generated by BYTE or WORD statements

bull If a literal value represents an address in the program the assembler must generate the

appropriate Modification record

Symbol-defining statements

Assembler directives

EQU

ORG

Assembler directive EQU

LALITHAMBIGAIB APIT Page 34

SYSTEM SOFTWARE ( CS 2304) UNIT - II Most assemblers provide an assembler directive that allows the programmer to define

symbols and specify their values

Symbol EQU value

When the assembler encounters the EQU statement it enters ldquosymbolrdquo into SYMTAB with

the value of ldquosymbolrdquo

Use of EQU

Establish symbolic names that can be used for improved readability in place of numeric

values

+LDT 4096

MAXLEN EQU 4096

+LDT MAXLEN

Define mnemonic names for registers

A EQU 0

X EQU 1

L EQU 2

RMO 01

Establish and use names that reflect the logical function of the registers in the program

BASE EQU R1

ACCUMULATOR EQU R2

INDEX EQU R3

Assembler directive ORG

The assembler directive ORG is usually used to indirectly assign values to symbols

ORG value

ldquoValuerdquo is a constant or an expression involving constants and previously defined

symbols

When this statement is encountered the assembler resets its location counter (LOCCTR) to

the specified value

Use ORG for label definition

Suppose that we want to define a table with the following structure

STAB SYMBOL VALUE FLAGS

100 entries 6 bytes 3 bytes 1byte

LALITHAMBIGAIB APIT Page 35

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In some assemblers the previous value of LOCCTR is automatically remembered so we can

write

ORG

to return to the normal use of LOCCTR

Restrictions of EQU and ORG in an ordinary two-pass assembler For an ordinary two-pass assembler all symbols must be defined during Pass 1 Hence the

following sequences could not be processed by an ordinary two-pass assembler

All terms used to specify the value of the new symbol must have been defined previously in

the program

BETA EQU ALPHA

ALPHA RESW 1

ALPHA EQU BETA

BETA EQU DELTA

DELTA RESW 1

disallowed

ORG ALPHA

BYTE1 RESB 1

BYTE2 RESB 1

BYTE3 RESB 1

ORG

ALPHA RESB 1

disallowed

ALPHA RESW 1

BETA EQU ALPHA

Allowed

LALITHAMBIGAIB APIT Page 36

disallowed

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Expressions Most assemblers allow the use of expressions whenever a single operand such as a label or

literal is permitted

bull Each such expression must be evaluated by the assembler to produce a single

operand address or value

Assemblers generally allow arithmetic expressions formed according to the normal rule using

the operators +- and

bull Individual terms in the expression may be

bull constants

bull user-defined symbols or

bull special terms

bull The most common special term is the current value of the location

counter (often designated by )

Types of terms

Absolute terms - The value of an absolute term is independent of program location

Relative terms - The value of a relative term is dependent on the beginning address of the

program

Types of expressions

By the type of value produced expressions can classified as

1 Absolute expressions

bull The value of an absolute expression is independent of the program location

bull The absolute expression may contains relative terms provided the relative terms occur in

pairs and the terms in each such pair have opposite signs No relative term can enter multiplication

or division operation

bull eg MAXLEN EQU BUFEND-BUFFER

LALITHAMBIGAIB APIT Page 37

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Relative expressions

bull The value of a relative expression is relative the beginning address of the object

program

bull A relative expression is one in which all of the relative terms except

one can be paired as described above The remaining unpaired term must have a positive sign No

relative term can enter multiplication or division operation

Expressions that are neither relative nor absolute should be flagged by the assembler as errors

Determining types of expressions

Symbol Type Value

MAXLEN A 1000

BUFEND R 1036

BUFFER R 0036

RETADR R 0030

BUFEND+BUFFER 100-BUFFER and 3BUFFER are neither relative expressions nor

absolute expressions

Program blocks

The program is divided into three blocks They are Default - 0CDATA-1

and CBLKS-2

Default ( 0 ) block for mnemonic instructions

CDATA ( 1 ) block for less memory instructions

CBLKs ( 2 ) block for more memory instructions

Then one more column is included in the program when object code is

generated (ie) block whenever the system will encounter the new block need to write

USE and BLOCK name then set LOCCTR value to 0000

LALITHAMBIGAIB APIT Page 38

Add this field to SYMTAB

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In program block table consists of block name block valuestarting address and

length

After finding location counter value need to find object code and write object

program

LALITHAMBIGAIB APIT Page 39

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Program Block Diagram

LALITHAMBIGAIB APIT Page 40

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Control section and program linkingControl section

i A control section is a part of the program that maintains its identity after assembly

ii Each control section can be loaded and relocated independently of the others

iii Different control sections are most often used for subroutines or other logical

subdivisions of a program

Assembler directive CSECT

CSECT signal the start of a new control section

The assembler establishes a separate location counter (initialized as 0) for each control section

Assembler directives EXTDEF EXTREF

LALITHAMBIGAIB APIT Page 41

SYSTEM SOFTWARE ( CS 2304) UNIT - IIEXTDEF external definition

The EXTDEF statement in a control section names symbols called external symbols that are

defined in this control section and may be used by other sections Control section names are

automatically considered to be external symbols

EXTREF external reference

The EXTREF statement names symbols that are used in this control section and are defined

elsewhere

LALITHAMBIGAIB APIT Page 42

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 29: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example

The above statement says that the register A is loaded with the value stored at location 102D

Suppose it is decided to load and execute the program at location 3000 instead of location 1000

Then at address 102D the required value which needs to be loaded in the register A is no more

available

The address also gets changed relative to the displacement of the program Hence we need to

make some changes in the address portion of the instruction so that we can load and execute the

program at location 3000

Apart from the instruction which will undergo a change in their operand address value as the

program load address changes There exist some parts in the program which will remain same

regardless of where the program is being loaded

Since assembler will not know actual location where the program will get loaded it cannot make

the necessary changes in the addresses used in the program However the assembler identifies

for the loader those parts of the program which need modification

An object program that has the information necessary to perform this kind of modification is

called the relocatable program

LALITHAMBIGAIB APIT Page 29

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example Program Relocation

1) The above diagram shows the concept of relocation

a) Initially the program is loaded at location 0000

b) The instruction JSUB is loaded at location 0006

c) The address field of this instruction contains 01036 which is the address of the instruction

labeled RDREC

2) The second figure shows that if the program is to be loaded at new location 5000 The address of

the instruction JSUB gets modified to new location 6036

3) Likewise the third figure shows that if the program is relocated at location 7420 the JSUB

instruction would need to be changed to 4B108456 that correspond to the new address of

RDREC

The only parts of the program that require modification at load time are those that specify

direct addresses

LALITHAMBIGAIB APIT Page 30

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The rest of the instructions need not be modified

o Not a memory address (immediate addressing)

o PC-relative Base-relative

From the object program it is not possible to distinguish the address and constant

o The assembler must keep some information to tell the loader

o The object program that contains the modification record is called a relocatable

program

The way to solve the relocation problem

For an address label its address is assigned relative to the start of the program(START 0)

Produce a Modification record to store the starting location and the length of the address

field to be modified

The command for the loader must also be a part of the object program

Modification record

One modification record for each address to be modified

The length is stored in half-bytes (4 bits)

The starting location is the location of the byte containing the leftmost bits of the address

field to be modified

If the field contains an odd number of half-bytes the starting location begins in the middle of

the first byte

LALITHAMBIGAIB APIT Page 31

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Relocatable Object Program

In the above object code the red boxes indicate the addresses that need modifications The object

code lines at the end are the descriptions of the modification records for those instructions which

need change if relocation occurs M00000705 is the modification suggested for the statement at

location 0007 and requires modification 5-half bytes

Machine dependent assembler featuresAssembler features not closely related to machine architecture

bull Literalsbull Symbol-defining statementsbull Expressionsbull Program blocksbull Control sections and program linking

Literals

It is convenient for the programmer to be able to write the value of a constant operand as a part of

the instruction that uses it Such an operand is called a literal

45 001A ENDFIL LDA =ClsquoEOFrsquo 003210

002D =ClsquoEOFrsquo 454F46

LALITHAMBIGAIB APIT Page 32

SYSTEM SOFTWARE ( CS 2304) UNIT - II215 1062 WLOOP TD =Xlsquo05rsquo E32011

1076 =Xlsquo05rsquo 05

bull In this assembler language notation a literal is identified with the prefix= which is followed by a

specification of the literal value

The difference between a literal and an immediate operand

With immediate addressing the operand value is assembled as a part of the machine

instruction

With a literal the assembler generates the specified value as a constant at some other

memory location The address of this generated constant is used as the target address for the

machine instruction

Literal pool

All of the literal operands used in a program are gathered together into one or more literal

pools

Where the literal pool should be placed

93 LTORG

002D =ClsquoEOFrsquo 454F46

1048698 The assembler directive LTORG tells the assembler to generate a literal pool here

Literal for current value of location counter

1048698 The value of the location counter can be denoted by a literal operand

bull BASE

bull LDB =

Handling duplicate literal operands

The assembler should avoid storing duplicate literals

The easiest way to recognize duplicate literals is by comparison of the character

strings defining them

bull 100 LDA =ClsquoEOFrsquo

LALITHAMBIGAIB APIT Page 33

SYSTEM SOFTWARE ( CS 2304) UNIT - IIbull 125 LDA =ClsquoEOFrsquo

bull 160 LDA =Xlsquo454F46rsquo Literal operands with different

literal names may have the same

literal values

Recognizing literal operands that have different literal names but have the same literal values will

complicate the design of an assembler

Literal table (LITTAB)

The basic data structure needed to process literal operands is a literal table (LITTAB)

LITTAB contains Literal NameValue and Length

LITTAB is often organized as a hash table using the literal name or value as the key

Processing literal operands

Pass 1

bull For each recognized literal operand search LITTAB If the literal is already present in the

table no action is need if it is not present the literal is added to LITTAB without assigning

its address

bull When a LTORG statement is encountered or the end of the program the assembler makes a

scan of LITTAB and assigns an address to each literal

bull Update the location counter to reflect the number of bytes occupied by each literal

Pass 2

bull Search LITTAB for each literal operand encountered

bull The data values specified by the literals in each literal pool are inserted at the appropriate

places in the object program

bull In the same way as these values generated by BYTE or WORD statements

bull If a literal value represents an address in the program the assembler must generate the

appropriate Modification record

Symbol-defining statements

Assembler directives

EQU

ORG

Assembler directive EQU

LALITHAMBIGAIB APIT Page 34

SYSTEM SOFTWARE ( CS 2304) UNIT - II Most assemblers provide an assembler directive that allows the programmer to define

symbols and specify their values

Symbol EQU value

When the assembler encounters the EQU statement it enters ldquosymbolrdquo into SYMTAB with

the value of ldquosymbolrdquo

Use of EQU

Establish symbolic names that can be used for improved readability in place of numeric

values

+LDT 4096

MAXLEN EQU 4096

+LDT MAXLEN

Define mnemonic names for registers

A EQU 0

X EQU 1

L EQU 2

RMO 01

Establish and use names that reflect the logical function of the registers in the program

BASE EQU R1

ACCUMULATOR EQU R2

INDEX EQU R3

Assembler directive ORG

The assembler directive ORG is usually used to indirectly assign values to symbols

ORG value

ldquoValuerdquo is a constant or an expression involving constants and previously defined

symbols

When this statement is encountered the assembler resets its location counter (LOCCTR) to

the specified value

Use ORG for label definition

Suppose that we want to define a table with the following structure

STAB SYMBOL VALUE FLAGS

100 entries 6 bytes 3 bytes 1byte

LALITHAMBIGAIB APIT Page 35

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In some assemblers the previous value of LOCCTR is automatically remembered so we can

write

ORG

to return to the normal use of LOCCTR

Restrictions of EQU and ORG in an ordinary two-pass assembler For an ordinary two-pass assembler all symbols must be defined during Pass 1 Hence the

following sequences could not be processed by an ordinary two-pass assembler

All terms used to specify the value of the new symbol must have been defined previously in

the program

BETA EQU ALPHA

ALPHA RESW 1

ALPHA EQU BETA

BETA EQU DELTA

DELTA RESW 1

disallowed

ORG ALPHA

BYTE1 RESB 1

BYTE2 RESB 1

BYTE3 RESB 1

ORG

ALPHA RESB 1

disallowed

ALPHA RESW 1

BETA EQU ALPHA

Allowed

LALITHAMBIGAIB APIT Page 36

disallowed

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Expressions Most assemblers allow the use of expressions whenever a single operand such as a label or

literal is permitted

bull Each such expression must be evaluated by the assembler to produce a single

operand address or value

Assemblers generally allow arithmetic expressions formed according to the normal rule using

the operators +- and

bull Individual terms in the expression may be

bull constants

bull user-defined symbols or

bull special terms

bull The most common special term is the current value of the location

counter (often designated by )

Types of terms

Absolute terms - The value of an absolute term is independent of program location

Relative terms - The value of a relative term is dependent on the beginning address of the

program

Types of expressions

By the type of value produced expressions can classified as

1 Absolute expressions

bull The value of an absolute expression is independent of the program location

bull The absolute expression may contains relative terms provided the relative terms occur in

pairs and the terms in each such pair have opposite signs No relative term can enter multiplication

or division operation

bull eg MAXLEN EQU BUFEND-BUFFER

LALITHAMBIGAIB APIT Page 37

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Relative expressions

bull The value of a relative expression is relative the beginning address of the object

program

bull A relative expression is one in which all of the relative terms except

one can be paired as described above The remaining unpaired term must have a positive sign No

relative term can enter multiplication or division operation

Expressions that are neither relative nor absolute should be flagged by the assembler as errors

Determining types of expressions

Symbol Type Value

MAXLEN A 1000

BUFEND R 1036

BUFFER R 0036

RETADR R 0030

BUFEND+BUFFER 100-BUFFER and 3BUFFER are neither relative expressions nor

absolute expressions

Program blocks

The program is divided into three blocks They are Default - 0CDATA-1

and CBLKS-2

Default ( 0 ) block for mnemonic instructions

CDATA ( 1 ) block for less memory instructions

CBLKs ( 2 ) block for more memory instructions

Then one more column is included in the program when object code is

generated (ie) block whenever the system will encounter the new block need to write

USE and BLOCK name then set LOCCTR value to 0000

LALITHAMBIGAIB APIT Page 38

Add this field to SYMTAB

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In program block table consists of block name block valuestarting address and

length

After finding location counter value need to find object code and write object

program

LALITHAMBIGAIB APIT Page 39

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Program Block Diagram

LALITHAMBIGAIB APIT Page 40

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Control section and program linkingControl section

i A control section is a part of the program that maintains its identity after assembly

ii Each control section can be loaded and relocated independently of the others

iii Different control sections are most often used for subroutines or other logical

subdivisions of a program

Assembler directive CSECT

CSECT signal the start of a new control section

The assembler establishes a separate location counter (initialized as 0) for each control section

Assembler directives EXTDEF EXTREF

LALITHAMBIGAIB APIT Page 41

SYSTEM SOFTWARE ( CS 2304) UNIT - IIEXTDEF external definition

The EXTDEF statement in a control section names symbols called external symbols that are

defined in this control section and may be used by other sections Control section names are

automatically considered to be external symbols

EXTREF external reference

The EXTREF statement names symbols that are used in this control section and are defined

elsewhere

LALITHAMBIGAIB APIT Page 42

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 30: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Example Program Relocation

1) The above diagram shows the concept of relocation

a) Initially the program is loaded at location 0000

b) The instruction JSUB is loaded at location 0006

c) The address field of this instruction contains 01036 which is the address of the instruction

labeled RDREC

2) The second figure shows that if the program is to be loaded at new location 5000 The address of

the instruction JSUB gets modified to new location 6036

3) Likewise the third figure shows that if the program is relocated at location 7420 the JSUB

instruction would need to be changed to 4B108456 that correspond to the new address of

RDREC

The only parts of the program that require modification at load time are those that specify

direct addresses

LALITHAMBIGAIB APIT Page 30

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The rest of the instructions need not be modified

o Not a memory address (immediate addressing)

o PC-relative Base-relative

From the object program it is not possible to distinguish the address and constant

o The assembler must keep some information to tell the loader

o The object program that contains the modification record is called a relocatable

program

The way to solve the relocation problem

For an address label its address is assigned relative to the start of the program(START 0)

Produce a Modification record to store the starting location and the length of the address

field to be modified

The command for the loader must also be a part of the object program

Modification record

One modification record for each address to be modified

The length is stored in half-bytes (4 bits)

The starting location is the location of the byte containing the leftmost bits of the address

field to be modified

If the field contains an odd number of half-bytes the starting location begins in the middle of

the first byte

LALITHAMBIGAIB APIT Page 31

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Relocatable Object Program

In the above object code the red boxes indicate the addresses that need modifications The object

code lines at the end are the descriptions of the modification records for those instructions which

need change if relocation occurs M00000705 is the modification suggested for the statement at

location 0007 and requires modification 5-half bytes

Machine dependent assembler featuresAssembler features not closely related to machine architecture

bull Literalsbull Symbol-defining statementsbull Expressionsbull Program blocksbull Control sections and program linking

Literals

It is convenient for the programmer to be able to write the value of a constant operand as a part of

the instruction that uses it Such an operand is called a literal

45 001A ENDFIL LDA =ClsquoEOFrsquo 003210

002D =ClsquoEOFrsquo 454F46

LALITHAMBIGAIB APIT Page 32

SYSTEM SOFTWARE ( CS 2304) UNIT - II215 1062 WLOOP TD =Xlsquo05rsquo E32011

1076 =Xlsquo05rsquo 05

bull In this assembler language notation a literal is identified with the prefix= which is followed by a

specification of the literal value

The difference between a literal and an immediate operand

With immediate addressing the operand value is assembled as a part of the machine

instruction

With a literal the assembler generates the specified value as a constant at some other

memory location The address of this generated constant is used as the target address for the

machine instruction

Literal pool

All of the literal operands used in a program are gathered together into one or more literal

pools

Where the literal pool should be placed

93 LTORG

002D =ClsquoEOFrsquo 454F46

1048698 The assembler directive LTORG tells the assembler to generate a literal pool here

Literal for current value of location counter

1048698 The value of the location counter can be denoted by a literal operand

bull BASE

bull LDB =

Handling duplicate literal operands

The assembler should avoid storing duplicate literals

The easiest way to recognize duplicate literals is by comparison of the character

strings defining them

bull 100 LDA =ClsquoEOFrsquo

LALITHAMBIGAIB APIT Page 33

SYSTEM SOFTWARE ( CS 2304) UNIT - IIbull 125 LDA =ClsquoEOFrsquo

bull 160 LDA =Xlsquo454F46rsquo Literal operands with different

literal names may have the same

literal values

Recognizing literal operands that have different literal names but have the same literal values will

complicate the design of an assembler

Literal table (LITTAB)

The basic data structure needed to process literal operands is a literal table (LITTAB)

LITTAB contains Literal NameValue and Length

LITTAB is often organized as a hash table using the literal name or value as the key

Processing literal operands

Pass 1

bull For each recognized literal operand search LITTAB If the literal is already present in the

table no action is need if it is not present the literal is added to LITTAB without assigning

its address

bull When a LTORG statement is encountered or the end of the program the assembler makes a

scan of LITTAB and assigns an address to each literal

bull Update the location counter to reflect the number of bytes occupied by each literal

Pass 2

bull Search LITTAB for each literal operand encountered

bull The data values specified by the literals in each literal pool are inserted at the appropriate

places in the object program

bull In the same way as these values generated by BYTE or WORD statements

bull If a literal value represents an address in the program the assembler must generate the

appropriate Modification record

Symbol-defining statements

Assembler directives

EQU

ORG

Assembler directive EQU

LALITHAMBIGAIB APIT Page 34

SYSTEM SOFTWARE ( CS 2304) UNIT - II Most assemblers provide an assembler directive that allows the programmer to define

symbols and specify their values

Symbol EQU value

When the assembler encounters the EQU statement it enters ldquosymbolrdquo into SYMTAB with

the value of ldquosymbolrdquo

Use of EQU

Establish symbolic names that can be used for improved readability in place of numeric

values

+LDT 4096

MAXLEN EQU 4096

+LDT MAXLEN

Define mnemonic names for registers

A EQU 0

X EQU 1

L EQU 2

RMO 01

Establish and use names that reflect the logical function of the registers in the program

BASE EQU R1

ACCUMULATOR EQU R2

INDEX EQU R3

Assembler directive ORG

The assembler directive ORG is usually used to indirectly assign values to symbols

ORG value

ldquoValuerdquo is a constant or an expression involving constants and previously defined

symbols

When this statement is encountered the assembler resets its location counter (LOCCTR) to

the specified value

Use ORG for label definition

Suppose that we want to define a table with the following structure

STAB SYMBOL VALUE FLAGS

100 entries 6 bytes 3 bytes 1byte

LALITHAMBIGAIB APIT Page 35

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In some assemblers the previous value of LOCCTR is automatically remembered so we can

write

ORG

to return to the normal use of LOCCTR

Restrictions of EQU and ORG in an ordinary two-pass assembler For an ordinary two-pass assembler all symbols must be defined during Pass 1 Hence the

following sequences could not be processed by an ordinary two-pass assembler

All terms used to specify the value of the new symbol must have been defined previously in

the program

BETA EQU ALPHA

ALPHA RESW 1

ALPHA EQU BETA

BETA EQU DELTA

DELTA RESW 1

disallowed

ORG ALPHA

BYTE1 RESB 1

BYTE2 RESB 1

BYTE3 RESB 1

ORG

ALPHA RESB 1

disallowed

ALPHA RESW 1

BETA EQU ALPHA

Allowed

LALITHAMBIGAIB APIT Page 36

disallowed

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Expressions Most assemblers allow the use of expressions whenever a single operand such as a label or

literal is permitted

bull Each such expression must be evaluated by the assembler to produce a single

operand address or value

Assemblers generally allow arithmetic expressions formed according to the normal rule using

the operators +- and

bull Individual terms in the expression may be

bull constants

bull user-defined symbols or

bull special terms

bull The most common special term is the current value of the location

counter (often designated by )

Types of terms

Absolute terms - The value of an absolute term is independent of program location

Relative terms - The value of a relative term is dependent on the beginning address of the

program

Types of expressions

By the type of value produced expressions can classified as

1 Absolute expressions

bull The value of an absolute expression is independent of the program location

bull The absolute expression may contains relative terms provided the relative terms occur in

pairs and the terms in each such pair have opposite signs No relative term can enter multiplication

or division operation

bull eg MAXLEN EQU BUFEND-BUFFER

LALITHAMBIGAIB APIT Page 37

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Relative expressions

bull The value of a relative expression is relative the beginning address of the object

program

bull A relative expression is one in which all of the relative terms except

one can be paired as described above The remaining unpaired term must have a positive sign No

relative term can enter multiplication or division operation

Expressions that are neither relative nor absolute should be flagged by the assembler as errors

Determining types of expressions

Symbol Type Value

MAXLEN A 1000

BUFEND R 1036

BUFFER R 0036

RETADR R 0030

BUFEND+BUFFER 100-BUFFER and 3BUFFER are neither relative expressions nor

absolute expressions

Program blocks

The program is divided into three blocks They are Default - 0CDATA-1

and CBLKS-2

Default ( 0 ) block for mnemonic instructions

CDATA ( 1 ) block for less memory instructions

CBLKs ( 2 ) block for more memory instructions

Then one more column is included in the program when object code is

generated (ie) block whenever the system will encounter the new block need to write

USE and BLOCK name then set LOCCTR value to 0000

LALITHAMBIGAIB APIT Page 38

Add this field to SYMTAB

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In program block table consists of block name block valuestarting address and

length

After finding location counter value need to find object code and write object

program

LALITHAMBIGAIB APIT Page 39

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Program Block Diagram

LALITHAMBIGAIB APIT Page 40

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Control section and program linkingControl section

i A control section is a part of the program that maintains its identity after assembly

ii Each control section can be loaded and relocated independently of the others

iii Different control sections are most often used for subroutines or other logical

subdivisions of a program

Assembler directive CSECT

CSECT signal the start of a new control section

The assembler establishes a separate location counter (initialized as 0) for each control section

Assembler directives EXTDEF EXTREF

LALITHAMBIGAIB APIT Page 41

SYSTEM SOFTWARE ( CS 2304) UNIT - IIEXTDEF external definition

The EXTDEF statement in a control section names symbols called external symbols that are

defined in this control section and may be used by other sections Control section names are

automatically considered to be external symbols

EXTREF external reference

The EXTREF statement names symbols that are used in this control section and are defined

elsewhere

LALITHAMBIGAIB APIT Page 42

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 31: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The rest of the instructions need not be modified

o Not a memory address (immediate addressing)

o PC-relative Base-relative

From the object program it is not possible to distinguish the address and constant

o The assembler must keep some information to tell the loader

o The object program that contains the modification record is called a relocatable

program

The way to solve the relocation problem

For an address label its address is assigned relative to the start of the program(START 0)

Produce a Modification record to store the starting location and the length of the address

field to be modified

The command for the loader must also be a part of the object program

Modification record

One modification record for each address to be modified

The length is stored in half-bytes (4 bits)

The starting location is the location of the byte containing the leftmost bits of the address

field to be modified

If the field contains an odd number of half-bytes the starting location begins in the middle of

the first byte

LALITHAMBIGAIB APIT Page 31

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Relocatable Object Program

In the above object code the red boxes indicate the addresses that need modifications The object

code lines at the end are the descriptions of the modification records for those instructions which

need change if relocation occurs M00000705 is the modification suggested for the statement at

location 0007 and requires modification 5-half bytes

Machine dependent assembler featuresAssembler features not closely related to machine architecture

bull Literalsbull Symbol-defining statementsbull Expressionsbull Program blocksbull Control sections and program linking

Literals

It is convenient for the programmer to be able to write the value of a constant operand as a part of

the instruction that uses it Such an operand is called a literal

45 001A ENDFIL LDA =ClsquoEOFrsquo 003210

002D =ClsquoEOFrsquo 454F46

LALITHAMBIGAIB APIT Page 32

SYSTEM SOFTWARE ( CS 2304) UNIT - II215 1062 WLOOP TD =Xlsquo05rsquo E32011

1076 =Xlsquo05rsquo 05

bull In this assembler language notation a literal is identified with the prefix= which is followed by a

specification of the literal value

The difference between a literal and an immediate operand

With immediate addressing the operand value is assembled as a part of the machine

instruction

With a literal the assembler generates the specified value as a constant at some other

memory location The address of this generated constant is used as the target address for the

machine instruction

Literal pool

All of the literal operands used in a program are gathered together into one or more literal

pools

Where the literal pool should be placed

93 LTORG

002D =ClsquoEOFrsquo 454F46

1048698 The assembler directive LTORG tells the assembler to generate a literal pool here

Literal for current value of location counter

1048698 The value of the location counter can be denoted by a literal operand

bull BASE

bull LDB =

Handling duplicate literal operands

The assembler should avoid storing duplicate literals

The easiest way to recognize duplicate literals is by comparison of the character

strings defining them

bull 100 LDA =ClsquoEOFrsquo

LALITHAMBIGAIB APIT Page 33

SYSTEM SOFTWARE ( CS 2304) UNIT - IIbull 125 LDA =ClsquoEOFrsquo

bull 160 LDA =Xlsquo454F46rsquo Literal operands with different

literal names may have the same

literal values

Recognizing literal operands that have different literal names but have the same literal values will

complicate the design of an assembler

Literal table (LITTAB)

The basic data structure needed to process literal operands is a literal table (LITTAB)

LITTAB contains Literal NameValue and Length

LITTAB is often organized as a hash table using the literal name or value as the key

Processing literal operands

Pass 1

bull For each recognized literal operand search LITTAB If the literal is already present in the

table no action is need if it is not present the literal is added to LITTAB without assigning

its address

bull When a LTORG statement is encountered or the end of the program the assembler makes a

scan of LITTAB and assigns an address to each literal

bull Update the location counter to reflect the number of bytes occupied by each literal

Pass 2

bull Search LITTAB for each literal operand encountered

bull The data values specified by the literals in each literal pool are inserted at the appropriate

places in the object program

bull In the same way as these values generated by BYTE or WORD statements

bull If a literal value represents an address in the program the assembler must generate the

appropriate Modification record

Symbol-defining statements

Assembler directives

EQU

ORG

Assembler directive EQU

LALITHAMBIGAIB APIT Page 34

SYSTEM SOFTWARE ( CS 2304) UNIT - II Most assemblers provide an assembler directive that allows the programmer to define

symbols and specify their values

Symbol EQU value

When the assembler encounters the EQU statement it enters ldquosymbolrdquo into SYMTAB with

the value of ldquosymbolrdquo

Use of EQU

Establish symbolic names that can be used for improved readability in place of numeric

values

+LDT 4096

MAXLEN EQU 4096

+LDT MAXLEN

Define mnemonic names for registers

A EQU 0

X EQU 1

L EQU 2

RMO 01

Establish and use names that reflect the logical function of the registers in the program

BASE EQU R1

ACCUMULATOR EQU R2

INDEX EQU R3

Assembler directive ORG

The assembler directive ORG is usually used to indirectly assign values to symbols

ORG value

ldquoValuerdquo is a constant or an expression involving constants and previously defined

symbols

When this statement is encountered the assembler resets its location counter (LOCCTR) to

the specified value

Use ORG for label definition

Suppose that we want to define a table with the following structure

STAB SYMBOL VALUE FLAGS

100 entries 6 bytes 3 bytes 1byte

LALITHAMBIGAIB APIT Page 35

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In some assemblers the previous value of LOCCTR is automatically remembered so we can

write

ORG

to return to the normal use of LOCCTR

Restrictions of EQU and ORG in an ordinary two-pass assembler For an ordinary two-pass assembler all symbols must be defined during Pass 1 Hence the

following sequences could not be processed by an ordinary two-pass assembler

All terms used to specify the value of the new symbol must have been defined previously in

the program

BETA EQU ALPHA

ALPHA RESW 1

ALPHA EQU BETA

BETA EQU DELTA

DELTA RESW 1

disallowed

ORG ALPHA

BYTE1 RESB 1

BYTE2 RESB 1

BYTE3 RESB 1

ORG

ALPHA RESB 1

disallowed

ALPHA RESW 1

BETA EQU ALPHA

Allowed

LALITHAMBIGAIB APIT Page 36

disallowed

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Expressions Most assemblers allow the use of expressions whenever a single operand such as a label or

literal is permitted

bull Each such expression must be evaluated by the assembler to produce a single

operand address or value

Assemblers generally allow arithmetic expressions formed according to the normal rule using

the operators +- and

bull Individual terms in the expression may be

bull constants

bull user-defined symbols or

bull special terms

bull The most common special term is the current value of the location

counter (often designated by )

Types of terms

Absolute terms - The value of an absolute term is independent of program location

Relative terms - The value of a relative term is dependent on the beginning address of the

program

Types of expressions

By the type of value produced expressions can classified as

1 Absolute expressions

bull The value of an absolute expression is independent of the program location

bull The absolute expression may contains relative terms provided the relative terms occur in

pairs and the terms in each such pair have opposite signs No relative term can enter multiplication

or division operation

bull eg MAXLEN EQU BUFEND-BUFFER

LALITHAMBIGAIB APIT Page 37

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Relative expressions

bull The value of a relative expression is relative the beginning address of the object

program

bull A relative expression is one in which all of the relative terms except

one can be paired as described above The remaining unpaired term must have a positive sign No

relative term can enter multiplication or division operation

Expressions that are neither relative nor absolute should be flagged by the assembler as errors

Determining types of expressions

Symbol Type Value

MAXLEN A 1000

BUFEND R 1036

BUFFER R 0036

RETADR R 0030

BUFEND+BUFFER 100-BUFFER and 3BUFFER are neither relative expressions nor

absolute expressions

Program blocks

The program is divided into three blocks They are Default - 0CDATA-1

and CBLKS-2

Default ( 0 ) block for mnemonic instructions

CDATA ( 1 ) block for less memory instructions

CBLKs ( 2 ) block for more memory instructions

Then one more column is included in the program when object code is

generated (ie) block whenever the system will encounter the new block need to write

USE and BLOCK name then set LOCCTR value to 0000

LALITHAMBIGAIB APIT Page 38

Add this field to SYMTAB

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In program block table consists of block name block valuestarting address and

length

After finding location counter value need to find object code and write object

program

LALITHAMBIGAIB APIT Page 39

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Program Block Diagram

LALITHAMBIGAIB APIT Page 40

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Control section and program linkingControl section

i A control section is a part of the program that maintains its identity after assembly

ii Each control section can be loaded and relocated independently of the others

iii Different control sections are most often used for subroutines or other logical

subdivisions of a program

Assembler directive CSECT

CSECT signal the start of a new control section

The assembler establishes a separate location counter (initialized as 0) for each control section

Assembler directives EXTDEF EXTREF

LALITHAMBIGAIB APIT Page 41

SYSTEM SOFTWARE ( CS 2304) UNIT - IIEXTDEF external definition

The EXTDEF statement in a control section names symbols called external symbols that are

defined in this control section and may be used by other sections Control section names are

automatically considered to be external symbols

EXTREF external reference

The EXTREF statement names symbols that are used in this control section and are defined

elsewhere

LALITHAMBIGAIB APIT Page 42

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 32: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Relocatable Object Program

In the above object code the red boxes indicate the addresses that need modifications The object

code lines at the end are the descriptions of the modification records for those instructions which

need change if relocation occurs M00000705 is the modification suggested for the statement at

location 0007 and requires modification 5-half bytes

Machine dependent assembler featuresAssembler features not closely related to machine architecture

bull Literalsbull Symbol-defining statementsbull Expressionsbull Program blocksbull Control sections and program linking

Literals

It is convenient for the programmer to be able to write the value of a constant operand as a part of

the instruction that uses it Such an operand is called a literal

45 001A ENDFIL LDA =ClsquoEOFrsquo 003210

002D =ClsquoEOFrsquo 454F46

LALITHAMBIGAIB APIT Page 32

SYSTEM SOFTWARE ( CS 2304) UNIT - II215 1062 WLOOP TD =Xlsquo05rsquo E32011

1076 =Xlsquo05rsquo 05

bull In this assembler language notation a literal is identified with the prefix= which is followed by a

specification of the literal value

The difference between a literal and an immediate operand

With immediate addressing the operand value is assembled as a part of the machine

instruction

With a literal the assembler generates the specified value as a constant at some other

memory location The address of this generated constant is used as the target address for the

machine instruction

Literal pool

All of the literal operands used in a program are gathered together into one or more literal

pools

Where the literal pool should be placed

93 LTORG

002D =ClsquoEOFrsquo 454F46

1048698 The assembler directive LTORG tells the assembler to generate a literal pool here

Literal for current value of location counter

1048698 The value of the location counter can be denoted by a literal operand

bull BASE

bull LDB =

Handling duplicate literal operands

The assembler should avoid storing duplicate literals

The easiest way to recognize duplicate literals is by comparison of the character

strings defining them

bull 100 LDA =ClsquoEOFrsquo

LALITHAMBIGAIB APIT Page 33

SYSTEM SOFTWARE ( CS 2304) UNIT - IIbull 125 LDA =ClsquoEOFrsquo

bull 160 LDA =Xlsquo454F46rsquo Literal operands with different

literal names may have the same

literal values

Recognizing literal operands that have different literal names but have the same literal values will

complicate the design of an assembler

Literal table (LITTAB)

The basic data structure needed to process literal operands is a literal table (LITTAB)

LITTAB contains Literal NameValue and Length

LITTAB is often organized as a hash table using the literal name or value as the key

Processing literal operands

Pass 1

bull For each recognized literal operand search LITTAB If the literal is already present in the

table no action is need if it is not present the literal is added to LITTAB without assigning

its address

bull When a LTORG statement is encountered or the end of the program the assembler makes a

scan of LITTAB and assigns an address to each literal

bull Update the location counter to reflect the number of bytes occupied by each literal

Pass 2

bull Search LITTAB for each literal operand encountered

bull The data values specified by the literals in each literal pool are inserted at the appropriate

places in the object program

bull In the same way as these values generated by BYTE or WORD statements

bull If a literal value represents an address in the program the assembler must generate the

appropriate Modification record

Symbol-defining statements

Assembler directives

EQU

ORG

Assembler directive EQU

LALITHAMBIGAIB APIT Page 34

SYSTEM SOFTWARE ( CS 2304) UNIT - II Most assemblers provide an assembler directive that allows the programmer to define

symbols and specify their values

Symbol EQU value

When the assembler encounters the EQU statement it enters ldquosymbolrdquo into SYMTAB with

the value of ldquosymbolrdquo

Use of EQU

Establish symbolic names that can be used for improved readability in place of numeric

values

+LDT 4096

MAXLEN EQU 4096

+LDT MAXLEN

Define mnemonic names for registers

A EQU 0

X EQU 1

L EQU 2

RMO 01

Establish and use names that reflect the logical function of the registers in the program

BASE EQU R1

ACCUMULATOR EQU R2

INDEX EQU R3

Assembler directive ORG

The assembler directive ORG is usually used to indirectly assign values to symbols

ORG value

ldquoValuerdquo is a constant or an expression involving constants and previously defined

symbols

When this statement is encountered the assembler resets its location counter (LOCCTR) to

the specified value

Use ORG for label definition

Suppose that we want to define a table with the following structure

STAB SYMBOL VALUE FLAGS

100 entries 6 bytes 3 bytes 1byte

LALITHAMBIGAIB APIT Page 35

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In some assemblers the previous value of LOCCTR is automatically remembered so we can

write

ORG

to return to the normal use of LOCCTR

Restrictions of EQU and ORG in an ordinary two-pass assembler For an ordinary two-pass assembler all symbols must be defined during Pass 1 Hence the

following sequences could not be processed by an ordinary two-pass assembler

All terms used to specify the value of the new symbol must have been defined previously in

the program

BETA EQU ALPHA

ALPHA RESW 1

ALPHA EQU BETA

BETA EQU DELTA

DELTA RESW 1

disallowed

ORG ALPHA

BYTE1 RESB 1

BYTE2 RESB 1

BYTE3 RESB 1

ORG

ALPHA RESB 1

disallowed

ALPHA RESW 1

BETA EQU ALPHA

Allowed

LALITHAMBIGAIB APIT Page 36

disallowed

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Expressions Most assemblers allow the use of expressions whenever a single operand such as a label or

literal is permitted

bull Each such expression must be evaluated by the assembler to produce a single

operand address or value

Assemblers generally allow arithmetic expressions formed according to the normal rule using

the operators +- and

bull Individual terms in the expression may be

bull constants

bull user-defined symbols or

bull special terms

bull The most common special term is the current value of the location

counter (often designated by )

Types of terms

Absolute terms - The value of an absolute term is independent of program location

Relative terms - The value of a relative term is dependent on the beginning address of the

program

Types of expressions

By the type of value produced expressions can classified as

1 Absolute expressions

bull The value of an absolute expression is independent of the program location

bull The absolute expression may contains relative terms provided the relative terms occur in

pairs and the terms in each such pair have opposite signs No relative term can enter multiplication

or division operation

bull eg MAXLEN EQU BUFEND-BUFFER

LALITHAMBIGAIB APIT Page 37

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Relative expressions

bull The value of a relative expression is relative the beginning address of the object

program

bull A relative expression is one in which all of the relative terms except

one can be paired as described above The remaining unpaired term must have a positive sign No

relative term can enter multiplication or division operation

Expressions that are neither relative nor absolute should be flagged by the assembler as errors

Determining types of expressions

Symbol Type Value

MAXLEN A 1000

BUFEND R 1036

BUFFER R 0036

RETADR R 0030

BUFEND+BUFFER 100-BUFFER and 3BUFFER are neither relative expressions nor

absolute expressions

Program blocks

The program is divided into three blocks They are Default - 0CDATA-1

and CBLKS-2

Default ( 0 ) block for mnemonic instructions

CDATA ( 1 ) block for less memory instructions

CBLKs ( 2 ) block for more memory instructions

Then one more column is included in the program when object code is

generated (ie) block whenever the system will encounter the new block need to write

USE and BLOCK name then set LOCCTR value to 0000

LALITHAMBIGAIB APIT Page 38

Add this field to SYMTAB

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In program block table consists of block name block valuestarting address and

length

After finding location counter value need to find object code and write object

program

LALITHAMBIGAIB APIT Page 39

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Program Block Diagram

LALITHAMBIGAIB APIT Page 40

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Control section and program linkingControl section

i A control section is a part of the program that maintains its identity after assembly

ii Each control section can be loaded and relocated independently of the others

iii Different control sections are most often used for subroutines or other logical

subdivisions of a program

Assembler directive CSECT

CSECT signal the start of a new control section

The assembler establishes a separate location counter (initialized as 0) for each control section

Assembler directives EXTDEF EXTREF

LALITHAMBIGAIB APIT Page 41

SYSTEM SOFTWARE ( CS 2304) UNIT - IIEXTDEF external definition

The EXTDEF statement in a control section names symbols called external symbols that are

defined in this control section and may be used by other sections Control section names are

automatically considered to be external symbols

EXTREF external reference

The EXTREF statement names symbols that are used in this control section and are defined

elsewhere

LALITHAMBIGAIB APIT Page 42

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 33: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - II215 1062 WLOOP TD =Xlsquo05rsquo E32011

1076 =Xlsquo05rsquo 05

bull In this assembler language notation a literal is identified with the prefix= which is followed by a

specification of the literal value

The difference between a literal and an immediate operand

With immediate addressing the operand value is assembled as a part of the machine

instruction

With a literal the assembler generates the specified value as a constant at some other

memory location The address of this generated constant is used as the target address for the

machine instruction

Literal pool

All of the literal operands used in a program are gathered together into one or more literal

pools

Where the literal pool should be placed

93 LTORG

002D =ClsquoEOFrsquo 454F46

1048698 The assembler directive LTORG tells the assembler to generate a literal pool here

Literal for current value of location counter

1048698 The value of the location counter can be denoted by a literal operand

bull BASE

bull LDB =

Handling duplicate literal operands

The assembler should avoid storing duplicate literals

The easiest way to recognize duplicate literals is by comparison of the character

strings defining them

bull 100 LDA =ClsquoEOFrsquo

LALITHAMBIGAIB APIT Page 33

SYSTEM SOFTWARE ( CS 2304) UNIT - IIbull 125 LDA =ClsquoEOFrsquo

bull 160 LDA =Xlsquo454F46rsquo Literal operands with different

literal names may have the same

literal values

Recognizing literal operands that have different literal names but have the same literal values will

complicate the design of an assembler

Literal table (LITTAB)

The basic data structure needed to process literal operands is a literal table (LITTAB)

LITTAB contains Literal NameValue and Length

LITTAB is often organized as a hash table using the literal name or value as the key

Processing literal operands

Pass 1

bull For each recognized literal operand search LITTAB If the literal is already present in the

table no action is need if it is not present the literal is added to LITTAB without assigning

its address

bull When a LTORG statement is encountered or the end of the program the assembler makes a

scan of LITTAB and assigns an address to each literal

bull Update the location counter to reflect the number of bytes occupied by each literal

Pass 2

bull Search LITTAB for each literal operand encountered

bull The data values specified by the literals in each literal pool are inserted at the appropriate

places in the object program

bull In the same way as these values generated by BYTE or WORD statements

bull If a literal value represents an address in the program the assembler must generate the

appropriate Modification record

Symbol-defining statements

Assembler directives

EQU

ORG

Assembler directive EQU

LALITHAMBIGAIB APIT Page 34

SYSTEM SOFTWARE ( CS 2304) UNIT - II Most assemblers provide an assembler directive that allows the programmer to define

symbols and specify their values

Symbol EQU value

When the assembler encounters the EQU statement it enters ldquosymbolrdquo into SYMTAB with

the value of ldquosymbolrdquo

Use of EQU

Establish symbolic names that can be used for improved readability in place of numeric

values

+LDT 4096

MAXLEN EQU 4096

+LDT MAXLEN

Define mnemonic names for registers

A EQU 0

X EQU 1

L EQU 2

RMO 01

Establish and use names that reflect the logical function of the registers in the program

BASE EQU R1

ACCUMULATOR EQU R2

INDEX EQU R3

Assembler directive ORG

The assembler directive ORG is usually used to indirectly assign values to symbols

ORG value

ldquoValuerdquo is a constant or an expression involving constants and previously defined

symbols

When this statement is encountered the assembler resets its location counter (LOCCTR) to

the specified value

Use ORG for label definition

Suppose that we want to define a table with the following structure

STAB SYMBOL VALUE FLAGS

100 entries 6 bytes 3 bytes 1byte

LALITHAMBIGAIB APIT Page 35

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In some assemblers the previous value of LOCCTR is automatically remembered so we can

write

ORG

to return to the normal use of LOCCTR

Restrictions of EQU and ORG in an ordinary two-pass assembler For an ordinary two-pass assembler all symbols must be defined during Pass 1 Hence the

following sequences could not be processed by an ordinary two-pass assembler

All terms used to specify the value of the new symbol must have been defined previously in

the program

BETA EQU ALPHA

ALPHA RESW 1

ALPHA EQU BETA

BETA EQU DELTA

DELTA RESW 1

disallowed

ORG ALPHA

BYTE1 RESB 1

BYTE2 RESB 1

BYTE3 RESB 1

ORG

ALPHA RESB 1

disallowed

ALPHA RESW 1

BETA EQU ALPHA

Allowed

LALITHAMBIGAIB APIT Page 36

disallowed

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Expressions Most assemblers allow the use of expressions whenever a single operand such as a label or

literal is permitted

bull Each such expression must be evaluated by the assembler to produce a single

operand address or value

Assemblers generally allow arithmetic expressions formed according to the normal rule using

the operators +- and

bull Individual terms in the expression may be

bull constants

bull user-defined symbols or

bull special terms

bull The most common special term is the current value of the location

counter (often designated by )

Types of terms

Absolute terms - The value of an absolute term is independent of program location

Relative terms - The value of a relative term is dependent on the beginning address of the

program

Types of expressions

By the type of value produced expressions can classified as

1 Absolute expressions

bull The value of an absolute expression is independent of the program location

bull The absolute expression may contains relative terms provided the relative terms occur in

pairs and the terms in each such pair have opposite signs No relative term can enter multiplication

or division operation

bull eg MAXLEN EQU BUFEND-BUFFER

LALITHAMBIGAIB APIT Page 37

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Relative expressions

bull The value of a relative expression is relative the beginning address of the object

program

bull A relative expression is one in which all of the relative terms except

one can be paired as described above The remaining unpaired term must have a positive sign No

relative term can enter multiplication or division operation

Expressions that are neither relative nor absolute should be flagged by the assembler as errors

Determining types of expressions

Symbol Type Value

MAXLEN A 1000

BUFEND R 1036

BUFFER R 0036

RETADR R 0030

BUFEND+BUFFER 100-BUFFER and 3BUFFER are neither relative expressions nor

absolute expressions

Program blocks

The program is divided into three blocks They are Default - 0CDATA-1

and CBLKS-2

Default ( 0 ) block for mnemonic instructions

CDATA ( 1 ) block for less memory instructions

CBLKs ( 2 ) block for more memory instructions

Then one more column is included in the program when object code is

generated (ie) block whenever the system will encounter the new block need to write

USE and BLOCK name then set LOCCTR value to 0000

LALITHAMBIGAIB APIT Page 38

Add this field to SYMTAB

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In program block table consists of block name block valuestarting address and

length

After finding location counter value need to find object code and write object

program

LALITHAMBIGAIB APIT Page 39

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Program Block Diagram

LALITHAMBIGAIB APIT Page 40

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Control section and program linkingControl section

i A control section is a part of the program that maintains its identity after assembly

ii Each control section can be loaded and relocated independently of the others

iii Different control sections are most often used for subroutines or other logical

subdivisions of a program

Assembler directive CSECT

CSECT signal the start of a new control section

The assembler establishes a separate location counter (initialized as 0) for each control section

Assembler directives EXTDEF EXTREF

LALITHAMBIGAIB APIT Page 41

SYSTEM SOFTWARE ( CS 2304) UNIT - IIEXTDEF external definition

The EXTDEF statement in a control section names symbols called external symbols that are

defined in this control section and may be used by other sections Control section names are

automatically considered to be external symbols

EXTREF external reference

The EXTREF statement names symbols that are used in this control section and are defined

elsewhere

LALITHAMBIGAIB APIT Page 42

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 34: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - IIbull 125 LDA =ClsquoEOFrsquo

bull 160 LDA =Xlsquo454F46rsquo Literal operands with different

literal names may have the same

literal values

Recognizing literal operands that have different literal names but have the same literal values will

complicate the design of an assembler

Literal table (LITTAB)

The basic data structure needed to process literal operands is a literal table (LITTAB)

LITTAB contains Literal NameValue and Length

LITTAB is often organized as a hash table using the literal name or value as the key

Processing literal operands

Pass 1

bull For each recognized literal operand search LITTAB If the literal is already present in the

table no action is need if it is not present the literal is added to LITTAB without assigning

its address

bull When a LTORG statement is encountered or the end of the program the assembler makes a

scan of LITTAB and assigns an address to each literal

bull Update the location counter to reflect the number of bytes occupied by each literal

Pass 2

bull Search LITTAB for each literal operand encountered

bull The data values specified by the literals in each literal pool are inserted at the appropriate

places in the object program

bull In the same way as these values generated by BYTE or WORD statements

bull If a literal value represents an address in the program the assembler must generate the

appropriate Modification record

Symbol-defining statements

Assembler directives

EQU

ORG

Assembler directive EQU

LALITHAMBIGAIB APIT Page 34

SYSTEM SOFTWARE ( CS 2304) UNIT - II Most assemblers provide an assembler directive that allows the programmer to define

symbols and specify their values

Symbol EQU value

When the assembler encounters the EQU statement it enters ldquosymbolrdquo into SYMTAB with

the value of ldquosymbolrdquo

Use of EQU

Establish symbolic names that can be used for improved readability in place of numeric

values

+LDT 4096

MAXLEN EQU 4096

+LDT MAXLEN

Define mnemonic names for registers

A EQU 0

X EQU 1

L EQU 2

RMO 01

Establish and use names that reflect the logical function of the registers in the program

BASE EQU R1

ACCUMULATOR EQU R2

INDEX EQU R3

Assembler directive ORG

The assembler directive ORG is usually used to indirectly assign values to symbols

ORG value

ldquoValuerdquo is a constant or an expression involving constants and previously defined

symbols

When this statement is encountered the assembler resets its location counter (LOCCTR) to

the specified value

Use ORG for label definition

Suppose that we want to define a table with the following structure

STAB SYMBOL VALUE FLAGS

100 entries 6 bytes 3 bytes 1byte

LALITHAMBIGAIB APIT Page 35

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In some assemblers the previous value of LOCCTR is automatically remembered so we can

write

ORG

to return to the normal use of LOCCTR

Restrictions of EQU and ORG in an ordinary two-pass assembler For an ordinary two-pass assembler all symbols must be defined during Pass 1 Hence the

following sequences could not be processed by an ordinary two-pass assembler

All terms used to specify the value of the new symbol must have been defined previously in

the program

BETA EQU ALPHA

ALPHA RESW 1

ALPHA EQU BETA

BETA EQU DELTA

DELTA RESW 1

disallowed

ORG ALPHA

BYTE1 RESB 1

BYTE2 RESB 1

BYTE3 RESB 1

ORG

ALPHA RESB 1

disallowed

ALPHA RESW 1

BETA EQU ALPHA

Allowed

LALITHAMBIGAIB APIT Page 36

disallowed

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Expressions Most assemblers allow the use of expressions whenever a single operand such as a label or

literal is permitted

bull Each such expression must be evaluated by the assembler to produce a single

operand address or value

Assemblers generally allow arithmetic expressions formed according to the normal rule using

the operators +- and

bull Individual terms in the expression may be

bull constants

bull user-defined symbols or

bull special terms

bull The most common special term is the current value of the location

counter (often designated by )

Types of terms

Absolute terms - The value of an absolute term is independent of program location

Relative terms - The value of a relative term is dependent on the beginning address of the

program

Types of expressions

By the type of value produced expressions can classified as

1 Absolute expressions

bull The value of an absolute expression is independent of the program location

bull The absolute expression may contains relative terms provided the relative terms occur in

pairs and the terms in each such pair have opposite signs No relative term can enter multiplication

or division operation

bull eg MAXLEN EQU BUFEND-BUFFER

LALITHAMBIGAIB APIT Page 37

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Relative expressions

bull The value of a relative expression is relative the beginning address of the object

program

bull A relative expression is one in which all of the relative terms except

one can be paired as described above The remaining unpaired term must have a positive sign No

relative term can enter multiplication or division operation

Expressions that are neither relative nor absolute should be flagged by the assembler as errors

Determining types of expressions

Symbol Type Value

MAXLEN A 1000

BUFEND R 1036

BUFFER R 0036

RETADR R 0030

BUFEND+BUFFER 100-BUFFER and 3BUFFER are neither relative expressions nor

absolute expressions

Program blocks

The program is divided into three blocks They are Default - 0CDATA-1

and CBLKS-2

Default ( 0 ) block for mnemonic instructions

CDATA ( 1 ) block for less memory instructions

CBLKs ( 2 ) block for more memory instructions

Then one more column is included in the program when object code is

generated (ie) block whenever the system will encounter the new block need to write

USE and BLOCK name then set LOCCTR value to 0000

LALITHAMBIGAIB APIT Page 38

Add this field to SYMTAB

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In program block table consists of block name block valuestarting address and

length

After finding location counter value need to find object code and write object

program

LALITHAMBIGAIB APIT Page 39

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Program Block Diagram

LALITHAMBIGAIB APIT Page 40

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Control section and program linkingControl section

i A control section is a part of the program that maintains its identity after assembly

ii Each control section can be loaded and relocated independently of the others

iii Different control sections are most often used for subroutines or other logical

subdivisions of a program

Assembler directive CSECT

CSECT signal the start of a new control section

The assembler establishes a separate location counter (initialized as 0) for each control section

Assembler directives EXTDEF EXTREF

LALITHAMBIGAIB APIT Page 41

SYSTEM SOFTWARE ( CS 2304) UNIT - IIEXTDEF external definition

The EXTDEF statement in a control section names symbols called external symbols that are

defined in this control section and may be used by other sections Control section names are

automatically considered to be external symbols

EXTREF external reference

The EXTREF statement names symbols that are used in this control section and are defined

elsewhere

LALITHAMBIGAIB APIT Page 42

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 35: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - II Most assemblers provide an assembler directive that allows the programmer to define

symbols and specify their values

Symbol EQU value

When the assembler encounters the EQU statement it enters ldquosymbolrdquo into SYMTAB with

the value of ldquosymbolrdquo

Use of EQU

Establish symbolic names that can be used for improved readability in place of numeric

values

+LDT 4096

MAXLEN EQU 4096

+LDT MAXLEN

Define mnemonic names for registers

A EQU 0

X EQU 1

L EQU 2

RMO 01

Establish and use names that reflect the logical function of the registers in the program

BASE EQU R1

ACCUMULATOR EQU R2

INDEX EQU R3

Assembler directive ORG

The assembler directive ORG is usually used to indirectly assign values to symbols

ORG value

ldquoValuerdquo is a constant or an expression involving constants and previously defined

symbols

When this statement is encountered the assembler resets its location counter (LOCCTR) to

the specified value

Use ORG for label definition

Suppose that we want to define a table with the following structure

STAB SYMBOL VALUE FLAGS

100 entries 6 bytes 3 bytes 1byte

LALITHAMBIGAIB APIT Page 35

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In some assemblers the previous value of LOCCTR is automatically remembered so we can

write

ORG

to return to the normal use of LOCCTR

Restrictions of EQU and ORG in an ordinary two-pass assembler For an ordinary two-pass assembler all symbols must be defined during Pass 1 Hence the

following sequences could not be processed by an ordinary two-pass assembler

All terms used to specify the value of the new symbol must have been defined previously in

the program

BETA EQU ALPHA

ALPHA RESW 1

ALPHA EQU BETA

BETA EQU DELTA

DELTA RESW 1

disallowed

ORG ALPHA

BYTE1 RESB 1

BYTE2 RESB 1

BYTE3 RESB 1

ORG

ALPHA RESB 1

disallowed

ALPHA RESW 1

BETA EQU ALPHA

Allowed

LALITHAMBIGAIB APIT Page 36

disallowed

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Expressions Most assemblers allow the use of expressions whenever a single operand such as a label or

literal is permitted

bull Each such expression must be evaluated by the assembler to produce a single

operand address or value

Assemblers generally allow arithmetic expressions formed according to the normal rule using

the operators +- and

bull Individual terms in the expression may be

bull constants

bull user-defined symbols or

bull special terms

bull The most common special term is the current value of the location

counter (often designated by )

Types of terms

Absolute terms - The value of an absolute term is independent of program location

Relative terms - The value of a relative term is dependent on the beginning address of the

program

Types of expressions

By the type of value produced expressions can classified as

1 Absolute expressions

bull The value of an absolute expression is independent of the program location

bull The absolute expression may contains relative terms provided the relative terms occur in

pairs and the terms in each such pair have opposite signs No relative term can enter multiplication

or division operation

bull eg MAXLEN EQU BUFEND-BUFFER

LALITHAMBIGAIB APIT Page 37

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Relative expressions

bull The value of a relative expression is relative the beginning address of the object

program

bull A relative expression is one in which all of the relative terms except

one can be paired as described above The remaining unpaired term must have a positive sign No

relative term can enter multiplication or division operation

Expressions that are neither relative nor absolute should be flagged by the assembler as errors

Determining types of expressions

Symbol Type Value

MAXLEN A 1000

BUFEND R 1036

BUFFER R 0036

RETADR R 0030

BUFEND+BUFFER 100-BUFFER and 3BUFFER are neither relative expressions nor

absolute expressions

Program blocks

The program is divided into three blocks They are Default - 0CDATA-1

and CBLKS-2

Default ( 0 ) block for mnemonic instructions

CDATA ( 1 ) block for less memory instructions

CBLKs ( 2 ) block for more memory instructions

Then one more column is included in the program when object code is

generated (ie) block whenever the system will encounter the new block need to write

USE and BLOCK name then set LOCCTR value to 0000

LALITHAMBIGAIB APIT Page 38

Add this field to SYMTAB

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In program block table consists of block name block valuestarting address and

length

After finding location counter value need to find object code and write object

program

LALITHAMBIGAIB APIT Page 39

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Program Block Diagram

LALITHAMBIGAIB APIT Page 40

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Control section and program linkingControl section

i A control section is a part of the program that maintains its identity after assembly

ii Each control section can be loaded and relocated independently of the others

iii Different control sections are most often used for subroutines or other logical

subdivisions of a program

Assembler directive CSECT

CSECT signal the start of a new control section

The assembler establishes a separate location counter (initialized as 0) for each control section

Assembler directives EXTDEF EXTREF

LALITHAMBIGAIB APIT Page 41

SYSTEM SOFTWARE ( CS 2304) UNIT - IIEXTDEF external definition

The EXTDEF statement in a control section names symbols called external symbols that are

defined in this control section and may be used by other sections Control section names are

automatically considered to be external symbols

EXTREF external reference

The EXTREF statement names symbols that are used in this control section and are defined

elsewhere

LALITHAMBIGAIB APIT Page 42

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 36: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In some assemblers the previous value of LOCCTR is automatically remembered so we can

write

ORG

to return to the normal use of LOCCTR

Restrictions of EQU and ORG in an ordinary two-pass assembler For an ordinary two-pass assembler all symbols must be defined during Pass 1 Hence the

following sequences could not be processed by an ordinary two-pass assembler

All terms used to specify the value of the new symbol must have been defined previously in

the program

BETA EQU ALPHA

ALPHA RESW 1

ALPHA EQU BETA

BETA EQU DELTA

DELTA RESW 1

disallowed

ORG ALPHA

BYTE1 RESB 1

BYTE2 RESB 1

BYTE3 RESB 1

ORG

ALPHA RESB 1

disallowed

ALPHA RESW 1

BETA EQU ALPHA

Allowed

LALITHAMBIGAIB APIT Page 36

disallowed

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Expressions Most assemblers allow the use of expressions whenever a single operand such as a label or

literal is permitted

bull Each such expression must be evaluated by the assembler to produce a single

operand address or value

Assemblers generally allow arithmetic expressions formed according to the normal rule using

the operators +- and

bull Individual terms in the expression may be

bull constants

bull user-defined symbols or

bull special terms

bull The most common special term is the current value of the location

counter (often designated by )

Types of terms

Absolute terms - The value of an absolute term is independent of program location

Relative terms - The value of a relative term is dependent on the beginning address of the

program

Types of expressions

By the type of value produced expressions can classified as

1 Absolute expressions

bull The value of an absolute expression is independent of the program location

bull The absolute expression may contains relative terms provided the relative terms occur in

pairs and the terms in each such pair have opposite signs No relative term can enter multiplication

or division operation

bull eg MAXLEN EQU BUFEND-BUFFER

LALITHAMBIGAIB APIT Page 37

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Relative expressions

bull The value of a relative expression is relative the beginning address of the object

program

bull A relative expression is one in which all of the relative terms except

one can be paired as described above The remaining unpaired term must have a positive sign No

relative term can enter multiplication or division operation

Expressions that are neither relative nor absolute should be flagged by the assembler as errors

Determining types of expressions

Symbol Type Value

MAXLEN A 1000

BUFEND R 1036

BUFFER R 0036

RETADR R 0030

BUFEND+BUFFER 100-BUFFER and 3BUFFER are neither relative expressions nor

absolute expressions

Program blocks

The program is divided into three blocks They are Default - 0CDATA-1

and CBLKS-2

Default ( 0 ) block for mnemonic instructions

CDATA ( 1 ) block for less memory instructions

CBLKs ( 2 ) block for more memory instructions

Then one more column is included in the program when object code is

generated (ie) block whenever the system will encounter the new block need to write

USE and BLOCK name then set LOCCTR value to 0000

LALITHAMBIGAIB APIT Page 38

Add this field to SYMTAB

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In program block table consists of block name block valuestarting address and

length

After finding location counter value need to find object code and write object

program

LALITHAMBIGAIB APIT Page 39

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Program Block Diagram

LALITHAMBIGAIB APIT Page 40

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Control section and program linkingControl section

i A control section is a part of the program that maintains its identity after assembly

ii Each control section can be loaded and relocated independently of the others

iii Different control sections are most often used for subroutines or other logical

subdivisions of a program

Assembler directive CSECT

CSECT signal the start of a new control section

The assembler establishes a separate location counter (initialized as 0) for each control section

Assembler directives EXTDEF EXTREF

LALITHAMBIGAIB APIT Page 41

SYSTEM SOFTWARE ( CS 2304) UNIT - IIEXTDEF external definition

The EXTDEF statement in a control section names symbols called external symbols that are

defined in this control section and may be used by other sections Control section names are

automatically considered to be external symbols

EXTREF external reference

The EXTREF statement names symbols that are used in this control section and are defined

elsewhere

LALITHAMBIGAIB APIT Page 42

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 37: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Expressions Most assemblers allow the use of expressions whenever a single operand such as a label or

literal is permitted

bull Each such expression must be evaluated by the assembler to produce a single

operand address or value

Assemblers generally allow arithmetic expressions formed according to the normal rule using

the operators +- and

bull Individual terms in the expression may be

bull constants

bull user-defined symbols or

bull special terms

bull The most common special term is the current value of the location

counter (often designated by )

Types of terms

Absolute terms - The value of an absolute term is independent of program location

Relative terms - The value of a relative term is dependent on the beginning address of the

program

Types of expressions

By the type of value produced expressions can classified as

1 Absolute expressions

bull The value of an absolute expression is independent of the program location

bull The absolute expression may contains relative terms provided the relative terms occur in

pairs and the terms in each such pair have opposite signs No relative term can enter multiplication

or division operation

bull eg MAXLEN EQU BUFEND-BUFFER

LALITHAMBIGAIB APIT Page 37

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Relative expressions

bull The value of a relative expression is relative the beginning address of the object

program

bull A relative expression is one in which all of the relative terms except

one can be paired as described above The remaining unpaired term must have a positive sign No

relative term can enter multiplication or division operation

Expressions that are neither relative nor absolute should be flagged by the assembler as errors

Determining types of expressions

Symbol Type Value

MAXLEN A 1000

BUFEND R 1036

BUFFER R 0036

RETADR R 0030

BUFEND+BUFFER 100-BUFFER and 3BUFFER are neither relative expressions nor

absolute expressions

Program blocks

The program is divided into three blocks They are Default - 0CDATA-1

and CBLKS-2

Default ( 0 ) block for mnemonic instructions

CDATA ( 1 ) block for less memory instructions

CBLKs ( 2 ) block for more memory instructions

Then one more column is included in the program when object code is

generated (ie) block whenever the system will encounter the new block need to write

USE and BLOCK name then set LOCCTR value to 0000

LALITHAMBIGAIB APIT Page 38

Add this field to SYMTAB

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In program block table consists of block name block valuestarting address and

length

After finding location counter value need to find object code and write object

program

LALITHAMBIGAIB APIT Page 39

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Program Block Diagram

LALITHAMBIGAIB APIT Page 40

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Control section and program linkingControl section

i A control section is a part of the program that maintains its identity after assembly

ii Each control section can be loaded and relocated independently of the others

iii Different control sections are most often used for subroutines or other logical

subdivisions of a program

Assembler directive CSECT

CSECT signal the start of a new control section

The assembler establishes a separate location counter (initialized as 0) for each control section

Assembler directives EXTDEF EXTREF

LALITHAMBIGAIB APIT Page 41

SYSTEM SOFTWARE ( CS 2304) UNIT - IIEXTDEF external definition

The EXTDEF statement in a control section names symbols called external symbols that are

defined in this control section and may be used by other sections Control section names are

automatically considered to be external symbols

EXTREF external reference

The EXTREF statement names symbols that are used in this control section and are defined

elsewhere

LALITHAMBIGAIB APIT Page 42

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 38: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - II

2 Relative expressions

bull The value of a relative expression is relative the beginning address of the object

program

bull A relative expression is one in which all of the relative terms except

one can be paired as described above The remaining unpaired term must have a positive sign No

relative term can enter multiplication or division operation

Expressions that are neither relative nor absolute should be flagged by the assembler as errors

Determining types of expressions

Symbol Type Value

MAXLEN A 1000

BUFEND R 1036

BUFFER R 0036

RETADR R 0030

BUFEND+BUFFER 100-BUFFER and 3BUFFER are neither relative expressions nor

absolute expressions

Program blocks

The program is divided into three blocks They are Default - 0CDATA-1

and CBLKS-2

Default ( 0 ) block for mnemonic instructions

CDATA ( 1 ) block for less memory instructions

CBLKs ( 2 ) block for more memory instructions

Then one more column is included in the program when object code is

generated (ie) block whenever the system will encounter the new block need to write

USE and BLOCK name then set LOCCTR value to 0000

LALITHAMBIGAIB APIT Page 38

Add this field to SYMTAB

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In program block table consists of block name block valuestarting address and

length

After finding location counter value need to find object code and write object

program

LALITHAMBIGAIB APIT Page 39

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Program Block Diagram

LALITHAMBIGAIB APIT Page 40

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Control section and program linkingControl section

i A control section is a part of the program that maintains its identity after assembly

ii Each control section can be loaded and relocated independently of the others

iii Different control sections are most often used for subroutines or other logical

subdivisions of a program

Assembler directive CSECT

CSECT signal the start of a new control section

The assembler establishes a separate location counter (initialized as 0) for each control section

Assembler directives EXTDEF EXTREF

LALITHAMBIGAIB APIT Page 41

SYSTEM SOFTWARE ( CS 2304) UNIT - IIEXTDEF external definition

The EXTDEF statement in a control section names symbols called external symbols that are

defined in this control section and may be used by other sections Control section names are

automatically considered to be external symbols

EXTREF external reference

The EXTREF statement names symbols that are used in this control section and are defined

elsewhere

LALITHAMBIGAIB APIT Page 42

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 39: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - II

In program block table consists of block name block valuestarting address and

length

After finding location counter value need to find object code and write object

program

LALITHAMBIGAIB APIT Page 39

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Program Block Diagram

LALITHAMBIGAIB APIT Page 40

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Control section and program linkingControl section

i A control section is a part of the program that maintains its identity after assembly

ii Each control section can be loaded and relocated independently of the others

iii Different control sections are most often used for subroutines or other logical

subdivisions of a program

Assembler directive CSECT

CSECT signal the start of a new control section

The assembler establishes a separate location counter (initialized as 0) for each control section

Assembler directives EXTDEF EXTREF

LALITHAMBIGAIB APIT Page 41

SYSTEM SOFTWARE ( CS 2304) UNIT - IIEXTDEF external definition

The EXTDEF statement in a control section names symbols called external symbols that are

defined in this control section and may be used by other sections Control section names are

automatically considered to be external symbols

EXTREF external reference

The EXTREF statement names symbols that are used in this control section and are defined

elsewhere

LALITHAMBIGAIB APIT Page 42

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 40: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Program Block Diagram

LALITHAMBIGAIB APIT Page 40

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Control section and program linkingControl section

i A control section is a part of the program that maintains its identity after assembly

ii Each control section can be loaded and relocated independently of the others

iii Different control sections are most often used for subroutines or other logical

subdivisions of a program

Assembler directive CSECT

CSECT signal the start of a new control section

The assembler establishes a separate location counter (initialized as 0) for each control section

Assembler directives EXTDEF EXTREF

LALITHAMBIGAIB APIT Page 41

SYSTEM SOFTWARE ( CS 2304) UNIT - IIEXTDEF external definition

The EXTDEF statement in a control section names symbols called external symbols that are

defined in this control section and may be used by other sections Control section names are

automatically considered to be external symbols

EXTREF external reference

The EXTREF statement names symbols that are used in this control section and are defined

elsewhere

LALITHAMBIGAIB APIT Page 42

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 41: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Control section and program linkingControl section

i A control section is a part of the program that maintains its identity after assembly

ii Each control section can be loaded and relocated independently of the others

iii Different control sections are most often used for subroutines or other logical

subdivisions of a program

Assembler directive CSECT

CSECT signal the start of a new control section

The assembler establishes a separate location counter (initialized as 0) for each control section

Assembler directives EXTDEF EXTREF

LALITHAMBIGAIB APIT Page 41

SYSTEM SOFTWARE ( CS 2304) UNIT - IIEXTDEF external definition

The EXTDEF statement in a control section names symbols called external symbols that are

defined in this control section and may be used by other sections Control section names are

automatically considered to be external symbols

EXTREF external reference

The EXTREF statement names symbols that are used in this control section and are defined

elsewhere

LALITHAMBIGAIB APIT Page 42

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 42: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - IIEXTDEF external definition

The EXTDEF statement in a control section names symbols called external symbols that are

defined in this control section and may be used by other sections Control section names are

automatically considered to be external symbols

EXTREF external reference

The EXTREF statement names symbols that are used in this control section and are defined

elsewhere

LALITHAMBIGAIB APIT Page 42

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 43: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - II

The object code generation for EXTREF is only changed

For object program there are 6 records needed They are Header Record Define

Record Refer Record Text Record Modification Record and End Record

LALITHAMBIGAIB APIT Page 43

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 44: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - II

LALITHAMBIGAIB APIT Page 44

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45

Page 45: Introduction - Home - Sri Venkateswara College of … · Web viewM00000705 is the modification suggested for the statement at location 0007 and requires modification 5-half bytes

SYSTEM SOFTWARE ( CS 2304) UNIT - II

Object Program for Control Section and Program Linking

LALITHAMBIGAIB APIT Page 45