csc 3210 computer organization and programming chapter 9 external data and text d.m. rasanjalee...

30
CSC 3210 Computer Organization and Programming Chapter 9 EXTERNAL DATA AND TEXT D.M. Rasanjalee Himali

Upload: lucinda-snow

Post on 27-Dec-2015

224 views

Category:

Documents


7 download

TRANSCRIPT

Page 1: CSC 3210 Computer Organization and Programming Chapter 9 EXTERNAL DATA AND TEXT D.M. Rasanjalee Himali

CSC 3210Computer Organization and Programming

Chapter 9

EXTERNAL DATA AND TEXT

D.M. Rasanjalee Himali

Page 2: CSC 3210 Computer Organization and Programming Chapter 9 EXTERNAL DATA AND TEXT D.M. Rasanjalee Himali

Introduction So far the data structure that we've used to

organize memory is the stack.

Stack is good for variables local to a subroutine ("automatic" variables in C) subroutines need to allocate a bunch of memory when

they start, and throw it away when they're done.

The LIFO characteristics of the stack are natural for this.

Page 3: CSC 3210 Computer Organization and Programming Chapter 9 EXTERNAL DATA AND TEXT D.M. Rasanjalee Himali

Introduction However, stack variables are constrained in two

ways: they have limited scope, and limited extent.

Scope and extent. Scope refers to the part of the code in which a variable

can be referenced. Extent is the length of time that the variable exists.

Stack variables have both scope and extent limited to an individual subroutine.

Page 4: CSC 3210 Computer Organization and Programming Chapter 9 EXTERNAL DATA AND TEXT D.M. Rasanjalee Himali

Introduction The task of widening scope is a job for the

compiler and we won't discuss it any more here.

However, the task of allowing lengthened extent is a question of memory allocation.

To do this we obviously need a new data structure other than the stack.

Page 5: CSC 3210 Computer Organization and Programming Chapter 9 EXTERNAL DATA AND TEXT D.M. Rasanjalee Himali

Introduction The solution :

is to allocate a portion of memory when the program starts that is used for so-called "static" data.

Static variables in functions do not change between function calls.

This chunk of memory is called a segment.

In fact, the program itself is a chunk of memory that is managed in this way – it's called the text segment.

The memory used for static data is called the data segment.

Page 6: CSC 3210 Computer Organization and Programming Chapter 9 EXTERNAL DATA AND TEXT D.M. Rasanjalee Himali

Review of static data in C Case 1.

If a variable is declared outside of any function, it is given whole-program extent and whole-program scope.

That is, it can be referenced from any point, in any file that is compiled into the program

If a variable from another file is going to be used, it needs to be declared as extern to tell the compiler that it's declared elsewhere.

Page 7: CSC 3210 Computer Organization and Programming Chapter 9 EXTERNAL DATA AND TEXT D.M. Rasanjalee Himali

Review of static data in C Case 2.

If a variable is declared outside of any function, but given the keyword static, then it is given whole-program extent, but its scope is restricted to the current file

Page 8: CSC 3210 Computer Organization and Programming Chapter 9 EXTERNAL DATA AND TEXT D.M. Rasanjalee Himali

Review of static data in C Case 3.

If a variable is declared inside a function, with the keyword static, then again is it given whole-program extent, but its scope is restricted to the current subroutine:

Page 9: CSC 3210 Computer Organization and Programming Chapter 9 EXTERNAL DATA AND TEXT D.M. Rasanjalee Himali

External Variables Two classes of external variables:

Zero initialized (initialized to zero) Non-Zero initialized (initialized to a value other than zero)

When a program is loaded to memory, the program text, (Non-Zero) initialized variables ,and zero-initialized variables are loaded into different regions of memory called sections.

Each section generally start at a 0x2000 byte boundary.

Page 10: CSC 3210 Computer Organization and Programming Chapter 9 EXTERNAL DATA AND TEXT D.M. Rasanjalee Himali

External Variables Program text is read only

Initialized/Zero-initialized variables are read-write

These 3 regions in memory are called text, data and bss (block starting symbol) sections respectively.

The assembler has to be told about these areas using psuedo-ops: section “.text” ! for the start of the text segment . section “.data” ! for the start of the non-zero initialized variables section “.bss” ! (block starting symbol) start of the zero initialized text

These 3 sections are all at low memory, leaving stack at high

memory. Stack has nothing to do with the program sections

Page 11: CSC 3210 Computer Organization and Programming Chapter 9 EXTERNAL DATA AND TEXT D.M. Rasanjalee Himali

External Variables The assembler

maintains three location counters, one for each of the text, data, and bss sections.

Page 12: CSC 3210 Computer Organization and Programming Chapter 9 EXTERNAL DATA AND TEXT D.M. Rasanjalee Himali

The text Section

Read only memory region, where the machine codes go

Code in the text section is loaded into memory starting at memory location 0x2000.

Addresses of machine instructions are relative to the beginning of the program

The beginning of the program is signified by label main , usually made as global main

Page 13: CSC 3210 Computer Organization and Programming Chapter 9 EXTERNAL DATA AND TEXT D.M. Rasanjalee Himali

The data Section

Read/Write memory region, where (non zero) initialized variables go.

Variables are specified by size: Ex: .double, .word, .half, .byte

Variables can be aligned on boundary using .align

String/character types can be represented in 3 different ways: Ex: .byte, .ascii ,.asciz – give a null terminating string

Variables MUST be initialized (i.e. to non zero value)

Page 14: CSC 3210 Computer Organization and Programming Chapter 9 EXTERNAL DATA AND TEXT D.M. Rasanjalee Himali

The data Section Ex:

would result in the following three constants in the data section:

Normally, such data are labeled so that they may be referred to in a program:

Notice that we have appended _m to all the memory addresses to distinguish these from stack offsets, (_s) and registers (_r).

Page 15: CSC 3210 Computer Organization and Programming Chapter 9 EXTERNAL DATA AND TEXT D.M. Rasanjalee Himali

The data Section To access such data, we need to load them into a

register, or to store the contents of a register into addressed memory.

To do this we need to load the 32-bit addresses of the data into a register before the data may be accessed.

Page 16: CSC 3210 Computer Organization and Programming Chapter 9 EXTERNAL DATA AND TEXT D.M. Rasanjalee Himali

The data Section For

example, to compute k = i + j, we would write:

.section “.data”

.global i_m, j_m, k_mi_m: .word 3j_m: .word 9k_m: .word 3+9

.section “.text”define(i_r, l0)define(j_r, l1)define(k_r, l2).global main

main:save %sp,-96,%spsethi %hi(i_m), %o0 !load i_m to i_rld[%o0+%lo(i_m)], %i_r

sethi %hi(j_m), %o0 !load j_m to j_rld[%o0+%lo(j_m)], %j_r

add %i_r,%j_r,%o0 !store i_r+j_r to k_mset k_m, %o1 !load address of k_m to

%o1st %o0, [%o1] !store result to k_m

Page 17: CSC 3210 Computer Organization and Programming Chapter 9 EXTERNAL DATA AND TEXT D.M. Rasanjalee Himali

The data Section We may also initialize bytes and halfwords:

If we simply need space and are not concerned with its initialization, we may use the . skip pseudo-op which only advances the location counter a specified number of bytes, thus providing space.

For example:

will provide space for a 100-word uninitialized array, ary.

Page 18: CSC 3210 Computer Organization and Programming Chapter 9 EXTERNAL DATA AND TEXT D.M. Rasanjalee Himali

The data Section External data must be aligned in memory.

For external data, assembler provides the correct alignment by changing the contents of the location counter.

`The .align pseudo-op provides for this:

Page 19: CSC 3210 Computer Organization and Programming Chapter 9 EXTERNAL DATA AND TEXT D.M. Rasanjalee Himali

The data Section The . align pseudo-op ensures that the location counter, the address

where the next data will be assigned, will be evenly divisible by n.

If the value of the location counter is not evenly divisible by n (so as to produce no remainder), the . align pseudo-op will increase the value of the location counter until it is evenly divisible by n.

If we are not sure that the alignment is correct, we need to use: an .align 4 before any word data, an .align 2 before any halfword data, and an .align 8 before any doubleword data.

Page 20: CSC 3210 Computer Organization and Programming Chapter 9 EXTERNAL DATA AND TEXT D.M. Rasanjalee Himali

The data Section If the variables are true externals, (variables

whose names are to be made available to other independently assembled program sections), the variables names must be declared global, using the .global pseudo-op:

Page 21: CSC 3210 Computer Organization and Programming Chapter 9 EXTERNAL DATA AND TEXT D.M. Rasanjalee Himali

ASCII Data We frequently make use of ASCII codes in programs.

The assembler recognizes a character enclosed in double quotes “” to indicate that we want the ASCII code for that character.

Ex: load the string “hello” into five consecutive bytes of memory, we could write it as

The definition of strings is handled more directly by two other pseudo-ops, ascii and asciz (preferred).

These two pseudo-ops take a string enclosed in quotes, assembling the ASCII codes for each character into successive bytes of memory.

Page 22: CSC 3210 Computer Organization and Programming Chapter 9 EXTERNAL DATA AND TEXT D.M. Rasanjalee Himali

ASCII Data There are two ways of indicating the end of a string:

by marking it with a zero byte, \0, (used by C) or by giving the length of the string in bytes.

Thus in C, our string “hello” should have an additional byte:

This can be achieved using the . asciz pseudo-op (preferred):

The . asciz pseudo-op appends a zero byte to the end of its string argument.

Page 23: CSC 3210 Computer Organization and Programming Chapter 9 EXTERNAL DATA AND TEXT D.M. Rasanjalee Himali

ASCII Data Strings are frequently

read-only, as in format strings. Consider the classic C program:

This translates into assembly language as:

.file "hello.c“

.section ".data" .align 8.LLC0: .asciz "Hello world \n“

.section ".text" .align 4 .global main .type main, #function .proc 04main:

save %sp, -112, %spsethi %hi(.LLC0), %g1

or %g1, %lo(.LLC0), %o0 call printf, 0 nop mov %g1, %i0 ret restore

Page 24: CSC 3210 Computer Organization and Programming Chapter 9 EXTERNAL DATA AND TEXT D.M. Rasanjalee Himali

The bss Section BSS – Block Starting Symbol

Read/Write memory region where zero-initialized variables go.

Memory allocation through skipping bytes

Therefore, In the bss section we may only define labels such as:

These variables will be initialized to zero immediately before the program is executed.

Initialized data, other than initialized to zero, may not be in the bss section.

Page 25: CSC 3210 Computer Organization and Programming Chapter 9 EXTERNAL DATA AND TEXT D.M. Rasanjalee Himali

Relocation and Linking with Other Code

Why we need linking and relocation? we want separate compilation.

Two problems caused by separate compilation: The Linking Problem:

Programs in one file need to access subroutines and data from another file.

The assembler doesn't know the addresses of some symbols at assembly time, and the location of symbols inside other files needs to be discovered.

The Relocation Problem: The separate segments created by assembling each file need to be

combined. Ex: code written in separate places needs to be combined into one,

contiguous text segment.

Page 26: CSC 3210 Computer Organization and Programming Chapter 9 EXTERNAL DATA AND TEXT D.M. Rasanjalee Himali

Relocation and Linking with Other Code

Linking is the process of finding addresses for all the

symbols used by your program.

Relocation is modifying addresses that need to change

because many files are being combined into one.

Page 27: CSC 3210 Computer Organization and Programming Chapter 9 EXTERNAL DATA AND TEXT D.M. Rasanjalee Himali

Relocation and Linking with Other Code

The assembler always assembles each file as if it started at memory location zero.

When files are combined into one program, they are placed one-after-the other.

So when two files are combined into one program, they can't both start at zero; one (at least) has to be changed since it will start after the other.

Page 28: CSC 3210 Computer Organization and Programming Chapter 9 EXTERNAL DATA AND TEXT D.M. Rasanjalee Himali

Relocation and Linking with Other Code Ex: local symbol

foo's address can be calculated by the assembler, since it is in the same file as main

the address used in the "call foo" operation doesn't need to be changed, since subroutine calls and branches are PC-relative

Ex: external symbol

the assembler can't find printf anywhere in the source file.

So, printf is added to the Unresolved References table, which is kept at the end of each object file

Page 29: CSC 3210 Computer Organization and Programming Chapter 9 EXTERNAL DATA AND TEXT D.M. Rasanjalee Himali

Relocation and Linking with Other Code

Ex: global symbol

Any symbol that is declared .global will be added to the Symbol Table at the each of each object file.

Page 30: CSC 3210 Computer Organization and Programming Chapter 9 EXTERNAL DATA AND TEXT D.M. Rasanjalee Himali

Reference

http://packetstormsecurity.org/mag/b4b0/b4b0-03/sparc_asm_tutorial/