preprocessing, compiling, assembling, and linking introduction

22
- 1 of 22 - Preprocessing, Compiling, Assembling, and Linking Introduction In this lesson will examine Architecture of C program Introduce C preprocessor and preprocessor directives How to use preprocessor’s directives to manage program Have examined process Creating program Have developed program Written in C Source code Next step Translate into something computer can use Called object code Things to think about along the way How to accommodate different Versions Called localization Features Targets - Machines Operating Systems The First Step Model the process Examine at three levels Each with increasing detail Start with top level Begin with source file End with object or machine code Also called object file or machine code file Machine code will be unique to specific computer or microprocessor Transformation from source to object Called compilation or compiling Top Level

Upload: others

Post on 01-Jul-2022

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Preprocessing, Compiling, Assembling, and Linking Introduction

- 1 of 22 -

Preprocessing, Compiling, Assembling, and Linking

Introduction In this lesson will examine

Architecture of C program Introduce C preprocessor and preprocessor directives How to use preprocessor’s directives to manage program

Have examined process Creating program

Have developed program Written in C

Source code

Next step Translate into something computer can use

Called object code Things to think about along the way

How to accommodate different • Versions

Called localization • Features • Targets - Machines • Operating Systems

The First Step

Model the process Examine at three levels

Each with increasing detail

Start with top level Begin with source file End with object or machine code

Also called object file or machine code file Machine code will be unique to specific computer or microprocessor

Transformation from source to object Called compilation or compiling

Top Level

Page 2: Preprocessing, Compiling, Assembling, and Linking Introduction

- 2 of 22 -

Level 1 Level 2:

The Pieces

Let’s begin with preprocessor Preprocessor

Simple and handy tool Its job is to process C source code

Before compiler Reads source program

Page 3: Preprocessing, Compiling, Assembling, and Linking Introduction

- 3 of 22 -

Translates it into machine code Question might be

Why do we have to do this Overview

Many of useful features and capabilities of C Not implemented by compiler Rather

Selected by user Brought in on demand by preprocessor

When program written

User includes various directives to preprocessor Preprocessor

Reads source file Interprets directives Effects operations specified by directives

Example directives tell the preprocessor

• Which library files to include • Which user written files to include • Which portions of the program to include or exclude

We may want slightly different versions of program For different applications

May want to conditionally include debug code • Specify certain constant identifiers

Called symbolic constants Make reading and managing program easier

Structure

From above discussion we see Preprocessor has

Input C source code

Containing embedded preprocessor directives Output

Preprocessed C source file Input to compiler

Implementation

Separate Program

Page 4: Preprocessing, Compiling, Assembling, and Linking Introduction

- 4 of 22 -

Reads original C source file Looks for lines beginning with # symbol

Evaluates each such line Writes out C source to compiler

Based upon directives included in line

Single Program Performs

Preprocessing Compilation

In single pass

Preprocessor Language Preprocessor language specified as set of directives Directives typically begin in column 1 (caution) of source file

Depends upon version of preprocessor As it goes through source file line by line

Preprocessor looks for lines beginning with the special character # Syntax

Completely independent of the C language Number of directives

Approximately 12 – 15 Shown in following table

Preprocessor in Action

Examine each line in program source Those that do not begin with #

Viewed as source text These are ignored and sent directly to output

Those that begin with # Expand Transform

As directed by the command

Page 5: Preprocessing, Compiling, Assembling, and Linking Introduction

- 5 of 22 -

Assuming process runs correctly Must result in a C program Preprocessor does not correct

User design errors Syntactic errors Grammatical errors

Directive Definition

#define Define a preprocessor macro or symbolic constant

#undef Undefine or remove a preprocessor macro

#include Insert contents of another source file

#if Conditionally include contents of another source file

#ifdef Conditionally include contents of another source file if macro name is defined

#ifndef Conditionally include contents of another source file if macro name is not defined

#elif Conditionally include contents of another source file if macro name is defined and previous #if, #ifdef, #ifndef, or #elif failed

#else Alternative action if preceding #if, #ifdef, #ifndef, or #elif directive fails

#endif Closes #if, #ifdef, #ifndef, or #elif construct

#line Return line number for compiler message

defined name

defined(name)

Directive that returns 1 if name is defined as preprocessor macro and 0 otherwise

# operator Directive to replace macro parameter with string constant containing parameter’s value

## operator Create single token from two adjacent tokens

#pragma Specify proprietary information to the compiler

#error Return a compile time error with associated message

Page 6: Preprocessing, Compiling, Assembling, and Linking Introduction

- 6 of 22 -

Lexical Conventions Line beginning with #

Preprocessor command Name of command must follow # ISO C - International Standards Organization

White space can precede or follow # on the same source line

Older versions do not permit

Line with only # ISO C

Null directive Treated as blank line

Older versions May be different

Remainder of the line following the command

May contain command args Args subject to macro replacement

If no args required Remainder of line should be empty

White space and comments allowed Often old compilers will ignore

Preprocessor lines are recognized

Before macro expansion Will talk about macros shortly

If macro expands into something that looks like preprocessor directive Directive not recognized

Example #define STRLIB #include<string.h> STRLIB 1. #define processed

STRLIB is interpreted to mean #include<string.h> 2. STRLIB substitution executed

Based upon the definition in previous line 3. Token sequence #include<string.h> passed to compiler as code

Page 7: Preprocessing, Compiling, Assembling, and Linking Introduction

- 7 of 22 -

The preprocessor recognizes the line continuation character

Commands can extend to multiple lines with the \ character Example

#define DOLLAR $ #define BACKSLASH \ #define MODULUS | Results in 2 lines not 3 as might be expected

Line 2 continued on to line 3 – these interpreted as single line

Example

#define SWAP(a, b) {

a ^= b; \

b ^= a; \

a ^= b; \

}

Preprocessor Directives File Inclusion

Directive #include Simplest preprocessor directive Has two forms

Either form

Replaces the current line with Entire contents of the named file

If complete path not given

Search determined by form used < >

Search in certain standard places System type places

syntax #include <fileName> #include “fileName”

Page 8: Preprocessing, Compiling, Assembling, and Linking Introduction

- 8 of 22 -

Determined by implementation Defined by search rules Specific location set at time of compiler installation

“ “ First search some local places

Current directory Second

Certain standard places General intent < >

Standard implementation files “ “

Programmer written files Included file

May contain #include commands Number

Implementation dependent ANSI C requires support for 8 minimum Error if included file cannot be found

Third form of #include recognized

The tokens undergo normal macro expansion

Result must match one of the first two forms Example

#define COMMS “G:/mySystem/include/comms.h” #include COMMS Causes the preprocessor to look in directory and for the file specified

G:/mySystem/include/comms.h

Note: the forward slash / or back slash \ used to separate directories along a directory path depends upon operating system. Typically UNIX or LINUX derivatives use the forward slash and Windows derivatives use the back slash

syntax #include preprocessor tokens

Page 9: Preprocessing, Compiling, Assembling, and Linking Introduction

- 9 of 22 -

Macro Substitution

Directives #define #undef

#define i. The first form of #define directive

Causes name To be defined as a macro to the preprocessor Instructs the preprocessor To replace all (unquoted) occurrences of name with text

name

Must be an identifier as defined by the C language U/L case letters.....

text Called the body of the macro

Process called macro substitution

Simple macros Common use

Symbolic constants Example

#define MAXSIZE 2048 #define PI 3.14 #define TWOPI 6.28 #define TWOPI (3.14*2.0) #define TWOPI (PI + PI) int myArray [MAXSIZE]; circumference = TWOPI * radius area = PI * pow(r,2);

syntax i. #define name text ii. #define name (arg1, arg2, ... argn,) text iii. #undef name

Page 10: Preprocessing, Compiling, Assembling, and Linking Introduction

- 10 of 22 -

Macros with Parameters Second form of #define directive

Declares a formal parameter list

Parameter list Immediately follows macro name

No intervening whitespace If whitespace

Definition assumed to be macro with no args Enclosed in () Separated by commas

Args in the parameter list Must be identifiers No two the same Need not be used in macro body

Parameter list may be empty

Using a Parameterized Macro Macro invoked

Writing name Left parenthesis 1 actual arg for each formal parameter Separated by commas Right parenthesis

If no formal parameters

Must include empty arg list Whitespace may appear

Between Name Left parenthesis

Formal arg

May contain Properly balanced parenthesis Commas

If within set of parenthesis Braces and subscripting brackets

Cannot contain commas Do not have to balance

Page 11: Preprocessing, Compiling, Assembling, and Linking Introduction

- 11 of 22 -

Example

#define sum(x,y) ((x) + (y)) x = sum(2*a, b) / sum (c,d); x = sum(2 * g(a,b), h(a,b)) / sum (c,d);

Example

#define getModem() getc(modemIn) while ((c = getModem()) != EOF)

Example

Can define a macro that takes arbitrary statement as its argument

#define assign(anyStatement) anyStatement assign( {a = 1 ; b = 2;}) assign (c = 0; d = 1; e = 2;)

Example

#define max(a,b) ((A) > (B) ? (A) : (B)) max (3, 4); max (6, 5);

Potential problems

Consider max(i++, j++);

Appears to be simple use of max ()

Observe

((A) > (B)) replaced by ((i++) > (j++))

(A) : (B) replaced by ((i++) : (j++))

Potentially each variable is incremented twice

#undef The #undef macro Companion to #define

Page 12: Preprocessing, Compiling, Assembling, and Linking Introduction

- 12 of 22 -

Used to make name

No longer defined Causes preprocessor to forget Macro definition of name

Once name is undefined Can be given new definition

Using #define Not an error

To undefine a name that is not defined

Macro expansion Not performed within #undef directive

Conditional Compilation

Directives #if #else, #elif #endif

Conditional Compilation directives

Based upon computed condition Allow lines of source code to be

Passed through Eliminated

Used to control the way the source code

Assembled Compiled Semantics

syntax #if constant-expression #else, #elif constant-expression #endif

syntax #undef name

Page 13: Preprocessing, Compiling, Assembling, and Linking Introduction

- 13 of 22 -

As expected

#if constant-expression

constant-expression Must evaluate to constant arithmetic value May include macro substitution

if constant-expression non- zero

Subsequent C code lines Intended to be included in program All C source lines

Sent to preprocessor output Until #else, #elif, or #endif

Expression encountered

#else, #elif constant-expression

#else Like familiar if - else

If if previous conditions fail Lines follow #else are included

#elif constant-expression Equivalent to else if

Like if constant-expression evaluated Consequences are the same as #if

#endif Closes the #if sequence

Example

Let the variable SYSTEM identify the host system LINUX OSX UNIX WIN7

Page 14: Preprocessing, Compiling, Assembling, and Linking Introduction

- 14 of 22 -

Want different header file included depending upon system Each defines system specific information #if LINUX #define HDR “linuxHeader.h” #elif OSX #define HDR “osxHeader.h” #elif UNIX #define HDR “unixHeader.h” #else #define HDR “win7Header.h”

Conditional Directives

Directives #ifdef #ifndef

Conditional directives Test if an identifier

Defined Not defined

#ifdef

Equivalent to if 1

if the identifier is defined if 0

if the identifier is not defined #ifndef

Equivalent to if 0

if the identifier is defined if 1

if the identifier is not defined or undefined

syntax #ifdef name #ifndef name

Page 15: Preprocessing, Compiling, Assembling, and Linking Introduction

- 15 of 22 -

Example Want different debug code included depending upon system

Conditionally include debug code Don’t want to

Include in the final version Take out

For future upgrades

Each defines system specific information

#define LINUX 0 #define WIN7 0 #define UNIX 0 #define OSX 1 #if def LINUX

Linux debug code #endif #if def WIN7

Win 7 debug code #endif #if def UNIX

Unix debug code #endif #if def OSX

osx debug code #endif Example

Program Multiple files

Several files share Common .h file

May want to Debug separately Use for multiple targets Use for different programs

In final build Will have multiple definitions for variable if .h file included multiple times

Page 16: Preprocessing, Compiling, Assembling, and Linking Introduction

- 16 of 22 -

Example

May have added debug code to source For use during development

Want to remove for release Bad style to individually comment out

Each line of debug code Preprocessor can help

Example preproc0.c

#include <string.h> #include <stdio.h> #define DEBUG // commenting out this line will

// prevent debug code from inclusion in final build int main() { char* myString = "Hello"; #ifdef DEBUG

printf ("The string length is %d\n", strlen(myString)); #endif return 0; }

Miscellaneous Directives

Directives #line #error #pragma #line

Page 17: Preprocessing, Compiling, Assembling, and Linking Introduction

- 17 of 22 -

If program built from

Multiple other files Sometimes useful to annotate

With line numbers from original file Instead of normal sequential numbering

Info provided by #line directive

Used to instantiate the __LINE__ __FILE__

__LINE__ Line number of current source program Decimal integer constant

__FILE__ Name of current source file String constant

Example

preproc1.c

#include <stdio.h> #include <string.h> int main() { char* myString = "Hello"; #line 123 "myFile" printf ("This line is %d from %s\n", __LINE__, __FILE__); printf ("The string length is %d\n", strlen(myString)); return 0; }

syntax #line line-number “fileName” #line line-number

Page 18: Preprocessing, Compiling, Assembling, and Linking Introduction

- 18 of 22 -

#error Used to write Compile time error message

error-message is subject to macro expansion Typically used in conditionals

Warn of inconsistencies Constraint violations Incomplete information

Example

preproc2.c #include <stdio.h> #include <string.h> #define SYSTEM Linux #ifndef SYSTEM #error "You must specify the system type" #endif int main() { char* myString = "Hello"; printf ("The string length is %d\n", strlen(myString)); return 0; }

#pragma Used to

Add new preprocessor or compiler functionality Provide implementation defined information to the compiler

No restrictions on tokens Compilers should ignore what they do not understand

syntax #error error-message

syntax #pragma tokens

Page 19: Preprocessing, Compiling, Assembling, and Linking Introduction

- 19 of 22 -

args to directive Subject to macro expansion

No agreement on standard pragmas

Example #pragma pagesize (number of lines)

MS Visual C++ Sets the number of lines desired per page of source listing

#pragma pages (<pages>)

Generate <pages> (formfeeds) in source listing Default value is 1

#pragma inline

Compile with fast calling convention

Typedef Names C provides facility for creating new data type names called typedef

Typedef creates alias or synonym for existing type

After declaration identifier becomes synonym for typeName

Caution

Typedef does not create a new type It is merely a synonym or alias for an existing type

Cannot redefine the built-in meaning of a type

typedef int double is illegal

Example typedef int* INTPTR;

INTPTR is not a pointer to an int It may be used where ever and int* can be used

INTPTR myPtr;

myPtr is now a pointer to an integer

syntax typedef typeName identifier

Page 20: Preprocessing, Compiling, Assembling, and Linking Introduction

- 20 of 22 -

typedef

Very useful in simplifying complicated declarations Thus

Helps to simplify program Makes intent more obvious

Use carefully

Rather than clarify Overuse can serve to confuse

The Compiler Compiler is a tool for translating programs

Into variety of forms One such form

Assembly language – the instruction set for the machine

As we saw in level 2 diagram above Top level program

Can be made up of number of modules Module can be

C source file Standard library file

Defined as part of the language such as Math library String library Library that manages all input and output

Custom library Under the Hood

As program compiled Compiler has a lot of record keeping to do

Translation Unit

As compilation process proceeds Each .c or source file compiled individually

Called translation unit Symbol Table

As each source file compiled Table of identifiers of symbols within program created

Called symbol table

Page 21: Preprocessing, Compiling, Assembling, and Linking Introduction

- 21 of 22 -

How compiler keeps track of

All identifiers used Where in memory variables placed

Allocate Memory – Yes or No

Each symbol name entered into symbol table Declaration – brings name into name space

No memory allocated Definition – brings name into name space

Sufficient memory allocated to hold variable If definition appears in different translation unit

Identify as extern Want only single definition – memory allocation For each variable or function body in system

Prior to this stage Program did not depend upon machine Now program in form that will execute only on particular machine

The Assembler Assembler is tool we use for converting

Assembly language into machine language Program expressed as collection of 0’s and 1’s machine understands

The Linker Although program now in machine language Not ready to be executed Problem

All variables and data structures we use Must reside in computer memory Each needs an address in memory

Question

Which address should we use

Unfortunately Cannot always use the same address What if someone else wants to use same address

Page 22: Preprocessing, Compiling, Assembling, and Linking Introduction

- 22 of 22 -

To solve problem assembler generates Relocatable code

Code that can be placed anywhere in memory Second question arises at this time

We’d like to be able to use existing code Our own Other peoples

How do we get this into our program without typing in each time

Tool called linker loader can help with both problems Does two jobs

1. Links collection of program modules together 2. Resolves address problems

Summary In this lesson examined

Architecture of C program Introduced C preprocessor and preprocessor directives How to use preprocessors directives to manage program Should now be comfortable working with basic C preprocessor directives Know when and how to use Aware if tools compiler, assembler, and linker and role they play in building C

program