1 introduction to semantics the meaning of a language

35
1 Introduction to Semantics The meaning of a language

Upload: eustacia-tucker

Post on 25-Dec-2015

228 views

Category:

Documents


0 download

TRANSCRIPT

1

Introduction to Semantics

The meaning of a language

2

Overview

This section will cover: what is semantics of a computer language? Three forms of formal semantics and their purpose. classification of errors

3

What is semantics?

Semantics describes the meaning of the elements that make up a programming language.

Syntax describes the form of program elements.

Syntax:

ident ::= [A-Za-z_]\w*

declaration ::= datatype ident ['=' value]

Semantics:

- You can only declare an identifier once per scope!

- You must declare every identifier before you use it in a statement.

- An identifier cannot be a reserved word.

4

What is semantics? (cont'd)

Semantic definition of syntax raises more semantic questions...

What is a scope?

What is a data type?

What operations are permitted on data types?

Syntax:

term ::= factor { (*|/) factor }

Semantics (for integer arithmetic):

Division by zero raises an exception

If the result exceeds the range of 32-bit 2's complement integers, the higher bits are ignored!

5

The parts of semantics

Semantics defines many characteristics of a language; semantics is intertwined with:

names and scope of names

binding time

the type system

protocol for subprograms

These issues will be covered later.

6

Semantics: Type System Semantics defines the type system type system - data types and allowed operations

can the programmer define his own data types or aliases for existing data types?

range of data types and result of operations

Example:

Syntax: intvalue ::= 0 | [-](1|...|9){ digit }

Semantics: int values are stored as 32-bit binary values in 2's complement form, with a range 2-31 to 231-1.

Syntax: expr ::= term { (+|-) term }

Semantics: integer + and - are performed modulo the range of int values. If a result of a calculation exceeds the largest value, then higher order bits results from the calculation are discarded.

7

Semantics: Identifiers

Identifiers are names. Names can identify: variables and constants points in the program (labels) functions and procedures (subprograms) programmer-defined data types, including classes

Syntax: identifier ::= [A-Za-z_]\w*

Semantics: all variable and constant names must be unique! cannot use reserved words as identifiers names are case sensitive can a function (method) have same name as variable?

8

Semantics: Scope

Scope is a range of statements. Scope of an identifier is the range where an identifier is

visible or known. Semantics defines the scope of identifiers.

9

Scope Examples

In Pascal, Fortran, and C: variables are defined at the top of a subprogram. Their scope is the entire subprogram.

REAL FUNCTION ADD(A, B)REAL XX = A + BADD = XRETURNEND

PROGRAM MAINREAL X, Y, AX = 20.0Y = 5.0A = ADD(X, Y)PRINT *, "Sum is ", A

10

Scope Examples

C++, C#, Java: variables can be defined anywhere, their scope is from the point of declaration to the end of the smallest enclosing block, { ... }. Idea based on Algol.

float sum(int max) { float x, sum = 0; int k; for(k=1; k<=max; k++){ scanf("%f", &x); sum = sum + x; } return sum;}

float sum(int max) { float sum = 0; for(int k=1; k<=max;k++){ float x; scanf("%f", &x); sum = sum + x; } return sum;}

C C++

11

Semantics: Binding Time

Semantics also defines when names are "bound" to properties.

C Example: int n = 1000;

"int" bound by C language definition

"int is 32-bits" bound by compiler implementation

"n is an int" bound when program is compiled

address of n is bound when program is loaded (static var) or when function is executed (stack local var)

"n = 1000" is bound when program is loaded (static) or each time function is executed

12

Semantics: Subprograms

Semantics defines the meaning of subprograms.In particular, how parameters are passed and values returned.

/* C and C++ default is

* to pass parameters

* by value

*/

void swap(int a, int b) {

int tmp;

tmp = a;

a = b;

b = tmp;

}

/* C++ and C# let you

* pass parameters

* by reference

*/

void swap(int& a, int& b) {

int tmp;

tmp = a;

a = b;

b = tmp;

}

13

Formal Semantics

Three mathematical notations for semantics exist.

Operational semantics describes the effect of each semantic element on state of a hypothetical computer.

Axiomatic semantics describes assertions (or axioms) of what must be true before and after an expression is executed.

Denotational semantics describes semantic elements as state changing functions, again using some hypothetical computer. May use recursive functions.

14

Formal Semantics Examples

Consider assignment: target = source

Operational semantics: is the state of the computer, v is any value of the source, U-bar is overriding union:

Axiomatic semantics: if s target = source

Denotational semantics: M is a mapping of expressions to program states .

)}target{(source)target(

source)(

,v

v

sss source).\target.(

true

)}source.target.{(),(M

:M

,sss

StateStateStatement

15

Why Formal Semantics?

1. Avoid ambiguities in the implementation This can lead to different compilers producing

different executable programs from same source.

Ada had an ambiguity in implementation of "in out" parameters. In some programs, different compilers produced different results!

2. Enable formal proof of program correctness, at least in some situations.

3. Enable verification that a compiler adheres to language specification.

16

Static and Dynamic Characteristics

Aspects of a computer language can be defined as static or dynamic. You often hear "dynamic memory allocation" or "static binding".

Static - something that is done or known before the program executes, including things done while the program is being loaded for execution.

Dynamic - something that is done or known while the program executes.

17

Static/Dynamic Examples

Syntax checking for compiled languages is static

A division by zero error is dynamic (unless you insult the compiler by writing "x/0")

The definition of data types like "int", "float" is static.

Allocating memory for function calls is dynamic.

The scope of a variable can be static or dynamic, depending on the language... but usually static.

18

Classifying Errors

It is helpful to classify errors by type and when they are detected.

19

Classifying Errors

Lexical errors are detected by the compiler: static.

Syntax errors are detected by the compiler: static.

Semantic errors may be:

detected by compiler. int n = 2.5;

detected by linker. r = SQRT(x*x+y*y);

detected at run-time. /* java */

for(k=-1; ;k++) sum +=a[k];

Logic errors may be:

detected at run-time

not detected at all

20

Find 9 errors in this program

classify as: lexical, syntax, static semantic, dynamic semantic, or logical. Indicate when error is detected.

include <stdio.h>/* return maximum of x and y */int max( integer x, integer y ) {

if (x > y) return y; else return x;

} int main( ) {

int a, b; printf("Input two integers: "); scanf("%f %f", a, b); printf("The max of %d and %d is %d\n", a, b,

MAX(x,y); return;

}

21

Find 9 errors in this program: solution

include <stdio.h>1. Syntax: missing "#" detected by compiler at "<" symbol

int max( integer x, integer y ) { 2. Static Semantic: "integer" isn't a datatype, compiler detect

if (x > y) return y; else return x;

3. Logic Error not detected!: this returns min of x and y

scanf("%f %f", a, b);4. Dynamic semantic error: "%f" should be "%d", may be a run-time error or not detected at all5. Semantic error: must use address of a, b (&a,&b) in scanf. The compiler should detect this, but it may not (gcc did not), since an int can be an address! Maybe runtime error.

22

Find 9 errors in this program: solution

printf("The max of %d and %d is %d\n", a, b,MAX(x,y);

6. Static semantic error: "MAX" should be "max". The linker will report an "unresolved external symbol" error because it couldn't find a function named "MAX". 7. Static semantic error: (x,y) should be (a,b). Compiler will report use of undefined variables x, y.8. Syntax error: missing ")" to close printf( ... ). Compiler reports this as a syntax error.

return;9. Static Semantic error: declared "int main" but here there is no return value. Semantics says that the function's actual return type has to be the same as in the header. Detected by compiler.

23

Find 7 errors in this program

classify as: lexical, syntax, static semantic, dynamic semantic, or logical. Indicate when error is detected.

#include <stdlib.h>/* return x modulo y, return 0 if y is 0. */int mod( int x, int y ) {

if ( y = 0 ) return 0; else return x # y;

} void main( ) {

int a, b; printf("Input two integers: "); scanf("%d %d", a, b); printf("%d mod %d is %d\n", a, b, mod(b,a);return;

}

24

Find 7 errors: partial solution

#include <stdlib.h>Static semantic error: we didn't #include <stdio.h>, so

compiler should give an error when scanf and printf are used. However, gcc ignores this.

if ( y = 0 ) return 0; // should be ( y == 0 ) Logic error: the C language allows any expression to be used

as a test condition in "if". This will set y equal 0, then return a value 0, so the "if" test is always false. The next statement will produce a division by zero error. Java doesn't allow conversion of other datatypes to boolean, so this would be a syntax error in Java.

void main( ) { Static semantic error: the C language says that main should

return an int. Compiler reports this error.

25

Attributes

Properties of language entities, especially identifiers. Examples:

Value of an expression Data type of an identifier Number of digits in a numeric data type Memory location of a variable Code body of a function or method

Declarations ("definitions") bind attributes to identifiers. Different declarations may bind the same identifier to

different sets of attributes.

26

Binding

Binding means "an association" associate names with values associate symbols with operations

Binding Time describes when this occurs

Example: int count; the name "int" was bound by the C language def'n

(along with meanings of operators +, -, ... for int) the size (and set of possible values) of "int" was bound

bound at compiler design time identifier "count" is bound to "int" at compile time the location is bound at load or execution time

27

Binding Times

Louden gives 6 possible binding times: language definition time: Java defines precision of int; C

leaves it to the implementation. In C, an "int" can be 16 bits or 32 bits. The stdint.h header on UNIX provides typedefs, such as:typedef short int int16_t;

typedef int int32_t; language implementation time: when the compiler or

interpreter is written translation time (compile time) link time, for compiled programs load time execution time

28

Load Time versus Execution Time

How are count and sum different? C example:

int count; /* an external variable is static */int sub( ) {

int sum; /* a local variable, dynamically allocated *//* do something */

}

count is allocated storage at load time (and exists for the life of the program)

sum is allocated storage at execution time, i.e. each time sub is executed

The scope of count and sum are also different.

29

Static and Dynamic Binding

Static Binding - occurs before the program is run

Dynamic Binding - occurs while the program is running

a symbol can have both static and dynamic attributes

/* Binding time example */int count; /*external var */ int sub( ) {

int sum = 0;static int last = 0;int *x;void *p;p = (double *)malloc(...);

Type Binding Storage Bindingstatic staticstatic dynamicstatic dynamicstatic staticstatic dynamicdynamic dynamic

30

Exercise

For each of these attributes, indicate the binding time in C and Java as precisely as possible.

1. number of significant digits in a "float"

2. the meaning of "char"

3. the size of an array variable

4. the memory location of a local variable

5. the value of a constant (C "const int", Java "final")

6. the memory location of a function or method

Hint: C and Java differ at least in items 1 and 5

31

So now you know...

When someone asks, "are method names statically or dynamically bound to actual code"?

/* Java */class Pet {

public void talk( ) {System.out.println("hello");}

}class Dog extends Pet {

public void talk() {System.out.println("woof");

}...Pet p = new Dog( );p.talk( );

/* C++ */class Pet { public:

void talk( ) {cout << "hello" << endl; }

}class Dog: public Pet {

public: void talk() {cout << "woof" << endl; }

}...Pet *p; Dog dog; p = &dog; p->talk( );

32

So now you know...

In C++, method names are statically bound to code,

unless "virtual" is specified.

In Java, all methods are dynamically bound to actual code, except in these cases...

"private" methods are statically bound

"static" methods are statically bound

"final" methods are statically bound

33

Variables and Constants

A variable is a name for a memory location, its value can change during execution.

A constant is an object whose value does not change throughout its lifetime.

Literals are data values (no names) used in a program. int buffer[80]; 80 is a numeric literal.

Constants may be: substituted for values by compiler (never allocated) compile-time static (compiler can set value) load-time static (value determined at load time) dynamic (value determined at run time)

34

Binding of Constants

C "const" can be compile time, load time, or run time constants:

const int MaxSize = 80; /* compile time */

void mysub( const int n ) {

const time_t now = time(0); /* load time */

const int LastN = n; /* dynamic */

In Java, "final" means a variable cannot be changed after the first assignment. Otherwise, same as var.

static final int MAX = 1000; /* class loadtime */

void mysub ( int n ) {

final int LastN = n; /* runtime */

35

Constants (2)

Compile-time constant in Java:static final int zero = 0;

Load-time constant in Java:static final Date now = new Date();

Dynamic constant in Java:any non-static final variable.

Java "final" identifiers are variables with a restriction (no reassignment).

C "const" is more strict: compiler has the option to eliminate them during compilation.