ierg2080 introduction to systems programming chapter 2 - c

66
IERG2080 Introduction to Systems Programming Chapter 2 - C Basic Data Types Professor Jack Y. B. Lee Department of Information Engineering ( Revision 1.7 14 Sep 2021)

Upload: others

Post on 05-Dec-2021

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: IERG2080 Introduction to Systems Programming Chapter 2 - C

IERG2080 Introduction to Systems ProgrammingChapter 2 - C Basic Data Types

Professor Jack Y. B. LeeDepartment of Information Engineering

(Revision 1.7 – 14 Sep 2021)

Page 2: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 2

Acknowledgements

• Some materials in this set of slides are based on Chapter 5 of Jens Gustedt, Modern C, released under a Creative

Commons license. Available at https://modernc.gforge.inria.fr/.

Materials were reproduced here with author’s permission.• Other images were sourced from the Internet.• Note that code examples presented here often omitted error

checking for brevity. Do not, repeat, do not omit error checking in real codes.

Page 3: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 3

Contents

• 1. Types, Values, and Representations• 2. Properties of Types• 3. Binary Representation• 4. Alias for Data Types• 5. Specifying Values• 6. Variables and Initializers• 7. Constants• Summary

Page 4: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 4

1. Types, Values, and Representations

• An Analogy:

Bits: 01001100 Atoms

Interpret as a decimal integer:010011002 = 7610

Mails

- An 8-bit unsigned integer can representinteger values from 0 to 255.

- Can perform arithmetic with it, etc.

- A mail has a letter inside an envelope .- Envelope has an address, letter has message, etc.

Page 5: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 5

1. Types, Values, and Representations

• An Analogy:

Bits: 01001100 Atoms

Interpret as a decimal integer:010011002 = 7610

Mails

- An 8-bit unsigned integer can representinteger values from 0 to 255.

- Can perform arithmetic with it, etc.

- A mail has a letter inside an envelope .- Envelope has an address, letter has message, etc.

What it is composed of.

Page 6: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 6

1. Types, Values, and Representations

• An Analogy:

Bits: 01001100 Atoms

Interpret as a decimal integer:010011002 = 7610

Mails

- An 8-bit unsigned integer can representinteger values from 0 to 255.

- Can perform arithmetic with it, etc.

- A mail has a letter inside an envelope .- Envelope has an address, letter has message, etc.

What it represents.

Page 7: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 7

1. Types, Values, and Representations

• An Analogy:

Bits: 01001100 Atoms

Interpret as a decimal integer:010011002 = 7610

Mails

- An 8-bit unsigned integer can representinteger values from 0 to 255.

- Can perform arithmetic with it, etc.

- A mail has a letter inside an envelope .- Envelope has an address, letter has message, etc.

How it can be used.

Page 8: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 8

1. Types, Values, and Representations

• An Analogy:

Bits: 01001100 Atoms

Interpret as a decimal integer:010011002 = 7610

Mails

- An 8-bit unsigned integer can representinteger values from 0 to 255.

- Can perform arithmetic with it, etc.

- A mail has a letter inside an envelope .- Envelope has an address, letter has message, etc.

Type

Value / Instance

Representation

Page 9: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 9

1. Types, Values, and Representations

• Real-world examples: Calculator is a type

• Your calculator is an instance of type calculator Smartphone is also a type

• My smartphone is an instance of type smartphone

• A type defines what an object can do and how A smartphone type instance can be used to make phone calls

by dialing the number using a numeric keypad• To actually use a type of object you need an instance To make a phone call I need to bring out and use my

smartphone

Page 10: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 10

1. Types, Values, and Representations

• C programming examples: int, float are two different types for numbers

• A type defines what an object can do and how An int instance can store integer numbers (…, -2, -1, 0, 1, 2,

…) within a range, e.g., from -232 to (232-1) A float instance can store floating point numbers (3.1416, 2.54,

-0.0096, etc.) within a range with certain precision For both types one can perform arithmetic such as +, -, *, / on

their instances

Page 11: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 11

2. Properties of Types

• Define the possible values Numeric data types defined by the C language:

C d

ata

type

s

Page 12: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 12

2. Properties of Types

• Define the possible values Integer data types defined by the C language:

• signed (positive and negative) vs unsigned (non-negative)– By default it is signed, add unsigned prefix for unsigned type– E.g., int is signed integer, unsigned int is unsigned integer

• size of type (how large the range of numbers it can represent)– char (e.g., range of 28)– short (e.g., range of 216)– int (e.g., range of 216 or 232, depending on platform)– long (e.g., range of 232)– long long (e.g., range of 264)

Page 13: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 13

2. Properties of Types

• Define the possible values Floating point data types defined by the C language:

• Always signed (positive and negative)• size of type (how large the range of numbers it can represent)

– float (e.g., range of 3.4×1038)– double (e.g., range of 1.8×10308)– long double (e.g., range of 1.19×104932)

• real versus complex– complex type supports complex numbers, introduced in ISO C99

(not covered here)

Page 14: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 14

2. Properties of Types

• Creating an instance of a data type Example: Create an int instance

#include <stdio.h>

int main(void) {int anInt;

anInt = 2080; // Store a value to it.

anInt = anInt – 940; // Perform arithmetic

printf("anInt=%i\n", anInt);

return 0;}

This instance is a variable with name "anInt".This "2080" is also an instance, but w/o a name, known as literals (more later). The compiler will infer its type as int.

When you create an instance of a type, the system (i.e., compiler) will:(a) allocate memory (e.g., 8 bytes for an int) for storing the type of values

it represents;(b) keep a record of the type for the instance (e.g., anInt is an int) so that

it knows how to work with it (e.g., performs addition) andverify if it is being used correctly in program codes.

Page 15: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 15

2. Properties of Types

• Displaying text to the console printf( ) – See http://man7.org/linux/man-pages/man3/printf.3.html

• Function Prototype

int printf(const char *format, ...);

Format string specifyingthe output format

Optional additional arguments matching the format string.

char *name = "Jack"; int n=100;printf("Hello %s for the %i–th time!\n", name, n);

Except for % and \other chars are outputto the stream directly.

% marks a field to be field with value from a function argument.The next char after % specify the format of value to display (aka conversion specifier).Common examples:- %d, %i : int argument, converted to signed decimal notation for output.- %f : double argument (float will be converted to double first), floating point output.- %s : string argument, copy the string to output up to but not including the

terminating null.- %p : pointer argument, output the address stored by the pointer.- %% : output '%'.

Page 16: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 16

2. Properties of Types

• Define the allowed operations signed integer allows arithmetic operations +, -, *, /, as well as

modulo operator % float allows arithmetic operations +, -, *, /, but not %

• Required All values/literals (and variables) must have associated type,

either explicitly defined by codes or implicitly (values/literals only) determined by the compiler

• Non-changing Types are static – once defined for a variable it cannot change

subsequently• e.g., once anInt is defined as an instance of int, it cannot later

change to long int or float

Page 17: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 17

3. Binary Representation

• Platform-dependent C does not fully specify how a type is to be represented in

binary inside memory Example: Representation of sign (+ve, -ve) in integer types

0 0 0 0 0 1 1 0

b7 b6 b5 b4 b3 b2 b1 b0

interpreted as unsigned 8-bit integer value = 6

How do we represent a negative value, say, -6?

1 0 0 0 0 1 1 0

b7 b6 b5 b4 b3 b2 b1 b0

interpreted as signed 8-bit integer value = -6

Reserve the most significant bit (MSB)as the sign bit where b7=0 representspositive while b7=1 represents negative.

There are other representations: lookup "one’s complement" and "two’s complement".

Page 18: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 18

3. Binary Representation

• Compiler and Binary Representation The binary representation schemes are hardware (CPU)

dependent. This is why the compiler must know the target CPU to

correctly generate machine codes to operate on numbers. Normally the compiler adopt the representation on the

platform it runs it• e.g., gcc in Linux/Intel PC will adopt Intel CPU representations

in generating the executable binary code. (Try "gcc -v".) Cross-compiler can generate binary codes for a target CPU

different from the one it runs in• e.g., compiling an iPhone app in OS X / Intel Mac for use in an

iPhone with an A11 processor

Page 19: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 19

3. Binary Representation

• Compiler and Binary Representation The size of a data type or a variable or an expression can be

determined at compile time using the sizeof() operator. Note sizeof() operator returns a number in the size_t type

(see later) which is platform-dependent.#include <stdio.h>

int main() {printf("sizeof(char) = %zu bytes\n", sizeof(char));printf("sizeof(short) = %zu bytes\n", sizeof(short));printf("sizeof(int) = %zu bytes\n", sizeof(int));printf("sizeof(long) = %zu bytes\n", sizeof(long));printf("sizeof(long long) = %zu bytes\n", sizeof(long long));printf("sizeof(float) = %zu bytes\n", sizeof(float));printf("sizeof(double) = %zu bytes\n", sizeof(double));printf("sizeof(long double) = %lu bytes\n", sizeof(long double));printf("sizeof(size_t) = %zu bytes\n", sizeof(size_t));return 0;

}

sizeof.c

C99 supports the new format specifier %z for size_t type.Otherwise use %lu for size_t type. The gcc compiler will givewarning if the format specifier is incompatible.

Page 20: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 20

3. Binary Representation

• Compiler and Binary Representation Default gcc compiled to 64-bit target

gcc compiled to 32-bit target

yblee@ubuntu:~/ierg2080/Chapter-2$ gcc -o sizeof sizeof.cyblee@ubuntu:~/ierg2080/Chapter-2$ ./sizeofsizeof(char) = 1 bytessizeof(short) = 2 bytessizeof(int) = 4 bytessizeof(long) = 8 bytessizeof(long long) = 8 bytessizeof(float) = 4 bytessizeof(double) = 8 bytessizeof(long double) = 16 bytessizeof(size_t) = 8 bytes

yblee@ubuntu:~/ierg2080/Chapter-2$ gcc -m32 -o sizeof32 sizeof.cyblee@ubuntu:~/ierg2080/Chapter-2$ ./sizeof32sizeof(char) = 1 bytessizeof(short) = 2 bytessizeof(int) = 4 bytessizeof(long) = 4 bytessizeof(long long) = 8 bytessizeof(float) = 4 bytessizeof(double) = 8 bytessizeof(long double) = 12 bytessizeof(size_t) = 4 bytes

Page 21: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 21

3. Binary Representation

• C numbers are not real mathematical numbers! Integer types (short, int, long, etc.) has range limits and thus

may overflow in assignment or calculations Example:

#include <stdio.h>

int main(void) {unsigned short a = 65535;unsigned short b = a + 1;

printf("a=%u, b=%u\n", a, b);return 0;

}

yblee@ubuntu:~/ierg2080$ gcc -o overflow overflow.cyblee@ubuntu:~/ierg2080$ ./overflowa=65535, b=0

overflow.c

b overflows as its binary representation is 16 bits (i.e., 0 to 65535),so it cannot represent the result (i.e., 65536). It wraps around byeffectively computing (65536 mod 65536) = 0.

Page 22: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 22

3. Binary Representation

• C numbers are not real mathematical numbers! Careful choice of integer type is needed to prevent overflow Example:

#include <stdio.h>

int main(void) {unsigned short a = 65535;unsigned int b = a + 1;

printf("a=%u, b=%u\n", a, b);return 0;

}

yblee@ubuntu:~/ierg2080$ gcc -o overflow overflow.cyblee@ubuntu:~/ierg2080$ ./overflowa=65535, b=65536

overflow.c

b – is now an int with wider range (more bits) so no more overflow.

Page 23: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 23

3. Binary Representation

• C numbers are not real mathematical numbers! Gangnam Style Broke Youtube:

Number of views (-ve!)

Source: https://www.geeksforgeeks.org/10-famous-bugs-in-the-computer-science-world/

Page 24: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 24

3. Binary Representation

• C numbers are not real mathematical numbers! Limits of basic data types are defined in limits.h and float.h Example:

#include <stdio.h>#include <limits.h>

int main() {

printf("The minimum value of SIGNED CHAR = %d\n", SCHAR_MIN);printf("The maximum value of SIGNED CHAR = %d\n", SCHAR_MAX);printf("The maximum value of UNSIGNED CHAR = %d\n", UCHAR_MAX);printf("The minimum value of SHORT INT = %d\n", SHRT_MIN);printf("The maximum value of SHORT INT = %d\n", SHRT_MAX); printf("The minimum value of INT = %d\n", INT_MIN);printf("The maximum value of INT = %d\n", INT_MAX);printf("The minimum value of CHAR = %d\n", CHAR_MIN);printf("The maximum value of CHAR = %d\n", CHAR_MAX);printf("The minimum value of LONG = %ld\n", LONG_MIN);printf("The maximum value of LONG = %ld\n", LONG_MAX);

return(0);}

showlimits.c

Source: https://www.tutorialspoint.com/c_standard_library/limits_h.htm

Page 25: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 25

3. Binary Representation

• C numbers are not real mathematical numbers! Limits of basic data types are defined in limits.h and float.h Example: (outputs)

yblee@ubuntu:~/ierg2080$ gcc -o showlimits showlimits.cyblee@ubuntu:~/ierg2080$ ./showlimits The minimum value of SIGNED CHAR = -128The maximum value of SIGNED CHAR = 127The maximum value of UNSIGNED CHAR = 255The minimum value of SHORT INT = -32768The maximum value of SHORT INT = 32767The minimum value of INT = -2147483648The maximum value of INT = 2147483647The minimum value of CHAR = -128The maximum value of CHAR = 127The minimum value of LONG = -9223372036854775808The maximum value of LONG = 9223372036854775807

Page 26: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 26

3. Binary Representation

• C numbers are not real mathematical numbers! C floating point number types have finite precisions, i.e., not all

real numbers can be represented exactly Example:

#include <stdio.h>

int main(void) {float x = 0.2;float y = 0.20000000000001;printf("x = %.30f\n", x); // print 30 digits after decimal pointprintf("y = %.30f\n", y); // print 30 digits after decimal pointreturn 0;

}

yblee@ubuntu:~/ierg2080$ ./floatliteralsx = 0.200000002980232238769531250000y = 0.200000002980232238769531250000

The differences while small, could accumulate in repeated calculations or could affect comparisons.To dig deeper see: http://www.cs.yale.edu/homes/aspnes/pinewiki/C(2f)FloatingPoint.html

Page 27: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 27

3. Binary Representation

• Number Conversions Assignment involving different number types may either loose

precision (e.g., from double to float) or overflow (e.g., from float to int)

Example:#include <stdio.h>

int main() {float largenum1 = 1e9;float largenum2 = 1e10;int intnum1 = largenum1, intnum2 = largenum2;printf("float1 = %f, int1 = %i\n", largenum1, intnum1);printf("float2 = %f, int2 = %i\n", largenum2, intnum2);return 0;

}

u2@ub2-150:~/ierg2080/ch-02$ ./numconvertfloat1 = 1000000000.000000, int1 = 1000000000float2 = 10000000000.000000, int2 = -2147483648

Page 28: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 28

3. Binary Representation

• C Characters and Strings ASCII (American Standard Code for Information Interchange)

• 7 bits are required to represent all English alphabetsDec Char Dec Char Dec Char Dec Char--------- --------- --------- ----------0 NUL (null) 32 SPACE 64 @ 96 `1 SOH (start of heading) 33 ! 65 A 97 a2 STX (start of text) 34 " 66 B 98 b3 ETX (end of text) 35 # 67 C 99 c4 EOT (end of transmission) 36 $ 68 D 100 d5 ENQ (enquiry) 37 % 69 E 101 e6 ACK (acknowledge) 38 & 70 F 102 f7 BEL (bell) 39 ' 71 G 103 g8 BS (backspace) 40 ( 72 H 104 h9 TAB (horizontal tab) 41 ) 73 I 105 i10 LF (NL line feed, new line) 42 * 74 J 106 j11 VT (vertical tab) 43 + 75 K 107 k12 FF (NP form feed, new page) 44 , 76 L 108 l13 CR (carriage return) 45 - 77 M 109 m14 SO (shift out) 46 . 78 N 110 n15 SI (shift in) 47 / 79 O 111 o16 DLE (data link escape) 48 0 80 P 112 p17 DC1 (device control 1) 49 1 81 Q 113 q18 DC2 (device control 2) 50 2 82 R 114 r19 DC3 (device control 3) 51 3 83 S 115 s20 DC4 (device control 4) 52 4 84 T 116 t21 NAK (negative acknowledge) 53 5 85 U 117 u22 SYN (synchronous idle) 54 6 86 V 118 v23 ETB (end of trans. block) 55 7 87 W 119 w24 CAN (cancel) 56 8 88 X 120 x25 EM (end of medium) 57 9 89 Y 121 y26 SUB (substitute) 58 : 90 Z 122 z27 ESC (escape) 59 ; 91 [ 123 {28 FS (file separator) 60 < 92 \ 124 |29 GS (group separator) 61 = 93 ] 125 }30 RS (record separator) 62 > 94 ^ 126 ~31 US (unit separator) 63 ? 95 _ 127 DEL

Page 29: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 29

3. Binary Representation

• C Characters and Strings ASCII

• C char type is 8 bits• C string is a sequence of bytes, each representing one char,

followed by a null-terminating char (value of 0). Expanded character sets, e.g., UNICODE, MBCS

• Support multiple languages (e.g., Chinese, Russian, etc.)• Use more than one byte (e.g., 2 bytes) to represent each char• Configured via compiler and library options• See:

– https://home.unicode.org/– https://docs.microsoft.com/en-us/cpp/text/unicode-and-

mbcs?view=vs-2019

Page 30: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 30

3. Binary Representation

• Byte Order How do multi-byte numbers lay out in memory? Example: unsigned long (4 bytes)

• Most-significant-byte first (aka Big-Endian)

• Least-significant-byte first (aka Little-Endian)

Decimal vale: 287454020Hexadecimal value: 0x11223344

11 22 33 44

Memory address pMemory address p+1

Memory address p+2

44 33 22 11

Memory address pMemory address p+1

Memory address p+2

Page 31: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 31

3. Binary Representation

• Byte Order C built-in data types' endianness are determined by the CPU

running the machine code Big-endian CPUs

• Not common nowadays, e.g., SPARC, POWER Little-endian CPUs

• Intel, ARM, etc.• Some CPUs are switchable, e.g., ARM, POWER, etc.

When compiling software to machine codes the endianness must be fixed according to the target CPU architecture

Page 32: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 32

3. Binary Representation

• When do we need to worry about binary representation? Software that exchange data with other platforms

• e.g., generate a binary data file in Linux/Intel to be read by an iPhone app

Software that performs network communications• e.g., an iPhone banking app carrying out transactions with the

bank’s server running in Linux/Intel platform. These topics and others are covered in IERG4180.

Page 33: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 33

4. Alias for Data Types

• You may run into codes using a type such as size_t:

• size_t is not a fundamental C data type (like int) so where does it come from? It is defined in stddef.h, e.g.,

#include <stdio.h>#include <stddef.h>

int main(void) {size_t a=10;size_t b=20;size_t c = a + b;printf("%zu + %zu = %zu\n", a, b, c);return 0;

}

typedef unsigned long size_t;

C reserve word for creating an alias for an existing type

The existing type(to be aliased)

Name of the new type(has the same representation as the existing type,C standard typedef ’s always has the _t suffix.)

Page 34: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 34

4. Alias for Data Types

• But why? Binary representation for fundamental data types such as int

and long vary across platforms• May have different range, e.g.,

16-bit unsigned int can represent a range from 0 to 3276732-bit unsigned int can represent a range from 0 to 4294967296

Choosing the right data type could become complicated if we want to prevent overflow (i.e., trying to assign a number larger than the maximum value a type can represent)

• The C Library (not the language) provides a number of convenient alias to automatically pick the right data type for common usage These are defined in header files such as <stddef.h> that came

with the compiler

Page 35: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 35

4. Alias for Data Types• size_t

Unsigned integer type, guarantee to represent any valid object size in the platform without overflow

Should be used whenever size of an object is to be represented Older C source codes may use int/long or unsigned int/long

Page 36: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 36

5. Specify Values

• Literals and Values A literal is just a value in the source code without being

assigned a name like a variable, e.g., 2080 is an integer value

#include <stdio.h>

int main(void) {int anInt;

anInt = 2080;

anInt = anInt – 940;

printf("anInt=%i\n", anInt);

return 0;}

A literal of integer value 2080

Another literal of integer value 940

A 3rd literal of char string (see next slide)

Page 37: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 37

5. Specify Values

• Literals and Values A literal is just a value in the source code without being

assigned a name like a variable• Integer Type Prefixes Octal integer constant (i.e., base-8)

• Begins with a ‘0’ and follows by digits {0,…,7}• E.g., 077 means 778 or 7×81+7×80 = 6310

Hexadecimal integer constant (i.e., base-16)• Begins with prefix ‘0x’ and follows by digits {0,…9,A,…,F}• Each base-16 digit is 4 bits (24) so 0xA32F is 16 bits• E.g., 0xA32F means A32F16 or

10×163+3×162+2×161+15×160 = 4177510

Binary integer constant (i.e., base-2)• Begins with prefix ‘0b’ and follows by digits {0,1}

Page 38: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 38

5. Specify Values

• Integer Type Suffixes No suffix defaults to int, e.g., "int x = 2080" u or U specifies unsigned int, e.g., "unsigned x = 2080U" l or L specifies long, e.g., "long x = 2080L" ul or UL specifies unsigned long,

e.g., "unsigned long x=2080UL" ll or LL specifies long long,

e.g., "long long x = 2080LL" ull or ULL specifies unsigned long long,

e.g., "unsigned long long x = 2080ULL"

Page 39: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 39

5. Specify Values

• Floating Point Number Decimal floating point constants

• With a decimal point, i.e., '.', e.g., "2080."• Support scientific notation for 10e, e.g., 1.7E-13 means 1.7×10-13

• Floating point literals defaults to type double (not float)• Types can be changed by suffix: f, F for float and l, L for long double

• Characters and Strings Integer character constant

• A single letter enclosed by single quotes: ' ', e.g., 'a'• Internally represented by an integer value mapped to the

character (e.g., using ASCII code, 'a' has value 97) String literals

• A sequence of letters enclosed by " ", e.g., "anInt=%i\n"

Page 40: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 40

5. Specify Values

• Characters and Strings ‘\’ is called an escape character used to specify special

sequence (can be used in both char and string)• ‘\n’ is a control character to switch to a new line (e.g., in printf)

/* helloworld.c* This is my first C program in IERG2080!*/

#include <stdio.h>

int main(void) {printf("Hello World!\n");return 0;

}

yblee@ubuntu:~/ierg2080$ ./helloworldHello World!yblee@ubuntu:~/ierg2080$

/* helloworld.c* This is my first C program in IERG2080!*/

#include <stdio.h>

int main(void) {printf("Hello World!");return 0;

}

yblee@ubuntu:~/ierg2080$ ./helloworldHello World!yblee@ubuntu:~/ierg2080$

Without the newline ‘\n’ the command prompt won’t start in a new line.

Page 41: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 41

5. Specify Values

• Characters and Strings Common C escape sequences:

Escape Sequence Meaning

\b Backspace

\n Newline

\r Carriage return (next char to print from the beginning of the current line.

\t Horizontal tab

\\ Backslash \

\’ and \” Single and double quotation marks

Page 42: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 42

5. Specify Values

• Some properties of literal values Consecutive string literals are concatenated (by the compiler)

#include <stdio.h>

int main(void) {printf("Hello World!"

" This follows in the same line.\n""This prints in a new line.\n");

return 0;}

yblee@ubuntu:~/ierg2080$ gcc -o stringliterals stringliterals.cyblee@ubuntu:~/ierg2080$ ./stringliterals Hello World! This follows in the same line.This prints in a new line.

Useful for writing long string literals by breaking them intomultiple lines.

Page 43: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 43

5. Specify Values

• Some properties of literal values Beware of mixing signed and unsigned numbers

#include <stdio.h>

int main(void) {unsigned x = 1;printf("x = %u\n", x);printf("-x = %u\n", -x);return 0;

}

yblee@ubuntu:~/ierg2080$ gcc -o numberliterals numberliterals.cyblee@ubuntu:~/ierg2080$ ./numberliterals x = 1-x = 4294967295

The "%u" means unsigned int value

Page 44: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 44

5. Specify Values

• Some properties of literal values Floating point literals may differ from its effective value

• Limited by the representation’s precision• May lead to unexpected results:#include <stdio.h>

int main(void) {float x = 0.2;float y = 0.20000000000001;

if (x == y)printf("x == y!\n");

elseprintf("x != y!\n");

printf("x = %.30f\n", x);printf("y = %.30f\n", y);return 0;

}

yblee@ubuntu:~/ierg2080$ ./floatliterals x == y!x = 0.200000002980232238769531250000y = 0.200000002980232238769531250000

Page 45: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 45

5. Specify Values

• Some properties of literal values Floating point literals may differ from its effective value

• Choose the type with sufficient precision for the task:

#include <stdio.h>

int main(void) {double x = 0.2;double y = 0.20000000000001;

if (x == y)printf("x == y!\n");

elseprintf("x != y!\n");

printf("x = %f\n", x);printf("y = %f\n", y);return 0;

}

yblee@ubuntu:~/ierg2080$ ./floatliterals x != y!x = 0.200000000000000011102230246252y = 0.200000000000010003109451872660

Page 46: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 46

6. Variables and Initializers

• Variables A variable has a type (e.g., int) and a value (e.g., 2080) Intuitively a variable is like a custom PO box with a name

• Customized for the data type (e.g., int for storing signed integers using 32 bits)

• The name is just the variable name#include <stdio.h>

int main(void) {double x = 0.2;double y = 0.20000000000001;

if (x == y)printf("x == y!\n");

elseprintf("x != y!\n");

printf("x = %f\n", x);printf("y = %f\n", y);return 0;

}

Page 47: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 47

6. Variables and Initializers

• Variables Declaring and defining a variable:

#include <stdio.h>

int main(void) {int x;int y = 10;

if (y > x)printf("y > x!\n");

elseprintf("y <= x!\n");

printf("x = %i\n", x);printf("y = %i\n", y);return 0;

}

BAD BUG

gcc -o uninitvar uninitvar.cyblee@ubuntu:~/ierg2080$ ./uninitvar y > x!x = 0y = 10

Page 48: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 48

6. Variables and Initializers

• Variables Declaring and defining a variable:

#include <stdio.h>

int main(void) {int x;int y = 10;

if (y > x)printf("y > x!\n");

elseprintf("y <= x!\n");

printf("x = %i\n", x);printf("y = %i\n", y);return 0;

}

BAD BUG

gcc -o uninitvar uninitvar.cyblee@ubuntu:~/ierg2080$ ./uninitvar y <= x!x = -26y = 10

x was not initialized with a value.

An initialized variable will have a value represented by whateverbit values are stored in its memory location. It’s unpredictable.

Page 49: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 49

6. Variables and Initializers

• Variables Declaring, defining, and initializing a variable:

#include <stdio.h>

int main(void) {int x = 0;int y = 10;

if (y > x)printf("y > x!\n");

elseprintf("y <= x!\n");

printf("x = %i\n", x);printf("y = %i\n", y);return 0;

}

CORRECT

gcc -o uninitvar uninitvar.cyblee@ubuntu:~/ierg2080$ ./uninitvar y > x!x = 0y = 10

x is now always initialized to 0.

Page 50: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 50

6. Variables and Initializers

• Configuring Compiler Warning Levels By default, the compiler does not give any warning for the

uninitialized variable:#include <stdio.h>

int main(void) {int x;int y = 10;

if (y > x)printf("y > x!\n");

elseprintf("y <= x!\n");

printf("x = %i\n", x);printf("y = %i\n", y);return 0;

}

BAD BUG

gcc -o uninitvar uninitvar.cyblee@ubuntu:~/ierg2080$ ./uninitvar y > x!x = -26y = 10

x was used before initialized with a value.

No complain from thecompiler!

Page 51: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 51

6. Variables and Initializers

• Configuring Compiler Warning Levels Raise the compiler’s warning levels to the max: -Wall option

#include <stdio.h>

int main(void) {int x;int y = 10;

if (y > x)printf("y > x!\n");

elseprintf("y <= x!\n");

printf("x = %i\n", x);printf("y = %i\n", y);return 0;

}

BAD BUG

gcc -o uninitvar uninitvar.c -Walluninitvar.c: In function ‘main’:uninitvar.c:7:7: warning: ‘x’ is used uninitialized in this function [-Wuninitialized]

if (y > x)^

x was used before initialized with a value.

Now the compiler warnsabout the potential bug.

To dig further: https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html

Page 52: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 52

7. Constants

• Repeated use of the same literals Example: (a.k.a. hardcoding values in source codes)

Imagine you need to modify the course code from 2080 to 3080.

Further imagine it was used in 100 different places! There must be a better way! Yes it does.

#include <stdio.h>

int main(void) {printf("Welcome to IERG%i\n", 2080);printf("IERG%i covers C Programming\n", 2080);printf("IERG%i is a fun course!\n", 2080);return 0;

}

yblee@ubuntu:~/ierg2080$ gcc -o constants constants.cyblee@ubuntu:~/ierg2080$ ./constants Welcome to IERG2080IERG2080 covers C ProgrammingIERG2080 is a fun course!

Page 53: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 53

7. Constants

• Repeated use of the same literals replaced by a constant Example:

int main(void) {int const courseCode = 2080;

printf("Welcome to IERG%i\n", courseCode);printf("IERG%i covers C Programming\n", courseCode);printf("IERG%i is a fun course!\n", courseCode);return 0;

}

yblee@ubuntu:~/ierg2080$ gcc -o constants constants.cyblee@ubuntu:~/ierg2080$ ./constants Welcome to IERG2080IERG2080 covers C ProgrammingIERG2080 is a fun course!

const qualifier (make a type const)

The output is exactly the same as before.

Page 54: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 54

7. Constants

• Repeated use of the same literals replaced by a constant Example: Change the constant once and apply it everywhere

#include <stdio.h>

int main(void) {int const courseCode = 3080;

printf("Welcome to IERG%i\n", courseCode);printf("IERG%i covers C Programming\n", courseCode);printf("IERG%i is a fun course!\n", courseCode);return 0;

}

yblee@ubuntu:~/ierg2080$ gcc -o constants constants.cyblee@ubuntu:~/ierg2080$ ./constants Welcome to IERG3080IERG3080 covers C ProgrammingIERG3080 is a fun course!

And all gets updated.

Just one change

Page 55: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 55

7. Constants

• Constant Type const is a type qualifier to be applied to an existing type A const type must not have its value changed in the source

code and is enforced by the compiler at compile time Example:

#include <stdio.h>

int main(void) {int const courseCode = 2080;

courseCode = 1140;printf("Welcome to IERG%i\n", courseCode);printf("IERG%i covers C Programming\n", courseCode);printf("IERG%i is a fun course!\n", courseCode);return 0;

}

yblee@ubuntu:~/ierg2080$ gcc -o constants constants.cconstants.c: In function ‘main’:constants.c:13:15: error: assignment of read-only variable ‘courseCode’

courseCode = 1140;^

Page 56: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 56

7. Constants

• Type Qualifiers Type qualifier (e.g., const) applies to the type on the left

int const courseCode = 2080;

Base type: int

Qualifier: constNow the base type becomes integer constant, i.e., read-only.

Initialization: as const cannot change valueits must be initialized upon definition.

Note: Earlier C versions put the const qualifier on the left, i.e., const int. While this also works it is not recommended for new codes.

Page 57: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 57

7. Constants

• Enum An enumeration creates a new type with a set of allowable

values Example:

#include <stdio.h>

int main(void) {enum ExamResult {FAIL, PASS};enum ExamResult myResult = PASS;

if (myResult == PASS)printf("Hurray I PASSED!\n");

elseprintf("See you next year!\n");

return 0;}

yblee@ubuntu:~/ierg2080$ gcc -o enum enum.cyblee@ubuntu:~/ierg2080$ ./enumHurray I PASSED!

Page 58: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 58

7. Constants

• Enum An enumeration creates a new type with a set of allowable

values One can achieve the same using integer values with self-

defined meanings:#include <stdio.h>

int main(void) {int myResult = 1; // 0 = FAIL, 1 = PASS

if (myResult == 1) // vs “if (myResult == PASS)”printf("Hurray I PASSED!\n");

elseprintf("See you next year!\n");

return 0;}

yblee@ubuntu:~/ierg2080$ gcc -o enum enum.cyblee@ubuntu:~/ierg2080$ ./enumHurray I PASSED!

Page 59: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 59

7. Constants

• Enum An enumeration creates a new type with a set of allowable

values In fact enum constants are represented by integer values:

#include <stdio.h>

int main(void) {enum ExamResult {FAIL, PASS};enum ExamResult myResult = PASS;

if (myResult == 1) // Bad coding style!printf("Hurray I PASSED!\n");

printf("FAIL = %i\n", FAIL);printf("PASS = %i\n", PASS);

return 0;}

yblee@ubuntu:~/ierg2080$ gcc -o enum enum.c -Wallyblee@ubuntu:~/ierg2080$ ./enumHurray I PASSED!FAIL = 0PASS = 1

Page 60: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 60

7. Constants

• Enum Why use enum then?

• Make the code easier to understand– "myResult == PASS" vs "myResult == 1"

• Define the valid values for a given variable– Exam results should be either PASS or FAIL

• Make it clear what type is expected– integer int can be anything but stating enum ExamResult makes it

clear what is expected

More readable codes makes better codes!

Page 61: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 61

7. Constants

• C Language Preprocessor Process macros and other preprocessor directives to produce a

processed C source file for the compiler to compile

C source file(w/ macros and

preprocessordirectives)

Pure C source codes

C header file(w/ macros and

preprocessordirectives)

#include

Compiler

C Preprocessor

Page 62: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 62

7. Constants

• C Language Preprocessor Macros – preprocessor definition for code substitution Example: Using macro to define a constant

Looks very much like enum, right?• No new type introduced• The macro will be expanded during preprocessing …

#define EXAMRESULT int#define FAIL 0#define PASS 1

int main(void) {EXAMRESULT myResult = PASS;if (myResult == PASS)

return PASS;else

return FAIL;}

# indicates its forthe preprocessor(c.f. #include)

Macro names should be in all caps.

Page 63: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 63

7. Constants

• C Language Preprocessor Macros – preprocessor definition for code substitution Use –E option in gcc to generate the pre-processed C source

codes (or invoke the preprocessor directly via cpp)

int main(void) {int myResult = 1;if (myResult == 1)

return 1;else

return 0;}

yblee@ubuntu:~/ierg2080$ gcc -o macros-preprocessed.c macros.c -Eyblee@ubuntu:~/ierg2080$ gedit macros-preprocessed.c &

#define EXAMRESULT int#define FAIL 0#define PASS 1

int main(void) {EXAMRESULT myResult = PASS;if (myResult == PASS)

return PASS;else

return FAIL;}

Macro expansion is essentially “search and replace”!

macros.cmacros-preprocessed.c

Macros are gone now

Page 64: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 64

7. Constants

• C Language Preprocessor Macros

• Preprocessor performs search and replace for the macro in the entire C source file

• Preprocessor won’t check the syntax correctness of the macro-expanded codes – that’s the job of the C compiler. Sometimes this results in hard-to-understand compilation errors.

• In addition to constants, macros can also define alias for data types (e.g., EXAMRESULT for int), inline functions (more on that later).

There are many other pre-processor directives• #include (insert an external file into the current source file)• Conditional compilation, diagnostics, etc.• To dig further: https://gcc.gnu.org/onlinedocs/cpp/

Page 65: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 65

Summary• In this chapter we covered: C Data Types

• C data types, their allowable values and operations• Built-in data types, type alias• Data type representation, considerations for overflow and

precision Ways to specify values for various data types

• Character strings and escape sequences Variables and how/when to initialize them

• Uninitialized variables are a BIG NO NO• Compiler warning levels

Page 66: IERG2080 Introduction to Systems Programming Chapter 2 - C

Copyright Jack Y. B. LeeAll Rights Reserved

IERG2080 Introduction to Systems Programming - C Basic Data Types 66

Summary• In this chapter we covered: Constants

• Hardcoding vs defining constants• The const qualifier and the left-hand rule• Enum• C preprocessor and macros

• Final words on data types Data types help the programmer clearly specify the intended

use of values and variables The compiler keeps track of data types and checks them in use

(e.g., during assignment, passing parameters to functions, etc.) Data types help catch coding errors during compile time (when

it is easy to fix) rather than runtime (when it annoys the user)