lecture 2 data representation in computer systems lecture duration: 2 hours

Lecture 2

Data Representation inComputer Systems

Lecture Duration: 2 Hours

Prepared by Dr. Hassan SALTI - 2012 2

Lecture Overview

Introduction Positional Numbering System Decimal to binary conversion Signed integer representation Floating-point representation


Some Notifications – A reminder (1/2)Introduction

Bit: The most basic unit of information in a digital computer (On/Off ; 0/1 state)

Byte: A set of 8bits Word: two or more adjacent bytes that are

manipulated collectively Word size: The size of a word in bits depends on the

computer organization (16, 32, 64 bits, …) Nibbles (or nybbles): set of 4 bits – Usually a set of 8

bits is divided into two nibbles, a low order nibble and a high order nibble


Some notifications – A reminder (2/2)Introduction

Example:0 1 1 0 0 1 1 1 1 0 0 0 1 1 0 1

bit

byte byte

bit bit bit bit bit bit bit bit bit bit bit bit bit bit bit

Word (16 bit)

High Order nibble

High Order nibble

Low Order nibble

Low Order nibble

Most Significant bit

(MSB)

Least Significant bit

(LSB)


Lecture Overview

Introduction Positional Numbering System Decimal to binary conversion Signed integer representation Floating-point representation


Positional Numbering System (1/3)Positional Numbering System

Any numeric value is represented through increasing powers of a radix (or base)

The set of valid numerals (digits) is equal in size to the radix of that system

The least numeral is 0 and the highest one in 1 smaller than the radix

Example:• In the decimal system (base 10)

- The radix is 10- The number of valid numerals is 10 (equal to the radix)- The set of valid numerals is: {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}



The most important radices (bases) in computer science are:• Binary

- Radix 2 or base 2- Numerals: {0 , 1}

• Octal- Radix 8 or Base 8- Numerals: {0 , 1 , 2 , 3 , 4 , 5 , 6 , 7}

• Hexadecimal- Radix 16 or base 16- Numerals: {0 , 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , A , B , C , D , E , F}



Any numeric value is represented through increasing powers of a radix (or base)

Examples• 43.5110 = 2x102 + 4x101 + 3x100 + 5x10-1 + 1x10-2

• 2123 = 2x32 + 1x31 + 2x30 = 2310

• 10110.012 = 1x24 + 0x23 + 1x22 + 1x21 + 0x20 + 0x2-

1 + 1x2-2= 22.2510


Lecture Overview

Introduction Positional Numbering System Decimal to binary conversion• Converting Unsigned Whole Numbers• Converting fractions• Converting between Power-of-Two Radices

Signed integer representation Floating-point representation


Some numbers to remember (1/1)Decimal to binary conversion

Keep in mind the following tables or how to obtain them!


Converting Unsigned Whole Numbers (1/6)Decimal to binary conversion

A real number can take any value (ex. 10323.7643 ; -16813.5322703)

Whole number: No fractions (ex: 10, 1231, 3543, …, -12, -12334,…)

Unsigned number: Only positive numbers (ex: 102313.43234, 1231.56234, 12357, …)

Unsigned whole numbers: No fraction and only positive numbers



Convert the decimal number 11310 to binary: 11310 = 2

Method 1: Repeated subtraction113- 64 49- 32 17- 16 1- 1 0

1110001

11310 = 11100012



Method 2: Division-remainder2 |113 2 |56 2 |28 2 |14 2 |7 2 |3 2 |1 0

11310 = 11100012

MSB

LSBRemainder 1Remainder 0Remainder 0Remainder 0Remainder 1Remainder 1Remainder 1



A binary number with N bits can represent 2N unsigned integers from 0 to 2N-1

Example:• Having N=4 bits, we can

represent 24 = 16 unsigned integers from 0 to 24-1=16-1=15

• The number 16 CANNOT be represented with only 4 bits!!



The subtraction method is cumbersome. The subtraction method requires a familiarity

with the powers of the radix being used. The division-remainder method is faster and

easier than the repeated subtraction method. The division-remainder method can be used

to convert from decimal to any other base system (not only to base 2).



Example: Convert 10410 to base 3 using the division-remainder method.

3 |104 3 |34 3 |11 3 |3 3 |1 0

Remainder 2Remainder 1Remainder 2Remainder 0Remainder 1

10410 = 102123


Lecture Overview




Converting fractions (1/5)Decimal to binary conversion

Fractions in a decimal system can be converted/approximated to fractions in any other radix system

Radix points separate the integer part of a number from its fractional part

Example of fractions (the integer part is italic and the fractional part is bold)• Base 10 : 2390167.1208• Base 3 : 2012.11022• Base 2 : 1011110.111011

The “radix point” is called a “decimal point” in a decimal system, a “binary point” in a binary system, and so on…



To convert fractions from decimal to any other base system we repeatedly multiply by the destination radix

Example: Convert 0.430410 to base 5.

0.4304x 52.1520 The integer part is 20.1520x 50.7600 The integer part is 0x 53.8000 The integer part is 30.8000x 54.0000 The integer part is 4,

the fractional part is zero, we are done

0.430410 = 0.20345



Some fractions in one base could be indeterminate• Fractions that contain repeating strings of digits to the right of the

radix point• Example: (2/3)10=(0.666…)10

An indeterminate fraction in one base could be determinate in another base (and vice-versa).• Example: (2/3)10=0.23=(0.666…)10

- 2/3 is indeterminate in base 10 but determinate in base 3.

When a fraction is indeterminate, an approximation is needed• We fix the number of digits to the right of the radix point

Also, approximation is needed due to the limited computing resources (example: limited size of the processor’s registers)



Example: Convert 0.3437510 to binary with 4 bits to the right of the binary point.

0.34375 x 20.68750 x 21.375000.37500 x 20.75000 x 21.50000 This is our fourth bit.

We will stop here.

0.3437510 = 0.01012



Convert 26.78125 to binary:26.7812510 = 2

By using the methods just described we will have:

2610=110102 and 0.7812510=0.110012

So 26.7812510=11010.110012


Going back to positional numbering system (1/1)Decimal to binary conversion

Any unsigned whole or fractional number could be converted to decimal by using the “Positional Numbering System” described previously

Examples: 0.01012=0x2-1+1x2-2+0x2-3+1x2-4 = 0 + 0.25 + 0 +

0.0625 = 0.312510

134.20345 = 1x52 + 3x51 + 4x50 + 2x5-1 + 0x5-2 + 3x5-3 + 4x5-4 = 44.430410


Lecture Overview




Converting between Power-of-Two Radices (1/4)Decimal to binary conversion

To convert between any base to any other base (different than base 10), it is easier to pass through base 10.• Example: 31214= 3?

• First step: 31214 = 3x43 + 1x42 + 2x41 + 1x40=21710

• Second step: by using the division-remainder method: 21710 = 220013

• So 31214=220013

Working between bases that are powers of two is much more easier.



The must famous power-of-two radices are: binary (base 2), octal (base 23 / base 8) and hexadecimal (base 24 / base 16).

Each octal digit is equivalent to a group of 3 binary digits called octet1

Each hexadecimal digit is equivalent to a group of 4 binary digits called hextet

We convert from binary to octal and from binary to hexadecimal by simply grouping bits

1 The term “Octet” could also be used in the literature to describe a set of 8 bits.



Example: Convert 101100100111012 to octal

• Make Groups of 3 bits (from right to left):- 10 110 010 011 101

• Add zero(s) on the left to complete the last octet- 010 110 010 011 101

• Convert each octet to its corresponding octal digit- 010 110 010 011 101 2 6 2 3 5

• Finally: 101100100111012 = 262358



Example: Convert 101100100111012 to hexadecimal

• Make Groups of 4 bits (from right to left):- 10 1100 1001 1101

• Add zero(s) on the left to complete the last hextet- 0010 1100 1001 1101

• Convert each hextet to its corresponding hexadecimal digit

- 0010 1100 1001 1101 2 C 9 D

• Finally: 101100100111012 = 2C9D16


Lecture Overview

Introduction Positional Numbering System Decimal to binary conversion Signed integer representation• Signed Magnitude• Complement system

Floating-point representation


Signed integer representationSigned integer representation

An integer is a whole number Signed integers are the set of positive and

negative whole numbers How should we encode and deal with the

actual sign of the number? Two concepts are used• Signed Magnitude concept• Complement concept


Signed Magnitude (1/13)Signed integer representation

Signed magnitude is the most intuitive method

The MSB (Most Significant Bit) of a binary number is kept as the “sign” of the number• MSB = 1: negative number• MSB = 0: positive number

The remaining bits represent the magnitude (or absolute value) of the numeric value



Example: In a 8 bit word signed magnitude system give the decimal representation of the following numbers

• 00000001?- The MSB is 0: The number is positive- The remaining 7 bits are: 00000012 = 110

- The decimal number is +1

• 10000001?- The MSB is 1: The number is negative- The remaining 7 bits are: 00000012 = 110

- The decimal number is -1



Example: In a 8 bit word signed magnitude system give the decimal representation of the following numbers

• 10001001?- The MSB is 1: The number is negative- The remaining 7 bits are: 00010012 = 910

- The decimal number is -9

• 01000001?- The MSB is 0: The number is positive- The remaining 7 bits are: 10000012 = 6510

- The decimal number is +65



In a N bit word signed magnitude system• 1 bit is used for the sign of the number• N-1 bits are used for the magnitude of the number• The largest integer is 2N-1 - 1• The smallest integer is -(2N-1 - 1)

Example: in a 8 bit word signed magnitude system• The largest integer is 011111112 = 27-1 = 12710

• The smallest integer is 111111112 = -(27-1) = -12710



Computers should be able to carry out mathematical operations

Signed-magnitude arithmetic is carried out using essentially the same methods as humans• At first we look at the signs of the two operands• We arrange the operands in a certain way based on

their signs• We perform the calculation without regard to the

signs• Finally, we supply the sign as appropriate



Adding operands that have the same sign Example: Add 010011112 to 001000112 using

signed-magnitude arithmetic. 1 1 1 1 carries⇐

0 1 0 0 1 1 1 1 (79)0 + 0 1 0 0 0 1 1 + (35)0 1 1 1 0 0 1 0 (114)

We find 010011112 + 001000112 = 011100102 in signed-magnitude representation.

Sign



Overflow condition• In the last example, adding the seventh’ bits to

the left gives no carry• If there is a carry, we say that we have an

overflow condition and the carry is discarded, resulting in an incorrect sum.

Example: Add 010000012 to 011000012 using signed-magnitude arithmetic



1 1 carries⇐0 1 0 0 0 0 0 1 (65)0 + 1 1 0 0 0 0 1 + (97)0 0 1 0 0 0 1 0

The addition overflows The last carry is discarded The sum’s result is incorrect

X

(34)



Signed-magnitude subtraction is carried out in a manner similar to pencil and paper decimal arithmetic

Example 1: Subtract 010011112 (79) from 011000112 (99) using signed-magnitude arithmetic.

0 1 1 2 borrows⇐0 1 1 0 0 0 1 1 (99)0 - 1 0 0 1 1 1 1 (79)0 0 0 1 0 1 0 0 (20)

We find 011000112 - 010011112 = 000101002 in signed-magnitude representation.



Example 2: Subtract 011000112 (99) from 010011112 (79) using signed-magnitude arithmetic.• Here the subtrahend, 01100011, is larger than the

minuend, 01001111. • With the result obtained in Example 2.12, we know

that the difference of these two numbers is 00101002.• Because the subtrahend is larger than the minuend, all

that we need to do is change the sign of the difference.• So we find 010011112 - 011000112 = 100101002 in

signed-magnitude representation



Example 3: Add 100100112 (-19) to 000011012 (+13) using signed-magnitude arithmetic.• The result is negative• We subtract 13 from 19• The result of the binary subtraction is: 100001102 (-6)

Example 4: Subtract 100110002 (-24) from 101010112 (-43) using signed-magnitude arithmetic.• This is equivalent to adding -43 to 24• The result is negative• We subtract 24 from 43• The result of the binary subtraction is: 100100112 (-19)



General rules when operands have different signs• Determine which operand has the larger

magnitude• The sign of the result is the same as the sign of

the operand with the larger magnitude• the magnitude must be obtained by subtracting

(not adding) the smaller one from the larger one



Problems related to signed magnitude• To much decisions to make (larger number? ;

borrows? ; what signs?).• The number 0 could have two representations :

10000000 and 00000000.• Complicated method• Expensive circuits


Lecture Overview

Introduction Positional Numbering System Decimal to binary conversion Signed integer representation• Signed Magnitude• Complement system

Floating-point representation


Complement system (1/19) Signed integer representation

Complement system is used to represent/convert negative numbers only

When using complement system the subtraction is converted to an addition

Advantages of complement system• Simplify computer arithmetic• No need to process sign bits separately• The sign of a number is easily checked by looking

at its high-order bit (MSB).



In base 10, “Casting out 9s” was used to subtract numbers

Let’s say we wanted to find 167 - 52• At first, 999 - 52 is calculated

999 – 52 = 947• 947 is then added to 167 and the last carry is added to the

sum:167 – 52 = 167 + 947 = 114 + 1 = 115

a

1 1 1

1 6 7

+ 9 4 7

1 1 4

Carries:



The last method uses a “diminished radix complement”

Working in base r (radix), the diminished radix is given by : r-1

Example: Base 10 ; r=10• The diminished radix is r-1 = 10 - 1 = 9• We say that a negative number is converted to its 9’s

complement • For example, -246810 is converted to its nine’s

complement as follows: -246810 = 9999 - 2468 = 7531C9



In a binary system r=2• The diminished radix complement is r-1 = 1• We say that we work in one’s complement (C1)• To convert a negative number to its one’s complement

this number is subtracted from all ones• A positive number is directly converted to its binary

representation• Example:

- The one’s complement of 01012 is 11112 - 01012 = 1010C1

- It is nothing more than switching all of the 1s with 0s and vice versa!!



Example: Express 2310 and -910 in 8-bit binary one’s complement form.

2310 = + (000101112) = 00010111C1

-910 = - (000010012) = 11110110C1



In one’s compliment the subtraction is converted into addition• Example: 2310 – 910 = 2310 + (-910)

Example: Add 2310 to -910 using 8-bit binary one’s complement arithmetic.

The result is 00001110C1 = +(000011102) = 1410

1 1 1 1 1 1

0 0 0 1 0 1 1 1 2310

+ 1 1 1 1 0 1 1 0 + (-910)

0 0 0 0 1 1 0 1 1410

Carries:



Example: Add 910 to -2310 using 8-bit binary one’s complement arithmetic.

-2310 = - (00010111)2 = 11101000C1

910 = + (000010012) = 00001001C1

910 + (-2310) = 11101000C1 + 00001001C1

Result: 11110001C1 = -(000011102) = -1410

0 0 0 0 1 0 0 0

0 0 0 0 1 0 0 1 910

+ 1 1 1 0 1 0 0 0 + (-2310)

1 1 1 1 0 0 0 1 -1410

Carries:



In One’s complement, we still have two representations for zero: 00000000 and 11111111

Computer engineers long ago stopped using one’s complement

A more efficient representation for binary numbers is the two’s complement



Two’s complement is an example of a radix complement

No need to subtract one from the radix r when working in a radix complement.

Example: Base 10 ; r=10• We say that a negative number is converted to its 10’s

complement • For example, -246810 is converted to its ten’s

complement as follows: -246810 = 10000 - 2468 = 7532C10



In a binary system r=2• The diminished radix r = 2• We say that we work in two’s complement• Consider “d” is the number of digits• To convert a negative number “N” to its two’s

complement this number is subtracted from rd = 2d : N10 = (2d – N)C2

• A positive number is directly converted to its binary representation



Example:• In a 4 bits system: d=4;• All negative numbers are converted by being

subtracted from 2d = 24 = 1610 = 100002

• The two’s complement of 00112 is 100002 - 00112 = 1101C2

• It is nothing more than one’s complement incremented by 1!!



Example: Express 2310, -2310, and -910 in 8-bit binary two’s complement form.• 2310 = + (000101112) = 000101112

• -2310 = -(000101112) = 111010002 + 1 = 111010012

• -910 = -(000010012) = 111101102 + 1 = 111101112



Unlike C1 arithmetic, in C2 the last carry is discarded

Example 1: Add 910 to -2310 using two’s complement arithmetic.

The result is 11110010C2 = -(000011102) = -1410

Carries: 0 0 0 0 1 0 0 1

0 0 0 0 1 0 0 1 910

+ 1 1 1 0 1 0 0 1 + (-2310)

1 1 1 1 0 0 1 0 -1410



Note how a negative binary number in C2 is converted to decimal• At first all 0 and 1 in the C2’s number are

switched: 11110010 → 00001101• A “1” is then added to the last number:

00001101+1 = 00001110• So 11110010C2 = -(000011102) = -1410



Example 2: Find the sum of 2310 and -910 in binary using two’s complement arithmetic.

2310 = +(00010111)2 = 00010111C2

-910 = -(000010012) = 11110111C2

2310 + (-910) = 00010111C2 + 11110111C2

Result: 00001110C2 = +(000011102) = 1410

1 1 1 1 0 1 1 1

0 0 0 1 0 1 1 1 2310

+ 1 1 1 1 0 1 1 1 + (-910)

0 0 0 0 1 1 1 0 -1410

Carries:



Advantages of two’s complement• It is the most popular choice for representing

signed numbers• The algorithm for adding and subtracting is quite

easy• It has the best representation for 0 (all 0 bits)• It is self-inverting• It is easily extended to larger numbers of bits.



Drawback• the asymmetry seen in the range of values that

can be represented by N bits.• Examples:

- With signed-magnitude, 4 bits allow us to represent the values -7 (11112) through +7 (01112).

- Using two’s complement, we can represent the values: -8 (1000C2) through +7 (0111C2)



Overflow in complement systems (C1 and C2)• An overflow occurs if two positive numbers are

added and the result is negative• or if two negative numbers are added and the

result is positive.• It is not possible to have overflow when if a

positive and a negative number are being added together.



To Detect Overflow• Check the last two carries

- If these are different: there is an overflow- If these are equal: there is no overflow

Example 1: Find the sum of 12610 and 810 in binary using two’s complement arithmetic.

The result is 10000110C2 = -(01111010)2 = -12210!!!

Note that the last two carries are different

0 1 1 1 1 1 1 0

0 1 1 1 1 1 1 0 12610

+ 0 0 0 0 1 0 0 0 + 810

1 0 0 0 0 1 1 0 -1410

Carries:


Lecture Overview

Introduction Positional Numbering System Decimal to binary conversion Signed integer representation Floating-point representation• A simple model• Floating-point arithmetic• Floating point errors


Floating-point representation (1/1)Floating-point representation

A computer is supposed to solve all problems Huge and fractional numbers and complicated

mathematical operations could be involved An optimized solution to give a good ratio:

“Biggest Number/word size” is the Floating point representation


Computers use a form of scientific notation for floating-point representation

Numbers written in scientific notation have three components:

Scientific notation in base 10:

Scientific notation in base 2:

+ 0.101101 23x

+ 0.579 107x


A simple model (1/8)Floating-point representation

In digital computers, floating-point numbers consist of three parts:• A sign bit,• an exponent part: representing the exponent on a

power of 2,• a fractional part called a significand: which is a

fancy word for a mantissa.



More bits used for the exponent increases the range of numbers

More bits used for the significant increases the precision

For simplicity, in all this course, we will use a simplified 14 bits model• Sign bit: 1 bit• Exponent: 5 bits• Significand: 8 bits



Exercise 1: Represent the number 17 in a 14 bits floating point representation• 17 = 17.0 x 100 = 1.7 x 101 = 0.17 x 102

• Analogically in binary: • 1710 = 100012 x 20

= 1000.12 x 21= 100.012 x 22 = 10.0012 x23 =

1.00012 x 24 = 0.100012 x 25 = 0.0100012 x 26 = 0.00100012 x 27 = ...• As a convention, we stop when the MSB of the significant is “1”:

0.100012 x 25

• The exponent is 510 = 001012

• The significant is: 100012 → 100010002

• So: 0 0 0 1 0 1 1 0 0 0 1 0 0 0



The last floating point representation is not suitable for negative exponents• Example:

- the number 0.25 = 0.012 = 0.12 x 2-1

- How to represent the negative exponent -1?!

To solve such problems we use an excess-16 bias• All negative and positive exponents are added by 16• We say that the real exponent is replaced by a biased

exponent• All exponents are converted to positive biased exponents



With an excess-16 bias• Exponent values less than 16 will indicate

negative exponent values• Exponent values more than 16 will indicate

positive exponent values• exponents of all zeros or all ones are typically

reserved for special numbers (such as zero or infinity).



Example 1: Represent the number 17 in a 14 bits floating point form with excess-16 bias• The number is positive: sign bit is “0”• 1710 = 0.100012 x 25

• The exponent is 510 → (5+16)10 = 2110 = 101012

• The significant is: 100012 → 100010002

• So 17 in floating point form with excess-16 bias is:0 1 0 1 0 1 1 0 0 0 1 0 0 0



Example 2: Represent the number 0.2510 in a 14 bits floating point form with excess-16 bias.• The number is positive: sign bit is “0”• 0.25 = 0.012 x 20 = 0.12 x 2-1

• The exponent is -110 → (-1+16)10 = 1510 = 011112

• The significant is 1 → 10000000• So 0.25 in floating point form with excess-16 bias

is: 0 0 1 1 1 1 1 0 0 0 0 0 0 0



Example 3: Express -0.0312510 in normalized floating-point form with excess-16 bias.• The number is negative: sign bit is “1”• 0.0312510 = 0.000012 = 0.00001x20 = 0.0001x2-1 =

… = 0.1x2-4

• The exponent is -410 → (-4+16)10 = 1210 = 011002

• The significant is 1 → 10000000• So -0.03125 in floating point form with excess-16

bias is: 1 0 1 1 0 0 1 0 0 0 0 0 0 0


Lecture Overview



Floating point arithmetic (1/2)Floating-point representation

To add/subtract two numbers in floating point form• Both numbers should have the same exponent• If exponents are different

1. we change one of the numbers so that both of them are expressed in the same power of the base

2. We add the binary numbers3. We represent the result in a normalized floating

point form


Floating point arithmetic (2/2)Floating-point representation

Example: Add the following binary numbers as represented in a normalized 14-bit format with an excess-16 bias.

The second number is 0.10011010x20

The first number is 0.11001000x22 = 11.001000x20

Now 0.100110102 + 11.0010002 :

0.1 0 0 1 1 0 1 0+ 1 1.0 0 1 0 0 0 0 0 1 1.1 0 1 1 1 0 1 0

The result is 11.10111010 x 20 = 0.1110111010 x 22

In floating point form with excess-16

0 1 0 0 1 0 1 1 0 0 1 0 0 0

+ 0 1 0 0 0 0 1 0 0 1 1 0 1 0

1810 → 210

1610 → 010

0 1 0 0 1 0 1 1 1 0 1 1 1 0


Lecture Overview



Floating Point Errors (1/2)Floating-point representation

Computers are finite systems When dealing with floating-point form, we are

modeling the infinite system of real numbers in a finite system of integers

What we have, in truth, is an approximation of the real number system

The more bits we use, the better the approximation However, there is always some element of error Such errors can propagate through a lengthy

calculation, causing substantial loss of precision


Floating Point Errors (2/2)Floating-point representation

Example: • In our previous simple model

- we are limited between -0.111111112x215 through +0.111111112x215.

- we cannot store 2x-19 or 2128; they simply don’t fit.- Also, 128.5 cannot be accurately stored even if it is well within

our range→ 128.510 = 10000000.12 = 0.1000000012x28

→ The significant is expressed with more than 8 bits!→ In practice we store only the first 8 bits: 10000000→ We actually store 128 and not 128.5 with an absolute error of 0.5→ The relative error is : 128.5 - 128 = 0.0038910 = 0.39%.

128.5

End of lecture 2

Try to solve all exercises related to lecture 2

lecture 2 data representation in computer systems lecture duration: 2 hours

Documents