data

46
Based on slides from Jones & Bartlett Data Representation in Computer Systems

Upload: didier-a

Post on 22-Mar-2016

213 views

Category:

Documents


1 download

DESCRIPTION

Data Representation in Computer Systems Based on slides from Jones & Bartlett Chapter 2 Objectives • Understand how errors can occur in computations because of overflow. Data Representation 1 COMP 228, Fall 2009

TRANSCRIPT

Based on slides from Jones & Bartlett

Data Representation in Computer Systems

COMP 228, Fall 2009 Data Representation 1

Chapter 2 Objectives

• Understand the fundamentals of alphanumeric data representation and manipulation in digital computers.

• Master the skill of converting between various radix systems.

• Understand how errors can occur in computations because of overflow.

COMP 228, Fall 2009 Data Representation 2

2.1 Introduction

• A bit is the most basic unit of information in a computer.– It is a state of “on” or “off” in a digital circuit (flip flop).– Sometimes these states are “high” or “low” voltage

instead of “on” or “off.”• A byte is a group of eight bits.

– A byte is the smallest possible addressable unit of a typical computer storage.

– The term, “addressable,” means that a particular byte can be retrieved according to its location in memory or register.

COMP 228, Fall 2009 Data Representation 3

2.2 Positional Numbering Systems

• The decimal number 947 in powers of 10 is:

• The decimal number 5836.47 in powers of 10 is:

5 × 10 3 + 8 × 10 2 + 3 × 10 1 + 6 × 10 0

+ 4 × 10 -1 + 7 × 10 -2

9 × 10 2 + 4 × 10 1 + 7 × 10 0

COMP 228, Fall 2009 Data Representation 4

2.2 Positional Numbering Systems

• The binary number 11001 in powers of 2 is:

• When the radix of a number is something other than 10, the base is denoted by a subscript. – Sometimes, the subscript 10 is added for emphasis:

110012 = 2510

1 × 2 4 + 1 × 2 3 + 0 × 2 2 + 0 × 2 1 + 1 × 2 0

= 16 + 8 + 0 + 0 + 1 = 25

COMP 228, Fall 2009 Data Representation 5

2.3 Decimal to Binary Conversions

• Fractional decimal values have nonzero digits to the right of the decimal point.

• Fractional values of other radix systems have nonzero digits to the right of the radix point.

• Numerals to the right of a radix point represent negative powers of the radix:

0.4710 = 4 × 10 -1 + 7 × 10 -2

0.112 = 1 × 2 -1 + 1 × 2 -2

= ½ + ¼= 0.5 + 0.25 = 0.75

COMP 228, Fall 2009 Data Representation 6

• The calculation to the right is an example of using the subtraction method to convert the decimal 0.8125 to binary.

– Our result, reading from top to bottom is:

0.812510 = 0.11012

– Of course, this method works with any base, not just binary.

2.3 Decimal to Binary Conversions

COMP 228, Fall 2009 Data Representation 7

2.3 Decimal to Binary Conversions

• The hexadecimal numbering (base-16) system uses the numerals 0 through 9 and the letters A through F.– The decimal number 12 is C16.– The decimal number 26 is 1A16.

• It is easy to convert between base 16 and base 2, because 16 = 24. It is used primarily for its human friendliness, bridging the gap between the human (decimal) world and computer (binary) world.- Example: 2610 = 110102 = 1A16

• Thus, to convert from binary to hexadecimal, all we need to do is group the binary digits into groups of fourbits, starting from the right.

COMP 228, Fall 2009 Data Representation 8

Unsigned Integer

• An unsigned integer is a binary represented integer whose value is always positive.

• Example: 2610 = 110102 is an unsigned integer represented in 5 bits.

• For every decimal number, there is a minimum number of bits that must be needed in order to represent it [here the decimal number 26 requires 5 bits].

• Overflow (error) will occur if the number of bits available to store an integer is smaller than the minimum required.

• Example: With 4 bits, we need to delete one of the 5 bits needed to represent 26. An error will necessarily result.

COMP 228, Fall 2009 Data Representation 9

2.4 Signed Integer Representation

• A signed integer can be negative or positive, depending on its ‘sign’.

• Two’s complement is the common signed integer representation:– If the number is positive, just convert it to binary and

you’re done.– If the number is negative, find the one’s complement of

the number (by complementing every bit) and then add 1.• Example:

– In 8-bit one’s complement, positive 3 is: 00000011– Negative 3 in one’s complement is: 11111100– Adding 1 gives us -3 in two’s complement form: 11111101.

COMP 228, Fall 2009 Data Representation 10

2.4 Signed Integer Representation

• In two’s complement addition, all we do is add our two binary numbers. Just discard any carries emitting from the high order bit.

– Example: 48 + (-19) = 29

We note that 19 in one’s complement is: 00010011, so -19 in one’s complement is: 11101100,and -19 in two’s complement is: 11101101.

COMP 228, Fall 2009 Data Representation 11

2.4 Signed Integer Representation

• 2’s complement is attractive coding for signed integers: subtraction can be easily implemented as addition with the 1’s complement of the second operand, together with a least significant carry-in bit, as evidenced in the previous example.

• Whenever the most significant bit is 1, the 2’s complement integer is negative.

COMP 228, Fall 2009 Data Representation 12

2.4 Signed Integer Representation

• Examples:• Represent -68 in 8-bit 2’s complement

68 = 010001002-68 = (10111011)2 + 1 = 101111002

• A Quick Trick:Complement every bit until the last 1-bit in the string, as underlined in the above example. This replaces the addition by simple (and partial) complementation.

COMP 228, Fall 2009 Data Representation 13

2.4 Signed Integer Representation

• When we use a finite number of bits to represent a number, we always run the risk of the result of our calculations becoming too large to be stored in the computer.

• Typical overflow occurs when the sum of two positive integers becomes negative, or the sum of two negative integers becomes positive, due to the need to truncate the most significant bit when the register that contains the result is not large enough.

• While we cannot always prevent overflow, we can always detect overflow.

• In complement arithmetic, an overflow condition is easy to detect.

COMP 228, Fall 2009 Data Representation 14

2.4 Signed Integer Representation

• Example:– Using two’s complement binary

arithmetic, find the sum of 107 and 46.

• We see that the nonzero carry from the seventh bit overflows into the sign bit, giving us the erroneous result: 107 + 46 = -103.

Rule for detecting signed two’s complement overflow: When the “carry in” and the “carry out” of the sign bit differ, overflow has occurred.

COMP 228, Fall 2009 Data Representation 15

2.4 Signed Integer Representation

• Signed and unsigned numbers are both useful.– For example, memory addresses are always unsigned.

• Using the same number of bits, unsigned integers can express twice as many values as signed numbers.

• Trouble arises if an unsigned value becomes too large and overflows.– In four bits: 1111 + 1 = 0000 [the most significant bit is

chopped off because only 4 bits are used]• Good programmers stay alert for this kind of

problem.

COMP 228, Fall 2009 Data Representation 16

2.4 Signed Integer Representation• Integer Range (for 3-bit integers b2b1b0)

b2 b1 b0 Unsigned 2’s Complement0 0 0 0 00 0 1 1 10 1 0 2 20 1 1 3 31 0 0 4 -4 1 0 1 5 -3 1 1 0 6 -2 1 1 1 7 -1

For k bits, the range of integers represented in 2’s complement is:- 2k-1 ≤ 2’s complement integer ≤ 2k-1 - 1

COMP 228, Fall 2009 Data Representation 17

Clicker 2.1

What is the decimal equivalent of the 2’s complement integer 10000001?

A. 129B. -128C. -129D. -127E. None of the above

COMP 228, Fall 2009 Data Representation 18

Clicker 2.2

When 10000001 is added to itself, carry will occur.

A. TrueB. False

COMP 228, Fall 2009 Data Representation 19

Clicker 2.3

When 10000001 is added to itself, overflow will occur.

A. TrueB. False

COMP 228, Fall 2009 Data Representation 20

Clciker 2.4

Suppose x = FFF00FF0. What is –x (in hex)?

A. 000FF00FB. 000FF010C. 000FF0F0D. None of the above

COMP 228, Fall 2009 Data Representation 21

• Computer representation of a floating-point number consists of three fixed-size fields:

• IEEE-754 single precision standard uses excess-127 code for the 8-bit exponent and signed-magnitude for the 23-bit (excluding the sign) significand

2.5 Floating-Point Representation

COMP 228, Fall 2009 Data Representation 22

Floating Point Representation

• Example:Sign = 1 [negative]Exponent = 10000001 [Excess-127: 129 – 127 = 2]Significand = 111….0 [Implied 1: 1.1112 ]Hence: 110000001110…0 = -1.1112 * 22 = -111.12 = -7.510

In Hex, this becomes: C0E00000

COMP 228, Fall 2009 Data Representation 23

Clicker 2.5

What is the floating point representation of 1.1? (expressed in hex)

A. 00800000B. 00C00000C. 3FC00000D. 3F800000E. None of the above

COMP 228, Fall 2009 Data Representation 24

2.6 Character Code (ASCII)

• Alphanumeric characters include characters, numbers, punctuations etc that you find on typical keyboards.

• Their representation in a computer may follow some standards. The most common one is ASCII (American Standard Code for Information Interchange).

• The 7-bit ASCII code employs 7 bits to encode up to 128 different alphanumeric characters [27 = 128]

COMP 228, Fall 2009 Data Representation 25

2.6 ASCII Code

• Example:• ‘COMP 228’ contains a string of 8 ASCII characters,

including the space inside. Its coded representation is:• C: 43• O: 4F• M: 4D• P: 50• : 20• 2: 32• 2: 32• 8: 38

COMP 228, Fall 2009 Data Representation 26

2.6 ASCII Code

• An Interesting Observation:– Decimal Digit ‘0’ = 3016

– Decimal Digit ‘6’ = 3616

– We can convert ‘6’ to 6 by subtraction:‘6’ – 3016 = 3616 – 3016 = 6

– Input/Output is often ASCII coded. So conversion from ASCII to integer representation may be necessary for data processing.

COMP 228, Fall 2009 Data Representation 27

2.6 ASCII Code

• Another Interesting Observation– ‘A’ = 4116 ‘a’ = 6116

– Converting ‘A’ to ‘a’ can be done by addition:‘A’ + 2016 = 4116 + 2016 = 6116 = ‘a’

– This applies to all other alphabets (A to Z) and may be simple programming tricks in text processing.

COMP 228, Fall 2009 Data Representation 28

Clicker 2.6

What is ’48’ – 48 when performed by a computer? (expressed in hex)

A. 0B. 3408C. 33F0D. 3300

COMP 228, Fall 2009 Data Representation 29

• Check digits, appended to the end of a long number can provide some protection against data input errors.– The last character of UPC barcodes and ISBNs are check

digits.

• Longer data streams require more economical and sophisticated error detection mechanisms.

• Cyclic redundancy checking (CRC) codes provide error detection for large blocks of data.

2.8 Error Detection and Correction

COMP 228, Fall 2009 Data Representation 30

• Consider a set of k data bits and an error detection parity bit P.

• Even Parity Invariant:• The k+1 bits do not have a single bit error iff the total

number of 1-bits (bits whose values are 1) is even.• Example: Data = 10010010 P =1

The total number of 1-bits is 4, satisfying theeven parity invariant.If any of the nine bits toggle (0 to 1 or vice versa), the even parity invariant will fail.

2.8 Error Detection and Correction

COMP 228, Fall 2009 Data Representation 31

• Suppose three of the data bits are toggled erroneously. Will the parity bit allow us detect this?

• A Always• B Never• C Sometimes

Clicker 2.7

COMP 228, Fall 2009 Data Representation 32

• Data transmission errors are easy to fix once an error is detected. – Just ask the sender to transmit the data again.

• In computer memory and data storage, however, this cannot be done.– Too often the only copy of something important is in

memory or on disk.

• Thus, to provide data integrity over the long term, error correcting codes are required.

2.8 Error Detection and Correction

COMP 228, Fall 2009 Data Representation 33

• Hamming codes and Reed-Soloman codes are two important error correcting codes.

• Reed-Soloman codes are particularly useful in correcting burst errors that occur when a series of adjacent bits are damaged.– Because CD-ROMs are easily scratched, they employ a type

of Reed-Soloman error correction.• Because the mathematics of Hamming codes is

much simpler than Reed-Soloman, we discuss Hamming codes in some detail.

2.8 Error Detection and Correction

COMP 228, Fall 2009 Data Representation 34

• Suppose we have a set of n-bit code words consisting of m data bits and r (redundant) parity bits.

• An error could occur in any of the n bits, so each code word can be associated with n erroneous words at a Hamming distance of 1.

• Therefore,we have n + 1 bit patterns for each code word: one valid code word, and n erroneous words.

2.8 Error Detection and Correction

COMP 228, Fall 2009 Data Representation 35

• With n-bit code words, we have 2 n possible code words consisting of 2 m data bits (where n = m + r).

• This gives us the inequality:

(n + 1) × 2 m ≤ 2 n

• Because n = m + r, we can rewrite the inequality as:

(m + r + 1) × 2 m ≤ 2 m + r or (m + r + 1) ≤ 2 r

– This inequality gives us a lower limit on the number of check bits that we need in our code words.

2.8 Error Detection and Correction

COMP 228, Fall 2009 Data Representation 36

• Suppose we have data words of length m = 4. Then:

(4 + r + 1) ≤ 2 r

implies that r must be greater than or equal to 3.• This means to build a code with 4-bit data words

that will correct single-bit errors, we must add 3 check bits.

• Finding the number of check bits is the hard part. The rest is easy.

2.8 Error Detection and Correction

COMP 228, Fall 2009 Data Representation 37

• Suppose we have data words of length m = 8. Then:

(8 + r + 1) ≤ 2 r

implies that r must be greater than or equal to 4.• This means to build a code with 8-bit data words

that will correct single-bit errors, we must add 4 check bits, creating code words of length 12.

• So how do we assign values to these check bits?

2.8 Error Detection and Correction

COMP 228, Fall 2009 Data Representation 38

• With code words of length 12, we observe that each of the digits, 1 though 12, can be expressed in powers of 2. Thus:

1 = 2 0 5 = 2 2 + 2 0 9 = 2 3 + 2 02 = 2 1 6 = 2 2 + 2 1 10 = 2 3 + 2 1

3 = 2 1 + 2 0 7 = 2 2 + 2 1 + 2 0 11 = 2 3 + 2 1 + 2 04 = 2 2 8 = 2 3 12 = 2 3 + 2 2

– 1 (= 20) contributes to all of the odd-numbered digits.– 2 (= 21) contributes to the digits, 2, 3, 6, 7, 10, and 11.– . . . And so forth . . .

• We can use this idea in the creation of our check bits.

2.8 Error Detection and Correction

COMP 228, Fall 2009 Data Representation 39

• Using our code words of length 12, number each bit position starting with 1 in the low-order bit.

• Each bit position corresponding to an even power of 2 will be occupied by a check bit.

• These check bits contain the parity of each bit position for which it participates in the sum.

2.8 Error Detection and Correction

COMP 228, Fall 2009 Data Representation 40

• Since 2 (= 21) contributes to the digits, 2, 3, 6, 7, 10, and 11. Position 2 will contain the parity for bits 3, 6, 7, 10, and 11.

• When we use even parity, this is the modulo 2 sum of the participating bit values.

• For the bit values shown, we have a parity value of 0 in the second bit position.

2.8 Error Detection and Correction

What are the values for the other parity bits?

COMP 228, Fall 2009 Data Representation 41

• The completed code word is shown above.– Bit 1checks the digits, 3, 5, 7, 9, and 11, so its value is 1.– Bit 4 checks the digits, 5, 6, 7, and 12, so its value is 1.– Bit 8 checks the digits, 9, 10, 11, and 12, so its value is

also 1.• Using the Hamming algorithm, we can not only

detect single bit errors in this code word, but also correct them!

2.8 Error Detection and Correction

COMP 228, Fall 2009 Data Representation 42

• Suppose an error occurs in bit 5, as shown above. Our parity bit values are:

– Bit 1 checks digits, 3, 5, 7, 9, and 11. Its value is 1, but should be zero.

– Bit 2 checks digits 2, 3, 6, 7, 10, and 11. The zero is correct.– Bit 4 checks digits, 5, 6, 7, and 12. Its value is 1, but should

be zero.– Bit 8 checks digits, 9, 10, 11, and 12. This bit is correct.

2.8 Error Detection and Correction

COMP 228, Fall 2009 Data Representation 43

• We have erroneous bits in positions 1 and 4.• With two parity bits that don’t check, we know that

the error is in the data, and not in a parity bit.• Which data bits are in error? We find out by

adding the bit positions of the erroneous bits.• Simply, 1 + 4 = 5. This tells us that the error is in

bit 5. If we change bit 5 to a 1, all parity bits check and our data is restored.

2.8 Error Detection and Correction

COMP 228, Fall 2009 Data Representation 44

Clicker 2.8

Suppose a register has a stuck-at-0 fault in its least significant bit. Suppose the register contains FC00 and it has a parity of 1.

A. We can tell the correct content is FC00.B. We can tell the correct content is FC01.C. We cannot tell the correct content.

COMP 228, Fall 2009 Data Representation 45

Clicker 2.9

• Consider the Hamming SECC with the following data:10_010_0_ _

The resulting code word should be:

A: 1000100010B: 1010101000C: 1010101010D: 1010100010E: None of the above