eece476 lectures 4,5 –alus, add, multiply, and floating-point chapter 3: computer arithmetic the...
Post on 19-Dec-2015
219 views
TRANSCRIPT
![Page 1: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/1.jpg)
EECE476
Lectures 4,5 –ALUs,Add, Multiply, and Floating-Point
Chapter 3: Computer Arithmetic
The University ofBritish Columbia EECE 476 © 2005 Guy Lemieux
![Page 2: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/2.jpg)
2
Announcements
• Assignment 1
– First part posted on web.• Do it as practice for tomorrow’s tutorial !!
– Second part coming soon.• Do it as practice for QUIZ next week!
• Quiz Dates– Quiz 1 Thurs, Sept 22nd based on Assign 1– Quiz 2, etc TBD
![Page 3: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/3.jpg)
3
Reading
• Chapter 3– 3.2 signed numbers– 3.3 addition and subtraction– 3.4 multiplication– 3.5 division– 3.6 floating-point (read lightly)
![Page 4: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/4.jpg)
4
Computer Arithmetic
• Objective 1– Discover the “logic complexity” of the
different types of arithmetic done by a CPU
– The complexity will have an impact on performance later!
• Objective 2– Learn how to build an ALU for your project
![Page 5: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/5.jpg)
5
The Conclusions
• Add is easy– Fast adding is not too bad either…– Subtraction: addition’s tricky pal
• Multiply is hard…– But you can add many times
• Divide is really hard…– Divide and be conquered!
• Anything floating-point is impossible!– Well, not quite, but you will get the idea…
![Page 6: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/6.jpg)
6
Computer Architecture?
• Recall– Computer Architecture = ISA + Machine Organization– Machine Organization = implementation details!
• Begin to consider coupling– ISA Machine Organization
• Heart of computer: arithmetic calculations– Done by ALU: Arithmetic Logic Unit
• Some parts not done by ALU– Decision-making, iteration, memory/state …– All of these are important as well
![Page 7: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/7.jpg)
7
MIPS Arithmetic Instructions• Let’s design an ALU that MIPS can use !!
• Many different operations– Arithmetic
• Add, AddU, Sub, SubU,• AddI, AddIU • Mult, MultiU, Div, DivU
– Logical• And, Or, Xor, Nor• AndI, OrI, XorI
– Logical/Arithmetic• SLT, SLTU• SLTI, SLTIU
– Shifting (Left/Right & Logical/Arithmetic & Const/Variable)• SLL, SRL, SRA, SLLV, SRLV, SRAV
![Page 8: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/8.jpg)
8
MIPS ALU Design
• First: simplify!
– Throw out “hard” operations• Mult, Div
– Extract & group basic operations• Add, Sub• And, Or, Nor, Xor• SLT• Shifting (is this hard?)
![Page 9: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/9.jpg)
9
MIPS ALU Design
• Second: simplify!
– Identify common optimizations• Sub = variation of Add• Nor = variation of Or (why Nor ?)
– Some other CPUs have even more operations• Bit set, Bit test, Bit clear, etc
![Page 10: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/10.jpg)
10
ALU Design 1
• Easy way…
– Try to be more creative!
F
Instruction/operation
+/–
*
![Page 11: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/11.jpg)
11
ALU Design 2
• Start with single bit operations– All operations share same 2 inputs– Small optimizations may be possible
• E.g., Or and Nor
• E.g., Add and And (see problem set)• Generally, these aren’t too helpful
NorOperation
FA
B
![Page 12: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/12.jpg)
12
ALU Design 3
• Build up to larger multi-bit operations
– Bigger & better optimizations are possible• E.g., Add and Sub• E.g., SLT
![Page 13: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/13.jpg)
13
Add/AND/OR for ALU: Bit-based
b
0
2
Result
Operation
a
1
CarryIn
CarryOut
Result31a31
b31
Result0
CarryIn
a0
b0
Result1a1
b1
Result2a2
b2
Operation
ALU0
CarryIn
CarryOut
ALU1
CarryIn
CarryOut
ALU2
CarryIn
CarryOut
ALU31
CarryIn
One Bit Multiple Bits
![Page 14: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/14.jpg)
14
Subtract for ALU
• A – B = A + (B + 1)
0
2
Result
Operation
a
1
CarryOut
0
1
Binvert
b
CarryIn
![Page 15: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/15.jpg)
15
Fast Adders
• We will assume fast adders are available– Slow O(n) vs. Fast O(log n)– Eg, carry-lookahead, carry-skip, carry-select
• You should know “fast adders” already, but…– Not a part of this course– Fast adders are NOT on assignments, tests or exam– Do NOT use a fast adder in your project
• FPGAs have their own “fast carry chain” to do adds quickly, and you will merely confuse the tool
![Page 16: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/16.jpg)
16
Set Less Than
Most-significant bitAll other bits
• One-bit ALU blocks
0
3
Result
Operation
a
1
CarryIn
CarryOut
0
1
Binvert
b 2
Less
0
3
Result
Operation
a
1
CarryIn
0
1
Binvert
b 2
Less
Set
Overflowdetection
Overflow
![Page 17: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/17.jpg)
17
Set Less Than
• Stitching theone-bit ALUblockstogether
• Notice ‘Set’ outputis sign bit of A-B– Used as ‘Less’ input on LSB– Other ‘Less’ inputs
forced to 0
S eta3 1
0
A LU 0 R es ult0a0
R es ult1a1
0
R es ult2a2
0
O p era tio n
b3 1
b0
b1
b2
R es ult31
O ve rflo w
B in ve rt
C a rry In
Le ss
C a rryIn
C a rryO u t
A LU 1Le ss
C a rryIn
C a rryO u t
A LU 2Le ss
C a rryIn
C a rryO u t
A L U 31Less
C a rryIn
![Page 18: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/18.jpg)
18
Testing Result for Zero
Seta31
0
Result0a0
Result1a1
0
Result2a2
0
Operation
b31
b0
b1
b2
Result31
Overflow
Binvert
Zero
ALU0Less
CarryIn
CarryOut
ALU1Less
CarryIn
CarryOut
ALU2Less
CarryIn
CarryOut
ALU31Less
CarryIn
![Page 19: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/19.jpg)
19
BIG PICTURE
• This course is about computer architecture
• Why care about ALU design details?– Our goal is performance– Some ALU designs may be faster or slower
• You must understand the impact they have on– Clock frequency (cycle time)– Instruction set design– More advanced things (eg, impact of multiple ALUs)– Ultimately, performance!
![Page 20: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/20.jpg)
Multiply
Shift,Add,
Shift,Add,
Shift,Add…
![Page 21: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/21.jpg)
21
Multiplication - Decimal
• More complex than addition– Multiple additions (and shifts)– More gates/area, slower
• Gradeschool algorithm:
Multiplicand M 13Multiplier Q x 11 13 <- 13 x 1 13 <- 13 x 10Product P 143
![Page 22: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/22.jpg)
22
Multiplication - Binary• Same algorithm, different digits
Multiplicand M 1101 (13)Multiplier Q x 1011 (11) 1101 <- 1 Q0 Partial Product PP0 1101 <- 1 Q1 Partial Product PP1 0000 <- 0 Q2 Partial Product PP2 1101 <- 1 Q3 Partial Product PP3 10001111 (143) Product P
• M bits x N bits => M+N bit product• Binary makes it easy:
– Bit Qi is zero => PPi is 0– Bit Qi is one => PPi is M (shifted i times left)– Product is sum of PPs
![Page 23: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/23.jpg)
23
Multiplication – Hardware V0a
• Array multiplier• Stage i accumulates PPi
(0 or M shifted i)depending on Qi
• Answer Pcomes outat bottom
• Slow!Big!
Q0
M0M1M2M3
M0M1M2M3
M0M1M2M3
M0M1M2M3
Q1
Q2
Q3
P0P1P2P3P4P5P6P7
0 0 0 0
![Page 24: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/24.jpg)
24
Multiplication – Hardware V0b
• at each stage shift M left ( x 2)• next bit of Q determines whether to add in shifted multiplicand• accumulate 2n bit partial product at each stage• each stage identical: need only 1 stage in hardware (use multiple
cycles)
Q0
M0M1M2M3
M0M1M2M3
M0M1M2M3
M0M1M2M3
Q1
Q2
Q3
P0P1P2P3P4P5P6P7
0 0 0 00 0 0
![Page 25: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/25.jpg)
25
Multiplication – Hardware V1• M: 64b shift register, Q: 32b shift register, P: 64b register• Initially, ensure high bits of M are zero (M63..M32 = 0)
P: Product
M: Multiplicand
64-bit ALU
Shift Left
Shift Right
WriteControl
32 bits
64 bits
64 bits
Multiplier = datapath + control
Q: Multiplier
Q0
Add
![Page 26: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/26.jpg)
26
Multiplication – Hardware V1
Notes• 1 clock cycle per bit (32 total)• 0’s are left-shifted into M
– Lower bits of P never change once formed
• Half of bits in M are always zero– 64 bit ALU is wasted
Observations lead to refinement:• Right-shift P instead of left-shifting M
![Page 27: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/27.jpg)
27
Multiplication Algorithm• Russian Peasant Algorithm
PP=MP=0while( Q != 0 ) {
if( Q is odd ) // ie, if bit 0 of Q is = ‘1’ P = P + PP // accumulate partial product (PP) in P
end ifPP = PP * 2 // shift PP left 1 positionQ = Q / 2 // shift Q right 1 position
}
• Compare this to the hardware just presented!– Each loop iteration takes one clock cycle– How many cycles are required?
![Page 28: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/28.jpg)
28
Multiplication – Hardware V2• M: 32b register, Q: 32b shift register, P: 64b shift register• Initially, P=0. Only high bits of P (63..32) affected by a write.
P: Product
M: Multiplicand
32-bit ALU
Shift Right
Shift Right
WriteControl
32 bits
32 bits
64 bits
Q: Multiplier
Q0
Add
![Page 29: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/29.jpg)
29
Multiplication – Hardware V2
• What’sreallygoingon?
Q0
Q1
Q2
Q3
P0P1P2P3P4P5P6P7
0 0 0 0
M0M1M2M3
M0M1M2M3
M0M1M2M3
M0M1M2M3
![Page 30: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/30.jpg)
30
Multiplication – Hardware V2
Notes• 1 clock cycle per bit (32 total)• Lower 32 bits of P are initially unused
– Holds zero, but unused– Each cycle, 1 fewer unused bit
• 0’s are right-shifted into Q– Initially: 32 bits used in Q– Each step: 1 fewer bits needed in Q– At end: Q is destroyed
Observations lead to refinement:• Use lower 32 bits of P to hold Q
![Page 31: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/31.jpg)
31
Multiplication – Hardware V3
• M: 32b register, P: 64b shift register (lower half represents Q)• Initially, P=Q. Only high bits of P (63..32) are changed on write.
P: Product
M: Multiplicand
32-bit ALU
Shift Right
WriteControl
32 bits
64 bits
Q0
Add
Q: Multiplier
![Page 32: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/32.jpg)
32
Multiplication – Hardware V3
Notes• P has two halfs: high, low
MIPS multiply instruction MultU• 32 regular MIPS registers• 2 special MIPS registers: HI, LO
– Why special? Need to right-shift contents
• HI, LO store results of MultU
![Page 33: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/33.jpg)
33
Multiplication – Signed Numbers• Gradeschool algorithm assumes unsigned numbers
Multiplicand M 1101 (13)Multiplier Q x 1011 (11) 1101 <- 1 Q0 Partial Product 0 1101 <- 1 Q1 Partial Product 1 0000 <- 0 Q2 Partial Product 2 1101 <- 1 Q3 Partial Product 3 10001111 (143)
• Signed numbers?– Example above reads (-3) * (-5) = (-113), clearly wrong!– Requires some adjustments
![Page 34: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/34.jpg)
34
Multiplication – Signed Numbers
Two Cases For Signed Multiply: P = M*Q• Case A: M signed, Q unsigned or Q >= 0
– Add using sign-extension of PP
Multiplicand M 1101 (–3)Multiplier Q x 1011 (11) 11111101 <- 1 Q0 Partial Product 0 1111101 <- 1 Q1 Partial Product 1 000000 <- 0 Q2 Partial Product 2 11101 <- 1 Q3 Partial Product 3 11011111 (–33)
![Page 35: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/35.jpg)
35
Multiplication – Signed Numbers
Two Cases For Signed Multiply: P = M*Q• Case B: M signed, Q signed and Q < 0
– One method:• Note that P = M*Q = (-M)*(-Q) = (M+1)*(Q+1)• Now (Q+1) is positive, follow Case A• How to do this in hardware?
– Use sign bits to modify M and Q, two extra adds for +1’s
– Alternate method: Booth encoding• Look it up!
![Page 36: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/36.jpg)
Divide?
Forget it!
Basically, do the long division thing over multiple clock cycles:
1) Subtract divisor2) If >= 0, put “1” in answer, do next bit3) If <0, put “0” in answer, add divisor back, do next bit
![Page 37: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/37.jpg)
Floating-Point
![Page 38: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/38.jpg)
38
Integers and Beyond• Integers perfectly accurate, no error
– 32bit integer: -2,147,483,648 to 2,147,483,647– integers “overflow” or wrap on +1 from 2,147,483,647 to -2,147,483,648
• What about numbers with non-integral parts?– Large range in values, possibly large number of significant digits….– Rationals
• 0.5 => can represent as ½• 1/3 => 0.33333333333333333333333• 63/127 => 0.4960629921259842519685039370…
– Irrationals• sqrt(2) = 1.41421356237309504880168872420…• Transcendentals: pi = 3.14159265927…, e = 2.71828183…
– Scientific• NA = 6.022 x 1023 Avagadro’s number (atoms in one mole)• G = 6.67259 × 10-11 gravitational constant (F = -GMm/r2)• c = 2.99792458 x 108 speed of light (m/s)
![Page 39: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/39.jpg)
39
Floating Point Numbers
• How to represent non-integral numbers in binary?– Many possible ways
• e.g., store (numerator, denominator) => doesn’t work for irrationals
– All ways have limitations• Cannot represent all real numbers: infinite number of them, finite
number of bits!
– Need a standard on how to interpret the bits• e.g. two’s complement for signed integers
– Benefits of a standard:• Software portability: same answer on any machine• Data portability: binary data can be sent directly, no conversions• Numerical environment: defines level of mathematical precision,
allows research into error analysis, avoids future problems
![Page 40: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/40.jpg)
40
Floating Point Numbers: IEEE754
• A floating-point standard IEEE754– Standard published in 1985
• Started in 1977• Primarily work of William Kahan (UofT student)
– Based largely on development of Intel 8087• A floating-point processor designed to work with the 8086
– Intel’s chip was a model to follow• 8087 first commercial product to implement IEEE 754• Other companies implemented IEEE 754, looked at Intel’s
chip
![Page 41: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/41.jpg)
41
Binary Representationof Fractional Numbers
• Example
101.011= 1*22 + 0*21 + 1*20+ 0*2-1 + 1*2-2 + 1*2-3
= 4 + 0 + 1 + 0 + ¼ + 1/8
= 5.375
![Page 42: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/42.jpg)
42
Binary Representationof Floating-Point Numbers
• Recall scientific notation:
6.022 x 1023
6.022 is the normalized significant part
10 is the base or radix23 is the exponent
• Can do the same in binary:
1.011 x 23
= 1.011 x 8= 1011= 11 (base=10)
• Negative numbers?– Need to remember the sign of
the significant part
• Generally:
(–1)S x M x be
Where:
S is sign (0 or 1)M is significandb is base/radixe is exponent
![Page 43: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/43.jpg)
43
IEEE 754 Binary Single-Precision Floating-Point Representation
(–1)S x M x be
1.011 x 23 S = 0, M = 1.011, b = 2, e = 3
Encoding into bits:– assume b=2 (binary!), no need to store/remember– S one bit: 0– M 24 bits: 1011 0000 0000 0000 0000 0000
• If normalized, first (leftmost) digit of m is always a ‘1’, never a ‘0’• Don’t store the leading ‘1’, instead define M=1.F an store F
– F 23 bits: 011 0000 0000 0000 0000 0000– convert e=3 into binary (e may be negative!):
• Use biased notation, called Excess-N• Excess 127 used here: Define E = e+127 = 130
– E 8 bits, E = 1000 0010
![Page 44: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/44.jpg)
44
IEEE 754 Binary Floating-Point Representation
Representation of floating point numbers in IEEE 754 standard:
Single precision32 bits total
1 8 23
sign S E F
exponent:excess 127binary integer
significand:normalized binarysignificand w/ hiddeninteger bit: M = 1.F
Double precision64 bits total
1 11 52S E F
exponent:excess 1023binary integer
![Page 45: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/45.jpg)
45
IEEE 754 Precision
• Single precision– Enough for 9 decimal digits of accuracy
• Double precision– Enough for 17 decimal digits of accuracy
• Storing floating-point numbers to disk?Two options:– A: write binary value (32bits or 64bits)
• IEEE 754 standard allows us to interchange these values!– B: write value as decimal digits, eg in ASCII
• Need to write 9 (or 17) decimal digits• Need to write sign, exponent as well• Reading back in: convert to binary, get same binary value as before
![Page 46: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/46.jpg)
46
IEEE 754 Accuracy
• Not all real values can be represented– Inf. # of values between ½ and ¼ – Inf. # of values between ½ and 3/8 – Inf. # of values between ½ and 7/16, etc
• All floating-point numbers are approximations– Calculations with approximations introduce errors– Reduce size of errors by proper rounding– 754: keep extra bits of precision during calculations for rounding– Cannot solve all problems: algorithm numerical stability a must!– You get same problems on every machine using IEEE754
![Page 47: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/47.jpg)
47
IEEE 754 Range
• See /usr/include/limits.h on a Unix system
• Single precision:– Minimum: 1.175494351E-38 (FLT_MIN)– Maximum: 3.402823466E+38 (FLT_MAX)
• Double precision:– Minimum: 2.2250738585072014E-308 (DBL_MIN)– Maximum: 1.7976931348623157E+308 (DBL_MAX)
![Page 48: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/48.jpg)
48
IEEE 754 Range
• What happens on overflow ?– Depends, 754 standard defines some special cases– Normally, you get value called Infinity
• Tiny numbers ? (smaller than smallest normal)– Goes to zero? Called underflow
• This is rather drastic
– 754 standard defines denormalized numbers• Underflow occurs gradually…• Underflow/denormals hard to design hardware
– Not all chips support it
• Often use software interrupts to handle denormals
![Page 49: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/49.jpg)
49
IEEE 754 Special Cases
• Infinity, -Infinity– Caused by overflow– Caused by 1/0 (note: no error produced)
• +0, -0– This is claimed to be useful!
• NaN– “Not a Number”– 0/0– Infinity/Infinity, 0*Infinity, Infinity–Infinity, etc– Sqrt(-number)– Infectious: NaN + number = NaN, NaN x number = NaN, etc
• Comparisons (<, >, =, etc) with Infinity? NaN?– Cases all defined by the standard
![Page 50: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/50.jpg)
50
IEEE 754 Hardware
1. Compare exponents
2. Shift smaller number right
3. Add
4. Normalize
5. Round
FractionSign ExponentFractionSign Exponent
Big ALU
FractionSign Exponent
![Page 51: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/51.jpg)
51
Shifters
• Left as an exercise...
![Page 52: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux](https://reader030.vdocument.in/reader030/viewer/2022032611/56649d2b5503460f94a013d4/html5/thumbnails/52.jpg)
52
BIG PICTURE
• This course is about computer architecture
• Why care about ALU design details?– Our goal is performance– Some ALU designs may be faster or slower
• You must understand the impact they have on– Clock frequency (cycle time)– Instruction set design– More advanced things (eg, impact of multiple ALUs)– Ultimately, performance!