computation for physics 計算物理概論
Post on 23-Feb-2016
74 Views
Preview:
DESCRIPTION
TRANSCRIPT
Computation for Physics計算物理概論數位資料表示法
Numeral systems計數系統Base基底 Name名稱 Representation
2 Binary 二進位 0,1
8 Octal八進位 0,1,2,3,4,5,6,7
10 Decimal十進位 0,1,2,3,4,5,6,7,8,9
16 Hexadecimal十六進位 0,1,2,3,4,5,6,7,8,9,A=10,B=11,C=12,D=13,E=14,F=15
(𝑎𝑛𝑎𝑛−1⋯𝑎1𝑎0 .𝑐1𝑐2𝑐3⋯ )𝑏=∑𝑘=0
𝑛
𝑎𝑘𝑏𝑘+∑
𝑘=1
𝑛
𝑐𝑘𝑏−𝑘
(001000101111)2=(1057)8=(22𝐹 )16=(559)10
Bit=Binary digit• Bit=0,1
• b, bit
• Byte=Eight bits• B
• K=Kilo=1000=103 or 1024=210
• M=Mega=106 or 1024=220
• G=Giga=109 or 1024=230
• T=Tera=1012 or 1024=240
• KB=kilobyte, MB=kilobyte, GB=gigabyte
Integer
Unsigned integer• Map N-bits to 0,1,…,2N-1=2N possible numbers• Unsigned byte=8 bits
• 0,…,255
• Unsigned short integer=16 bits• 0,…,65535
• Unsigned integer=32 bits• 0,…,4,294,967,295
• Unsigned long integer=64 bits• 0,…, 18,446,744,073,709,551,615
Most and leastsignificant bit
1 0 0 1 1 0 1 1
a 8-bits number
Most significant bit=msbLeast significant bit=lsb
Signed integerHow to represent negative numbers in N-bits without “-” ?• Sign-and-magnitude method
• Use msb as “sign”, N-1 bits as “magnitude”• Two representations for zero
• One’s complement• Use “bit complement” as the arithmetic negative• Two representations for zero
• Two’s complement• Use “two’s complement of the absolute value”• One representation for zero
Sign-and-magnitude
0 1 1 1 1 1 1 1=
0 0 0 0 0 0 0 0=
1 0 0 0 0 0 0 0=
1 1 1 1 1 1 1 1=
sign magnitude
Sign-and-magnitude• Addition of x and y
• If x and y have the same sign sgn(x)(x+y)• If x and y have different sign
• If |x|>|y| sgn(x)*(x-y)• If |y|>|x| sgn(y)*(y-x)
• Subtraction x and y=Addition of x and (-y)• If overflow return error message
One’s complement
0 1 1 1 1 1 1 1=
0 0 0 0 0 0 0 0=
1 1 1 1 1 1 1 1=
1 0 0 0 0 0 0 0=
𝑥=(𝑏7𝑏6𝑏5𝑏4𝑏3𝑏2𝑏1𝑏0)2❑⇔ −𝑥=(𝑏7𝑏6𝑏5𝑏4𝑏3𝑏2𝑏1𝑏0)2𝑏=1❑
⇔𝑏=0 ,𝑏=0❑
⇔𝑏=1
One’s complementDecimal Binary Decimal Binary
+0 0000 -0 1111+1 0001 -1 1110+2 0010 -2 1101+3 0011 -3 1100+4 0100 -4 1011+5 0101 -5 1010+6 0110 -6 1001+7 0111 -7 1000+8 1000 -8 0111
One’s complementAddition
1 1
0 0 0 1 0 1 1 0 +22
+ 0 0 0 0 0 0 1 1 +3
0 0 0 1 1 0 0 1 +25
One’s complementend-around carry
1 1 1 1
0 1 0 0 0 1 1 0 +70
+ 1 1 1 0 1 1 0 0 -19
1 0 0 1 1 0 0 1 0 +50
+ 0 0 0 0 0 0 0 1 +1
0 0 1 1 0 0 1 1 +51
One’s complementnegative zero
1 1 1 1 1 1 1
0 0 0 1 0 1 1 0 +22
+ 1 1 1 1 1 1 1 1 -0
1 0 0 0 1 0 1 0 1 +21
+ 0 0 0 0 0 0 0 1 +1
0 0 0 1 0 1 1 0 +22
TWO’s complement
0 1 1 1 1 1 1 1=
0 0 0 0 0 0 0 0=
0 0 0 0 0 0 0 0=
1 0 0 0 0 0 0 1=
𝑥=(𝑏7𝑏6𝑏5𝑏4𝑏3120100)2❑⇔ −𝑥=(𝑏7𝑏6𝑏5𝑏4𝑏3120100)2𝑏=1❑
⇔𝑏=0 ,𝑏=0❑
⇔𝑏=1
1 0 0 0 0 0 0 0=
two’s complementDecimal Binary Decimal Binary
+0 0000 -0 0000+1 0001 -1 1111+2 0010 -2 1110+3 0011 -3 1101+4 0100 -4 1100+5 0101 -5 1011+6 0110 -6 1010+7 0111 -7 1001+8 1000 -8 1000
Two’s ComplementAddition
1
0 0 0 1 0 0 0 0 +16
+ 0 0 0 1 1 0 0 0 +24
0 0 1 0 1 0 0 0 +40
Two’s ComplementAddition
1 1 1 1
1 1 1 1 0 0 0 0 -16
+ 0 0 0 1 1 0 0 0 +24
1 0 0 0 0 1 0 0 0 +8
ignore
Two’s ComplementAddition
1 1 1 0 1 0 0 0 -24
+ 0 0 0 1 0 0 0 0 +16
1 1 1 1 1 0 0 0 -8
Two’s ComplementAddition
1 1 1 1
1 1 1 1 0 0 0 0 -16
+ 1 1 1 0 1 0 0 0 -24
1 1 1 0 1 1 0 0 0 -40
ignore
Two’s ComplementAddition
1 1 1 1
0 1 1 1 1 0 0 0 +120
+ 0 0 0 0 1 0 0 1 +9
1 0 0 0 0 0 0 1 -126
overflow error message
Positive+Positive=Negative!
Most and leastsignificant bit
1 0 0 1 1 0 1 1
a 8-bits number
Most significant bit=msbLeast significant bit=lsb
Logic gates
Logic gates邏輯閘AND NAND
OR NOR
NOT XOR
XNOR
And Gate
Input OutputA B A AND B0 0 00 1 01 0 01 1 1
OR Gate
Input OutputA B A OR B0 0 00 1 11 0 11 1 1
NOT Gate
Input OutputA B0 11 0
NAND Gate
Input OutputA B A NAND B0 0 10 1 11 0 11 1 0
NOR Gate
Input OutputA B A NOR B0 0 10 1 01 0 01 1 0
XOR Gate
Input OutputA B A XOR B0 0 00 1 11 0 11 1 0
XNOR Gate
Input OutputA B A XNOR B0 0 10 1 01 0 01 1 1
AdderHalf Adder Full Adder
Real number
Radix point小數點• Base 10 notation decimal point• Base 2 notation binary point
(𝑎𝑛𝑎𝑛−1⋯𝑎1𝑎0 .𝑐1𝑐2𝑐3⋯ )𝑏=∑𝑘=0
𝑛
𝑎𝑘𝑏𝑘+∑
𝑘=1
𝑛
𝑐𝑘𝑏−𝑘
Radix point
Scientific notation科學記號
𝑥=𝑎×10𝑏 ,1≤|𝑎|<10
Floating point representation浮點數表示法+152853.5047=+12345×10− 4
significand (mantissa)
exponent
IEEE 754• IEEE
• Institute of Electrical and Electronics Engineers
• IEEE 754• IEEE Standard for Floating-Point Arithmetic• Arithmetic formats: sets of binary and decimal floating-point
data, which consist of finite numbers (including signed zeros and subnormal numbers), infinities, and special "not a number" values (NaNs)
• Interchange formats: encodings (bit strings) that may be used to exchange floating-point data in an efficient and compact form
• Rounding rules: properties to be satisfied when rounding numbers during arithmetic and conversions
• Operations: arithmetic and other operations on arithmetic formats
• Exception handling: indications of exceptional conditions (such as division by zero, overflow, etc.)
IEEE 754 binary16Half precision
• Sign bit: 1 bit• Exponent width: 5 bits• Significand precision: 11 (10 explicitly stored)• Exponent encoding
• Offset=15, Emin=-14,Emax=15
• Minimum positive value=2^-14• Maximum positive value=(2-2^-10)2^15=65504• Minimum subnormal value=2^-24≈5.96 ×10^-5
IEEE 754 binary32 single precision
• Sign bit: 1 bit• Exponent width: 8 bits• Significand precision: 24 (23 explicitly stored)• Exponent encoding
• Offset=127, Emin=-126,Emax=127
• Minimum positive value≈2.2 ×10^-308• Maximum positive value≈1.8 ×10^308• Minimum subnormal value≈4.9 ×10^-324
IEEE 754 binary64Double precision
• Sign bit: 1 bit• Exponent width: 11 bits• Significand precision: 53 (52 explicitly stored)• Exponent encoding
• Offset=1023, Emin=-1022,Emax=1023
• Minimum positive value=2^-126≈1.18 ×10^-38• Maximum positive value=(2-2^-23)2^127≈3.4
×10^38• Minimum subnormal value=2^-149≈1.4 ×10^-45
Representation errorNotation
Represent Approximate Error
1/7 0.142857142857142857… 0.142587
Ln(2) 0.69314718055994530941...
0.693147
Log(2) 0.30102999566398119521...
0.3010
1.25992104989487316476...
1.25992
1.41421356237309504880...
1.41421
e 2.71828182845904523536...
2.718281828459045
π 3.14159265358979323846...
3.141592653589793
Rounding Errors• In base-10 system
• ½=0.5• 1/3=0.333333333333333333333333333333333
333333
• In base-2 system• Terminating iff denominators are powers of 2
(1/2, 3/16)
IEEE Rounding Modes• Truncation:
• Keep the desired number of digits unchanged, removing all less-significant digits; also called rounding toward zero.
• 0.142857 ≈ 0.142 (All digits less significant than the third removed).
• Round to Nearest: (Default)• Round to the nearest valid representation. Break ties by rounding to an even
digit• +23.524, +24.524; -23.5-24, -24.5 -24 (symmetry between +/- numbers)
• Round to Nearest: • Round to the nearest valid representation. Break ties by rounding away from
zero.• +23.524,+24,525; -23.5-24, -24.5 -25 (symmetry between +/- numbers)
• Round to −∞: • Round to a value less than or equal to the original number. If the original
number is positive, this is equivalent to truncation.
• Round to +∞: • Round to a value greater than or equal to the original number. If the original
number is negative, this is equivalent to truncation.
IEEE 754Special values
• Positive infinity• Negative infinity• (positive zero=ordinary zero)• Negative zero: -0• NaNs: “Not a number” values• Subnormal numbers
Signed zero
• Sign=0+ or 1-• Exponent=0• Significand(Mantissa)=0• Arithmetic
• , , • ,
• , • NaN, NaN
Signed infinity • Sign=0+ or 1-• Exponent=maximum value
• 111112
• FFH
• 7FF16
• Significand(Mantissa)=0
NaNs• Sign quiet, signaling• Exponent=maximum value
• 111112
• FFH
• 7FF16
• Significand(Mantissa)≠0• Creation
• NaN, NaN, NaN
• NaN, =NaN
• =1 or NaN
• Square root or logarithm of negative number
• Inverse sine or cosine of a number with absolute value greater than 1
Subnormal numbers• Sign=0+ or 1-• Exponent=0
• 000002
• 00H
• 00016
• Significand(Mantissa)≠0
Floating pointaddition
• x=123456.7 = 1.234567 × 10^5• y=101.7654 = 1.017654 × 10^2 =
0.001017654 × 10^5• x+y=(1.234567+ 0.001017654) × 10^5• x+y=1.235584654 × 10^5• x+y≈1.235585 × 10^5• Round-off error!
Floating point addition of a small number
• x=1.234567 × 10^5• y=9.876543 × 10^-3=0.00000009876543
× 10^5• x+y=(1.234567+ 0.00000009876543) ×
10^5• x+y=1.23456709876543 × 10^5• x+y≈1.234567 × 10^5=x• Round-off error!
Floating point subtraction of two similar numbers
• x=123457.1467 ≈ 1.234571 × 10^5• y=123456.659 ≈ 1.234567 × 10^5• x-y=0.4877=4.877000 × 10^1• (1.234571-
1.234567)×10^5=0.0000040×10^5=4×10^1
• 4.877000 × 10^1 v.s. 4.000000 × 10^1 • 20% round-off error!
4.877000 × 10^1
Floating pointmultiplication
• x=4.734612 × 10^3• y=5.417242 × 10^5• x*y=25.648538980104 × 10^8• x*y≈25.64854 × 10^8= 2.564854 × 10^9• Round-off error!
Accuracy problems• 0.1*0.1=0.01• 0.1≈ 1.10011001100110011001101 × 2^-4• x=0.100000001490116119384765625• x*x=0.01000000029802322609739917425031308
0847263336181640625≠0.01• x*x≈0.010000000707805156707763671875
(single)• 0.01≈ 0.009999999776482582092285156250
≠x*x• tan(π/2) ≈ 16331239353195370.0 or −22877332.0
Arithmetic overflow
Arithmetic underflow
(a+b)+c≠a+(b+c)• a = 1234.567, b = 45.67834, c = 0.0004• (a+b)= 1280.24534 ≈ 1280.245• (a+b)+c=1280.245+0.0004=1280.2454 ≈
1280.245
• (b+c)= 45.67874• a+(b+c)= 1280.24574 ≈ 1280.246
(a+b)*c ≠A*C+B*C• a= 1234.567, b= 1.234567, c= 3.333333• (a+b)=1235.802• (a+b)*c=1235.802*3.333333 ≈ 4119.340
• a*c= 1234.567 × 3.333333 ≈ 4115.223• b*c= 1.234567 × 3.333333 ≈ 4.115223• a*c+b*c=4115.223+4.115223
≈4110.338≠4119.340
Numerical derivative
𝑓 ′ (𝑥 )=limh❑→0
𝑓 (𝑥+h )− 𝑓 (𝑥 )h
top related