na10-02-floating_pkjjkkjoint.pdf

Upload: adsr

Post on 14-Apr-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/30/2019 Na10-02-floating_pkjjkkjoint.pdf

    1/27

    Chapter 3.

    Approximation and Round-Off Errors

    Speed: 48.X

    Mileage: 87324.4X

  • 7/30/2019 Na10-02-floating_pkjjkkjoint.pdf

    2/27

    2Significant Figures

    Number of significant figures indicates precision. Significant digits of anumber are those that can be usedwith confidence, e.g., the number ofcertain digits plus one estimated digit.

    53,800 How many significant figures?

    5.38 x 104 3

    5.380 x 104 4

    5.3800 x 104 5

    Zeros are sometimes used to locate the decimal point not significantfigures.

    0.00001753 4

    0.0001753 4

    0.001753 4

  • 7/30/2019 Na10-02-floating_pkjjkkjoint.pdf

    3/27

    3Approximations and Round-Off Errors

    For many engineering problems, no analytical solutions.

    Numerical methods yield approximate results. We cannotexactly compute the errors associated with numerical

    methods. Only rarely given data are exact, since they originate from

    measurements. Therefore there is probably error in the inputinformation.

    Algorithm itself usually introduces errors as well, e.g.,unavoidable round-offs, etc

    The output information will then contain error from both of thesesources.

    How confident we are in our approximate result?

    The question is how much error is present in ourcalculation and is it tolerable?

  • 7/30/2019 Na10-02-floating_pkjjkkjoint.pdf

    4/27

    4

    Accuracy

    How close is a computed or measured value to thetrue value

    Precision (or reproducibility)

    How close is a computed or measured value topreviously computed or measured values.

    Inaccuracy (or bias)

    A systematic deviation from the actual value.

    Imprecision (or uncertainty). Magnitude of scatter.

  • 7/30/2019 Na10-02-floating_pkjjkkjoint.pdf

    5/27

    5Fig 3.2

  • 7/30/2019 Na10-02-floating_pkjjkkjoint.pdf

    6/27

    6Error Definit ions

    True Value = Approximation + Error

    Et = True value Approximation (+/-)

    valuetrue

    errortrueerrorrelativefractionalTrue =

    %100valuetrue

    errortrueerror,relativepercentTrue

    t=

    True error

  • 7/30/2019 Na10-02-floating_pkjjkkjoint.pdf

    7/27

    For numerical methods, the true value will be

    known only when we deal with functions thatcan be solved analytically (simple systems). Inreal world applications, we usually not know

    the answer a priori. Then

    Iterative approach, example Newtonsmethod

    ionApproximat

    erroreApproximat

    a=

    ionapproximatCurrent

    ionapproximatPrevious-ionapproximatCurrenta

    =

    (+ / -)

  • 7/30/2019 Na10-02-floating_pkjjkkjoint.pdf

    8/27

    8

    Use absolute value. Computations are repeated until stopping criterion is

    satisfied.

    If the following criterion is met

    you can be sure that the result is correct to at least nsignificant figures.

    sa

    )%10(0.5100.5 n)-(2-ns ==

    Pre-specified % tolerancebased on the knowledge ofyour solution

  • 7/30/2019 Na10-02-floating_pkjjkkjoint.pdf

    9/27

    9Fig 3.3 decimal, binary

  • 7/30/2019 Na10-02-floating_pkjjkkjoint.pdf

    10/27

    10Fig 3.4 Signed binary

    1000 0000 0000 0001 = (-1)

    2s complement

    0000 0000 0000 0001 = 1

    0000 0000 0000 0000 = 0

    1111 1111 1111 1111 = -1

    1111 1111 1111 1110 = -2

    Number range ?, How to compute from ?a a

  • 7/30/2019 Na10-02-floating_pkjjkkjoint.pdf

    11/27

    11Fractional number decimal, binary

    321012 106105104103102101456.123 +++++=

    375.5

    125.025.014

    212120212021011.101 321012

    =

    +++=

    +++++=

    fractionFixed point number

  • 7/30/2019 Na10-02-floating_pkjjkkjoint.pdf

    12/27

    12Floating point number (base-10)

    Chapter 3

    156.78 0.15678x103 in a floatingpoint base-10 system

    Suppose only 4decimal places to be stored

    Normalized to remove the leading zeroes.

    Multiply the mantissa by 10 and lower theexponent by 1

    0.2941 x 10-1

    1

    10

    1100294.0

    029411765.0341

    0

  • 7/30/2019 Na10-02-floating_pkjjkkjoint.pdf

    13/27

    13Floating point number (binary, base-2)

    32101011.0011.101 =

    22101101.000101101.0 =

    Fig 3.5

  • 7/30/2019 Na10-02-floating_pkjjkkjoint.pdf

    14/27

    14Floating point number

    Numbers such as , e, or cannot be expressed

    by a fixed number of significant figures.

    Computers use a base-2 representation, they cannot

    precisely represent certain exact base-10 numbers.

    Fractional quantities are typically represented in

    computer using floating point form, e.g.,

    7

    exponent

    Base of the number system usedmantissa

    ebm

  • 7/30/2019 Na10-02-floating_pkjjkkjoint.pdf

    15/27

    15Floating point number

    Therefore

    for a base-10 system 0.1m

  • 7/30/2019 Na10-02-floating_pkjjkkjoint.pdf

    16/27

    16Chopping, Rounding

    Example:

    =3.14159265358 to be stored on a base-10 system

    carrying 7 significant digits.=3.141592 chopping error t=0.00000065

    If rounded

    =3.141593 t=0.00000035

    Some machines use chopping, because rounding adds

    to the computational overhead. Since number ofsignificant figures is large enough, resulting choppingerror is negligible.

  • 7/30/2019 Na10-02-floating_pkjjkkjoint.pdf

    17/27

    17Fig 3.6 - example

  • 7/30/2019 Na10-02-floating_pkjjkkjoint.pdf

    18/27

    18Example 3.4 (p. 61)

    2-3 (1 2-1+0 2-2+0 2-3)=0.062500 (the smallest)

    2-3(1 2-1+0 2-2+1 2-3)=0.078125

    2-3(1 2-1+1 2-2+0 2-3)=0.093750

    2-3(1 2-1+1 2-2+1 2-3)=0.109375

    Evenly spaced by

    2-3(0 2-1+0 2-2+1 2-3)=0.015625

    2-2 (1 2-1+0 2-2+0 2-3)=0.125000

    2-2(1 2-1+0 2-2+1 2-3)=0.156250

    2-2(1 2-1+1 2-2+0 2-3)=0.187500

    2-2(1 2-1+1 2-2+1 2-3)=0.218750

    Evenly spaced by

    2-2(0 2-1+0 2-2+1 2-3)=0.03125

  • 7/30/2019 Na10-02-floating_pkjjkkjoint.pdf

    19/27

    19Example 3.4 (p. 61)

    22 (1 2-1+0 2-2+0 2-3)=2

    22(1 2-1+0 2-2+1 2-3)=2.5

    22(1 2-1+1 2-2+0 2-3)=3

    22(1 2-1+1 2-2+1 2-3)=3.5

    Evenly spaced by

    22(0 2-1+0 2-2+1 2-3)=0.5

    23 (1 2-1+0 2-2+0 2-3)=4

    23(1 2-1+0 2-2+1 2-3)=5

    23(1 2-1+1 2-2+0 2-3)=6

    23(1 2-1+1 2-2+1 2-3)=7 (the largest)

    Evenly spaced by

    23(0 2-1+0 2-2+1 2-3)=1

  • 7/30/2019 Na10-02-floating_pkjjkkjoint.pdf

    20/27

    20Fig 3.7

  • 7/30/2019 Na10-02-floating_pkjjkkjoint.pdf

    21/27

    21IEEE Standard 754 Floating Point Numbers

    Single precision (32-bit)

    sign: 1 bit, exponent: 8 bits, mantissa: 23 bits

    7 significant base-10 digits with range 10 -38 to 10 39

    Double precision (64-bit) sign: 1 bit, exponent: 11 bits, mantissa: 52 bits

    15-16 significant base-10 digits with range 10 -308 to 10 308

  • 7/30/2019 Na10-02-floating_pkjjkkjoint.pdf

    22/27

    22Arithmetic Manipulations

    Common Arithmetic operations

    The mantissa of the number with the smaller exponentis modified so that the exponents are the same

    0.1557 101+0.4381 10-1

    1

    1

    0.1557 10

    0.4381 10

    1

    1

    1

    0.1557 10

    0.004381 100.160081 10

    10.1600 10

  • 7/30/2019 Na10-02-floating_pkjjkkjoint.pdf

    23/27

    23Arithmetic Manipulations

    Subtraction

    0.3641 102 - 0.2686 102

    2

    2

    0.3641 10

    0.2686 10

    2

    2

    2

    0.3641 10

    0.2686 10

    0.0955 10

    10.9550 10

  • 7/30/2019 Na10-02-floating_pkjjkkjoint.pdf

    24/27

    24Arithmetic Manipulations

    Subtraction

    0.7642 103 - 0.7641 103

    3

    3

    0.7642 10

    0.7642 10

    3

    3

    3

    0.7642 10

    0.7641 10

    0.0001 10

    00.1000 10

  • 7/30/2019 Na10-02-floating_pkjjkkjoint.pdf

    25/27

    25Arithmetic Manipulations

    Multiplication

    0.1363 103 0.6423 103

    Exponents are added

    Mantissas are multiplied

    3

    1

    0.1363 10

    0.6423 10

    3

    1

    2

    0.1363 10

    0.6423 10

    0.08754549 10

    10.8754549 10

    1

    0.8754 10

    26

  • 7/30/2019 Na10-02-floating_pkjjkkjoint.pdf

    26/27

    26Errors

    Adding a large and a small number

    0.4000 104+0.1000 10-2

    4

    4

    0.4000 10

    0.0000001 10

    +

    4

    4

    4

    0.4000 10

    0.0000001 10

    0.4000001 10

    +

    40.4000 10

    27

  • 7/30/2019 Na10-02-floating_pkjjkkjoint.pdf

    27/27

    27Errors

    Subtractive Cancellation

    If2 4

    2

    b b acx

    a

    =

    2 4b ac b b = %

    " "b b small + =%

    0.12345678

    0.12345666

    0.00000012

    0.12345???

    0.12345???

    0.00000???

    acb 42 >>