research matters nick higham february 25, 2009 school of ...higham/talks/squeezing19.pdffp128...
TRANSCRIPT
Research Matters
February 25, 2009
Nick HighamDirector of Research
School of Mathematics
1 / 6
Exploiting Half Precision Arithmetic inSolving Ax = b
Nick HighamSchool of Mathematics
The University of Manchesterwww.maths.manchester.ac.uk/~higham
nla-group.orgSlides available at http://bit.ly/squeeze19
Joint work with Srikara Pranesh and Mawussi Zounon
Today’s Floating-Point Arithmetics
Type Bits Range u = 2−t
bfloat16 half 16 10±38 2−8 ≈ 3.9× 10−3
fp16 half 16 10±5 2−11 ≈ 4.9× 10−4
fp32 single 32 10±38 2−24 ≈ 6.0× 10−8
fp64 double 64 10±308 2−53 ≈ 1.1× 10−16
fp128 quadruple 128 10±4932 2−113 ≈ 9.6× 10−35
fp* forms all IEEE standard, but fp16 storage only.bfloat16 used by Google TPU and forthcoming IntelNervana Neural Network Processor.
Nick Higham Half Precision in Solving Ax = b 2 / 19
Why Use Lower Precision in Sci Comp?
Faster flops.Less communication.Lower energy consumption.
But need toprove low precision (where used) gives sufficientaccuracy,refine low accuracy quantities.
Focus in this talk on Ax = b.
Nick Higham Half Precision in Solving Ax = b 3 / 19
Harmonic Series
What is the harmonic sum∞∑
k=1
1k
?
Arithmetic Computed sum No. of termsfp8 3.5000 16
bfloat16 5.0625 65fp16 7.0859 513fp32 15.404 2097152fp64 34.122 2.81 · · · × 1014
Simulations in MATLAB with chop function(H & Pranesh, 2019).
Nick Higham Half Precision in Solving Ax = b 4 / 19
Harmonic Series
What is the harmonic sum∞∑
k=1
1k
?
Arithmetic Computed sum No. of termsfp8 3.5000 16
bfloat16 5.0625 65fp16 7.0859 513fp32 15.404 2097152fp64 34.122 2.81 · · · × 1014
Simulations in MATLAB with chop function(H & Pranesh, 2019).
Nick Higham Half Precision in Solving Ax = b 4 / 19
Iterative Refinement in Three PrecisionsA,b given in precision u.
Solve Ax0 = b by LU factorization in precision uf > u.r = b − Ax0 precision ur < uSolve Ad = r precision uf
x1 = fl(x0 + d) precision u
uf u ur
half single doublehalf double quad
single double quad
Nick Higham Half Precision in Solving Ax = b 5 / 19
GMRES-IR
H & Carson (2018, 2019):to compute the update di apply GMRES to
Adi ≡ U−1L−1Adi = U−1L−1ri .
B’erroruf u ur κ∞(A) nrm cmp F’error
LU H D Q 104 D D DGMRES-IR H D Q 1012 D D DGMRES-IR H D D 108 D D D
Essentially need κ∞(A)u � 1 for convergence.Haidar, Tomov, Dongarra & H (2018): on NVIDIAV100 GPU speedup of 4 and energy reduction of 80%.
Nick Higham Half Precision in Solving Ax = b 6 / 19
Overflow and Underflow
First step of IR3: round fp64 (range [10−324,10308)→fp16 (range [10−8,105]).
Can sufferoverflow,underflow,elements becoming subnormal: the fp16 interval[10−8,10−5].
Need to squeeze the range of A,b and exploit thewhole fp16 range.
Write xmax = 6.55× 104 for fp16 overflow level.
Nick Higham Half Precision in Solving Ax = b 7 / 19
Example: Vector 2-Norm in fp16
Evaluate ‖x‖2 for
x =
[αα
]as√
x21 + x2
2 in fp16.
Recall uh = 4.88× 10−4, rmin = 6.10× 10−5.
α Relative error Comment10−4 1 Underflow to 0
3.3× 10−4 4.7× 10−2 Subnormal range5.5× 10−4 7.1× 10−3 Subnormal range1.1× 10−2 1.4× 10−4 Perfect rel. err
Nick Higham Half Precision in Solving Ax = b 8 / 19
Simple Conversion Strategies
Algorithm Inf Round then replace infinities.1: A(h) = flh(A)
2: For every |a(h)ij | ≥ θxmax, set a(h)
ij = sign(aij)θxmax.
Algorithm Scale Scale then round.1: amax = maxi,j |aij |2: µ = θxmax/amax
3: A(h) = flh(µA)
Both algs make maxi,j |a(h)ij | ≤ θxmax, where θ ∈ (0,1].
Alg Inf can make large changes to A.Alg Scale: underflow or subnormals if |aij | � xmax.
Nick Higham Half Precision in Solving Ax = b 9 / 19
Conversion With Two-Sided Diagonal Scaling
Algorithm 2DS Two-sided diagonal scaling then round.1: Obtain diagonal matrices R, S.2: β = maxi,j |RAS|ij .3: µ = θxmax/β4: A(h) = flh(µ(RAS)) % Max elt θxmax.
Algorithm Row and column equilibration.1: ri = ‖A(i , :)‖−1
∞ , i = 1 : n2: R = diag(r)
3: A = RA % A is row equilibrated.4: sj = ‖A(:, j)‖−1
∞ , j = 1 : n5: S = diag(s)
Nick Higham Half Precision in Solving Ax = b 10 / 19
Expose to the Right (ETTR) Analogy
In digital photography:expose image so histogram just touchesthe right edge.
Maximizes use of dynamic range of sensor.
Same principal here. fp16 numbers:
Subnormals Normalized numbers
Keep computations in the yellow zone.Likely to need to scale data up.Need problem analysis to find appropriate scaling thatavoids overflow.
Nick Higham Half Precision in Solving Ax = b 11 / 19
Choice of θ
In PA = LU (partial pivoting):|`ij | ≤ 1,|uij | ≤ ρn maxi,j |aij |, where growth factor ρn not large.
Take θ = 0.1 (say).
Can show that if ukk underflows then
κ∞(A) ≥ θxmax
xsmin
.
For fp16 and θ = 0.1, this is κ∞(A) ≥ 1.09× 1011.
Nick Higham Half Precision in Solving Ax = b 12 / 19
Numerical Experiments
13 badly scaled matrices from SuiteSparse MatrixCollection with maxi,j |aij | > xmax for fp16.
κ∞(A) ≤ 1014.maxi,j |aij | = 1010.mini,j{ |aij | : aij 6= 0 } = 10−25.
Precisions = (H,S,D) and (H,D,Q).
MATLAB using Moler’s fp16 class and Advanpix forquad precision.
IR convergence test is b’err ≤ nu.
Nick Higham Half Precision in Solving Ax = b 13 / 19
GMRES-IR Iterations (IR Steps)
Index (half, single, double) (half, double, quad)Alg Inf Alg Scale Alg Inf Alg Scale
1 6 (1) 2 (1) 14 (2) 7 (2)2 4 (1) 2 (1) 12 (2) 8 (2)3 35 (3) 6 (3) 84 (4) 14 (4)4 24 (2) 0 (0) 214 (3) 28 (4)5 108 (2) 0 (0) 258 (3) 6 (2)6 37 (3) 2 (1) 180 (4) 3 (1)7 0 (0) 1 (1) 4 (2) 2 (1)8 0 (0) 3 (1) 25 (3) 10 (2)9 120 (4) 0 (0) 116 (3) 16 (4)
10 – (–) 0 (0) – (–) 18 (4)11 255 (3) 0 (0) 686 (4) 37 (4)12 0 (0) – (–) 13 (2) – (–)13 0 (0) – (–) 11 (2) – (–)
Nick Higham Half Precision in Solving Ax = b 14 / 19
GMRES-IR Iterations (IR Steps)
Index (half, single, double) (half, double, quad)Alg 2DS Alg 2DS
1 0 (0) 2 (1)2 0 (0) 4 (2)3 2 (1) 6 (2)4 0 (0) 16 (2)5 0 (0) 2 (1)6 0 (0) 2 (1)7 0 (0) 2 (1)8 0 (0) 8 (2)9 0 (0) 9 (3)
10 1 (1) 11 (3)11 0 (0) 36 (3)12 0 (0) 9 (2)13 0 (0) 7 (2)
Nick Higham Half Precision in Solving Ax = b 15 / 19
Conditioning
2 4 6 8 10 12100
105
1010
1015
(A)
(RAS)
(MA)
Nick Higham Half Precision in Solving Ax = b 16 / 19
Notes on Scaling
The purpose of two-sided diagonal scaling is tosqueeze A into fp16.
The scaled alg is mathematically equivalent to theunscaled one if the LU pivot sequence doesn’t change
. . . and even numerically equivalent to the unscaledone if the scaling is by powers of 2.
Scaling may change the pivot sequence, though.
Important to work with the unscaled problem as scalingchanges norms!
Nick Higham Half Precision in Solving Ax = b 17 / 19
Conclusions
Must consider overflow/underflow in conversion to fp16.
Two-sided diagonal scaling works well to compress therange.
Further scalar mult needed to move data close to xmax.
Alg 2DS greatly widens the class of problems solvablewith IR3 (4x double precision solver).
Slides available at http://bit.ly/squeeze19
For more on performance of GMRES-IR Ax = b solvers see3:05-3:25 Jack J. Dongarra, Experimentswith Mixed Precision Algorithms in LinearAlgebra
Nick Higham Half Precision in Solving Ax = b 18 / 19
James H. Wilkinson (1919–1986) Centenaryhttps://nla-group.org/
Wilkinson page and blog posts during 2019.Advances in Numerical Linear Algebra, Manchester,May 29-30, 2019, Celebrating the Centenary of theBirth of James H. Wilkinson.
Nick Higham Half Precision in Solving Ax = b 19 / 19
References I
E. Carson and N. J. Higham.A new analysis of iterative refinement and its applicationto accurate solution of ill-conditioned sparse linearsystems.SIAM J. Sci. Comput., 39(6):A2834–A2856, 2017.
E. Carson and N. J. Higham.Accelerating the solution of linear systems by iterativerefinement in three precisions.SIAM J. Sci. Comput., 40(2):A817–A847, 2018.
N. J. Higham.Iterative refinement for linear systems and LAPACK.IMA J. Numer. Anal., 17(4):495–509, 1997.
Nick Higham Half Precision in Solving Ax = b 1 / 2
References II
N. J. Higham and S. Pranesh.Simulating low precision floating-point arithmetic.MIMS EPrint 2019.xx, Manchester Institute forMathematical Sciences, The University of Manchester,UK, 2019.In preparation.
N. J. Higham, S. Pranesh, and M. Zounon.Squeezing a matrix into half precision, with anapplication to solving linear systems.MIMS EPrint 2018.37, Manchester Institute forMathematical Sciences, The University of Manchester,UK, Nov. 2018.15 pp.
Nick Higham Half Precision in Solving Ax = b 2 / 2