research matters nick higham february 25, 2009 school of ...higham/talks/squeezing19.pdffp128...

22
Exploiting Half Precision Arithmetic in Solving Ax = b Nick Higham School of Mathematics The University of Manchester www.maths.manchester.ac.uk/~higham nla-group.org Slides available at http://bit.ly/squeeze19 Joint work with Srikara Pranesh and Mawussi Zounon

Upload: others

Post on 06-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Research Matters Nick Higham February 25, 2009 School of ...higham/talks/squeezing19.pdffp128 quadruple 128 10 4932 2 113 ˇ9:6 10 35 fp* forms all IEEE standard, but fp16 storage

Research Matters

February 25, 2009

Nick HighamDirector of Research

School of Mathematics

1 / 6

Exploiting Half Precision Arithmetic inSolving Ax = b

Nick HighamSchool of Mathematics

The University of Manchesterwww.maths.manchester.ac.uk/~higham

nla-group.orgSlides available at http://bit.ly/squeeze19

Joint work with Srikara Pranesh and Mawussi Zounon

Page 2: Research Matters Nick Higham February 25, 2009 School of ...higham/talks/squeezing19.pdffp128 quadruple 128 10 4932 2 113 ˇ9:6 10 35 fp* forms all IEEE standard, but fp16 storage

Today’s Floating-Point Arithmetics

Type Bits Range u = 2−t

bfloat16 half 16 10±38 2−8 ≈ 3.9× 10−3

fp16 half 16 10±5 2−11 ≈ 4.9× 10−4

fp32 single 32 10±38 2−24 ≈ 6.0× 10−8

fp64 double 64 10±308 2−53 ≈ 1.1× 10−16

fp128 quadruple 128 10±4932 2−113 ≈ 9.6× 10−35

fp* forms all IEEE standard, but fp16 storage only.bfloat16 used by Google TPU and forthcoming IntelNervana Neural Network Processor.

Nick Higham Half Precision in Solving Ax = b 2 / 19

Page 3: Research Matters Nick Higham February 25, 2009 School of ...higham/talks/squeezing19.pdffp128 quadruple 128 10 4932 2 113 ˇ9:6 10 35 fp* forms all IEEE standard, but fp16 storage

Why Use Lower Precision in Sci Comp?

Faster flops.Less communication.Lower energy consumption.

But need toprove low precision (where used) gives sufficientaccuracy,refine low accuracy quantities.

Focus in this talk on Ax = b.

Nick Higham Half Precision in Solving Ax = b 3 / 19

Page 4: Research Matters Nick Higham February 25, 2009 School of ...higham/talks/squeezing19.pdffp128 quadruple 128 10 4932 2 113 ˇ9:6 10 35 fp* forms all IEEE standard, but fp16 storage

Harmonic Series

What is the harmonic sum∞∑

k=1

1k

?

Arithmetic Computed sum No. of termsfp8 3.5000 16

bfloat16 5.0625 65fp16 7.0859 513fp32 15.404 2097152fp64 34.122 2.81 · · · × 1014

Simulations in MATLAB with chop function(H & Pranesh, 2019).

Nick Higham Half Precision in Solving Ax = b 4 / 19

Page 5: Research Matters Nick Higham February 25, 2009 School of ...higham/talks/squeezing19.pdffp128 quadruple 128 10 4932 2 113 ˇ9:6 10 35 fp* forms all IEEE standard, but fp16 storage

Harmonic Series

What is the harmonic sum∞∑

k=1

1k

?

Arithmetic Computed sum No. of termsfp8 3.5000 16

bfloat16 5.0625 65fp16 7.0859 513fp32 15.404 2097152fp64 34.122 2.81 · · · × 1014

Simulations in MATLAB with chop function(H & Pranesh, 2019).

Nick Higham Half Precision in Solving Ax = b 4 / 19

Page 6: Research Matters Nick Higham February 25, 2009 School of ...higham/talks/squeezing19.pdffp128 quadruple 128 10 4932 2 113 ˇ9:6 10 35 fp* forms all IEEE standard, but fp16 storage

Iterative Refinement in Three PrecisionsA,b given in precision u.

Solve Ax0 = b by LU factorization in precision uf > u.r = b − Ax0 precision ur < uSolve Ad = r precision uf

x1 = fl(x0 + d) precision u

uf u ur

half single doublehalf double quad

single double quad

Nick Higham Half Precision in Solving Ax = b 5 / 19

Page 7: Research Matters Nick Higham February 25, 2009 School of ...higham/talks/squeezing19.pdffp128 quadruple 128 10 4932 2 113 ˇ9:6 10 35 fp* forms all IEEE standard, but fp16 storage

GMRES-IR

H & Carson (2018, 2019):to compute the update di apply GMRES to

Adi ≡ U−1L−1Adi = U−1L−1ri .

B’erroruf u ur κ∞(A) nrm cmp F’error

LU H D Q 104 D D DGMRES-IR H D Q 1012 D D DGMRES-IR H D D 108 D D D

Essentially need κ∞(A)u � 1 for convergence.Haidar, Tomov, Dongarra & H (2018): on NVIDIAV100 GPU speedup of 4 and energy reduction of 80%.

Nick Higham Half Precision in Solving Ax = b 6 / 19

Page 8: Research Matters Nick Higham February 25, 2009 School of ...higham/talks/squeezing19.pdffp128 quadruple 128 10 4932 2 113 ˇ9:6 10 35 fp* forms all IEEE standard, but fp16 storage

Overflow and Underflow

First step of IR3: round fp64 (range [10−324,10308)→fp16 (range [10−8,105]).

Can sufferoverflow,underflow,elements becoming subnormal: the fp16 interval[10−8,10−5].

Need to squeeze the range of A,b and exploit thewhole fp16 range.

Write xmax = 6.55× 104 for fp16 overflow level.

Nick Higham Half Precision in Solving Ax = b 7 / 19

Page 9: Research Matters Nick Higham February 25, 2009 School of ...higham/talks/squeezing19.pdffp128 quadruple 128 10 4932 2 113 ˇ9:6 10 35 fp* forms all IEEE standard, but fp16 storage

Example: Vector 2-Norm in fp16

Evaluate ‖x‖2 for

x =

[αα

]as√

x21 + x2

2 in fp16.

Recall uh = 4.88× 10−4, rmin = 6.10× 10−5.

α Relative error Comment10−4 1 Underflow to 0

3.3× 10−4 4.7× 10−2 Subnormal range5.5× 10−4 7.1× 10−3 Subnormal range1.1× 10−2 1.4× 10−4 Perfect rel. err

Nick Higham Half Precision in Solving Ax = b 8 / 19

Page 10: Research Matters Nick Higham February 25, 2009 School of ...higham/talks/squeezing19.pdffp128 quadruple 128 10 4932 2 113 ˇ9:6 10 35 fp* forms all IEEE standard, but fp16 storage

Simple Conversion Strategies

Algorithm Inf Round then replace infinities.1: A(h) = flh(A)

2: For every |a(h)ij | ≥ θxmax, set a(h)

ij = sign(aij)θxmax.

Algorithm Scale Scale then round.1: amax = maxi,j |aij |2: µ = θxmax/amax

3: A(h) = flh(µA)

Both algs make maxi,j |a(h)ij | ≤ θxmax, where θ ∈ (0,1].

Alg Inf can make large changes to A.Alg Scale: underflow or subnormals if |aij | � xmax.

Nick Higham Half Precision in Solving Ax = b 9 / 19

Page 11: Research Matters Nick Higham February 25, 2009 School of ...higham/talks/squeezing19.pdffp128 quadruple 128 10 4932 2 113 ˇ9:6 10 35 fp* forms all IEEE standard, but fp16 storage

Conversion With Two-Sided Diagonal Scaling

Algorithm 2DS Two-sided diagonal scaling then round.1: Obtain diagonal matrices R, S.2: β = maxi,j |RAS|ij .3: µ = θxmax/β4: A(h) = flh(µ(RAS)) % Max elt θxmax.

Algorithm Row and column equilibration.1: ri = ‖A(i , :)‖−1

∞ , i = 1 : n2: R = diag(r)

3: A = RA % A is row equilibrated.4: sj = ‖A(:, j)‖−1

∞ , j = 1 : n5: S = diag(s)

Nick Higham Half Precision in Solving Ax = b 10 / 19

Page 12: Research Matters Nick Higham February 25, 2009 School of ...higham/talks/squeezing19.pdffp128 quadruple 128 10 4932 2 113 ˇ9:6 10 35 fp* forms all IEEE standard, but fp16 storage

Expose to the Right (ETTR) Analogy

In digital photography:expose image so histogram just touchesthe right edge.

Maximizes use of dynamic range of sensor.

Same principal here. fp16 numbers:

Subnormals Normalized numbers

Keep computations in the yellow zone.Likely to need to scale data up.Need problem analysis to find appropriate scaling thatavoids overflow.

Nick Higham Half Precision in Solving Ax = b 11 / 19

Page 13: Research Matters Nick Higham February 25, 2009 School of ...higham/talks/squeezing19.pdffp128 quadruple 128 10 4932 2 113 ˇ9:6 10 35 fp* forms all IEEE standard, but fp16 storage

Choice of θ

In PA = LU (partial pivoting):|`ij | ≤ 1,|uij | ≤ ρn maxi,j |aij |, where growth factor ρn not large.

Take θ = 0.1 (say).

Can show that if ukk underflows then

κ∞(A) ≥ θxmax

xsmin

.

For fp16 and θ = 0.1, this is κ∞(A) ≥ 1.09× 1011.

Nick Higham Half Precision in Solving Ax = b 12 / 19

Page 14: Research Matters Nick Higham February 25, 2009 School of ...higham/talks/squeezing19.pdffp128 quadruple 128 10 4932 2 113 ˇ9:6 10 35 fp* forms all IEEE standard, but fp16 storage

Numerical Experiments

13 badly scaled matrices from SuiteSparse MatrixCollection with maxi,j |aij | > xmax for fp16.

κ∞(A) ≤ 1014.maxi,j |aij | = 1010.mini,j{ |aij | : aij 6= 0 } = 10−25.

Precisions = (H,S,D) and (H,D,Q).

MATLAB using Moler’s fp16 class and Advanpix forquad precision.

IR convergence test is b’err ≤ nu.

Nick Higham Half Precision in Solving Ax = b 13 / 19

Page 15: Research Matters Nick Higham February 25, 2009 School of ...higham/talks/squeezing19.pdffp128 quadruple 128 10 4932 2 113 ˇ9:6 10 35 fp* forms all IEEE standard, but fp16 storage

GMRES-IR Iterations (IR Steps)

Index (half, single, double) (half, double, quad)Alg Inf Alg Scale Alg Inf Alg Scale

1 6 (1) 2 (1) 14 (2) 7 (2)2 4 (1) 2 (1) 12 (2) 8 (2)3 35 (3) 6 (3) 84 (4) 14 (4)4 24 (2) 0 (0) 214 (3) 28 (4)5 108 (2) 0 (0) 258 (3) 6 (2)6 37 (3) 2 (1) 180 (4) 3 (1)7 0 (0) 1 (1) 4 (2) 2 (1)8 0 (0) 3 (1) 25 (3) 10 (2)9 120 (4) 0 (0) 116 (3) 16 (4)

10 – (–) 0 (0) – (–) 18 (4)11 255 (3) 0 (0) 686 (4) 37 (4)12 0 (0) – (–) 13 (2) – (–)13 0 (0) – (–) 11 (2) – (–)

Nick Higham Half Precision in Solving Ax = b 14 / 19

Page 16: Research Matters Nick Higham February 25, 2009 School of ...higham/talks/squeezing19.pdffp128 quadruple 128 10 4932 2 113 ˇ9:6 10 35 fp* forms all IEEE standard, but fp16 storage

GMRES-IR Iterations (IR Steps)

Index (half, single, double) (half, double, quad)Alg 2DS Alg 2DS

1 0 (0) 2 (1)2 0 (0) 4 (2)3 2 (1) 6 (2)4 0 (0) 16 (2)5 0 (0) 2 (1)6 0 (0) 2 (1)7 0 (0) 2 (1)8 0 (0) 8 (2)9 0 (0) 9 (3)

10 1 (1) 11 (3)11 0 (0) 36 (3)12 0 (0) 9 (2)13 0 (0) 7 (2)

Nick Higham Half Precision in Solving Ax = b 15 / 19

Page 17: Research Matters Nick Higham February 25, 2009 School of ...higham/talks/squeezing19.pdffp128 quadruple 128 10 4932 2 113 ˇ9:6 10 35 fp* forms all IEEE standard, but fp16 storage

Conditioning

2 4 6 8 10 12100

105

1010

1015

(A)

(RAS)

(MA)

Nick Higham Half Precision in Solving Ax = b 16 / 19

Page 18: Research Matters Nick Higham February 25, 2009 School of ...higham/talks/squeezing19.pdffp128 quadruple 128 10 4932 2 113 ˇ9:6 10 35 fp* forms all IEEE standard, but fp16 storage

Notes on Scaling

The purpose of two-sided diagonal scaling is tosqueeze A into fp16.

The scaled alg is mathematically equivalent to theunscaled one if the LU pivot sequence doesn’t change

. . . and even numerically equivalent to the unscaledone if the scaling is by powers of 2.

Scaling may change the pivot sequence, though.

Important to work with the unscaled problem as scalingchanges norms!

Nick Higham Half Precision in Solving Ax = b 17 / 19

Page 19: Research Matters Nick Higham February 25, 2009 School of ...higham/talks/squeezing19.pdffp128 quadruple 128 10 4932 2 113 ˇ9:6 10 35 fp* forms all IEEE standard, but fp16 storage

Conclusions

Must consider overflow/underflow in conversion to fp16.

Two-sided diagonal scaling works well to compress therange.

Further scalar mult needed to move data close to xmax.

Alg 2DS greatly widens the class of problems solvablewith IR3 (4x double precision solver).

Slides available at http://bit.ly/squeeze19

For more on performance of GMRES-IR Ax = b solvers see3:05-3:25 Jack J. Dongarra, Experimentswith Mixed Precision Algorithms in LinearAlgebra

Nick Higham Half Precision in Solving Ax = b 18 / 19

Page 20: Research Matters Nick Higham February 25, 2009 School of ...higham/talks/squeezing19.pdffp128 quadruple 128 10 4932 2 113 ˇ9:6 10 35 fp* forms all IEEE standard, but fp16 storage

James H. Wilkinson (1919–1986) Centenaryhttps://nla-group.org/

Wilkinson page and blog posts during 2019.Advances in Numerical Linear Algebra, Manchester,May 29-30, 2019, Celebrating the Centenary of theBirth of James H. Wilkinson.

Nick Higham Half Precision in Solving Ax = b 19 / 19

Page 21: Research Matters Nick Higham February 25, 2009 School of ...higham/talks/squeezing19.pdffp128 quadruple 128 10 4932 2 113 ˇ9:6 10 35 fp* forms all IEEE standard, but fp16 storage

References I

E. Carson and N. J. Higham.A new analysis of iterative refinement and its applicationto accurate solution of ill-conditioned sparse linearsystems.SIAM J. Sci. Comput., 39(6):A2834–A2856, 2017.

E. Carson and N. J. Higham.Accelerating the solution of linear systems by iterativerefinement in three precisions.SIAM J. Sci. Comput., 40(2):A817–A847, 2018.

N. J. Higham.Iterative refinement for linear systems and LAPACK.IMA J. Numer. Anal., 17(4):495–509, 1997.

Nick Higham Half Precision in Solving Ax = b 1 / 2

Page 22: Research Matters Nick Higham February 25, 2009 School of ...higham/talks/squeezing19.pdffp128 quadruple 128 10 4932 2 113 ˇ9:6 10 35 fp* forms all IEEE standard, but fp16 storage

References II

N. J. Higham and S. Pranesh.Simulating low precision floating-point arithmetic.MIMS EPrint 2019.xx, Manchester Institute forMathematical Sciences, The University of Manchester,UK, 2019.In preparation.

N. J. Higham, S. Pranesh, and M. Zounon.Squeezing a matrix into half precision, with anapplication to solving linear systems.MIMS EPrint 2018.37, Manchester Institute forMathematical Sciences, The University of Manchester,UK, Nov. 2018.15 pp.

Nick Higham Half Precision in Solving Ax = b 2 / 2