likelihood approximation with parallel hierarchical matrices for large spatial datasets

Post on 09-Apr-2017

62 Views

Category:

Environment

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

LikelihoodApproximationWithParallelHierarchicalMatricesForLargeSpatialDatasets

A. Litvinenko, Y. Sun, M. Genton, D. Keyes, CEMSE, KAUST

HIERARCHICAL LIKELIHOOD APPROXIMATIONSuppose we observe a mean-zero, stationary and isotropic Gaussian process Z with a Matérn covari-

ance at n irregularly spaced locations. Let Z = (Z(s1), ..., Z(sn))T then Z ∼ N (0,C(θ)), θ ∈ Rq is anunknown parameter vector of interest, where

Cij(θ) = cov(Z(si), Z(sj)) = C(‖si − sj‖,θ), and

C(r) := Cθ(r) =2σ2

Γ(ν)

( r2`

)νKν

(r`

), θ = (σ2, ν, `)T

is the Matérn covariance function. The MLE ofθ is obtained by maximizing the Gaussian log-likelihood function:

L(θ) = −n2

log(2π)− 1

2log |C(θ)|− 1

2Z>C(θ)−1Z.

We approximate C ≈ C̃ in the H-matrix formatwith cost and storage O(kn log n), k � n.

Theorem 1 1. Let ρ(C̃−1C̃− I) < ε < 1. It holds| log |C| − log |C̃|| ≤ −n log(1− ε). Let ‖C−1‖ ≤ c1,then |L̃(θ; k)− L(θ)| ≤ c20 · c1 · ε+ n log(1− ε)

Operation Sequen. Compl. Parallel Compl. (shared mem.)

building(C̃) O(n logn) O(n logn)p

+O(|V (T )\L(T )|)storage(C̃) O(kn logn) O(kn logn)

C̃z O(kn logn) O(kn logn)p

+ n√p

H-Cholesky O(k2n log2 n) O(n logn)p

+O( k2n log2 n

n1/d )

Daily soil moisture, Mississippi basin.

H-matrix rank

3 7 9

cov.

leng

th

0.02

0.025

0.03

0.035

0.04

0.045

0.05

0.055

0.06

Box-plots for differentH-ranks k = {3, 7, 9}, ` = 0.0334.

ℓ, ν = 0.325, σ2 = 0.980 0.2 0.4 0.6 0.8 1

×104

-5

-4

-3

-2

-1

0

1

2

3

log(|C̃|)

zTC̃

−1z

−L̃

Moisture, n = 66049, rank k = 11.

PARALLEL HIERARCHICAL MATRICES (HACKBUSCH, KRIEMANN’05)Advantages to approximate C by C̃: H-approximation is cheap; storage and matrix-vector productcost O(kn log n); LU and inverse cost O(k2n log2 n); efficient parallel implementations exists.

54

54

86

86

68

67 67

20

19 112

28 26

15 24

112

112 123

26 107

19 17

107

107

82

67 67

25

25

25

32 66

20 20

28 24

25 26

16 25

66

66 73

44 63

30 23

63

63 67

29

15 67

14 27

22 18

69

60 60

25 60

25 25

79

63 63

24

14

27 63

19 29

16 21

23 30

64

64 79

19 66

23 29

66

66 66

24

11 65

21 52

28 35

65

65 70

112 112

19

24

35

21 112

19 8

23 35

18 15

12 20

79

71 71

115 115

33 115

23 29

127

125

66

66 67

24

25 125

27 25

25 30

125

125

81

64 64

17 125

26 19

125

125

65

65 73

17

35

22

14 65

20 30

25 26

25 21

15 28

65

65 82

21 70

22 49

70

60 60

30

29 60

25 16

17 13

71

64 64

15 61

14 20

61

61 70

19

20

18 59

23 23

17 25

21 25

59

59 85

119 119

16 119

23 21

70

62 62

120 120

12

17

16

13

20 120

14 17

25 23

17 24

13 20

8 12

80

67 67

19 65

11 15

65

64 64

17 122

18 15

122

122 122

34

23 112

11 26

24 23

113

113

74

74 84

21 104

17 23

104

104 120

28

24

18 120

29 22

26 31

18 27

78

57 57

125 125

28 125

19 30

62

62 77

122 122

23

32 121

15 15

26 33

122

122 123

25 123

24 19

70

63 63

29 63

24 18

75

65 65

18

28

35

33

28 65

18 24

24 35

16 14

19 21

12 20

78

66 66

124 124

18 123

21 16

123

123

66

66 70

18

27 127

29 19

21 23

127

127

77

63 63

21 123

11 32

123

123

61

61 82

28

26

16 123

26 30

19 31

18 26

123

113 113

22 113

16 25

73

59 59

29 59

26 29

76

76 79

26

16

19 62

20 19

28 23

24 26

62

62 73

18 69

41 20

69

65 66

28 12325 12

12326 26

54

54

86

86

68

67 67

20

39 112

48 73

15 39

112

112 123

63 107

56 65

107

107

82

67 67

25

50

67

58 66

59 58

71 76

50 75

16 41

66

66 73

63 63

62 66

63

63 67

84

58 67

54 55

75 76

69

60 60

59 60

56 55

79

63 63

65

70

58 63

59 57

67 69

63 80

64

64 79

59 66

54 56

66

66 66

82

57 65

60 66

79 83

65

65 70

112 112

19

43

76

82 112

73 72

62 83

37 51

12 32

79

71 71

115 115

87 115

78 77

127

127

66

66 67

63

73 125

70 77

59 75

125

125

81

64 64

76 125

78 84

125

125

65

65 73

48

71

79

56 65

61 67

75 82

59 71

43 61

65

65 82

59 70

54 60

70

60 60

80

52 60

48 54

67 52

71

64 64

54 61

61 55

61

61 70

69

74

48 59

52 63

67 67

64 78

59

59 85

119 119

64 119

65 74

70

62 62

120 120

12

29

45

47

59 120

51 58

53 69

44 52

25 40

8 20

80

67 67

46 65

50 56

65

64 64

67 122

62 61

122

122 122

69

71 113

63 71

58 71

113

113

74

74 84

64 104

59 72

104

104 120

55

69

63 120

66 80

66 74

43 60

78

57 57

125 125

81 125

69 78

62

62 77

122 122

68

77 122

62 65

64 81

122

122 123

76 123

71 80

70

63 63

63 63

56 55

75

65 65

34

56

73

85

70 65

63 60

76 93

55 61

45 54

27 40

78

66 66

124 124

68 123

63 71

123

123

66

66 70

61

69 127

68 74

60 74

127

127

77

63 63

73 123

63 74

123

123

61

61 82

57

72

66 123

67 88

63 74

45 61

123

113 113

72 113

65 75

73

59 59

62 59

59 57

76

76 79

69

68

52 62

54 64

72 83

62 74

62

62 73

58 69

62 58

69

66 66

83 12326 26

12326 26

(1st) Matérn H-matrix approximations for moisture example, n = 8000, ε = 10−3, ` = 0.64, ν = 0.325,σ2 = 0.98, 29.3MB vs 488.3MB for dense, set up time 0.4 sec.; (2nd) Cholesky factor L, with accuracy ineach block ε = 10−8, 4.8 sec., storage 52.8 MB.; (3rd) Distribution across p processors; (4) Kronecker prod-uct ofH-matrices, n = 381K; (5) Discretization of Mississippi basin, [−84.8◦−72.9◦]×[32.446◦, 43.4044◦].

NUMERICAL EXAMPLES

H-matrix approximation, ν = 0.5, domain G = [0, 1]2, ‖C̃(0.25,0.75)‖2 = {212, 568}, n = 16049.

k KLD ‖C− C̃‖2 ‖C̃C̃−1 − I‖2` = 0.25 ` = 0.75 ` = 0.25 ` = 0.75 ` = 0.25 ` = 0.75

10 2.6e-3 0.2 7.7e-4 7.0e-4 6.0e-2 3.150 3.4e-13 5e-12 2.0e-13 2.4e-13 4e-11 2.7e-9

Computing time and number of iterations for maximization of log-likelihood L̃(θ; k), n = 66049.k size, GB C̃, set up time, s. compute L̃, s. maximizing, s. # iters10 1 7 115 1994 1320 1.7 11 370 5445 9

dense 38 42 657 ∞ -

Moisture data. We used adaptive rank arithmetics with ε = 10−4 for each block of C̃ and ε = 10−8 foreach block of C̃−1. Number of processing cores is 40.

n compute C̃ L̃L̃T inverseCompr. time size time size ‖I− (L̃L̃T)−1C̃‖2 time size ‖I− C̃−1C̃‖2rate % sec. MB sec. MB sec. MB

10000 86% 0.9 106 4.1 109 7.7e-6 44 230 7.8e-530000 92.5% 4.3 515 25 557 1.1e-3 316 1168 1.1e-1

n = 512K, accuracy inside each block 10−8, matrix setup 261 sec., compression rate 99.98% (0.4GB against 2006 GB).H-LU is done in 843 sec., required 5.8 GB RAM, inversion LU error 2 · 10−3.

(1st) −L vs. ν; (2nd) with nuggets {0.01, 0.005, 0.001} for Gaussian covariance, n = 2000, k = 14,σ2 = 1; (3rd) Zoom of 2nd figure; (4th) box-plots for ν vs number of locations n.

REFERENCES AND ACKNOWLEDGEMENTS

[1] B. N. KHOROMSKIJ, A. LITVINENKO, H. G. MATTHIES, Application of hierarchical matrices for computing theKarhunen-Loéve expan-sion, Computing, Vol. 84, Issue 1-2, pp 49-67, 2008.

[2] Y. SUN, M. STEIN, Statistically and computationally efficient estimating equations for large spatial datasets, JCGS, 2016,[3] A. LITVINENKO, M. GENTON, Y. SUN, D. KEYES, ??matrix techniques for approximating large covariance matrices

and estimating its parameters, PAMM 16 (1), 731-732, 2016[4] W. NOWAK, A. LITVINENKO, Kriging and spatial design accelerated by orders of magnitude: combining low-rank

covariance approximations with FFT-techniques, J. Mathematical Geosciences, Vol. 45, N4, pp 411-435, 2013.

Work supported by SRI-UQ and ECRC, KAUST. Thanks to Ronald Kriemann for HLIBPro .

top related