recent progress on the nearest correlation matrix problemhigham/talks/talk15_ncm.pdf · newton...

Post on 11-Aug-2020

1 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Research Matters

February 25, 2009

Nick HighamDirector of Research

School of Mathematics

1 / 6

Recent Progress on theNearest Correlation Matrix Problem

Nick HighamSchool of Mathematics

The University of Manchester

http://www.maths.manchester.ac.uk/~higham/talks

Joint work withNataša Strabic and Vedran Šego

New Directions in Numerical Computation,August 25-28, 2015, Oxford.

In celebration of Nick Trefethen’s 60th birthday

Invalid Correlation Matrices

Correlation matrix: symm pos semidef with unit diagonal.

Sample correlation matrix from empirical data.May lack definiteness due to

missing observations,asynchronous observations,left-censored data,stress testing,expert judgement,separate blocks joined together (aggregation).

How do we make the matrix (semi)definite?

Nick Higham Nearest Correlation Matrix 2 / 27

Correlation Matrix

A correlation matrix hasones on the diagonal,all eigenvalues nonnegative,all elements between −1 and 1.

Is this a correlation matrix?1 1 01 1 10 1 1

.

Spectrum: −0.4142, 1.0000, 2.4142.

I Vary a13: it must be 1 for a correlation matrix.

Nick Higham Nearest Correlation Matrix 3 / 27

Correlation Matrix

A correlation matrix hasones on the diagonal,all eigenvalues nonnegative,all elements between −1 and 1.

Is this a correlation matrix?1 1 01 1 10 1 1

.

Spectrum: −0.4142, 1.0000, 2.4142.

I Vary a13: it must be 1 for a correlation matrix.

Nick Higham Nearest Correlation Matrix 3 / 27

Correlation Matrix

A correlation matrix hasones on the diagonal,all eigenvalues nonnegative,all elements between −1 and 1.

Is this a correlation matrix?1 1 01 1 10 1 1

. Spectrum: −0.4142, 1.0000, 2.4142.

I Vary a13: it must be 1 for a correlation matrix.

Nick Higham Nearest Correlation Matrix 3 / 27

London Finance Company Question (2000)

“Given a real symmetric matrix A which is almost acorrelation matrix what is the best approximating (inFrobenius norm?) correlation matrix?”

Massage the original data, e.g., plug gaps.Make ad hoc modifications to matrix: e.g., shiftnegative e’vals up to zero then diagonally scale.

Find a nearest correlation matrix.√

Literature search: very little found.

Nick Higham Nearest Correlation Matrix 4 / 27

Spherical Parametrization

Pinheiro & Bates (1996), Rebonato & Jäckel (2000).

.Correlation matrix A = RT R with (n = 3):

R =

1 cos θ12 cos θ13

0 sin θ12 sin θ13 cos θ23

0 0 sin θ13 sin θ23

.Hence min{ ‖A− RT R‖2

F : θij ∈ [0, π], 1 ≤ i < j ≤ n }.

Nonlinear, lots of local minima!

Nick Higham Nearest Correlation Matrix 5 / 27

Spherical Parametrization

Pinheiro & Bates (1996), Rebonato & Jäckel (2000).

.Correlation matrix A = RT R with (n = 3):

R =

1 cos θ12 cos θ13

0 sin θ12 sin θ13 cos θ23

0 0 sin θ13 sin θ23

.Hence min{ ‖A− RT R‖2

F : θij ∈ [0, π], 1 ≤ i < j ≤ n }.

Nonlinear, lots of local minima!

Nick Higham Nearest Correlation Matrix 5 / 27

Problem

min{ ‖A− X‖F : X is a correlation matrix }

X ∈ Sn ∩ Un, where

Sn = {X ∈ Rn×n : X is symm pos semidef },Un = {X = X T ∈ Rn×n : xii = 1, i = 1 : n }.

Constraint a closed, convex set, so unique minimizer.

H (2002):

Characterization of solution using normal cones ofconvex sets.Alternating projections algorithm.

Nick Higham Nearest Correlation Matrix 6 / 27

Alternating Projections

von Neumann (1933), for subspaces.

S1

S2

Dykstra (1983) incorporated corrections for closed convexsets.

Nick Higham Nearest Correlation Matrix 7 / 27

Algorithm (H, 2002)

Given A = AT ∈ Rn×n, compute nearest correlation matrix.

1 ∆S0 = 0, Y0 = A2 for k = 1,2, . . .3 Rk = Yk−1 −∆Sk−1

4 Xk = PSn(Rk) % Project onto Sn.5 ∆Sk = Xk − Rk % Dykstra’s correction.6 Yk = PUn(Xk) % Project onto Un.7 end8 Return Yk .

Xk and Yk both converge to solution.Linear convergence, at best.Can add further constraints/projections . . .

Nick Higham Nearest Correlation Matrix 8 / 27

Unexpected Applications

Some recent papers (all use alternating projections):

Simulating wireless links in vehicular networks(2014)

Analysing carbon dioxide storage resources (2013)

Applying stochastic small-scale damage functionsto German winter storms (2012)

Predicting breeding values for eventing disciplinesand grades in sport horses (2012)

Characterisation of tool marks on cartridge casesby combining multiple images (2012)

Experiments in reconstructing twentieth-centurysea levels (2011)

Nick Higham Nearest Correlation Matrix 9 / 27

Newton Method

Qi & Sun (2006): Newton method based on theory ofstrongly semismooth matrix functions.

Applies Newton to dual (unconstrained) ofmin 1

2‖A− X‖2F problem.

Dual problem is ctsly differentiable, but not twicedifferentiable⇒ use generalized Jacobian of gradient.

Globally and quadratically convergent.

Practical improvements: Borsdorf & H (2010).

NAG code g02aaf: order of magnitude faster thanalt proj. See NCM blog post (2013)

Cannot incorporate fixed elements!

Nick Higham Nearest Correlation Matrix 10 / 27

Newton Method

Qi & Sun (2006): Newton method based on theory ofstrongly semismooth matrix functions.

Applies Newton to dual (unconstrained) ofmin 1

2‖A− X‖2F problem.

Dual problem is ctsly differentiable, but not twicedifferentiable⇒ use generalized Jacobian of gradient.

Globally and quadratically convergent.

Practical improvements: Borsdorf & H (2010).

NAG code g02aaf: order of magnitude faster thanalt proj. See NCM blog post (2013)

Cannot incorporate fixed elements!

Nick Higham Nearest Correlation Matrix 10 / 27

Accelerating a Fixed-Point Iteration

We are looking for x∗ such that g(x∗) = x∗ for g : Rn → Rn.

Fixed-point iteration

xk+1 = g(xk), k ≥ 1, x0 ∈ Rn given.

AccelerationThe iteration history for some chosen m ≥ 0 is

. . . xk−m xk−(m−1) . . . xk−1 xk

. . . g(xk−m) g(xk−(m−1)) . . . g(xk−1) g(xk).

I Define xk+1 using all of this information.

Nick Higham Nearest Correlation Matrix 11 / 27

Accelerating a Fixed-Point Iteration

We are looking for x∗ such that g(x∗) = x∗ for g : Rn → Rn.

Fixed-point iteration

xk+1 = g(xk), k ≥ 1, x0 ∈ Rn given.

AccelerationThe iteration history for some chosen m ≥ 0 is

. . . xk−m xk−(m−1) . . . xk−1 xk

. . . g(xk−m) g(xk−(m−1)) . . . g(xk−1) g(xk).

I Define xk+1 using all of this information.

Nick Higham Nearest Correlation Matrix 11 / 27

Anderson AccelerationGiven history length m.

1 x1 = g(x0)2 for k = 1,2, . . . until convergence3 mk = min(m, k)4 Determine θ1, . . . , θmk to minimize ‖uk − vk‖2

2, where

uk = xk +

mk∑j=1

θj(xk−j − xk),

vk = g(xk) +

mk∑j=1

θj(g(xk−j)− g(xk)

).

5 xk+1 = vk

6 end

If g linear, objective function is ‖uk − g(uk)‖22.

Nick Higham Nearest Correlation Matrix 12 / 27

History of Anderson Acceleration

I Originates with Anderson (1965): integral equations.I In quantum chemistry known as Pulay mixing or direct

inversion in the iterative subspace (DIIS) (Pulay, 1980).I Recent papers by numerical analysts, e.g., Walker & Ni

(2011), Toth & Kelley (2015).

TheoryNo general guarantees of convergence!

I Does not require the iteration to be linearly convergent.I Related to multisecant quasi-Newton methods;

equivalent to “bad” Broyden.I For Ax = b with mk = k , essentially GMRES.I Some analysis on the convergence for contractive

mappings of Anderson acceleration with fixed m.

Practicalities

AA is implemented using differences of function valuesand iterates.

A linear least squares problem is solved at each step(the main cost of AA).

Can write alternating projections for NCM in fixed-point,vector form and apply AA.

Nick Higham Nearest Correlation Matrix 14 / 27

Experiment 1

Five invalid correlation matrices from the literature.Iterations: nearcorr vs. nearcorr_AA.

n ititAA

m = 1 m = 2 m = 3 m = 4 m = 54 39 15 10 9 9 95 27 17 14 12 11 106 801 305 212 117 126 407 33 15 10 10 10 9

Nick Higham Nearest Correlation Matrix 15 / 27

Experiment 2: Fixed Elements (1)george: 90× 90

0 20 40 60 80

0

10

20

30

40

50

60

70

80

90

nz = 1060

Keep fixed:(1,1) block.Main diagonal (not unit).“Small” diagonals.

madalyn: 94× 94

10 20 30 40 50 60 70 80 90

10

20

30

40

50

60

70

80

90

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Keep fixed:All diagonal blocks(respective sizes 12, 5, 1,14, 12, 1, 10, 4, 5, 9, 13and 8).

Nick Higham Nearest Correlation Matrix 16 / 27

Experiment 2: Fixed Elements (2)

nearcorr vs. nearcorr_fe vs. nearcorr_fe_AAn = 90: finance.n = 94: carbon dioxide storage.

n it it_feitAA_fe

m = 1 m = 2 m = 3 m = 4 m = 590 29 169 93 70 55 45 3994 18 40 15 14 12 12 12

H & Strabic (2015), Anderson acceleration of the alternatingprojections method for computing the nearest correlation matrix,MIMS EPrint 2015.39. Codes available on Github.

Nick Higham Nearest Correlation Matrix 17 / 27

Shrinking

Invalid correlation matrix C.Target correlation matrix T .Replace C by S(α) = (1− α)C + αT , where α ∈ [0,1].

Large literature on shrinking in which

C and T are both cov/correl matrices.α is chosen subject to statistical considerations.

Our optimal shrinking parameter:

α∗ = min{α ∈ [0,1] : S(α) is pos semidef}.

Nick Higham Nearest Correlation Matrix 19 / 27

Additional Requirement: Fixed Block

Data: N random variables, K observations X ∈ RK×N .Arrange so first m columns have no missing data:

X = [ x1 . . . xm xm+1 . . . xm+n ] .

C =

[ m n

m A Yn Y T B

] Symmetric. 3

Unit diagonal. 3

Elements in [−1,1]. 3

Positive semidefinite. 7

Transform C intoa valid correlation matrix,while keeping a positive semidefinite block A.

Nick Higham Nearest Correlation Matrix 20 / 27

The Shrinking Method

S(α) := α

[A 00 I

]︸ ︷︷ ︸target

+(1− α)[

A YY T B

], α ∈ [0,1].

S(α) =

[A (1− α)Y

(1− α)Y T αI + (1− α)B

] Symmetric. 3

Unit diagonal. 3

Upper-left block A. 3

Find

α∗ = min{α ∈ [0,1] : S(α) pos semidef}= min{α ∈ [0,1] : f (α) := λmin (S(α)) ≥ 0 }.

Minimal uniform rel change to each unfixed element.

Nick Higham Nearest Correlation Matrix 21 / 27

The Shrinking Method

S(α) := α

[A 00 I

]︸ ︷︷ ︸target

+(1− α)[

A YY T B

], α ∈ [0,1].

S(α) =

[A (1− α)Y

(1− α)Y T αI + (1− α)B

] Symmetric. 3

Unit diagonal. 3

Upper-left block A. 3

Find

α∗ = min{α ∈ [0,1] : S(α) pos semidef}= min{α ∈ [0,1] : f (α) := λmin (S(α)) ≥ 0 }.

Minimal uniform rel change to each unfixed element.

Nick Higham Nearest Correlation Matrix 21 / 27

Properties of f (α) = λmin (S(α))

A positive definite

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

−1.6

−1.4

−1.2

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

alpha

lam

bda

min

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

−1.5

−1

−0.5

0

0.5

alpha

lam

bda

min

f is concave.f (0) < 0, f (1) > 0.α∗ is the unique zero of f in [0,1].

Nick Higham Nearest Correlation Matrix 22 / 27

Bisection with Cholesky

C :=

[ m n

m A Yn Y T B

]invalid correlation matrix, A pos def.

Algorithm

A = RT11R11 (Cholesky decomposition).

RT11X = Y , Z = X T X

Interval [α`, αr ] ≡ [0,1].

1 αm = (α` + αr )/22 T = αmI + (1− αm)B − (1− αm)

2Z3 Attempt Cholesky of T and set αm = α` or αm = αr

accordingly.

Guarantee psd matrix by taking α∗ ← αr in bisection.

Nick Higham Nearest Correlation Matrix 23 / 27

Bisection with Cholesky

C :=

[ m n

m A Yn Y T B

]invalid correlation matrix, A pos def.

Algorithm

A = RT11R11 (Cholesky decomposition).

RT11X = Y , Z = X T X

Interval [α`, αr ] ≡ [0,1].

1 αm = (α` + αr )/22 T = αmI + (1− αm)B − (1− αm)

2Z3 Attempt Cholesky of T and set αm = α` or αm = αr

accordingly.

Guarantee psd matrix by taking α∗ ← αr in bisection.

Nick Higham Nearest Correlation Matrix 23 / 27

Generalized Eigenvalue Approach

Looking for

α∗ = min{α ∈ [0,1] : S(α) is positive semidefinite},where

S(α) = αT + (1− α)C =: E − αF .E and F are both symm indef.

Rewrite

S(α) = (1− α)(

α

1− αT + C

)=: (1− α) (−µT + C) .

α∗ from smallest generalized eigenvalue of the definitepencil C − µT . Solve by

T = RTRG = R−T CR−1

Find smallest e’val µ∗ of G.

Easily modified to exploit fixed (1,1) block.

Nick Higham Nearest Correlation Matrix 24 / 27

Generalized Eigenvalue Approach

Looking for

α∗ = min{α ∈ [0,1] : S(α) is positive semidefinite},where

S(α) = αT + (1− α)C =: E − αF .E and F are both symm indef. Rewrite

S(α) = (1− α)(

α

1− αT + C

)=: (1− α) (−µT + C) .

α∗ from smallest generalized eigenvalue of the definitepencil C − µT . Solve by

T = RTRG = R−T CR−1

Find smallest e’val µ∗ of G.

Easily modified to exploit fixed (1,1) block.

Nick Higham Nearest Correlation Matrix 24 / 27

Generalized Eigenvalue Approach

Looking for

α∗ = min{α ∈ [0,1] : S(α) is positive semidefinite},where

S(α) = αT + (1− α)C =: E − αF .E and F are both symm indef. Rewrite

S(α) = (1− α)(

α

1− αT + C

)=: (1− α) (−µT + C) .

α∗ from smallest generalized eigenvalue of the definitepencil C − µT . Solve by

T = RTRG = R−T CR−1

Find smallest e’val µ∗ of G.

Easily modified to exploit fixed (1,1) block.

Nick Higham Nearest Correlation Matrix 24 / 27

Experiment: Shrinking versus NCM

n = 1399, 3120: matrices from finance industry.g02aa is NAG Newton NCM code.Shrinking is done by bisection.Times (secs).

n shrinking g02aa Distance ‖ · ‖F1e-3 1e-6 1e-3 1e-6 shrinking NCM

1399 0.2 0.3 3.7 4.4 321.0 21.03120 1.0 2.2 28.1 34.3 178.7 5.42798 0.7 1.6 44.2 50.9 1221.2 1089.54519 2.3 5.0 220.9 234.7 1761.5 1631.56240 7.1 17.5 447.3 449.9 2578.1 2446.8

Nick Higham Nearest Correlation Matrix 25 / 27

Shrinking Summary

Attractive way to restore definiteness.

Order of magnitude faster than computing NCM.

Can easily incorporate weighting.

Codes available on Github.

Code g02anf in NAG Library Mark 25.Weighting is being added.

H, Strabic & Šego (2014), Restoring definiteness via shrinking, withan application to correlation matrices with a fixed block,MIMS EPrint 2014.54; to appear in SIAM Review.

Nick Higham Nearest Correlation Matrix 26 / 27

Conclusions

Invalid correlation matrices are ubiquitous.Frequently replaced by nearest correlation matrix.

Practitioners use alt proj because easily available(MATLAB, R).

For NCM with fixed element constraints, recommendalternating projections + Anderson acceleration.

Anderson acceleration reduces # iterations by

at least a half for standard NCM andat least a third for the (harder) variants.

Shrinking is an attractive alternative: order ofmagnitude faster than NCM.

Nick Higham Nearest Correlation Matrix 27 / 27

References I

D. G. Anderson.Iterative procedures for nonlinear integral equations.J. Assoc. Comput. Mach., 12(4):547–560, Oct. 1965.

R. Borsdorf and N. J. Higham.A preconditioned Newton algorithm for the nearestcorrelation matrix.IMA J. Numer. Anal., 30(1):94–107, 2010.

N. J. Higham.Computing the nearest correlation matrix—A problemfrom finance.IMA J. Numer. Anal., 22(3):329–343, 2002.

Nick Higham Nearest Correlation Matrix 1 / 5

References II

N. J. Higham and N. Strabic.Anderson acceleration of the alternating projectionsmethod for computing the nearest correlation matrix.MIMS EPrint 2015.39, Manchester Institute forMathematical Sciences, The University of Manchester,UK, Aug. 2015.22 pp.

Nick Higham Nearest Correlation Matrix 2 / 5

References III

N. J. Higham, N. Strabic, and V. »Sego.Restoring definiteness via shrinking, with an applicationto correlation matrices with a fixed block.MIMS EPrint 2014.54, Manchester Institute forMathematical Sciences, The University of Manchester,UK, Nov. 2014.19 pp.Revised March 2015. To appear in SIAM Rev.

J. Pinheiro and D. M. Bates.Unconstrained parametrizations for variance-covariancematrices.Statistics and Computing, 6(3):289–296, 1996.

Nick Higham Nearest Correlation Matrix 3 / 5

References IV

P. Pulay.Convergence acceleration of iterative sequences. Thecase of SCF iteration.Chem. Phys. Lett., 73(2):393–398, 1980.

H. Qi and D. Sun.A quadratically convergent Newton method forcomputing the nearest correlation matrix.SIAM J. Matrix Anal. Appl., 28(2):360–385, 2006.

R. Rebonato and P. Jäckel.The most general methodology for creating a validcorrelation matrix for risk management and optionpricing purposes.Journal of Risk, 2(2):17–27, 2000.

Nick Higham Nearest Correlation Matrix 4 / 5

References V

A. Toth and C. T. Kelley.Convergence analysis for Anderson Acceleration.SIAM J. Numer. Anal., 53(2):805–819, 2015.

H. F. Walker and P. Ni.Anderson acceleration for fixed-point iterations.SIAM J. Numer. Anal., 49(4):1715–1735, 2011.

Nick Higham Nearest Correlation Matrix 5 / 5

top related