fast relaxation methods for the matrix exponential

Relaxation methods for !the matrix exponential !

on large networks

David F. Gleich!Purdue University!

Joint work with Kyle Kloster @ Purdue supported by "NSF CAREER 1149756-CCF

Code www.cs.purdue.edu/homes/dgleich/codes/nexpokit!

Mines David Gleich · Purdue 1

Models and algorithms for high performance !matrix and network computations


18 P. G. CONSTANTINE, D. F. GLEICH, Y. HOU, AND J. TEMPLETON

1

error

0

2

(a) Error, s = 0.39 cm

1

std

0

2

(b) Std, s = 0.39 cm

10

error

0

20

(c) Error, s = 1.95 cm

10

std

0

20

(d) Std, s = 1.95 cm

Fig. 4.5: Error in the reduce order model compared to the prediction standard de-viation for one realization of the bubble locations at the final time for two values ofthe bubble radius, s = 0.39 and s = 1.95 cm. (Colors are visible in the electronicversion.)

the varying conductivity fields took approximately twenty minutes to construct usingCubit after substantial optimizations.

Working with the simulation data involved a few pre- and post-processing steps:interpret 4TB of Exodus II files from Aria, globally transpose the data, compute theTSSVD, and compute predictions and errors. The preprocessing steps took approx-imately 8-15 hours. We collected precise timing information, but we do not reportit as these times are from a multi-tenant, unoptimized Hadoop cluster where otherjobs with sizes ranging between 100GB and 2TB of data sometimes ran concurrently.Also, during our computations, we observed failures in hard disk drives and issuescausing entire nodes to fail. Given that the cluster has 40 cores, there was at most2400 cpu-hours consumed via these calculations—compared to the 131,072 hours ittook to compute 4096 heat transfer simulations on Red Sky. Thus, evaluating theROM was about 50-times faster than computing a full simulation.

We used 20,000 reducers to convert the Exodus II simulation data. This choicedetermined how many map tasks each subsequent step utilized—around 33,000. Wealso found it advantageous to store matrices in blocks of about 16MB per record. Thereduction in the data enabled us to use a laptop to compute the coe�cients of theROM and apply to the far face for the UQ study in Section 4.4.

Here are a few pertinent challenges we encountered while performing this study.Generating 8192 meshes with di↵erent material properties and running independent

Tensor eigenvalues"and a power method

28

Tensor methods for network alignment

Network alignment is the problem of computing an approximate isomorphism between two net-works. In collaboration with Mohsen Bayati, Amin Saberi, Ying Wang, and Margot Gerritsen,the PI has developed a state of the art belief propagation method (Bayati et al., 2009).

FIGURE 6 – Previous workfrom the PI tackled net-work alignment with ma-trix methods for edgeoverlap:

i

j

j

0i

0

OverlapOverlap

A L B

This proposal is for match-ing triangles using tensormethods:

j

i

k

j

0

i

0

k

0

TriangleTriangle

A L B

If xi, xj , and xk areindicators associated withthe edges (i, i0), (j, j0), and(k, k0), then we want toinclude the product xixjxk

in the objective, yielding atensor problem.

We propose to study tensor methods to perform network alignmentwith triangle and other higher-order graph moment matching. Similarideas were proposed by Svab (2007); Chertok and Keller (2010) alsoproposed using triangles to aid in network alignment problems.In Bayati et al. (2011), we found that triangles were a key missingcomponent in a network alignment problem with a known solution.Given that preserving a triangle requires three edges between twographs, this yields a tensor problem:

maximizeX

i2L

wixi +X

i2L

X

j2L

xixjSi,j +X

i2L

X

j2L

X

k2L

xixjxkTi,j,k

| {z }triangle overlap term

subject to x is a matching.

Here, Ti,j,k = 1 when the edges corresponding to i, j, and k inL results in a triangle in the induced matching. Maximizing thisobjective is an intractable problem. We plan to investigate a heuris-tic based on a rank-1 approximation of the tensor T and usinga maximum-weight matching based rounding. Similar heuristicshave been useful in other matrix-based network alignment algo-rithms (Singh et al., 2007; Bayati et al., 2009). The work involvesenhancing the Symmetric-Shifted-Higher-Order Power Method due toKolda and Mayo (2011) to incredibly large and sparse tensors . On thisaspect, we plan to collaborate with Tamara G. Kolda. In an initialevaluation of this triangle matching on synthetic problems, using thetensor rank-1 approximation alone produced results that identifiedthe correct solution whereas all matrix approaches could not.

vision for the future

All of these projects fit into the PI’s vision for modernizing the matrix-computation paradigmto match the rapidly evolving space of network computations. This vision extends beyondthe scope of the current proposal. For example, the web is a huge network with over onetrillion unique URLs (Alpert and Hajaj, 2008), and search engines have indexed over 180billion of them (Cuil, 2009). Yet, why do we need to compute with the entire network?By way of analogy, note that we do not often solve partial di↵erential equations or modelmacro-scale physics by explicitly simulating the motion or interaction of elementary particles.We need something equivalent for the web and other large networks. Such investigations maytake many forms: network models, network geometry, or network model reduction. It is thevision of the PI that the language, algebra, and methodology of matrix computations will

11

maximize

Pijk

T

ijk

x

i

x

j

x

k

subject to kxk2

= 1

Human protein interaction networks 48,228 triangles Yeast protein interaction networks 257,978 triangles The tensor T has ~100,000,000,000 nonzeros

We work with it implicitly

where ! ensures the 2-norm

[x

(next)

]

i

= ⇢ · (

X

jk

T

ijk

x

j

x

k

+ �x

i

)

SSHOPM method due to "Kolda and Mayo

Simulation data analysis SIMAX ‘09, SISC ‘11,MapReduce ‘11, ICASSP ’12

Network alignment ICDM ‘09, SC ‘11, TKDE ‘13

Fast & Scalable"Network centrality SC ‘05, WAW ‘07, SISC ‘10, WWW ’10, …

Data clustering WSDM ‘12, KDD ‘12, CIKM ’13 …

Ax = b

min kAx � bkAx = �x

Massive matrix "computations

on multi-threaded and distributed architectures

3 Image from rockysprings, deviantart, CC share-alike

Everything in the world can be explained by a matrix, and we see

how deep the rabbit hole goes

The talk ends, you believe -- whatever

you want to.

Matrix exponentials

exp(A) is defined as

1X

k=0

1

k !

Ak Always converges

dx

dt= Ax(t) , x(t) = exp(tA)x(0)

Evolution operator "for an ODE

A is n ⇥ n, real


special case of a function of a matrix f (A)

others are f (x) = 1/x ; f (x) = sinh(x)...

This talk: a column of the matrix exponential

x = exp(P)ec

x the solution

P the matrix

ec the column



Matrix computations in a red-pill

Solve a problem better by exploiting its structure!

This talk: a column of the matrix exponential

x = exp(P)ec

x the solution

P the matrix

ec the column

localized large, sparse, stochastic


Localized solutions

0 2 4 6x 105

0

0.5

1

1.5plot(x)

nnz(x) = 513, 969

100 102 104 10610−15

10−10

10−5

100

100 102 104 10610−15

10−10

10−5

100

nonzeros

erro

r

x = exp(P)ec

length(x) = 513, 969Mines David Gleich · Purdue 8

Our mission!Find the solution with work "roughly proportional to the "localization, not the matrix.


Our algorithm!www.cs.purdue.edu/homes/dgleich/codes/nexpokit

100 102 104 10610−15

10−10

10−5

100

100 102 104 10610−15

10−10

10−5

100

nonzeros

erro

r


Outline

1.  Motivation and setup 2.  Converting x = exp(P) ec into a linear system 3.  Relaxation methods for "

linear systems from large networks 4.  Error analysis 5.  Experiments


6,$0� 5(9,(:� "��6RFLHW\�IRU�,QGXVWULDO�DQG�$SSOLHG�0DWKHPDWLFV�9RO��1R��2FWREHU��

1,1(7((1� '8%,286� :$<6� 72� &20387(�7+(� (;321(17,$/� 2)� $�0$75,; �

&/(9(� 02/(5W� $1'� &+$5/(6� 9$1� /2$1W�

$EVWUDFW��,Q�SULQFLSOH��WKH�H[SRQHQWLDO�RI�D�PDWUL[�FRXOG�EH�FRPSXWHG�LQ�PDQ\�ZD\V��0HWKRGV�LQYROYLQJ�DSSUR[LPDWLRQ�WKHRU\��GLIIHUHQWLDO�HTXDWLRQV��WKH�PDWUL[�HLJHQYDOXHV��DQG�WKH�PDWUL[�FKDUDFWHULVWLF�SRO\��QRPLDO�KDYH�EHHQ�SURSRVHG��,Q�SUDFWLFH��FRQVLGHUDWLRQ�RI�FRPSXWDWLRQDO�VWDELOLW\�DQG�HIILFLHQF\�LQGLFDWHV�WKDW�VRPH�RI�WKH�PHWKRGV�DUH�SUHIHUDEOH�WR�RWKHUV��EXW�WKDW�QRQH�DUH�FRPSOHWHO\�VDWLVIDFWRU\��

��,QWURGXFWLRQ��0DWKHPDWLFDO�PRGHOV� RI� PDQ\�SK\VLFDO��ELRORJLFDO��DQG�HFRQRPLF�SURFHVVHV�LQYROYH�V\VWHPV�RI�OLQHDU��FRQVWDQW�FRHIILFLHQW�RUGLQDU\�GLIIHUHQWLDO�HTXDWLRQV�

[��W�� $[��W��

+HUH�$� LV�D�JLYHQ��IL[HG��UHDO�RU�FRPSOH[�Q�E\�Q�PDWUL[��$�VROXWLRQ�YHFWRU�[�W��LV�VRXJKW�ZKLFK�VDWLVILHV�DQ�LQLWLDO�FRQGLWLRQ�

[��2�� [R��

,Q�FRQWURO�WKHRU\��$�LV�NQRZQ�DV�WKH�VWDWH�FRPSDQLRQ�PDWUL[�DQG�[�W��LV�WKH�V\VWHP�UHVSRQVH��

,Q�SULQFLSOH��WKH�VROXWLRQ�LV�JLYHQ�E\�[�W�� HW$[R�ZKHUH�HW$�FDQ�EH�IRUPDOO\�GHILQHG�E\�WKH�FRQYHUJHQW�SRZHU�VHULHV�

W�$��W$� W$�H� ,�,$�� a��

7KH�HIIHFWLYH�FRPSXWDWLRQ�RI�WKLV�PDWUL[�IXQFWLRQ�LV�WKH�PDLQ�WRSLF�RI�WKLV�VXUYH\��:H�ZLOO�SULPDULO\�EH�FRQFHUQHG�ZLWK�PDWULFHV�ZKRVH�RUGHU�Q�LV�OHVV�WKDQ�D�IHZ�

KXQGUHG��VR�WKDW�DOO�WKH�HOHPHQWV�FDQ�EH�VWRUHG�LQ�WKH�PDLQ�PHPRU\�RI�D�FRQWHPSRUDU\�FRPSXWHU��2XU�GLVFXVVLRQ�ZLOO�EH�OHVV�JHUPDQH�WR�WKH�W\SH�RI�ODUJH��VSDUVH�PDWULFHV�ZKLFK�RFFXU�LQ�WKH�PHWKRG�RI�OLQHV�IRU�SDUWLDO�GLIIHUHQWLDO�HTXDWLRQV��

'R]HQV�RI�PHWKRGV�IRU�FRPSXWLQJ�H�W$FDQ�EH�REWDLQHG�IURP�PRUH�RU�OHVV�FODVVLFDO�UHVXOWV�LQ�DQDO\VLV��DSSUR[LPDWLRQ�WKHRU\��DQG�PDWUL[�WKHRU\��6RPH�RI�WKH�PHWKRGV�KDYH�EHHQ�SURSRVHG�DV�VSHFLILF�DOJRULWKPV��ZKLOH�RWKHUV�DUH�EDVHG�RQ�OHVV�FRQVWUXFWLYH�FKDUDFWHUL]DWLRQV��2XU� ELEOLRJUDSK\�FRQFHQWUDWHV�RQ� UHFHQW�SDSHUV�ZLWK�VWURQJ�DOJRULWKPLF�FRQWHQW��DOWKRXJK�ZH�KDYH�LQFOXGHG�D�IDLU�QXPEHU�RI�UHIHUHQFHV�ZKLFK�SRVVHVV�KLVWRULFDO�RU�WKHRUHWLFDO�LQWHUHVW��

,Q�WKLV�VXUYH\�ZH�WU\�WR�GHVFULEH�DOO�WKH�PHWKRGV�WKDW�DSSHDU�WR�EH�SUDFWLFDO��FODVVLI\�WKHP�LQWR�ILYH�EURDG�FDWHJRULHV��DQG�DVVHVV�WKHLU�UHODWLYH�HIIHFWLYHQHVV��$FWX��DOO\��HDFK�RI�WKH��PHWKRGV��ZKHQ�FRPSOHWHO\�LPSOHPHQWHG�PLJKW�OHDG�WR�PDQ\�GLIIHUHQW�FRPSXWHU�SURJUDPV�ZKLFK�GLIIHU�LQ�YDULRXV�GHWDLOV��0RUHRYHU��WKHVH�GHWDLOV�PLJKW�KDYH�PRUH�LQIOXHQFH�RQ�WKH�DFWXDO�SHUIRUPDQFH�WKDQ�RXU�JURVV�DVVHVVPHQW�LQGLFDWHV��7KXV��RXU�FRPPHQWV�PD\�QRW�GLUHFWO\�DSSO\�WR�SDUWLFXODU�VXEURXWLQHV��

,Q�DVVHVVLQJ�WKH�HIIHFWLYHQHVV�RI�YDULRXV�DOJRULWKPV�ZH�ZLOO�EH�FRQFHUQHG�ZLWK�WKH�IROORZLQJ�DWWULEXWHV��OLVWHG�LQ�GHFUHDVLQJ�RUGHU�RI�LPSRUWDQFH��JHQHUDOLW\��UHOLDELOLW\��

�5HFHLYHG�E\�WKH�HGLWRUV�-XO\��DQG�LQ�UHYLVHG�IRUP�0DUFK��W�'HSDUWPHQW�RI�0DWKHPDWLFV��8QLYHUVLW\�RI�1HZ�0H[LFR��$OEXTXHUTXH��1HZ�0H[LFR��7KLV�

ZRUN�ZDV�SDUWLDOO\�VXSSRUWHG�E\�16)�*UDQW�0&6��W�'HSDUWPHQW�RI�&RPSXWHU�6FLHQFH��&RUQHOO�8QLYHUVLW\��,WKDFD��1HZ�<RUN��7KLV�ZRUN�ZDV�

SDUWLDOO\�VXSSRUWHG�E\�16)�*UDQW�0&6��

��

This content downloaded from 128.210.126.199 on Sun, 28 Jul 2013 21:30:56 PMAll use subject to JSTOR Terms and Conditions

SIAM REVIEW c⃝ 2003 Society for Industrial and Applied MathematicsVol. 45, No. 1, pp. 3–49

Nineteen Dubious Ways toCompute the Exponential of aMatrix, Twenty-Five Years Later∗

Cleve Moler†

Charles Van Loan‡

Abstract. In principle, the exponential of a matrix could be computed in many ways. Methods involv-ing approximation theory, differential equations, the matrix eigenvalues, and the matrixcharacteristic polynomial have been proposed. In practice, consideration of computationalstability and efficiency indicates that some of the methods are preferable to others butthat none are completely satisfactory.

Most of this paper was originally published in 1978. An update, with a separatebibliography, describes a few recent developments.

Key words. matrix, exponential, roundoff error, truncation error, condition

AMS subject classifications. 15A15, 65F15, 65F30, 65L99

PII. S0036144502418010

1. Introduction. Mathematical models of many physical, biological, and eco-nomic processes involve systems of linear, constant coefficient ordinary differentialequations

x(t) = Ax(t).

Here A is a given, fixed, real or complex n-by-n matrix. A solution vector x(t) issought which satisfies an initial condition

x(0) = x0.

In control theory, A is known as the state companion matrix and x(t) is the systemresponse.

In principle, the solution is given by x(t) = etAx0, where etA can be formallydefined by the convergent power series

etA = I + tA +t2A2

2!+ · · · .

The effective computation of this matrix function is the main topic of this survey.

∗Published electronically February 3, 2003. A portion of this paper originally appeared in SIAMReview, Volume 20, Number 4, 1978, pages 801–836.

http://www.siam.org/journals/sirev/45-1/41801.html†The MathWorks, Inc., 3 Apple Hill Drive, Natick, MA 01760-2098 ([email protected]).‡Department of Computer Science, Cornell University, 4130 Upson Hall, Ithaca, NY 14853-7501

([email protected]).

3

Dow

nloa

ded

07/2

8/13

to 1

28.2

10.1

26.1

99. R

edis

tribu

tion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.siam

.org

/jour

nals

/ojs

a.ph

p


Matrix exponentials on large networks

exp(A) =

1X

k=0

1

k !

Ak If A is the adjacency matrix, then Ak counts the number of length k paths between node pairs.

[Estrada 2000, Farahat et al. 2002, 2006] Large entries denote important nodes or edges. Used for link prediction and centrality

If P is a transition matrix, then "Pk is the probability of a length k walk between node pairs.

[Kondor & Lafferty 2002, Kunegis & Lommatzsch 2009, Chung 2007] Used for link prediction, kernels, and clustering or community detection

exp(P) =

1X

k=0

1

k !

Pk


Another useful matrix exponential

P column stochastic

e.g. P = AT D�1

A is the adjacency matrix

if A is symmetric

exp(PT) = exp(D�1A) = D�1

exp(AD�1

)D = D�1

exp(P)D


Another useful matrix exponential

P column stochastic

e.g. P = AT D�1

A is the adjacency matrix

if A is symmetric

exp(�L) = exp(D�1/2AD�1/2 � I)

=

1

eexp(D�1/2AD�1/2

)

=

1

eD�1/2

exp(AD�1

)D1/2

=

1

eD�1/2

exp(P)D1/2

Negative Normalized Laplacian


heat kernel of a graph

solves the heat equation at t=1.

dx(t)dt

= �Lx(t)

Matrix exponentials on large networks Is a single column interesting? Yes!

exp(P)ec =

1X

k=0

1

k !

Pk ec Link prediction scores for node c A community relative to node c

But … modern networks are "large ~ O(109) nodes, sparse ~ O(1011) edges, constantly changing …

and so we’d like "speed over accuracy



Newman’s netscience

collaboration network!

379 vertices

1828 non-zeros

“zero” on most nodes

ec has a single "one here

x = exp(P)ec

The issue with existing methods

We want good results in less than one matvec. Our graphs have small diameter and fast fill-in. Krylov methods !A few matvecs, quick loss of sparsity due to orthogonality !Direct expansion!A few matvecs, quick loss of sparsity due to fill-in


exp(P)ec ⇡ ⇢Vexp(H)e1

[Sidje 1998]"ExpoKit

exp(P)ec ⇡PN

k=0

1

k !

Pk ec

Outline



✓


Our underlying method

Direct expansion!A few matvecs, quick loss of sparsity due to fill-in This method is stable for stochastic P!

"… no cancellation, unbounded norm, etc. !!


x = exp(P)ec ⇡PN

k=0

1

k !

P

kec = xN

Lemma kx � xNk1 1N!N

Our underlying method !as a linear system Direct expansion! "!!!


x = exp(P)ec ⇡PN

k=0

1

k !

P

kec = xN

2

6666664

III�P/1 III

�P/2. . .. . . III

�P/N III

3

7777775

2

6666664

v0v1......

vN

3

7777775=

2

6666664

ec0......0

3

7777775xN =

NX

i=0

vi

(III⌦ IIIN � SN ⌦ P)v = e1 ⌦ ec

Lemma we approximate xN well if we approximate v well

Our mission (2)!Approximately solve " when A, b are sparse,"x is localized.


Ax = b

Outline



✓


✓

Coordinate descent, Gauss-Southwell, Gauss-Seidel, relaxation & “push” methods

Be greedy Don’t look at the whole system. Look at equations that are violated and try and fix them.


Coordinate descent, Gauss-Southwell, Gauss-Seidel, relaxation & “push” methods


Algebraically! Procedurally!

Solve(A,b) x = sparse(size(A,1),1) r = b While (1) Pick j where r(j) != 0 z = r(j) x(j) = x(j) + r(j) For i where A(i,j) != 0 r(i) = r(i) – z*A(i,j)

Ax = b

r

(k ) = b � Ax

(k )

x

(k+1) = x

(k ) + ejeTj r

(k )

r

(k+1) = r

(k ) � r (k )j Aej

It’s called the “push” method because of PageRank


(III� ↵P)x = v

r

(k )

= v � (III� ↵P)x

(k )

x

(k+1)

= x

(k )

+ ejeTj r

(k )

“r

(k+1)

= r

(k ) � r (k )

j Aej ”

r (k+1)

i =

8><

>:

0 i = jr (k )

i + ↵Pi ,j r(k )

j Pi ,j 6= 0

r (k )

i otherwise

PageRankPush(links,v,alpha) x = sparse(size(A,1),1) r = b While (1) Pick j where r(j) != 0 z = r(j) x(j) = x(j) + z r(j) = 0 z = alpha * z / deg(j) For i where “j links to i” r(i) = r(i) + z

It’s called the “push” method because of PageRank


Demo

Justification of terminology

This method is frequently “rediscovered” (3 times for PageRank!)

Let Ax = b, diag(A) = I It’s Gauss-Seidel if j is chosen cyclically It’s Gauss-Southwell if j is the largest entry in the residual It’s coordinate descent if A is symmetric, pos. definite It’s a relaxation step for any A

Works great for other problems too! "[Bonchi, Gleich, et al. J. Internet Math. 2012]


Back to the exponential


2

6666664

III�P/1 III

�P/2. . .. . . III

�P/N III

3

7777775

2

6666664

v0v1......

vN

3

7777775=

2

6666664

ec0......0

3

7777775xN =

NX

i=0

vi


Solve this system via the same method. Optimization 1 build system implicitly Optimization 2 don’t store vi, just store sum xN

Code (inefficient, but working) for !Gauss-Southwell to solve function x = nexpm(P,c,tol) n = size(P,1); N = 11; sumr=1; r = zeros(n,N+1); r(c,1) = 1; x = zeros(n,1); % the residual and solution while sumr >= tol % use max iteration too [ml,q]=max(r(:)); i=mod(q-1,n)+1; k=ceil(q/n); % use a heap in practice for max r(q) = 0; x(i) = x(i)+ml; sumr = sumr-ml;% zero the residual, add to solution [nset,~,vals] = find(P(:,i)); ml=ml/k; % look up the neighbors of node i for j=1:numel(nset) % for all neighbors if k==N, x(nset(j)) = x(nset(j)) + vals(j)*ml; % add to solution else, r(nset(j),k+1) = r(nset(j),k+1) + vals(j)*ml;% or add to next residual sumr = sumr + vals(j)*ml; end, end, end % end if, end for, end while Todo use dictionary for x, r and use heap or queue for residual


Outline



✓


✓ ✓

Error analysis for Gauss-Southwell


Theorem

Assume P is column-stochastic, v

(0)

= 0.

(Nonnegativity)

iterates and residuals are nonnegative

v

(l) � 0 and r

(l) � 0

(Convergence)

residual goes to 0:

kr

(l)k1

Q

l

k=1

�1 � 1

2dk

� l

(

� 1

2d

)


“easy”

“annoying” d is the

largest degree

Proof sketch

Gauss-Southwell picks largest residual ⇒  Bound the update by avg. nonzeros in residual (sloppy) ⇒  Algebraic convergence with slow rate, but each update is

REALLY fast O(d max log n).

If d is log log n, then our method runs in sub-linear time "(but so does just about anything)


Overall error analysis


Components!Truncation to N terms Residual to error Approximate solve

Theorem kxN(`) � xk1 1

N!N+

1e· `� 1

2d

After ℓ steps of Gauss-Southwell

More recent error analysis


Theorem (Gleich and Kloster, 2013 arXiv:1310.3423)" Consider solving personalized PageRank using the Gauss-Southwell relaxation method in a graph with a Zipf-law in the degrees with exponent p=1 and max-degree d, then the work involved in getting a solution with 1-norm error ε is

work = O⇣

log(

1

" )(

1

" )

3/2d2

(log d)

2

⌘

Outline



✓


✓ ✓

✓

Our implementations

C++ mex implementation with a heap to implement Gauss-Southwell. C++ mex implementation with a queue to store all residual entries ≥ 1/(tol nN).

At completion, the residual norm ≤ tol. We use the queue except for the runtime comparison.


Accuracy vs. tolerance


0

0.2

0.4

0.6

0.8

1

−2 −3 −4 −5 −6 −7log10 of residual tolerance

Pre

cisi

on a

t 1

00

pgp−ccpgp social graph, 10k vertices

For the pgp social graph, we study the precision in finding the 100 largest nodes as we vary the tolerance. This set of 100 does not include the nodes immediate neighbors. (Boxplot over 50 trials)

Accuracy vs. work


For the dblp collaboration graph, we study the precision in finding the 100 largest nodes as we vary the work. This set of 100 does not include the nodes immediate neighbors. (One column, but representative)

10−2 10−1 100

0

0.2

0.4

0.6

0.8

1

dblp−cc

Effective matrix−vector products

Prec

isio

n

to

l=10

−4

tol=

10−5

@10@25@100@1000

dblp collaboration graph, 225k vertices

Runtime


103

104

105

106

10−4

10−2

100

|E| + |V|

Runtim

e (

secs

).

TSGSTSGSQEXPVMEXPVTAYLOR

Flickr social network"500k nodes, 5M edges

Outline

1.  Motivation and setup 2.  Converting x = exp(P) ec into a linear system 3.  Coordinate descent methods for "


✓


✓ ✓

✓ ✓

References and ongoing work

Kloster and Gleich, Workshop on Algorithms for the Web-graph, 2013. Also see the journal version on arXiv. www.cs.purdue.edu/homes/dgleich/codes/nexpokit

•  Error analysis using the queue (almost done …) •  Better linear systems for faster convergence •  Asynchronous coordinate descent methods •  Scaling up to billion node graphs (done …)


Supported by NSF CAREER 1149756-CCF www.cs.purdue.edu/homes/dgleich

fast relaxation methods for the matrix exponential

Technology

mayo gleich purdue mines

matrix large

ddavid gleich purdue

ode david gleich purdue

matrix methods

matrix exponential x

p ecdavid gleich purdue

matrix f