adaptive solver for the p-version of finite element method

18
INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN ENGINEERING, VOL. 40, 17671784 (1997) ADAPTIVE SOLVER FOR THE p-VERSION OF FINITE ELEMENT METHOD JACOB FISH AND RAVI GUTTAL Department of Civil Engineering and Scientific Computation Research Center, Rensselaer Polytechnic Institute, ¹roy, N½ 12180-359, º.S.A. SUMMARY An adaptive solver for large-scale hierarchic finite element systems has been developed. A decision-making methodology aimed at selecting an optimal solution strategy on the basis of estimated conditioning, sparsity and memory requirements for a given problem has been devised. Numerical experiments have been conducted on selected shell and 3-D problems in the range of 1000100 000 degrees of freedom. ( 1997 by John Wiley & Sons, Ltd. KEY WORDS: adaptive solver; sparsity; conditioning; multigrid; conjugate gradient; sparse direct solver 1. INTRODUCTION A robust computationally efficient solver has been an elusive goal for the finite element commun- ity solving a myriad of large-scale problems. The concept of such a solver poses a dilemma for solution method developers, ‘should one try to devise a single strategy which can handle all problems in a computationally efficient manner, or develop problem-specific computationally efficient strategies ?’. This dilemma is not new, just as in other branches of computational mathematics, both approaches are being pursued by researchers. In our current investigation we have adopted the latter. The adaptive solver we propose, utilizes the properties of the given problem to select a computationally optimal strategy rather than applying a single solution strategy to every problem. This work focuses on developing an adaptive multilevel preconditioned solver for symmetric positive-definite hierarchic system resulting from the p-version of finite element method. Adaptive single-level preconditioned conjugate gradient solvers for such systems have been investigated by researchers in IBM1 and Mandel.2 An ideal choice of the preconditioner that minimizes com- putational effort depends on computer architecture, software considerations, size of the problem, sparsity pattern and condition number of the system to be solved. The vital constituents of the multilevel preconditioned method are the number of auxiliary levels, their discretization, and the technique used to process individual levels. A methodology to determine an ideal choice of these constituents based on the problem data is explored. Theoretical guidelines for choosing the number of levels and their discretization resulting in optimal number of iterations have been reported in Reference 3. However, such choices may not necessarily result in optimal CPU times. In large-scale problems, besides floating point opera- tions, matrix computations involve a large amount of integer and logical operations, which can take a significant portion of the total computer time. In addition memory traffic can be a bottle CCC 00295981/97/10176718$17.50 Received 28 May 1996 ( 1997 by John Wiley & Sons, Ltd. Revised 16 September 1996

Upload: jacob-fish

Post on 06-Jun-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ADAPTIVE SOLVER FOR THE p-VERSION OF FINITE ELEMENT METHOD

INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN ENGINEERING, VOL. 40, 1767—1784 (1997)

ADAPTIVE SOLVER FOR THE p-VERSION OF FINITEELEMENT METHOD

JACOB FISH AND RAVI GUTTAL

Department of Civil Engineering and Scientific Computation Research Center, Rensselaer Polytechnic Institute,¹roy, N½ 12180-359, º.S.A.

SUMMARY

An adaptive solver for large-scale hierarchic finite element systems has been developed. A decision-makingmethodology aimed at selecting an optimal solution strategy on the basis of estimated conditioning, sparsityand memory requirements for a given problem has been devised. Numerical experiments have beenconducted on selected shell and 3-D problems in the range of 1000—100 000 degrees of freedom. ( 1997 byJohn Wiley & Sons, Ltd.

KEY WORDS: adaptive solver; sparsity; conditioning; multigrid; conjugate gradient; sparse direct solver

1. INTRODUCTION

A robust computationally efficient solver has been an elusive goal for the finite element commun-ity solving a myriad of large-scale problems. The concept of such a solver poses a dilemma forsolution method developers, ‘should one try to devise a single strategy which can handle allproblems in a computationally efficient manner, or develop problem-specific computationally efficientstrategies?’. This dilemma is not new, just as in other branches of computational mathematics,both approaches are being pursued by researchers. In our current investigation we have adoptedthe latter. The adaptive solver we propose, utilizes the properties of the given problem to selecta computationally optimal strategy rather than applying a single solution strategy to everyproblem.

This work focuses on developing an adaptive multilevel preconditioned solver for symmetricpositive-definite hierarchic system resulting from the p-version of finite element method. Adaptivesingle-level preconditioned conjugate gradient solvers for such systems have been investigated byresearchers in IBM1 and Mandel.2 An ideal choice of the preconditioner that minimizes com-putational effort depends on computer architecture, software considerations, size of the problem,sparsity pattern and condition number of the system to be solved. The vital constituents of themultilevel preconditioned method are the number of auxiliary levels, their discretization, and thetechnique used to process individual levels. A methodology to determine an ideal choice of theseconstituents based on the problem data is explored.

Theoretical guidelines for choosing the number of levels and their discretization resulting inoptimal number of iterations have been reported in Reference 3. However, such choices may notnecessarily result in optimal CPU times. In large-scale problems, besides floating point opera-tions, matrix computations involve a large amount of integer and logical operations, which cantake a significant portion of the total computer time. In addition memory traffic can be a bottle

CCC 0029—5981/97/101767—18$17.50 Received 28 May 1996( 1997 by John Wiley & Sons, Ltd. Revised 16 September 1996

Page 2: ADAPTIVE SOLVER FOR THE p-VERSION OF FINITE ELEMENT METHOD

neck. For optimal computational efficiency it is important to pay as much attention to the flow ofdata and the logical and integer operations as to the amount of floating point arithmetic. Hencein the present work some practical strategies which result in nearly optimal CPU times aredevised.

In Section 2 we elucidate the multilevel preconditioned algorithm and solution strategy forhierarchic systems. Adaptive selection strategies and numerical results are presented in Section 3.Conclusions and remarks are made in Section 4.

2. MULTILEVEL PRECONDITIONED METHODS FOR HIERARCHIC SYSTEMS

The hierarchic nature of the stiffness matrix resulting from the p-method is well suited fora multilevel preconditioned iterative method. The hierarchic levels utilized during the multileveliteration process play the role of nested grids employed in the traditional multigrid method.4Multigrid iterations reduce the errors in different frequencies by using auxiliary grids and in caseof the p-version of finite element method the frequency decomposition is directly available interms of spectral orders.

Consider the hierarchic system:

Kmdm"f m m"1, 2 . . .

where

Km"CKm~1

Km21

Km12

Km11D , d"G

dm~1

dm1H , f"G

fm~1

fm1H (1)

m is the maximum number of hierarchic levels used in multilevel iterations. K0 the stiffness matrixon the initial level; Km is of order n

m'n

m~1, where n

m~1is the order of the block Km~1 ;

dm13R(nm~nm~1) and dm~13Rnm~1.Let Qm~1

mand Qm

m~1be the restriction and prolongation operators, which transfer the data

from level (m) to level (m!1) and vice versa. For the p-method it has a very simple form:

Qm~1m

"[I 0]"QmTm~1

(2)

where I is the order nm~1

identity matrix, and 0 is order (nm!n

m~1) zero matrix. A single

multilevel iteration has a compact recursive definition given by

zm :"MLm(rm, Km) (3)

where rm is the residual vector. The details of a V-cycle multilevel iteration process are given inTable I.

The hierarchic multilevel preconditioned method involves three crucial steps:

(1) Coarse Level Correction (CLC),(2) smoothing,(3) acceleration.

In case of a two-level scheme, CLC is performed by a direct solution. In the one-level scheme,CLC corresponds to the direct solution of the entire system. The computational complexity ofCLC is influenced by the sparsity pattern of the stiffness matrix. On the other hand, the rate ofconvergence of the iterative method is governed by condition number of the problem. Thus thecomputational work of the multilevel preconditioned method is dependent on both, the sparsity

1768 J. FISH AND R. GUTTAL

( 1997 by John Wiley & Sons, Ltd.INT. J. NUMER. METHODS ENG., VOL. 40: 1767—1784 (1997)

Page 3: ADAPTIVE SOLVER FOR THE p-VERSION OF FINITE ELEMENT METHOD

Table I. V-cycle multigrid

1. Loop i"0, 1, 2 . . . until convergence

if i"0Qdm"0

2. Perform c1

presmoothing operations

ic1dm :"smooth(c1,0

idm, Km, f m)

where the left superscript and subscript denote the cycle number and smoothing count,respectively.

3. Restrict residual from level m to m!1

rm~1"Qm~1m

(f m!Kmc1 idm)

4. Coarse level correction

If

Else

(m!1)"lowest level, solve directly zm~1"(Km~1)~1 rm~1

zm~1 :"MLm~1(rm~1, Km~1),

5. Prolongate from level m!1 to m

c1`1idm"c1idm#Qm

m~1zm~1

6. Perform c2

postsmoothing operations

i`10dm :"smooth(c

2,c1`1idm, Km, fm)

pattern and the condition number of the stiffness matrix. Memory considerations also playa decisive role in the selection of a solution strategy.

2.1. Smoothing

The performance of the multilevel preconditioned method is influenced by the type of smooth-ing technique chosen. In out previous work5 we have identified three efficient smoothingtechniques:

(1) Block diagonal smoothing,(2) Incomplete Cholesky factorization (ICC),(3) Symmetric Gauss Seidel (SGS).

In case of Block Diagonal smoothing, a smaller than the normal subset of unknowns can beupdated during the smoothing phase at a given level by taking advantage of the fact thatsmoothing mainly affects highest oscillatory modes of error. Thus relaxation sweeps (smoothing)can be performed on block by block level keeping the rest of the degrees of freedom fixed.The multilevel method resulting from such block diagonal smoothing was denominated by Banket al.6 as Hierarchical Basis Multigrid (HBM) technique. It has been shown6 that the rate ofconvergence of HBM has logarithmic dependence on the problem size as opposed to themultilevel method with regular smoothing which has an optimal rate of convergence independentof the mesh size and spectral order.

An incomplete Cholesky factor is a popular smoother and results in robust multilevelmethods.7 Unfortunately incomplete factor exists only for some special cases of symmetricpositive-definite matrices (M-matrices). For general class of symmetric positive-definite matrices

ADAPTIVE SOLVER FOR THE p-VERSION OF FINITE ELEMENT METHOD 1769

( 1997 by John Wiley & Sons, Ltd. INT. J. NUMER. METHODS ENG., VOL. 40: 1767—1784 (1997)

Page 4: ADAPTIVE SOLVER FOR THE p-VERSION OF FINITE ELEMENT METHOD

rejection (dropping) of non-zero terms during incomplete factorization often leads to an unstablefactorization process resulting in small pivots in the factor or in an indefinite system. In such casesthe diagonal entries are modified by adding a small positive number when required beforeelimination takes place to guarantee that the resulting incomplete factor is positive definite.

Based on the type of rejection (dropping) criteria, two types of incomplete factorizations arepossible:

(1) Incomplete factorization by position ICC(P): In this case any non-zero entry generatedduring factorization is not retained if it does not fit into the sparsity pattern adopted,8which in our case coincides with that of the matrix K.

(2) Incomplete factorization by magnitude or value ICC(V): In this case the non-zero entrygenerated during factorization is retained only if its magnitude satisfies a specified cri-terion.9

The incomplete factorization by value ICC(V) is computationally expensive compared to thatby position ICC(P), but results in higher rate of convergence of the multilevel method especiallyfor ill-conditioned problems. For incomplete factorization of the dense hierarchic stiffness matrixa combination of ICC(V) with no fill-in for lower polynomial orders and ICC(P) for higher ordersinvolves significantly less computational work than ICC(V) for the entire stiffness matrix. Thisstrategy emulates a block diagonal preconditioner with larger blocks for lower polynomialorders.10

The Symmetric Gauss Seidel (SGS) smoothing is particularly suited for relatively well-condi-tioned problems. The SGS smoothing is very attractive from memory considerations as itrequires no extra storage.

The type of smoothing (level'1) will be adaptively selected based on the problem data.

2.2. Acceleration Schemes

For ill-conditioned problems, such as thin shells, it is desirable to accelerate the rate ofconvergence of the multilevel methods. In this subsection we present two acceleration schemeswhich require a small fraction of computational effort, but at the same time are efficient inexpediting the convergence of the multilevel methods.

Two parameter acceleration scheme. Let irm be the residual vector at the end of ith multilevel(m-level) iteration. The incremental multilevel solution for the next iteration izm"MLm (irm, Km) isused as a predictor in the two parameter acceleration scheme. The solution in the correctionphase is then updated as follows:

i`1v"ia izm#ib iv (4)

i`1dm"idm#i`1v (5)

where parameters (ia, ib) are obtained by minimizing the potential energy functional:

12(idm#ia izm#ibiv)TKm(idm#iaizm#ibiv)!(idm#iaizm#ibiv)T f mPmin

ia ib(6)

The resulting multilevel preconditioned algorithm is summarized in Table II.

Conjugate gradient acceleration scheme. The conjugate gradient method can be used as anacceleration scheme for the multilevel method. The acceleration parameters ia, ib are found from

1770 J. FISH AND R. GUTTAL

( 1997 by John Wiley & Sons, Ltd.INT. J. NUMER. METHODS ENG., VOL. 40: 1767—1784 (1997)

Page 5: ADAPTIVE SOLVER FOR THE p-VERSION OF FINITE ELEMENT METHOD

Table II. Two parameter acceleration scheme

Step 10d"0, 0r"f

0z :"ML(0r, K)

0v"0y"0

0x"K0z

0b"0; 0a"(f, 0z)/(0x, 0z) (7)

Step 2Do i"0, 1, 2 . . . until convergence

GiaibH"C

(ix, iz)

(ix, iv)

(ix, iv)

(iy, iv)D~1

G( ir, iz)

( ir, iv)H ∀i'0 (8)

i`1v"ia iz#ib iv

i`1d"id#i`1v

i`1y"ia ix#ib iy

i`1r"ir!i`1y

i`1z"ML(i`1r, K)

i`1x"K i`1z

Convergence criteria

S( ir, ir)

(0r, 0r)(e

the line search K-orthogonality (K iv, i`1v)"0 conditions, respectively. The resulting multilevelpreconditioned conjugate gradient algorithm is outlined in Table III.

It can be shown that the two schemes are mathematically equivalent in absence of round-offerrors. However for well-conditioned problems conjugate gradient acceleration is superiorbecause it involves fewer scalar product evaluations. On the other hand, for poor-conditionedproblems the two-parameter acceleration is less sensitive to round-off errors, resulting in feweriterations (Section 4.2). Note that the two acceleration schemes require no additionalmatrix—vector multiplication and their benefit clearly overshadows the cost involved in vectorproduct evaluations.5

2.3. Adaptive multilevel preconditioned solution method

In this subsection we outline the adaptive strategy for solving symmetric positive-definitehierarchic systems:

1. Estimate condition number (i) and sparsity (-) of the system.2. Estimate memory requirements (k

i) for alternative solution methods.

3. Given i, -, ki

and maximum available memory select an optimal multilevel method.A particular choice of the multilevel method includes selection of:

(a) type of smoothing,

ADAPTIVE SOLVER FOR THE p-VERSION OF FINITE ELEMENT METHOD 1771

( 1997 by John Wiley & Sons, Ltd. INT. J. NUMER. METHODS ENG., VOL. 40: 1767—1784 (1997)

Page 6: ADAPTIVE SOLVER FOR THE p-VERSION OF FINITE ELEMENT METHOD

Table III. Conjugate gradient acceleration scheme

Step 10d"0, 0r"f

0z :"ML(0r, K)

0v"0z

Step 2Do i"0, 1, 2 . . . until convergence

ia"irT iv

ivTK iv(9)

i`1d"id#ia iv

i`1r"ir!iaK iv

i`1z"P~1 i`1r"ML(i`1r, K)

i`1b"i`1rT i`1z

irT iz(10)

i`1v"i`1z#i`1b iv

Convergence criteria

S( ir, ir)

(0r, 0r)(e

(b) number of levels,(c) acceleration scheme.

4. Monitor the iterative algorithm for localized divergence.

The iteration process is monitored by the values of parameters (a, b) (7, 8, 9, 10). A localizeddivergence is indicated by negative or very small values of acceleration parameter a. Localizeddivergence indicates that a stronger multilevel preconditioner is required. Localized divergencecan be circumvented either by selecting a stronger smoothing strategy or by increasing the size ofthe coarse level. When incomplete factorization smoothing is used in the multilevel scheme,divergence is attributed to incomplete factor being close to singular. In this case, incompletefactor is recalculated on a diagonally scaled system or by performing ICC(V) on a largerlower-order system. Another possibility is allowing additional fill-in during factorization. In thepresent case a combination of diagonal scaling and ICC(V) on a larger lower-order system isemployed.

An intelligent choice of the multilevel schedule and adaptive smoothing are vital for computa-tional efficiency of the multilevel preconditioned method. The ability to reliably predict theoptimal solution strategy for a given problem by purely theoretical means is questionable, andthus numerical experiments are utilized in the decision-making process.

3. SELECTION OF OPTIMAL MULTILEVEL PRECONDITIONED METHOD

In this section we elucidate the decision-making methodology to select an optimal multilevelpreconditioned method. Implementation issues of the adaptive multilevel preconditioned methodare also outlined. Numerical experiments are conducted on shell problems with varying thickness,

1772 J. FISH AND R. GUTTAL

( 1997 by John Wiley & Sons, Ltd.INT. J. NUMER. METHODS ENG., VOL. 40: 1767—1784 (1997)

Page 7: ADAPTIVE SOLVER FOR THE p-VERSION OF FINITE ELEMENT METHOD

Figure 1. RPI concrete canoe

Figure 2. RPI hybrid car body

Figure 3. Flange

number of elements, and polynomial orders. For the 3-D problems only the number of elements,and polynomial orders are changed. Large-scale shell problems considered here are: the Canoeand Car illustrated in Figures 1 and 2, respectively. These problems are modeled with a hierarchicshell elements with six d.o.f.s per node (three global translations and three global rotations)with the geometry mapped by cubic Lagrange interpolation functions.5 The Canoe is meshedwith 288 and 512 elements, the Car with 198 and 252 elements, respectively. The 3-D problemsconsidered are the flange and the V-block illustrated in Figures 3 and 4, respectively. Both the3-D problems are modelled with hexahedral elements. The geometry of the hexahedral elements ismapped by cubic Lagrangian functions. The V-block is meshed with 63 and 128 elements andthe Flange is modelled with 82 and 135 elements, respectively. The degrees of freedom

ADAPTIVE SOLVER FOR THE p-VERSION OF FINITE ELEMENT METHOD 1773

( 1997 by John Wiley & Sons, Ltd. INT. J. NUMER. METHODS ENG., VOL. 40: 1767—1784 (1997)

Page 8: ADAPTIVE SOLVER FOR THE p-VERSION OF FINITE ELEMENT METHOD

Figure 4. V-Block

corresponding to interior modes are statically condensed for both shell and hexahedral elements.Static condensation of interior modes results in a better conditioned system without affecting thesparsity. The condition number of the statically condensed system grows as O(log2 p) as com-pared to O(p2) of the system without static condensation for 2-D problems.11 For convergence e isselected as 10~5.

An efficient multilevel preconditioned method requires a computationally efficient coarse levelcorrection method at the lowest level. A sparse direct solver,12 which exploits the sparsity of thestiffness matrix is employed. In Reference 12 the equations are renumbered using the minimumdegree algorithm.13 The sparse12 solver is superior to most envelope methods in terms ofcomputational work and has been widely used in industrial applications.

3.1. Estimation of condition number, sparsity and storage requirements

Estimation of the condition number, the sparsity and storage requirements play an importantrole in the decision-making process of selecting the optimal solution strategy. The exact calcu-lation of the condition number is not practical and thus an estimate to the condition number isevaluated. The condition number estimate i is evaluated as follows:

i"jª.!9

/jª.*/

(11)

where jª.!9

and jª.*/

are the estimates of maximum and minimum eigenvalues of K, respectively.For a symmetric positive-definite matrix K, the maximum eigenvalue is bounded by the max-imum matrix norm EKE

=and is used as an estimate:

jª.!9

"EKE="max

i

+j

DkijD (12)

To evaluate jª.*/

, we estimate j.!9

(K~10

) using Lanczos method, when K~10

is a stiffness matrixof a lower-order system. A plot of polynomial order versus the minimum eigenvalue (j

.*/)

illustrated in Figure 5 for four problems (Car, Canoe, V-Block, and Flange) indicates that thesmallest eigenvalue of a lower-order system is a good estimate for the entire system.

The sparsity of the stiffness matrix K is quantified in terms of the average bandwidth. Theaverage bandwidth of the system is defined as the ratio of estimated total number of non-zeros inthe Cholesky factor of K, to the total number of equations. In case of an envelope method, such as

1774 J. FISH AND R. GUTTAL

( 1997 by John Wiley & Sons, Ltd.INT. J. NUMER. METHODS ENG., VOL. 40: 1767—1784 (1997)

Page 9: ADAPTIVE SOLVER FOR THE p-VERSION OF FINITE ELEMENT METHOD

Figure 5. Variation of minimal eigenvalue versus polynomial order

the skyline solver, the total number of non-zeros in the Cholesky factor can be estimated as thesize of the skyline storage. For a sparse solver the total number of non-zeros can be estimatedfrom the graph model of the symmetric matrix.12 The average bandwidth is calculated as:

-"

nonzero(L)

NDOFS(13)

where L is the Cholesky factor of stiffness matrix K, NDOFS are the total number of equationsand nonzero(L) is the total number of non-zero terms in the matrix L. The computation of

ADAPTIVE SOLVER FOR THE p-VERSION OF FINITE ELEMENT METHOD 1775

( 1997 by John Wiley & Sons, Ltd. INT. J. NUMER. METHODS ENG., VOL. 40: 1767—1784 (1997)

Page 10: ADAPTIVE SOLVER FOR THE p-VERSION OF FINITE ELEMENT METHOD

nonzero(L) for a sparse direct solver involves considerable symmetric graph manipulations,and hence is computationally expensive. Instead, an estimate to nonzero(L) can be expedi-tiously calculated. Consider L

0, the Cholesky factor of K

0, the estimate for nonzero(L) is then

calculated as:

nonzero(L)"nonzero(L

0)

nonzero(K0)]nonzero(K) (14)

This estimate for nonzero(L) has been found to be remarkably accurate, since for hierarchicsystems the ratio nonzero(L)/nonzero(K) remains practically constant for all polynomial orders.A plot of nonzero(L)/nonzero(K) versus polynomial order for four problems (Car, Canoe,V-Block, and Flange) (shown in Figure 6) corroborates this fact.

The total storage requirements (ki) for different multilevel methods are estimated as follows:

(1) Multilevel method with SGS smoothing:k1+storage for CLC#sparse storage of K.

(2) Multilevel method with incomplete factorization smoothing (no fill-ins):k2+storage for CLC#sparse storage of K#sparse storage for ICC.

(3) Multilevel method with block diagonal smoothing:k3+storage for CLC#sparse storage of K#sparse storage for block diagonal K.

(4) Single-level method:k4+storage for Cholesky factor of K#sparse storage of K.

The storage for CLC includes sparse storage of the Cholesky factor of size nonzero(L0) and an

integer index array of a smaller size. The sparse storage of stiffness matrix is given in terms ofsorted Block Sparse Row (BSR) format, where the blocks correspond to the d.o.f.s associated witheach mode. Hence the storage for K consists of storage for nonzero(K) and a node by node indexinteger array of size Mnonzero(K)/d.o.f.s per nodeN. Storage for ICC is given in terms of standardCompressed Sparse Row (CSR) format and consists of sparse storage of incomplete factor of sizenonzero(K) and an integer pointer array of same size. The estimated storage does not take intoconsideration the work arrays temporarily allocated during the program run. A plot of max.heap memory/estimated memory versus maximum heap memory of all methods for the fourrepresentative problems is illustrated in Figures 7(a) and 7(b). The maximum heap memory (max.heap memory) is the largest memory allocated during the program run. It is evident from the plotthat the estimation of memory requirements is quite accurate especially for large problems.

3.2. Optimal multilevel preconditioned method

To aid in selection of an optimal multilevel preconditioned method, the decision graphs areconstructed by conducting numerical experiments for three factors: sparsity pattern, conditionnumber and maximum available memory. We consider these factors simultaneously enablinginteraction between them. The decision graphs are scatter plots with superimposed discriminantfunctions. The discriminant functions classify the set of sample problems into subsets, each ofwhich can be solved in optimal CPU times with a single solution method. The discriminantfunctions can be determined graphically, or by carrying out complete discriminant analysis.14

An important attribute of the multilevel preconditioned method is the number of hierarchiclevels. For the p-version the number of levels can be chosen in the range of [1, P], where P is themaximum polynomial order of the hierarchic stiffness matrix. In our current investigation weconsider only one and two level schemes.

1776 J. FISH AND R. GUTTAL

( 1997 by John Wiley & Sons, Ltd.INT. J. NUMER. METHODS ENG., VOL. 40: 1767—1784 (1997)

Page 11: ADAPTIVE SOLVER FOR THE p-VERSION OF FINITE ELEMENT METHOD

Figure 6. Sparsity estimates versus polynomial order

First, we estimate what must be the size of the coarse level in a two-level scheme which wouldresult in minimum total CPU time. A plot of normalized CPU time (cpu time/min. cpu time)versus normalized coarse level polynomial order (CLC polynomial order/highest polynomialorder) is constructed for the four representative problems. It is evident from the plot (shown inFigure 8) that the selection of the coarse level polynomial order as one half the highest polynomialorder is nearly optimal in terms of total CPU time. For the set of Legendre interpolationfunctions used, this results in a coarse level that has approximately one half the total number ofequations for maximum polynomial order P in the range of 2)P)12.

ADAPTIVE SOLVER FOR THE p-VERSION OF FINITE ELEMENT METHOD 1777

( 1997 by John Wiley & Sons, Ltd. INT. J. NUMER. METHODS ENG., VOL. 40: 1767—1784 (1997)

Page 12: ADAPTIVE SOLVER FOR THE p-VERSION OF FINITE ELEMENT METHOD

The selection of acceleration scheme depends on the condition number of the system to besolved. A plot of condition number versus the ratio of number of cycles with CG acceleration tothe number of cycles with two parameter acceleration is shown in Figure 9. It is evident from theplot that the two parameter acceleration scheme results in lesser number of iterations for ill-conditioned problems (i'107). On the other hand for well-posed problems both the acceler-ation schemes result in same number of iterations. However the CG acceleration scheme involvesfewer scalar product evaluations.

Figure 7(a). Maximum heap memory versus estimated memory for car and canoe problems

1778 J. FISH AND R. GUTTAL

( 1997 by John Wiley & Sons, Ltd.INT. J. NUMER. METHODS ENG., VOL. 40: 1767—1784 (1997)

Page 13: ADAPTIVE SOLVER FOR THE p-VERSION OF FINITE ELEMENT METHOD

Figure 7(b). Maximum heap memory versus estimated memory for block and flange problems

Finally we develop decision graphs (Figures 10(a) and 10(b)) for selecting an optimal multilevelsolution strategy for four maximum memory limits:

(1) 50 MB,(2) 100 MB,(3) 200 MB,(4) 600 MB.

ADAPTIVE SOLVER FOR THE p-VERSION OF FINITE ELEMENT METHOD 1779

( 1997 by John Wiley & Sons, Ltd. INT. J. NUMER. METHODS ENG., VOL. 40: 1767—1784 (1997)

Page 14: ADAPTIVE SOLVER FOR THE p-VERSION OF FINITE ELEMENT METHOD

Figure 8. (a) Shell problems (car and canoe); (b) 3D problems (block, flange)

Figure 9. Number of cycles versus condition number for various acceleration schemes

The following 11 solution techniques have been considered:

(1) two-level scheme with Block diagonal incomplete factorization [ICC(P)] smoothing,(2) two-level method with Block diagonal incomplete factorization [ICC(V)] smoothing,(3) two-level method with Block diagonal full factorization smoothing,(4) two-level method with incomplete factorization by value [ICC(V)] smoothing up to

polynomial order one and ICC(P) for higher orders (p'1),(5) two-level method with Incomplete factorization by value ICC(V) smoothing up to

polynomial order two and ICC(P) for higher orders (p'2),(6) two-level method with Incomplete factorization by value ICC(V) up to polynomial order

three and ICC(P) for higher orders (p'3),(7) two-level method with Incomplete factorization by value ICC(V) smoothing for all

polynomial orders,

1780 J. FISH AND R. GUTTAL

( 1997 by John Wiley & Sons, Ltd.INT. J. NUMER. METHODS ENG., VOL. 40: 1767—1784 (1997)

Page 15: ADAPTIVE SOLVER FOR THE p-VERSION OF FINITE ELEMENT METHOD

Figure 10(a). (top) 50 MB memory; (bottom) 100 MB memory

Figure 10(b). (top) 200 MB memory; (bottom) 600 MB memory

ADAPTIVE SOLVER FOR THE p-VERSION OF FINITE ELEMENT METHOD 1781

( 1997 by John Wiley & Sons, Ltd. INT. J. NUMER. METHODS ENG., VOL. 40: 1767—1784 (1997)

Page 16: ADAPTIVE SOLVER FOR THE p-VERSION OF FINITE ELEMENT METHOD

(8) two-level method with ICC(P) smoothing for all polynomial orders,(9) two-level scheme with Symmetric Gauss Seidel (SGS) smoothing,

(10) one-level sparse direct solver,(11) one-level skyline solver.

The optimal solution method in terms of total CPU time for each problem characterized byaverage bandwidth and condition number, is plotted in Figures 10(a) and 10(b). Only fivesolution techniques amongst the ten listed above were found to be optimal for at least oneproblem considered. Thus the space of decision graph is divided into five regions correspondingto the five optimal solution strategies with some minor overlap along the boundaries. Thesingle-level method (sparse direct solver) is represented by a circle, the two-level method with(ICC(V) up to p)2 and ICC(P) for (p'2)) smoothing scheme is represented by square. Thetwo-level preconditioned method with incomplete factorization (ICC(V) for p)1 and ICC(P) for(p'1)) smoothing scheme is denoted by triangle. The two-level preconditioned method with

Figure 11(a). Convergence of various solvers for car and canoe problems

1782 J. FISH AND R. GUTTAL

( 1997 by John Wiley & Sons, Ltd.INT. J. NUMER. METHODS ENG., VOL. 40: 1767—1784 (1997)

Page 17: ADAPTIVE SOLVER FOR THE p-VERSION OF FINITE ELEMENT METHOD

Figure 11(b). Convergence of various solvers for flange and block problems

block diagonal incomplete factorization (ICC(P)) smoothing is represented by diamond, andfinally the two-level scheme with SGS smoothing is represented by an inverted triangle. Whena new problem is encountered the optimal solver is determined based on the estimates of thecondition number, average bandwidth and memory considerations for the given problem.Decision graphs were implemented using isoparametric mapping in three space dimensionscorresponding to memory, sparsity and conditioning.

Finally a plot of the problem size versus CPU time in seconds is illustrated in Figures 11(a) and11(b) for the four representative problems solved using different two-level, one-level and theadaptive solver. The runs were made on Sun Sparc 5, 110 MHz workstation with 600 MBmemory.

4. CONCLUSIONS AND FUTURE RESEARCH

Research efforts were conducted to develop an adaptive multilevel solution strategy for solvinghierarchic systems resulting from the p-version of finite element discretization. The decision graph

ADAPTIVE SOLVER FOR THE p-VERSION OF FINITE ELEMENT METHOD 1783

( 1997 by John Wiley & Sons, Ltd. INT. J. NUMER. METHODS ENG., VOL. 40: 1767—1784 (1997)

Page 18: ADAPTIVE SOLVER FOR THE p-VERSION OF FINITE ELEMENT METHOD

methodology aimed at determining an optimal multilevel solution strategy is developed. Onlyincore solution methods have been considered so far. Clearly an ultimate solution engine needs toinclude an efficient out-of-core solver, since for very large problems, it is not usually possible tokeep the stiffness matrix in RAM. Both iterative and direct out-of-core solvers are currently beinginvestigated in the context of the p-method.

REFERENCES

1. R. B. Morris, Y. Tsuji and P. Carnevali, ‘Adaptive solution strategy for solving large systems of p-type finite elementequations’, Int. j. numer. methods eng., 33, 2059—2071 (1992).

2. J. Mandel, ‘Adaptive iterative solvers in finite elements’, in M. Papadrakakis (ed.), Solving ¸arge-scale Problems inMechanics, Wiley, New York, 1993, pp. 65—88.

3. W. Hackbusch, Iterative Solution of ¸arge Sparse System of Equations, Springer, New York, 1994.4. A. Brandt, ‘Multi-level adaptive solutions to boundary-value problems’, Math. Comp., 31, 333—390 (1977).5. J. Fish and R. Guttal, ‘The p-version of finite element method for shell analysis’, Comput. Mech. Int. J., 16, 328—340

(1995).6. R. E. Bank, T. F. Dupont and H. Yserentant, ‘The hierarchical basis multigrid method’, Numer. Math., 52, 427—458

(1988).7. O. Axelsson, ‘Analysis of incomplete matrix factorizations as multigrid smoothers for vector and parallel computers’,

Appl. Math. Comp., 19, 3—22 (1986).8. D. S. Kershaw, ‘The incomplete Choleski-conjugate gradient method for the iterative solution of systems of linear

equations’, J. Comp. Phys., 26, 43—65 (1968).9. M. A. Ajiz and A. Jennings, ‘A robust incomplete Choleski-conjugate gradient algorithm’, Int. j. numer. methods eng.,

20, 949—966 (1984).10. O. Axelsson and I. Gustafsson, ‘Preconditioning and two-level multigrid methods of arbitrary degree of approxima-

tion’, Math. Comp., 40, 219—242 (1983).11. I. Babuska, A. Craig, J. Mandel and J. Pitkaranta, ‘Efficient preconditioning for the p-version finite element method in

two dimensions’, SIAM J. Numer. Anal., 28, 624—661 (1991).12. VSS—Sparse Direct Solver, NASA Langley, Hamplou, VA, 1996.13. W. F. Tinney, Comments on using sparsity techniques for power system problems. Sparse Matrix Proceedings, IBM

Research Rept. RAI 3-12-69.14. C. J. Huberty, Applied Discriminant Analysis. Wiley, New York, 1994.

.

1784 J. FISH AND R. GUTTAL

( 1997 by John Wiley & Sons, Ltd.INT. J. NUMER. METHODS ENG., VOL. 40: 1767—1784 (1997)