an uncertainty quantification framework for studying …lin491/pub/glin-13-mg.pdf · an...

19
Math Geosci (2013) 45:799–817 DOI 10.1007/s11004-013-9459-0 An Uncertainty Quantification Framework for Studying the Effect of Spatial Heterogeneity in Reservoir Permeability on CO 2 Sequestration Zhangshuan Hou · Dave W. Engel · Guang Lin · Yilin Fang · Zhufeng Fang Received: 26 January 2012 / Accepted: 11 April 2013 / Published online: 7 May 2013 © International Association for Mathematical Geosciences 2013 Abstract A new uncertainty quantification framework is adopted for carbon seques- tration to evaluate the effect of spatial heterogeneity of reservoir permeability on CO 2 migration. Sequential Gaussian simulation is used to generate multiple realizations of permeability fields with various spatial statistical attributes. In order to deal with the computational difficulties, the following ideas/approaches are integrated. First, differ- ent efficient sampling approaches (probabilistic collocation, quasi-Monte Carlo, and adaptive sampling) are used to reduce the number of forward calculations, explore effectively the parameter space, and quantify the input uncertainty. Second, a scal- able numerical simulator, extreme-scale Subsurface Transport Over Multiple Phases, is adopted as the forward modeling simulator for CO 2 migration. The framework has the capability to quantify input uncertainty, generate exploratory samples effectively, perform scalable numerical simulations, visualize output uncertainty, and evaluate input-output relationships. The framework is demonstrated with a given CO 2 injec- tion scenario in heterogeneous sandstone reservoirs. Results show that geostatistical parameters for permeability have different impacts on CO 2 plume radius: the mean parameter has positive effects at the top layers, but affects the bottom layers nega- tively. The variance generally has a positive effect on the plume radius at all layers, particularly at middle layers, where the transport of CO 2 is highly influenced by the subsurface heterogeneity structure. The anisotropy ratio has weak impacts on the plume radius, but affects the shape of the CO 2 plume. Z. Hou ( ) · Y. Fang · Z. Fang Earth Systems Science Division, Pacific Northwest National Laboratory, Post Office Box 999, Richland, WA 99352, USA e-mail: [email protected] D.W. Engel · G. Lin Computational Science & Mathematics Division, Pacific Northwest National Laboratory, Post Office Box 999, Richland, WA 99352, USA

Upload: vanliem

Post on 24-Jun-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: An Uncertainty Quantification Framework for Studying …lin491/pub/GLIN-13-MG.pdf · An Uncertainty Quantification Framework for Studying the Effect of Spatial Heterogeneity in

Math Geosci (2013) 45:799–817DOI 10.1007/s11004-013-9459-0

An Uncertainty Quantification Frameworkfor Studying the Effect of Spatial Heterogeneityin Reservoir Permeability on CO2 Sequestration

Zhangshuan Hou · Dave W. Engel · Guang Lin ·Yilin Fang · Zhufeng Fang

Received: 26 January 2012 / Accepted: 11 April 2013 / Published online: 7 May 2013© International Association for Mathematical Geosciences 2013

Abstract A new uncertainty quantification framework is adopted for carbon seques-tration to evaluate the effect of spatial heterogeneity of reservoir permeability on CO2

migration. Sequential Gaussian simulation is used to generate multiple realizations ofpermeability fields with various spatial statistical attributes. In order to deal with thecomputational difficulties, the following ideas/approaches are integrated. First, differ-ent efficient sampling approaches (probabilistic collocation, quasi-Monte Carlo, andadaptive sampling) are used to reduce the number of forward calculations, exploreeffectively the parameter space, and quantify the input uncertainty. Second, a scal-able numerical simulator, extreme-scale Subsurface Transport Over Multiple Phases,is adopted as the forward modeling simulator for CO2 migration. The framework hasthe capability to quantify input uncertainty, generate exploratory samples effectively,perform scalable numerical simulations, visualize output uncertainty, and evaluateinput-output relationships. The framework is demonstrated with a given CO2 injec-tion scenario in heterogeneous sandstone reservoirs. Results show that geostatisticalparameters for permeability have different impacts on CO2 plume radius: the meanparameter has positive effects at the top layers, but affects the bottom layers nega-tively. The variance generally has a positive effect on the plume radius at all layers,particularly at middle layers, where the transport of CO2 is highly influenced by thesubsurface heterogeneity structure. The anisotropy ratio has weak impacts on theplume radius, but affects the shape of the CO2 plume.

Z. Hou (�) · Y. Fang · Z. FangEarth Systems Science Division, Pacific Northwest National Laboratory, Post Office Box 999,Richland, WA 99352, USAe-mail: [email protected]

D.W. Engel · G. LinComputational Science & Mathematics Division, Pacific Northwest National Laboratory, Post OfficeBox 999, Richland, WA 99352, USA

Page 2: An Uncertainty Quantification Framework for Studying …lin491/pub/GLIN-13-MG.pdf · An Uncertainty Quantification Framework for Studying the Effect of Spatial Heterogeneity in

800 Math Geosci (2013) 45:799–817

Keywords Uncertainty quantification · Efficient sampling · Reservoirheterogeneity · Carbon sequestration

1 Introduction

Numerical models are essential tools in fully understanding the fate of injected CO2for commercial-scale sequestration projects and should be included in the life cycle ofa project. Common practice involves modeling the behavior of CO2 during and afterinjection using site-specific reservoir and caprock properties. Migration and storageof CO2 involve complex physical and chemical processes. Among the factors thataffect sequestration performance is the spatial heterogeneity of injection reservoirproperties such as permeability. The effects of heterogeneity in carbon sequestrationin saline aquifers and other formations have been investigated extensively and areknown to play important roles in the migration and storage capacity of the aquifer.Sources of uncertainties in geological carbon sequestration stem from site characteri-zation, site capacity, injection rate, caprock integrity, CO2 trapping mechanisms, min-eral precipitation and dissolution kinetics. Evaluation of the impact of these uncer-tainties on injectivity, storage, and leakage risks relies on model formulation and pa-rameterization. Multiphase flow and reactive transport models have been used exten-sively for studying migration of CO2 in aquifers (Barnes et al. 2009; Izgec et al. 2008;Knauss et al. 2005; Nordbotten et al. 2005; Oldenburg and Unger 2003; Pruess 2008;Rutqvist et al. 2007). There is also uncertainty related to reaction kinetics. A sensi-tivity analysis for trapping through Dawsonite precipitation showed that slower daw-sonite kinetics resulted in increased formation of the other trapping minerals such ascalcite and magnesite (Knauss et al. 2005). In addition, Izgec et al. (2008) showed thatchanges in formation permeability are very sensitive to kinetic mineral reaction pa-rameters. The input uncertainty can be addressed through uncertainty analysis, whichyields an indication of which parameters have the most impact on carbon sequestra-tion and provide guidance for the design, injection, regulation, and future data collec-tion of proposed sequestration sites. For example, some studies (Eccles et al. 2009;Oldenburg and Unger 2003) indicate that subsurface permeability and heterogeneitymay have a large effect on predicting surface CO2 leakage, implying the careful char-acterization and selection of a sequestration site are important. Such studies usuallysolve a set of nonlinear partial differential equations to predict CO2 movement as afunction of time and space. Significant uncertainty may arise due to data availabilityand quality, model oversimplifications and inaccuracy. Among the simplifications isthe homogeneous assumption of the reservoir or a simplified model to describe theheterogeneity/structure of the field.

There are different ways to describe the spatial heterogeneity and incorporate itinto the simulations, using either numerical or analytical methods. Previous effortshave simplified the spatial structure to consist of zones and/or layers with site infor-mation from well logs; one example is a spatial variogram model with variogramanalyses assuming spatial stationarity (Deutsch 2002; Deutsch and Journel 1998;Kitanidis 1997). Previous work also has examined the impacts of reservoir andcaprock properties on the migration, injectivity, and leakage of CO2 (Hou et al.

Page 3: An Uncertainty Quantification Framework for Studying …lin491/pub/GLIN-13-MG.pdf · An Uncertainty Quantification Framework for Studying the Effect of Spatial Heterogeneity in

Math Geosci (2013) 45:799–817 801

2012b). The results show that reservoir permeability is the dominant factor onCO2 plume radius and reservoir injectivity. Therefore, the study described herefocuses on evaluating the heterogeneity of permeability and quantifying the as-sociated uncertainty with CO2 plume predictions using an uncertainty quantifi-cation (UQ) framework. The most challenging issue is the computational timeand demand related to numerous simulations required to perform stable sensitiv-ity analyses and/or parameter calibration. Adaptive sampling is a sampling tech-nique that reduces the computation time by reducing the number of simulations re-quired, and has been used before in similar applications (Dawson and Hall 2006;Givens and Raftery 1996). However, most techniques do not utilize a global searchalgorithm or try to minimize the response variance, which are at the center of theadaptive response modeling process utilized in our analysis. Using this process,one can identify strategic locations of new samples for simulation, in the overallparameter space, and produce comparable results with fewer samples than otherMonte Carlo sampling methods. Even with efficient and adaptive sampling meth-ods to reduce the number of numerical simulations to achieve reliable conclusions,the overall computational demand is still large, particularly for three-dimensionalmultiphase flow and transport modeling for CO2 sequestration problems. There-fore, high performance computing (HPC) should be integrated into the numericalsimulators to facilitate the exploratory UQ study. Our overall goal is to providea novel UQ framework that provides a reliable and efficient means of quantita-tively predicting uncertainty and assessing the risk of CO2 geological sequestration,focusing on the well-known heterogeneity impacts on multiphase flow and trans-port processes. The sampling techniques and the overall framework for the sim-ulations and UQ are described in Sect. 2. The model setup and numerical sim-ulations for a case study are described in Sect. 3. The results of the case studyand a discussion of the results are presented in Sects. 4 and 5. The last sectioncontains our conclusions from the framework development and the simulation re-sults.

2 Methodology: The Uncertainty Quantification Framework

2.1 Parameterization and Quasi-Monte Carlo Sampling

Assuming that the input parameters can be represented by probabilistic distribu-tion functions, one can generate sample values from these distributions and use theparameter values for numerical evaluation. The distribution functions can be ob-tained from literature or derived based on entropy (Woodbury and Ulrych 1993;Hou and Rubin 2005; Hou et al. 2006)) given some prior knowledge. The successof a numerical study for sensitivity analysis and/or inversion usually relies on evalu-ating all possibilities defined by the model space using efficient sampling approaches.Given a large number of dimensions, systematic sampling techniques such as bySimpson’s rule are not sufficient (Tarantola 2005). Therefore, a quasi-Monte Carlo(QMC) technique, which incorporates deterministic sequences, is adopted to guaran-tee good dispersion between sample points (Caflisch 1998), as shown in Fig. 1. The

Page 4: An Uncertainty Quantification Framework for Studying …lin491/pub/GLIN-13-MG.pdf · An Uncertainty Quantification Framework for Studying the Effect of Spatial Heterogeneity in

802 Math Geosci (2013) 45:799–817

Fig. 1 Comparison of pseudorandom (regular Monte Carlo) sampling (left) versus quasi-random sampling(right). A and B are two fictitious variables from a multivariate uniform distribution. When A is plottedagainst B using pseudorandom samples, clumping of samples occurs. Quasi-random sampling alleviatesthis problem, and the samples are well dispersed

QMC approach has been used successfully for exploring parameter space as wellas for stochastic inversion (Dick and Pillichshammer 2010; Hou et al. 2006, 2012a,2012b). QMC requires a choice regarding input of a low-discrepancy sequence. Itis widely acknowledged that Sobol sequences (Sobol 1967) perform well for prob-lems of greater than six dimensions and avoid degradation effects observed in manyother low-discrepancy sequences (Atanassov et al. 2010; Sobol and Shukhman 2007;Wang 2009). In practice, the number of QMC samples needs to be determined fora sampling-involved problem, such as numerical integration, exploratory sensitivityanalyses, and parameter estimation/inversion. The number normally is a power of2 and is usually chosen as a tradeoff between computational time and the numeri-cal error. Because CO2 migration simulations are computationally demanding, it wasnot practical to run thousands of simulations in this study. There was also a con-cern about the reliability of the developed relationships between the output responses(e.g., plume radius, injectivity) and independent variables (e.g., the hyperparameters).Tests of stability of response surfaces were performed, and it was found that for thechosen study sites, 128 QMC samples are adequate to give reliable response surfaces(i.e., CO2 plume radius vs. the hyperparameters), and the significance levels of theparameters are relatively stable.

2.2 Sparse-Grid-Based Probabilistic Collocation Sampling Method

Another sampling approach, the sparse-grid-based probabilistic collocation (PC)method, is adopted to explore the extreme realizations and for comparison studies.The general procedure for the PC approach is similar to that for MC simulations ex-cept for a difference in selecting the sampling points and corresponding weights. Theprocedure consists of three main steps:

1. Generate Nc collocation points in probability space of random parameters as in-dependent random inputs based on a quadrature formula.

Page 5: An Uncertainty Quantification Framework for Studying …lin491/pub/GLIN-13-MG.pdf · An Uncertainty Quantification Framework for Studying the Effect of Spatial Heterogeneity in

Math Geosci (2013) 45:799–817 803

2. Solve a deterministic problem at each collocation point.3. Estimate the solution statistics using the corresponding quadrature rule

⟨u(x, t)

⟩ =∫

Γ

u(x, t, ξ)ρ(ξ)dξ

≈Nc∑

k=1

v(x, t, ξk)wk,

σ (u)(x, t) =√∫

Γ

(u(x, t, ξ) − 〈u〉)2

ρ(ξ)dξ

≈√√√√

Nc∑

k=1

v2(x, t, ξk)wk − 〈v〉2,

(1)

where ρ(ξ) is the probabilistic distribution function of random variable ξ , Nc is thenumber of quadrature points, {ξk} is the set of quadrature points, and {wk} is the corre-sponding set of weights, which are the combination of quadrature weights in each ran-dom dimension. In the second step of the PC approach, as for MC, any existing codecan be used to solve deterministic flow and transport equations. Extensive reviewson the construction of quadrature formulas may be found in Cools (1999) and Coolsand Rabinowitz (1993). In this work, the Smolyak formula (Smolyak 1963) is usedto construct the collocation point set, which is a linear combination of tensor productformulas, and the resulting point set has a significantly smaller number of pointsthan the full tensor product set. Recently, researchers (Xiu and Hesthaven 2005;Lin and Tartakovsky 2009) have used Lagrange polynomial interpolation to constructhigh-order stochastic collocation methods based on sparse grids using the Smolyakformula. Such sparse grids do not depend as strongly on the dimensionality of therandom space and as such are more suitable for applications with high-dimensionalrandom inputs.

2.3 Adaptive Sampling: Bootstrap and Response Surface-Based Sampling

Adaptive sampling is a way to identify new sample points based on previous results.The adaptive response modeling (ARM) method utilizes the response surface fit tothe model results to identify new points along the surface. One method for identifyingnew points is to develop uncertainty estimates along the response surface and identifythose points or regions where the uncertainty is high (Engel et al. 2004). First, a rela-tively small initial sample of design points are selected using a space-filling design ora similar design based on a joint probability distribution of the design variables. Thenthe numerical simulations are conducted to obtain the model response at each designpoint in the sample. Next, a response approximation algorithm is applied to the exist-ing responses to develop a surrogate model, together with a measure of goodness of fitto the true response. If the fit is adequate, or a budget for computing resources is ex-hausted, the process ends, and the final surrogate model is implemented together withassociated predictive uncertainty estimates. Otherwise, another algorithm is used toidentify a set of most informative new design points based on a number of possible

Page 6: An Uncertainty Quantification Framework for Studying …lin491/pub/GLIN-13-MG.pdf · An Uncertainty Quantification Framework for Studying the Effect of Spatial Heterogeneity in

804 Math Geosci (2013) 45:799–817

criteria. New model evaluations at the additional design points then augment the pre-vious set of responses, and the approximation algorithm is applied to the augmentedset of responses to refine the surrogate model.

2.4 Surrogate Model Formulation and Estimation

The relationships between response variables and input parameters are evaluated bylooking at the response surfaces and the corresponding reduced-order models or sur-rogates. In this study, the multivariate adaptive regression splines (MARS) method(Friedman 1991) is used. The MARS method falls into the category of machine learn-ing techniques and is essentially a nonparametric piecewise polynomial regressionmodel that was designed to perform well for high-dimensional data. The MARS pre-dictor consists of a sum of weighted spline basis functions φi(x) = φ(x, ti ) with knotsti and takes the form

y(x) = β0 +n∑

i=1

βi φi (x). (2)

In Eq. (2), n denotes the total number of basis functions. The basis function φi isdefined as

φi(x) =Ki∏

k=1

[Sk,i(xν(k,i) − tk,i )

]q+, (3)

where Ki is the number of factors (order of interactions) in the ith basis function,Sk,i is the basis function sign (+/−), xν(.,.) is the νth predictor variable, and tk,i isthe location of the kth knot for the predictor variable xν(.,.), that is, the tk,i ’s parti-tion the range of the predictor variable xν(.,.). The basis function is a truncated powerfunction and note that in the case q = 1, the MARS approximation is a piecewiselinear approximation to the response surface of the computer model. At each step,the MARS algorithm searches for an optimal partition of the domain space and thensolves for the basis function coefficients and the knot locations that produce the bestfit to the observed responses for the selected partition of input space. The MARSalgorithm adds basis functions until the observed responses are overfitted, then sys-tematically prunes away those basis functions that contribute the least to a measureof goodness-of-fit to arrive at a final model. They can be computed rapidly, exhibitreasonable fidelity to the training response surface and, therefore, adapt well to a spe-cific response surface, have the necessary flexibility to model low-order interactionsamong the input variables, and allow for optional smoothing in the final model.

2.5 Prediction Uncertainty Analysis

The following approach is used to create a proxy for prediction uncertainty at anygiven design point. In addition to the MARS predictor y, a predictor y(j) that excludesthe j th observed response is computed for each of the n observed responses. Theprediction interval proxy is defined to be the sample variance s2(x) of these leave-one-out predictors at any design point x

Page 7: An Uncertainty Quantification Framework for Studying …lin491/pub/GLIN-13-MG.pdf · An Uncertainty Quantification Framework for Studying the Effect of Spatial Heterogeneity in

Math Geosci (2013) 45:799–817 805

Fig. 2 Sample locations. The left plot shows all 197 samples while the right plot shows the first 16QMC samples (red points), four new ARM sample locations extracted from the first 16 QMC samples(blue points), and the closest samples (from the remaining 181 samples, left plot) to the four new ARMlocations (green points)

s2(x) = 1

n − 1

n∑

i=1

[

y(i)(x) − 1

n

n∑

j=1

y(j)(x)

]2

. (4)

This measure has the same qualitative structure around each linear subsection of theMARS predictor as that of a standard linear regression prediction interval. For theanalysis presented in this paper, a large number of the parameter set will be simu-lated up front (using the QMC and PC sampling techniques). The ARM process wasadapted to utilize the samples previously generated. This was accomplished by usingthe sample point closest to each identified adaptive sample location, without replace-ment (i.e., once a sample point was selected by the ARM, it was removed for laterchecking). This approach is illustrated in Fig. 2, in which the ARM method placedall of the new augmented points in the region of high mean permeability and stan-dard deviation. The reason for this is that the response (plume radius) in this regionis the largest and produces the largest uncertainties. The goal of using the ARM pro-cess is to produce response surfaces by using a reduced number of model evaluationscompared to a quasi-random approach while producing similar results. To measurehow well the process works, the root mean squared error (RMSE) metric is used tocompare the response surface from the reduced simulation results to the true surface.Because the true surface is not known, it was approximated by fitting a responsesurface to the responses from all of the simulated parameter sets.

2.6 Uncertainty Quantification Framework

In order to take the advantage of advanced sampling techniques, UQ capabilities,and high performance computing, a UQ framework has been developed, and is sum-marized in Fig. 3. This framework allows the user to select the sampling and UQtechniques and generate the necessary input files for analytical or numerical models.These input files are then ported to high performance computing (HPC) machines,

Page 8: An Uncertainty Quantification Framework for Studying …lin491/pub/GLIN-13-MG.pdf · An Uncertainty Quantification Framework for Studying the Effect of Spatial Heterogeneity in

806 Math Geosci (2013) 45:799–817

Fig. 3 The overall UQ pipeline

where each realization is run and results are produced. The results are then analyzedand visualized through exploratory data analyses. The results produced within thisframework are presented in the following sections.

3 CO2 Sequestration Case Study: Model Setup and Numerical Simulation

The simulation consists of a three-dimensional domain partitioned into 60 × 60× 25grid cells with a uniform grid size of 80 m in horizontal directions and 4 m in the verti-cal direction. The bottom elevation of the formation is 1500 m below ground surface.The injection is treated as a line source located at the center of the domain, represent-ing the injection of gas mass into 25 of the vertical cells at a rate of 0.04 MMT/yr foreach cell. The simulation was executed for 10 years of injection. Initial pressure ofthe simulation was calculated assuming the water table is 10 m below ground surface.Temperature is 110.13 °F (43.4 °C) at the bottom of the domain and decreases with agradient of 0.01 °F/ft (0.018 °C/m) to the top of the domain. The initial dissolved saltaqueous mass fraction was set to 0.21. An initial condition was used for temperatureand salt mass fraction; a zero flux for gas and water phase was assumed for all theboundaries surrounding the domain. A uniform porosity of 0.175 and compressibilityof 9.7×10−10/Pa were selected for the simulation. The permeability is assumed to beisotropic, and its distribution is described by a spatial variogram model. For describ-ing the spatial heterogeneity in permeability, the hyperparameters include the mean,variance, and spatial integral scale of logarithmic permeability Y = log10(k), where k

is in millidarcies (mD). Various combinations of model parameters are sampled usingquasi-Monte Carlo sampling with 128 realizations and probabilistic collocation sam-pling with 69 realizations. The saturation-capillary pressure relations are described bythe Brooks and Corey (1966) expression with Fayer and Simmons (1995) extension(Eq. (5)). With this function, the actual aqueous saturations are computed in terms ofeffective aqueous saturations according to Eq. (6), in which the effective minimumaqueous saturation is computed as a function of aqueous–gas capillary pressure, asshown in Eq. (7). The aqueous and gas relative permeability functions use the Burdine(1953) pore-size distribution model (Eqs. (8) and (9)). The equations are

Page 9: An Uncertainty Quantification Framework for Studying …lin491/pub/GLIN-13-MG.pdf · An Uncertainty Quantification Framework for Studying the Effect of Spatial Heterogeneity in

Math Geosci (2013) 45:799–817 807

sl =( Pg−Pl

ρlg

ψ

)−λ

for (Pg − Pl) ≥ ψ,

sl = 1 for (Pg − Pl) < ψ,

(5)

sl = sl (1 − sm) + sm, (6)

sm =(

1 − ln(Pg−Pl

ρlg

)

ln(hod)

)sm, (7)

krl = (sl)3+2λ, (8)

krg = (sg)2[1 − (sl)

1+2/λ], (9)

where Pg is gas pressure, Pl is aqueous pressure, sl is the effective aqueous saturation,sl is the actual aqueous saturation, ρl is reference aqueous density, ψ is the Brooksand Corey air-entry head (m), sm the effective residual aqueous saturation, λ is theBrooks and Corey fitting parameter, sg is the effective aqueous saturation, and hod isthe oven-dried gas–aqueous capillary head (m), krl is aqueous relative permeability,and krg is the gas relative permeability.

We adopted a simulator, STOMP (subsurface transport over multiple phases)(White and Oostrom 2006), to simulate the CO2 migration processes. The model-ing domain is discretized with structured orthogonal grids for spatial discretizationand a fully implicit formulation for temporal discretization. The conservation equa-tions are solved using a set of primary variables. Secondary variables are solved fromthe primary variables through the constitutive equations, which generally are non-linear functions. Nonlinearities in the discretized governing equations and associateconstitutive equations are solved using Newton–Raphson iteration. The simulationsin this study were conducted with a parallel version of STOMP, eSTOMP-CO2e,which solves flow and transport problems for nonisothermal CO2 sequestration sys-tems in aqueous saline formations. One-sided communication and a global sharedmemory programming paradigm from the Global Array Toolkit library (Nieplochaet al. 2006) are used for scalability, performance, and extensibility on massively par-allel processing computers. The approach is compatible with the more commonlyused Message Passing Interface (Message Passing Interface Forum 2009) used bythe PETSc (Balay et al. 2011) global implicit solver. With the scalable simulator,the simulations required ∼ 48 hours of wall clock time for all 197 realizations using6,304 64-bit AMD 2.2-GHz Opteron processor cores on a computing cluster, whilean equivalent desktop simulation for one realization would have taken more than 2weeks. The parallel capability is indispensable for UQ in the context of our under-standing of the effect of reservoir spatial heterogeneity on CO2 sequestration.

4 Results

The CO2 migration process is simulated for each combination of the parametersmY , σY , and IY,H (the mean, standard deviations, and spatial anisotropy ratio Ih/Iv).Well log data from a few geological formations (e.g., Mt. Simon) is used to deter-mine the ranges of these hyperparameters, as shown in Fig. 4. In this study, thehyperparameters are assumed to follow uniform distributions between the ranges.

Page 10: An Uncertainty Quantification Framework for Studying …lin491/pub/GLIN-13-MG.pdf · An Uncertainty Quantification Framework for Studying the Effect of Spatial Heterogeneity in

808 Math Geosci (2013) 45:799–817

Fig. 4 Quasi-Monte Carlosamples (top) for exploring theparameter space, andprobabilistic collocation samples(bottom) for marginal parametereffect analyses (e.g., looking atthe effect of one parameter byfixing other parameters). Theranges of the three parametersare for Y = log10(permeability,mD). Mean(Y ) has the sameunit, stdev(Y ) and anisotropy(Y )are dimensionless

Note that when more information is available, the distribution functions may takethe forms of truncated Gaussian/exponential pdfs (Woodbury and Ulrych 1993;Hou and Rubin 2005). The lower bound for mY is 100 mD, a threshold for a potentialformation as a qualified target for injection. An upper bound of 0.3 is assigned forσY to represent a relatively high heterogeneity for typical sandstone reservoirs. Therange of anisotropy ratios covers a significant proportion of the anisotropy ratios pos-tulated for geologically relevant depositional environments (e.g., Deutsch 2002). Thesample sets are shown in Fig. 4. For each sample set, geostatistical algorithms areused to generate multiple realizations of spatially heterogeneous fields of reservoirpermeability. Each realization is then used to simulate CO2 migration and snapshotsof gas saturation distributions are analyzed. The plume front is defined to be located

Page 11: An Uncertainty Quantification Framework for Studying …lin491/pub/GLIN-13-MG.pdf · An Uncertainty Quantification Framework for Studying the Effect of Spatial Heterogeneity in

Math Geosci (2013) 45:799–817 809

Fig. 5 CO2 plume radius as a function of mean(Y ), std(Y ), and anisotropy ratio (Ih/Iv)

at the numerical grids where the gas saturation is 0.1, with a tolerance of 0.05, andthe plume radius is calculated at different depths by averaging the distance of thecorresponding grids to the injection well.

Figure 5 shows the response curves of the average plume radius at 10 years withrespect to the three parameters describing spatial heterogeneity in reservoir hetero-geneity. As shown, the parameter mY has a positive effect for the top three layers buta negative effect for the bottom layers. It is not surprising that the higher the averagepermeability, the farther the plume front can advance. In contrast, for lower layers,higher permeability results in a smaller plume radius because it becomes easier forthe plume to move upward instead of spreading horizontally. This behavior is seenin the 21st layer and is similar for layers below. The parameter σY generally has apositive effect on the plume radius at all layers. The effect is more obvious at middlelayers, where the transport of CO2 is highly influenced by the subsurface heterogene-ity structure. At the very bottom layers, the variability does not matter because theplume will not advance far horizontally before moving upward due to the buoyanceforce. In addition, the level of variability is roughly proportional to the ranges in theplume radius predictions. The anisotropy ratio has almost no impact on the plume ra-

Page 12: An Uncertainty Quantification Framework for Studying …lin491/pub/GLIN-13-MG.pdf · An Uncertainty Quantification Framework for Studying the Effect of Spatial Heterogeneity in

810 Math Geosci (2013) 45:799–817

dius in the summary analyses, but later analyses show that the anisotropy correspondswell with the shape of the CO2 plume. It is intuitive that the larger input uncertaintyresults in larger output predictive intervals/ranges. For example, in this study, caseswith large differences in mean and/or variance in permeability led to large differ-ences in CO2 plume shapes and radius. As shown in the response curves in Fig. 5,our framework enables us to evaluate the relationships between the outputs (responsevariables) and the input parameters, as well as the output uncertainty as shown in theboxplots. When the parameter dimensionality is high, it might be necessary to gener-ate more samples to avoid under-sampling issues, which can be seen in the responsecurves as inconsistent patterns of monotonicity and/or nonlinearity.

Figure 6 shows three pairs of comparisons of gas saturation plume expansion from45 years to 50 years. To illustrate the effect of the variance, each group possessessimilar mean values and integral scales. The left panels have lower variability thanthe panels on the right. With variance increase, the shape of the plume front be-comes more irregular and tortuous. Assuming that the plume front is analogous to aclosed circular wave, the higher variance produces a larger magnitude (i.e., strongercontrast in plume front extent). In Fig. 6, case 170 is an extreme case with zerovariance, so the shape of the plume front is a perfect circle with zero wave mag-nitude/contrast. With increasing variance, tortuosity emerges and becomes increas-ingly stronger, representing the effect of preferential paths. Figure 7 also shows threepairs of comparisons of gas saturation plume expansion from 45 years to 50 years,in which each group possesses similar mean values and variances, to illustrate theeffect of anisotropy ratio. The left panels have a lower Ih/Iv ratio than the right pan-els. Within each group, because of the similar variance, the contrasts in front extentare weak. When the anisotropy ratio decreases from the left column to the right col-umn, the plume front wave exhibits higher frequencies, because longer correlationranges correspond to stronger spatial continuity and, therefore, the periodicity of theplume front gets weaker. When the anisotropy ratio is high enough, one can expectthat the three-dimensional domain becomes a layering formation, and the plume willhave a circular front shape within each observation plane, similar to the homoge-neous cases. The difference is the vertical zig-zag pattern contrasts between layerscompared to the smooth increase of plume radius from the bottom of the reservoirfor the homogeneous cases. Figure 8 shows the dynamics between the permeabilityparameters and the plume radius. As mentioned earlier, plume radius increases asthe mean increases at the high levels (e.g., z-axis = 25) but decreases as the mean de-creases for the lower levels (e.g., z-axis = 5). This happens because when the mean isrelatively large, the buoyance force can quickly move the injected CO2 up to the toplayers, eliminating the effect of heterogeneity at lower zones. Therefore, the plumeradius increases at top layers, but decreases at bottom layers.

The response surfaces of response variables to the input parameters were analyzed,and the results using different sampling schemes were compared (Fig. 9, and Table 1).A surrogate for the true response surface for the plume radius at the top layer is shownin the upper left-hand plot of Fig. 9. In this plot, the MARS algorithm fit a surfaceto all 197 model runs, using the three stochastic parameters including permeabilitymean, standard deviation, and anisotropy ratio. Also shown in Fig. 9 is a responsesurface fit to all 128 QMC runs, fit to the first 64 QMC runs, and fit to a combina-tion of the first 32 QMC runs augmented with 32 ARM samples extracted from the

Page 13: An Uncertainty Quantification Framework for Studying …lin491/pub/GLIN-13-MG.pdf · An Uncertainty Quantification Framework for Studying the Effect of Spatial Heterogeneity in

Math Geosci (2013) 45:799–817 811

Fig. 6 CO2 gas saturation plume expansion from 45 years to 50 years with respect to the variance of Y .The two cases along each row have similar mean values and anisotropy ratios of Y . The left panels havelower variances (variability, standard deviation) than the panels on the right

remaining QMC runs. Qualitatively, all four of these plots of response surfaces lookvery similar, meaning that the smart adaptive sampling methods combining QMCand ARM can do a good job of predicting the true RS using a significantly reducedset of samples. In Table 1, the specific sample design is defined by the method (i.e.,QMC, PC or ARM) and the number of realizations/samples. The RMSE is calculatedfor each sample design by subtracting the surrogate true response surface from theresponse surface fit to the simulation results. Two different surrogate true response

Page 14: An Uncertainty Quantification Framework for Studying …lin491/pub/GLIN-13-MG.pdf · An Uncertainty Quantification Framework for Studying the Effect of Spatial Heterogeneity in

812 Math Geosci (2013) 45:799–817

Fig. 7 CO2 gas saturation plume expansion from 45 years to 50 years with respect to the anisotropy ratio(Ih/Iv). The two cases along each row have similar mean values and variances of Y . To illustrate the effectof anisotropy ratio (integration scales), the left panels have lower Ih/Iv ratios than the right panels

surfaces have been used. The first one was fit to all 197 samples (128 QMC + 69 PC).Because the PC method places duplicate sample locations which tend to bias a re-sponse surface fit, the second true response surface was fit to only the 128 QMCsamples. The ARM samples were selected by taking the first 32 QMC samples andfinding new (augment) samples in groups of 16. The column labeled A/Q is the ratioof the RMSE for the response surface fit to the ARM samples to the RMSE for theresponse surface fit to the QMC samples. In several cases, the ARM method producesa response surface much closer to the true response surface (e.g., the RMSE for ARM

Page 15: An Uncertainty Quantification Framework for Studying …lin491/pub/GLIN-13-MG.pdf · An Uncertainty Quantification Framework for Studying the Effect of Spatial Heterogeneity in

Math Geosci (2013) 45:799–817 813

Fig. 8 Comparison of plume radius (m) at different depths for several selected parameter combinations.Left plot represents the plume radius calculated for the 11 sample points (with the realization number andparameter values shown in table) for all 25 nodes along the z-axis. Right plot represents plume radius ofthe same 11 points for a selected set of depths (nodes along the z-axis)

Fig. 9 Comparison of response surfaces of plume radius (m) in the top node along the z-axis, with respectto the three parameters, across the different sampling schemes (QMC, PC, ARM)

Page 16: An Uncertainty Quantification Framework for Studying …lin491/pub/GLIN-13-MG.pdf · An Uncertainty Quantification Framework for Studying the Effect of Spatial Heterogeneity in

814 Math Geosci (2013) 45:799–817

Table 1 Comparison of estimated response surface results as defined by the root mean squared error(error = true response—estimated response) using different sampling methods and number of samples

Number ofsamples

RMSEa RMSEb

QMC ARM A/Q QMC ARM ARMc A/Q Ac/Q

32 15.07 15.07 1.00 12.76 12.76 12.76 1.00 1.00

48 12.44 14.11 1.33 12.15 14.00 14.18 1.15 1.17

64 10.75 8.88 0.83 10.37 8.20 9.27 0.79 0.89

80 10.40 8.45 0.82 8.71 5.15 8.62 0.59 0.99

96 9.53 10.48 1.10 7.24 8.89 5.93 1.23 0.82

128 7.70

a true response surface fit to 128 QMC + 69 PC samples

b true response surface fit to 128 QMC samplesc ARM fit using only the 128 QMC samples

A/Q = RMSE(ARM)/RMSE(QMC)

with 80 samples is 18 % smaller than for QMC with 80 samples when compared tothe combined true response surface obtained using all samples and is 41 % smallerthan for QMC with 80 samples when compared to the true response surface fit to the128 QMC samples).

5 Discussion

The proposed uncertainty analysis framework enables us to evaluate how the inputuncertainty propagates, analyze output uncertainty, and build response surfaces. Ithas all the necessary technical components for a typical UQ framework, but notethat not all the factors controlling CO2 migration behavior are fully explored in thisstudy; rather, we focus on reservoir permeability, which is assumed to be describedby spatial variogram models with three geostatistical parameters—mean, variance(or standard deviation), and spatial macroscopic anisotropy ratio. The results couldbe more convincing with a more complete parameterization scheme incorporatingheterogeneity in porosity and other factors, for which more information/data will beneeded to infer parameters describing porosity heterogeneity or correlations betweenporosity and permeability, and the computational demand would be higher with addi-tional parameters. It is natural to use average output statistics to study/evaluate the ef-fects of input parameters. Similar to other studies in hydrological problems involvingspatial anisotropy in field properties, the spatial anisotropy ratio seems to have negli-gible effects on average plume radius at all depths but in fact it does affect the wavefront shapes (the periodicity of the fronts and locations of the preferential paths). Ifthe response variable is the CO2 concentration or plume arrival time at a particu-lar location, or the maximum radius near the seal-reservoir interface, the anisotropyratio could be important. The adaptive sampling method utilized samples that were al-ready created using the QMC and PC methods. For this reason, additional/augmentedsamples selected by the ARM method were selected from the sample closest to the

Page 17: An Uncertainty Quantification Framework for Studying …lin491/pub/GLIN-13-MG.pdf · An Uncertainty Quantification Framework for Studying the Effect of Spatial Heterogeneity in

Math Geosci (2013) 45:799–817 815

ARM locations. As a result, the response surface fit to these samples tended to overfit regions with small sample size and under fit in regions with a large number ofsamples. As a result of this possible over-fitting, the RMSE measure of how wellthe response surface approximated the true response surface (i.e., ARM fit with allsamples) may not always decrease as more samples were used. In order to obtainreliable output statistics, a number of numerical simulations are needed to link theresponse variables to the input parameters. The proposed approach provides a meansof dealing with the computational demand by using effective sampling approaches(QMC/PC), parallel computing techniques, and adaptive sampling (ARM). Comput-ing resource management and data transfer tools have been incorporated into our UQanalysis framework to achieve optimized UQ capability with the given computationalresources. Findings from the case study provide insight on the effect of spatial het-erogeneity on CO2 migration, for example, the impacts of geostatistical parametersof reservoir permeability are given in a quantitative manner.

6 Conclusions

A novel UQ analysis framework is introduced to analyze the effects of reservoirproperties on CO2 plume evolution. The framework integrates efficient sampling ap-proaches, high–performance computing, and adaptive sampling. It can be used todeal with computationally demanding problems such as simulating CO2 migrationin a three-dimensional heterogeneous domain. In this study, the approach is demon-strated by exploring the effects of the hyperparameters including the mean, variance,and spatial integral scale of log permeability. The parameter mY has a positive ef-fect on CO2 plume radius for the top layers but affects the bottom layers negatively.The parameter σY generally has a positive effect on the plume radius at all layers.The effect is more obvious at middle layers, where the transport of CO2 is highlyinfluenced by the subsurface heterogeneity structure. At the very bottom layers, thevariability is less important because the plume will not advance far horizontally be-fore moving upward due to the buoyance force. The anisotropy ratio has almost noimpact on the plume radius in the summary analyses, but does affect the shape ofthe CO2 plume, such as the periodicity of the plume front; longer correlation rangesare corresponding to stronger spatial continuity and, therefore, the periodicity of theplume front becomes weaker. The adaptive sampling method showed some promisingresults when its response surface was compared to a surrogate true response surface.The goal with this method is to develop a response surface that is as close to the trueresponse surface with as few samples as possible. This method identifies new pa-rameter sample points in regions of high uncertainty based on the estimated responsesurface and has the potential to achieve reliable response surfaces with fewer samplesthan efficient sampling (e.g., quasi-Monte Carlo) approach alone.

Acknowledgements This research has been accomplished and funded through Pacific Northwest Na-tional Laboratory’s Carbon Sequestration Initiative, which is part of the Laboratory Directed Research andDevelopment Program. This study was conducted at the Pacific Northwest National Laboratory, operatedby Battelle Memorial Institute for the US Department of Energy under Contract DE-AC05-76RL01830.

Page 18: An Uncertainty Quantification Framework for Studying …lin491/pub/GLIN-13-MG.pdf · An Uncertainty Quantification Framework for Studying the Effect of Spatial Heterogeneity in

816 Math Geosci (2013) 45:799–817

References

Atanassov E, Ivanovska S, Karaivanova A (2010) Tuning the generation of Sobol sequence with Owenscrambling. In: Proceedings of LSSC09 (accepted for publication). In: LNCS, vol 5910, pp 459–466(to appear). ISSN: 0302-9743

Balay S, Brown J, Buschelman K, Eijkhout V, Gropp WD, Kaushik D, Knepley MG, McInness LC, SmithBF, Zhang H (2011) PETSc users manual, ANL-95/11, Revision 3.2. Argonne National Laboratory,Argonne, 211 pp

Barnes DA, Bacon DH, Kelley SR (2009) Geological sequestration of carbon dioxide in the CambrianMount Simon sandstone: regional storage capacity, site characterization, and large scale injectionfeasibility; Michigan Basin, USA. Environ Geosci 16(3):163–183. doi:10.1306/eg.05080909009

Brooks RH, Corey AT (1966) Properties of porous media affecting fluid flow. J Irrig Drain Div 93(3):61–88

Burdine NT (1953) Relative permeability calculations from pore-size distribution data. Pet Trans AIME198:71–77

Caflisch RE (1998) Monte Carlo and quasi-Monte Carlo methods. In: Iserles A (ed) Acta Numerica, vol 7.Cambridge University Press, Cambridge, pp 1–49. 1998. doi:10.1017/S0962492900002804

Cools R (1999) Monomial cubature rules since “Stroud”: a compilation—part 2. J Comput Appl Math112:21–27. doi:10.1016/0377-0427(93)90027-9

Cools R, Rabinowitz P (1993) Monomial cubature rules since “Stroud”: a compilation. J Comput ApplMath 48:309–326. doi:10.1016/0377-0427(93)90027-9

Dawson R, Hall J (2006) Adaptive importance sampling for risk analysis of complex infrastructure sys-tems. Proc R Soc A 462:3343–3362. doi:10.1098/rspa.2006.1720

Deutsch CV (2002) Geostatistical reservoir modeling, 1st edn. Oxford University Press, New York, 384 ppDeutsch CV, Journel AG (1998) GSLIB: geostatistical software library and user’s guide, 2nd edn. Oxford

University Press, New York, 384 ppDick J, Pillichshammer F (2010) Digital nets and sequences: discrepancy theory and quasi-Monte Carlo

integration. Cambridge University Press, Cambridge, UK, 618 ppEccles JK, Pratson L, Newell NG, Jackson RB (2009) Physical and economic potential of geological CO2

sequestration in saline aquifers. Environ Sci Technol 43(6):1962–1969. doi:10.1021/es801572eEngel DW, Liebetrau AM, Jarman KD, Ferryman TA, Scheibe TD, Didier BT (2004) An iterative un-

certainty assessment technique for environmental modeling. In: Mowrer HT, McRoberts R, Van-Deusen PC (eds) The joint proceedings of the accuracy and environmetrics, 2004 conferences.http://www.spatial-accuracy.org/system/files/Engel2004accuracy.pdf

Fayer MJ, Simmons CS (1995) Modified soil water retention functions for all matric suctions. WaterResour Res 31:1233–1238

Friedman JH (1991) Multivariate adaptive regression splines. Ann Stat 19(1):1–66Givens GH, Raftery AE (1996) Local adaptive importance sampling for multivariate densities with strong

nonlinear relationships. J Am Stat Assoc 91(433):132–141Hou Z, Rubin Y (2005) On MRE concepts and prior compatibility issues in vadose zone inverse and

forward modeling. Water Resour Res 41:W12425. doi:10.1029/2005WR004082Hou Z, Rockhold ML, Murray CJ (2012a) Evaluating the impact of caprock and reservoir prop-

erties on potential risk of CO2 leakage after injection. Environ Earth Sci 66(8):2403–2415.doi:10.1007/s12665-011-1465-2

Hou Z, Huang M, Leung LR, Lin G, Ricciuto DM (2012b) Sensitivity of surface flux simulations tohydrologic parameters based on an uncertainty quantification framework applied to the CommunityLand Model. J Geophys Res 117:D15108

Hou Z, Rubin Y, Hoversten GM, Vasco D, Chen J (2006) Reservoir parameter identification using mini-mum relative entropy-based Bayesian inversion of seismic AVA and marine CSEM data. Geophysics71(6):O77–O88

Izgec O, Demiral B, Bertin H, Akin S (2008) CO2 injection into saline carbonate aquifer formations I:laboratory investigation. Transp Porous Media 72:1–24. doi:10.1007/s11242-007-9132-5

Kitanidis PK (1997) Introduction to geostatistics: applications in hydrogeology. Cambridge UniversityPress, Cambridge, UK, 272 pp

Knauss KG, Johnson JW, Steefel CI (2005) Evaluation of the impact of CO2, co-contaminant gas, aqueousfluid and reservoir rock interactions on the geological sequestration of CO2. Chem Geol 217:339–350

Page 19: An Uncertainty Quantification Framework for Studying …lin491/pub/GLIN-13-MG.pdf · An Uncertainty Quantification Framework for Studying the Effect of Spatial Heterogeneity in

Math Geosci (2013) 45:799–817 817

Lin G, Tartakovsky AM (2009) An efficient, high-order probabilistic collocation method on sparse gridsfor three-dimensional flow and solute transport in randomly heterogeneous porous media. Adv WaterResour 32(5):712–722. doi:10.1016/j.advwatres.2008.09.003

Message Passing Interface Forum (2009). http://www.mpi-forum.org/Nieplocha J, Palmer B, Tipparaju V, Krishnan M, Trease H, Apra E (2006) Advances, applications and

performance of the Global Arrays shared memory programming toolkit. Int J High Perform ComputAppl 20(2):203–231. doi:10.1177/1094342006064503

Nordbotten JM, Celia MA, Bachu S (2005) Injection and storage of CO2 in deep saline aquifers: an-alytical solution for CO2 plume evolution during injection. Transp Porous Media 58(3):339–360.doi:10.1007/s11242-004-0670-9

Oldenburg CM, Unger AJA (2003) On leakage and seepage from geologic carbon sequestration sites:unsaturated zone attenuation. Vadose Zone J 2:287–296. doi:10.2113/2.3.287

Pruess K (2008) On CO2 fluid flow and heat transfer behavior in the subsurface, following leakage from ageologic storage reservoir. Environ Geol 54(8):1677–1686. doi:10.1007/s00254-007-0945-x

Rutqvist J, Birkholzer JT, Cappa F, Tsang C-F (2007) Estimating maximum sustainable injection pres-sure during geological sequestration of CO2 using coupled fluid flow and geomechanical fault-slipanalysis. Energy Convers Manag 48(6):1798–1807

Smolyak S (1963) Quadrature and interpolation formulas for tensor products of certain classes of functions.Sov Math Dokl 4:240–243

Sobol IM (1967) Distribution of points in a cube and approximate evaluation of integrals. USSR ComputMath Math Phys 7:86–112

Sobol IM, Shukhman BV (2007) Quasi-random points keep their distance. Math Comput Simul75(3–4):80–86. doi:10.1016/j.matcom.2006.09.004

Tarantola A (2005) Inverse problem theory and methods for model parameter estimation. Society for In-dustrial and Applied Mathematics, Philadelphia, 352 pp

Wang X (2009) Dimension reduction techniques in quasi-Monte Carlo methods for option pricing. IN-FORMS J Comput 21(3):488–504

White MD, Oostrom M (2006) STOMP—subsurface transport over multiple phases—version 4.0—user’sguide. PNNL-15782, Pacific Northwest National Laboratory, Richland, Washington, 120 pp

Woodbury AD, Ulrych TJ (1993) Minimum relative entropy—forward probabilistic modeling. Water Re-sour Res 29(8):2847–2860

Xiu D, Hesthaven JS (2005) High order collocation methods for differential equations with random inputs.SIAM J Sci Comput 27(3):1118–1139