an adaptive spatial model for precipitation data from ...people.tamu.edu › ~k-bowman › pdf ›...

17
Stat Comput (2015) 25:389–405 DOI 10.1007/s11222-013-9439-8 An adaptive spatial model for precipitation data from multiple satellites over large regions Avishek Chakraborty · Swarup De · Kenneth P. Bowman · Huiyan Sang · Marc G. Genton · Bani K. Mallick Received: 2 December 2012 / Accepted: 20 November 2013 / Published online: 4 December 2013 © Springer Science+Business Media New York 2013 Abstract Satellite measurements have of late become an important source of information for climate features such as precipitation due to their near-global coverage. In this article, we look at a precipitation dataset during a 3-hour window over tropical South America that has information from two satellites. We develop a flexible hierarchical model to combine instantaneous rainrate measurements from those satellites while accounting for their potential heterogeneity. The research of Bani K. Mallick and Avishek Chakraborty was supported by National Science Foundation grant DMS 0914951. Research of Marc G. Genton was partially supported by NSF grants DMS-1007504 and DMS-1100492. The research in this article was also partially supported by Award No. KUSC1-016-04 made by King Abdullah University of Science and Technology (KAUST). A. Chakraborty (B ) · H. Sang · B.K. Mallick Department of Statistics, Texas A&M University, College Station, TX 77843-3143, USA e-mail: [email protected] H. Sang e-mail: [email protected] B.K. Mallick e-mail: [email protected] S. De SAS Research & Development (India) Pvt. Ltd, Pune 411013, India e-mail: [email protected] K.P. Bowman Department of Atmospheric Sciences, Texas A&M University, College Station, TX 77843-3150, USA e-mail: [email protected] M.G. Genton CEMSE Division, King Abdullah University of Science and Technology, Thuwal 23955-6900, Saudi Arabia e-mail: [email protected] Conceptually, we envision an underlying precipitation sur- face that influences the observed rain as well as absence of it. The surface is specified using a mean function centered at a set of knot locations, to capture the local patterns in the rainrate, combined with a residual Gaussian process to ac- count for global correlation across sites. To improve over the commonly used pre-fixed knot choices, an efficient re- versible jump scheme is used to allow the number of such knots as well as the order and support of associated poly- nomial terms to be chosen adaptively. To facilitate compu- tation over a large region, a reduced rank approximation for the parent Gaussian process is employed. Keywords Large data computation · Nonstationary spatial model · Precipitation modeling · Predictive process · Random knots · Reversible jump Markov chain Monte Carlo · Satellite measurements 1 Introduction Algorithms to estimate atmospheric parameters from satel- lite measurements of upwelling radiation (Lethbridge 1967) have become invaluable for investigating global weather and climate. Satellites are routinely used to observe tem- perature, humidity, clouds, precipitation, aerosols and at- mospheric trace constituents. Applications of satellite data include weather forecasting, climate studies, ozone deple- tion, drought monitoring, crop forecasting and flood warn- ing. Data obtained from satellites are attractive because of their potential for near-global coverage and their ability to generate measurements at high spatial and temporal reso- lution compared to ground-based or airborne sources. Over the ocean and in sparsely populated regions of the land sur- face where few ground-based measurements exist, satellite observations are essential (Simpson et al. 1988; Kidd 2001).

Upload: others

Post on 02-Feb-2021

0 views

Category:

Documents


0 download

TRANSCRIPT

  • Stat Comput (2015) 25:389–405DOI 10.1007/s11222-013-9439-8

    An adaptive spatial model for precipitation data from multiplesatellites over large regions

    Avishek Chakraborty · Swarup De ·Kenneth P. Bowman · Huiyan Sang · Marc G. Genton ·Bani K. Mallick

    Received: 2 December 2012 / Accepted: 20 November 2013 / Published online: 4 December 2013© Springer Science+Business Media New York 2013

    Abstract Satellite measurements have of late become animportant source of information for climate features suchas precipitation due to their near-global coverage. In thisarticle, we look at a precipitation dataset during a 3-hourwindow over tropical South America that has informationfrom two satellites. We develop a flexible hierarchical modelto combine instantaneous rainrate measurements from thosesatellites while accounting for their potential heterogeneity.

    The research of Bani K. Mallick and Avishek Chakraborty wassupported by National Science Foundation grant DMS 0914951.Research of Marc G. Genton was partially supported by NSF grantsDMS-1007504 and DMS-1100492. The research in this article wasalso partially supported by Award No. KUSC1-016-04 made by KingAbdullah University of Science and Technology (KAUST).

    A. Chakraborty (B) · H. Sang · B.K. MallickDepartment of Statistics, Texas A&M University, College Station,TX 77843-3143, USAe-mail: [email protected]

    H. Sange-mail: [email protected]

    B.K. Mallicke-mail: [email protected]

    S. DeSAS Research & Development (India) Pvt. Ltd, Pune 411013,Indiae-mail: [email protected]

    K.P. BowmanDepartment of Atmospheric Sciences, Texas A&M University,College Station, TX 77843-3150, USAe-mail: [email protected]

    M.G. GentonCEMSE Division, King Abdullah University of Scienceand Technology, Thuwal 23955-6900, Saudi Arabiae-mail: [email protected]

    Conceptually, we envision an underlying precipitation sur-face that influences the observed rain as well as absence ofit. The surface is specified using a mean function centeredat a set of knot locations, to capture the local patterns in therainrate, combined with a residual Gaussian process to ac-count for global correlation across sites. To improve overthe commonly used pre-fixed knot choices, an efficient re-versible jump scheme is used to allow the number of suchknots as well as the order and support of associated poly-nomial terms to be chosen adaptively. To facilitate compu-tation over a large region, a reduced rank approximation forthe parent Gaussian process is employed.

    Keywords Large data computation · Nonstationary spatialmodel · Precipitation modeling · Predictive process ·Random knots · Reversible jump Markov chain MonteCarlo · Satellite measurements

    1 Introduction

    Algorithms to estimate atmospheric parameters from satel-lite measurements of upwelling radiation (Lethbridge 1967)have become invaluable for investigating global weatherand climate. Satellites are routinely used to observe tem-perature, humidity, clouds, precipitation, aerosols and at-mospheric trace constituents. Applications of satellite datainclude weather forecasting, climate studies, ozone deple-tion, drought monitoring, crop forecasting and flood warn-ing. Data obtained from satellites are attractive because oftheir potential for near-global coverage and their ability togenerate measurements at high spatial and temporal reso-lution compared to ground-based or airborne sources. Overthe ocean and in sparsely populated regions of the land sur-face where few ground-based measurements exist, satelliteobservations are essential (Simpson et al. 1988; Kidd 2001).

    mailto:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]

  • 390 Stat Comput (2015) 25:389–405

    Precipitation is critically important in terms of economicand social impacts, but it is also one of the most difficult at-mospheric phenomena to observe and model. The complexphysical processes and large variability of precipitation posesignificant scientific challenges in modeling the atmosphere.Rain gauges are the most common technology for measur-ing surface rainrates. Compared to parameters such as tem-perature, rainfall has very short space and time correlationlength scales, so high resolution data are required to resolvespatial and temporal variations (Austin and Houze 1972;Rodríguez-Iturbe and Mejía 1974). With the exception ofa few small, dense research networks, the spacing betweenrain gauges tends to be much larger than the typical spatialscales of precipitation systems (Felgate and Read 1975). Inpractical terms, gauge networks do not resolve the variabil-ity of precipitation systems, particularly on smaller scales.In addition, there is little rain gauge data over the oceansand over large expanses of many continents. In contrast, dueto their near-global coverage, satellites have the potential toprovide precipitation estimates with high spatial resolutionwhere gauge observations or conventional earth-bound mon-itoring systems are unavailable. The principal limitation ofobservations from low-Earth-orbiting satellites is temporalsampling. A single satellite will typically observe a givenlocation on the Earth’s surface only a few times per day. Al-though satellites can provide global coverage, satellite sam-pling and measurement errors can be substantial, so effectivemethods are needed to validate and combine observationsinto consistent and useful estimates. A detailed discussionon sampling errors for satellite rainfall average can be foundin Bell et al. (1990, 2001), Bell and Kundu (1996) and Mc-Connell and North (1987).

    Over the last few decades many satellite-based precipi-tation algorithms have been developed (Wilheit 1977; Xieand Arkin 1998; Ba and Gruber 2001; Huffman et al. 2002;Joyce et al. 2004; Negri et al. 2002; Sorooshian et al. 2000;Vicente et al. 1998; Weng et al. 2003). Satellite methods canbe used to generate precipitation products at various spa-tial and temporal resolutions. Precipitation estimation fromspace is most often based on observations of infrared or mi-crowave radiation. Infrared-based techniques can have rel-atively high spatial and temporal resolution. Infrared radi-ation cannot penetrate typical dense clouds, but it is pos-sible to measure the altitude of cloud tops. Higher cloudtops indicate deeper storms, which tend to produce largerrainfall amounts. A statistical relationship between cloud-top height and surface rain gauge data can be used to es-timate surface rainrates from satellite observations. Hence,the rainrate is related only indirectly to the observed quantity(cloud top height); so uncertainties are larger. Microwaveradiation, on the other hand, can penetrate through clouds.Consequently, microwave methods are more closely tied tothe relevant physical quantity (falling raindrops), but spatial

    and temporal coverage are typically lower. Microwave meth-ods address these problems by merging precipitation obser-vations from multiple satellites to yield global precipitationestimates at reasonably high spatial and temporal resolution(e.g. 0.25◦ and a few hours) as in the Tropical Rainfall Mea-suring Mission (TRMM) Multisatellite Precipitation Analy-sis (TMPA) described by Huffman et al. (2007). The reso-lution of the TMPA grid (0.25◦) is comparable to the reso-lution of current microwave instruments. Currently mergedsatellite estimates do not take into account the different sta-tistical properties of the input data streams. These differ-ences include variations in spatial and temporal resolutionand in error characteristics. Multiple observations are usu-ally combined in the simplest possible way by averaging thevarious instantaneous estimates to produce a “best estimate”of the mean rainfall rate over the selected interval (Huff-man et al. 2007). Because statistical properties of rainfallare highly non-Gaussian and depend strongly on space andtime scale, this averaging can significantly impact highermoments of the estimates (e.g. variances and covariances).

    In this paper, our approach is to use a Bayesian hierar-chical framework for combining the available observationsfrom multiple satellites within a space-time volume. Thisenables us to develop the model in a way that can account forsome specific characteristics of a precipitation dataset. Gen-erally, precipitation patterns are highly localized and havefast time scales. As a result, at any given location, rainfall isabsent most of the time. Hence any rainfall data, aggregatedover a short time interval, is expected to have a high num-ber of zeros. The use of mixture models with a degeneratemass at zero is a common approach to model zero-inflateddata, e.g., a zero inflated Poisson (ZIP) model (Cohen 1991).Agarwal et al. (2002) discussed the use of Bayesian meth-ods to analyze spatially correlated zero-inflated count datain the presence of covariate information. In the context of anecological dataset on presence of plant species, Chakrabortyet al. (2010) used a spatial probit model to address a largenumber of absences. In the current work, we introduce a lo-cally varying bias term for that. Atmospheric convection,which drives the event of precipitation, involves turbulentinteractions across a wide range of scales and is governedby fundamentally nonlinear equations of fluid dynamics. Asa consequence, the information on whether it is raining at agiven location or not is highly localized. The bias term weintroduce here addresses this nonlinearity and local varia-tion, unlike the linearity assumption in Fuentes et al. (2008).For geospatial datasets, an additive model with nonlinearcovariate-dependence was also discussed in Kammann andWand (2003).

    For precipitation, when it rains, the probability distribu-tion of instantaneous rainrate is non-Gaussian with long tail.Hence, a lognormal model is often adopted for nonzero pre-cipitation measurements, see Lee and Zawadzki (2005) and

  • Stat Comput (2015) 25:389–405 391

    Fuentes et al. (2008). Alternative approaches also exist inthe literature such as the truncated power transformation ofBardossy and Plate (1992) and the use of skew-elliptical dis-tributions in Marchenko and Genton (2010). Our approachis motivated by Fuentes et al. (2008), where a zero-inflatedlog-Gaussian model has been used for rainfall, and, in a hier-archical framework, the distribution of no rainfall events wasmodeled to depend on the true rainfall intensity. However,the modeling of the rainfall process in this article signifi-cantly differs from the approach therein. To jointly modeltrue rainfall intensities at adjacent locations, Fuentes et al.(2008) used a Gaussian Markov random field with correla-tion parameters that do not change across locations. How-ever, in general, the amount of rainfall over a few hoursis highly localized and nonstationary on a large domain.A number of works have focussed on developing nonsta-tionary spatial covariance functions. Spatial deformationsto model nonstationary spatial processes have been used bySampson and Guttorp (1992), Schmidt and O’Hagan (2003)and Anderes and Stein (2008). On the other hand, kernelconvolution and its variants have been applied in severalpapers to create nonstationary covariance functions as inHigdon (1998), Fuentes (2002) and Paciorek and Schervish(2006). Jun and Stein (2008) and Jun (2011) proposed amethod of modeling nonstationary covariance functions ona sphere. Nonstationarity can also be introduced through co-variates. Using spatially-varying regression coefficients fora point-referenced data provides a scope of detecting sub-regional variation in the response-predictor relationship, seeGelfand et al. (2003). Specifically, in the context of analyz-ing a zero-inflated dataset like ours, Finley et al. (2011) pro-posed a hierarchical model where multivariate spatial pro-cess priors were used for two different sets of regressioncoefficients—one for controlling the abundance of zeros andthe other for the nonzero observations. Here, we also workwith a covariate-dependent nonstationary model but opt foran adaptive specification, as in Friedman (1991) and Deni-son et al. (1998). It depends on finding a set of knot loca-tions in the predictor space and developing a local poly-nomial for each one of them. This approach is relativelysimpler to interpret, estimate and, importantly, model non-Gaussian patterns and input interactions in the response sur-face. The flexibility of this specification lies in the fact thatthe functions can be constructed adaptively, i.e., the orderand support of the local polynomials and even the numberof knots is decided by the pattern of the data during model-fitting. When spatial covariates are available, the model canselect the important predictors and/or interactions, eliminat-ing the need to pre-specify the form of dependence. To mod-ify the number of polynomial terms in the function, an ef-ficient reversible jump Markov chain Monte Carlo (RJM-CMC, Richardson and Green 1997) sampler has been devel-oped in Sect. 4. The residual was modeled with a Gaussian

    process (GP) prior so as to reflect correlation across sites ona global scale.

    Although not directly relevant to the modeling and dataanalysis in this article, we like to mention that there is a sig-nificant collection of literature on modeling extreme precip-itation events. Here the quantity of interest is the right-handtail of the rainfall distribution. The common approach is touse extreme value theory to propose a probability distribu-tion for exceedance (the event that rainfall crosses a thresh-old value) rate at each site and relate the parameters of thosedistributions using spatial process models. See Cooley et al.(2007) and Sang and Gelfand (2009) for a hierarchical ap-proach to this problem for rainfall data at point and grid-level, respectively.

    Another important feature of our problem is predictionat thousands of unsampled sites. As Markov random field-based models do not have predictive property, one needs toinclude all such sites directly into the estimation, potentiallyleading to a large number of spatial random effects and slowconvergence. With the spline-GP combination used in thisarticle, prediction can be done as a post-MCMC analysis.One potential issue here is the sensitivity of GP computationto the number of locations in the dataset. There are a num-ber of approximation techniques in the literature, such asprocess convolution (Higdon 2002), approximate likelihood(Stein et al. 2004), fixed rank kriging (Cressie and Johannes-son 2008), covariance tapering (Furrer et al. 2006; Kaufmanet al. 2008), predictive process (Banerjee et al. 2008) andvery recently a combined approach involving reduced rankapproximation and covariance tapering by Sang and Huang(2012); see Sun et al. (2012) for a review of available meth-ods. We employ the fixed knot-based predictive process ap-proximation, discussed in brief in Sect. 4.1. Finally, as withany Bayesian approach, information on uncertainty aboutprocess parameters, in addition to their pointwise estimatescan be readily obtained which may be of significant interestwith rainfall being a highly variable event.

    The article is organized as follows. Measurements of pre-cipitation around the world with multiple satellites and sub-sequent processing of the raw data are discussed in Sect. 2.In Sect. 3, the formulation of a Bayesian hierarchical spa-tial model for rainfall is introduced. Implementation detailsincluding the choice of priors, large data computation andsampling scheme are outlined in Sect. 4. The details of thesimulation study and the real data analysis are presented inSect. 5. Finally, Sect. 6 summarizes the present work andpoints out further related research. In this article, the nota-tions N(μ,σ 2) and LN(μ,σ 2) have been used for denotingnormal and lognormal probability densities with location μand scale σ , respectively. Φ is used for the standard normalcumulative distribution function and Nd(μ,Σ) stands ford-dimensional multivariate normal distribution with meanvector μ and dispersion matrix Σ .

  • 392 Stat Comput (2015) 25:389–405

    Fig. 1 Observation swaths of different satellites in a typical 3-hourobserving period. The figure is from Huffman et al. (2007). Blackindicates no data. Colors indicate rainrate in mm/hr. Shades of gray in-dicate observations of zero rain from the different satellites. The white

    trajectory corresponds to the path of TMI. Observations are averagedwhere overlaps occur. Reproduced by permission of the American Me-teorological Society (Color figure online)

    2 Collection of satellite measurements on precipitation

    Precipitation observations from multiple satellites can bemerged to yield global precipitation estimates at reason-ably high spatial and temporal resolution as is done withthe Tropical Rainfall Measuring Mission (TRMM) Multi-satellite Precipitation Analysis (TMPA) described by Huff-man et al. (2007). Data are collected by a variety of lowearth orbit satellites, including the TRMM Microwave Im-ager (TMI) on the TRMM satellite, Special Sensor Mi-crowave Imagers on the Defense Meteorological SatelliteProgram satellites, the Advanced Microwave Scanning Ra-diometer on the NASA Aqua satellite, and Advanced Mi-crowave Sounding Units (AMSU) on National Oceanic andAtmospheric Administration (NOAA) operational satellites.These satellites vary with respect to altitude, orbital inclina-tion, and equator crossing time. Data from multiple satellitesare aggregated in 3-hour windows, which are approximatelytwo complete orbits for low-Earth-orbiting satellites. The3-hour time window is usual in precipitation study (Huffmanet al. 2007) as a compromise between two competing goals:higher temporal resolution of the pattern of precipitation andgreater spatial coverage during each averaging window. Be-cause of the constraints of orbital motion and instrument de-sign, a single satellite provides limited coverage of the globewithin a 3-hour window. The current suite of operational and

    research satellites can, together, provide nearly global cov-erage in a 3-hour period. Longer sampling windows wouldoffer greater coverage but would have a lesser ability to re-solve the diurnal cycle, which is an important componentof precipitation variation. Figure 1 shows the available datafor a typical 3-hour period. For example, the regional 3-hourprecipitation dataset we used for analysis in Sect. 5 has in-formation from two satellites—TMI and AMSU-NOAA17.Following the TMPA practice, we choose the TMI as the ref-erence standard due to its higher spatial resolution (resultingfrom relatively lower altitude) and ongoing calibration withthe TRMM Precipitation Radar.

    A typical satellite microwave radiometer measures theupwelling microwave radiation by using a rotating antennato scan in a conical pattern as the satellite moves abovethe Earth’s surface. The spatial resolution of the measure-ments depends on the altitude of the satellite, the size ofthe microwave antenna (typically about 1 m), and the wave-lengths used. Rainrates are estimated using physical algo-rithms (Wilheit et al. 1977) based on reverse lookup tablesthat are precomputed using radiative transfer models. For ef-ficiency and convenience in processing the data, within each3-hour window the rain estimates from the instantaneousfields of view (pixels) of the satellite are first averaged ontoa 0.25◦ × 0.25◦ latitude-longitude grid. In the tropics theseboxes are nearly square and are approximately 28 × 28 km2.

  • Stat Comput (2015) 25:389–405 393

    This resolution represents a reasonable compromise amongthe differing spatial resolutions of the various instrumentsand is suitable for prediction related to climatological oragricultural studies. Rain measurements are reported as in-stantaneous rainrates in mm/hr.

    3 Hierarchical spatial model for rainrate

    In this section we describe the proposed hierarchical modelfor the multi-satellite precipitation data. Let D be the do-main of observation, T be the time window of study, X(s) =(x1(s), x2(s), . . . , xp(s))

    T be the p-dimensional vector ofcovariates measured at location s and Ỹl(s) is the observedrainrate during T from the l-th satellite at location s ∈ Dfor l = 1,2, . . . ,L. However, at any given s, the rainratemay not be available for some or all of the L satellites.The observed rainrate data is modeled as a noisy version ofan underlying unobserved potential rainrate process as fol-lows:

    Ỹl(s) =

    ⎧⎪⎨

    ⎪⎩

    0 with probability π(s),

    exp{c1l + c2lY (s) + εl(s)}with probability 1 − π(s),

    (1)

    where π(s) is the probability of zero rainfall at grid cell sand exp(Y (s)) represents the latent potential rainrate pro-cess at location s. If it rains, then the rainrate observedby satellite l, Ỹl(s), is a noisy measurement of that la-tent process. The parameters c1l , c2l are the model’s addi-tive and multiplicative bias adjustments specific to satel-lite l. For identifiability purpose, we need to set c1l0 =0, c2l0 = 1 for some l0 ∈ {1,2, . . . ,L}, so that any infer-ence from the model can be interpreted with satellite l0 asthe reference. Generally, one chooses l0 to be the satel-lite which is known to have maximum precision in mea-surements. In applications, where rainfall data are avail-able from multiple sources, the one expected to have thehighest accuracy can be used as the reference, e.g., rain-gauge data in Fuentes et al. (2008). The zero-mean noiseεl(s) characterizes the variations due to measurement er-ror and/or micro-scale spatial variations for the l-th satel-lite.

    We introduce the data augmentation approach (Tannerand Wong 1987; Albert and Chib 1993) to relate the rain-fall probability π(s) with the latent rainrate process Y(s) ina flexible way. The spatial probit model for π(s) is as fol-lows:

    π(s) = 1 − Φ{μπ(s) + βπY (s)}. (2)

    Conceptually, this amounts to modeling the zeros of rain-rate measurements to correspond to the low values of the

    latent process Y . The variable intercept μπ(·) can be re-ferred to as “bias function” that accounts for the poten-tial event of zero rainfall due to nonlinear interactions thatcannot be captured linearly through Y(s). If μπ(s) is aconstant over D, we have that E{log Ỹk(s)|Ỹk(s) > 0} andΦ−1{π(s)} are linear functions of each other for any k, sim-ilar to the assumption made in Fuentes et al. (2008). We in-troduce L latent surfaces Z1(s),Z2(s), . . . ,ZL(s) such that

    Zl(s)i.i.d.∼ N(μπ(s) + βπY (s),1), 1 ≤ l ≤ L. Then we can

    rewrite (1) as

    Ỹl(s) ={

    0 if Zl(s) ≤ 0,exp{c1l + c2lY (s) + εl(s)} if Zl(s) > 0.

    (3)

    Estimation of {Y(s) : s ∈ D} is of prime interest in thisproblem. Y(s) captures rainfall patterns over D. Since rain-fall over a small time window is a highly localized event,the usual isotropic GP-based spatial models will not sufficefor Y . There are different approaches to introduce nonsta-tionarity as outlined in Sect. 1. The approach we take here isto specify the mean surface μy(·) using multivariate adap-tive regression splines (MARS; Friedman 1991; Denisonet al. 1998). The idea is to model the function as a sum ofinteractions of varying order from a basis set of local poly-nomials as follows:

    Y(s) = μy(s) + w(s),

    μy(s) = my(s) +ky∑

    h=1νy,hφy,h

    (X(s)

    ), (4)

    φy,h(x) =nh∏

    r=1

    [uhr(xvhr − thr )

    ]

    +,

    where my(·) represents a fixed trend function (e.g. just anintercept or a linear or quadratic trend in components of s)and (·)+ = max(·,0), nh is the degree of the interaction ofthe basis function φy,h. The {uhr}, sign indicators, are ±1,the vhr gives the index of the predictor variable which is be-ing split at the value thr within its range. Thus, the func-tion φy,h represents a pattern around the knot th = {thr :r = 1,2, . . . , nh} (we refer to it as a sp-knot) and the set{φy,h(·) : h = 1,2, . . . , ky} defines an adaptive partitioningof the multidimensional space.

    Specifying μy(·) with local interactions provides greaterflexibility to model surface patterns and variable relation-ships. Most importantly, localized structures allow for non-stationarity. The flexibility of MARS lies in the fact that theinteraction functions can be constructed adaptively, i.e., theorder of interaction, knot locations, signs and even the num-ber of such terms ky is decided by the pattern of the dataduring model-fitting, eliminating the need for any prior ad-hoc or empirical judgement. In spite of having a flexible

  • 394 Stat Comput (2015) 25:389–405

    mean structure, MARS is relatively simple to fit which isan obvious advantage against other choices of nonstation-ary processes in the present application. For interpretabil-ity and to avoid overfitting, interactions up to a certain or-der are used and only up to a prefixed number of termsmay be allowed in the above sum. Any additional or higherorder global pattern is accounted for by assigning a zero-mean GP prior to the residual process w(s). The covariancefunction for w is assumed to be isotropic, i.e. for two lo-cations s and s′, cov{w(s),w(s′)} = σ 2y ρ{d(s, s′), κ} whered(·, ·) is the great circle distance for latitude-longitude data.ρ is the correlation function (validity of usual correlationfunctions such as exponential and Matérn with respect togeodetic distance is discussed in Banerjee (2005)), κ isthe parameter (vector) that controls the smoothness andrate of decay. The above specification implies that, for aset of n locations s = (s1, s2, . . . , sn), the vector Y(s) =(Y (s1), Y (s2), . . . , Y (sn))

    T is distributed as:

    Y(s) ∼ Nn(μy(s),Σ(s;κ)

    ),

    where Σ(i, i′) = σ 2y ρ{d(si, si′), κ}. If a priori νy ∼ Nky ×(0, cIky ) then marginalizing out νy , we have:

    E[Y(s)

    ] = my(s),V[Y(s)

    ] = cP (s)P (s)T + Σ(s, κ),where P(s) is a n × ky matrix with i-th column being(φy,i(s1),φy,i(s2), . . . , φy,i (sn))

    T . Let θh = {(vhr , thr , uhr ):r = 1,2, . . . , nh} be the parameters in φy,h and f (s; θh) =φy,h(s). Then at individual location level, we have:

    var[Y(si)

    ] = cky∑

    h=1f 2(si; θh) + σ 2y ,

    cov[Y(si), Y (si′)

    ] = cky∑

    h=1f (si; θh)f (si′ ; θh)

    + σ 2y ρ{d(si, si′), κ

    }.

    (5)

    This provides a very flexible model for the covariance spec-ification of the Y process that is also easy to interpret. Thetotal covariance is decomposed into a locally varying (non-stationary) component combined with a global pattern com-ing from the GP. The vector θh controls the characteristicsof the pattern associated with the h-th sp-knot and f (s, θh)represents its effect at location s. The resulting covarianceis the sum of such local effects. In practice, one starts witha global covariance term only and then the model itself se-lects the local effects as necessary. The parameter c repre-sents prior confidence (uncertainty) in these effects. The biasfunction, μπ(·) is also specified using MARS as above, witha separate set of parameters.

    It follows from (5) that MARS specification amounts tobuilding the covariance model of the output process with

    locally-supported components. In spatial literature, commonapproaches to incorporate nonstationarity by kernel mixingof process variables offers essentially similar decompositionof covariance; see Higdon et al. (1999), Fuentes (2002) andBanerjee et al. (2004). However, MARS offers significantlygreater flexibility over those methods as the shape of sucheffects around the knots can be decided independently ofeach other without being controlled by that of any chosenkernel function. Moreover, allowing each of these patternsto have its own local support encourages sparsity and avoidsthe complexity often associated with determining appropri-ate kernel bandwidth parameter(s).

    Another important advantage that this specification of-fers is the ability to let the required number of such lo-cal effects and the associated sp-knots be entirely decidedby the data, without compromising for the computationalcomplexity and interpretability. For any spatial model de-pendent on a set of knots, selection of an appropriate num-ber of knots and placing them optimally over space has al-ways been a critical issue. With too few knots, there is al-ways a possibility of overestimating the actual spatial rangeand neglecting important local patterns. On the other hand,using too many knots increases the computational demandand may lead to poor predictive performance by accountingfor even the noises or negligible variations in the observeddata. Conditional on a fixed number of knots, Gelfand et al.(2012) discussed an approach to place them optimally usinga minimum predictive variance criterion. A model-based ap-proach for random knots was introduced in Guhaniyogi et al.(2011). There, in a multi-stage structure, a point processprior was assumed for the set of knots. The intensity of thatpoint process can either be a parametric multimodal surfaceor a log-Gaussian process itself. However, when the numberof such knots is significant, efficiently updating them mayturn out to be challenging owing to a nonstandard posteriordistribution. Our specification allows for random knot selec-tion by placing a prior on the set of observed points in thespace and then, during the MCMC scheme, varies the size aswell as the members of the collection of the sp-knots via ad-dition, deletion or modification, only one at a time. Anothervery important feature of our knot selection procedure is thateven though we are working in a p-dimensional predictorspace, a typical sp-knot can be a location in a lower dimen-sional space. This can be particularly useful when the spatialprocess under consideration has different degrees of smooth-ness across coordinates. For example, an atmospheric pro-cess may exhibit significant variations with change in lon-gitude, but may remain relatively uniform with variations inlatitude (at a fixed longitude). In those situations, a more par-simonious representation can be ensured with MARS by al-lowing the data to position its sp-knots only over the range oflongitudes. Since most knot-based spatial methods have notaddressed this dimension-wise variation so far, it can pro-vide a potentially interesting direction for future research in

  • Stat Comput (2015) 25:389–405 395

    this field. For our multi-stage spatial model, the samplingprocedure is described in Sect. 4.

    4 Details of estimation and inference

    In this section, we focus on how to implement the model inSect. 3 on a (potentially massive) precipitation dataset. SinceGP computation is sensitive to the data size, we begin witha suitable approximation method to make the model capableof handling the computation. Subsequently we describe thefull hierarchical model used to fit the dataset and outline theestimation procedure via MCMC. Finally, we mention someof the quantities of interest which can be estimated by post-processing the posterior samples.

    4.1 Knot-based approximation for large dataset

    When the number of locations inside D gets large (inthousands), updating the latent spatial process parametersinside an MCMC becomes complicated due to its high-dimensional covariance matrix. We use a reduced rank rep-resentation of the original process, the predictive process, asdeveloped in Banerjee et al. (2008). Below, we present theidea in brief.

    Consider realizations from a zero-mean, unit-scale GPw(·) at a set of n locations s = (s1, s2, . . . , sn) ∈ D wheren is large. The method proceeds by first choosing m � nlocations s0 = (s01 , s02 , . . . , s0m) in D, to be referred to aspp-knots, and then replaces w(s) by an approximate pro-cess w̃(s) = E[w(s)|w(s0)] = Lw(s0) where the matrix Lis calculated from the correlation function ρ of the origi-nal process w. L depends on the correlation parameter(s)of ρ. If m � n, we gain in terms of computation time us-ing the Sherman-Woodbury-Morrison (S-W-M) type formu-lae (Banerjee et al. 2008). However the accuracy of the ap-proximation goes up with increasing m, so there has to bea trade-off. We prefer to use this method, as it is deriveddirectly from the parent process without any need for ad-hoc choice of basis functions, is easy to interpret and hasclosed form analytic expressions. This approximation is co-herent with the MARS model introduced in Sect. 3, wherethe resulting covariance was shown to depend on a set ofrandom sp-knots. We like to mention that, conditional on afixed number of pp-knots m, we can allow their locations tovary by using a point process prior on them as in Guhaniyogiet al. (2011). However, the hierarchical model described inSect. 3 already includes an adaptive spatial function basedon sp-knots and GP is used only as a model for the residualprocess. So, in this article, we choose to work with a fixedset of pp-knots only.

    We introduce bias correction, a modification discussedin Finley et al. (2009). Since var{w(sj )} > var{w̃(sj )} for

    each j , the predictive process is expected to underestimatethe spatial variance and increase the variance of the pureerror. The correction introduces an heteroscedastic indepen-dent error ε∗ so that w̃(s) = Lw(s0) + ε∗ and var{w̃(sj )} =var{w(sj )} for any j = 1,2, . . . , n. Introduction of a biascorrection term also facilitates computation when the GPunder consideration is applied to a latent-stage response (asours) that needs to be updated every iteration. The advantageof this method is illustrated in Sect. 4.3.

    4.2 MCMC from the complete hierarchical model

    We discuss parameter estimation using a MCMC schemefrom the model in Sect. 3. Let {sl1, sl2, . . . , slnl } be thelocations in D at which rainrate measurements are avail-able from the satellite l for l = 1,2, . . . ,L. Let ỹlj andxlj denote the rainrate measurement and available covari-ate information, respectively, at location slj by satellite l forj = 1,2, . . . , nl . Let s denotes the pooled set of n distinct lo-cations

    ⋃Ll=1{sl1, sl2, . . . , slnl } and s0 the set of m pp-knots

    as above. For the joint set of locations (s, s0), partition thespatial correlation matrix as Cn+m(κ) =

    ( Cn(κ) Cn,m(κ)Cn,m(κ) Cm(κ)

    ),

    where the entries of Cn+m are unit scale correlation termsunder correlation function ρ(·, κ). From Sect. 4.1, L(κ) =Cn,m(κ)C

    −1m (κ). Then we can write the full hierarchical

    model as follows:

    ỹlj ∼ 1(zlj < 0)δ0 + 1(zlj > 0)LN(c1l + c2ly(slj ), σ 20

    ),

    zlj ∼ N(μπ(slj ) + βπy(slj ),1

    ),

    y(s) = μy(s) + σyw̃(s),w̃(s) = L(κ)w(s0) + ε∗(s), (6)w

    (s0

    ) ∼ GP (0, ρ(·, κ)),ε∗(s) ∼ Nn

    (0,Diag

    {In − Cn,m(κ)C−1m (κ)Cm,n(κ)

    }),

    μd(s) = md(s) +kd∑

    h=1νd,hφd,h

    (x(s)

    ), d = π,y.

    Regarding prior specification, we set c1l0 = 0, c2l0 = 1 forsome l0 as discussed in Sect. 3. For l = l0 we use Gaus-sian priors centered at 0 and 1, respectively. For the regres-sion coefficient βπ and variance parameter σ 20 , we use Nor-mal and inverse-gamma priors, respectively, for conjugacyof posterior distributions. We choose ρ(·, κ) to be the expo-nential correlation function with decay parameter κ . It canbe easily shown that κ ≈ 3.0/R (R is the spatial range, i.e.,the distance at which the correlation falls below 0.05) soκ can be specified using a prior idea about possible valuesof R. An Inverse Gamma (aσ , bσ ) prior was used for thespatial variance σ 2y . We chose the trend function md(·) to bea constant intercept for d = π,y and, without loss of gener-ality, merge it with the MARS predictors as a constant basis

  • 396 Stat Comput (2015) 25:389–405

    function φd,0(·) ≡ 1. Conditional on kπ and ky , the numberof basis functions in the expansion, we assign Gaussian pri-ors to the coefficient vectors so that νy ∼ Nky (0, σ 2y τ 2y Iky )and νπ ∼ Nkπ (0, τ 2πIkπ ). We assign Inverse gamma priorsto both scale parameters τ 2y and τ

    2π to maintain conjugacy,

    so that we can draw them from respective Inverse gammaposterior conditional distributions.

    For d = π,y, we can control the parsimony of the non-stationary function μd in three different ways: (1) chang-ing the prior mean of kd , (2) putting an order constraint oneach φd,h and (3) setting a fixed threshold k0 for maximumvalue of kd . Accounting for the column of ones correspond-ing to the constant basis function, (kd − 1) is chosen to havea Poisson(λd ) prior truncated to the right at k0 to control thenumber of terms in the sum. For each φd,h we can eitheruse a strict upper bound (e.g. allowing only functions uptosecond order) or choose a prior that puts small probabilityon a higher order basis function. The value of k0 shouldbe chosen based on our idea of the variability in y-surface.k0 = 0 corresponds to the most parsimonious model—a per-fectly stationary surface. Increasing k0 will allow us to cap-ture more and more local patterns but risks overfitting. Inpractical examples, the choice of k0 may come from prioridea about nature of variation in the surface. In absence ofany such information, one can use a validation method, i.e.,using a subset of the data as the test set, fit the model withdifferent values of k0 and investigate its influence on the pre-dictive performance.

    During the MCMC, the vector of parameters has beenupdated in the following blocks: (i) spline coefficients{νd,h}, number of basis functions kd and parameters withineach φd,h for d = π,y; (ii) latent rainrate variables y(s);(iii) auxiliary surface z(·); (iv) regression parameters suchas βπ and {cil : i = 1,2, l = 1,2, . . . ,L, l = l0}. We rewritethe probability distribution of y(s) as y(s) ∼ Nn(μy(s) +L(κ)w(s0),Σy) where Σy = σ 2y Diag{In − Cn,m(κ) ×C−1m (κ)Cm,n(κ)}. The resulting posterior distribution for yis Gaussian and, importantly, has independence across lo-cations. So, it is easy to draw y even when n is large. Asin Albert and Chib (1993), conditional on y(slj ) and ỹlj , zljcan also be sampled independently across different satellitesand locations:

    zlj |ỹlj , y(slj ),μπ ,βπ ind∼ 1(ỹlj = 0)N(−∞,0)(μlj ,1)+ 1(ỹlj > 0)N(0,∞)(μlj ,1)

    μlj = μπ(slj ) + βπy(slj ),

    where NA(μ,1) stands for N(μ,1) distribution truncatedwithin A ⊂ R.

    Next, we discuss updating parameters related to the pos-terior distribution of y, i.e., κ,σ 2y and μy . For this, we first

    marginalize out w(s0) from the distribution of y. As ob-served in Chib and Carlin (1999), marginalizing out the ran-dom effects improves the mixing behavior of the MCMC.However, this leads to a full covariance structure for y as:

    Σ(y(s)

    ) = σ 2y[Cn,m(κ)C

    −1m (κ)Cm,n(κ)

    + Diag{In − Cn,m(κ)C−1m (κ)Cm,n(κ)}]

    .

    We can use S-W-M type matrix computations for calculat-ing the determinant and inverse of this covariance matrix anduse that in drawing from posterior distributions of κ and σ 2y .

    Posterior samples of w(s0) can be drawn afterwards froma multivariate normal distribution as evident from (6). Re-gression parameters c1k, c2k and βπ can also be updated us-ing standard sampling steps. The most important componentof this MCMC is updating the spline parameters appearingin μy and μπ which has been performed using a reversiblejump (Richardson and Green 1997) scheme described be-low.

    We start with μy , the mean function for y. We drop thesuffix y for notational simplicity. With k basis functions,let αk = {nh,uh,vh, νh, th}kh=1 be the corresponding set ofspline parameters. Marginalizing out ν and σ 2, the distri-bution for y(s), p(y(s)|k,αk, . . .), can be written in closedform (see Appendix). Now, using a suitable proposal dis-tribution q , propose a dimension changing move (k,αk) →(k′, αk′). We consider three types of possible moves (i) birth:addition of a basis function; (ii) death: deletion of an exist-ing basis function; and (iii) change: modification of an ex-isting basis function. Thus k′ ∈ {k − 1, k, k + 1}. The accep-tance ratio for such a move is given by

    pk→k′ = min(

    1,p(y(s)|k′, αk′ , . . .)p(y(s)|k,αk, . . .)

    p(αk′ |k′)p(k′)p(αk|k)p(k)

    × q{(k′, αk′) → (k,αk)}

    q{(k,αk) → (k′, αk′)})

    .

    First, we mention the prior for (k,αk) in the form ofp(αk|k)p(k). As specified above, (k + 1) has a Poisson(λ)prior truncated at some upper bound k0. Within each con-stituent local polynomial, nh controls its order, vh controlsthe set of variables involved whereas uh and th determine thesigns and the position of the sp-knot. If p is the total num-ber of covariates and we allow interactions up to 2nd order,then the number of possible choices for a basis function (ex-

    cluding the constant function) is N = p +p + (p2) = p2+3p2 .

    Accounting for rearrangement of the same set of basis func-tions, the number of distinct k-set basis functions is Nk/k!.Once we determine a basis function, we choose the individ-ual coordinates of that sp-knot uniformly from the availabledata points (since a change in pattern can only be detected atdata points) and determine its sign to be positive or negative

  • Stat Comput (2015) 25:389–405 397

    with probability 1/2 each. Thus we obtain:

    p(αk|k) ∝ Nk

    k! (1/2n)∑k

    h nh .

    Although above we assumed that all covariates have ndistinct values to locate a sp-knot, modifications can bemade easily when this is not the case.

    Next we specify the proposal distribution q(·, ·) for eachof the three moves as follows:

    (i) First decide on the type of move to be proposed withprobabilities bk (birth), dk (death) and ck (change),bk +dk +ck = 1. We put dk = 0, ck = 0 if k = 1, bk = 0if k = k0.

    (ii) For a birth move, choose a new basis function ran-domly from the N -set, calculate its order nh and choosethe location of its sp-knot and signs as before withprobability ( 12n )

    nh .(iii) The death move is performed by randomly removing

    one of the k − 1 existing basis functions (excluding theconstant basis function).

    (iv) A change move consists of choosing an existing non-constant basis function randomly and alter its sign andcorresponding sp-knot.

    From above, we have

    q((k,αk) →

    (k′, αk′

    )) =

    ⎧⎪⎨

    ⎪⎩

    bk1N

    ( 12n)nk+1 k′ = k + 1,

    dk1

    k−1 k′ = k − 1,

    ck1

    k−1 (1

    2n )nh k′ = k.

    Above, for the ‘change’ step, h denotes the index of ba-sis function that has been randomly chosen for change. Theacceptance ratios for different types of move can be workedout from this. Set k = k′, αk = αk′ if the move is accepted,leave them unchanged otherwise. Subsequently, νy can beupdated using the k-variate t distribution with degrees offreedom d = n + 2aσ , mean μk , dispersion c0kΣkd , whoseexpressions are given (with derivation) in Appendix. Theupdating scheme for the spline parameters in μπ is simi-lar except for the fact that we need to marginalize over νπonly as z has a known variance (set to 1) as in (6).

    4.3 Posterior inference

    The principal objective of this data analysis is to understandthe precipitation pattern over the region D. For that, we cre-ate spatial maps of (i) expected rainrate {π(s) exp{Y(s)} :s ∈ D}; and (ii) probability of rainfall {π(s) : s ∈ D} usingthe posterior samples. There can be locations inside D withno available measurements from any of the L satellites thatdo not have any likelihood contribution. The realizations ofthe rainrate process Y at those locations are constructed us-ing the predictive distributions of a GP. Since Gaussian pro-cesses can capture a wide range of dependencies, using them

    in a hierarchical setting enhances predictive performance forthe model. Mathematically, if only n out of N sites haveat least one satellite measurement then inference for the re-maining (N −n) sites is done from their posterior predictivedistributions. If sp = {sn+1, sn+2, . . . , sN } denotes the set oflocations with no precipitation data, the foregoing predictiveprocess approximation yields

    y(sp) = μy(sp) + σy{CN−n,m(κ)C−1m (κ)w

    (s0

    ) + ε∗(sp)},

    ε∗(sp) ∼ NN−n(0,Diag

    {IN−n − CN−n,m(κ)C−1m (κ)

    × Cm,N−n(κ)})

    .

    (7)

    As we mentioned earlier, the advantage of using bias cor-rection is evident here since conditional on w(s0) and κ , wecan draw samples from the posterior predictive distributionof Y(sp), independent of each other and also of Y(s) (con-ditional on realizations of w(s0)) due to the independenceamong ε∗s across locations. This is computationally veryefficient since we do not need to draw from a high dimen-sional multivariate Gaussian distribution if we want to studya larger region and require predicting the rainrate surface atthousands of sites with no satellite readings. Also of inter-est is the posterior estimate of {μy(s) : s ∈ D}, which pro-vides an idea of localized patterns in the rainrate over D. Allthese diagnostics are provided with the real data analysis inSect. 5.

    5 Data analysis

    We proceed to application of the variable-knot approach de-scribed in Sects. 3 and 4. First, in Sect. 5.1 we carry out sim-ulation studies to highlight the improvement in predictiveperformance under the proposed method relative to fixed-knot predictive process models. Then, in Sect. 5.2, we an-alyze an actual precipitation dataset from Northern SouthAmerica.

    5.1 Simulation study

    In the discussion following (4) and (5), we have arguedthat one of the key advantage of a MARS-based covariancemodel is its ability to capture a wide range of spatial struc-tures in the response surface. To justify that numerically, weuse synthetic datasets from two different models.

    For simulation, we fix the input space X to be the unithypercube in R4. The input points x = (x1, x2, x3, x4)T aredrawn uniformly over X and the response, Y(x), is simu-lated from a Gaussian process with a nonstationary covari-ance function as follows:

    E[Y(x)

    ] = β0;Cov

    [Y(x), Y

    (x′

    )] = σ 20 Ix=x′ + xT Ω0x′,(A)

  • 398 Stat Comput (2015) 25:389–405

    Table 1 Comparison ofpredictive performance fordifferent spatial models

    Simulationmodel

    Predictionproperty

    Model for estimation

    MARSw/pp-knots

    Predictive process with

    same # pp-knots 2× # pp-knots 3× # pp-knots

    (A) Abs. Bias 0.099 0.104 0.107 0.112

    Pred. Uncertainty 0.570 0.626 0.603 0.603

    Coverage Propn. 92.100 91.500 91.000 90.600

    (B) Abs. Bias 0.048 0.271 0.195 0.137

    Pred. Uncertainty 0.539 1.937 1.536 1.374

    Coverage Propn. 96.200 93.500 93.800 94.500

    where IA = 1 if A is true, 0 otherwise. For any σ 20 > 0 andany positive definite matrix Ω0, it is easy to verify that theabove is a valid covariance function.

    In the second example, keeping X unchanged, we moveto a more general functional form for Y(x) as follows:

    Y(x) = β0 + β1x51 + log(1 + x22

    ) + β2x3 sin(πx21

    )

    + Ix4

  • Stat Comput (2015) 25:389–405 399

    Fig. 2 (Left) Map of the study region showing number of satellite measurements available across different sites during 1.30 a.m.–4.30 a.m. onJanuary 1st, 2008. (Right) Location of the study region in South America

    America (Fig. 2) lying south of the equator, inside the rect-angle [70◦W,40◦W ]× [20◦S,0◦]. This region is chosen be-cause of its large average rainfall rate, which helps to reducesampling errors, and its regular diurnal variability. A longterm mean rainrate pattern over 13 years for this region ispresented in Fig. 3. In the current work, we select a 3-hourtime window, 1.30 a.m.–4.30 a.m. on January 1st, 2008.Data are available from two satellites—TMI and AMSU-NOAA17. Other satellites did not make any observation inthis region during the specified time interval. Ground-basedobservations by gauges or radar are very sparse in the Ama-zon basin. From Fig. 2, it is evident that some of the siteshave multiple rainrate measurements as they fell inside theintersection of trajectories of both satellites during that timewindow whereas some parts of the region were not coveredby any of the satellites. Thus, estimation of rainrate at ob-served locations, from either one or both of the satellites aswell as prediction at unmapped sites are necessary to createa complete precipitation map for this region.

    We start with some empirical summaries of the raw data.Rainrate measurements from both of the satellites are avail-able at 2026 of a total of 9600 sites in the region whereas2305 of them have no available data. During the time win-dow, TMI was able to cover 5653 sites whereas AMSU-NOAA17 covered 3668 sites. About 17 % of all satellitemeasurements recorded positive precipitation, the rest beingall zero. Figure 4 provides a comprehensive representationof the precipitation measurements collected during that timeinterval from the above region.

    We analyze the data using the model from (6). We usediffused prior specifications for model parameters. Regres-sion coefficients such as βπ, c11 and c21, are assigned a

    N(0,100) prior. For variance parameters, we use an InverseGamma(2,4) prior. For the sp-knots in μy and μπ , we alloweach of them to have at most k0 = 30 nonconstant local func-tions, i.e. each of ky and kπ is assigned an 1 + Poisson(4)prior truncated between [1,31]. As we show later, this rangeturns out to be adequate for our dataset. We consider localpolynomials of only upto second order, i.e. nh has a uni-form prior on {1,2}. We select m = 100 locations within theregion as the pp-knots. For the correlation parameter κ , wefirst select a possible set of values for the spatial range R anduse a uniform distribution over equidistant points in that set.As mentioned in Sect. 4.2, this leads to a discrete uniformprior for κ . The MCMC is run for 15000 iterations, discard-ing the first 5000 draws and thinning the rest at every 5thdraw.

    Before providing the posterior summaries for precipita-tion patterns, we perform a validation step by comparingmodel-based predictions with corresponding true observa-tions. For that, we randomly remove a “test” set that con-sists of 130 and 122 sites from the region with positive pre-cipitation records from TMI and AMSU-NOAA17, respec-tively. We treat those sites as having no available measure-ment, predict the latent process Y as in Sect. 4.3 and, trac-ing back the hierarchy in (6), regenerate the samples for the“observed” rainrates Ỹ at each of those locations for each ofthe satellites using 2000 thinned MC samples of model pa-rameters. The samples are summarized in form of predictivemean and credible set. As with the simulation studies, wecompute three measures of predictive performance: (i) abso-lute bias (ii) predictive uncertainty and (iii) coverage statusfor every Ỹ in the test set. In Table 2 we present, for each

  • 400 Stat Comput (2015) 25:389–405

    Fig. 3 Climatological meansurface rainrate for January forthe period 1998–2010 from theTRMM MultisatellitePrecipitation Analysis (Huffmanet al. 2007). The domain for thisstudy is indicated by the blackrectangle box. The meanrainrate field is relativelysmooth across the study domain,except in the southwest corner,where orographic effects fromthe Andes Mountains areapparent (Color figure online)

    Fig. 4 Grid-level rainratemeasurements from (top) TMIand (bottom) AMSU-NOAA17satellites during [1:30,4:30] a.m. on January 1, 2008(Color figure online)

  • Stat Comput (2015) 25:389–405 401

    satellite, the summary of these measures—the mean abso-lute bias, the mean predictive uncertainty and the coverageproportion.

    The results turn out to be satisfactory as the empiricalcoverage rates for both satellites exceed 89 %. As expected,the TMI measurements have significantly better estimatesof error and reduced uncertainty of prediction over those ofAMSU-NOAA17 because the former has been used as thereference standard for this data analysis. Now, we provideposterior summary statistics for some important model pa-rameters in Table 3.

    Table 2 Predictive performance of both satellites on test dataset

    Satellite Mean absolutebias (in mm/hr)

    Mean predictiveuncertainty(in mm/hr)

    Coverage proportionfor 90 % credible sets

    TMI 1.039 3.765 0.892

    AMSU-NOAA17

    1.697 5.738 0.893

    Combined 1.357 4.720 0.893

    The number of sp-knots for the latent log-rainrate processas well as for the probability of no precipitation are well-within their assigned range of [1, 31]. The multiplicativefactor for the AMSU-NOAA17, c2 is marginally above 1.The effect of the latent log rainrate process Y on the event ofrainfall is parametrized by βπ and turns out to be significant.Next, we look at the spatial maps for rainrate summaries.First, we present the pointwise estimates for the exp[Y ] andπ surfaces in Fig. 5. Then, the posterior mean and uncer-tainty estimates (90 % credible set width) of the expectedrainrate process, as defined in Sect. 4.3, are included in

    Table 3 Posterior summaries of important model parameters

    Parametertype

    Parametername

    Pointestimate

    90 % posteriorcredible interval

    Latentprocess-specific

    ky 21 [17, 26]

    kπ 8 [5, 12]

    βπ 1.392 [1.271, 1.582]

    AMSU-NOAA17-specific

    c1 0.332 [0.245, 0.429]

    c2 1.090 [0.999, 1.187]

    Fig. 5 Posterior surfaceestimates of (top) probability ofrainfall π and (bottom) potentialrainrate exp[Y ] (Color figureonline)

  • 402 Stat Comput (2015) 25:389–405

    Fig. 6 Posterior estimatedsurfaces for (top) expectedrainrate and (bottom) itsuncertainty (width of pointwise90 % credible sets) (Color figureonline)

    Fig. 6. To highlight the contribution of sp-knot based func-tions, we present the posterior maps of exp{μy(·)} and μπin Fig. 7.

    Precipitation is mostly concentrated in the west-centralpart of the region and decreases as one moves towards theocean in the east. In Fig. 5, the probability map shows higherchance of observing precipitation in the central part of theregion as a result of smoothing effect of high rainfall ob-servations in the surrounding regions from both satellites.Figure 6 shows patches of region with relatively higher ex-pected rainrate. The highs and lows of the uncertainty esti-mates are often related to those of the corresponding pointestimates. This is a natural property of the log-Gaussianmodel (also other models like Gamma) due to the inter-dependence between mean and variance. The uncertaintyestimates are reflective of typically high variability (as wellas lack of spatial smoothness) of precipitation over a shorttime window. Finally, Fig. 7 shows bands of piecewise ho-mogeneous regions created by the collection of sp-knotsand associated local polynomials. The estimated surface forexp[μy] contains relatively higher number of localized pat-terns than the surface for μπ . This is justified by Table 3,which shows that the posterior probability mass function for

    ky puts greater weight on larger values within the [1,31]range than the one for kπ .

    6 Summary and future work

    In this article, we presented a novel hierarchical model tocombine precipitation measurements from multiple satel-lites. To capture a wide range of localized as well as large-scale spatial patterns in the underlying potential rainrate pro-cess, a flexible random knot-based mean function has beenused in combination with a stationary residual in the logscale. The method was adjusted to handle a large numberof observation locations using a predictive process approxi-mation, making it applicable to studies involving larger re-gions. However, it is likely for a larger domain to containheterogeneous subregions, e.g., land-sea boundaries or re-gions separated by mountain stretches, that may experiencedifferent rainfall patterns. Whereas the MARS specification,employed in this work, can be really useful for these sit-uations (since it does not make assumptions regarding anyglobal pattern of correlation in the response surface), it isworth exploring alternative modeling ideas that can account

  • Stat Comput (2015) 25:389–405 403

    Fig. 7 Maps of sp-knot basedsurfaces—(top) exp[μy ] and(bottom) μπ constructed fromtheir respective posteriorsamples (Color figure online)

    for boundary effects between nonhomogeneous regions. An-other extension lies in extending the 3-hr window to a largertime interval like a day or a week that brings a temporal pat-tern to the data. The associated spatio-temporal process canbe developed as an extension to the current spatial model forY in a number of different ways. Limitations for such spec-ifications with respect to restricted model assumptions (e.g.separability across space and time), computational load orintuitive interpretability need to be compared for preferringone of them over the others. Another important informationthat may significantly improve rainfall prediction is the useof associated covariate data. Covariates can be of differenttypes: (i) climate features such as temperature, wind speed,wind direction etc., (ii) geographic information such as ele-vation and (iii) measures of human intervention, e.g., forestcover, emission rates of pollutants etc. which are believedto influence rainfall in the long run. Inclusion of appropri-ate local covariate information is often useful to explain thenonstationarity, thus eliminating the need for complex mod-els.

    Acknowledgements The authors acknowledge the Texas A&M Uni-versity Brazos HPC cluster that contributed to the research reportedhere (http://brazos.tamu.edu).

    Appendix: Marginalizing out νy and σ 2y for estimationof spline parameters in μy(s)

    Denote by . . . all parameters except ν,σ 2y . Let P =[φ1[x(s)], φ2[x(s)], . . . , φk[x(s)]], S = y(s) − Pνy . Wehave,

    p(y(s)| . . .)

    ∝∫

    νy

    σ 2y

    p(y(s)|νy, σ 2y , . . .

    )p(νy |σ 2y

    )p(σ 2y

    )dσ 2y dνy,

    ∝ (2πτ 2y)−k/2

    νy

    σ 2y

    (σ 2y

    )− n+k2 −aσ −1

    × exp[

    − 12σ 2y

    (ST D−1S + νTy νy/τ 2y + 2bσ

    )]

    dσ 2y dνy,

    ∝ (2πτ 2y)−k/2

    Γ

    (n

    2+ aσ

    )

    νy

    (ST D−1S + νTy νy/τ 2y

    2+ bσ

    )− n+m+k2 −aσdνy.

    http://brazos.tamu.edu

  • 404 Stat Comput (2015) 25:389–405

    Now write ST D−1S + νTy νy = νTy Aνy − 2νTy B + C, whereA = P T D−1P + Ik

    τ 2y, B = P T D−1Sy , C = STy D−1Sy . Then

    we have, ST D−1S + νTy νy + 2bσ = (νy − μk)T Σ−1k (νy −μk) + c0k , where μk = A−1B,Σk = A−1, c0k = C −bT A−1b + 2bσ . Denote d = n + 2aσ . Thenp(y(s)| . . .)

    ∝ (πτ 2y)−k/2

    c− d+k20k Γ

    (d + k

    2

    )∫

    νy

    [1

    d(νy − μk)T

    ×(

    c0kΣk

    d

    )−1(νy − μk) + 1

    ]− d+k2dνy.

    The integrand is the pdf (up to a constant) for the k-variate tdistribution with mean μk , dispersion

    c0kΣkd

    and degrees offreedom d . Hence, we obtain the closed form expression for

    p(y(s)| . . .) ∝ (τ 2y )−k/2c−d2

    0k |Σk|1/2.

    References

    Agarwal, D.K., Gelfand, A.E., Citron-Pousty, S.: Zero-inflated modelswith application to spatial count data. Environ. Ecol. Stat. 9, 341–355 (2002)

    Albert, J.H., Chib, S.: Bayesian analysis of binary and polychotomousresponse data. J. Am. Stat. Assoc. 88(422), 669–679 (1993)

    Anderes, E.B., Stein, M.L.: Estimating deformations of isotropic Gaus-sian random fields on the plane. Ann. Stat. 36, 719–741 (2008)

    Austin, P.M., Houze, R.A.: Analysis of the structure of precipitationpatterns in New England. J. Appl. Meteorol. 11, 926–935 (1972)

    Ba, M.B., Gruber, A.: Goes multispectral rainfall algorithm (gmsra). J.Appl. Meteorol. 40, 1500–1514 (2001)

    Banerjee, S.: On geodetic distance computations in spatial modeling.Biometrics 61(2), 617–625 (2005)

    Banerjee, S., Gelfand, A.E., Knight, J.R., Sirmans, C.F.: Spatial mod-eling of house prices using normalized distance-weighted sums ofstationary processes. J. Bus. Econ. Stat. 22(2), 206–213 (2004)

    Banerjee, S., Gelfand, A., Finley, A., Sang, H.: Gaussian predictiveprocess models for large spatial data sets. J. R. Stat. Soc. B 70(4),825–848 (2008)

    Bardossy, A., Plate, E.J.: Space-time model for daily rainfall using at-mospheric circulation patterns. Water Resour. Res. 28(5), 1247–1259 (1992)

    Bell, T.L., Kundu, P.K.: A study of the sampling error in satellite rain-fall estimates using optimal averaging of data and a stochasticmodel. J. Climate 9, 1251–1268 (1996)

    Bell, T.L., Abdullah, A., Martin, R.L., North, G.R.: Sampling errorsfor satellite-derived tropical rainfall: Monte Carlo study usinga space-time stochastic model. J. Geophys. Res. 95(D3), 2195–2205 (1990)

    Bell, T.L., Kundu, P.K., Kummerow, C.D.: Sampling errors of ssm/iand trmm rainfall averages: comparison with error estimates fromsurface data and a simple model. J. Appl. Meteorol. 40, 938–954(2001)

    Chakraborty, A., Gelfand, A.E., Wilson, A.M., Latimer, A.M., Silan-der, J.A.: Modeling large scale species abundance with latent spa-tial processes. Ann. Appl. Stat. 4(3), 1403–1429 (2010)

    Chakraborty, A., Mallick, B.K., McClarren, R.G., Kuranz, C.C., Bing-ham, D.R., Grosskopf, M.J., Rutter, E., Stripling, H.F., Drake,R.P.: Spline-based emulators for radiative shock experiments withmeasurement error. J. Am. Stat. Assoc. 108, 411–428 (2013)

    Chib, S., Carlin, B.P.: On mcmc sampling in hierarchical longitudinalmodels. Stat. Comput. 9(1), 17–26 (1999)

    Cohen, A.C.: Truncated and Censored Samples, 1st edn. MarcelDekker, New York (1991)

    Cooley, D., Nychka, D., Naveau, P.: Bayesian spatial modeling of ex-treme precipitation return levels. J. Am. Stat. Assoc. 102(479),824–840 (2007)

    Cressie, N., Johannesson, G.: Fixed rank kriging for very large spatialdata sets. J. R. Stat. Soc. B 70(1), 209–226 (2008)

    Denison, D.G.T., Mallick, B.K., Smith, A.F.M.: Bayesian mars. Stat.Comput. 8(4), 337–346 (1998)

    Felgate, D.G., Read, D.G.: Correlation analysis of the cellular structureof storms observed by raingauges. J. Hydrol. 24, 191–200 (1975)

    Finley, A., Sang, H., Banerjee, S., Gelfand, A.: Improving the perfor-mance of predictive process modeling for large datasets. Comput.Stat. Data Anal. 53(8), 2873–2884 (2009)

    Finley, A.O., Banerjee, S., MacFarlane, D.W.: A hierarchical model forquantifying forest variables over large heterogeneous landscapeswith uncertain forest areas. J. Am. Stat. Assoc. 106(493), 31–48(2011)

    Friedman, J.: Multivariate adaptive regression splines. Ann. Stat. 19(1),1–67 (1991)

    Fuentes, M.: Spectral methods for nonstationary spatial processes.Biometrika 89, 197–210 (2002)

    Fuentes, M., Reich, B., Lee, G.: Spatial-temporal mesoscale modellingof rainfall intensity using gauge and radar data. Ann. Appl. Stat.2, 1148–1169 (2008)

    Furrer, R., Genton, M.G., Nychka, D.: Covariance tapering for inter-polation of large spatial datasets. J. Comput. Graph. Stat. 15(3),502–523 (2006)

    Gelfand, A.E., Kim, H.J., Sirmans, C.F., Banerjee, S.: Spatial modelingwith spatially varying coefficient processes. J. Am. Stat. Assoc.98(462), 387–396 (2003)

    Gelfand, A.E., Banerjee, S., Finley, A.O.: Spatial design for knot se-lection in knot-based dimension reduction models. In: Mateu, J.,Müller, W.G. (eds.) Spatio-Temporal Design: Advances in Effi-cient Data Acquisition, pp. 142–169. Wiley, Chichester (2012)

    Guhaniyogi, R., Finley, A.O., Banerjee, S., Gelfand, A.E.: AdaptiveGaussian predictive process models for large spatial datasets. En-vironmetrics 22(8), 997–1007 (2011)

    Higdon, D.: A process-convolution approach to modelling tempera-tures in the North Atlantic Ocean. Environ. Ecol. Stat. 5(2), 173–190 (1998)

    Higdon, D.: Space and space-time modeling using process convolu-tions. In: Anderson, C., Barnett, V., Chatwin, P.C., El-Shaarawi,A.H. (eds.) Quantitative Methods for Current Environmental Is-sues, pp. 37–56. Springer, London (2002)

    Higdon, D., Swall, J., Kern, J.: Non-stationary spatial modeling. In:Bernardo, J.M., Berger, J.O., Dawid, A.P., Smith, A.F.M. (eds.)Bayesian Statistics, vol. 7, pp. 181–197. Oxford University Press,Oxford (1999)

    Huffman, G.J., Adler, R.F., Stocker, E.F., Bolvin, D.T., Nelkin, E.J.:A trmm-based system for real-time quasi-global merged precip-itation estimates. In: TRMM International Science Conference,Honolulu, pp. 22–26 (2002)

    Huffman, G.J., Adler, R.F., Bolvin, D.T., Gu, G., Nelkin, E.J., Bow-man, K.P., Hong, Y., Stocker, E.F., Wolef, D.B.: The trmm mul-tisatellite precipitation analysis (tmpa): quasi-global, multiyear,combined-sensor precipitation estimates at fine scales. J. Hydrom-eteorol. 8, 38–55 (2007)

    Joyce, R.J., Janowiak, J.E., Arkin, P.A., Xie, P.: Cmorph: a method thatproduces global precipitation estimates from passive microwaveand infrared data at high spatial and temporal resolution. J. Hy-drometeorol. 5, 487–503 (2004)

    Jun, M.: Non-stationary cross-covariance models for multivariate pro-cesses on a globe. Scand. J. Stat. 38, 726–747 (2011)

  • Stat Comput (2015) 25:389–405 405

    Jun, M., Stein, M.L.: Nonstationary covariance models for global data.Ann. Appl. Stat. 2(4), 1271–1289 (2008)

    Kammann, E.E., Wand, M.P.: Geoadditive models. J. R. Stat. Soc., Ser.C, Appl. Stat. 52(1), 1–18 (2003)

    Kaufman, C.G., Schervish, M.J., Nychka, D.W.: Covariance taperingfor likelihood-based estimation in large spatial data sets. J. Am.Stat. Assoc. 103(484), 1545–1555 (2008)

    Kidd, C.: Satellite rainfall climatology: a review. Int. J. Climatol. 21,1041–1066 (2001)

    Lee, G.W., Zawadzki, I.: Variability of drop size distributions: time-scale dependence of the variability and its effects on rain estima-tion. J. Appl. Meteorol. 44, 241–255 (2005)

    Lethbridge, M.: Precipitation probability and satellite radiation data.Mon. Weather Rev. 95(7), 487–490 (1967)

    Marchenko, Y.V., Genton, M.G.: Multivariate log-skew-elliptical dis-tributions with applications to precipitation data. Environmetrics21(3–4), 318–340 (2010)

    McConnell, A., North, G.R.: Sampling errors in satellite estimates oftropical rain. J. Geophys. Res. 92(D8), 9567–9570 (1987)

    Negri, A.J., Xu, L., Adler, R.F.: A trmm-calibrated infrared rainfallalgorithm applied over Brazil. J. Geophys. Res. 107(D20), 8048–8062 (2002)

    Paciorek, C., Schervish, M.: Spatial modelling using a new class ofnonstationary covariance functions. Environmetrics 17, 483–506(2006)

    Richardson, S., Green, P.J.: On Bayesian analysis of mixtures with anunknown number of components. J. R. Stat. Soc. B 59(4), 731–792 (1997)

    Rodríguez-Iturbe, I., Mejía, J.M.: The design of rainfall networks intime and space. Water Resour. Res. 10, 713–728 (1974)

    Sampson, P.D., Guttorp, P.: Nonparametric estimation on nonstation-ary spatial covariance structure. J. Am. Stat. Assoc. 87, 108–119(1992)

    Sang, H., Gelfand, A.E.: Hierarchical modeling for extreme values ob-served over space and time. Environ. Ecol. Stat. 16(3), 407–426(2009)

    Sang, H., Huang, J.Z.: A full scale approximation of covariance func-tions for large spatial data sets. J. R. Stat. Soc. B 74(1), 111–132(2012)

    Schmidt, A.M., O’Hagan, A.: Bayesian inference for non-stationaryspatial covariance structure via spatial deformations. J. R. Stat.Soc. B 65, 743–758 (2003)

    Simpson, J., Adler, R.F., North, G.R.: A proposed Tropical RainfallMeasuring Mission (TRMM) satellite. Bull. Am. Meteorol. Soc.69(3), 278–295 (1988)

    Sorooshian, S., Hsu, K.L., Gao, X., Gupta, H., Imam, B., Braith-waite, D.: Evaluation of Persiann system satellite-based estimatesof tropical rainfall. Bull. Am. Meteorol. Soc. 81(9), 2035–2046(2000)

    Stein, M., Chi, Z., Welty, L.: Approximating likelihoods for large spa-tial data sets. J. R. Stat. Soc. B 66, 275–296 (2004)

    Sun, Y., Li, B., Genton, M.G.: Geostatistics for large datasets. In:Porcu, E., Montero, J.M., Schlather, M. (eds.) Advances and Chal-lenges in Space-Time Modelling of Natural Events, vol. 207,pp. 55–77. Springer, Berlin (2012)

    Tanner, T.A., Wong, W.H.: The calculation of posterior distributionsby data augmentation. J. Am. Stat. Assoc. 82, 528–549 (1987)

    Vicente, G.A., Scofield, R.A., Menzel, W.P.: The operational goes in-frared rainfall estimation technique. Bull. Am. Meteorol. Soc.79(9), 1883–1898 (1998)

    Weng, F.W., Zhao, L., Ferraro, R., Pre, G., Li, X., Grody, N.C.: Ad-vanced microwave sounding unit (amsu) cloud and precipitationalgorithms. Radio Sci. 38(4), 8068–8079 (2003)

    Wilheit, T.T.: A satellite technique for quantitatively mapping rainfallrates over the ocean. J. Appl. Meteorol. 16, 551–560 (1977)

    Wilheit, T.T., Chang, A.T.C., Rao, M.S.V., Rodgers, E.B., Theon, J.S.:A satellite technique for quantitatively mapping rainfall rates overthe oceans. J. Appl. Meteorol. 16(5), 551–560 (1977)

    Xie, P., Arkin, P.A.: Global monthly precipitation estimates fromsatellite-observed outgoing longwave radiation. J. Climate 11,137–164 (1998)

    An adaptive spatial model for precipitation data from multiple satellites over large regionsAbstractIntroductionCollection of satellite measurements on precipitationHierarchical spatial model for rainrateDetails of estimation and inferenceKnot-based approximation for large datasetMCMC from the complete hierarchical modelPosterior inference

    Data analysisSimulation studyMulti-satellite precipitation data

    Summary and future workAcknowledgementsAppendix: Marginalizing out nuy and sigma2y for estimation of spline parameters in µy(s)References