evaluation and transferability of the noah land surface model in

17
Evaluation and Transferability of the Noah Land Surface Model in Semiarid Environments TERRI S. HOGUE Department of Hydrology and Water Resources, The University of Arizona, Tucson, Arizona LUIS BASTIDAS Department of Civil and Environmental Engineering, Utah State University, Logan, Utah HOSHIN GUPTA AND SOROOSH SOROOSHIAN* Department of Hydrology and Water Resources, The University of Arizona, Tucson, Arizona KEN MITCHELL NOAA/NWS/National Centers for Environmental Prediction, Camp Springs, Maryland WILLIAM EMMERICH USDA, Agricultural Research Service, Tucson, Arizona (Manuscript received 16 February 2003, in final form 10 August 2004) ABSTRACT This paper investigates the performance of the National Centers for Environmental Prediction (NCEP) Noah land surface model at two semiarid sites in southern Arizona. The goal is to evaluate the transfer- ability of calibrated parameters (i.e., direct application of a parameter set to a “similar” site) between the sites and to analyze model performance under the various climatic conditions that can occur in this region. A multicriteria, systematic evaluation scheme is developed to meet these goals. Results indicate that the Noah model is able to simulate sensible heat, ground heat, and ground temperature observations with a high degree of accuracy, using the optimized parameter sets. However, there is a large influx of moist air into Arizona during the monsoon period, and significant latent heat flux errors are observed in model simula- tions during these periods. The use of proxy site parameters (transferred parameter set), as well as tradi- tional default parameters, results in diminished model performance when compared to a set of parameters calibrated specifically to the flux sites. Also, using a parameter set obtained from a longer-time-frame calibration (i.e., a 4-yr period) results in decreased model performance during nonstationary, short-term climatic events, such as a monsoon or El Niño. Although these results are specific to the sites in Arizona, it is hypothesized that these results may hold true for other case studies. In general, there is still the opportunity for improvement in the representation of physical processes in land surface models for semiarid regions. The hope is that rigorous model evaluation, such as that put forth in this analysis, and studies such as the Project for the Intercomparison of Land-Surface Processes (PILPS) San Pedro–Sevilleta, will lead to advances in model development, as well as parameter estimation and transferability, for use in long-term climate and regional environmental studies. 1. Introduction Numerous experiments have been carried out to evaluate land surface models with a goal of facilitating advances in model development. Many of these com- parison studies have been undertaken by the Project for the Intercomparison of Land-Surface Processes (PILPS; Pitman et al. 1999; Henderson-Sellers et al. 1993, 1995) under the auspices of the Global Energy and Water Cycle Experiment (GEWEX). However, few of the evaluations have been carried out in semiarid regions. Understanding the interaction of land surface processes with climate is crucial for predicting the avail- ability of water resources (i.e., groundwater and surface water sources) in arid regions. The PILPS San Pedro– Sevilleta experiment (Bastidas et al. 2004) is being un- * Current affiliation: Department of Civil and Environmental Engineering, University of California, Irvine, Irvine, California. Corresponding author address: Terri S. Hogue, Department of Civil and Environmental Engineering, 5732C Boelter Hall, Uni- versity of California, Los Angeles, Los Angeles, CA 90095-1593. E-mail: [email protected] 68 JOURNAL OF HYDROMETEOROLOGY VOLUME 6 © 2005 American Meteorological Society JHM402

Upload: doannhan

Post on 01-Jan-2017

214 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Evaluation and Transferability of the Noah Land Surface Model in

Evaluation and Transferability of the Noah Land Surface Model inSemiarid Environments

TERRI S. HOGUE

Department of Hydrology and Water Resources, The University of Arizona, Tucson, Arizona

LUIS BASTIDAS

Department of Civil and Environmental Engineering, Utah State University, Logan, Utah

HOSHIN GUPTA AND SOROOSH SOROOSHIAN*Department of Hydrology and Water Resources, The University of Arizona, Tucson, Arizona

KEN MITCHELL

NOAA/NWS/National Centers for Environmental Prediction, Camp Springs, Maryland

WILLIAM EMMERICH

USDA, Agricultural Research Service, Tucson, Arizona

(Manuscript received 16 February 2003, in final form 10 August 2004)

ABSTRACT

This paper investigates the performance of the National Centers for Environmental Prediction (NCEP)Noah land surface model at two semiarid sites in southern Arizona. The goal is to evaluate the transfer-ability of calibrated parameters (i.e., direct application of a parameter set to a “similar” site) between thesites and to analyze model performance under the various climatic conditions that can occur in this region.A multicriteria, systematic evaluation scheme is developed to meet these goals. Results indicate that theNoah model is able to simulate sensible heat, ground heat, and ground temperature observations with a highdegree of accuracy, using the optimized parameter sets. However, there is a large influx of moist air intoArizona during the monsoon period, and significant latent heat flux errors are observed in model simula-tions during these periods. The use of proxy site parameters (transferred parameter set), as well as tradi-tional default parameters, results in diminished model performance when compared to a set of parameterscalibrated specifically to the flux sites. Also, using a parameter set obtained from a longer-time-framecalibration (i.e., a 4-yr period) results in decreased model performance during nonstationary, short-termclimatic events, such as a monsoon or El Niño. Although these results are specific to the sites in Arizona,it is hypothesized that these results may hold true for other case studies. In general, there is still theopportunity for improvement in the representation of physical processes in land surface models for semiaridregions. The hope is that rigorous model evaluation, such as that put forth in this analysis, and studies suchas the Project for the Intercomparison of Land-Surface Processes (PILPS) San Pedro–Sevilleta, will lead toadvances in model development, as well as parameter estimation and transferability, for use in long-termclimate and regional environmental studies.

1. Introduction

Numerous experiments have been carried out toevaluate land surface models with a goal of facilitating

advances in model development. Many of these com-parison studies have been undertaken by the Project forthe Intercomparison of Land-Surface Processes(PILPS; Pitman et al. 1999; Henderson-Sellers et al.1993, 1995) under the auspices of the Global Energyand Water Cycle Experiment (GEWEX). However,few of the evaluations have been carried out in semiaridregions. Understanding the interaction of land surfaceprocesses with climate is crucial for predicting the avail-ability of water resources (i.e., groundwater and surfacewater sources) in arid regions. The PILPS San Pedro–Sevilleta experiment (Bastidas et al. 2004) is being un-

* Current affiliation: Department of Civil and EnvironmentalEngineering, University of California, Irvine, Irvine, California.

Corresponding author address: Terri S. Hogue, Department ofCivil and Environmental Engineering, 5732C Boelter Hall, Uni-versity of California, Los Angeles, Los Angeles, CA 90095-1593.E-mail: [email protected]

68 J O U R N A L O F H Y D R O M E T E O R O L O G Y VOLUME 6

© 2005 American Meteorological Society

JHM402

Page 2: Evaluation and Transferability of the Noah Land Surface Model in

dertaken using five semiarid vegetation sites in thesouthwestern United States (including two from thisstudy). Length of data, seasonality, short-term climaticevents, and data quality are all issues that can impactthe estimation of model parameters. This analysis testssome of the hypotheses for the calibration and cross-validation schemes proposed for the PILPS intercom-parison study.

Growing interest in the study of semiarid systems hasled to the formation of several interdisciplinary groupsinstituted to furthering the understanding of hydro-logic, ecologic, and atmospheric processes in semiaridbasins, including the Semi-Arid Land-Surface-Atmosphere (SALSA) program (Goodrich et al. 2000)and, more recently, the National Science Foundation(NSF)-funded Science and Technology Center on theSustainability of Semi-Arid Hydrology and RiparianAreas (SAHRA) (Sorooshian et al. 2002). Under theseprograms, several flux tower and data collection siteshave been set up to investigate the coupling of surfaceand climate processes and to investigate the interac-tions of surface and groundwater systems. The avail-ability of 4 yr of data from two distinct semiarid veg-etation types allows us to study the diversity of theseenvironments and to analyze what degree of differen-tiation (and, hence, parameter estimation) is needed formodeling land–atmosphere interactions in semiarid re-gions. The objectives of our work are threefold: 1) torigorously evaluate the performance of the Noah modelin semiarid regions, 2) to analyze transferability of themodel in semiarid regions (i.e., the ability to directlytransfer parameters to another site of similar climaticand vegetated conditions), and 3) to evaluate the abilityof the model to capture variations in the energy andwater balance due to changes in climate forcings.

A background discussion is presented in the follow-ing section. The study area and details of the region arespecified in section 3. Methods are described in section4, along with a brief overview of the Noah model andthe optimization program used for parameter estima-tion. Results are presented in section 5, with a discus-sion and conclusions in section 6.

2. Background

The semiarid southwestern United States has expe-rienced significant changes within the last century. In-creasing population has had dramatic and varied im-pacts on the region’s ecosystems. There have beensignificant changes in the distribution and type ofvegetation found in the area, with large areas of nativegrasses being replaced by Chihuahuan Desert shrubsand mesquite trees. Human activities such as ranching,agriculture, urban development, fire suppression, andgroundwater mining have influenced these transforma-tions (Chehbouni et al. 2000). It is theorized that feed-back influences on the local and regional climate have

caused a reduction in evaporation losses from the sur-face to the atmosphere (Qi et al. 2000).

The scale at which surface fluxes are investigatedis crucial to understanding the dynamics of land–atmosphere interactions (Rodriquez-Iturbe et al. 2001).An evaluation of the performance of a land surfacemodel’s ability to capture high-frequency variations inthe water and energy budget (i.e., diurnal processes)may be better suited for study at a finer spatial resolu-tion, or point scale, before evolving to simulations ofregional-scale water balance, or global climate. Variousland surface modeling studies are occurring at the pointscale, but temporal aggregations are typically made toevaluate model performance at the monthly and yearlytime scales (Wood et al. 1998; Schlosser et al. 2000;Boone et al. 2001). This study focuses on the point (orflux tower) scale and short time scales to allow for anunbiased evaluation of the Noah model’s performance.

Parameter estimation for land surface models is stilltraditionally done via a global land surface classifica-tion scheme with standard values assigned for variousland cover or vegetation types. In many cases, calibra-tion is problematic because data may not exist in re-gions where comprehensive land surface studies are un-dertaken, and little parameter estimation, and even lessvalidation, can be done. The validation of a model hasbeen defined as the “process of demonstrating that asite-specific model is capable of making accurate pre-dictions for periods outside a calibration period” (Refs-gaard and Knudsen 1996). A model is considered vali-dated if the accuracy and prediction during the valida-tion period (outside of calibration) are within what aredefined as acceptable errors. The importance of cali-bration (albeit manual or automatic) and validationwithin the land surface modeling community is becom-ing more accepted. This study will use a testing schemeadapted from Refsgaard and Knudsen (1996), alongwith previously developed multicriteria techniques(Bastidas 1998; Gupta et al. 1999; Bastidas et al. 2001),to obtain parameter values and conduct an objectivemodel performance assessment (validation) at the ob-servational time step of 20 min.

3. Study area

a. Walnut Gulch

The study sites are part of the Walnut Gulch Experi-mental Watershed in southeastern Arizona, a subbasinof the Upper San Pedro River basin, located in theborderland of southeastern Arizona and northeast So-nora, Mexico. The San Pedro basin is a broad, high-desert valley representing a transition between the So-noran and Chihuahuan deserts. There is significant to-pographic (1100–2900 m) and vegetation variationwithin the basin, and the region experiences significantvariability in climate. Vegetation types in the basin in-

FEBRUARY 2005 H O G U E E T A L . 69

Page 3: Evaluation and Transferability of the Noah Land Surface Model in

clude a desert shrub steppe, grassland, oak savannah,riparian corridor, and ponderosa pine at the higher el-evations. The woody plants in the region tend to bemost active in spring or autumn (using deeper soil mois-ture reserves), while the C4 grass species respondquickly to upper soil moisture during the summer mon-soon periods (Kemp 1983). Two specific study siteswere developed within Walnut Gulch during 1996: theLucky Hills site, with mixed desert shrub vegetation,and the Kendall site, which is a semiarid grassland (Fig.1). The two sites are located approximately 10 kmapart, with Lucky Hills at an elevation of 1372 m andKendall at 1526 m. Further details regarding the soilsand vegetation for these sites can be found in Emmer-ich (2003) and Scott (1999).

b. Hydroclimatology

Annual rainfall in the Upper San Pedro ranges from300 to 750 mm yr�1 (Goodrich et al. 2000). The meanannual (1 July 1893–31 July 2003 for Tombstone, Ari-zona) precipitation in the watershed is estimated at 356mm yr�1, and mean annual temperature is 17°C (West-ern Region Climate Center 2003). There are severaldistinct climate periods in the Southwest—the winterperiod, which receives a significant portion of the an-nual precipitation, a distinct dry season in spring(March through June), and the summer monsoon, typi-cally from July through September, which brings con-vective storms to the region and typically delivers overhalf of the annual rainfall. Potential evaporation ratesin the lower parts of the basin are estimated to be 10times the annual rainfall (Goodrich et al. 2000). Statis-tics were gathered on the temporal variability of theprecipitation time series on the two sites. Some of thesedata are displayed in Table 1. For the purposes of thisstudy, the monsoon period (MON) is defined as themonths of July, August, and September, and the winterperiod (WIN) is defined as the months of December,January, and February. Values for the study periods are

shown, along with the percent of annual precipitationfalling in the monsoon for each year (in parentheses).

Several climatic events occurred during the period ofthe study data, including an El Niño period (1997/98winter) and an extremely wet monsoon period (1999summer). The 1997/98 El Niño significantly increasedthe winter rainfall totals over Arizona during this pe-riod (Buizer et al. 2000). During the 1999 monsoon,Lucky Hills received 412 mm of rain, nearly 94% of theannual rainfall during this monsoon season, while Ken-dall received 315 mm of rain, which is 93% of the yearlytotal. The precipitation time series for each year areplotted in Fig. 2. Lines are drawn at 1 July and 30September to designate the monsoon period.

c. Flux data

Micrometeorological measurements were initiated in1996 at both the Kendall and Lucky Hills sites using a

FIG. 1. Kendall grassland and Lucky Hills desert shrub sites within the Walnut GulchExperimental Watershed [photos courtesy of B. Emmerich and U.S. Department of Agricul-ture (USDA) Agricultural Research Service (ARS)].

TABLE 1. Temperature and precipitation totals for the desig-nated study periods at the Kendall and Lucky Hills sites. Shown inparentheses are the percent of the yearly total precipitation forthe monsoon period. The monsoon period (MON) is defined asJul, Aug, and Sep, and the winter period (WIN) is defined as Dec,Jan, and Feb.

Kendall Lucky Hills Kendall Lucky Hills

Temperaturedata (°C)

Precipitationdata (mm)

4-yr avg 16.86 16.91 372 4801997 16.63 16.79 333 4951998 16.43 16.56 365 3381999 17.08 16.99 337 4392000 17.30 17.29 451 6491997 MON 24.25 24.72 151 (45%) 241 (49%)1998 MON 24.38 24.94 250 (68%) 197 (58%)1999 MON 22.64 22.88 315 (93%) 412 (94%)2000 MON 24.26 24.63 158 (35%) 282 (43%)1997/98 WIN 6.65 6.53 132 1711998/99 WIN 9.99 9.55 9.2 9.91999/00 WIN 9.95 9.4 6.1 12.4

70 J O U R N A L O F H Y D R O M E T E O R O L O G Y VOLUME 6

Page 4: Evaluation and Transferability of the Noah Land Surface Model in

Bowen ratio energy balance system. Continuous, 20-min measurements of water and carbon vapor flux mea-surements were collected for the four study years(1997–2000). Along with these flux measurements,standard meteorological data (shortwave radiation, airtemperature, precipitation, wind speed, precipitation,

and relative humidity) were also collected. Data forboth sites (4 yr) were subject to quality control withparticular attention to extraneous values and Bowenratio values approaching �1.0.

Annual mean diurnal cycles of sensible and latentheat fluxes are displayed in Fig. 3. Each line in the

FIG. 2. Twenty-minute time series of precipitation for 4 yr at the (a) Kendall and (b) LuckyHills sites. The monsoon time period is designated with gray vertical lines at 1 Jul and 30 Sep.

FIG. 3. Annual mean diurnal cycle of sensible and latent heat at the Kendall and Lucky Hillssites: (a) Kendall: latent heat, (b) Kendall: sensible heat, (c) Lucky Hills: latent heat, and (d)Lucky Hills: sensible heat. Single-year diurnal fluxes are shown (1997, 1998, 1999, 2000) alongwith the daily average flux for the 4-yr period (1997–2000). Note the scale on the latent heatflux axis is half that of the sensible heat flux axis.

FEBRUARY 2005 H O G U E E T A L . 71

Page 5: Evaluation and Transferability of the Noah Land Surface Model in

figures represents the aggregated daily average flux foreach of the 4 yr (i.e., the yearly average at each of the72 twenty-minute time steps throughout the day), alongwith the 4-yr average (note the scale on the latent heataxis is half that of the sensible heat axis). The diurnalcycle of sensible heat for each of the 4 yr reveals goodconsistency from year to year. However, the diurnalcycles of latent heat reveal much more variability. Atthe Kendall site, the average maximum varies from 70to around 100 W m�2, and has a fairly smooth diurnalcycle. At the Lucky Hills site, three of the years arefairly similar (1998, 1999, and 2000). The 1997 year atLucky Hills was much drier overall (lowest average pre-cipitation). Also, the monsoon period during 1997 wasproblematic at this site (W. Emmerich 2003, personalcommunication), and was removed for purposes of thisstudy, resulting in a lower overall diurnal average forthis year. Although precipitation totals for the 1997 and1999 years are similar, the patterns in the latent heatflux are not. During 1997, the rain was more distributedthroughout the year, while in 1999 there is a muchlarger concentration of precipitation during the mon-soon, resulting in an above-average latent heat flux forthis year. The increased latent heat is most likely due tomore active vegetation and possible saturation of thetop soil layers during the monsoons. At Kendall, 2000 issimilar to 1997, with lower annual means for latentheat. This could also be due to the slightly drier mon-soon period (like 1997), and more moisture later in thefall season.

4. Methods

a. Model

The Noah model (Ek et al. 2003) is one of the ever-evolving community, or multigroup, land surface mod-els. The Noah model was chosen for evaluation for sev-eral reasons. The model has been used in previous stud-ies by this research group and demonstrated the bestperformance for another semiarid site (Hogue 2003).More importantly, the model is currently parameter-ized for use over semiarid regions in the National Cen-ters for Environmental Prediction (NCEP) NorthAmerican Land Data Assimilation System (NLDAS)(Mitchell et al. 2004) of the National Weather Service(NWS). The model is updated periodically on theNCEP Web site (online at ftp://ftp/ncep.noaa.gov/pub/gcp/ldas/Noahlsm) and version 2.5.1 is used in thisanalysis (release date: 5 March 2002). The Noah modelcontains four soil layers: a thin 10-cm top layer, a sec-ond root zone layer of 20 cm, a deep root zone of 60 cm,and a subroot zone of 110 cm. It can be run for 13vegetation covers (2 of which use the same parametervalues) and nine different soil types (two of which alsouse the same parameters).

The Noah model has 33 parameters: 10 related tovegetation and 23 that describe soil properties (Table

2). The model also has 16 initial states (when run withfour root layers). Of these 49 variables to be estimated,32 are included throughout the parameter estimationand validation testing. Because soil moisture data werenot available at this site, eight of the initial soil moisturestates were included in the parameter analysis. Al-though optimizing initial states is not common practicein operational forecasting, it was undertaken in thisstudy to allow for an objective “best” estimate of initialsoil moisture states and to allow for a balanced com-parison of model performance for short time periods(seasonal and yearly time frames). The Noah modeluses a local greenness fraction from the NormalizedDifference Vegetation Index (NDVI) to establish sea-sonality in the model for each of the 13 vegetationtypes. The leaf-area index (LAI) value is typically heldconstant (or used as a tuning parameter) instead of alsobeing varied seasonally (Gutman and Ignatov 1998).Consequently, the LAI parameter was included as aparameter to be calibrated, while the monthly green-ness fraction was obtained from NCEP and not ad-justed.

b. Optimization

Calibration (or optimization) of a model involves se-lecting values for parameters so that the model simu-lates the behavior (as closely as possible) of the studysite. Various automated techniques have evolved overthe past few decades, with increasing success and ap-plication to hydrologic models. The multicriteria theorywas more recently applied to the optimization of landsurface models by Gupta et al. (1998). This work pro-posed that the parameter estimation problem be refor-mulated as a multicriteria problem that seeks a set oftrade-off solutions (pareto set) instead of a singleunique solution (parameter set). Yapo et al. (1997) de-veloped the Multiple-Objective Complex OptimizationMethod-University of Arizona (MOCOM-UA), whichhas shown success in providing improved calibrations ofwatershed models (Boyle et al. 2000, 2001; Beldring2002), hydrochemistry models (Meixner et al. 2000,2002), and land surface schemes (Gupta et al. 1999;Bastidas 1998; Bastidas et al. 2001; Leplastrier et al.2002; Xia et al. 2002). More details on the MOCOM-UA algorithm may be found in Gupta et al. (1999) andYapo et al. (1997). We utilize root-mean-square error(rmse) as the error function for the MOCOM-UA op-timization of the Noah model. For validation, both thermse and the Nash–Sutcliffe forecasting efficiency (nse)(Nash and Sutcliffe 1970) are used. Rmse is defined as

rmse ��1n � �Ot � St�

2, �1�

where Ot is the observed value at time t, St is the modelsimulation at time t, and n is the number of observa-tions. The nse measures the fraction of the variance ofobserved values explained by the model. Values can

72 J O U R N A L O F H Y D R O M E T E O R O L O G Y VOLUME 6

Page 6: Evaluation and Transferability of the Noah Land Surface Model in

range from minus infinity to 1.0, with higher valuesindicating better agreement. The nse is represented as

nse � 1 � �� �Ot � St�2�� �Ot � Omean�2�, �2�

where Ot and St are defined as above, and Omean is themean of the observed values.

c. Validation

Traditional model evaluation includes a split-sample(SS) testing strategy, where parameters are estimated

during one time period and are evaluated on a separateor independent time period. Few studies follow morerigorous testing schemes, such as those proposed inKlemes (1986) and Refsgaard and Knudsen (1996), in-cluding proxy basin (PB) and differential split-sample(DSS) analyses. These tests vary somewhat in ap-proach, but, in general, parameter sets are obtained viacalibration and then validated in the SS, PB, or DSSframework. The PB test involves application of a cali-brated model to a similar catchment or site with nodirect calibration of parameters on the validation basin.

TABLE 2. Noah model parameter descriptions, default values, optimized values for the 4-yr and 1999 monsoon calibrations, rangesfor calibration, and flag for optimization (OPT) or fixed value (FIX).

No.Parameter

name Flag Default Min value Max value Physical meaning of parameters

1 rcmin OPT 400 40 1000 Minimum stomatal resistance (m)2 rgl OPT 100 30 150 Used in solar radiation term of canopy resistance Fx3 hs OPT 42 36.35 55 Used in vapor pressure deficit term of canopy resistance Fx4 z0 OPT 0.011 0.01 0.99 Roughness length (m)5 lai OPT 4 0.05 6 Leaf-area index6 cfactr OPT 0.5 0.1 2 Canopy water parameter7 cmcmax OPT 5.00E�04 1.00E�04 2.00E�03 Second canopy water parameter (m)8 sbeta OPT �2 �4 �1 Used in calculation of vegetation effect on soil heat flux9 rsmax OPT 5000 2000 10000 Maximum stomatal resistance (m)

10 topt OPT 298 293 303 Optimum transpiration air temperature (K)11 maxsmc OPT 0.42 0.33 0.66 Porosity12 drysmc OPT 0.119 0.02 0.2 Air dry soil moisture content limits13 psisat OPT 0.62 0.04 0.62 Saturated soil potential14 satdk OPT 1.41E�05 5.00E�07 3.00E�05 Saturated soil hydraulic conductivity (m s�1)15 b OPT 4.26 3.5 10.8 The “b” parameter16 satdw OPT 2.33E�05 5.71E�06 2.33E�05 Saturated soil diffusivity17 quartz OPT 0.1 0.1 0.82 Soil quartz content18 nroot FIX 4 4 4 Number of root layers19 refdk OPT 2.00E�06 5.00E�07 3.00E�05 Reference value for saturated hydraulic conductivity20 fxexp OPT 2 0.2 4 Bare soil evaporation exponent used in DEVAP21 refkdt OPT 3 0.1 10 Reference value for surface infiltration parameter22 czil OPT 0.2 0.05 0.8 To calculate roughness length of heat23 csoil OPT 2.00E�06 1.26E�06 3.50E�06 Soil heat capacity for mineral soil component24 zbot FIX �8 �3 �20 Depth of lower boundary soil temperature (m)25 frzk OPT 0.15 0.1 0.25 Ice threshold (above frozen soil is impermeable)26 xnup OPT 0.025 0.025 0.08 Threshold snow depth (100% snow cover) (m)27 snoalb FIX 0.75 0.3 0.75 Maximum albedo over deep snow28 salp FIX 2.6 2.6 2.6 Shape of distribution function of snow cover29 slope FIX 0.1 0.001 1 Slope30 t1 FIX 299 265 300 Initial skin temperature (K)31 cmc FIX 5.00E�04 0 0.001 Intitial canopy water content (m)32 snowh FIX 0 0 0.1 Initial actual snow depth (m)33 sneqv FIX 0 0 0.1 Initial water equivalent snow depth (m)34 sldpt1 FIX 0.1 0.1 0.1 Soil depth, layer 1 (m)35 sldpt2 FIX 0.2 0.2 0.2 Soil depth, layer 2 (m)36 sldpt3 FIX 0.6 0.6 0.6 Soil depth, layer 3 (m)37 sldpt4 FIX 1.1 1.1 1.1 Soil depth, layer 4 (m)38 stc1 FIX 297 260 300 Initial soil temperature (K)39 stc2 FIX 293.7 260 300 Initial soil temperature (K)40 stc3 FIX 291.5 260 300 Initial soil temperature (K)41 stc4 FIX 290.4 260 300 Initial soil temperature (K)42 smc1 OPT — 0.05 0.56 Initial soil total moisture43 smc2 OPT — 0.05 0.56 Initial soil total moisture44 smc3 OPT — 0.05 0.56 Initial soil total moisture45 smc4 OPT — 0.05 0.56 Initial soil total moisture46 sh2o1 OPT — 0.05 0.56 Initial soil liquid moisture47 sh2o2 OPT — 0.05 0.56 Initial soil liquid moisture48 sh2o3 OPT — 0.05 0.56 Initial soil liquid moisture49 sh2o4 OPT — 0.05 0.56 Initial soil liquid moisture

FEBRUARY 2005 H O G U E E T A L . 73

Page 7: Evaluation and Transferability of the Noah Land Surface Model in

This evaluation provides insight into the common prac-tice of transferability of parameters within similar cli-matic regions. Adjustment of parameters can be under-taken to account for different conditions within theproxy basin (i.e., slightly different soil type, etc.), butthe parameters are not calibrated against any observa-tions. The DSS test involves calibration of the modelbased on data before a catchment change occurs, ad-justment of parameters to reflect the expected changes,and validation or testing of the model after the changehas occurred. These changes could involve land alter-ations (i.e., fire, urbanization, deforestation, etc.) orsome sort of nonstationary climate event, where a cli-mate phenomenon occurs that has not been observed inthe calibration data.

Given the length of our dataset, we can only assessshort-term climate phenomena (monsoon, El Niño con-ditions, wet and dry years, etc.) and the impact of theseperiods on model simulations. Longer-term climatechange and variability, and how these will affect veg-etation and evaporation processes in semiarid regions,can only be undertaken through long-term model simu-lations. Taking into account the complexity of theabove regime, the given data, and the climatic condi-tions of the basins in this study, a modified testingscheme is developed for this analysis as follows:

1) The SS tests will involve calibration on one timeperiod and validation on independent time seg-ments. However, to make this analysis more robustand to assess the influence of data length on theestimated parameters, calibration is performed overvarious time periods in the datasets, with corre-sponding validation on differing lengths of data.This is illustrated in Fig. 4.

2) The PB analysis will include the direct transfer ofthe estimated parameters from the calibrated mod-els between the two similar study sites. This PB testis undertaken to test the capability of the model tosimulate energy fluxes from a site for which no cali-bration data would be available. Direct applicationof the model is done, with no calibration of param-eters. The two sites used in this analysis contain

similar soils, but slightly different vegetation types(one grassland and one desert shrub), which re-spond differently to the atmospheric forcing in theregion. Testing on the proxy basins will be done forthe corresponding time period (i.e., same calibrationand validation period).

3) A modified DSS is undertaken for this study. Be-cause the data period of this analysis included a totalof 4 yr of data, only short-term climatic and seasonalconditions were evaluated. The summer monsoonperiod is evaluated, as are the winter periods (thetwo typical precipitation periods in this region),along with the 1999 monsoon period and the 1997/98winter (El Niño event). Parameters from the yearlycalibrations are evaluated on these unusual climaticevents The DSS testing scheme is illustrated inFig. 5.

5. Results

a. General calibration

The Noah model was calibrated for each of the listedperiods (Table 3) using sensible heat (SH), ground heatflux (G), and ground temperature (Tg) data (taken at a5-cm depth), using rmse as the objective function. Dueto the variability of latent heat (LE) flux and the lowvalues found through most of the year at the Arizonasites, latent heat was not explicitly used as one of thecalibration criteria, but model simulations of latent heatflux were evaluated along with the other fluxes. Thisselection is supported by previous work by the authors,investigating which surface fluxes are best suited to themulticriteria estimation procedures (Bastidas 1998;Bastidas et al. 2001). The multicriteria approach pro-duces a set of solutions, or Pareto set, with the propertythat, moving from one solution to another, results in theimprovement of one criterion while causing deteriora-tion in another. The Pareto set represents the minimaluncertainty that can be achieved for the parameters viacalibration, without subjective assignment of weights tothe individual model responses (Gupta et al. 1998). Foreach calibration a 250 parameter set, or Pareto solution,

FIG. 4. The SS testing scheme. Calibration periods are listed down the left column (desig-nated in black in the matrix). Validation periods are listed across the top with correspondingtesting periods designated as gray blocks. For example, parameters from the 1997 calibrationperiod are evaluated on the same period (1997), along with single years 1998, 1999, and 2000,and the 3-yr period of 1998, 1999, and 2000.

74 J O U R N A L O F H Y D R O M E T E O R O L O G Y VOLUME 6

Page 8: Evaluation and Transferability of the Noah Land Surface Model in

was generated using SH, G, and Tg for criteria (basedon Bastidas 1998). Along with the calibrations, defaultparameters were also run for each of the data periods.

Cumulative distribution plots of the objective func-tion solutions (from the 250 values) were generated(Fig. 6). Each surface flux is labeled across the top ofthe respective column. Each line represents one Paretoset (represented by its objective function values). Thetop row in each plot (a and b) displays the distributionof results for the 1-yr calibrations, and the second rowcontains the seasonal calibrations (monsoons and win-ter periods). Each set of functions is normalized againstthe 4-yr calibration (1997–2000). The 4-yr set would berepresented as a zero line on the plots. Therefore, if theerrors during the selected calibration periods are lowerthan the errors from the 4-yr set, the distributions willlie to the left of zero, and errors higher than the 4-yrperiod will lie to the right of zero. A more vertical linerepresents a set of solutions that have less variation(very similar objective functions and parameter sets),while a more curved or horizontal line indicates morevariations.

Results at the Kendall site are illustrated in Fig. 6a,with Lucky Hills in Fig. 6b. In general, most of theerrors for the fluxes are lower during the 1-yr calibra-tion periods than the 4-yr period (curves lie to the leftof 0). There is more variability from year to year in theparameter sets for SH and Tg, while sets are more con-sistent for G. Errors in the LE fluxes are higher, al-though, as stated previously, this flux was not used inthe calibration process. The seasonal calibrations at

Kendall are similar to the yearly calibrations, withslight variability in the parameter sets during the sea-sons, but generally similar performance for SH, G, andTg. However, model simulations did not reproduce LEfor either the 4-yr monsoon calibration (MON1997–2000) or the 1999 monsoon (MON1999). Both of thesetime periods have much higher errors for LE than theoverall 4-yr set and are off of the normalized scale (�2)for this figure. For the other fluxes, the parameter setsreveal fairly good consistency when compared to the1997–2000 calibration set (0 line), and all seasonal cali-bration sets have lower errors for Tg than over the 4-yrperiod.

The same set of plots is shown in Fig. 6b (LuckyHills). Again, the yearly calibration periods have muchlower errors than the 4-yr set and show more consistentbehavior for most of the fluxes than at Kendall. For theseasonal calibrations, SH and G fluxes show consis-tency and similar behavior, and errors are much lowerfor the winter periods than the overall 4-yr period forTg. Again, the latent heat flux is where the model di-verges. Errors are lower for the winter calibration pe-riods (WIN1997–2000 and WIN1997–98), but are offthe scale for the MON1999 and MON1997–2000. Simi-lar to the results at Kendall, these results tell us that themodel performs well during the dry periods when SH isthe dominant energy component, but during the wetmonsoon periods, the model does not capture the dra-matic change in the energy balance and does not cap-ture this variability in LE as well.

b. Nash–Sutcliffe efficiency

After parameter sets were generated for each of thestudy periods, a “best set” was chosen to use in thecross-validation testing schemes. This single-best setwas generated using an L2 norm procedure (or Euclid-ean distance). The Euclidean distance of two points x� (x1, . . . , xn) and y � (y1, . . . , yn) in Euclidean nspace is computed as

L2 norm � ��x1 � x2�2 � �x2 � y2�2 � . . . � �xn � yn�2.

�3�

The nse statistics for the selected parameter sets areshown in Fig. 7, with both the Kendall and Lucky Hillsresults plotted (SH, LE, and Tg). The best results arearguably for SH and Tg at both sites. There is somedecrease in performance during the 4-yr winter period(WIN1997–2000) at Lucky Hills, but the trend is suchthat greater than 80% (�0.80 nse) of the variance ofboth SH and Tg is captured by the model simulations.Keeping in mind that calibration was not performedusing the latent heat flux, the model performance is stillless than ideal. Efficiency values range from 40%–70%at Kendall, and from 10%–70% at Lucky Hills. Thehigh variability (and low average values) of LE at these

TABLE 3. Data periods used for calibration of the Noah model.

Calibration periods for study sites

1997–2000 (4-yr period)1997–2000 (each yearly period)1997–98 (2-yr period)1999–2000 (2-yr period)1997–2000 monsoon periods (MON1997–2000)1997–2000 winter periods (WIN1997–2000)1999 monsoon (MON1999)1997/98 winter (WIN1997–98)

FIG. 5. The DSS testing scheme. Calibration periods are listed inthe left column and the validation periods in which these param-eters were tested are MON1997–2000, WIN1997–2000,MON1999, and WIN1997–98. The gray shading represents testing,white is no testing, and black is the same period as calibration.

FEBRUARY 2005 H O G U E E T A L . 75

Page 9: Evaluation and Transferability of the Noah Land Surface Model in

sites results in poor model simulations, especially 1997,which overall has much lower LE values throughoutthe year. Low nse during the 4-yr monsoon period(MON1997–00) and the winter period (WIN1997–00) isindicative of problems with the model capturing theunusual variability of LE using parameters from alonger-time-period calibration. The model calibrationfrom the MON1999, on the other hand, is able to cap-ture this variability. These results suggest that the abil-ity of the model to capture short-term, intense changesfrom a dry condition to a wetter, evaporative conditionis limited using parameter sets from longer time peri-ods. We theorize that this is due to the optimizationprocedure finding parameters that capture the domi-nant sensible heat flux that occurs during most of theyear in this region (typically 9–10 months each year).Parameter sets calibrated specifically on the short,

highly variable monsoon period do a better job of cap-turing the variability in these periods.

c. Model validation

1) SPLIT SAMPLE

The SS analysis consisted of applying the best param-eter set, and default parameters, to each of the periodsas shown in Fig. 4 (only annual or multiannual calibra-tion periods). Results are presented as scatterplots inFigs. 8a and 8b. Each box in the figures represents ab-solute rmse (W m�2) for LE (y axis) versus SH (x axis)for each of the selected 11 validation periods. For ex-ample, in the 1997–2000 plot (upper-left corner), theparameters calibrated on this time period, along withparameters from the other calibration periods (1997,1998, 1999, 2000, 1997–98, 1999–2000), and default, are

FIG. 6. Cumulative distribution functions (CDFs) of objective functions (or rmses) from thecalibration periods for (a) Kendall and (b) Lucky Hills. Each box represents the errors fromthe calibration set for a specific flux. For example, in the upper-left corner in (a), foursingle-year calibrations (1997, 1998, 1999, and 2000) and the errors that resulted during thecalibration for sensible heat are shown. The rmses are normalized against the 4-yr calibrationand each yearly set is designated by a gray or black, thick or thin line. Objective functionvalues that are more or less than a difference of 2 (normalized value) are offscale and aredesignated by an arrow.

76 J O U R N A L O F H Y D R O M E T E O R O L O G Y VOLUME 6

Page 10: Evaluation and Transferability of the Noah Land Surface Model in

tested over the 4-yr period. Each specific parameter setis designated via symbols described in the lower-rightlegend, and default parameter simulation errors aredesignated as the dashed lines on each plot.

At the Kendall site (Fig. 8a), simulations from thedefault parameters generally result in higher rms errorsthan the calibrated parameter sets. In many cases thereis a 5–10 W m�2 reduction in sensible heat errors andup to 20 W m�2 when using any calibrated parameterset rather than the default values. In the 1997–2000period for Kendall, the default parameters resulted inrms errors of around 65 W m�2 for sensible heat and 50W m�2 for latent heat over the 4-yr dataset (1997–2000). Although these errors may seem high, in relationto the normally large sensible heat values in this region(up to 600 W m�2 during summer periods), the modelerrors are actually around 10% or less of this value. Ashas been discussed, latent heat has greater variabilitythan sensible or ground heat fluxes in this region, withfairly low values throughout the year (near 10–15 Wm�2) and increasing to much higher values of around300 W m�2 during the monsoon period, hence, themodel errors are a significantly higher percentage ofthe average values. During the shorter time periods atthe Kendall site (1997, 1998, 1999, and 2000), the pa-rameter sets all perform fairly consistently on thesevalidation periods, except for 1999, where there is morespread in the results. The parameter set calibrated overthe 4-yr period, designated by the star (*), results inslightly larger errors for the latent heat fluxes whencompared to the other parameter sets during three ofthe 1-yr periods. This result also suggests that the pa-rameter sets from the longer time periods do not cap-ture the short-term year-to-year variability in the dataat these sites. The extremely wet monsoon period wasduring 1999, and the inconsistency in the parameter setsto capture this extreme is observed, with the 1999 and

2000 parameter sets performing better than other setsfor this wetter year. Of the 2-yr validation periodstested, the 1999–2000 period shows more inconsistency,with only the parameter set calibrated specifically forthis period performing well. The final four validationperiods in the graphic consist of testing a 1-yr calibra-tion set on the other 3 yr, along with a test of the 4-yrset and the default on these periods. Results from theseruns reveal that the 1-yr set performs fairly well on thealternative 3-yr periods, and on some occasions, as wellas, or slightly better than, the 4-yr set.

The Lucky Hills site (Fig. 8b) shows some of thesame general trends as at the Kendall site. However,the default parameters perform better during some ofthe validation periods than at Kendall. This could beexplained by the difference in vegetation at the twosites, with the Kendall site having a more dense veg-etation cover, while the desert scrub at Lucky Hills,having bare ground exposed, may be more typical ofthe vegetation parameterized in these regions. Again,there is more spread in the results in the periods con-taining the 1999 and the 2000 periods, and the param-eters calibrated from these periods perform better thanthe other sets tested for this time. For the 3-yr valida-tion periods, the default parameters perform slightlybetter than that of the 4-yr period, and similar to thecalibrated set on latent heat flux. Errors for SH are stillhigher in most periods using the default parameters. Asstated previously, the magnitude of the errors is alsorelated to the magnitude of the fluxes, with the sensibleheat fluxes being much higher at this site for most ofthis year.

2) PROXY BASIN

Each parameter set discussed above, along with thefour seasonal calibrations, were evaluated directlyagainst the same time period on the other study site.

FIG. 7. Nash–Sutcliffe forecasting efficiency for the various calibration periods. The Kendallsite is represented by the solid lines and Lucky Hills is represented by the dashed lines. Eachcalibration period is listed on the x axis. Sensible heat values are represented by a (●) symbol,latent heat values are represented by a (o) symbol, and ground temperature values are rep-resented by a () symbol.

FEBRUARY 2005 H O G U E E T A L . 77

Page 11: Evaluation and Transferability of the Noah Land Surface Model in

Direct application of the parameter sets is used to simu-late surface fluxes for the same time period at a proxysite (Kendall to Lucky Hills, and Lucky Hills to Ken-dall), with no adjustment of parameters. Results from

this site-to-site cross-validation study are presented inFig. 9. Results for the calibrated parameter sets (stars),default (dashed lines), and the proxy basin parameterset (squares) are shown in each plot.

FIG. 8. The SS results for the (a) Kendall and (b) Lucky Hills sites. Each box represents oneof the validation periods and is a scatterplot of the rmse for latent heat and sensible heat fluxfor each parameter set. Default parameters: dotted gray lines; various calibration sets: symbolslisted in legend.

78 J O U R N A L O F H Y D R O M E T E O R O L O G Y VOLUME 6

Page 12: Evaluation and Transferability of the Noah Land Surface Model in

When applying the Lucky Hills parameters to theKendall site (Fig. 9a), there are a few validation periodswhen the default parameters perform as well as thecalibrated sets. However, the default set also results inextremely poor performance for several periods (1997,

1998, 1999, 2000, 1999–2000) when compared to thecalibrated or proxy parameter sets. More significantly,in most cases at these sites, proxy basin parametersresult in a slight decrease in model performance whencompared to the site-specific calibrated set (increase of

FIG. 9. Same as in Fig. 8, but for the PB results. Again, the (a) Kendall and (b) LuckyHills sites are shown.

FEBRUARY 2005 H O G U E E T A L . 79

Page 13: Evaluation and Transferability of the Noah Land Surface Model in

5–20 W m�2 in flux errors). Two of the seasonal cali-brations—the 1997–2000 monsoon period (MON1997–2000), and the 1999 monsoon (MON1999)—have proxybasin and calibration sets that result in errors greaterthan the 70 W m�2 on the given figure.

At the Lucky Hills site (Fig. 9b) the proxy basin pa-rameters (Kendall sets) generally also result in poorerperformance than the site-specific set for several timeperiods (1999, 1997–1998, MON1999, WIN1997–2000,and WIN1997–1998). Only during 2000 do the proxybasin parameters yield similar results to the site-specificcalibration. Default parameters at the Lucky Hills typi-cally result in larger flux errors than either of the cali-brated sets, but do perform well for two winter periodsat Lucky Hills (WIN1997–2000 and WIN1997–98). Themonsoon periods at Lucky Hills show similar problemsas at the Kendall site, with generally higher errors forthese periods.

3) DIFFERENTIAL SPLIT SAMPLE

Four distinct climatic periods were selected(MON1997–2000, WIN1997–2000, MON1999, andWIN1997–98) to evaluate how parameter sets fromvarious calibration periods would perform on theseatypical climatic events found in the southwest. Defaultparameters were also run. Results are presented in Fig.10. Findings from the Kendall site are displayed in thetop row of plots (Fig. 10a), with Lucky Hills in thebottom series (Fig. 10b). Note the difference in scale forthe rmse (range of 20–100 W m�2) from Figs. 8 and 9(range of 20–70 W m�2). The parameter sets show con-sistent performance during the winter periods at bothsites with a clustering of errors in the same region.These sets (and default parameters) result in fairly lowLE errors (20–40 W m�2). The parameter sets havemuch greater variability during the monsoon periods.For the monsoon 1997–2000 period (Kendall site), the

parameter sets yield similar errors for SH, but muchmore variability in errors for LE. For the MON1999, allparameter sets show a large spread in both SH and LE.Interestingly, the 4-yr parameter set (1997–2000) doescapture this period fairly well, along with the 1999 andMON1999 set (which would be more expected). AtLucky Hills, similar results are observed, with param-eter sets performing well on the drier winter periods,and showing poorer performance on the wetter mon-soon periods. Simulations during these wet periods(with high LE fluxes) seem to be very dependent on theparameter values, and calibration of parameters spe-cific to these periods appears critical for the model toyield adequate simulations. Again, using parameter setsfrom longer time frames or periods where the sensibleheat flux is dominant (which is most of the year in thisregion), contributes to poorer model performance dur-ing short, wet periods.

4) TIME SERIES

Time series of selected optimizations from the Ken-dall site are presented in Figs. 11 and 12, with 10-dayperiods from the dry season (1 June 1998) and the mon-soon period (1–10 August 1999), respectively. Both fig-ures show the default (dark solid line) simulation, the250 trade-off or Pareto set solution (gray area), and theobserved data for the period (circles). The scatterplotsto the right of the time series correspond to the entireperiod of calibration (i.e., observed versus 4-yr calibra-tion results). For the 10-day period during June (Fig.11), it is observed that the model tracks well for SH, G,and Tg. The trade-off solution encompasses the ob-served data throughout the diurnal cycle for these threefluxes. The width of the solutions is also fairly narrow,indicating less parameter uncertainty for these simula-tions. The default parameters tend to overestimate onthe sensible heat and undersimulate for Tg and G. For

FIG. 10. Same as in Fig. 8, but for the DSS results. Both the (a) Kendall and (b) LuckyHills sites are shown.

80 J O U R N A L O F H Y D R O M E T E O R O L O G Y VOLUME 6

Page 14: Evaluation and Transferability of the Noah Land Surface Model in

latent heat, although the daily flux values are very low,neither the narrow trade-off solution, nor the default,are capturing the peak flux during the height of the day.The situation is much different during the monsoonperiod displayed in Fig. 12. Sensible heat fluxes trendmuch lower during the day, while latent heat fluxes arehigher and more variable. The trade-off solution stilltracks SH with a fairly narrow range of uncertainty, andcaptures G and Tg well. Default parameters oversimu-late SH during this period, but match G and Tg as wellas the pareto set. However, there are obvious differ-ences when evaluating LE. The uncertainty associatedwith the trade-off solution is much greater during thisperiod. The range of solutions tracks better than thedefault parameters, but still slightly underestimates thepeak flux. This is also evident in the scatterplots of theobserved and simulated LE. Although the correlation isnot extremely poor (r � 0.715), the trend of the modelto undersimulate higher values is apparent. These re-sults reinforce earlier statements, which is the inabilityof the model to capture latent heat fluxes during themonsoon period, and especially with parameter setsfrom longer-time-period optimizations.

6. Discussion and conclusions

The goal of this investigation was to rigorously evalu-ate the performance of the Noah land surface model in

a semiarid region using an objective, systematic, cali-bration and evaluation procedure. We sought to answerseveral questions, including the following: 1) How welldoes a common land surface model, such as the NCEPNoah model, perform in semiarid regions? 2) Do pa-rameters calibrated at a proxy site lead to reasonablesimulations at other sites with similar climate and veg-etation? and 3) How do parameters perform whentested under various climatic conditions that were notincluded as part of the calibration period? Few studieshave addressed these issues with long-term datasets orin a semiarid region.

In general, the Noah model accurately reproducesthe sensible heat, ground heat, and ground temperatureobservations. Soil moisture observations were notavailable for this study, and, hence, no conclusions canbe directly stated regarding the tracking of this vari-able. Follow-up studies may be needed to address theissue of initialization of prognostic land states withoutobservation data, and the potential impact on optimi-zation of parameter values. However, latent heat fluxvalues were available and it is observed that the modeldoes not reproduce this variable as well, especially dur-ing the very significant monsoon period. The dramaticincreases in LE during this period (and decrease in sen-sible heat) are problematic for the model, and hence,higher errors are observed. The latent heat flux in the

FIG. 11. Kendall site model simulations for an 11-day period during the dry season (Jun1998). Gray shaded area: trade-off bounds (Pareto set) for 1997–2000 parameters (4-yr cali-bration); circles: observed data; and solid line: default parameters.

FEBRUARY 2005 H O G U E E T A L . 81

Page 15: Evaluation and Transferability of the Noah Land Surface Model in

Noah model is strongly dependent on the greennessfraction value obtained from long-term remote sensingdata (Kurkowski et al. 2003). This fraction may be amajor source of poor performance during the wet mon-soon summers, because the model has no mechanism toadjust to abnormal or abrupt changes in vegetation re-sponse (Kurkowski et al. 2003). This season is critical inthe southwest, because vegetation processes are fullyactive for most of the grasses and other plant species inthe region, and the evaluation of the evaporative com-ponent in these basins (and land surface models) is cru-cial to the ongoing consumptive use studies in the re-gion.

Parameter sets from the optimization procedureshow improved performance over nearly all of the de-fault simulations with model error reduced by as muchas 20–40 W m�2 using the calibrated parameters. Cli-matic events such as El Niño and the seasonal monsoonperiod require parameters that are specifically cali-brated during these events (or include similar events)for adequate representation of the latent heat fluxes.This finding has significance for the range of studiescomparing the performance of land surface models.The elimination of parameter identification errors anda systematic evaluation procedure allows for a morerealistic and fair intercomparison of model perfor-mance.

In this study, single-year calibrations have lower er-rors during that period than using parameters from alonger calibration period. Also, when applying shortertime period calibrations (1 yr) to longer time series (3or 4 yr), performance does not decline significantly.This analysis supports that for long-term simulations, itmay be acceptable to select a representative data pe-riod for application to a longer period. There is also ademonstrated need to include both a “wet” and a “dry”period in this calibration period. Using parameter setsfrom longer time-frame periods (2 or 4 yr) results indecreased performance during the nonstationary cli-matic events, especially for the more extreme monsoonperiods (such as the 1999 monsoon period). To capturethe variability of these climatic conditions, there is aneed to include wet periods where latent heat is mark-edly pronounced to initiate all physical processes in themodel and to improve the estimation of model param-eters.

In general, it can be stated that the application ofproxy site (transferred) parameters results in slightlylarger errors, from 5 to 20 W m�2, over a parameter setspecific to the flux site. Results also indicate that thisconclusion is specific to the validation time period, withmodel performance declining by as much as 40–60 Wm�2 (sensible heat) when using proxy parameters oversome time periods. In most cases, however, the proxy

FIG. 12. Same as in Fig. 11, but for Kendall site model simulations for an 11-day periodduring the monsoon season (Aug 1998).

82 J O U R N A L O F H Y D R O M E T E O R O L O G Y VOLUME 6

Page 16: Evaluation and Transferability of the Noah Land Surface Model in

site parameters are still an improvement over defaultvalues at these sites. In the absence of calibration data,a proxy basin set of parameters can be applied with amoderate decline in performance. Although these re-sults are specific to the Arizona sites and, in theory, cannot be generalized to other pairs of proxy sites, wesuspect that these results may hold for other case stud-ies. As longer-term datasets become available, furtherproxy site studies will allow greater insight into thishypothesis. At the minimum, this analysis provides afirst attempt to quantify the range of errors that can beexpected when applying proxy site parameters to othersimilar climatic regions.

This is one of the first studies to address data lengthand quality on parameter estimates in a land surfacemodel, and the transferability of those estimates insemiarid regions. The Noah model was selected as asurrogate in this study for other common land surfacemodels and did have some simulation problems duringthe wet season. However, based on our analysis ofmodel performance in semiarid and other regions(Hogue 2003), we expect other models to behave simi-larly under these conditions. Land surface models needto be able to simulate long-term change, but one wouldalso hope they would also capture seasonal variability.An assessment of this ability was tested in this analysis.Our goal is that rigorous calibration and validationstudies performed on models, such as the NCEP Noahmodel, will lead to improved parameter estimates foruse in long-term climate and regional environmentalstudies. A companion paper is in progress, further ana-lyzing the parameter behavior and sensitivity at thesetwo sites, as well as at five other climatic regions.

Acknowledgments. The authors would like to thankthe reviewers for their thoughtful comments that havecontributed greatly to the improvement of this manu-script. Partial financial support for this research wasprovided by grants from the NASA EOS GraduateCollege Fellowship and SAHRA (Sustainability ofSemi-Arid Hydrology and Riparian Areas) under theSTC program of the National Science Foundation,Agreement EAR-9876800.

REFERENCES

Bastidas, L. A., 1998: Parameter estimation for hydrometeoro-logical models using multi-criteria methods. Ph.D. disserta-tion, The University of Arizona, 204 pp.

——, H. V. Gupta, and S. Sorooshian, 2001: Bounding the param-eters of land-surface schemes using observational data. LandSurface Hydrology, Meteorology and Climate: Observationsand Modeling, V. Lakshmi, J. Albertson, and J. Schaake,Eds., Water Science and Application, Vol. 3, Amer. Geophys.Union, 65–76.

——, B. Nijssen, W. Emmerich, H. V. Gupta, and E. Small, 2004:Description of the PILPS 2g experiment: Model comparisonover semi-arid areas. GLASS Science Panel of GEWEX Pro-posal, 19 pp.

Beldring, S., 2002: Multi-criteria validation of a precipitation-runoff model. J. Hydrol., 257, 189–211.

Boone, A., F. Habets, and J. Noilhan, 2001: The Rhone-AGGregation Experiment. GEWEX News, Vol. 11, No. 3,International GEWEX Project Office, Silver Spring, MD,3–5.

Boyle, D. P., H. V. Gupta, and S. Sorooshian, 2000: Toward im-proved calibration of hydrologic models: Combining thestrengths of manual and automatic methods. Water Resour.Res., 36, 3663–3674.

——, ——, ——, V. Koren, Z. Y. Zhang, and M. Smith, 2001:Toward improved streamflow forecasts: Value of semidistrib-uted modeling. Water Resour. Res., 37, 2749–2759.

Buizer, J. L., J. Foster, and D. Lund, 2000: Global impacts andregional actions: Preparing for the 1997–98 El Niño climateand societal interactions. Bull. Amer. Meteor. Soc., 81, 2131–2139.

Chehbouni, A., and Coauthors, 2000: A preliminary synthesis ofmajor scientific results during the SALSA program. Agric.For. Meteor., 105, 311–323.

Ek, M. B., K. E. Mitchell, Y. Lin, P. Grunmann, E. Rogers, G.Gayno, and V. Koren, 2003: Implementation of the upgradedNoah land-surface model in the NCEP operational mesoscaleEta model. J. Geophys. Res., 108, 8851, doi:10.1029/2002JD003296.

Emmerich, W. E., 2003: Carbon dioxide fluxes in a semiarid en-vironment with high carbonate soils. Agric. For. Meteor., 116,91–102.

Goodrich, D. C., and Coauthors, 2000: Preface paper to the Semi-Arid Land-Surface-Atmosphere (SALSA) Program specialissue. Agric. For. Meteor., 105, 3–20.

Gupta, H. V., S. Sorooshian, and P. O. Yapo, 1998: Towardsimproved calibration of hydrologic models: Multiple andnoncommensurable measures of information. Water Resour.Res., 34, 751–763.

——, L. A. Bastidas, S. Sorooshian, W. J. Shuttleworth, and Z. L.Yang, 1999: Parameter estimation of a land surface schemeusing multi-criteria methods. J. Geophys. Res., 104, 19 491–19 504.

Gutman, G., and A. Ignatov, 1998: The derivation of the greenvegetation fraction from NOAA/AVHRR data for use innumerical weather prediction models. Int. J. Remote Sens.,19, 1533–1543.

Henderson-Sellers, A., Z.-L. Yang, and R. E. Dickinson, 1993:The Project for Intercomparison of Land-surface Parameter-ization Schemes. Bull. Amer. Meteor. Soc., 74, 1335–1349.

——, A. J. Pitman, P. K. Love, P. Irannejad, and T. H. Chen, 1995:The Project for Intercomparison of Land Surface Parameter-ization Schemes (PILPS): Phases 2 and 3. Bull. Amer. Meteor.Soc., 76, 489–503.

Hogue, T. S., 2003: A multi-criteria evaluation of land-surfacemodels and application to semi-arid regions. Ph.D. disserta-tion, The University of Arizona, 247 pp.

Kemp, P. R., 1983: Phenological patterns of Chihuahuan desertplants in relation to the timing of water availability. J. Ecol.,71, 427–436.

Klemes, V., 1986: Operational testing of hydrological simulationmodels. Hydrol. Sci. J., 31, 13–24.

Kurkowski, N. P., D. J. Stensrud, and M. E. Baldwin, 2003: As-sessment of implementing satellite-derived land cover data inthe Eta Model. Wea. Forecasting, 18, 404–416.

Leplastrier, M., A. J. Pitman, H. Gupta, and Y. Xia, 2002: Ex-ploring the relationship between complexity and perfor-mance in a land surface model using the multicriteriamethod. J. Geophys. Res., 107, 4443, doi:10.1029/2001JD000931.

Meixner, T., R. C. Bales, M. W. Williams, D. H. Campbell, andJ. S. Baron, 2000: Stream chemistry modeling of two water-sheds in the front range, Colorado. Water Resour. Res., 36,77–87.

——, L. A. Bastidas, H. V. Gupta, and R. C. Bales, 2002: Multi-criteria parameter estimation for models of stream chemical

FEBRUARY 2005 H O G U E E T A L . 83

Page 17: Evaluation and Transferability of the Noah Land Surface Model in

composition. Water Resour. Res., 38, 1027, doi:10.1029/2000WR000112.

Mitchell, K. E., and Coauthors, 2004: The multi-institution NorthAmerican Land Data Assimilation System (NLDAS) project:Utilizing multiple GCIP products and partners in a continen-tal distributed hydrological modeling system. J. Geophys.Res., 109, D07S90, doi:0.1029/2003JD003823.

Nash, J. E., and J. V. Sutcliffe, 1970: River flow forecastingthrough conceptual models. 1, A discussion of principles. J.Hydrol., 10, 282–290.

Pitman, A. J., and Coauthors, 1999: Key results and implicationsfrom phase 1(c) of the Project for Intercomparison of Land-surface Parameterization Schemes. Climate Dyn., 15, 673–684.

Qi, J., and Coauthors, 2000: Spatial and temporal dynamics ofvegetation in the San Pedro River basin area. Agric. For.Meteor., 105, 55–68.

Refsgaard, J. C., and J. Knudsen, 1996: Operational validationand intercomparison of different types of hydrological mod-els. Water Resour. Res., 32, 2189–2202.

Rodriquez-Iturbe, I., A. Porporato, F. Laio, and L. Ridolfi, 2001:Plants in water-controlled ecosystems: Active role in hydro-logic processes and response to water stress. Adv. Water Re-sour., 24, 695–705.

Schlosser, C. A., and Coauthors, 2000: Simulations of a boreal

grassland hydrology at Valdai, Russia: PILPS phase 2(d).Mon. Wea. Rev., 128, 301–321.

Scott, R. L., 1999: Riparian and rangeland sol-vegetation-atmosphere interactions in southeastern Arizona. Ph.D. dis-sertation, The University of Arizona.

Sorooshian, S., R. Bales, H. Gupta, G. Woodard, and J. Wash-burne, 2002: A brief history and mission of SAHRA: A Na-tional Science Foundation Science and Technology Center on“Sustainability of semi-arid hydrology and riparian areas”(invited commentary). Hydrol. Processes, 16, 3293–3295.

Western Region Climate Center, cited 2003: Historical climateinformation. [Available online at http://www.wrcc.dri.edu/CLIMATEDATA.html.]

Wood, E. F., X. Liang, D. Lohmann, and D. P. Lettenmaier, 1998:The Project for Intercomparison of Land-surface Parameter-ization Schemes (PILPS) phase-2(c) Red-Arkansas River Ex-periment: 1. Experiment description and summary intercom-parisons. J. Global Planet. Change, 19, 115–135.

Xia, Y., A. J. Pittman, H. V. Gupta, M. Leplastrier, A. Hender-son-Sellers, and L. A. Bastidas, 2002: Calibrating a land sur-face model of varying complexity using multicriteria methodsand the Cabauw dataset. J. Hydrometeor., 3, 181–194.

Yapo, P. O., H. V. Gupta, and S. Sorooshian, 1997: Multi-objective global optimization for hydrologic models. J. Hy-drol., 204, 83–97.

84 J O U R N A L O F H Y D R O M E T E O R O L O G Y VOLUME 6