the professional geographer a point-based intelligent

16
PLEASE SCROLL DOWN FOR ARTICLE This article was downloaded by: [Florida Atlantic University] On: 23 March 2011 Access details: Access Details: [subscription number 784176984] Publisher Routledge Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37- 41 Mortimer Street, London W1T 3JH, UK The Professional Geographer Publication details, including instructions for authors and subscription information: http://www.informaworld.com/smpp/title~content=t788352615 A Point-Based Intelligent Approach to Areal Interpolation Caiyun Zhang a ; Fang Qiu b a Florida Atlantic University, b University of Texas, Dallas First published on: 02 March 2011 To cite this Article Zhang, Caiyun and Qiu, Fang(2011) 'A Point-Based Intelligent Approach to Areal Interpolation', The Professional Geographer, 63: 2, 262 — 276, First published on: 02 March 2011 (iFirst) To link to this Article: DOI: 10.1080/00330124.2010.547792 URL: http://dx.doi.org/10.1080/00330124.2010.547792 Full terms and conditions of use: http://www.informaworld.com/terms-and-conditions-of-access.pdf This article may be used for research, teaching and private study purposes. Any substantial or systematic reproduction, re-distribution, re-selling, loan or sub-licensing, systematic supply or distribution in any form to anyone is expressly forbidden. The publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to date. The accuracy of any instructions, formulae and drug doses should be independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings, demand or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material.

Upload: others

Post on 27-Apr-2022

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Professional Geographer A Point-Based Intelligent

PLEASE SCROLL DOWN FOR ARTICLE

This article was downloaded by: [Florida Atlantic University]On: 23 March 2011Access details: Access Details: [subscription number 784176984]Publisher RoutledgeInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

The Professional GeographerPublication details, including instructions for authors and subscription information:http://www.informaworld.com/smpp/title~content=t788352615

A Point-Based Intelligent Approach to Areal InterpolationCaiyun Zhanga; Fang Qiub

a Florida Atlantic University, b University of Texas, Dallas

First published on: 02 March 2011

To cite this Article Zhang, Caiyun and Qiu, Fang(2011) 'A Point-Based Intelligent Approach to Areal Interpolation', TheProfessional Geographer, 63: 2, 262 — 276, First published on: 02 March 2011 (iFirst)To link to this Article: DOI: 10.1080/00330124.2010.547792URL: http://dx.doi.org/10.1080/00330124.2010.547792

Full terms and conditions of use: http://www.informaworld.com/terms-and-conditions-of-access.pdf

This article may be used for research, teaching and private study purposes. Any substantial orsystematic reproduction, re-distribution, re-selling, loan or sub-licensing, systematic supply ordistribution in any form to anyone is expressly forbidden.

The publisher does not give any warranty express or implied or make any representation that the contentswill be complete or accurate or up to date. The accuracy of any instructions, formulae and drug dosesshould be independently verified with primary sources. The publisher shall not be liable for any loss,actions, claims, proceedings, demand or costs or damages whatsoever or howsoever caused arising directlyor indirectly in connection with or arising out of the use of this material.

Page 2: The Professional Geographer A Point-Based Intelligent

ARTICLES

A Point-Based Intelligent Approach to Areal Interpolation∗

Caiyun ZhangFlorida Atlantic University

Fang QiuUniversity of Texas at Dallas

Areal interpolation is the data transfer from one zonal system to another. A survey of previous literature onthis subject points out that the most effective methods for areal interpolation are the intelligent approaches,which often take two-dimensional (2-D) land use or one-dimensional (1-D) road network information asancillary data to give insight on the underlying distribution of a variable. However, the 2-D or 1-D ancillaryinformation is not always applicable for the variable of interest in a specific study area. This article introducesa point-based intelligent approach to the areal interpolation problem by using zero-dimensional (0-D)points as ancillary data that are locationally associated with the variable of interest. The connection betweenzonal variables and point locations can be modeled with a linear or a nonlinear exponential function, whichincorporates the distribution of the variables in the transferring of the information from the source zoneto the target zone. An experimental study interpolating the population data at a suburbanized area suggeststhat the proposed method is an attractive alternative to other areal interpolation solutions based on the eval-uation of its resulting accuracy and efficiency. Key Words: areal interpolation, models, population estimation.

La interpolacion espacial es la transferencia de datos de un sistema zonal a otro. La revision de la literaturaexistente sobre esta materia indica que los metodos mas efectivos de interpolacion espacial son los enfoquesinteligentes, que a menudo adoptan la red de informacion bidimensional de uso del suelo (2-D) o la vıaunidimensional (1-D) como datos subsidiarios para dar comprension a la distribucion subyacente de unavariable. Sin embargo, la informacion subsidiaria 2-D o 1-D no siempre es aplicable a la variable de interesen un area especıfica de estudio. En este artıculo se introduce un enfoque inteligente de base puntual alproblema de la interpolacion espacial, utilizando puntos cero-dimensionales (0-D) como datos subsidiariosque estan asociados locacionalmente con la variable de interes. La conexion entre las variables zonales y laslocalizaciones puntuales pueden modelarse con una funcion exponencial linear o no linear, que incorpore ladistribucion de las variables en la transferencia de informacion desde una zona fuente a la zona de destino. Un

∗We would like to extend our appreciation to the anonymous reviewers. Their constructive comments were very valuable in improving the qualityof this article. Caiyun Zhang would also like to acknowledge the support of the Natural Science Foundation of China through projects 40730530and 40706056.

The Professional Geographer, 63(2) 2011, pages 262–276 C© Copyright 2011 by Association of American Geographers.Initial submission, February 2009; revised submission, October 2009; final acceptance, November 2009.

Published by Taylor & Francis Group, LLC.

Downloaded By: [Florida Atlantic University] At: 13:39 23 March 2011

Page 3: The Professional Geographer A Point-Based Intelligent

A Point-Based Intelligent Approach to Areal Interpolation 263

estudio experimental en el que se interpolan los datos de poblacion en un area suburbanizada sugiere que elmetodo propuesto es una alternativa atractiva para buscar soluciones a otras interpolaciones espaciales basadasen la evaluacion de los resultantes niveles de exactitud y eficiencia. Palabras clave: interpolacion espacial,modelos, calculos de poblacion.

A real interpolation involving the transfor-mation of attribute data from one zonal

system to another is frequently used in eco-nomic, social, and urban studies. The zonalunits for which data are available are knownas source zones and those for which data arerequired are called target zones (Markoff andShapiro 1973). The need for areal interpola-tion arises when spatially aggregated data areavailable for one set of geographic areal units(or zones) but not for the areal units of currentinterest. In the United States, population datahave often been collected and aggregated everyten years in area units such as census blocks,block groups, and census tracts, which are ar-bitrarily defined by the U.S. Census Bureau forsurvey purposes. Population data, however, areoften needed in socioeconomic, urban, and in-terdisciplinary studies, where the study unitsare usually different from those used by theCensus Bureau. For example, many businessesoften need demographic data for marketinganalysis zones or service or trade areas. Sim-ilarly, in the natural sciences, the geographicunits of analysis are usually areas definedby land use, land cover, soil type, watershedboundaries, and a variety of other biophysicaland geophysical features. Given that census ge-ography and its concomitant demographic dataseldom correspond exactly to these areas, it iscrucial that population counts and other relatedinformation can be transferred into new areaunits so that the data from different disciplinesand disparate units of analysis can be integrated.

A variety of algorithms for areal interpolationhave been proposed based on different assump-tions for the underlying distributions of vari-ables. These algorithms can be grouped intotwo categories: simple methods that do not em-ploy ancillary data and intelligent methods thatmake use of ancillary data (Hawley and Moel-lering 2005).

The best known simple method is thearea-weighting approach, which interpolatesa variable based on the area of intersectionbetween the source and target zones with-out using ancillary data (Lam 1983). Thismethod has been used in various applica-

tions, for example, to generate a consis-tent global georeferenced population data setknown as Gridded Population of the World(GPW) in 1995 (Tobler et al. 1995), whichis now updated by the Center for Interna-tional Earth Science Information Network ofColumbia University with improved methods(see http://sedac.ciesin.columbia.edu/gpw/). Afundamental problem with this approach is thatit assumes spatial homogeneity of the variableof interest within each source zone, an assump-tion that is rarely true in the real world. Thepycnophylactic method (Tobler 1979) is alsoa simple method that provides a solution tothis problem. It fits a continuous surface tothe source zone data and then uses that sur-face to aggregate values for the target zones.Tobler applied this method to generate popu-lation density maps of 1970 at the state level andthe census tract level. A centroid-based methodproposed by Martin (1989) and Brackenand Martin (1989) is another simple solutionand was used for census mapping in the UnitedKingdom. This method places a kernel densitywindow over a source zone centroid to allo-cate the population into each grid cell with adistance-based weighting strategy. This simpleapproach does not conserve the total value ofeach source zone, a property known as volumepreserving in the literature and regarded as animportant characteristic in areal interpolation.To solve this problem, Martin (1996) modi-fied the original centroid-based algorithm toensure that the populations reported for tar-get zones are constrained to match the overallsum of the source units. Similarly, Harris andChen (2005) employed UK postcodes to esti-mate population density using the populationsurface modeling technique proposed by Mar-tin (1989) and Bracken and Martin (1989).

With the inclusion of additional data, var-ious algorithms have been proposed to pro-vide better solutions to the areal interpolationproblems. These methods are known as intelli-gent methods, because they utilize the ancillarydata to inform the distribution of the variableof interest in the source zone. These intelli-gent methods have been extensively applied to

Downloaded By: [Florida Atlantic University] At: 13:39 23 March 2011

Page 4: The Professional Geographer A Point-Based Intelligent

264 Volume 63, Number 2, May 2011

population interpolations (Mennis 2003; Holt,Lo, and Hodler 2004; Langford 2006; Reibeland Agrawal 2007), socioeconomic variableestimations (Goodchild, Anselin, and Deich-mann 1993; Eicher and Brewer 2001; Mennisand Hultgren 2006), and time-variant popula-tion analysis with changing historical admin-istrative boundaries (Gregory 2002a, 2002b).Among the intelligent methods within the lit-erature, the so-called dasymetric approach isthe most cited. The dasymetric method wasoriginally designed for mapping purposes byWright (1936) to improve the depiction of pop-ulation distribution in choropleth areal maps.It was then adopted as an areal interpola-tion approach that divides sources zones intosmaller constituents possessing different butinternally consistent density of the variable(Langford, Magnire, and Unwin 1991; Lang-ford 2006). Dasymetric approaches commonlyemploy two-dimensional (2-D) ancillary data,such as land use, to provide insight on theunderlying distribution of the variable in thesource zone (e.g., Fisher and Langford 1995,1996; Eicher and Brewer 2001; Mennis 2003;Holt, Lo, and Hodler 2004). With the avail-ability of one-dimensional (1-D) TopologicallyIntegrated Geographic Encoding and Refer-encing (TIGER) line geographic informationsystem (GIS) data, Xie (1995) was the first toattempt to solve the areal interpolation prob-lem by using these types of data as ancillaryinformation using a similar idea. Reibel andBufalino (2005) investigated the errors in Xie’salgorithm.

The ancillary data are often called controlzones if they are 2-D (Goodchild, Anselin,and Deichmann 1993). Similarly we can re-fer to the 1-D ancillary data as control linesif they are 1-D. Many variations of these intel-ligent methods, such as smart dasymetric meth-ods (Deichmann 1996; Turner and Openshaw2001), intelligent dasymetric methods (Mennisand Hultgren 2006), and regression dasymet-ric methods (Langford, Magnire, and Unwin1991; Yuan, Smith, and Limp 1997; Langford2006), have also been proposed.

Statistical methods have also been employedin areal interpolation. Those methods modelthe relationship between the distribution of thevariable of interest and the ancillary data, as inthe regression dasymetric method. The regres-sion analysis is usually conducted at the aggre-

gate level, for example, by using the total countof the variable in each source zone and the ag-gregate area of the control zones falling withinthe source zone. The resulting regression func-tion is then applied to each target zone to esti-mate its total count by using the aggregate areaof the control zones falling into the target zone.A statistical method called the EM (expecta-tion/maximum) algorithm that was proposed byFlowerdew and Green (Flowerdew 1988; Flow-erdew and Green 1991) is innovative. It modelsthe relationship between estimated populationdensity and socioeconomic variables such as thenumber of people or car ownership per house-hold at the disaggregate level (i.e., control unitlevel). Similarly, Harvey (2002a, 2002b) used aniterated regression procedure as a least-squaresapproximation of the EM algorithm to corre-late population with the digital number of eachresident pixel of a satellite image. These meth-ods often derive the regression formula fromindividual estimations with assumed statisticalrules, and then apply those rules to the wholeunits being studied (Xie 1995). The success ofthese methods depends heavily on the availabil-ity of detailed control variables at the disaggre-gate level and the degree that the variable ofinterest precisely follows a specific standard sta-tistical distribution, a demand that might limitits wide application.

Geostatistical approaches that were origi-nally designed for point interpolation have alsorecently been adopted in areal interpolation byincorporating spatial autocorrelation into themodeling process. Kyriakidis (2004) establisheda theoretical framework of area-to-point in-terpolation based on spatial cross-correlationbetween areal and point variables through cok-riging procedures. Kyriakidis and Yoo (2005)illustrated the realization of this theoreticalwork through the application of a synthetic im-age data set. Liu, Kyriakidis, and Goodchild(2008) further extended the model to disaggre-gate the residuals that resulted from a regres-sion between population density and built-upand vegetation compositions obtained from anIKONOS image. In a separate study conductedindependently by Wu and Murray (2005), theimperviousness fraction derived from an En-hanced Thematic Mapper Plus (ETM+) im-age was adopted as a secondary variable in acokriging prediction of population density. Asan emerging approach, geostatiscal-based areal

Downloaded By: [Florida Atlantic University] At: 13:39 23 March 2011

Page 5: The Professional Geographer A Point-Based Intelligent

A Point-Based Intelligent Approach to Areal Interpolation 265

interpolation, with its theoretical frameworkbeing recently developed and tested by appli-cations using simulated grid and satellite im-agery, is anticipated to be a promising methodif verified by more real-world applications us-ing other ancillary data.

Hawley and Moellering (2005) conducteda detailed comparative study of the fourmost widely used methods: the area-weightingmethod (Lam 1983), the pycnophylacticmethod (Tobler 1979), the binary dasymet-ric method using 2-D remotely sensed data(Wright 1936; Fisher and Langford 1995),and the road network method using 1-DTIGER/Line data (Xie 1995). It was found thatintelligent methods usually achieved better ac-curacy than simple methods, with the 1-D over-laid network as ancillary data resulting in thebest accuracy for population interpolation intheir study area.

Intelligent methods using 2-D ancillary dataor 1-D road network data are not always ap-plicable to all studies. Sadahiro (2000) pointedout that 2-D land use information employedin many dasymetric methods is often derivedfrom remote sensing images, which are notalways available or might be expensive, espe-cially for developing countries. In addition, thecomputational cost tends to be problematicbecause great quantities of detailed polygons,raster cells, or both have to be processed.The 1-D road network-based intelligent ap-proach often employs road information suchas TIGER/Line data to partition the attributedata along the street segments, which alsoincreases computation time due to the largevolume of segments to be processed. Such ap-plications rely on the theoretic assumption thathuman residents are mainly located along thesides of roads. Consequently, the applicabilityof the 1-D network-based approach for esti-mating variables other than population has notyet been explored (Xie 1995). Further, the im-plementation of both 2-D and 1-D intelligentmethods relies on the overlay operation, a com-plex topological process that further increasesthe computational expense of these methods,especially given the large quantity of data usu-ally involved.

In this article, we propose a point-based ap-proach that uses 0-D points as ancillary data tohelp transfer information from the source zoneto the target zone. These point data therefore

serve as control points, which help character-ize the underlying distribution of the variableof interest. This proposed approach has sev-eral advantages, as follows. First, 0-D ancillarydata have a much simpler data structure com-pared with 2-D and 1-D spatial data. Second,point data are widely available from variousdatabases such as school sites, business cen-ters, supermarkets, fire stations, and hazardoussites. They allow for the interpolation of othervariables besides human population. Third, theproposed approach is based on a simpler geo-graphic information systems (GIS) operation,straight-line distance analysis, rather than theoverlay process, which makes the techniquemore computationally efficient than 2-D and1-D intelligent methods. Finally, the new tech-nique makes an intelligent areal interpolationpossible when 2-D or 1-D ancillary data rele-vant to the variables to be interpolated are notavailable or applicable.

Previous Methods

To evaluate the proposed approach, five arealinterpolation methods that have been widelycited in the past are employed as benchmarks.The basic algorithms underlying these methodscan be described as follows.

Area-Weighting MethodLam (1983) reported the area-weightingmethod as the simplest algorithm for per-forming areal interpolation. Albeit verystraightforward, it is a volume-preservingalgorithm that conserves the total value withineach zone. This method can be formulatedin the following steps. First, the density Ds(e.g., population density) of a variable (e.g.,population) in each source zone is calculated as:

Ds = Us

As(1)

where Us and As refer to the value (e.g., thenumber of population) and area of source zones (e.g., the area of a census tract), respectively.Then the total number for a target zone t (Vt)(e.g., a ZIP code tabulation area) is estimatedby performing a weighted summation of thedensity values of all the source zones fallingin the target zone using the overlapping area

Downloaded By: [Florida Atlantic University] At: 13:39 23 March 2011

Page 6: The Professional Geographer A Point-Based Intelligent

266 Volume 63, Number 2, May 2011

between the source zone s and target zone t(Ats ) as the weight:

Vt =∑

Ats Ds (2)

where Ats is the overlapping area between thesource zone s and target zone t. The area-weighting method can be implemented in bothvector and raster GIS environments. FromEquation 1, we can see that this algorithm isbased on the assumption that the density withineach source zone is a constant. This method isprobably the only choice when additional infor-mation is unavailable for the interpolation area,even though the assumption of homogeneity israrely met in the real world (Xie 1995).

Pycnophylactic MethodThe pycnophylactic method was proposed byTobler (1979) for isopleth mapping and was ap-plied to areal interpolation by Goodchild andLam (1980). This method was developed ina raster GIS calculation environment. It firstcomputes the density of the variable within eachsource zone. The computation can be formu-lated as:

Ds i = Us

Ns(3)

where Ds i is the value for cell i , and Us andNs are the value and the number of cells withinsource zone s , respectively. This process is thesame as the area-weighting approach. The pyc-nophylactic approach, however, then smoothsthe density for each cell to combine the impactsof the adjacent neighbors on its grid value byusing the following equation:

Ds i′ =

∑Nni=1 Ds i

Nn(4)

where Nn is the number of neighbors of cell i.The smoothed densityDs i

′ for each cell is oftencomputed as the average of its four orthogonalneighbors or all of its immediate eight neigh-bors in the 3-by-3 smoothing window. The-oretically, however, the size of the smoothingwindow can be customized with any odd or evennumber based on the characteristics of the un-derlying data. To keep the volume-preserving

property or to meet the pycnophylactic con-dition, each Ds i

′ is adjusted with an iterativeprocedure that continues until there is eitherno significant difference between the estimatedvalues and actual values within the source zonesor there have been no significant changes ofcell values from the previous iteration. Tosave computation time, Mennis (2003) andLangford (2006) suggested the following one-step approach to avoid multiple iterations ofcalculation:

D′′s i = Ds i

′ Us∑Nsi=1 Ds i

′ (5)

The interpolated value for each target zonethen can be expressed as:

Vt =Nt∑

i=1

D′′s i (6)

where Nt is the number of cells within targetzone t.

By creating a smoothing surface, this methoddoes not confine itself to the homogeneityassumption of the area-weighting approach,which offers some compromise between homo-geneity and heterogeneity. It also conserves theoriginal value of each source zone and is an im-provement over the area-weighting method butmakes no attempt to use the abundant informa-tion available in ancillary data.

Dasymetric Methods Using 2-D AncillaryDataWright (1936), possibly the first to proposethe dasymetric method, used topological mapsas ancillary data for estimating the populationof Cape Cod to address the uneven distribu-tion problem in areal interpolation. Accordingto Hawley and Moellering (2005), Fisher andLangford (1995) were the first to publish a dasy-metric areal interpolation method using 2-Dland use data as control zones. If the land useinformation is simply the binary divide betweenrelated (e.g., residential) and unrelated (e.g.,nonresidential) information to the variable ofinterest (e.g., population), the method is calledthe binary dasymetric model. It is formulated

Downloaded By: [Florida Atlantic University] At: 13:39 23 March 2011

Page 7: The Professional Geographer A Point-Based Intelligent

A Point-Based Intelligent Approach to Areal Interpolation 267

by the following equations:

Dc ={

Us∑As c

0 (i f∑

As c = 0)(7)

Vt =∑

Btc Dc (8)

where Dc is the density of the variable withinthe control zone, Us is the value of source zones , and As c is the overlapping area between thesource zone s and control zone c . Vt is the in-terpolated value for target zone t, and Btc isthe overlapping area between target zone t andcontrol zone c . This approach assumes that thevariable might not be evenly distributed in thesource zone but is evenly distributed in the con-trol zones within a source zone.

The binary dasymetric method considersonly two land use types (i.e., one related to thevariable of interest, and the other not), but ifmore than two land use types are used, a re-gression model is required to derive the densityfor each land use type. The following equationgives the regression-based dasymetric methodusing land use control zone c with M classes(Langford, Magnire, and Unwin 1991; Lang-ford and Unwin 1994; Yuan, Smith, and Limp1997):

Us =M∑

m=1

As c m Dc m (9)

where As c m and Dc m are the area and thedensity of land use class m within sourcezone s , respectively. Dc m is derived by usingthe ordinary least-squares fitting method withthe corresponding matrix algebra expressed byEquations 10 and 11:

[Us n]N×1 = [As nm]N×M × [Dm]M×1 (10)

[Dm]M×1 =(

[As nm]M×N[As nm]TN×M

)−1

× [As nm]M×N × [Us n]N×1 (N ≥ M)

(11)

All of the source zones will be involved inthe derivation of the density of each class inthe control zone when the global regressionmethod is used. N is the total number of source

zones, and it must be greater than or equal toM.

The derived matrix Dc in the left side of Equa-tion 11 cannot be used directly for the targetzones’ estimations if the volume of each sourcezone needs to be conserved. To achieve vol-ume preservation for each source zone, each Dcneeds to be adjusted. This preservation can beimplemented with a rescaling operation similarto that in Mennis (2003) and Langford (2006):

D′c m = Dc m × Us

U ′s

(12)

where D′c m is the new density estimation for

class c in source zone s ; U ′s is the estimated

value of source zone s using Equation 10. D′c m

is then used to aggregate the value of the targetzone, in a similar manner to that of the binarydasymetric method.

Road Network Method Using 1-D AncillaryDataThe road network-based intelligent methodusing 1-D road network data as ancillary in-formation was developed by Xie (1995), whoproposed three algorithms: the network lengthmethod, the network hierarchical weight-ing method, and the network house-bearingmethod. The network length method is themost straightforward and widely used approachbecause no detailed information on road classesand housing distributions along the roads isneeded. The network length method is givenas follows:

Ds ={

UsLs

0 (i f Ls = 0)(13)

Vt =∑

Lts Ds (14)

where Ds is the value per unit length, Ls is thetotal length of the segments within a sourcezone, and Lts is the total length of the networksegments within the overlapping area betweenthe source and target zone.

The notion of incorporating relevant an-cillary information for characterizing the het-erogeneous nature of variable distributions inareal interpolation provides the rationale forthe dasymetric method that uses 2-D land uses

Downloaded By: [Florida Atlantic University] At: 13:39 23 March 2011

Page 8: The Professional Geographer A Point-Based Intelligent

268 Volume 63, Number 2, May 2011

or 1-D road networks as control zones or lines.So far, most of the dasymetric approaches inthe literature are used to interpolate only hu-man population. For other types of variables,such as toxic gas concentration caused by pointsources in pollution studies, these 2-D land usesor 1-D road networks are not always applica-ble, because they might not be directly relevant,or may even be completely unrelated, to thevariable being interpolated. As discussed pre-viously, spatial data with 2-D or 1-D ancillaryinformation also often involve complex topo-logical vector data structures, which demandsgreater computation time for areal interpola-tion calculation.

Point-Based Intelligent Method

We propose an alternative approach to areal in-terpolation based on 0-D point data. Comparedwith 2-D polygon and 1-D line data, point datahave a much simpler data structure and do notrequire extra storage of topological informa-tion to represent spatial relationships. The al-gorithms developed in this approach employprimarily a simple data processing operation,namely, straight-line distance analysis, to es-timate the density distribution. Similar to thepycnophylactic approach, the density thus de-rived can be smoothed if necessary. A uniquefeature of our point-based intelligent approachto areal interpolation is that it utilizes discreteancillary point data. It appears that the method-ology of the proposed approach is similar to thecentroid-based (Martin 1989) and geostatistics-based (Kyriakidis 2004) areal interpolation inthat point data are employed, but it is not ex-actly same as those in nature. Unlike the simplecentroid-based approach that utilizes the popu-lation information of source zone as weight, theproposed approach is an intelligent method thatintegrates the existing point ancillary data withthe population information, while preservingthe total population count in the source zone.The geostatistics-based approach makes use ofthe systematic point data derived from the gridcells or image pixels to perform area-to-pointinterpolation using a cokringing procedure assuggested by the publications currently avail-able, whereas our proposed approach employsdiscrete point data obtained from a GIS dataset to conduct a point-to-area interpolation us-

ing a simple Euclidean distance function. Inour intelligent point-based method, a linear orexponential equation was used to model the re-lationship between density of the variable ofinterests and the location of the control points.The adoption of the linear or exponential func-tions is based on techniques used by the urbanpopulation density modeling widely cited in theliterature. According to Martori and Surinach(2002), two classical models were often used tomodel population density in urban areas. One isbased on the linear function proposed by Stew-art (1947) and another is an exponential modelproposed by Clark (1951). The detailed algo-rithms of this approach are presented here, firstwithout a smoothing scheme and then with asmoothing scheme.

Without Smoothing SchemeThe proposed point-based approach assumesthat control points are locationally related tothe concentration of a variable of interest. Forexample, schools, supermarkets, and businesscenters are often close to population centersto best serve the residents around them. How-ever, toxic release inventory sites, landfills, andhazardous material processing plants are of-ten situated away from neighborhoods withhigh population densities so that the pollu-tion impacts on people can be minimized. Ourapproach aims to model these locational rela-tionships to characterize the underlying distri-bution of a variable of interest. The model canbe implemented in a vector or a raster envi-ronment. For simplicity, we used a raster en-vironment only as an example to explain ourapproach in this article, but the vector imple-mentation is very similar in principle. The den-sity of a variable can be estimated by this linearfunction:

Ds i = as Ws i (Ws i ∈ [0, 1]) (15)

The density can also be characterized by anonlinear exponential function in the form:

Ds i = as e Ws i (Ws i ∈ [0, 1]) (16)

where Ds i is the estimated density value for celli within source zone s , as is a constant param-eter for source zone s , and Ws i is the weightof cell i within source zone s . Ws i is calculated

Downloaded By: [Florida Atlantic University] At: 13:39 23 March 2011

Page 9: The Professional Geographer A Point-Based Intelligent

A Point-Based Intelligent Approach to Areal Interpolation 269

based on the inverse distance weighting:

Ws i =(

1 − λs i

λs max

)q

(q ≥ 1) (17)

where λs i is the distance from cell i withinsource zone s to the closest control point,which is assumed to have the largest influence,although it might not be always absolute indominance. This control point does not haveto be inside the source zone within which celli is located. λs max is the maximum values of λs ifor all the cells within source zone s ; q is thepower parameter that controls the degree of lo-cal influence. Higher q values suggest that therate of change in density values is higher near acell. The smaller the distance λs i , the larger theinfluence this control point has. If the cell is atthe furthest distance (λs max), the weight (Ws i )becomes 0, with no influence in determiningthe density value for this cell. The calculationof distance from each cell to its closest controlpoint can be easily derived using simple GIStechniques such as the straight-line distancefunction in a raster environment or the pointdistance function in a vector environment. Ifthe distribution of the variable is just distanceweighted instead of inverse distance weighted,where faraway control points have larger influ-ence on the density than near ones, a differ-ent weighting strategy can be employed. Forexample, garbage processing plants are oftenfar away from residential areas. Therefore, thepopulation interpolation using garbage pro-cessing plants as control points will use the fol-lowing formula for the weight parameter:

Ws i =(

λs i

λs max

)q

(18)

The constant parameter as in Equations 15and 16 can be derived respectively with the fol-lowing formulas:

Ds 1 = as Ws 1

Ds 2 = as Ws 2· · ·

Ds Ns = as Ws Ns

⎫⎪⎪⎬⎪⎪⎭ ⇒ as

=∑Ns

i=1 Ds i∑Nsi=1 Ws i

= Us∑Nsi=1 Ws i

(19)

Ds 1 = as e Ws 1

Ds 2 = as e Ws 2

· · ·Ds Ns = as e Ws Ns

⎫⎪⎪⎬⎪⎪⎭ ⇒ as

=∑Ns

i Ds i∑Nsi=1 e Ws i

= Us∑Nsi=1 e Ws i

(20)

where Us and Ns are the value of source zone sand the number of cells within it, respectively.Finally, the value for a target zone is computedas the sum of Ds i , similar to Equation 6. Thisapproach is volume preserving, a property thatmost users generally prefer.

With Smoothing SchemeThe pycnophylactic areal interpolation pro-posed by Tobler (1979) used a smoothing den-sity function to incorporate the effects of ad-jacent source zones. To make it possible tocompare with Tobler’s method, we also im-plemented a similar smoothing scheme using amoving window. The smoothed density valuefor cell i within source zone s (Ds i

′) can becomputed as the average of the original densityvalues (Ds i ) derived from Equation 15 or 16 forall the neighboring cells in the smooth window:

Ds i′ =

∑Nni=1 Ds i

Nn(21)

where Nn is the total number of the smooth-ing window (e.g., Nn = 9 for a 3-by-3 win-dow). To meet the pycnophylactic condition,a rescaling procedure similar to Equation 5 isused to derive the adjusted densityD′′

s i . The fi-nal interpolated value for any target zone canbe computed as the sum of D′′

s i , which is similarto Equation 6. By using the smoothing scheme,the impacts of its adjacent neighbors to eachcell are taken into consideration, so that out-liers resulting from the interpolation processcan be reduced. It is also a volume-preservingmethod when the scale operation is applied.

Case Study

The typical application of areal interpolationis often the residential population estimationfrom one areal unit to another. The U.S.

Downloaded By: [Florida Atlantic University] At: 13:39 23 March 2011

Page 10: The Professional Geographer A Point-Based Intelligent

270 Volume 63, Number 2, May 2011

Table 1 Descriptive statistics of population inthe source zone

Minimum 1,832Maximum 11,868M 5,784SD 2,058Total 491,675

Census Bureau provides population data for awide range of areal units, an ideal data set totest and evaluate alternative methods.

We performed areal interpolation on thepopulation data for a suburbanized county,Collin County, located in the state of Texasas our experimental area. As the part of Dal-las/Fort Worth metroplex, this area has beenexperiencing explosive growth in recent years.The U.S. Census Bureau estimated that thepopulation has increased about 48.6 percentfrom 2000 to 2007 in this region, which in-cludes eighty-five census tracts, thirty-two ZIPcode tabulation areas, 196 schools, and 31,951road network segments (Figure 1). In the exper-iment, census tracts were used as source zones,and ZIP code tabulation areas were used as tar-get zones. This is a suitable framework for aninterpolation experiment, because the units arenot nested but rather the source zones and tar-get zones intersect, a common practice oftenrequired in applied demographic work. Thesurveyed population values were available forboth source zones and target zones, whichwere used to derive areal interpolation resultsand to evaluate model accuracy. The descrip-tive statistics of the population for the sourcezone are listed in Table 1. In 2000, there were491,675 people living in this county. The min-imum population among all census tracts was1,832 and the maximum was 5,784. Land usedata from the North Central Texas Coun-cil of Governments were employed to eval-uate the dasymetric method using 2-D an-cillary information. To evaluate the binarydasymetric model, these land use data weregrouped into two classes: residential regionsand nonresidential regions. To test the dasy-metric regression model, it was regrouped intothree classes: high residential regions, low res-idential regions, and nonresidential regions.For the network-based intelligent approach,TIGER/Line network data were used as 1-D ancillary data. To examine our proposed

point-based intelligent method, we chose touse schools as the control points. The site ofa school is often selected based on a variety offactors, such as location, environment, safety,and zoning, among which the location and en-vironment factors are regarded as the most im-portant. School sites are always located to min-imize the travel distance for most students, andthey must be away from sources of noise, air,water, and soil pollutions, where populationconcentration is also sparse. Because schoolsare usually located close to the concentrationcenters of the population and serve as com-munity centers, their locations provide ancil-lary information for residential population dis-tribution (California Department of Education1998; Georgia Department of Education Facil-ities Services Unit 2003; Ewing, Schroeer, andGreene 2004; Springer 2007).

To evaluate the results of these methods, fiveerror measures of global fit were used: meanabsolute error (MAE), mean absolute percent-age error (MAPE; Goodchild and Lam 1980),population-weighted mean absolute percent-age error (PWMAPE; Qiu, Woller, and Briggs2003), root mean squared error (RMSE), andadjusted RMSE (Adj-RMSE; Gregory 2000).To assess the accuracy at the individual zonelevel, we also calculated the absolute percent-age error (APE) and reported the range be-tween the minimum and maximum APE foreach method, so that the stability of the inter-polation models can be evaluated.

In the literature, researchers usually usedonly one or two of these error measures. Forexample, Fisher and Langford (1995) usedRMSE to quantify the error introduced by var-ious areal interpolation methods and Gregory(2000, 2002a, 2002b) used Adj-RMSE to exam-ine the accuracy of several areal interpolationtechniques suitable for use with historical data.The use of different error measures in thesestudies makes comparison of their results dif-ficult. We therefore calculated all error mea-sures for this experimental area, with the hopethat future studies can use this as a benchmarkfor comparison of different areal interpolationapproaches. These measures are calculated asfollows:

MAE =∑Nt

t=1 |Vt′ − Vt |

Nt(22)

Downloaded By: [Florida Atlantic University] At: 13:39 23 March 2011

Page 11: The Professional Geographer A Point-Based Intelligent

A Point-Based Intelligent Approach to Areal Interpolation 271

Figure 1 The experimental area, Collin County in Texas, and related data. The top map is the censustract (source zone) boundary and ZIP code tabulation area (target zone) boundary, and the bottom mapis the residential area and TIGER/Line data.

Downloaded By: [Florida Atlantic University] At: 13:39 23 March 2011

Page 12: The Professional Geographer A Point-Based Intelligent

272 Volume 63, Number 2, May 2011

MAP E = 100 ×∑Nt

t=1|Vt ′−Vt|

Vt

Nt(23)

VWMAP E = 100 ×∑Nt

t=1Vt ′|Vt ′−Vt|

Vt∑Ntt=1 V ′

t

(24)

RMSE =√∑Nt

t=1 (Vt′ − Vt)2

Nt(25)

Ad j RMSE =√∑Nt

t=1 ( Vt ′−VtVt

)2

Nt(26)

AP E = 100 × |Vt′ − Vt |Vt

(27)

where Vt and Vt′ are the original and interpo-

lated value of each target zone, respectively, andNt is the number of target zones in the studyarea.

Some of these measures might be affected bythe extremes of the population base of the tar-get zones. For example, MAE and RMSE are inthe unit of population count and therefore areheavily impacted by the zones with a large pop-ulation base, because they tend to have a largeprediction error in count (Gregory 2000). Onthe contrary, MAPE and adj RMSE, in the unitof percentage, are often disproportionately af-fected by the zones with small population base,because the same amount of error in count willlead to a larger percentage error when the pop-ulation base is small. PWMAPE uses estimated

population as a weight to normalize the ab-solute percentage error, having the advantageto eliminate the effects by extreme populationbases. This is believed to be the most objectiveerror measure that is not heavily impacted by ei-ther high or low population base (Qiu, Woller,and Briggs 2003).

We tested our point-based intelligentmethod using 0-D school ancillary data, firstwith linear function without smoothing scheme(AI6), and then with smoothing scheme by a3-by-3 window (AI7), followed by with expo-nential function without smoothing (AI8) andwith smoothing (AI9). The power parameter qcontrols the degree of local influence. Like thepower parameter in inverse distance-weighted(IDW) point interpolation, this parameter isoften set empirically on a trial-and-error basis.Based on several experiments using values of0.5, 1, 2, 3, and 4, we found that a q value of 2achieved the best result and reported the resultswith only q equals 2 in this article.

The overall global accuracy of each methodis shown in Table 2. In general, the intelligentmethods using ancillary data outperformedarea-weighting and pycnophylactic methodswithout using ancillary data, which was ex-pected. The pycnophylactic method (AI2) hada similar result to the area-weighting ap-proach (AI1). In terms of MAPE, the bi-nary dasymetric method (AI3) generated thebest result. The regression dasymetric methodwith three-class land use data (AI5) pro-duced a poor result, which agrees with the

Table 2 Population areal interpolation errors of different methods

AI1 AI2 AI3 AI4 AI5 AI6 AI7 AI8 AI9

MAE 1,936 1,998 1,781 1,359 2,398 1,388 1,423 1,655 1,747MAPE 13.5% 13.8% 9.8% 10.5% 11.6% 10.1% 10.2% 11.9% 12.3%PWMAPE 7.2% 7.5% 7.4% 5.5% 10.0% 5.7% 5.8% 6.4% 6.8%RMSE 2,760 2,718 2,436 1,916 2,907 1,639 1,685 2,253 2,300Adj RMSE 0.18 0.18 0.13 0.15 0.15 0.13 0.13 0.16 0.16APE Range 43.98% 44.71% 34.21% 34.43% 34.20% 27.97% 25.74% 32.95% 34.42%

Note: AI1 = area-weighting method; AI2 = pycnophylactic method using a 3-by-3 smoothing window; AI3 = binarydasymetric method using 2-D land use ancillary data; AI4 = network length method using 1-D TIGER/Line ancillary data;AI5 = three-class regression-based dasymetric method using 2-D land use ancillary data; AI6 = point-based intelligentmethod using 0-D school ancillary data, linear function, without smoothing scheme, and q = 2; AI7 = point-basedintelligent method using 0-D school ancillary data, linear function, smoothing scheme with a 3-by-3 window, and q = 2;AI8 = point-based intelligent method using 0-D school ancillary data, exponential function, without smoothing scheme,and q = 2; AI9 = point-based intelligent method using 0-D school ancillary data, exponential function, smoothingscheme with a 3-by-3 window, and q = 2; MAE = mean absolute error; MAPE = mean absolute percentage error;PWMAPE = population-weighted mean absolute percentage error; RMSE = root mean squared error; Adj RMSE =adjusted root mean squared error; APE = absolute percentage error.

Downloaded By: [Florida Atlantic University] At: 13:39 23 March 2011

Page 13: The Professional Geographer A Point-Based Intelligent

A Point-Based Intelligent Approach to Areal Interpolation 273

conclusions drawn by Langford (2006), whosuggested that the benefits of the three-classdasymetric model over a binary model areinconclusive. The network length method(AI4) provided the best result in terms of MAEand PWMAPE. These results are consistentwith those obtained by other researchers. Haw-ley and Moellering (2005) conducted a sys-tematic comparison of the four approaches,including the area-weighting method, thepycnophylactic method, the binary dasymet-ric method, and the road network approach.Their results obtained from the population in-terpolation for three counties showed that theroad network was the best among these fourapproaches. Reibel and Bufalino (2005) alsosuggested that the road network method isa promising areal interpolation technique be-cause it produces more consistent errors whenapplied to variables at different areal units in agiven study area.

When comparing the point-based methodswith the preceding algorithms, we can seethat our methods (AI6 and AI7) outperformedarea-weighting (AI1) and pycnophylactic meth-ods (AI2), suggesting that the incorporation ofschools as control points did help to reflect thepopulation distribution information for eachsource zone. In terms of MAE, PWMAPE, andRMSE, our methods outperformed the dasy-metric method (AI3 and AI5) using land use asancillary data and achieved a comparable accu-racy with that of the network length method(AI4). Better accuracy might be achieved by us-ing fire stations or emergency centers that arecloser to the population concentration centers.Some schools might be located at the bound-ary or away from the residential area, whichcan explain some errors in the interpolation.We did not find much difference between theresults of our methods using the smoothing(AI7) and nonsmoothing scheme (AI6) basedon the total error measures. Population wasalso interpolated using the exponential func-tion with the two schemes (AI8 and AI9). Theresults illustrated that they are comparable withthat of the binary dasymetric method, but per-formed more poorly than the linear functionmodel, suggesting the linear function modelwas more suitable in this case. By examiningthe range of APE of each method, we can seethe proposed approach had a smaller range ofAPE, especially the smoothed method with lin-

ear model, implying its stability in performanceacross all individual zones. The computationalcost in seconds for each method is obtainedbased on a PC with a 3.4 GHz Pentium CPUand 1 GB of RAM (Table 3). We can see thatthe point-based models (AI6–AI9) are muchmore efficient than the 1-D and 2-D dasymetricmethods (AI3–AI5) and are comparable to sim-ple approaches (AI1–AI2), which makes the 0-D intelligent approach a much more attractiveand effective alternative when the areal inter-polation is needed for large areas. During ourexperiments, we also noticed that the 2-D dasy-metric model and 1-D network-based methodwere unstable when executed using ArcGIS 9.3,resulting in many crashes of the system. Thecrashes are not always consistent, in that thesame operation might work sometimes but notat another time or works fine with one com-puter but not with another one. Because mapoverlay operations are now standard proceduresfor GIS, the crashes might reflect the fact thatthe system implementing the procedures mightnot be stable when a large quantity of segments,polygons, or raster cells are involved, suggest-ing a demand for high computer performancefor the execution of the overlay procedures andtherefore the same for overlay-based 2-D dasy-metric and 1-D network methods.

Summary and Discussions

In this article, a point-based method wasproposed for the areal interpolation problemusing 0-D point data as ancillary information.Compared with 2-D dasymetric methods and1-D road network-based intelligent methodsthat use complex data structure and rely ontime-consuming GIS overlay operations, theproposed approach was theoretically moreefficient because it employed 0-D data withoutthe requirement of topological structure andwas implemented with a simple straight-linedistance GIS operation. An experimental studyin a suburban area illustrated that the proposedareal interpolation approaches took a similaramount of computation time but generatedmuch better accuracy than the simple ap-proaches that do not make use of ancillarydata. It also produced comparable results withthose using 1-D and 2-D ancillary data butwith tremendous saving in computational

Downloaded By: [Florida Atlantic University] At: 13:39 23 March 2011

Page 14: The Professional Geographer A Point-Based Intelligent

274 Volume 63, Number 2, May 2011

Table 3 Computational cost of each arealinterpolation method

Algorithm CPU time (seconds)

AI1 18AI2 36AI3 148AI4 158AI5 226AI6 30AI7 63AI8 49AI9 83

Note: The computer used had a 3.4 GHz Pentium proces-sor with 1 GB RAM. AI1 = area-weighting method; AI2= pycnophylactic method using a 3-by-3 smoothing win-dow; AI3 = binary dasymetric method using 2-D land useancillary data; AI4 = network length method using 1-DTIGER/Line ancillary data; AI5 = three-class regression-based dasymetric method using 2-D land use ancillarydata; AI6 = point-based intelligent method using 0-Dschool ancillary data, linear function, without smoothingscheme, and q = 2; AI7 = point-based intelligent methodusing 0-D school ancillary data, linear function, smoothingscheme with a 3-by-3 window, and q = 2; AI8 = point-based intelligent method using 0-D school ancillary data,exponential function, without smoothing scheme, and q =2; AI9 = point-based intelligent method using 0-D schoolancillary data, exponential function, smoothing schemewith a 3-by-3 window, and q = 2.

cost. This suggests that the point-basedintelligent method might serve as an efficientand effective alternative to the current simpleand intelligent areal interpolation techniques,especially in applications with a large studyarea. The point-based method also has thepotential to be used for other types of data;for example, pollutant interpolation using 0-Dhazardous sites. Lam (1983) stated that thechoice of an appropriate areal interpolationmethod largely should depend on three factors:the type of variable being interpolated, thedegree of accuracy desired, and the amountof computational effort afforded. Given theavailability of a variety of digital point datasources, the simplicity of the methodology, aswell as the comparable accuracy with the intel-ligent approaches using 1-D and 2-D ancillarydata and the amount of computational timesaved in our experiment, we concluded that thepoint-based methods can provide additionalsolutions to the areal interpolation problemswhen the choices are limited by various factors.

The encouraging results from our case studydo not imply the ultimate superiority of thepoint-based intelligent method over other ap-

proaches being compared under all circum-stances, however. The fact that different typesof ancillary data are required for different ap-proaches did not really make them comparablein a strict sense. Similar to other intelligentapproaches, the accuracy of the point-basedmethods heavily depends on the relative sizesof the source and target zones; the quality, suit-ability, and strength of relevance of the controlpoints to the variable of interest; as well as themodel employed and the associated parametersspecified. Sadahiro (1999, 2000) investigatedthe accuracy of various areal interpolation ap-proaches and asserted that estimation accuracyis improved when source zones are relativelysmall compared with the target zones; circularand square zones can achieve better results thanrectangle zones and the estimates become moreinaccurate as the shape of cells and target zonesbecome elongated. Inappropriate ancillary in-formation that is not strongly associated withthe variable of interest might actually reducethe accuracy of the areal interpolation. Morerigorous studies are needed to further investi-gate these issues in the future. �

Literature Cited

ArcGIS, Version 9.3. Redlands, CA: ESRI.Bracken, I., and D. Martin. 1989. The generation

of spatial population distribution from census cen-troid data. Environment and Planning A 21:537–43.

California Department of Education. 1998. Site se-lection criteria. Sacramento, CA: California De-partment of Education. http://www.cde.ca.gov/ls/fa/sf/documents/sitecrit.pdf (last accessed 8 Jan-uary 2011).

Clark, C. 1951. Urban population densities. Journalof the Royal Statistical Society 114:490–96.

Deichmann, U. 1996. A review of spatial populationdatabase design and modelling. Technical Report 96–3. Santa Barbara, CA: National Center for Geo-graphic Information and Analysis.

Eicher, C., and C. A. Brewer. 2001. Dasymetric map-ping and areal interpolation: Implementation andevaluation. Cartography and Geographic InformationScience 28:125–38.

Ewing, R., W. Schroeer, and W. Greene. 2004.School location and student travel: Analysis of fac-tors affecting mode choice. Transportation ResearchRecord 1895:55–63.

Fisher, P. F., and M. Langford. 1995. Modelling theerrors in areal interpolation between zonal systemsby Monte Carlo simulation. Environment and Plan-ning A 27:11–24.

Downloaded By: [Florida Atlantic University] At: 13:39 23 March 2011

Page 15: The Professional Geographer A Point-Based Intelligent

A Point-Based Intelligent Approach to Areal Interpolation 275

———. 1996. Modeling sensitivity to accuracy inclassified imagery: A study of areal interpolationby dasymetric mapping. The Professional Geographer48:299–309.

Flowerdew, R. 1988. Statistical methods for arealinterpolation: Predicting count data from a bi-nary variable. Research Report No. 15, NorthernRegional Research Laboratory, Newcastle uponTyne, UK.

Flowerdew, R., and M. Green. 1991. Data integra-tion: Statistical methods for transferring data be-tween zonal systems. In Handling geographic in-formation: Methodology and potential applications, ed.I. Masser and M. B. Blakemore, 38–53. London:Longman Scientific and Technical.

Georgia Department of Education Facilities ServicesUnit. 2003. A guide to school site selection. Atlanta,GA: Georgia Department of Education. http://www.doe.k12.ga.us/documents/schools/facilities/site2 a.pdf (last accessed 8 January 2011).

Goodchild, M. F., L. Anselin, and U. Deichmann.1993. A framework for the areal interpolation ofsocioeconomic data. Environment and Planning A25:383–97.

Goodchild, M. F., and N. S.–N. Lam. 1980. Arealinterpolation: A variant of the traditional spatialproblem. Geo–Processing 1:297–312.

Gregory, I. N. 2000. An evaluation of the accuracyof the areal interpolation of data for the analysisof long-term change in England and Wales. Paperpresented at the Fifth International Conference onGeoComputation, Greenwich, UK.

———. 2002a. The accuracy of areal interpolationtechniques: Standardizing 19th and 20th centurycensus data to allow long-term comparisons. Com-puters, Environment and Urban Systems 26:293–314.

———. 2002b. Time-variant GIS databases ofchanging historical administrative boundaries: AEuropean comparison. Transactions in GIS 6:161–78.

Harris, R., and Z. Chen. 2005. Giving dimensionto point locations: Urban density profiling usingpopulation surface models. Computers Environmentand Urban Systems 29:115–32.

Harvey, J. T. 2002a. Population estimation modelsbased on individual TM pixels. PhotogrammetricEngineering and Remote Sensing 68:1181–92.

———. 2002b. Estimating census district popula-tions from satellite imagery: Some approaches andlimitation. International Journal of Remote Sensing 2:2071–95.

Hawley, K., and H. Moellering. 2005. A compara-tive analysis of areal interpolation methods. Car-tography and Geographic Information Science 32:411–23.

Holt, J. B., C. P. Lo, and T. W. Hodler. 2004. Dasy-metric estimation of population density and arealinterpolation of census data. Cartography and Geo-graphic Information Science 31:103–21.

Kyriakidis, P. C. 2004. A geostatistical frameworkfor area-to-point spatial interpolation. GeographicalAnalysis 36:259–89.

Kyriakidis, P. C., and E.-H. Yoo. 2005. Geostatisti-cal prediction and simulation of point values fromareal data. Geographical Analysis 37:124–51.

Lam, N. S.-N. 1983. Spatial interpolation methods:A review. The American Cartography 10:129–49.

Langford, M. 2006. Obtaining population estimatesin non-census reporting zones: An evaluation ofthe 3-class dasymetric method. Computers, Envi-ronment and Urban Systems 30:161–80.

Langford, M., D. J. Magnire, and D. J. Unwin. 1991.The areal interpolation problem: Estimating pop-ulation using remote sensing in a GIS framework.In Handling geographical information, ed. I. Masserand M. Blakemore, 55–77. London: Longman.

Langford, M., and D. J. Unwin. 1994. Generatingand mapping population density surfaces within ageographical information system. The CartographicJournal 31:21–26.

Liu, X. H., P. C. Kyriakidis, and M. F. Goodchild.2008. Population-density estimation using regres-sion and area-to-point residual kriging. Interna-tional Journal of Geographical Information Science 22:431–47.

Markoff, J., and G. Shapiro. 1973. The linkage of datadescribing overlapping geographical units. Histor-ical Methods Newsletter 7:34–46.

Martin, D. 1989. Mapping population data from zonecentroid locations. Transactions of the Institute ofBritish Geographers 14:90–97.

———. 1996. An assessment of surface and zonalmodels of population. International Journal of Geo-graphical Information Systems 10:973–89.

Martori, J. C., and J. Surinach. 2002. Classicalmodels of urban population density. The case ofthe Barcelona metropolitan area. In Urban re-gions: Governing interacting, economic, housing andtransport systems, ed. J. Van Dick, P. Elhorst, J.Oosterhaven, and J. E. Wever, 109–23. Utrecht,The Netherlands: Koninklijk Nederlands Aardri-jkskundig Genootschap, Universiteit Utrecht.

Mennis, J. 2003. Generating surface models of pop-ulation using dasymetric mapping. The ProfessionalGeographer 55:31–42.

Mennis, J., and T. Hultgren. 2006. Intelligent dasy-metric mapping and its application to areal interpo-lation. Cartography and Geographic Information Sci-ence 33:179–94.

Qiu, F., K. L. Woller, and R. Briggs. 2003.Modeling urban population growth from remotesensed imagery and TIGER GIS road data. Pho-togrammetric Engineering & Remote Sensing 69:1031–42.

Reibel, M., and A. Agrawal. 2007. Areal interpolationof population counts using pre-classified land coverdata. Population Research and Policy Review 26:619–33.

Downloaded By: [Florida Atlantic University] At: 13:39 23 March 2011

Page 16: The Professional Geographer A Point-Based Intelligent

276 Volume 63, Number 2, May 2011

Reibel, M., and M. E. Bufalino. 2005. Street-weighted interpolation techniques for demo-graphic count estimation in incompatible zonesystems. Environment and Planning A 37:127–39.

Sadahiro, Y. 1999. Accuracy of areal interpolation:A comparison of alternative methods. Journal ofGeographical Systems 1:323–46.

———. 2000. Accuracy of count data estimated bythe point in polygon method. Geographical Analysis32:64–89.

Springer, D. 2007. Integrating schools into healthycommunity design. Issue Brief. Washington, DC:National Governor’s Association. http://www.nga.org/Files/pdf/0705schoolshealthydesign.pdf(last accessed 8 January 2011).

Stewart, J. Q. 1947. Empirical mathematicalrules concerning the distribution and equilib-rium of population. Geographical Review 24:461–85.

Turner, A., and S. Openshaw. 2001. Dissaggregativespatial interpolation. Paper presented at GISRUK2001, Glamorgan, Wales.

Tobler, W. 1979. Smooth pycnophylactic interpola-tion for geographical regions. Journal of the Amer-ican Statistical Association 74:519–30.

Tobler, W., U. Deichmann, J. Gottsegen, and K.Maloy. 1995. The Global Demography Project.Technical Report TR–95–6, National Center forGeographic Information and Analysis, Santa Bar-bara, CA.

Wright, J. 1936. A method of mapping densities ofpopulation: With Cape Cod as an example. Geo-graphical Review 26:103–10.

Wu, C., and A. Murray. 2005. A cokriging methodfor estimating population density in urban areas.Computers, Environment and Urban Systems 29:558–79.

Xie, Y. 1995. The overlaid network algorithms forareal interpolation problem. Computers, Environ-ment and Urban Systems 19:287–306.

Yuan, Y., R. M. Smith, and W. F. Limp. 1997.Remodeling census population with spatial infor-mation from Landsat TM imagery. Computers, En-vironment and Urban Systems 21:245–58.

CAIYUN ZHANG is an Assistant Professor inthe Department of Geosciences at the Florida At-lantic University, Boca Raton, FL 33431. E-mail:[email protected]. Her research interests include spa-tial analysis and modeling, GIS application softwaredevelopment, remote sensing digital image process-ing, LiDAR, and ocean remote sensing.

FANG QIU is an Associate Professor in the Pro-gram of GIS at the University of Texas at Dallas,Richardson, TX 75083. E-mail: [email protected] research interests include remote sensing digitalimage processing, spatial analysis and modeling, GISapplication software development, and Web-basedmapping and information processing.

Downloaded By: [Florida Atlantic University] At: 13:39 23 March 2011