· 2000 imr values • gridding using proportional allocation algorithm • we also converted...

��

Alex de Sherbinin, Deputy ManagerNASA Socioeconomic Data and Applications Center

Center for International Earth Science Information NetworkThe Earth Institute, Columbia UniversityPalisades, New York, USA

Acknowledgements: This presentation borrows heavily from material prepared by Deborah Balk, formerly of CIESIN and currently at Baruch College, and Gregory Yetman of CIESIN.

2

��

• Short history of gridding population data• Why grid?• Gridded Population of the World (GPW) Methodology• Global Rural Urban Mapping Project (GRUMP)• US Census Grids• Poverty Mapping

– Gridded Infant Mortality Rate– Gridded Child Malnutrition

��

��

��

��

��

��

��

��

� ��

� ��

More attention to global scope

More attention to comparability

More attention to problem-oriented science

More attention to spatial frameworks

��

• US Census Bureau’s Global Population Database (early 1990s)• Africa Population Grid (UNEP/GRID, 1991)• GPW v1 and Global Demography Project (NCGIA & CIESIN, 1994)• 1 degree global grid (Environment Canada, 1995)• Europe (RIVM, 1995)• Africa update and Asia (NCGIA, UNEP/GRID & WRI, 1996)• Latin America (CIAT)• HYDE (RIVM/ Klein Goldewijk 1997, 2001 and 2006)• LandScan (ORNL, 1999 and onwards)• GPW v2 (CIESIN et al., 2000)• GRUMP alpha (CIESIN et al., 2004)• GPW v3 (CIESIN et al., 2005)• In the past decade there have been far more efforts than can be

listed here

��

� ��

� ��

��

��

http://sedac.ciesin.columbia.edu/gpw/

!��"��

• Find tabular information with attributes– E.g., Population counts

• Match to geographic boundaries– Administrative– Urban footprints

• Estimate– Population to the target years

(1990, 1995 and 2000)• Transform to grids

Statistics South Africa

Descriptive - South Africa by Province and Municipality

Table 1

Province (PR_SA) District municipality (DC_PR_SA) Municipality (MN_PR_SA)

Main place (MP_SA) Sub-place (SP_SA)

Geography by Gender

for Person weighted

Male Female Total

3 Northern Cape 401094 421636 822729

6 DC6: NAMAKWA District Municipality 53424 54687 108110

301 NC061: Richtersveld 5170 4961 10130

30101 Alexander Bay 723 729 1452

30101001 Alexander Bay Navel Base 30 12 42

30101000 Alexander Bay SP 738 675 1413

NB: Spatially matched population census (and survey) data generally has several data providers!

#��$��

• Population and boundary data must match– Best available & matchable data are used

• Matching the inputs to one another is not as easy as it might seem– Boundaries change often and come in different scales– Population data may not match boundaries

• We may have population values for different years at different levels (e.g., district-level one year, state-level another)

"��

• Clean boundaries– E.g., remove slivers

• Make them consistent across borders and coasts– Use international standard—the DCW—with exceptions

• Europe—most spatially data supplied by one agency (SABE) and all international boundaries are internally consistent

• For GPW v4 we plan to use ISciences’ Global Coastline v.1

– Coastlines matched to DCW, except where much higher quality data are supplied

• E.g., Indonesia

• Data table needs to include the same variables, with the same variable names, formats, etc.

%�� &'��(��

�Places highlighted in yellow are new municipios�Need to find where they came from & their pop size

�Use on-line atlases or newer maps, when available�Add new pop to unit of origin or allocate old population to new unit proportionally.

��%��' ��

• Annual rate of change calculated:

• Population estimates adjusted to target years:Px = P2 ert

Definitionsr - Annual rate of growth

P1..2 - Census estimate

t - number of years between

census enumerations

Px - Year of EstimatePun - UN EstimatePadj - Adjusted estimate

tPP

er

��

��

�

�1

2log

��%��' �� )�� *

Definitions

a - Adjustment factor

Px - Year of Estimate (90 or 95)Pun - UN EstimatePadj - Adjusted estimate

• Adjustment factor for matching national estimates to UN estimates calculated:

a = (Pun - Px) / Pun

• Adjustment factor applied at the national level :Padj = Px * a

• Differences ranged from 20% under (Somalia, 1995) to 25% over (Jordon, 1990)

�� '��

• Proportional allocation used to spread the population over grid cells

• Virtually all data work completed on vector data– Gridding is the last step

• National grids created, global grids assembled by adding national grids together– Country grids are created with collars so that they start

and end on even degrees; therefore the assembly of the grids without interpolation is possible

– Replacement of country-specific grids feasible

��'��

��

��

��

Land Area: 458.4 square km

Area 2.6 kmPop = 628.5 persons per sq. km * 2.6 =1,634.1persons

��

Area 16.1 kmPop = 628.5 persons per sq. km * 16.1 =10,118 persons

Area 0.05 kmPop = 628.5 persons per sq. km * 0.05 =31.4 persons

Population 2000Persons

High: 10,123.5

Low: 7

��

!�� " ��# $ �% ��"��# $ �% ��"��# $ �% ��"��#

� �� &�� '�� '��'��

(� �� '�� '�� )��'��

#��+�� ,�$�"$-��./�

�� )� � *

��0��1�� )�01� *

http://sedac.ciesin.columbia.edu/gpw/

��0��1�� )�01� *

• Objective: To delineate urban and rural extents and populations– Collaboration between CIESIN, IFPRI, World Bank, & CIAT– Builds on GPW infrastructure– Adds urban areas from Nighttime lights satellite data

• Three databases:– Settlement Points (>70,000 w/ pop of 1k+)– Urban Extents (>23,500 w/pop of 5k+)– Pop Grid at 1 km resolution

�� &�01� �� )2*

• Stand alone model “GRUMPe” written in C• Combines the following pieces of information:

– Population and boundaries of each urban area based on NTLs • Boundaries sometimes based on buffered points where no NTL “signature”

– Population and boundaries of each admin area– Size of the intersect areas where urban and admin areas overlap– UN national estimates for percentage of population in urban and

rural areas

20

�� &�01� �� )3*

• The algorithm reallocates the total pop in each admin unit into rural and urban areas based on UN estimates, with six contraints:1. Total admin pop remains constant2. Urban pop density in any admin unit must be > rural density3. Rural pop density cannot be lower than national mininimum

rural pop density threshold for country/region4. Rural pop density cannot be higher than the national maximum

rural pop density threshold for country/region5. Urban pop density cannot be lower than national minimum

urban pop density threshold for country/region6. Urban pop density cannot be higher than the national

maximum urban pop density threshold for country/region

21

�� &�01� �� )4*

• The algorithm is trivial where only one urban area is contained within an admin unit

• It is more complex when:– there are multiple urban areas overlapping an admin unit– Urban areas overlap more than one admin area– Large urban areas contain more than one admin area

• These are common situations and require successive iterations to meet all constraints

22

�(��

23

1� ��

• Close up of Brazil using the 100K person cut off

• Note the variety of shape

• Much more than points convey

1"��

http://sedac.ciesin.columbia.edu/usgrid/

1"��

• Uses proportional allocation algorithm• Higher resolution:

– Resolution is 1km (30 arc-sec) for the country as a whole– Metropolitan areas are available at 250m (7.5 arc-sec)

• More census variables:– Individual data: age distribution, race, ethnicity, income,

poverty, educational level, and immigrant status– Household data: household size, one-person households,

female-headed households with children under 18, and linguistically isolated households

– Housing unit data: occupied housing units without a vehicle, and year of construction

26

1"�� "��

27

��

http://sedac.ciesin.columbia.edu/povmap/

��

• Infant mortality rates (IMRs): – Serve as a useful

proxy for overall poverty levels because they are highly correlated with metrics such as income, education levels, and health status of the population

– This metric is particularly good for distinguishing poverty levels at the lower end of the income ladder

�� $�0� ��

• Sources– Demographic and Health Surveys (39 countries)– Multiple Indicator Cluster Surveys (5 countries)– National Human Development Reports (14 countries)– National Statistical Offices (18 countries)– 6,494 spatial units in global data base

• Brazil and Mexico – 5,372 units• 74 other countries with subnational data –22 units per country on average• 115 countries national level data only (UNICEF)• 36 countries no data

• Calibration– Subnational IMR values adjusted to be consistent with national UNICEF

2000 IMR values

• Gridding using proportional allocation algorithm• We also converted rates to counts

– For each subnational unit, estimates of live births, infant deaths calculated based on gridded population, national fertility data, and subnational IMR.

��

31

��

• Use anthropometric data found in household surveys– DHS and MICS data were aggregated to the spatial units at which the

surveys report, based on raw data where it was available, and published reports otherwise.

– These spatial units are typically equivalent to first level administrative regions or aggregations thereof.

• Geospatial boundary files that match those spatial units were located or created in order to match the reporting regions of the surveys as closely as possible. – In many cases, the survey reports contained maps detailing the survey

regions. Elsewhere, matches were purely name-based.

• Map percent of children underweight– Underweight defined as being two standard deviations or more below

the mean weight for a given age, as compared to an internationalreference population.

32

� � �5�� !��

• Continued emphasis on higher resolution inputs• Effort to collect and grid more census variables

– Age and sex distribution– Urban/rural distribution

• Proposed output resolution: 1km grids• May create time series back to 1980• Looking for data sharing partners

33

#6'-7+�18�0+�1�6

'��( �"��

�� &� ��9��.�� .� �

34

· 2000 imr values • gridding using proportional allocation algorithm • we also converted...

Documents