geogra ph i cal data
DESCRIPTION
Geogra ph i cal Data. Type s , relati ons , m easures , classificati on s, dimensi on , aggregati on. To be seen on maps. urban. grass. water. te x t (nam e , e levation ). dik e. Topogra ph ic map. C lassif ied isolin e map. To be seen on maps. Choropleth map: - PowerPoint PPT PresentationTRANSCRIPT
Geographical Data
Types, relations, measures, classifications, dimension,
aggregation
To be seen on maps
Topographic map
urban
grass
water
dike
text(name,elevation)
Classified isoline map
To be seen on maps
Choropleth map:Map with administrativeboundaries which shows per region a value by a color or shade
Use of pesticide 1_3_Dper county
Maps show ...• Relation of place (geographic location) to a value
(here 780 mm precipitation) or name (here is Minnesota).
• An abstraction (model, simplification) of reality • A combination of themes (different sorts of data)• Connections (subway maps)
Tokyo subway map
Scales of measurement
• Nominal scale• Ordinal scale• Interval scale• Ratio scale• ( Angle/direction, vector, … )
Classification of types of data by statistical properties (Stevens, 1946)
Nominal scale
• Administrative map (names of the countries)• Landuse map (names of landuse: urban, grass,
forest, water, …)• Geological map (names of soil types: sand,
clay, rock, …)
Finite number of classes, each with a name.
Testing is possible for equivalence of name.
Ordinal scale
• School type (VMBO, HAVO, VWO)• Wind force on schale of Beaufort (0=no
wind, ... 6=heavy wind, …, 9=storm, ...)• Questionnaire-answers (disagree, partly
disagree, neutral, partly agree, agree)
Finite number of classes, each with a name
Testing for equivalence of name and for order
Interval scale
• Temperature in degrees Celsius or Fahrenheit• Time/year on Christian calendar
Unbounded number of classes, each with a value
Testing for equivalence, for order and for difference(a unit distance exists)
Ratio scale
• Measurements: concentration of lead in soil• Counts: population, number of airports• Percentages: unemployment percentage,
percent of landuse type forest
Unbounded number of classes, each with a value
Testing for equivalence, for order, for difference and for ratio (a natural zero exists)
Examples
Overview
nominal Categories equivalence number of occurrences, mode
ordinal Categories … and order … and median
interval Unbounded … and difference
… and average
ratio Unbounded … and ratio
two data collection
Other scales
• Angle (wind direction, direction of spreading)• Vector: angle and value (primary wind
direction and speed)• Categorical scales with partial membership
(fuzzy sets; points on indeterminate boundary between “plains” and “mountains”; location of coast line: tide)
Example
Classification schemes
Data on nominal scale: hierarchical classification schemes
nature
water
working
houses
flats
cattle
plants fruit
urban
agriculture
living
landuse
Classification schemes
Data on interval and ratio scales
• Fixed intervals
• Fixed intervals based on spread
• Quantiles: equal representatives
• “Natural” boundaries
[1-10], [11-20], [21-30]
[4-11], [12-19], [20-27]
4, 5, 5, 8, 12, 14, 17, 23, 27
[4-5], [8-14], [17-27]
[4-5], [8-17], [23-27]
Classification schemes, cont’d
• Statistical boundaries: average , standard deviation , then e.g. boundaries - 2, - , , + , + 2
• Arbitrary
Two classifications
Four equal intervalsQuartiles
Counties of Arizona, total population
Why is choice of classification important?
• Visualization often needs classification• Choice of class intervals influences
interpretation
Think of a report that addresses air pollution due to a factory made by the board of the factory or by an environmental organization
Data: object and field view
• Object view: discrete objects in the real world– road– telephone pole– lake
• Field view: geographic variable has a “value” at every location in the real world– elevation– temperature– soil type– land cover
Reference system
• Data according to the scales of measurement are attribute values in a reference system
• A geographical reference system is spatial, temporal or both
At 12 noon of August 26, 1999 , a temperature of 17.6 degrees Celsius is measured at 5 degrees longitude and 53 degrees latitude
Spatial objects
• Points; 0-dimensional, e.g. measurement point
• (Polygonal) line; 1-dimensional, e.g. border between Bolivia and Peru
• Polygons; 2-dimensional, e.g. Switzerland
• Sets of points, e.g. locations of accidents• Systems of lines (trees, graphs), e.g. street
network• Sets of polygons, subdivisions, e.g. island
group, provinces of Nederland
Dependency of dimension
• Dimension of an object can be scale dependent: Rhine river at scale 1 on 25.000 is 2-dim.; Rhine at scale 1 on 1.000.000 is 1-dim.
• Dimension of an object can be application dependent: Rhine as transport route is 1-dim.(length is relevant; not the surface area); Rhine as land cover in Nederland is 2-dim.
The third dimension
• Elevation can be considered an attribute on the ratio (!?) scale at (x,y)-coordinates
• For civil engineering: crossing of street and railroad can be at the same level, or one above the other
• Data on subsurface layers and their thickness
The time component
• Same region, same themes, different dates: Allows computation of change
• Trajectories give the locations at certain times for moving objects
Level of aggregation
Income of an individual
Average income in a municipality
Average income in a province
Average income in a country
Higher level of aggregation
Various aggregations in the Netherlands
• Prinvines (12)• Municipalities (441)• COROP regions (40)• Water districts (39)• Economic-geographic regions (129)• 2- and 4-number postal codes• Macro-regions (4 of 5; provinces joined)• Labor exchange district (127), planning region
(43), nodal region (80), ...
Aggregation: dangers
• MAUP: modifiable areal unit problem
0 - 12 - 45 -
Located occurrences of a rare disease
clustering?
Aggregation: dangers
• MAUP: modifiable areal unit problem
0 - 12 - 45 -
Aggregation boundarieshave got nothing to do with mapped theme
Located occurrences of a rare disease
clustering?
Aggregation: dangers
• Not enough aggregation: privacy violations(e.g. AIDS-cases with complete postal code)
• Correction for population spread is necessaryin case of data on people
0 - 12 - 45 -
Located occurrences of a rare disease
clustering?
Huntington’s disease,1800-1900
Summary
• Data is geometry, attribute, and time• Data is coded in a reference system• Attribute data is usually on one of the standard
scales of measurement• Classification of interval and ratio data is needed
for mapping (isoline or choropleth) and histograms• The object view and field view exist• Geometric data has a dimension (point, line, area),
but this may depend on scale and application• Data is often spatially aggregated