spat mod intro lecture - the university of edinburghbmg/teaching/spatmod/sm1.pdf · directed link...

34
Spatial Modelling Welcome to Spatial Modelling ! Taught by: Bruce Gittings Neil Stuart Owen Macdonald The detail is in the module outline This module is about how geographical data is conceptualised and stored in terms of a computer systems These are fundamental to manipulation within a GIS Because data models and their associated structures are so fundamental, a clear understanding of the issues is critical Poor choices gives rise to limited functionality or poor performance Geographical information can be 0,1,2,3 or even 4-dimensional GI is much more than maps; attributes, statistics and descriptive info too Representations are not always straight- forward

Upload: others

Post on 04-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Spat Mod Intro Lecture - The University of Edinburghbmg/teaching/spatmod/SM1.pdf · Directed Link Rings Line Segments Chains. Spatial Objects II 2-D Objects: Surfaces Polygons-the

Spatial Modelling

• Welcome to Spatial Modelling !

• Taught by:– Bruce Gittings

– Neil Stuart

– Owen Macdonald

• The detail is in the module outline

• This module is about how geographical data is conceptualised and stored in terms of a computer systems

• These are fundamental to manipulation within a GIS

• Because data models and their associated structures are so fundamental, a clear understanding of the issues is critical

• Poor choices gives rise to limited functionality or poor performance

• Geographical information can be 0,1,2,3 or even 4-dimensional

• GI is much more than maps; attributes, statistics and descriptive info too

• Representations are not always straight-forward

Page 2: Spat Mod Intro Lecture - The University of Edinburghbmg/teaching/spatmod/SM1.pdf · Directed Link Rings Line Segments Chains. Spatial Objects II 2-D Objects: Surfaces Polygons-the

What is a GIS?

• Essentially it is a system which includes facilities for:

– Data collection and input

– Editing and retrieval

– Manipulation and Analysis

– Output and reporting

• Critically a GIS must be able to handle a range of different types of data coming from e.g.

– Paper maps

– Surveying

– Remotely sensed satellite images

– Global Positioning System (GPS)

– Supermarket checkouts

– Cash machines and credit information

– Keyed in paper files or records (eg. questionnaires)

– Existing digital databases

• At its core is a spatial database

• A spatial database includes structures to optimally store both spatial and attribute data in a manner compatible with the functions outlined above

• The spatial database may be part of, or linked to a more traditional database

• There are many definitions of a GIS

Page 3: Spat Mod Intro Lecture - The University of Edinburghbmg/teaching/spatmod/SM1.pdf · Directed Link Rings Line Segments Chains. Spatial Objects II 2-D Objects: Surfaces Polygons-the

Representing Reality

REALITY (after Peuquet, 1984)

What actually exists, including all aspects which may or may not be perceived by individuals. Infinitely complex.

DATA MODEL

Abstraction of the real world incorporating only those properties thought to be relevant to the application.

DATA STRUCTURE

A representation of the data model, often expressed as in terms of arrays or other programming structures which can be incorporated in computer programs.

FILE STRUCTURE

The representation of the data in storage hardware in terms of bits and bytes on disk sectors

In reality this is a gross simplification; there are multiple levels of models and structures (e.g. memory structures)

THE REAL WORLD OBSERVATION

ENCODING

STORAGE

SIMPLIFICATION

& SELECTION

FILE STRUCTURE

DATA MODEL

DATA STRUCTURE

Page 4: Spat Mod Intro Lecture - The University of Edinburghbmg/teaching/spatmod/SM1.pdf · Directed Link Rings Line Segments Chains. Spatial Objects II 2-D Objects: Surfaces Polygons-the

Decomposing the Real World

Page 5: Spat Mod Intro Lecture - The University of Edinburghbmg/teaching/spatmod/SM1.pdf · Directed Link Rings Line Segments Chains. Spatial Objects II 2-D Objects: Surfaces Polygons-the

The Data Model

• A critical decision is the choice of data model

• Different data models will be required for spatial and attribute data

• These are generally referred to as:

– The Spatial Data Model and

– The Database Model

• Models must be able to cope with

– Maintaining data

– Modelling tasks

– Analysis tasks

– Presentation (including cartography)

• There may not be a free choice of models

• Models used today, in a particular system, may be constrained by past models

• These are so-called legacy models

Page 6: Spat Mod Intro Lecture - The University of Edinburghbmg/teaching/spatmod/SM1.pdf · Directed Link Rings Line Segments Chains. Spatial Objects II 2-D Objects: Surfaces Polygons-the

Modelling the Real World

Maps, Reports,

Statistics

Page 7: Spat Mod Intro Lecture - The University of Edinburghbmg/teaching/spatmod/SM1.pdf · Directed Link Rings Line Segments Chains. Spatial Objects II 2-D Objects: Surfaces Polygons-the

Attribute Data Models

• Attribute (Aspatial) information is the label / name / categorisation / description attached to each spatial object

• The attributes are as important as the spatial data themselves

• May be much more complex than the spatial data (e.g. paragraphs of text and numerous statistics attached to one point)

• May be a simple text label

• Attributes are usually stored in some form of database

• May be formal database management systems

• Rarely today are these simple text-handling systems within the GIS software

• They are rather sophisticated generic tools, identical to the core components of other information systems

• Attribute data may already exist within a corporate database or data warehouse (e.g. council tax register, customer buying information)

• Typically relational or object-oriented database management systems are used

Page 8: Spat Mod Intro Lecture - The University of Edinburghbmg/teaching/spatmod/SM1.pdf · Directed Link Rings Line Segments Chains. Spatial Objects II 2-D Objects: Surfaces Polygons-the

Linking Attributes

Page 9: Spat Mod Intro Lecture - The University of Edinburghbmg/teaching/spatmod/SM1.pdf · Directed Link Rings Line Segments Chains. Spatial Objects II 2-D Objects: Surfaces Polygons-the

What is Spatial Data ?

• A geographical entity is defined in terms of:

– Location (spatial reference)

– Dimensions

– Attribute

– Time

• Points

• Lines

• Polygons (Areas)

• Volumes

• Examples

• Spatial data may include real features (e.g roads and rivers) but also social or legal constructions (e.g. city limits or enumeration district boundaries), artificial structures (e.g. contours)

• What is a spatial reference?

– Latitude / Longitude

– National Grid Reference

– A Parish, County, Ward, ED etc.

– An Address

– A Postcode etc. etc.

Page 10: Spat Mod Intro Lecture - The University of Edinburghbmg/teaching/spatmod/SM1.pdf · Directed Link Rings Line Segments Chains. Spatial Objects II 2-D Objects: Surfaces Polygons-the

Point / Vertex

Node

Line Segment

Link

String

Chain

Spatial Objects I

0-D Objects:

1-D Objects:

Directed Link

Rings

Line Segments Chains

Page 11: Spat Mod Intro Lecture - The University of Edinburghbmg/teaching/spatmod/SM1.pdf · Directed Link Rings Line Segments Chains. Spatial Objects II 2-D Objects: Surfaces Polygons-the

Spatial Objects II

2-D Objects:

Surfaces

Polygons - the areas

enclosed by rings

Complex Polygon

Grid / Pixels

Page 12: Spat Mod Intro Lecture - The University of Edinburghbmg/teaching/spatmod/SM1.pdf · Directed Link Rings Line Segments Chains. Spatial Objects II 2-D Objects: Surfaces Polygons-the

Spatial Objects III

3-D Objects:

Volumes

Page 13: Spat Mod Intro Lecture - The University of Edinburghbmg/teaching/spatmod/SM1.pdf · Directed Link Rings Line Segments Chains. Spatial Objects II 2-D Objects: Surfaces Polygons-the

Representational Problems

• Computer scientists assume space can be represented as a mathematical function

• Geographers know different !

• Particular features are not always represented in the same way

• Representation may be dependent on scale e.g. cities may be polygons or points; rivers may be lines or polygons

• Representation may be dependent on the application e.g. rivers are lines for network modelling; as polygons for shading

• How should flats in a block be represented?

• Traditional maps are replete with cartographic compromises

e.g.

• Map Projections

• Addresses are complex and only partially unique (but see BS 7666)

• Postcodes were never designed as a spatial reference, but they are usually spatially allocated e.g EH8 9XP...

...but sometimes not e.g. GIR 0AA

Page 14: Spat Mod Intro Lecture - The University of Edinburghbmg/teaching/spatmod/SM1.pdf · Directed Link Rings Line Segments Chains. Spatial Objects II 2-D Objects: Surfaces Polygons-the

A Brief Word on Spatial References

• You need to understand a little about geodesy(literally dividing and measuring the earth)

• The science of measuring and mapping the earth's surface (Helmert, 1880)

• Turning a spherical earth into a flat map involves compromise, based on a Coordinate Referencing System (CRS)

• A CRS involves:

– an ellipsoid ('best fit' geoid)

– a datum (the position and orientation of the ellipsoid relative the centre of the earth)

– coordinates (the values which define the positions of real-world objects, relative to ellipsoid and datum)

• WGS84 (used by GPS) is one of thousands of datums

• WGS84 is a global system

• OSGB36 is another (used by British National Grid) but it is intended local best-fit

• Mathematically complex transformations define how to move coordinates in one datum to another

• Mistakes cost lots of money !

Further information in "Surveying & Positioning Guidance Note 1" from the International Association of Oil & Gas Producers (see Spatial Modelling Resource Page)

Page 15: Spat Mod Intro Lecture - The University of Edinburghbmg/teaching/spatmod/SM1.pdf · Directed Link Rings Line Segments Chains. Spatial Objects II 2-D Objects: Surfaces Polygons-the

Richness

• Detail is (to some extent) a function of scale

• But also due to mapping policies

• UK Ordnance Survey reduced the number of feature codes available in 1990s to ensure timely delivery of digital products

• Placename gazetteer has only ten codes and 60% of data is 'other' or 'all other'

• OS Open Data bringing a reconsideration of this strategy

Page 16: Spat Mod Intro Lecture - The University of Edinburghbmg/teaching/spatmod/SM1.pdf · Directed Link Rings Line Segments Chains. Spatial Objects II 2-D Objects: Surfaces Polygons-the

Data Quality

• Should be recorded in the data structure and form part of metadata

• Error vs. Uncertainty

– Is it wrong or are you just not sure where it is?

• Accuracy vs. precision

– Is it close to the truth, vs. number of decimal places used to represent it

– GIS are very precise, but the data are not necessarily accurate!

• Positional accuracy vs. Attribute accuracy

• Error propagation:

– Do cumulative errors get worse? Cancel out?

– Are errors in separate layers dependent or independent?

• Logical consistency (eg. topology, consistent coding of attributes)

• Completeness

• Lineage, fit for purpose, but what purpose?

• Measuring these

• Error modelling

Page 17: Spat Mod Intro Lecture - The University of Edinburghbmg/teaching/spatmod/SM1.pdf · Directed Link Rings Line Segments Chains. Spatial Objects II 2-D Objects: Surfaces Polygons-the

The Multi-Layer Model

• Comes from the concept of different map sheets covering one area (e.g. OS map, geology map, forestry etc.)

• The base-map is a cartographic collection of individual features (e.g. rivers, roads, buildings, railways, parks, pylons etc.)

• Often additional features are added to the base-map (e.g. administrative boundaries)

• This suggests that the base-map can be decomposed into separate feature layers

• Isolation of feature layers permits effective spatial analysis

• It also permits recombination of elements to produce different cartographic products

• Registration procedures are required to ensure accurate spatial referencing between layers

• There should be no limit to the number of layers

• All of the layers do not have to be used all of the time, either in terms of analysis or drawing the map

Page 18: Spat Mod Intro Lecture - The University of Edinburghbmg/teaching/spatmod/SM1.pdf · Directed Link Rings Line Segments Chains. Spatial Objects II 2-D Objects: Surfaces Polygons-the

Layer Model of GIS

Page 19: Spat Mod Intro Lecture - The University of Edinburghbmg/teaching/spatmod/SM1.pdf · Directed Link Rings Line Segments Chains. Spatial Objects II 2-D Objects: Surfaces Polygons-the

Discontinuous vs. Continuous Space

• Space is continuous

• However many GIS break space into discrete pieces (tiles)

• This follows the concept of paper map sheets, which break an area up into convenient pieces

• The paper maps sheets may cover differently sized areas; this can also be true of the digital equivalents

• Until recently OS data came as tiles

• Now MasterMap (DNF) should cover whatever area you require

• Discontinuous space was a reasonable approach in early systems to ensure the efficient processing

• Thus we often have a tile-based GIS, where the features are separated out into layers and these layers are broken into tiles

• Tiles may need to be aggregated to give continuous coverage of a study area and study areas may need to be broken up, thus tools are required for both of these

• Again careful registration is required to maintain spatial integrity across tiles

Page 20: Spat Mod Intro Lecture - The University of Edinburghbmg/teaching/spatmod/SM1.pdf · Directed Link Rings Line Segments Chains. Spatial Objects II 2-D Objects: Surfaces Polygons-the

The Object - Field Debate

• Objects are discrete, identifiable entities each with a spatial reference

• Fields are collections of spatial distributions, where the feature varies continuously in a mathematical sense across space

• The field-based view is based on the premise that that space is continuous and an infinite number of locations / samples potentially exist

• Fields are used for e.g. elevation, population density, temperature, soil type etc.

• There is a value for the variable at any location (could be absence or probability)

• Rasters and TINs are field-based structures

• The object-view states that humans see the world as empty space littered with objects

• Representation can be different at different scales

• Objects are precisely defined, but encoding uncertainty is difficult

• Objects can be points, lines or polygons

• Example applications include utilities, land registration, network modelling

Page 21: Spat Mod Intro Lecture - The University of Edinburghbmg/teaching/spatmod/SM1.pdf · Directed Link Rings Line Segments Chains. Spatial Objects II 2-D Objects: Surfaces Polygons-the

The Raster - Vector Debate

• The R-V debate closely mirrors, but is not quite the same as the O-F debate

• Rasters can be used to represent objects

RASTER ADVANTAGES

• It is a simple data structure

•Overlay operations are easily and

efficiently implemented

•High spatial variability is

efficiently represented in raster

format

•Satellite and other image data are

already in raster format

RASTER DISADVANTAGES

•Raster data is often less compact,

but data compression techniques

can often overcome this problem

•Topological relationships are

more difficult to represent

•Graphical output is less

aesthetically pleasing because

boundaries have a blocky

appearance (as do lines). Very

large numbers of cells (high

resolution rasters) can solve this

problem but file sizes increase

markedly

VECTOR ADVANTAGES

•Vector representation provides a

more compact data structure than

the raster model

•The vector model provides

efficient encoding of topology

and, as a result, more efficient

implementation of operations that

require topological information,

for example network analysis

•The vector model is better suited

to producing maps with crisp

line-work (like hand-drawn

maps)

VECTOR DISADVANTAGES

• It is a more complex structure

than a simple raster

•Overlay operations are more

difficult to implement

•Representing high spatial

variability is inefficient

•Handling image data is not really

possible in the vector domain

Page 22: Spat Mod Intro Lecture - The University of Edinburghbmg/teaching/spatmod/SM1.pdf · Directed Link Rings Line Segments Chains. Spatial Objects II 2-D Objects: Surfaces Polygons-the

Vector & Raster Representations

Page 23: Spat Mod Intro Lecture - The University of Edinburghbmg/teaching/spatmod/SM1.pdf · Directed Link Rings Line Segments Chains. Spatial Objects II 2-D Objects: Surfaces Polygons-the

Distinguishing Raster

from Vector

Page 24: Spat Mod Intro Lecture - The University of Edinburghbmg/teaching/spatmod/SM1.pdf · Directed Link Rings Line Segments Chains. Spatial Objects II 2-D Objects: Surfaces Polygons-the

Models and Structures

Vector

• Spaghetti Sub-Model (unstructured)

• Topological Sub-Model (structured)

Raster

• Different tessellations of space (e.g. square, rectangular, triangular, hexagonal)

• Nested tessellations provide variable resolution without vast space requirements (e.g. quadtrees)

Higher-Dimensional Structures

• Rasters

• TINs

• Fully 3D structures e.g. Oct-trees

• Temporal structures

• Multi-dimensional structures depend on n-dimensional indexes

• Spatial and Database Indexes are key in GIS, due to the data volumes involved

Page 25: Spat Mod Intro Lecture - The University of Edinburghbmg/teaching/spatmod/SM1.pdf · Directed Link Rings Line Segments Chains. Spatial Objects II 2-D Objects: Surfaces Polygons-the

The Third Dimension

• Surface vs. volume models; often referred to as 2.5D vs. 3D

• We are usually just interested in the surface of the earth i.e. one 'z' value for each 'x,y' location

• Some applications require more e.g. archaeology, geology, oceanography, atmospheric models

• Structures extend from 3D equivalents of the raster (voxels), through vector representations (incl. TINS) to 3D objects.

Salisbury Crags, Edinburgh

Page 26: Spat Mod Intro Lecture - The University of Edinburghbmg/teaching/spatmod/SM1.pdf · Directed Link Rings Line Segments Chains. Spatial Objects II 2-D Objects: Surfaces Polygons-the

Temporal Issues

• The representation of time is a particular problem

• Representation very dependent on the application

• Different representations, for example:

– time stamp (time as an attribute)

– regular or irregular intervals (snapshots)

– continuous change

• What happens in between the time-steps above?

• Can be a complex set of over-lapping spatial changes which may prove difficult to deconstruct unless an appropriate representation is used

Page 27: Spat Mod Intro Lecture - The University of Edinburghbmg/teaching/spatmod/SM1.pdf · Directed Link Rings Line Segments Chains. Spatial Objects II 2-D Objects: Surfaces Polygons-the

Textual Data as GI

• Poverty of data, evidenced by re-use of same often inadequate OS, Navtech or TeleAtlas data in multiple applications

• Weaknesses exemplified if we try to give directions

• Richness of textual description

• Text maintains the subtleties of history and details ephemeral knowledge much more easily and often more effectively than a map

• Recent trend has been towards tourist guides rather than a systematic description of places (traditional gazetteer)

• Gazetteer for Scotland attempts to address: contemporary and historical texts

• Lots of text on web which could become GI

• Issues:– Storage and data models are (relatively) simple– But we don't have good tools to interrogate

descriptive GI or make inferences from it– Data mining techniques represent only the

beginnings of a solution– Nor good tools to generate it– We don't even do geo-parsing particularly well– But then how do we differentiate the three

Newbiggings within a few miles of each other in Angus?

Page 28: Spat Mod Intro Lecture - The University of Edinburghbmg/teaching/spatmod/SM1.pdf · Directed Link Rings Line Segments Chains. Spatial Objects II 2-D Objects: Surfaces Polygons-the

Spatial Analysis of Texts

• Ability to compare texts

• Spatial Analysis tools

• Differences: Change Detection

• Error / Inconsistency Detection

• Deduction:

– Text A says place x is 3 miles east of place y; Text B says place z is 4 miles south of place y. We can deduce that place z is 5 miles southwest of place x.

– Text C (1910) says place x is 2 miles north of river y and place z is a quarter-mile east of river y; Text D (1990) says place x is just a mile north of Loch y and place z is not mentioned. Can we deduce that Loch y is a reservoir built between 1910-90 which flooded place z?

• Collecting similarly named features to create localities (the tower-house, hill, burn, East and West Farms etc.)

Page 29: Spat Mod Intro Lecture - The University of Edinburghbmg/teaching/spatmod/SM1.pdf · Directed Link Rings Line Segments Chains. Spatial Objects II 2-D Objects: Surfaces Polygons-the

Data Exchange

• Different structures for different systems

• Proprietary (e.g. ArcGIS shape file)

• De Facto Standards (e.g. AutoCAD DXF)

• De Jure Standards (e.g. BSi NTF)

• Costs of data structure translation (time, money, inconvenience)

• Direct translation vs. translation through intermediate format

– standard intermediate format

– proprietary intermediate format

• The concept of a translation switch-yards, based on hub-structure and one bi-directional translator to take each structure in and out of the hub format

• Increasingly XML structures are being used to provide translations (GML, SVG, MasterMap etc.)

• Spatial metadata

– Effective description? Lineage? Accuracy?

– Dublin Core, Gemini, FGDC, ISO etc.

– How is it presented? (eg. gigateway)

Page 30: Spat Mod Intro Lecture - The University of Edinburghbmg/teaching/spatmod/SM1.pdf · Directed Link Rings Line Segments Chains. Spatial Objects II 2-D Objects: Surfaces Polygons-the

Two Levels of Data Model

• Two different implementation models are regularly used for the combination of spatial data and database models

• These are a lower level of data model

• The Hybrid Model (or GeoRelational Model) is based on the pragmatic realisation that different forms of data should be stored separately for optimal performance (e.g. ArcInfo - not ArcGIS)

• The Integrated Model enforces a theoretically better solution with the potential of lost performance (e.g. Laserscan)

• The industry is moving from Hybrid to Integrated models

THE REAL WORLD

FILE STRUCTURE

DATA MODEL

DATA STRUCTURE

SPATIAL

DATA

MODEL

DATABASE

MODEL

INTEGRATED

MODEL

SPATIAL

DATA

MODEL

DATABASE

MODEL

HYBRID

MODEL

EITHER

OR

Page 31: Spat Mod Intro Lecture - The University of Edinburghbmg/teaching/spatmod/SM1.pdf · Directed Link Rings Line Segments Chains. Spatial Objects II 2-D Objects: Surfaces Polygons-the

Choosing Models and Structures

1. GENERALITY

– Should support spatial databases developed as a variety of scales for a variety of purposes

2. SIMPLICITY

– Should be as simple as is consistent with other goals. Simplicity is the key to efficient and reliable information

3. EFFICIENCY

– Should support efficient geo-processing functions directly, that is without converting data to special analysis or edit formats e.g. should not have to convert from vector to raster to do polygon overlay

4. ADAPTABILITY

– Should be possible to adapt the data model for specific applications e.g by adding or removing feature attributes

5. FREEDOM FROM RESTRICTIONS

– Should be no inherent limitations on size or content, particularly where large production applications are envisaged, e.g. ESRI coverage has limit of 500 points per line. Shapefiles have restrictions of max size (2 Gb) and 255 fields.

Page 32: Spat Mod Intro Lecture - The University of Edinburghbmg/teaching/spatmod/SM1.pdf · Directed Link Rings Line Segments Chains. Spatial Objects II 2-D Objects: Surfaces Polygons-the

Conclusions

• Data Models, and the Data Structures used to represent them, are crucial to successful implementation of GIS

• The real world is infinitely complex; and careful selection of properties is required

• There is a significant range of models and structures used within GIS

• There are no right or wrong models and structures, just better and worse ones for a particular application

• Transfer standards are also a big (and potentially expensive) issue

• GIS includes spatial data and attribute data; both are of equal importance, but are very different in form

• The Integrated - Hybrid Debate is one which is particularly current, and one we will come back to in this module

• Haven't said much about structures, that comes later...

Page 33: Spat Mod Intro Lecture - The University of Edinburghbmg/teaching/spatmod/SM1.pdf · Directed Link Rings Line Segments Chains. Spatial Objects II 2-D Objects: Surfaces Polygons-the

Its all about how you model space

Page 34: Spat Mod Intro Lecture - The University of Edinburghbmg/teaching/spatmod/SM1.pdf · Directed Link Rings Line Segments Chains. Spatial Objects II 2-D Objects: Surfaces Polygons-the

References

+Many other books

and papers

specified in the

reference list

on the module outline

relating to

individual lectures

GIS Intranet MSc GIS Courses Spatial Modelling