Download - Representation of spatial data
Representation of spatial data
GIS thematic layers, raster and vector, conversion, subdivision representation, continuous data: contours, DEMs, TINs
Thematic map layers
• Separate storage of data according to theme: map layers (or data layers)
• GIS typically use tens to hundreds of map layers
• For example: municipality borders, land use, cadastral boundaries, water pipes, churches, etc.
Example map layers
Census data, 1995(U.S.A.)
Geometry, topology and attributes
• Geometry: coordinates• Topology: adjacency relations of objects• Attributes: properties, values
Example: Country map of South AmericaGeometry: coordinates of the bordersTopology: which countries border which Attributes: names of countries, population, etc.
Representation of geometry• Two main approaches:
raster and vector• Can also be mixed in a
GIS, any map layer• Conversion raster-vector
and vice versa possible• Representation depends
on type of data, way of acquisition, desired operations, etc.
Raster structure
• Division of space into equal-size cells (squares, pixels)
• Theme gives cells a value (nominal, ordinal, interval, ratio, vector, …)
• Cells should not contain any further spatial information (more detail)
Data in raster form
Point object inraster form
Line object inraster form
Plane object inraster form
Raster maps
Raster: pros and cons
• Simple structure• Simple operations• Obtained after scanning,
remote sensing
• Less suitable for point and line objects: representation does not follow intuition
• Network analysis difficult• Not adaptive: no difference
in detail possible in different regions
• Either expensive in memory, or little precision
• Not obtained after digitizing
Raster: memory reduction
• Run-length encoding: no 2-dim array but coding start pixel with value and length of run
• Block encoding: 2-dim version• Disadvantage: makes structure and operations
much more complex
(34,67) forest 9(34,67) forest 4,6
Vector structure
• Objects stored as points, lines and areas• Points have coordinates; lines connect points;
areas are delimited by lines• Attributes are stored with the objects (point, line
or areal)
Vector: pros and cons
• Elegant structure; fits with both point, line and areal objects
• Small storage consumption • Precise• Adaptive: additional
control points possible• Network and cluster
analysis possible• Obtained after digitizing
• Relatively complex• Map overlay and buffer
computation complex
Vector representation of a region
• Not necessarily simply-connected:– NL has islands– NL has holes
(Baarle-Nassau / Baarle-Hertog); there are even regions in these holes
Representation of subdivisions
Subdivisions: spaghetti model
• Every chain is represented by a list with coordinate pairs
• Split nodes are doubly stored
• Areas are not present explicitly
C1: (..,..), (..,..), (..,..), ...C2: (..,..), (..,..), (..,..), ...C3: (..,..), (..,..), (..,..), ...
C1
C2
C3
C4
C5
C6
Subdivisions: polygon ring structure
• Every area is represented by a list with coordinate pairs
• Control points are doubly stored
• Neighbor areas are difficult to determine
• Consistency is difficult to maintain
P1
P2
P3
P1: (..,..), (..,..), (..,..), ...P2: (..,..), (..,..), (..,..), ...P3: (..,..), (..,..), (..,..), ...
Subdivisions: topological structure (node-link structure)
• Nodes are objects with coordinates
• Edges are connections of nodes
• Sequences of edges along polygon boundaries form cycles
• Polygons are objects that can access their boundaries
Doubly-connected edge list
Subdivisions: topological structure
• Edges are split into directed half-edges
• Half-edges have pointers to– Twin half-edge– Origin vertex– Next and Prev half-edges
of incident polygon– Incident polygon
• Polygons have pointers to half-edges, one in each bounding cycle
polygon
polygon
Next
PrevTwin
Origin
Subdivisions: topological chain structure
• Splitting nodes are objects with coordinates
• Chains are connections between splitting nodes and contain zero or more nodes with coordinates
• Sequences of chains along polygon boundaries form cycles
• Polygons are objects that can access their boundaries Doubly-connected chain list
half-chains
Vector structures
Spaghetti ++ + -- -
Polygon ring - -- ++ -
DC edge list -- ++ - +
DC chain list ++ ++ + ++
Memory Duplication Polygon Topologyretrieve retrieve
Raster-vector conversion
• Vector-to-raster: Like in computer graphics: scan-conversion of lines, etc.
• Raster-to-vector: Consider pixel sides between pixels with different values as boundary and put in vector representation Thinning, line simplification
E.g. for data integration
Thinning
Raster-vectorconversion
Thinning
Line simplification
• Douglas-Peucker algorithm from 1973• Input: chain p1, …, pn and error
p1pn
DP-algorithm
• Draw line segment between first and last point• If all points in between are within error: ready• Otherwise, determine farthest point and recursively continue
on the part until farthest point and the part after farthest point
DP-algorithm
DP-standard(i, j, )
Determine farthest point pk between pi and pjIf distance(pk, pi pj) > then DP-standard(i, k, ) DP-standard(k, j, ) Return the concatenation of the simplifications
Properties of the DP-algorithm
• DP-algorithm does not minimize the number of points in the simplification
DP-algorithm Optimal
Properties of the DP-algorithm• Determining farthest point takes O(n) time• Whole algorithm takes
T(n) = T(m) + T(n-m+1) + O(n),T(2) = O(1) time,
splitting in m and n-m+1 points
• “Fair” split gives O(n log n) time• Worst case gives quadratic time
Properties of the DP-algorithm
• DP-algorithm may give self-intersections in the output
Solution: test output for self-intersectionsand continue adding control points if necessary
Improved DP-algorithmDP-improved(i, j, )
Simp = DP-standard(i, j, )V = set of intersecting segments of SimpRepeat
For all segments s V: Refine(s) in Simp; do 1 refinement à la DP by adding the
farthest point, giving a new Simp V = set of intersecting segments of SimpUntil V is empty
Continuous data representation
• Data on interval or ratio measurement scale• Data values of points near by will usually be not
very different• Representation is necessarily an approximation:
finite representation of information with infinite detail
• Raster (1x) or vector (2x)
Digital Elevation Model (DEM)
Elevation models
(Elevation) grid
212120
2019
20 15
1010
25
Contour line model
Triangulation(TIN; triangulatedirregular network)
Raster Vector Vector
Grid elevation model
TIN elevation model
Elevation models• Contour model well-suited for visualisation, not
for representation or storage• Interpretations grid:
- elevation whole cel: not a continuous model- elevation middle cel: interpolation needed; how?
• Advantage grid: simple storage, operations simple too
• Advantage TIN: more efficient in storage, adaptive
Interpolation for grid
20 18
2218
Linear interpolation; saddle point problem
20 18
2218
20 18
2218
20 18
2218
Linear interpolation;additional point
20 18
2218
Non-linearinterpolation
20+18+18+224
= 19.5
Topological TIN structure
t
t1t2
t3
u v w
x, y-coordinates and elevation
• With explicit vertex and triangle representation
tt1 t2
t3u v
w
Topological TIN structure
t
t1t2
t3
u v w
Because t1 has pointers to two the same vertices as t, we can determine their shared edge, even though it is not represented explicitly
• With explicit vertex and triangle representation
tt1 t2
t3u v
w
Topological TIN structure
t
t1 t2
t3u v
w
• With explicit vertex and triangle representation
tt1 t2
t3u v
w
Topological TIN structure• Alternatively, edges have an explicit representation
too
e1 e2e3
t
e1 e2
e3
t1w
u
tt1 t2
t3u v
w
v
Summary representation• Objects have geometry and attributes, at least
the attributes are in a database• Geometry can be stored in raster or vector form;
each has advantages and disadvantages• Important geometric types of representations
are those for subdivisions and for elevation models
• For subdivisions, the doubly-connected chain list is the most suitable structure
• For elevation models, grids or TINs are most useful