gis - lecture 4
TRANSCRIPT
8/20/2010
1
GIS Data Structures and ModelsGIS Data Structures and Models
GE 517 – Geographic Information SystemsEngr. Ablao
Digital Geographic DataN i l i h d ib l ld f Numerical representations that describe real-world features and phenomenaMust be encoded in digital form and organized as a geographic database to be useful to GIS
This digital database creates our perception of the real
8/20/2010Geographic Information System
world
8/20/2010
2
Conventional Data vs. Digital Geographic Data
Conventional Data: Paper mapConventional Data: Paper mapStatic representation
Represents a general-purpose snapshot of the real world at a given time only
Digital Geographic Data:Dynamic representation
Allows a range of functions for storing processing analyzing
8/20/2010Geographic Information System
Allows a range of functions for storing, processing, analyzing, and visualizing spatial dataHas tools that allow users to interact with the data to meet their objectives
Physical Real world Map /
FIGURE 1. Bringing real-world data into a GIS or a map requires several abstractions and/or conversions of the original data. (Freely adapted from Bernhardsen, 1992).
Physical reality
→Real world
model→ Data model → Database →
Map / Report
Actual phenomenon
Entity Object Object Symbol, line, text
• Properties• Connection
• Type• Attributes
• Type• Attributes
• Type• Attributes
s • Relationship • Relationship• Geometry• Quality
• Relationship• Geometry• Quality
8/20/2010Geographic Information System
8/20/2010
3
8/20/2010Geographic Information System
Figure 2
What is a Data Model?Th h f GISThe heart of any GISA way of representing data
A set of constructs for describing and representing selected aspects of the l ld in a computer
8/20/2010Geographic Information System
selected aspects of the real world in a computer.
Longley, Goodchild, Maguire & Rhind
8/20/2010
4
Geographic Data Models
REAL WORLD
Field-based model Object-based model
8/20/2010Geographic Information System
Vector data modelRaster data model
Figure 3
Raster and Vector Data Models
8/20/2010Geographic Information SystemFigure 4
8/20/2010
5
Raster Data ModelTreats geographic space as populated by one or more spatial phenomena which vary continuously over space and having phenomena, which vary continuously over space and having no obvious boundaries.Best used to represent geographic features that are continuous over a large area (e.g. soil type, vegetation, etc.) Uses an array of rectangular cells/pixels/grids to represent real world objects
8/20/2010Geographic Information System
real-world objects.Each cell is defined by a coordinate location and an attribute that identifies the feature.Cell’s linear dimensions defines the spatial resolution
Raster Data Model
8/20/2010Geographic Information System
Figure 5
8/20/2010
6
Raster Data Model
8/20/2010Geographic Information System
Figure 6
Raster Data Model
8/20/2010Geographic Information SystemFigure 7
8/20/2010
7
A point is represented as a single cell.
well
tree
8/20/2010Geographic Information System
rock
Figure 8
A line is represented as a string of cells.
riverroad
canal
8/20/2010Geographic Information SystemFigure 9
8/20/2010
8
An area is a contiguous block of cells.
grassland
barn
forest
8/20/2010Geographic Information SystemFigure 10
The cell size determines the resolution of the data.The grid cell has an extent.
Raster Data Model
The grid cell has an extent.When we map features onto the grid, there is sometimes an imperfect fit.Each grid cell can usually be “owned” by one feature, that is, the one whose attribute it holds (mixed pixels).
8/20/2010Geographic Information System
8/20/2010
9
8/20/2010Geographic Information SystemFigure 11
8/20/2010Geographic Information System
Figure 12
8/20/2010
10
Raster Data Compression
A single raster file usually contains several million grid ll l d l d f lcells leading to a very large data file
Therefore, data compression is very important in the digital representation of raster data!
8/20/2010Geographic Information System
Raster Data Compression
Based on the realization that cells representing the f h d l l d h hsame feature type have identical values and that they
tend to be spatially clumpedExamples of raster data compression techniques are:
Run length encodingWavelet compression
8/20/2010Geographic Information System
8/20/2010
11
Raster Data Compression
Run length encodingg gAdjacent cells along a row having the same value are treated as a group known as a ‘run’instead of storing the similar cell values individually, the value is just stored once together with the
b f ll th t i th
8/20/2010Geographic Information System
number of cells that comprise the run
Run length encoding
8/20/2010Geographic Information System
Figure 13
8/20/2010
12
Raster Data Compression
Wavelet CompressionpMore efficient compressionCan achieve a ratio between 15:1 to 20:1 for 8-bit gray scale images and between 30:1 to 40:1 for 24-bit color images
8/20/2010Geographic Information System
Raster File FormatsBMP (bitmaps) – used by graphics in Microsoft Windows applications; no
compressionpDIB (Device Independent Bitmaps)GIF (Graphical Interchange Format) – widely used for images to be used on the
World Wide WebTIFF (Tagged Image File Format) – non-proprietary, system-independentJPEG (Joint Photographic Experts Group) – primarily for storage of photographic
images and for the World Wide Web GeoTIFF – extension of the TIFF format that contains georeferencing information
8/20/2010Geographic Information System
PNG (Portable Network Graphics) – patent-free; intended to replace the GIF formatPCX – supported by many image scannersMrSID (Multi-resolution Seamless Image Database)GRID – proprietary format used by ESRI in ArcInfo and ArcView GIS
8/20/2010
13
Quadtree Data Model
A hierarchical field-based model that uses grid cells f i bl iof variable sizes
Uses finer subdivisions in areas where finer detail occursGeographic space is divided by the process of recursive decompositionMore difficult compared to the simple raster model
8/20/2010Geographic Information System
More difficult compared to the simple raster model
Quadtree Data Model
8/20/2010Geographic Information System
Figure 14
8/20/2010
14
Digital Elevation Model
A raster-based representation of surface
8/20/2010Geographic Information System Southwest Corner of the Morrison Quadrangle, Colorado
Figure 15
Remotely sensed data are a special type of raster data.
8/20/2010Geographic Information System
Figure 16
8/20/2010
15
Other forms of raster data
Aerialphotos
Scannedmaps
8/20/2010Geographic Information System
Pictures
Converted data
Figure 17
Vector Data Model
Treats geographic space as populated by discrete objects, which h id tifi bl b d i ti l t thave identifiable boundaries or spatial extentEach object in the real-world is represented as either point, line, or polygon features.characterized by the use of sequential points or vertices to define a linear segment, each vertex consisting of an X coordinate and a Y coordinate.
8/20/2010Geographic Information System
Useful for representing discrete objects such as roads, buildings, rivers, boundaries, etc.
8/20/2010
16
Vector Data Model
8/20/2010Geographic Information System
Vector Data Model Raster Data Model
Vector Data Model
8/20/2010Geographic Information System
8/20/2010
17
Vector data model
The vector is composed of a pstring of points, each represented by their exact spatial coordinates.
8/20/2010Geographic Information System
Points are represented by their coordinates.
Vector data model
Lines are defined as a string of x- and y- coordinates.
8/20/2010Geographic Information System
8/20/2010
18
Vector data model
Areas are a string of x- and y-coordinates that terminate at the origin.
8/20/2010Geographic Information System
Types of Vector Data Model
1. Simple (spaghetti) data modelp ( p g )2. Topologic data model
8/20/2010Geographic Information System
8/20/2010
19
Simple (Spaghetti) Data Model
a one-for-one translation of the analog mapFeatures (lines and polygons) may overlapNo relationships between features/entitiesGenerally used where analysis is not important (eg. Plotting)each entity is a single, logical record in the computer,
d d i bl l th t i f d di t
8/20/2010Geographic Information System
coded as variable length strings of x- and y- coordinate pairsExamples are digitized maps and CAD data
Simple (Spaghetti) Data Model
Advantages:gEasy to create and storeCan be retrieved and rendered on screen very quickly
8/20/2010Geographic Information System
8/20/2010
20
Simple (Spaghetti) Data Model
Disadvantages:Spatial analysis (eg. Shortest path network analysis) cannot be performed easily due to lack of any connectivity relationshipsRedundant since the boundary segment between two polygons can be stored twice (once for each feature)
8/20/2010Geographic Information System
Inflexible since it is difficult to dissolve common boundaries when joining polygonsCannot be used in certain applications such as land management because of overlapping features
Topological Data Model
spaghetti model but with topology!E ll l d d l d Essentially just a simple data model structured using topologic rulesoften referred to as an intelligent data structurebecause spatial relationships between geographic features are easily derived when using them Most commonly used variants is the arc node data
8/20/2010Geographic Information System
Most commonly used variants is the arc-node data modeluses nodes (intersection of two lines; end point of line, and any point on line) and arcs (line connecting two nodes)
8/20/2010
21
Topological Data Model Components
Point (vertex) – where a line originates or terminatesNode – where a link originates or terminatesLine – a segment between two points (vertices)Link (or arc or chain) – a connection between two nodes
– may consist of several lines which are joined at points (vertices)
8/20/2010Geographic Information System
(vertices)– can only originate, terminate or be connected at nodes
Polygon (area) – composed of links– adjacent polygons have only one link between them
8/20/2010Geographic Information System
8/20/2010
22
Elements of Topological RelationshipConnectivity
I f ti b t li k ti l bj tInformation about linkages among spatial objectsKeep track of which links are connected at a node
AdjacencyInformation about neighbourhood among spatial objectsA link can determine the polygon to its left and its right
Containment
8/20/2010Geographic Information System
Information about inclusion of one spatial object within another spatial objectWhat nodes and links and other polygons are within a polygon
Elements of Topological Relationship
8/20/2010Geographic Information System
8/20/2010
23
Topological Map
Explains the concept of topological relationshipsA map that contains explicit topological information on top of the geometric information expressed in coordinates (Masry and Lee, 1988)All spatial entities are decomposed and represented in the form of the three basic graphical elements (point, line,
8/20/2010Geographic Information System
polygon)
Concept of a Topological Map
8/20/2010Geographic Information System
8/20/2010
24
8/20/2010Geographic Information System
8/20/2010Geographic Information System
8/20/2010
25
8/20/2010Geographic Information System
Arc-node Data Model
Stores graphical elements rather than graphical entitiesExplicitly stores spatial relationships between the graphical elements, and relationships between the arcs and respective nodesSt d t l i l l ti hi ll hi l
8/20/2010Geographic Information System
Stored topological relationships allow graphical entities to be constructed using the basic graphical elements
8/20/2010
26
Arc-node Data Model
8/20/2010Geographic Information System
Network Data Model
A special type of topologic modelp yp p gShows how lines connect with each other at nodesApplications:
Load analysis over an electric network
8/20/2010Geographic Information System
yVehicle routing over a street networkPollution tracing over a river network
8/20/2010
27
Use of Topological Relationship
Areas in GIS wherein topological relationships are applied are:
1. Data input and representation2. Spatial search3. Construction of complex spatial objects from
b h l l
8/20/2010Geographic Information System
basic graphical elements4. Integrity checks in database creation
Data input and representation
Creating a topologic geographic database is extremely time-consuming and error-pronePossible to adopt the spaghetti digitization for graphical data input and then later on let the computer refine and structure them using topological relationships
8/20/2010Geographic Information System
topological relationships
8/20/2010
28
Data input and representation
8/20/2010Geographic Information System
Spatial searchGIS uses topological relationships to perform spatial
h ffi i tlsearches efficientlyFor example, the land parcels surrounding a particular parcel can be located immediately using adjacency information
Polygons that share a common boundary with the parcel of interest are identified
8/20/2010Geographic Information System
Their IDs are used to retrieve the descriptive data from the database
8/20/2010
29
Spatial search
8/20/2010Geographic Information System
Construction of complex spatial objects
Complex objects are represented as complex l h d bpolygons in a geographic database
Two types of complex polygons are:Those containing one or more holes (or islands/enclaves) constructed from vector lines using topological informationTh d f t l th t t
8/20/2010Geographic Information System
Those made up of two or more polygons that are not physically connected constructed using a common identifier
8/20/2010
30
Construction of complex spatial objects
8/20/2010Geographic Information System
Integrity checks
Graphical data must be topologically cleaned before using them in GIS applications
Must not contain any topological errors
During topology building, the computer detects topological errors and marks them immediatelyOperator can then decide hat to do ith the
8/20/2010Geographic Information System
Operator can then decide what to do with the errors
8/20/2010
31
Integrity checks
8/20/2010Geographic Information System
Triangulated Irregular Network (TIN) Data Model
Creates and represents surfaces in GIS
S f i t d ti l i t i lSurface is represented as contiguous, non-overlapping triangles
Size of triangles may be adjusted to reflect the relief of the surface being modelled
Larger triangles for relatively flat terrain Smaller triangles for rugged terrain
Manages information about the nodes that comprise each triangle and the neighbors of each triangle
8/20/2010Geographic Information System
neighbors of each triangle
Applications:Volumetric calculations for road designDrainage studies Landslide studies
8/20/2010
32
Triangulated Irregular Network (TIN) Data Model
8/20/2010Geographic Information System
Advantages of TIN Data Model
Incorporates original sample points so the accuracy of the model can be checkedEfficient way of storing surface representations because the size of triangles can be varied according to the terrainD t t t k it t l l t l ti
8/20/2010Geographic Information System
Data structure makes it easy to calculate elevation, slope, aspect, and line-of-sight between points
8/20/2010
33
Vector File FormatsGDF/DIME (Geographic Base File/Dual Independent Map Encoding) – originally used for storing street mapsTIGER (Topologically Integrated Geographic Encoding and Referencing System) –improvement for GDF/DIMEDLG (Digital Line Graphs) – USGS topographic mapsAutoCAD DXF (Data Exchange Format) – widely used as an export format in many GISIGDS (Intergraph Design System) – widely used in mappingArcInfo Coverage – stores vector graphical data using topological structure explicitly defining spatial relationshipsArcInfo E00 – export format of ArcInfo
8/20/2010Geographic Information System
pShapefiles – format of ArcView GIS; defines geometry and attribute of geographically referenced objects using the main file, index file and database tableCGM (Computer Graphics Metafile) – ISO standard for vector data format; widely used in PC-based computer graphics applicationsPage Description Languages (PDL’s)
Considerations in choosing between Raster and Vector
source and type of datasource and type of dataanalytical procedures to be usedintended use of data
8/20/2010Geographic Information System
8/20/2010
34
Raster vs. VectorRaster Data ModelAdvantages:
Simple data structure Easy to conceptualize space representationEasy data collection and processingEasy and efficient overlaying High spatial variability is efficiently represented Easier and efficient for surface modelling, wherein individual features are not that importantAll i i f i d ( lli l d
8/20/2010Geographic Information System
Allows easy integration of image data (satellite, remotely-sensed, etc.)Pixel values can be modified individually or in a group by using a paletteSimple for own programming
Raster vs. VectorRaster Data Model
Disadvantages:Disadvantages:Do not provide precise locational and area computation information due to grid cellsRequires large storage capacity, based on size and number of colors No topological processingLi it d tt ib t d t h dli
8/20/2010Geographic Information System
Limited attribute data handlingInefficient projection transformations Loss of information when using large cells“Blocky” appearance when image is viewed in detail
8/20/2010
35
Raster vs. VectorVector Data Model
Advantages:Far more efficient in storage than grids Compact data structureProvides precise locational information of pointsCan represent point, line, and area features very accurately, leading to more accurate measurements and map outputsTopological models enable many types of analysisSophisticated attribute data handling
8/20/2010Geographic Information System
Sophisticated attribute data handlingEfficient for network analysis Efficient projection transformation Work well with pen and light-plotting devices and tablet digitizers
Raster vs. VectorVector Data Model
Disadvantages:Disadvantages:Complex data structure Difficult overlay operations High spatial variability is inefficiently represented Not compatible with RS imagery
8/20/2010Geographic Information System
Expensive data collectionNot good at continuous coverages or plotters that fill areas
8/20/2010
36
Representation of a line using raster and vector model
8/20/2010Geographic Information System
Raster representation Vector representation
Raster vs. VectorRaster Vector
Data collection Rapid Slow
Data volume Large SmallData volume Large Small
Graphic treatment Average Good
Data structure Simple Complex
Area analysis Good Average
Network analysis Poor Good
Generalization Simple Complex
8/20/2010Geographic Information System
8/20/2010
37
Integrating raster and vector data
8/20/2010Geographic Information System