gis - lecture 4

37
8/20/2010 1 GIS Data Structures and Models GIS Data Structures and Models GE 517 – Geographic Information Systems Engr. Ablao Digital Geographic Data N i l i h d ib l ld f Numerical representations that describe real-world features and phenomena Must be encoded in digital form and organized as a geographic database to be useful to GIS This digital database creates our perception of the real 8/20/2010 Geographic Information System world

Upload: sorbi

Post on 11-May-2015

12.226 views

Category:

Education


1 download

TRANSCRIPT

Page 1: GIS - Lecture 4

8/20/2010

1

GIS Data Structures and ModelsGIS Data Structures and Models

GE 517 – Geographic Information SystemsEngr. Ablao

Digital Geographic DataN i l i h d ib l ld f Numerical representations that describe real-world features and phenomenaMust be encoded in digital form and organized as a geographic database to be useful to GIS

This digital database creates our perception of the real

8/20/2010Geographic Information System

world

Page 2: GIS - Lecture 4

8/20/2010

2

Conventional Data vs. Digital Geographic Data

Conventional Data: Paper mapConventional Data: Paper mapStatic representation

Represents a general-purpose snapshot of the real world at a given time only

Digital Geographic Data:Dynamic representation

Allows a range of functions for storing processing analyzing

8/20/2010Geographic Information System

Allows a range of functions for storing, processing, analyzing, and visualizing spatial dataHas tools that allow users to interact with the data to meet their objectives

Physical Real world Map /

FIGURE 1. Bringing real-world data into a GIS or a map requires several abstractions and/or conversions of the original data. (Freely adapted from Bernhardsen, 1992).

Physical reality

→Real world

model→ Data model → Database →

Map / Report

Actual phenomenon

Entity Object Object Symbol, line, text

• Properties• Connection

• Type• Attributes

• Type• Attributes

• Type• Attributes

s • Relationship • Relationship• Geometry• Quality

• Relationship• Geometry• Quality

8/20/2010Geographic Information System

Page 3: GIS - Lecture 4

8/20/2010

3

8/20/2010Geographic Information System

Figure 2

What is a Data Model?Th h f GISThe heart of any GISA way of representing data

A set of constructs for describing and representing selected aspects of the l ld in a computer

8/20/2010Geographic Information System

selected aspects of the real world in a computer.

Longley, Goodchild, Maguire & Rhind

Page 4: GIS - Lecture 4

8/20/2010

4

Geographic Data Models

REAL WORLD

Field-based model Object-based model

8/20/2010Geographic Information System

Vector data modelRaster data model

Figure 3

Raster and Vector Data Models

8/20/2010Geographic Information SystemFigure 4

Page 5: GIS - Lecture 4

8/20/2010

5

Raster Data ModelTreats geographic space as populated by one or more spatial phenomena which vary continuously over space and having phenomena, which vary continuously over space and having no obvious boundaries.Best used to represent geographic features that are continuous over a large area (e.g. soil type, vegetation, etc.) Uses an array of rectangular cells/pixels/grids to represent real world objects

8/20/2010Geographic Information System

real-world objects.Each cell is defined by a coordinate location and an attribute that identifies the feature.Cell’s linear dimensions defines the spatial resolution

Raster Data Model

8/20/2010Geographic Information System

Figure 5

Page 6: GIS - Lecture 4

8/20/2010

6

Raster Data Model

8/20/2010Geographic Information System

Figure 6

Raster Data Model

8/20/2010Geographic Information SystemFigure 7

Page 7: GIS - Lecture 4

8/20/2010

7

A point is represented as a single cell.

well

tree

8/20/2010Geographic Information System

rock

Figure 8

A line is represented as a string of cells.

riverroad

canal

8/20/2010Geographic Information SystemFigure 9

Page 8: GIS - Lecture 4

8/20/2010

8

An area is a contiguous block of cells.

grassland

barn

forest

8/20/2010Geographic Information SystemFigure 10

The cell size determines the resolution of the data.The grid cell has an extent.

Raster Data Model

The grid cell has an extent.When we map features onto the grid, there is sometimes an imperfect fit.Each grid cell can usually be “owned” by one feature, that is, the one whose attribute it holds (mixed pixels).

8/20/2010Geographic Information System

Page 9: GIS - Lecture 4

8/20/2010

9

8/20/2010Geographic Information SystemFigure 11

8/20/2010Geographic Information System

Figure 12

Page 10: GIS - Lecture 4

8/20/2010

10

Raster Data Compression

A single raster file usually contains several million grid ll l d l d f lcells leading to a very large data file

Therefore, data compression is very important in the digital representation of raster data!

8/20/2010Geographic Information System

Raster Data Compression

Based on the realization that cells representing the f h d l l d h hsame feature type have identical values and that they

tend to be spatially clumpedExamples of raster data compression techniques are:

Run length encodingWavelet compression

8/20/2010Geographic Information System

Page 11: GIS - Lecture 4

8/20/2010

11

Raster Data Compression

Run length encodingg gAdjacent cells along a row having the same value are treated as a group known as a ‘run’instead of storing the similar cell values individually, the value is just stored once together with the

b f ll th t i th

8/20/2010Geographic Information System

number of cells that comprise the run

Run length encoding

8/20/2010Geographic Information System

Figure 13

Page 12: GIS - Lecture 4

8/20/2010

12

Raster Data Compression

Wavelet CompressionpMore efficient compressionCan achieve a ratio between 15:1 to 20:1 for 8-bit gray scale images and between 30:1 to 40:1 for 24-bit color images

8/20/2010Geographic Information System

Raster File FormatsBMP (bitmaps) – used by graphics in Microsoft Windows applications; no

compressionpDIB (Device Independent Bitmaps)GIF (Graphical Interchange Format) – widely used for images to be used on the

World Wide WebTIFF (Tagged Image File Format) – non-proprietary, system-independentJPEG (Joint Photographic Experts Group) – primarily for storage of photographic

images and for the World Wide Web GeoTIFF – extension of the TIFF format that contains georeferencing information

8/20/2010Geographic Information System

PNG (Portable Network Graphics) – patent-free; intended to replace the GIF formatPCX – supported by many image scannersMrSID (Multi-resolution Seamless Image Database)GRID – proprietary format used by ESRI in ArcInfo and ArcView GIS

Page 13: GIS - Lecture 4

8/20/2010

13

Quadtree Data Model

A hierarchical field-based model that uses grid cells f i bl iof variable sizes

Uses finer subdivisions in areas where finer detail occursGeographic space is divided by the process of recursive decompositionMore difficult compared to the simple raster model

8/20/2010Geographic Information System

More difficult compared to the simple raster model

Quadtree Data Model

8/20/2010Geographic Information System

Figure 14

Page 14: GIS - Lecture 4

8/20/2010

14

Digital Elevation Model

A raster-based representation of surface

8/20/2010Geographic Information System Southwest Corner of the Morrison Quadrangle, Colorado

Figure 15

Remotely sensed data are a special type of raster data.

8/20/2010Geographic Information System

Figure 16

Page 15: GIS - Lecture 4

8/20/2010

15

Other forms of raster data

Aerialphotos

Scannedmaps

8/20/2010Geographic Information System

Pictures

Converted data

Figure 17

Vector Data Model

Treats geographic space as populated by discrete objects, which h id tifi bl b d i ti l t thave identifiable boundaries or spatial extentEach object in the real-world is represented as either point, line, or polygon features.characterized by the use of sequential points or vertices to define a linear segment, each vertex consisting of an X coordinate and a Y coordinate.

8/20/2010Geographic Information System

Useful for representing discrete objects such as roads, buildings, rivers, boundaries, etc.

Page 16: GIS - Lecture 4

8/20/2010

16

Vector Data Model

8/20/2010Geographic Information System

Vector Data Model Raster Data Model

Vector Data Model

8/20/2010Geographic Information System

Page 17: GIS - Lecture 4

8/20/2010

17

Vector data model

The vector is composed of a pstring of points, each represented by their exact spatial coordinates.

8/20/2010Geographic Information System

Points are represented by their coordinates.

Vector data model

Lines are defined as a string of x- and y- coordinates.

8/20/2010Geographic Information System

Page 18: GIS - Lecture 4

8/20/2010

18

Vector data model

Areas are a string of x- and y-coordinates that terminate at the origin.

8/20/2010Geographic Information System

Types of Vector Data Model

1. Simple (spaghetti) data modelp ( p g )2. Topologic data model

8/20/2010Geographic Information System

Page 19: GIS - Lecture 4

8/20/2010

19

Simple (Spaghetti) Data Model

a one-for-one translation of the analog mapFeatures (lines and polygons) may overlapNo relationships between features/entitiesGenerally used where analysis is not important (eg. Plotting)each entity is a single, logical record in the computer,

d d i bl l th t i f d di t

8/20/2010Geographic Information System

coded as variable length strings of x- and y- coordinate pairsExamples are digitized maps and CAD data

Simple (Spaghetti) Data Model

Advantages:gEasy to create and storeCan be retrieved and rendered on screen very quickly

8/20/2010Geographic Information System

Page 20: GIS - Lecture 4

8/20/2010

20

Simple (Spaghetti) Data Model

Disadvantages:Spatial analysis (eg. Shortest path network analysis) cannot be performed easily due to lack of any connectivity relationshipsRedundant since the boundary segment between two polygons can be stored twice (once for each feature)

8/20/2010Geographic Information System

Inflexible since it is difficult to dissolve common boundaries when joining polygonsCannot be used in certain applications such as land management because of overlapping features

Topological Data Model

spaghetti model but with topology!E ll l d d l d Essentially just a simple data model structured using topologic rulesoften referred to as an intelligent data structurebecause spatial relationships between geographic features are easily derived when using them Most commonly used variants is the arc node data

8/20/2010Geographic Information System

Most commonly used variants is the arc-node data modeluses nodes (intersection of two lines; end point of line, and any point on line) and arcs (line connecting two nodes)

Page 21: GIS - Lecture 4

8/20/2010

21

Topological Data Model Components

Point (vertex) – where a line originates or terminatesNode – where a link originates or terminatesLine – a segment between two points (vertices)Link (or arc or chain) – a connection between two nodes

– may consist of several lines which are joined at points (vertices)

8/20/2010Geographic Information System

(vertices)– can only originate, terminate or be connected at nodes

Polygon (area) – composed of links– adjacent polygons have only one link between them

8/20/2010Geographic Information System

Page 22: GIS - Lecture 4

8/20/2010

22

Elements of Topological RelationshipConnectivity

I f ti b t li k ti l bj tInformation about linkages among spatial objectsKeep track of which links are connected at a node

AdjacencyInformation about neighbourhood among spatial objectsA link can determine the polygon to its left and its right

Containment

8/20/2010Geographic Information System

Information about inclusion of one spatial object within another spatial objectWhat nodes and links and other polygons are within a polygon

Elements of Topological Relationship

8/20/2010Geographic Information System

Page 23: GIS - Lecture 4

8/20/2010

23

Topological Map

Explains the concept of topological relationshipsA map that contains explicit topological information on top of the geometric information expressed in coordinates (Masry and Lee, 1988)All spatial entities are decomposed and represented in the form of the three basic graphical elements (point, line,

8/20/2010Geographic Information System

polygon)

Concept of a Topological Map

8/20/2010Geographic Information System

Page 24: GIS - Lecture 4

8/20/2010

24

8/20/2010Geographic Information System

8/20/2010Geographic Information System

Page 25: GIS - Lecture 4

8/20/2010

25

8/20/2010Geographic Information System

Arc-node Data Model

Stores graphical elements rather than graphical entitiesExplicitly stores spatial relationships between the graphical elements, and relationships between the arcs and respective nodesSt d t l i l l ti hi ll hi l

8/20/2010Geographic Information System

Stored topological relationships allow graphical entities to be constructed using the basic graphical elements

Page 26: GIS - Lecture 4

8/20/2010

26

Arc-node Data Model

8/20/2010Geographic Information System

Network Data Model

A special type of topologic modelp yp p gShows how lines connect with each other at nodesApplications:

Load analysis over an electric network

8/20/2010Geographic Information System

yVehicle routing over a street networkPollution tracing over a river network

Page 27: GIS - Lecture 4

8/20/2010

27

Use of Topological Relationship

Areas in GIS wherein topological relationships are applied are:

1. Data input and representation2. Spatial search3. Construction of complex spatial objects from

b h l l

8/20/2010Geographic Information System

basic graphical elements4. Integrity checks in database creation

Data input and representation

Creating a topologic geographic database is extremely time-consuming and error-pronePossible to adopt the spaghetti digitization for graphical data input and then later on let the computer refine and structure them using topological relationships

8/20/2010Geographic Information System

topological relationships

Page 28: GIS - Lecture 4

8/20/2010

28

Data input and representation

8/20/2010Geographic Information System

Spatial searchGIS uses topological relationships to perform spatial

h ffi i tlsearches efficientlyFor example, the land parcels surrounding a particular parcel can be located immediately using adjacency information

Polygons that share a common boundary with the parcel of interest are identified

8/20/2010Geographic Information System

Their IDs are used to retrieve the descriptive data from the database

Page 29: GIS - Lecture 4

8/20/2010

29

Spatial search

8/20/2010Geographic Information System

Construction of complex spatial objects

Complex objects are represented as complex l h d bpolygons in a geographic database

Two types of complex polygons are:Those containing one or more holes (or islands/enclaves) constructed from vector lines using topological informationTh d f t l th t t

8/20/2010Geographic Information System

Those made up of two or more polygons that are not physically connected constructed using a common identifier

Page 30: GIS - Lecture 4

8/20/2010

30

Construction of complex spatial objects

8/20/2010Geographic Information System

Integrity checks

Graphical data must be topologically cleaned before using them in GIS applications

Must not contain any topological errors

During topology building, the computer detects topological errors and marks them immediatelyOperator can then decide hat to do ith the

8/20/2010Geographic Information System

Operator can then decide what to do with the errors

Page 31: GIS - Lecture 4

8/20/2010

31

Integrity checks

8/20/2010Geographic Information System

Triangulated Irregular Network (TIN) Data Model

Creates and represents surfaces in GIS

S f i t d ti l i t i lSurface is represented as contiguous, non-overlapping triangles

Size of triangles may be adjusted to reflect the relief of the surface being modelled

Larger triangles for relatively flat terrain Smaller triangles for rugged terrain

Manages information about the nodes that comprise each triangle and the neighbors of each triangle

8/20/2010Geographic Information System

neighbors of each triangle

Applications:Volumetric calculations for road designDrainage studies Landslide studies

Page 32: GIS - Lecture 4

8/20/2010

32

Triangulated Irregular Network (TIN) Data Model

8/20/2010Geographic Information System

Advantages of TIN Data Model

Incorporates original sample points so the accuracy of the model can be checkedEfficient way of storing surface representations because the size of triangles can be varied according to the terrainD t t t k it t l l t l ti

8/20/2010Geographic Information System

Data structure makes it easy to calculate elevation, slope, aspect, and line-of-sight between points

Page 33: GIS - Lecture 4

8/20/2010

33

Vector File FormatsGDF/DIME (Geographic Base File/Dual Independent Map Encoding) – originally used for storing street mapsTIGER (Topologically Integrated Geographic Encoding and Referencing System) –improvement for GDF/DIMEDLG (Digital Line Graphs) – USGS topographic mapsAutoCAD DXF (Data Exchange Format) – widely used as an export format in many GISIGDS (Intergraph Design System) – widely used in mappingArcInfo Coverage – stores vector graphical data using topological structure explicitly defining spatial relationshipsArcInfo E00 – export format of ArcInfo

8/20/2010Geographic Information System

pShapefiles – format of ArcView GIS; defines geometry and attribute of geographically referenced objects using the main file, index file and database tableCGM (Computer Graphics Metafile) – ISO standard for vector data format; widely used in PC-based computer graphics applicationsPage Description Languages (PDL’s)

Considerations in choosing between Raster and Vector

source and type of datasource and type of dataanalytical procedures to be usedintended use of data

8/20/2010Geographic Information System

Page 34: GIS - Lecture 4

8/20/2010

34

Raster vs. VectorRaster Data ModelAdvantages:

Simple data structure Easy to conceptualize space representationEasy data collection and processingEasy and efficient overlaying High spatial variability is efficiently represented Easier and efficient for surface modelling, wherein individual features are not that importantAll i i f i d ( lli l d

8/20/2010Geographic Information System

Allows easy integration of image data (satellite, remotely-sensed, etc.)Pixel values can be modified individually or in a group by using a paletteSimple for own programming

Raster vs. VectorRaster Data Model

Disadvantages:Disadvantages:Do not provide precise locational and area computation information due to grid cellsRequires large storage capacity, based on size and number of colors No topological processingLi it d tt ib t d t h dli

8/20/2010Geographic Information System

Limited attribute data handlingInefficient projection transformations Loss of information when using large cells“Blocky” appearance when image is viewed in detail

Page 35: GIS - Lecture 4

8/20/2010

35

Raster vs. VectorVector Data Model

Advantages:Far more efficient in storage than grids Compact data structureProvides precise locational information of pointsCan represent point, line, and area features very accurately, leading to more accurate measurements and map outputsTopological models enable many types of analysisSophisticated attribute data handling

8/20/2010Geographic Information System

Sophisticated attribute data handlingEfficient for network analysis Efficient projection transformation Work well with pen and light-plotting devices and tablet digitizers

Raster vs. VectorVector Data Model

Disadvantages:Disadvantages:Complex data structure Difficult overlay operations High spatial variability is inefficiently represented Not compatible with RS imagery

8/20/2010Geographic Information System

Expensive data collectionNot good at continuous coverages or plotters that fill areas

Page 36: GIS - Lecture 4

8/20/2010

36

Representation of a line using raster and vector model

8/20/2010Geographic Information System

Raster representation Vector representation

Raster vs. VectorRaster Vector

Data collection Rapid Slow

Data volume Large SmallData volume Large Small

Graphic treatment Average Good

Data structure Simple Complex

Area analysis Good Average

Network analysis Poor Good

Generalization Simple Complex

8/20/2010Geographic Information System

Page 37: GIS - Lecture 4

8/20/2010

37

Integrating raster and vector data

8/20/2010Geographic Information System