intro to advanced gis and a review of basic gis. outlines about the class setting materials to be...
TRANSCRIPT
Intro to advanced GIS and a review of basic GIS
Outlines
About the class setting Materials to be covered and scheduled Quick review of GIS basics First lab (Lab 1)
What covered in introGIS
Geospatial Tech GIS GIS data GIS data type GIS data format
GIS a simplified view of Earth Two types of coordinate systems
Geographic coordinate system Projected coordinate system
Conic, cylindrical, Azimuthal Distortions (shape, size, distance,
direction) Two important things
Define Project
Geographic Coordinate SystemGeographic Coordinate SystemUnprojectedUnprojected
5
Projected Coordinate Projected Coordinate SystemSystem
6
What is GIS ?What is GIS ?
• A computer system for
- collecting,
- storing,
- manipulating,
- analyzing,
- displaying, and
- querying geographically related information.
In general GIS cover 3 components
Computer system Hardware
Computer, plotter, printer, digitizer Software and appropriate
procedures Spatially referenced or
geographic data People to carry out various
management and analysis tasks
Geographic Data
Geospatial data tells you where it is and attribute data tells you what it is. Metadata describes both geospatial and attribute data.
In GIS, we call geographic data as GIS data or spatial data
1. Geospatial data
Traditional method
To represent the geographic data is paper-based maps
Geology map Topographic map City street map (we still use it a lot) ...
Characteristics of spatial data
“mappable” characteristics: Location (coordinate system, will be lectured
later) Size is calculated by the amount (length, area,
perimeter) of the data Shape is defined as shape (point, line, area) of
the feature Discrete or continuous Spatial relationships
Discrete and continuous
Discrete data are distinct features that have definite boundaries and identities A district, houses, towns, agricultural
fields, rivers, highways, … Continuous data has no define
borders or distinctive values, instead, a transition from one value to another Temperature, precipitation, elevation, ...
GIS: a simplified view of the real world
Points Lines Areas Networks
A series of interconnecting lines
Road network River network Sewage network
Surfaces Elevation surface Temperature surface
Discrete features
Continuous features
Problems caused by the simplified features may still exist, but let’s live on it
Dynamic nature (not static) Forest grow River channel change City expand or decline
Identification of discrete and continuous features Road to be a line or a area?
Scale Some may not fit to any type of features: fuzzy
boundaries Transition area between woodland and grassland
Lets do not worry about these problems now!!! Just keep in mind
Topology needed
A collection of numeric data which clearly describes adjacency, containment (coincidence), and connectivity between map features and which can be stored and manipulated by a computer.
A set of rules on how objects relate to each other
Major difference in file formats
Higher level objects have special topology rules
Two basic data models to represent these features
Raster spatial data model Define space as an array of equally sized cells arranged in rows and
columns. Each cell contains an attribute value and location coordinates
Individual cells as building blocks for creating images of point, line, area, network and surface
Continuous raster Numeric values range smoothly from one location to another, for
example, DEM, temperature, remote sensing images, etc. Discrete raster
Relative few possible values to repeat themselves in adjacent cells, for example, land use, soil types, etc.
Vector spatial data model Use x-, y- coordinates to represent point, line, area, network,
surface Point as a single coordinate pair, line and polygon as ordered lists of
vertices, while attributes are associated with each features Usually are discrete features
DIGITAL SPATIAL DATA
• RASTER
• VECTOR
• Real World
Source: Defense Mapping School National Imagery and Mapping Agency
Raster and Vector Data Models
Vector RepresentationX-AXIS
500
400
300
200
100
600500400300200100
Y-AXIS
River
House
600
Trees
Trees
BB
B BB
BBB G
GBK
BBB
G
G
G GG
Raster Representation
1 2 3 4 5 6 7 8 9 1012345
67
8910
Real World
G G
Source: Defense Mapping School National Imagery and Mapping Agency
Example: Discrete raster
Xie et al. 2005
Example: continuous raster
Raster Real world Vector Heywood et al. 2006
Effects of changing resolutionHeywood et al. 2006
Vector – Advantages and Disadvantages
Advantages Good representation of reality Compact data structure Topology can be described in a network Accurate graphics
Disadvantages Complex data structures Simulation may be difficult Some spatial analysis is difficult or impossible
to perform
Raster – Advantages and Disadvantages
Advantages Simple data structure Easy overlay Various kinds of spatial analysis Uniform size and shape Cheaper technology
Disadvantages Large amount of data Less “pretty” Projection transformation is difficult Different scales between layers can be a nightmare May lose information due to generalization
GIS data formats (file formats)
Shapefiles Coverages TIN (e.g. elevation can be stored as TIN)
Triangulated Irregular Network
Grid (e.g. elevation can be stored as Grid) Image (e.g. elevation can be stored as
image, all remote sensing images)
Vector data
Raster data
Shape Files
Nontopological Advantages no overhead to
process topology Disadvantages polygons are
double digitized, no topologic data checking
At least 3 files .shp .shx .dbf
Coverages
Original ArcInfo Format Directory With Several Files Database Files are stored in the Info
Directory Uses Arc Node Topology
Containment (coincident) Connectivity Adjacency
Evolution of Vector Data Model
ESRI, Inc. Arc/Info: coverages ArcView: shapefiles ArcGIS: geodatabase
Geodatabase components-vector data and table
Primary (basic) components - feature classes, - feature datasets,- nonspatial tables.
complex components building on the basic components:
- topology, - relationship classes, - geometric networks
Geodatabase components-Raster data
Raster data referenced only in personal geodatabase Raster data physically stored in multiuser geodatabse Raster datasets and raster catalogs
A raster dataset is created from one or more individual rasters. When creating a raster dataset from multiple rasters, the data is mosaicked, or aggregated, into a single, seamless dataset in which areas of overlap have been removed. The input rasters must be contiguous (adjacent) and have the same properties, including the same coordinate system, cell size, and data format. For each raster dataset (.img, grid, JPEG, MrSID, TIFF), ArcGIS creates an ERDAS IMAGINE file (.img).
A raster catalog is defined as a table in the geodatabase which you can view like any other table in ArcCatalog. Each raster in the catalog is represented by a row in the table. It contains a collection of rasters that can be noncontiguous, stored in different formats, and have other different properties. In order to view all the rasters in the catalog, they must have the same coordinate system and a common geographic extent
2. Attribute data Attribute data is about “what” of a
spatial data and is a list or table of data arranged as rows and columns Rows are records (map features)
Each row represents a map feature, which has a unique label ID or object ID
Columns are fields (characteristics) Intersection of a column and a row shows
the values of attributes, such as color, ownership, magnitude, classification,…
examples
Relational database
A relational database is a collection of tables, also called relations, which can be connected to each other by keys.
A primary key represents one or more attributes whose values can uniquely identify a record in a table. Its counterpart in another table for the purpose of linkage is called a foreign key
Advantages Each table in the database can be prepared, maintained,
and edited separately from other tables Efficient data management and processing, since linking
tables query and/or analysis is often temporary
Join and relate tables
Join
Join
relate
relate
Once tables are separated as relational tables, then two operations can be used to link those tables during query and analysis
Join, brings together two tables based on a common key.
Relate, connects two tables (based on keys) but keeps the tables separate.
Keys do not have to have the same name but must be of the same data type
The joined table
The joined table will only preserved within the map document-the tables remain separate on disk-and can be removed at any time
Related tables
The related table will only preserved within the map document-the tables remain separate on disk-and can be removed at any time
3. metadata
Meta is defined as a change or transformation. Data is described as the factual information used as a basis for reasoning. Put these two definitions together and metadata would literally mean "factual information used as a basis for reasoning which describes a change or transformation."
In GIS, Metadata is data about the data. It consists of information that describes spatial data and is used to provide documentation for data products. Metadata is the who, what, when, where, why, and how about every facet of the spatial data.
According to the Federal Geographic Data Committee (FGDC), metadata is data about the content, quality, condition, and other characteristics of data.
Why use and create Why use and create metadatametadata
To help organize and maintain an organization's spatial data
- Employees may come and go but metadata can catalogue the changes and updates made to each spatial data set and how each employee implemented them
To provide information to other organizations and clearinghouses to facilitate data sharing and transfer
- It makes sense to share existing data sets rather than producing new ones if they are already available
To document the history of a spatial data set - Metadata documents what changes have been
made to each data set, such as changes in geographic projection, adding or deleting attributes, editing line intersections, or changing file formats. All of these could have an effect on data quality.
Metadata Should Include Data about
Date of data collected. Date of coverage generated. Bounding coordinates. Processing steps.
Software used RMSE, etc.
From where original data came. Who did processing. Projection coordinate System Datum Units Spatial scale Attribute definitions Who to contact for more information
See an example of non-standard metadata (see)
Federal Geographic Data Committee’s (FGDC) Content Standard for Digital Geospatial Metadata (CSDGM)
The FGDC is developing the National Spatial Data Infrastructure (NSDI) in cooperation with organizations from State, local and tribal governments, the academic community, and the private sector. The NSDI encompasses policies, standards, and procedures for organizations to cooperatively produce andshare geographic data.
The objectives of the CSDGM are to provide a common set of terminology and definitions for the documentation of digital geospatial data.
CSDGM (FGDC-STD-001-
1998)
Metadata = Identification_Information Data_Quality_Information Spatial_Data_Organization_Information Spatial_Reference_Information Entity_and_Attribute_Information Distribution_Information Metadata_Reference_Information
Connect to http://www.fgdc.gov/metadata/csdgm/
Metadata toolsMetadata tools Metadata editors:
- tkme / USGS- ArcCatalog / ESRI- SMMS / Intergraph- FGDCMETA / Illinois State Geological Survey- xtme / USGS
Metadata utilities (check compliance and export to text, HTML,XML, or SGML):
- mp / USGS- MP batch / Intergraph- ArcCatalog powered by mp/ ESRI
Metadata Server- Isite / FGDC- GeoConnect Geodata Management Server / Intergraph- ArcIMS Metadata Server / ESRI
mp: Metadata Parser
4. Geodatabase
Before geodatabase, in one GIS project, many GIS files (spatial data and nonspatial data) are stored separated. So for a large GIS project, the GIS files could be hundreds.
Within a geodatabase, all GIS files (spatial data and nonspatial data) in a project can be stored in one geodatabase, using the relational database management system (RDMS)
Types of geodatabases
personal enterprise
Personal Geodatabase
The personal geodatabase is given a name of filename.mdb that is browsable and editable by the ArcGIS, and it can also be opened with Microsoft Access. It can be read by multiple people at the same time, but edited by only one person at a time. maximum size is 2 GB.
Multiuser Geodatabase
Multiuser (ArcSDE or enterprise) geodatabase are stored in IBM DB2, Informix, Oracle, or Microsoft SQL Server.
It can be edited through ArcSDE by many users at the same time, is suitable for large workgroups and enterprise GIS implementations. no limit of size. support raster data.
3-tier ArcSDE client/server architecture with both the ArcSDE and Oracle RDBMS running on the same server, which minimizes network traffic and client load while increasing the server loadcompared to 2-tier system, in which the clientsdirectly connect to the RDBMS
Personal and Multiuser Geodatabase Comparison
source: www.esri.com
5. Geometric transformation
Geometric transformation is the process of using a set of control points and transformation equations to register a (2D) digitized map, a satellite image, or an air photograph onto a (2D) projected coordinate system.
In GIS, geometric transformation includes map-to-map transformation, image-to-map transformation, image-to-image transformation.
The root mean square (RMS) error is a quantitative measure of location accuracy that can determine the quality of a geometric transformation.
Image to map (or image) needs an additional step resampling to fill the cell values from the original image.
projection and coordinate system is to project the 3D earth to 2D plane, so the 3D earth can be represented in different GIS data models (2D digital format) in a GIS system.
6. Data accuracy and quality
Raster data quality Geolocation accuracy Estimation accuracy
Vector data quality Location errors Topographical errors
7. Vector data analysis
Vector data analysis uses the spatial features of point, line, and polygon as inputs.
The accuracy of analysis results depends on the accuracy of spatial features in terms of location and shape.
Topology can also be a factor for some vector data analyses such as buffering and overlay.
Pattern Analysis
IntersectUnionSymmetrical difference Identity
ANDORXORAND OR
Point pattern: nearest neighbor, Ripley’s K-functionMoran’s IG-Statistic
8. Raster data analysis
Raster data analysis is based on cells and rasters.
Raster data analysis can be performed at the level of individual cells, or groups of cells, or cells within an entire raster.
Some raster data operations use a single raster, while others use two or more rasters.
Raster data analysis is also related to the type of cell value (numeric or categorical values) in the input raster(s).
Local, focal, zonalAllocation and directionClip and mosaicAggregate and regiongroupMap algebra
9. Lab 1
Getting Started With the Geodatabase
COPY the result map of your last step to your home work
Copy your exam questions and result to your homework