introduction to geographic information systems fall 2013 (inf 385t-28620 ) geodatabases dr . david...
DESCRIPTION
Introduction to Geographic Information Systems Fall 2013 (INF 385T-28620 ) Geodatabases Dr . David Arctur Research Fellow, Adjunct Faculty University of Texas at Austin Lecture 4 September 19, 2013. Outline. Tables Geocodes Data table joins Spatial joins Spatial data formats - PowerPoint PPT PresentationTRANSCRIPT
Introduction to Geographic Information Systems Fall 2013 (INF 385T-28620)Geodatabases
Dr. David ArcturResearch Fellow, Adjunct Faculty
University of Texas at Austin
Lecture 4September 19, 2013
Outline
Tables Geocodes Data table joins Spatial joins Spatial data formats Geodatabases Calculating geometry
2INF385T(28620) – Fall 2013 – Lecture 4
TABLESLecture 4
3INF385T(28620) – Fall 2013 – Lecture 4
4
Two kinds of tables in ArcGIS Feature attribute table of map layer
Attribute data is part of map layers Data table with geocodes (such as
census IDs) Can add as table to ArcMap Can join to map layer to add more attributes to layer Join via same geocode values in both the data table
and map layer’s attribute table Census data example—too many census variables to
supply already in feature attribute table, so download custom table and join to appropriate polygon layer
INF385T(28620) – Fall 2013 – Lecture 4
Data table format Rectangular table with one value per
cell Columns (fields) are attributes Rows are observations (records)
5INF385T(28620) – Fall 2013 – Lecture 4
Data table format First row must have column names that are
self-documenting labels E.g., Shape, POP2000 First character of attribute name must be
a letter Remaining characters can be any letter,
digit, or the underscore character (but no blanks)
6INF385T(28620) – Fall 2013 – Lecture 4
Data table format All additional rows of a data table must
contain only attribute values (raw data) None of the rows can be sums,
averages, or other statistics for raw data rows
7INF385T(28620) – Fall 2013 – Lecture 4
8
Primary keys Each table has a primary key attribute
with two properties Each value is unique There are no null values
INF385T(28620) – Fall 2013 – Lecture 4
9
Field calculator Add computed columns in ArcGIS
ArcGIS does not have the query capacity of relational database packages to compute new columns on the fly
So, must create permanent new columns
Full range of computation Can add, multiply, etc. Has numeric and text functions Can concatenate text values
INF385T(28620) – Fall 2013 – Lecture 4
Field calculator (numeric)
10INF385T(28620) – Fall 2013 – Lecture 4
Field calculator (text) Concatenate house number and street fields
11INF385T(28620) – Fall 2013 – Lecture 4
External table file formats for import into ArcGIS
Plain ASCII text with comma separated values (.csv)
Very transportable format, very large files Each table record is a row terminated with a line-break
character (invisible, nonprinting value) Has values separated by a delimiter, usually a comma For data values that contain the delimiter, enclose the
value in double quotes Sometimes columns get wrong data type on import (use
double quotes to force text data type for digits, say for house numbers)
12INF385T(28620) – Fall 2013 – Lecture 4
External table file formats for import to ArcGIS
Excel (.xls, .xlsx) Excel 2003, up to 65,000 rows and 256 columns Excel 2007, up to 1,048,576 rows and 16,384
columns dBase database table (.dbf)
Legacy format ArcMap truncates field names to 1st 10
characters dBase IV has maximum of 255 columns Can open dBase file in Excel but cannot save
dBase from Excel Microsoft Access database (.mdb)
Up to 2 GB file size See following for other limits:
http://www.databasedev.co.uk/access_specifications.html 13INF385T(28620) – Fall 2013 – Lecture 4
GEOCODESLecture 4
14INF385T(28620) – Fall 2013 – Lecture 4
Geocodes (2000) Federal Information Processing
Standards (FIPS) Developed by the National Institute of
Standards and Technology Codes for place-names throughout the
world– Countries– States/provinces– Counties– Metropolitan statistical areas (MSA’s)– Cities– Places—Indian reservations, airports, and post offices in
the USSee http://www.genesys-sampling.com/pages/Template2/site2/61/default.aspx for additional geocodes. 15INF385T(28620) – Fall 2013 – Lecture 4
16
Geocodes: hierarchical
Country: US
FIPS codes (political boundaries)
County: 003 (Allegheny)State: 42 (Pennsylvania)
Tract: 1917
Block: 005 (US420031917003005)Block group: 003
Census codes(statistical boundaries)
Minor civil division: 4200361000 (Pittsburgh)
Parcel block & lot number0096-P-00210000000(1690 Seaton St, Pittsburgh, PA 15226)
Local government cadastral data(legal boundaries)
INF385T(28620) – Fall 2013 – Lecture 4
17INF385T(28620) – Fall 2013 – Lecture 4
World and US
18INF385T(28620) – Fall 2013 – Lecture 4
US and state 42
State 42 and county 003
19INF385T(28620) – Fall 2013 – Lecture 4
County 003 and municipality 61000
Municipality 61000 and tract 1917
20INF385T(28620) – Fall 2013 – Lecture 4
Tract 1917 and block group 003
Block group 003 and block 005
Geocodes (2010) ANSI Codes
American National Standards Institute Codes
Replace the Federal Information Processing Standards (FIPS)
The entities covered include: – States and statistically equivalent entities– Counties and statistically equivalent entities– Named populated and related location
entities (such as places and county subdivisions)
– American Indian and Alaska Native areas See http://www.census.gov/geo/www/ansi/ansi.html
21INF385T(28620) – Fall 2013 – Lecture 4
DATA TABLE JOINSLecture 4
22INF385T(28620) – Fall 2013 – Lecture 4
Review: Table joins Puts two tables together, on
the fly, to make one table One-to-one join (e.g., join state attribute
data to state shapefile by StateName) One-to-many join (e.g., join code table to
feature attribute table to add code description. Many records can use the same code value.)
Each table in a join must have key attribute for matching Must have same values and data types for
key in both tables23INF385T(28620) – Fall 2013 – Lecture 4
Example join
+ =
24INF385T(28620) – Fall 2013 – Lecture 4
Problems with joins Field types are different (e.g., one is
numeric and one is text)
25INF385T(28620) – Fall 2013 – Lecture 4
Text values left alignwhile numeric valuesright align
Solution Create a new field of the same type and use
Field Calculator
26INF385T(28620) – Fall 2013 – Lecture 4
Solution
Both tables are same field types
27INF385T(28620) – Fall 2013 – Lecture 4
Problems with joins
Data format varies Must remove dashes
28INF385T(28620) – Fall 2013 – Lecture 4
SPATIAL JOINSLecture 4
29INF385T(28620) – Fall 2013 – Lecture 4
Spatial joins
Joins using shape (not attribute field) Enables data aggregation (counting or
summing points by polygon) Common spatial joins
Points to polygons (counts) Polygons to points (adds text) Points to points (distances)
30INF385T(28620) – Fall 2013 – Lecture 4
31
Points to polygons How many businesses are in each
neighborhood? Start with:
Business points Neighborhood
polygons
INF385T(28620) – Fall 2013 – Lecture 4
Points to polygonsRight-click neighborhoods > Joins and Relates > Join
32INF385T(28620) – Fall 2013 – Lecture 4
Spatial join result New polygon layer with count of points (number
of architects and engineers)
33INF385T(28620) – Fall 2013 – Lecture 4
Spatial join result Show as a choropleth map with labels, or table
Neighborhood Name CountCentral Business District 53Southside Flats 14Shadyside 9Bloomfield 8Lower Lawrenceville 8North Shore 8Squirrel Hill South 6Strip District 6Point Breeze 4Squirrel Hill North 4Garfield 3South Oakland 3Friendship 2North Oakland 2Carrick 2Central Lawrenceville 2East Allegheny 2Mount Washington 2East Liberty 1Central Northside 1Westwood 1Banksville 1Brookline 1Perry North 1Highland Park 1Larimer 1Allegheny West 1Middle Hill 1Bluff 1Southside Slopes 1
34INF385T(28620) – Fall 2013 – Lecture 4
35
Points to polygons What neighborhood is a business in?
Start with: Business points Neighborhood
polygons
INF385T(28620) – Fall 2013 – Lecture 4
Polygons to points
Right-click business points > Joins and Relates > Join
36INF385T(28620) – Fall 2013 – Lecture 4
Spatial join result Point shapefile with neighborhood data on each
business
37INF385T(28620) – Fall 2013 – Lecture 4
Points to points How close is the nearest bus stop to a
business?
Start with: Business points Bus stop points
38INF385T(28620) – Fall 2013 – Lecture 4
Points to points Right-click business points > Joins and Relates
> Join
39INF385T(28620) – Fall 2013 – Lecture 4
Result Distance field added to new layer of businesses
and stops joined
40INF385T(28620) – Fall 2013 – Lecture 4
SPATIAL DATA FORMATSLecture 4
41INF385T(28620) – Fall 2013 – Lecture 4
Esri legacy format: Coverage Folder with
multiple files Can have
points, lines, and/or polygons
Has several intermediate data products (topology) to speed up processing (now calculated on the fly)
42INF385T(28620) – Fall 2013 – Lecture 4
Esri legacy format: Shapefile Multiple files, all with the same name but
different file extensions No intermediate data products, but has
indices to speed data processing Widely used to share spatial data files
43INF385T(28620) – Fall 2013 – Lecture 4
44
Shapefiles ArcView native format
Minimum files .shp–stores feature geometry .shx–stores index of features .dbf–stores attribute data
Additional files .prj–projection data .xml–metadata .sbn and .sbx–store
additional indices
INF385T(28620) – Fall 2013 – Lecture 4
CAD drawings CAD software
Autodesk, AutoCAD (.dwg) Bentley, Microstation (.dgn, .dxf)
Often used by engineering companies Better digitizing precision
45INF385T(28620) – Fall 2013 – Lecture 4
CAD drawings
46INF385T(28620) – Fall 2013 – Lecture 4
GEODATABASESLecture 4
Geodatabases
A geodatabase is a container used to hold a collection of datasets (GIS features, tables, raster images, and other objects)
Country layer
Graticule layer
World.gdb
48INF385T(28620) – Fall 2013 – Lecture 4
Enterprise geodatabases Practically unlimited size and multiple
simultaneous users Use enterprise data management
systems Store spatial datasets in a number of
DBMSs: IBM DB2, Microsoft SQL Server, Oracle, or Postgres
49INF385T(28620) – Fall 2013 – Lecture 4
Personal geodatabase
Parallels enterprise geodatabase but on PC
Stores datasets in a Microsoft Access .mdb file
Limited to 2 GB Much overhead in space and extra
structure Tempting to apply one’s own Access skills,
but needs ArcGIS Catalog utility for manipulation
50INF385T(28620) – Fall 2013 – Lecture 4
File geodatabase An Esri replacement for shapefiles
Vector and raster map layers Other objects (tables) Stores one or more datasets in a
folder of files with .gdb extension Can be up to 1 TB in size Can be used across platforms Can be compressed and encrypted
for read-only, secure use
51INF385T(28620) – Fall 2013 – Lecture 4
View geodatabases Cannot identify names in Windows
Explorer Must use ArcCatalog
52INF385T(28620) – Fall 2013 – Lecture 4
Non-Esri vector formats Interoperability
Ability of different vendors’ hardware and software to share data
Driven by the Internet with standards evolving for open data access (International Organization for Standardization, Open Geospatial Consortium, US Federal Geographic Data Committee)
Over 110 vector file formats available in ArcGIS Data Interoperability extension (http://www.esri.com/library/fliers/pdfs/data-interop-formats.pdf)
53INF385T(28620) – Fall 2013 – Lecture 4
KML (Keyhole Markup Language)
XML schema for Internet-based maps Originally created by Keyhole, Inc. for satellite images
and purchased by Google to become Google Maps Provides a set of features (points, lines, polygons,
images, text, etc.) with lat/long coordinates plus altitude for 3D viewing
KMZ is zipped KML and associated files, needed for upload to Google Maps
Portability Can import and export KML/KMZ via ArcToolbox in ArcGIS Can upload to Google maps from your computer
54INF385T(28620) – Fall 2013 – Lecture 4
X,y data Point data table with x and y attributes Increasingly popular to include x and y
with data Commonly used for GPS data
55INF385T(28620) – Fall 2013 – Lecture 4
CALCULATING GEOMETRYLecture 4
56INF385T(28620) – Fall 2013 – Lecture 4
Point centroidsWhen displaying or analyzing small polygons it is often better to use point centroids
57INF385T(28620) – Fall 2013 – Lecture 4
Calculate x,y fields
Add new x and y fields in the attribute table
58INF385T(28620) – Fall 2013 – Lecture 4
Calculate x,y fieldsCalculate geometry for x field, repeat for
y
59INF385T(28620) – Fall 2013 – Lecture 4
X,y field resultsResults are x and y values based on map properties (e.g., Long/Lat or x,y feet)
60INF385T(28620) – Fall 2013 – Lecture 4
Export table with x,y values
61INF385T(28620) – Fall 2013 – Lecture 4
Add x,y data table
62INF385T(28620) – Fall 2013 – Lecture 4
Export features X,y events should be exported as
permanent shapefile or feature class
63INF385T(28620) – Fall 2013 – Lecture 4
Count point centroids Population can be spatially joined to buffer around
polluting companies
64INF385T(28620) – Fall 2013 – Lecture 4
Other geometry calculations Area Perimeter Length
65INF385T(28620) – Fall 2013 – Lecture 4
Summary
Tables Geocodes Data table joins Spatial joins Spatial data formats Geodatabases Calculating geometry
66INF385T(28620) – Fall 2013 – Lecture 4