database management issues of interest to address databases

26
Database Management Issues of interest to Address Databases

Upload: brynn-bicker

Post on 15-Dec-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Database Management Issues of interest to Address Databases

Database Management Issues of interest to Address

Databases

Page 2: Database Management Issues of interest to Address Databases

Database Overview Agenda

• Database Components

• Example Data Types

• Table Indexes

• Domains

• Joins and Views

• Foreign and Primary Keys

Page 3: Database Management Issues of interest to Address Databases

Database Components

Table 8

Table 6 Table 7

Table 5

Table 4

Table 2 Table 3

Table 1

A database is the sum of all information you have obtained.

col1 col3col2 col4 col5Columns/Fields

Record/Row 1

Record/Row 2

Record/Row3

Record/Row4

Table 3Database

Page 4: Database Management Issues of interest to Address Databases

Sample Column Data TypesCharacter

Stores a maximum of 240 ASCII characters.

Integer

Stores an integer in a range -2,147,483,648 to 2,147,483,647

Smallint

Stores an integer value in the range -32,768 to 32,767

Double

Stores a real value in double precision floating point format

Real

Stores a real number value as a single precision floating point

Decimal

Stores a fixed point decimal number with a optional precision and scale

Timestamp

Stores a timestamp with ‘yyyy-mm-dd:hh:mm:ss’ format

Page 5: Database Management Issues of interest to Address Databases

Table Indexes• A table index contains information from a

specified table and column• The index allows you to sort information by

column and place this information in a table• Indexes can be placed on columns that are

frequently used in queries and have few repeating values

• Indexes help to improve performance on queries• A unique index can be created on a column that

will have unique values for each record

Page 6: Database Management Issues of interest to Address Databases

Domains and DB Integrity

• A domain allows you to check the validity of an entry into a column in a database table against a corresponding set of allowable values for that column

• Two types of domains exist – Range domain -- used with numeric data and consists of one

or more inclusive minimum-maximum ranges

– List domain -- used with character data and consist of a set of character strings

• Domains are stored in a series of domain tables

Page 7: Database Management Issues of interest to Address Databases

Relational Join

• A Join is a linkage between two tables in the database• Columns from each table with like data types are used to

establish the join relationship• There must be one identical value in the joined columns in

each table to complete the union

parcel_id parcel

mslinkmapidparcel_nocounty_namearea_sqftowner

parcel_idassessed_valuezone_classschool_districtland_use

Page 8: Database Management Issues of interest to Address Databases

Database Views

• A view is a window that allows you to analyze selected columns of joined tables.

• A view can be defined using either a single join or multiple join relationships (ie using several DB columns).

• Views are used for Query, Analysis and reporting of Database Values.

• Makes huge DB Tables more user-friendly.

Page 9: Database Management Issues of interest to Address Databases

Primary Keys

General Guidelines

• Should be numeric

• Must be Unique

• Do not change

• The shorter the better

• Automatically Generated is best...

Just Call them the “Record ID’s” in a Database Table...

Column Names

*

Page 10: Database Management Issues of interest to Address Databases

Foreign Keys

• Should/Must have a matching column in another table with, at least some, matching values.

• Require extensive planning during Database Development Phase.

• Should be unique and numeric, but don’t have to be….

Essentially the “Linkage Columns” between Database Tables...

Page 11: Database Management Issues of interest to Address Databases

What’s a Cartographic Feature

• Something from the real world represented in your digital map: streets, streams, houses, trees, etc.

• A graphic element that contains a pointer to a record in the Feature table

mslink fname fcode ftype category fweight flevel fcolor fstyletable digcmd other...

1 road rd1 line 3 2 29 0 022

CAD Graphics Table

Digital Map

Page 12: Database Management Issues of interest to Address Databases

Database Linkages on CAD Graphics

CAD File Graphics

DMRS 8000 0004 0005 0000DMRS 8000 0022 0014 0000

“Old” DB Table ID

record ID in “Old” Table

feature linkattribute link

The Database software will interpret the“old linkage code” to determine what table thegraphic elements “points” to.

Page 13: Database Management Issues of interest to Address Databases

Relational Databases (example)

mslink fname fcode ftype category fweight flevel fcolor fstyletable digcmd other...

1 road rd1 line 3 2 29 0 022

Feature Table

cnamemslink indexname indexlevel

trans3

CAD File Graphics

Category Table

mslink mapname category

12 road1.dgn 3

Maps Table

mslink mapid rd_name num_l traffic county

1 12 test 2 10100 morgan

roads Table

Tablename enitynum nextoccfeature 4 7

MSCATALOG

roads 22 2

Page 14: Database Management Issues of interest to Address Databases

1:2 Sides Table

Segment Table

Joining Tables (DBMS)

1: Many

Foreign Keys in eachtable are used to complete the Join Relationships from Table to Table.

Street Name Table

Page 15: Database Management Issues of interest to Address Databases

Joining Tables (example)

Street Name Table

Segment Table in Graphics

Segment Table1:1

Master Address File

Page 16: Database Management Issues of interest to Address Databases

Address Database Design Issues

• Determine your “Audience” and their needs.

• What’s your Geographic Extent?

• What Partnerships should be established?

• Establish Standards Early!

• Re-evaluate those Standards Regularly

Page 17: Database Management Issues of interest to Address Databases

Address Tables

• Parse the entire Address Record– Always easier for “us average” DB users to

“merge” columns, rather than “split” them.

• Pay close attention to Primary Keys and Foreign Keys during the Design and Testing Phase.

• Conduct a Pilot Study for the entire database structure before going “live”.

Page 18: Database Management Issues of interest to Address Databases

Address Tables (continued)

• Use Domains to control user input at EVERY opportunity!– List Domain (valid Street Names)– Range Domains (valid numeric ranges)

• If gathering new addresses from more than one source, collect them in “dummy” tables before the DB “gatekeeper” cleans them up and dumps them into the “master database”.

Page 19: Database Management Issues of interest to Address Databases

Address Data Entry

Specific data entry recommendations include:

1) Zip code entry first, with automatic fill of State and (optionally) locality data.

2) Support on-line entry with help screens, pop-up valid values access, and immediate edits.

3) Secondary unit data entry separate from street address (optionally before street for emphasis).

4) Addresses entered with manual overrides of edits should be flagged for future review.

5) Allow search for Zip code given City and State (optional).

Page 20: Database Management Issues of interest to Address Databases

Recommended Address Edits

Several types and levels of edits may be practical, depending on circumstances and business purpose.

1) Check entered data for valid abbreviations. (Abbreviation standards used by the USPS are included in Appendix B.)

2) Compare entered location(City) and State to Zipcode (based on GCS or equivalent table information).

3) Check Zipcode for validity (based on GCS or equivalent table information).

4) Compare entered address against valid addresses: Against an existing database containing addresses (within the enterprise)

Page 21: Database Management Issues of interest to Address Databases

Recommended Address Edits(continued)

5) Verify and correct the standard use of state code, standard spelling for city; and presence of standard street type.

6) Inspect Street numbers that seem to represent ranges of addresses, such as street numbers in a range or the use of terms such as "scattered sites". (This only applies for those applications that receive addresses representing, for example, blocks of apartments).

7) Identify and correct building name substitutions for street addresses to the extent possible. Using COTS software modules, against a postal-service database of 140 million valid addresses.

8) If County Code is missing, generate County Code.

Page 22: Database Management Issues of interest to Address Databases

Recommended Address Edits(continued)

9) Identify where range of latitude or longitude is more than 5 miles. Inspect and correct.

(This is a way to measure if the geocoding center is of a Zip code, rather than to a specific street address. This is unnecessary if the geocoding level is specified in a code, as is recommended).

10) Identify and delete official verbiage. For example: "Township of", "The Commonwealth of", "The Great State of".

11) Comma Check. The USPS recommends not using commas or other dividers within addresses, except the hyphen in Zip+4. The USPS further recommends all capital letters, to aid machine readability.

Page 23: Database Management Issues of interest to Address Databases

Recommended Address Edits(continued)

12) Enforce Business Rules.

For example, it may be a rule that P.O. Box numbers (and equivalent) may not substitute for Street names (and equivalent) if the address is for a property in which the enterprise holds an interest (as opposed to the mailing address of an individual or

organization).

Page 24: Database Management Issues of interest to Address Databases

Database Loading Tools useful for enhancing your

GBF data.

• Bulk Update – Attributes

• Area Loader – Polygons

• Length Loader – Lines

• Point Loader – X, Y coords. from DB

• Label Loader – from graphics to the

database!

Page 25: Database Management Issues of interest to Address Databases

Third-Party Database “Scrubbers”

• Clean up un-parsed Address Databases.

• Remove duplicate records or misspellings.

• Can even Geocode database records for you.

• Some provide CASS certified services for Address clean-up.

Page 26: Database Management Issues of interest to Address Databases

Web Sites of Interest

• http://www.nonprofitmailers.org/vendors/page6old.htm• http://www.census.gov/geo/www/tiger/vendors.html• http://www.nena.org/ads/prodvend.htm