colorado state address dataset automated processing

11
Colorado State Address Dataset Colorado State Address Dataset Automated Processing Nathan Lowry, GIS Outreach Coordinator State of Colorado September 23, 2014 September 23, 2014

Upload: geco-in-the-rockies

Post on 19-Jun-2015

58 views

Category:

Technology


1 download

DESCRIPTION

Nathan Lowry

TRANSCRIPT

Page 1: Colorado State Address Dataset Automated Processing

Colorado State Address Dataset Colorado State Address Dataset Automated Processing

Nathan Lowry, GIS Outreach CoordinatorState of Colorado

September 23, 2014 September 23, 2014

Page 2: Colorado State Address Dataset Automated Processing
Page 3: Colorado State Address Dataset Automated Processing

Common Data Model● Allows local and state-wide querying, analysis, and integration …

● Accommodates information exchanges▪ Hierarchical City to County County to Region Region to State▪ Hierarchical - City to County, County to Region, Region to State

▪ Among neighboring jurisdictions (eg. County to County, etc.)

● Allows profiles to provide data in standard forms for specific objectivesobjectives▪ NENA CLDXF for NG-911

▪ USPS Pub-28 for CASS

▪ ArcGIS Geocoding (for quality comparisons, etc.)

● It’s more efficient (less work) and assures more quality (less loss)

Page 4: Colorado State Address Dataset Automated Processing

FGDC-STD-016-2011United States Thoroughfare, Landmark, and Postal Address Data Standard

Of Greatest Significance:1.Everything* is ‘fully explicit’ (fully spelled‐out)

No abbreviations allowed; No Ambiguity*The only exception is two‐letter state postal codes (eg. “CO” = Colorado)

●2.You will express exactly how each address will be parsed     Parsing is no longer subject to interpretationParsing is no longer subject to interpretation

The break‐down is stored in the data for each record3.Each Address must be assigned a Unique Identifier (UID)

Multiple representations of the same address can be “tied together” if and only if (iff) addresses are assigned UIDs.

These are big changes that few have yet implementedThese are big changes that few have yet implemented•Our common data model is designed to accommodate both:

‒your current state and‒this “to be” state

Page 5: Colorado State Address Dataset Automated Processing

Presuppositions:Presuppositions:● SQL Server Integration Services (SSIS)

o Parallel processing - fast translations - True.p go Most Compatible with SQL Server - Irrelevant*o Developed by DBAs for DBAs - No, developed by app

developers for app developersp f pp p▪ (ie. Normalization tools) - Hah, hah, hah, hah,

hah!o No Additional Cost - (This one bore out)( )o I learned French instead of Spanish - (SSIS instead of

Python)

● No Parsing● No Parsingo I will translate, but it’ll be the locals’ responsibility to

pre-parse... - No parsing, no geocoding*o In addition no last lines no geocoding*o In addition, no last lines, no geocoding

● 6-8 Weeks Processing - 6-8 Months of Processing

Page 6: Colorado State Address Dataset Automated Processing

Automating Processes

Page 7: Colorado State Address Dataset Automated Processing

Colorado State Address Dataset

Automated and Manual Processes

Page 8: Colorado State Address Dataset Automated Processing

Automating Processes

Page 9: Colorado State Address Dataset Automated Processing

Observations● SQL Server Integration Services (SSIS)● SQL Server Integration Services (SSIS)

○ SSIS is quirky

○ SSIS Expression Language is SwahiliSSIS Expression Language is Swahili

○ A modeling canvas may be more effective for design

○ SSIS can integrate with many other server processes (FTP)

● Parsing and “Last Lining” will give CO jurisdictions a leg up○ The level of effort can be significant

○ CLDXF Street Naming and Address Numbering Conventions

St d d● Standards○ Jurisdictional pretypes, sequencers - minor tweaks

○ Subaddress conventions need something○ Subaddress conventions need ... something

Page 10: Colorado State Address Dataset Automated Processing

Opportunities

● Standards○ Improvement via implementation○ Improvement via implementation

○ Coalescence on Subaddresses

● Common implementations of data models○ Reduce the cost of development

M k h i f d f l d ibl○ Makes sharing of code useful and possible

● Common code○ Shared parsing tools

○ Shared applications

Page 11: Colorado State Address Dataset Automated Processing

Questions?

Thank You!