philadelphia, may 2–4, 2005 effective use of spatial databases for enterprise data integration dr...
DESCRIPTION
Philadelphia, May 2–4, Drivers for Data Integration Agile Competitors and Virtual Organisations - Goldman, Nagel and PreissTRANSCRIPT
Philadelphia, May 2–4, 2005
www.locationintelligence.net
Effective use of spatial databases for
Enterprise data integrationDr Paul WatsonLaser-Scan
Philadelphia, May 2–4, 2005
www.locationintelligence.net
Overview• Motivation - review key drivers for Data Integration• Obstacles to Data Integration• Data Quality Management Principles
– Data Stewardship– Knowledge Management– Enterprise metadata– Rules based processing
• Methdology – Data Quality Improvement cycle• Spatial Business Rules• Spatial Knowledge Management – Radius Studio
– Rules browser/ rules repository– Conformance check/ report– Reconciliation– Certification
Philadelphia, May 2–4, 2005
www.locationintelligence.net
Agile Principle Business Needs Information NeedsEnrich theCustomer
Adapt products &Services
Integrate, extend
MasterChange
ReactForecast
TransformModel, test, extrapolate
MobiliseResources
Plan, EmpowerBuild Virtual teams
Analyse, Build knowledgeCollaboration platforms
Co-operate toCompete
Focus, partner Transform, publish, exchange
Drivers for Data Integration
Agile Competitors and Virtual Organisations - Goldman, Nagel and Preiss
Philadelphia, May 2–4, 2005
www.locationintelligence.net
Obstacles to Data Integration
• Proprietary Data Silos - data cul de sac
• Monolithic Information Systems – embedded logic
• Built-in, Private Data Models – structure/lifecycle
• Unknown/Unproven Data Quality – KR/KM
Philadelphia, May 2–4, 2005
www.locationintelligence.net
Enablers For Enterprise Information Architectures
• Electronic Data Interchange • Straight-through Processing• Service Oriented Architectures• Rules-based Processing• Knowledge Representation Standards• Data & service integration is the goal
Philadelphia, May 2–4, 2005
www.locationintelligence.net
Technological Building Blocks
• Open data access standards
• Extensible, interoperable platforms
• Metadata publication/retrieval
• Data stewardship
Philadelphia, May 2–4, 2005
www.locationintelligence.net
Data Quality ImpactOne third of companies have been forcedto delay or scrap new systems because of faulty data,and a full 75% have experienced significant problemsresulting from data quality issues - PwC
55-70% of CRM and 70% of Data Warehouseproject failures are due to data quality issues - Gartner, Meta
US business loses $600 billion each yeardue to data quality problems - Data Warehousing Institute
When is data “fit for purpose”?
Philadelphia, May 2–4, 2005
www.locationintelligence.net
Principles, Tools & Method for DQ Improvement• Knowledge Management
– Describe the domain/problem independently (from data and systems)
• Rules-based paradigm– Decouple the problem definition from problem
discovery & resolution (what to do about it)• Data Quality Improvement Cycle
– Employ a process of continuous monitoring & improvement - sustainable interoperability™
Philadelphia, May 2–4, 2005
www.locationintelligence.net
Knowledge Management
• Where is knowledge now?– Embedded in data– Hidden in point applications– Inside people’s heads
• Store the knowledge/expertise of the organisation where everyone can contribute to it and share it - as enterprise metadata in a database
Philadelphia, May 2–4, 2005
www.locationintelligence.net
Benefits of a knowledge management approach
• Explicit/unambiguous – not embedded• Inclusive - accommodates domain experts• Non-technical – not just for developers• Open, distributable – location/applications• Auditable – regulatory/governance issues• Evolutionary – incremental acquisition, knowledge
is refined/grows over time • Structured – machine-readable
Philadelphia, May 2–4, 2005
www.locationintelligence.net
Knowledge Representation
•RDF•OWL•SWRL
Philadelphia, May 2–4, 2005
www.locationintelligence.net
Data Quality as Knowledge• Express data requirements as rules (e.g. SWRL)• Data quality rules – enterprise metadata - DB• Rules metadata can be shared (interpreted and
enforced) by many different applications• Rules can be used to measure Data Quality - %
conforming data instances• Rules guide data reconciliation - prioritise• Rules can be used to measure quality improvement
reliably• “Fit for Purpose” = satisfies the DQ rules
Philadelphia, May 2–4, 2005
www.locationintelligence.net
Rules-based Processing Paradigm
• Fact – Pattern – Action• Given some facts, if they meet any of the patterns/rules,
perform the defined action– Declarative – rule separated from processing– Pluggable actions – reporting/ reconciliation
Data Rule A
Report Action A
Reconcile Action A
Philadelphia, May 2–4, 2005
www.locationintelligence.net
Philadelphia, May 2–4, 2005
www.locationintelligence.net
make baselineassessment
refine rulemetadata
check dataconformance
perform datareconciliation
datacertification
datapublication
define rulemetadata
define qualitymission
Data Quality Improvement Cycle
0
10
20
30
40
50
60
70
80
90
Rule A Rule B
SP
0
10
20
30
40
50
60
70
80
90
Rule A Rule B
01020304050
60708090
100
Rule A Rule B
☺
e.g. address cleaning
Philadelphia, May 2–4, 2005
www.locationintelligence.net
Enterprise Integration Gapfor Location
• Spatial data - complex, brittle and surprising• Maintenance - manual, expensive, error-prone• Adherence to a spatial model is often business critical
(e.g. land & property management, utilities)• Tools - significant bespoke development, inflexible, built
by developers not domain experts• Location data - key for BI, data quality chasm, mining• Rudimentary IT standards for spatial data• Spatial semantics never explicit – KR/KM
Philadelphia, May 2–4, 2005
www.locationintelligence.net
Spatial Knowledge Management
• Same methodology– Spatial business rules as metadata– Conformance checking/reporting– Data Reconciliation – rules driven– Certification/publication– Only detailed data operations change
• Business Rule - Percentage: 0 ≤ x ≤ 100• Location Rule Spatial (a,b) : coveredBy, contains, withinDistance
etc.• Empower the domain expert – short “dev.” cycles, stay agile
Philadelphia, May 2–4, 2005
www.locationintelligence.net
SpatialData Quality
RulesIntra-featureConstraints
physicalgeometric
DataSpecifications
Ad hocRules
Inter-featureConstraints
proximity topologicaldirectional
Spatial Rules Discovery
Philadelphia, May 2–4, 2005
www.locationintelligence.net
Spatial Rule Types
Philadelphia, May 2–4, 2005
www.locationintelligence.net
SpatialRulesAuthoring
curbline
building
Philadelphia, May 2–4, 2005
www.locationintelligence.net
Spatial Rule Builder
Philadelphia, May 2–4, 2005
www.locationintelligence.net
Add Rule Clauses
building
zoning
Philadelphia, May 2–4, 2005
www.locationintelligence.net
Spatial & Non-spatial Conditions
Fire stationbuilding
Street centreline
“There’s no such thing as a spatial rule”
Philadelphia, May 2–4, 2005
www.locationintelligence.net
<ruleml:imp> <ruleml:_body> <swrlx:classAtom> <owlx:Class owlx:name="Road"/> <ruleml:var>r</ruleml:var> </swrlx:classAtom> <swrlx:individualPropertyAtom swrlx:property="hasGeom"> <ruleml:var>r</ruleml:var> <ruleml:var>g</ruleml:var> </swrlx:individualPropertyAtom> <swrlx:datavaluedPropertyAtom swrlx:property="hasName"> <ruleml:var>r</ruleml:var> <ruleml:var>n</ruleml:var> </swrlx:datavaluedPropertyAtom> <swrlx:classAtom> <owlx:IntersectionOf> <owlx:Class owlx:name="Geometry"/> <owlx:ObjectRestriction owlx:property="hasFeature"> <owlx:SomeValuesFrom> <owlx:IntersectionOf> <owlx:Class owlx:name="RoadSegment"/> <owlx:DataRestriction owlx:property="hasName"> <owlx:hasValue>n</owlx:hasValue> </owlx:DataRestriction> </owlx:IntersectionOf> </owlx:SomeValuesFrom> </owlx:ObjectRestriction> </owlx:IntersectionOf> <ruleml:var>j</ruleml:var> </swrlx:classAtom>
<swrlx:individualPropertyAtom swrlx:property="hasAggregate"> <ruleml:var>j</ruleml:var> <ruleml:var>a</ruleml:var> </swrlx:individualPropertyAtom> </ruleml:_body> <ruleml:_head> <swrlx:individualPropertyAtom swrlx:property="equals"> <ruleml:var>g</ruleml:var> <ruleml:var>a</ruleml:var> </swrlx:individualPropertyAtom> </ruleml:_head> </ruleml:imp>
High St.
High St.
Road
Road Segment
For each Road, the set of Road Segments having the same name as the Road must have the same aggregate geometry as the Road
Rul
es a
s K
now
ledg
e
Philadelphia, May 2–4, 2005
www.locationintelligence.net
Conformance Checking
SpatialRulesEngine
EnterpriseMessaging
EnterpriseSpatial DB
EnterpriseMetadata -Spatial Rules
Web ServicesClients
AsynchronousMessagingClient
BrowserClient
ReportingSolution
Philadelphia, May 2–4, 2005
www.locationintelligence.net
Data Reconciliation
SpatialRulesEngine
EnterpriseSpatial DB
EnterpriseMetadata -Spatial Rules
Web ServicesClients
UpdateA-B-A
CloneA-B-A’
Schema MapA-B-C
Philadelphia, May 2–4, 2005
www.locationintelligence.net
Warehouse Certification
SpatialRulesEngine
EnterpriseSpatial DB
EnterpriseMetadata -Spatial Rules
Web ServicesClients
0
10
20
30
40
50
60
70
80
90
Rule A Rule B
01020304050
60708090
100
Rule A Rule B
Philadelphia, May 2–4, 2005
www.locationintelligence.net
Summary
• Data quality issues in integration are best addressed using knowledge management/rules-based approaches
• Spatial data quality is no different to any other data quality – standard, interoperable rules are key
• Enterprise spatial rules engines form a secure base from which to develop open, distributable location enabled applications
Philadelphia, May 2–4, 2005
www.locationintelligence.net