an introduction to spatial databases - spatialdbadvisor.com · an introduction to spatial databases...

88
www.spatialdbadvisor.com.au [email protected] Simon Greener, The SpatialDB Advisor An Introduction to Spatial Databases Presentation to “Border and District Spatial Information Users Group” (BADSIG) November 11 th 2008

Upload: others

Post on 04-Sep-2019

23 views

Category:

Documents


1 download

TRANSCRIPT

[email protected]

Simon Greener,The SpatialDB Advisor

An Introduction to Spatial Databases

Presentation to “Border and District Spatial Information Users Group” (BADSIG)

November 11th 2008

[email protected]

The Talk

• I will start with a chunk of standard database theory• Then outline spatial database theory and how they are

implemented.• Before introducing and talking about the current crop of

commercial and Open source spatial database offerings including the relevant standards that apply to them.

• After “theorising” I will move on to outlining some practical examples of spatial database use at Forestry and other sites I have worked out.

• And finally, present some tips and tricks before taking questions.

(Throughout I will try and show how spatially enabled databases “geo-enable” the whole organisation.)

[email protected]

Business-centric “GIS”• GIS is a data-centric, (quality) data-demanding, large data-

volume industry;• Camp 1: “GIS is different”

– The traditional view of GIS industry;– Special software is required to:

• Store, manage, edit, access, analyse and display geospatial data;– Result (data management):

• Geospatial data described and managed via software that is external to an organisation’s enterprise data model(s) and information management infrastructure;

• Inward looking, small, niche technology segment.• Camp 2: “Geospatial data and processing is NOT different“

– An “enterprise geo-database” is not a product but, first and foremost, a corporate data management philosophy (“It’s about the business stupid!”)

– Trend is to horizontal market, commodity, open systems, fully IT integrated geospatial data and services.

[email protected]

5

Data Storage and Computers

• The two main traditional methods.• Simple Files

– Containing text or binary in a format that is known only to the application accessing it.

• Examples– ESRI Shapefiles, CAD files, MapInfo TAB files, Manifold Map files– GeoTIFFs– ESRI Coverages

– Some are documented “standards” but most are proprietary• eg Shapefile physical file format is documented at

http://www.esri.com/library/whitepapers/pdfs/shapefile.pdf

• Databases– A collection of (binary) files (database) under common, open,

management control (DBMS).– Applications that use databases do not need to understand how the

data is stored in these files.– DBMSs: Access, SQLServer, Oracle, Informix IDS, IBM DB2,

PostgreSQL. (The internal data structures that these products use are not documented.)

[email protected]

6

Files - application access

• Applications access files through the following direct “architecture”

• Application / user needs to know exactly where the file is physically stored

Shapefile(s)Map Viewer Application

Display& Query

File handling routines

ShapefileDefinition

[email protected]

Files - Limitations

• Rigid format that is locked to particular application/use.• Change in file format requires all programs that access it to be modified and

recompiled.• No accessible metadata• Data consistency

– Attributes in shapefile dbf are not self-checking via integrity rules that are application independent.

– What is the definition of a correct shape?– Bad shapefiles are very common!

• No security.• Multi-user access limited to read only at best (files are often locked to specific

applications).– Cannot update a shapefile that is being read by another application!

• Access limited to proprietary systems that understand structure.– Not all the shapefile format is published eg spatial index files. Why?

• File format created to satisfy particular functions.– New requirements needs new programs to create/access.– End user not empowered to modify structure eg topological shapefile!

• Size Limitations: Often 32bit programming limits or assumptions stops file size being greater than 2G (as is case with .shp file)

[email protected]

8

Databases - Definitions

• Data: – Known facts that can be recorded and have an

implicit meaning.• Database:

– A collection of interrelated data.• Database Management System (DBMS):

– A software package/ system to facilitate the creation and maintenance of a computerized database.

• Database System:– The DBMS software together with the data itself.

Sometimes, the applications are also included.

[email protected]

9

Key DBMS Features...

• “Self-describing”– A DBMS catalog stores the description (meta-data) of

the database. – Allows DBMS software to work with different

databases (sales vs finance vs operations vs planning vs marketing).

• Data Abstraction– A data model is used to hide storage details and

present the users with a conceptual view of the database.

• Multiple views of the data– User may be presented with views of a database that

describe only the data of interest to that user. • Provide multiple levels of security• Standardised backup and recovery

[email protected]

10

DBMS Features (2)

• Share data through multiuser transaction processing– Allows sets of users to read data from, and to update

data in, the database.– Concurrency control within the DBMS guarantees

that each transaction is correctly executed or completely aborted.

• Two main types of processing– OLTP (OnLine Transaction Processing).– OLAP (OnLine Analytical Processing) eg data

warehousing.• Enable DBMS2DBMS Replication

– Subscriber/Publisher, Master/Slave

[email protected]

11

DBMS Features (3)

• Insulate programs and data– Allows changing data storage structures (new

indexing, data partitioning methods) and operations without having to change the DBMS access programs.

• Thus databases:– Manage data independently of applications

and,– Hide physical implementation details from

them.“logical abstraction from physical

implementation”.

[email protected]

12

Databases - So how do I access my data?

• Because “how” or “where” the data is stored physically on disk is not known to the application developer or casual user , data is accessed is via “abstracted” languages and APIs.

• These enable a user/application to:– Define things through a Data Definition Languages

(DDL) ie create objects (tables, queries, indexes etc)

– Manipulate data through a Data Manipulation Language (DML) ie update, delete and query data.

• These are then implemented in standardised application programmer’s interfaces such as ODBC/OLEDB/JDBC through drivers.

[email protected]

13

Applications and Databases

MasterDBMS

DatabaseBusinessSystems

MapViewer

CasualUser

DDL&DML

LocalDBMS

Database

WAN

Remote Office

DDL&DML

[email protected]

14

DBMS - Advantages

• Minimises Data Redundancy by enabling Data Integration thus providing Data Consistency

• Data Sharing• Data Independence• Data Replication• DDL & DML• Data Accessibility• Uniform Security, Privacy and Integrity Controls• Eases application development• Reduced program maintenance

[email protected]

15

Database Disadvantages

• Product Expense– Commercial products are expensive (though there are some

excellent “open source” databases available eg PostgreSQL, MySQL etc).

• High Overheads– Maintenance and Support specialists– High end hardware

• Skills– Require trained users

• DBMS unnecessary when– Database and applications are simple, well defined, and not

expected to change.– Stringent real-time requirements that may not be met because of

DBMS overhead.– Multi-user access and editing is not required.– GIS data is sourced externally and has no role to play in

transactional edits (eg auto-computation)– Specialist file formats provide for more efficient analysis of static

data eg ESRI coverage for overlays for resource analysis

[email protected]

From Shadows to Reality:The role of Models...

[email protected]

17

Models of Reality

• But I’m getting ahead of myself…• We can't just assume:

– database == relational database• All science and thus computing is based on

observing the real world, and abstracting it through data models that attempt to represent it.

[email protected]

18

Model lexicon...• Three types of models are pertinent to databases

– Physical (low-level, internal)• Provide concepts that describe details of how data is

stored in the computer. (Files implement this.) – Logical or Conceptual (high-level, semantic)

• Provide concepts that are close to the way many users perceive data. (Also called entity-based or object-based data models.)

– Implementation (representational)• Provide concepts that fall between the above two,

balancing user views with some computer storage details.

[email protected]

19

Logical Modelling ...• A set of building blocks for (logically) modelling the

world is that of "Entities", "Relationships" and "Attributes"– Entity

• A person, place, event or concept about which information is recorded.

• More precisely, an entity is a set or collection of like individual objects called instances. An instance is a single occurrence of a given entity.

– Attribute• Describes a characteristic of an entity

– Relationship• An association between entities

[email protected]

20

Relationships...• A relationship is described using the concepts of degree and existence:– Degree describes whether the relationship between two

entities is 1:1, 1:Many or Many:Manyeg a pipe is connected to zero, one or more other pipes;

– Existence describes whether entities in a relationship are contingent on each othereg water in a pipe may be connected to another pipe; a valve must be connected to a pipe.

Topological relationships...

• However, with spatial entities, one extra relational property exists: topology:– A stop valve occurs at the end of a pipe, ie a

“meet” relationship between 0-1 objects.

Pipe#ID:integerGEOMETRY:BIT

Valve#ID:IntegerGEOMETRY:bit

May Have ValvesMust Be Connected to

1..4 0..2

Topological Relationship

Example Model...

• Example model: Tree Inventory Systems

Incarnating the model...

[email protected]

24

Realising Logical Models in Databases ....

• Databases use Implementation Models that represent and manage our logical models in computers.– Implemented models can be simple (retrictive) through

complicated.– Some differ in the method in which relationships (and

data methods) are implemented.– Main ones are:

• Hierarchical*• Network*• Relational• Object-Relational• Object Oriented*

*Not covered in this talk

[email protected]

25

Implementation Models of Interest

• Relational Model– Proposed in 1970 by E.F. Codd (IBM), first commercial

system in 1981-82.– Now in several commercial products (DB2, ORACLE,

SQL Server, SYBASE, INFORMIX).– MySQL (ISAM engine) is not a true relational database

though it implements the basic model.• Object-Relational Models

– Most Recent Trend.– Commercial offerings started with Informix Universal

Server. Exemplified in Oracle8i-11g, DB2 and (lately) SQL Server etc.

– The “open source” database, PostgreSQL, is one of pioneer O-RDBMS through the work of Professor Michael Stonebraker.

[email protected]

26

Implementation Models: Relational

• The model was first proposed by Dr. E.F. Codd of IBM in 1970 in the following paper:"A Relational Model for Large Shared Data Banks," Communications of the ACM, June 1970. (Earned him the ACM Turing Award.)

• The relational Model of Data is based on the concept of a Relation.

• A Relation is a mathematical concept based on the ideas of sets.

• The strength of the relational approach to data management comes from the formal foundation provided by the theory of relations.

[email protected]

27

Relational: Key Concepts

ExtensionPopulated TableSchema of a RelationTable DefinitionDomainValues in a columnTupleRowAttribute/DomainColumnRelation Table

Formal Terms Informal Terms

[email protected]

28

Relational Model - Key Concepts (2)

• Tuples (rows) in a relation (table) are not considered to be ordered; nor is attribute order in tuple.

• All values of attributes in a tuple are considered atomic (ie indivisible). – A special null value is used to represent values that are

unknown or inapplicable to certain tuples.• Constraints are conditions that must hold on all

valid relation instances.– Three main types of constraints:

• Key constraints• Entity integrity constraints• Referential integrity constraints

[email protected]

29

Relational Model: Constraints• Key (single relation)

– Is a set of attributes that uniquely identify tuples in a relation.

– A relation can have more than one possible key ⇒ candidate keys. If a relation has several candidate keys, one is chosen arbitrarily to be the primary key.

• Entity Integrity (single relation)– Check or Column constraints

• Referential Integrity– Specifies a relationship among tuples in two relations: the

referencing relation and the referenced relation.– (called foreign key attributes) that reference the primary

[email protected]

30

Relational Model: Diagram

[email protected]

31

Relational Model: SQL

• Structured Query Language– Non-procedural language for defining and manipulating

relations.• NOTE: SQL is NOT necessary relational

– For example on previous slide:Select cus_fname, cus_initial,cus_lname, cus_phonefrom agent, customerwhere ( agent.agent_fname = ‘John’and agent.agent_initial = ‘T’and agent.agent_lname = ‘Okon’ ) and customer.agent_code = agent.agent_code

[email protected]

32

Relational Model: Summary

• Advantages– Formal mathematical theory– Structural independence– Improved conceptual simplicity– Easier database design, implementation, management,

and use – Ad hoc query capability with SQL– Powerful database management system

• Disadvantages– Substantial hardware and system software overhead– Poor design and implementation is made easy– May promote “islands of information” problems– Difficulties with strict relational model as applied to

complex objects like vector shapes!

[email protected]

Spatial and Relational: How ?

• There is one method open to those who want to implement spatial data in a relational database.– The Hybrid Approach

• Now an historical legacy• GIS examples: Autodesk Vision and ESRI ArcSDE

• Is equivalent to file based implementation• Appropriate OpenGIS Standards:

– Simple Features - SQL - Binary* Geometry (1.1)– Simple Features - SQL - Normalized Geometry (1.1)

*WKB is required storage format only for this schema: not for “SQL with Geometry Types” (see later)

[email protected]

Spatial Databases (1)

Attributes in database Geometries in proprietary

data files Poor integration No access to spatial data

from IT

AttributesGeometriesGeometriesGeometries

GIS

IT

SQL

blobs

• Attributes in database• Geometries in database -

but in proprietary binary format

• IT can access geometries via proprietary interfaces only (ie APIs)

• Poor integration

[email protected]

Object-Relational described using spatial ...

Object-Relational Model

• Early relational database implementations did not handle arbitrary domains, in fact, they did not conceive of, nor implement, a flexible domain system based on data types + methods (an Object-Orientation concept).– Traditional Data Types

• Numbers, Dates and Text– Made up of a, potentially, unknown storage format (cf

Physical Model) and an “implied” set of methods or operators.

• For example, the operators for numbers are:– + * / mod etc

• Object-Relational Databases are based on support for User Defined Types (UDT) and/or Abstract Data Types (ADT).

[email protected]

User Defined Data Types

• User-Defined Types are often called in other systems called user-defined functions. – These are new datatypes that are, in

essense, calculated functions of existing datatypes. Thus SpatialWare is built by extending the existing *varbinary* datatype. (Hence why the breaking of size limits in SQL Server 2005 was so important to MapInfo.)

Abstract Data Types

• The second one, Abstract Data Types offer a way to add totally new datatypes into a relational system, with their own data access methods (e.g., index structures). – This is where the term post-relational or

object-relation (O-RDBMS) comes from: the ability to "extend" the basic type system of a relational database to be more than numbers, dates and strings.

– Though, even ADT developers are forced to use more fundamental data types for their implementation (eg Oracle's Sdo_Geometry uses Oracle Objects and VArrays).

• ADTs are found mainly in Oracle, DB2, Informix and PostgreSQL.

[email protected]

Spatial Databases (2)• So, true Spatial Databases are based on ADT/UDT

implementation of data types:– Often called: “SQL with Geometry Types”

GIS

eBusiness

GeometriesAttributes

SQL

• Spatial is native DBMS type

• Attributes and geometries integrated in database

• Supported by all GIS • Supported by eBusiness

applications• Spatial data queried

using SQL, Java

SQL

GISGIS

eBusinesseBusiness

What's in a name

• In essence, UDT or ADT refers to a data type that extends the SQL type system (terms often used interchangeable)

• Both can be used to define the column types for tables, this allows values stored in the columns of a table to be instances of the ADT/UDT.

• SQL functions may be declared to take ADT/UDT values as arguments, and return ADT/UDT values as results.

• However, the difference between UDT and ADT implementations is important where you want to inherit from the geometry object (needed for proper implementation of the "ISO Geometry Object Model"– ADTs allow sub-typing, UDTs do no.

(Concrete examples soon...)

[email protected]

Spatially Typed Databases• All the following databases have “native” spatial

implementations:– Oracle (Locator/Spatial)– PostgreSQL (Generally PostGIS is accepted as “official” implementation)

– MySQL– IBM Informix Spatial DataBlade and IBM Extender– SQL Server 2008 (including Express Edition)

• Most of these are “free” and come with the database itself.• Also exist some proprietary GIS data type implementations

for these databases:– ESRI Spatial Type for SQL Server, Oracle, Informix,

PostgreSQL– MapInfo Spatialware for SQL Server, Informix– Most of these cost money and are, invariably, heavily

tied to their vendors GIS technology stack.

Lies, Damned Lies and Standards...

Standards Bodies ...

• Like all things, there are some standards bodies that are defining applicable standards to control/support design and uptake of spatial databases:– Open GIS Consortium, Inc – International Standards Organisation (ISO)– W3C Consortium (XML/SVG…)

Applicable Spatial Standards...

OpenGIS Document Title Version

Type

OpenGIS Implementation Specification for Geographic Information - Simple Feature Access (ISO 19125)Part 1: Common Architecture

Supplies the common feature model for use by applications that will use the Simple Features data stores and access interfaces.

Part 2: SQL option Provides a standard SQL implementation of the abstract model in Part 1 that supports storage, retrieval, query and update of features. Includes Normalised, Binary and “SQL with Geometry Types”1 (Says nothing about physical storage format) implementation options

1.2 IS

• OpenGIS Standards (Latest)

• ISO Standards (Latest)ISO Document TitleISO/IEC CD 13249-3:2006(E) – Information technology – Database languages – SQL Multimedia and Application Packages ― Part 3: Spatial, May 15, 2006.

ISO 19107, Geographic information ⎯ Spatial schema

ISO 19111, Geographic information ⎯ Spatial referencing by coordinates (Implemented in the EPSG collection of geodetic systems)

IS - Implementation SpecificationDIS - Deprecated Implementation SpecificationSAP - Specification Application Profile

Standards define: Schema for Geometry Types

Note: Many implementations have additional tables, e.g., Oracle provides the above but also includes CS_SRS, SDO_GEOM_METADATA_TABLE etc.

Standards Define: Geometry Type Hierarchy / Object Model

Note: InheritanceNote: Class names in SQL/MM carry a "ST_" prefix. This is optional and implementations may chose to drop this prefix.

Standards: Orientation & Organisation

• SQL standards also define things like orientation of vertices in a polygon:– Anti-clockwise for all outer-shells– Clockwise for all inner-shells

• And polygon inversion/exversion and bowties

ISO SQL/MM Part 3 Spatial ADTCREATE TYPE ST_Geometry AS ( ST_PrivateDimension SMALLINT DEFAULT -1, ST_PrivateCoordinateDimension SMALLINT DEFAULT 2,ST_PrivateIs3D SMALLINT DEFAULT 0, ST_PrivateIsMeasured SMALLINT DEFAULT 0) NOT INSTANTIABLE NOT FINAL METHOD ST_Dimension()RETURNS SMALLINT LANGUAGE SQL DETERMINISTIC CONTAINS SQL RETURNS NULL ON NULL INPUT

CREATE TYPE ST_PointUNDER ST_Geometry AS ( ST_PrivateX DOUBLE PRECISION DEFAULT NULL, ST_PrivateY DOUBLE PRECISION DEFAULT NULL, ST_PrivateZ DOUBLE PRECISION DEFAULT NULL, ST_PrivateM DOUBLE PRECISION DEFAULT NULL ) INSTANTIABLE NOT FINAL METHOD ST_X() RETURNS DOUBLE PRECISION LANGUAGE SQL DETERMINISTIC CONTAINS SQL RETURNS NULL ON NULL INPUTetc

NOTE: No mention of WKB as storage format!

“standard does not prescribe a particular ADT mechanism, but specifies the behavior of the ADT's through a specification of interfaces that must be supported”

CREATE TYPE ST_LineString UNDER ST_Curve AS (

ST_PrivatePoints ST_Point ARRAY [ST_MaxGeometryArrayElements]

DEFAULT ARRAY[] )INSTANTIABLENOT FINALMETHOD ST_NumPoints()

RETURNS INTEGER LANGUAGE SQL DETERMINISTIC CONTAINS SQL RETURNS NULL ON NULL INPUT,

etc

Oracle's UDT ImplementationSQL> desc mdsys.sdo_geometry Name Null? Type ----------------------------------------- -------- ------------------------- SDO_GTYPE NUMBER SDO_SRID NUMBER SDO_POINT MDSYS.SDO_POINT_TYPE SDO_ELEM_INFO MDSYS.SDO_ELEM_INFO_ARRAY SDO_ORDINATES MDSYS.SDO_ORDINATE_ARRAYMETHOD------ MEMBER FUNCTION GET_GTYPE RETURNS NUMBERMETHOD------ MEMBER FUNCTION GET_DIMS RETURNS NUMBER... etc …METHOD------ MEMBER FUNCTION ST_COORDDIM RETURNS NUMBERMETHOD------ FINAL CONSTRUCTOR FUNCTION SDO_GEOMETRY RETURNS SELF AS RESULT Argument Name Type In/Out Default? ------------------------------ ----------------------- ------ -------- WKT CLOB IN SRID NUMBER IN DEFAULT... etc ...

SQL> desc sdo_ordinate_array SDO_ORDINATE_ARRAY VARRAY(1048576) OF NUMBER

Oracle's SQL/MM ADT ImplementationSQL> desc mdsys.st_curve mdsys.st_curve extends MDSYS.ST_GEOMETRY mdsys.st_curve is NOT FINAL Name Null? Type ----------------------------------------- -------- ------------------ GEOM MDSYS.SDO_GEOMETRY… etc …METHOD------ MEMBER FUNCTION GET_WKB RETURNS BLOB

METHOD------ MEMBER FUNCTION ST_COORDDIM RETURNS NUMBERMETHOD------ MEMBER FUNCTION ST_ISVALID RETURNS NUMBER… etc …METHOD------ MEMBER FUNCTION ST_DISTANCE RETURNS NUMBER Argument Name Type In/Out Default? ------------------------------ ----------------------- ------ -------- G2 ST_GEOMETRY INMETHOD------ MEMBER FUNCTION ST_INTERSECTS RETURNS NUMBER Argument Name Type In/Out Default? ------------------------------ ----------------------- ------ -------- G2 ST_GEOMETRY IN

Spatial Database Data Type FactsDatabase Storage

FormatStorage Format Compliance

Dimensions Spatial Reference System

Geometric Organisation

Access Standard

Oracle Oracle Object consisting of numbers and arrays

Arguably SQL3 components (arrays, numbers and objects)

3D + Measure

EPSG OGC SFS

SQL/MM (circular arcs)

Proprietary but open eg sdo_util.GetVertices; SDO_INSIDE()

Also

SQL/MM

SQL Server 2008

Proprietary(Actually .NET Common Language Runtime – CLR - Data Types)

None 2.5D + Measure

EPSG OGC SFS 1.1 OGC 1.1 + Extensions eg MakeValid()BufferWithTolerance()Reduce() (Douglas-Peuker)

Not Spatial Schema tables (eg GEOMETRY_COLUMNS and SPATIAL_REF_SYS

PostgreSQL Extended WKB

OGC SFS 1.1 EPSG OGC SFS SQL/MM (partially)

SFS 1.1.0SQL/MM + proprietary extensions eg ST_AddPoint(linestring, point, [<position>]);ST_Affine(geometry, float8, float8, float8, float8, float8, float8)

MySQL Proprietary None ISO/EPSG OGC SFS OGC 1.1

IBM DB2 Spatial Extender

Informix

Smart Large Object (BLOB)

Unknown 2.5D ISO/EPSG OGC SFS 1.1 OGC 1.1 + extensions eg SE_VertexUpdate() SE_VertexAppend() SE_VertexDelete()

ISO vs OGC

ISO vs OGC (2)

[email protected]

Spatial Database Storage...

• Databases are not proprietary in the same sense that GIS-centric file formats are.– No one really worries about how a number is stored (ie

IEEE) within a database as long as we can create, modify, delete and access the data via appropriate languages and standards to a desired precision.

– This is called “logical separation from physical implementation”

Well Known Binary (WKB)

• Like, Well Known Text (WKT), is provided primarily as an interchange/change format not a storage format.

• From Standard (SFS 1.2 Part 1 Common Architecture):– “The Well-known Binary Representation for Geometry

(WKBGeometry) provides a portable representation of a geometric object as a contiguous stream of bytes.”

– “The Well-known Binary Representation for Geometry is obtained by serializing a geometric object as a sequence of numeric types drawn from the set {Unsigned Integer, Double} and then serializing each numeric type as a sequence of bytes using one of two well defined, standard, binary representations for numeric types...”

[email protected]

Effects of Spatial Databases

• The most immediate effect of the uptake of spatial databases is that the binary divide between enterprise and “GIS” data management, visualisation and processing has gone.– It is no longer external to the core information model

and architecture of an organisation!• Result:

– We are freed from the myriad issues that come with a duality of data management and processing systems;

– We can concentrate on building systems that address real business and customer needs delivering better outcomes for all.

Database use case studies ....

Business Need: Wildfire Management ....

Business Process: Fire Fighting

• Need live reporting of area burnt by a wildfire.– Statistic: Area (ha) of Tenure & Vegetation

burned.– Data must be computed automatically as

part of transaction.– Output inserted into “Daily Burn” table

“attached” to wildfire “operation”– Email sent to user to view data in “Fires”

application.• Approach:

– Traditional GIS: Program run within “fat GIS client”.

– Ours: “Declarative” SQL & database-centric transaction processing….

Live Fire Mapping

• New fire boundary – Created by flying a helicopter around the fire edge

recording boundary using GPS

– Fly low to pick up CDMA access.

– Open connection to database and insert area into database as a single record transaction.

– Tenure/Vegetation that has been burned is then computed by “cookie cutter” SQL on next slide.

Live Fire Mapping: SQLSELECT /*+ORDERED ORDERED_PREDICATES */ a.OPERATION_TYPE_ID, a.OPERATION_ID, a.BURN_DATE, b.TENURE_CODE, b.VEGETATION_TYPE_CODE, ROUND(SUM( MDSYS.SDO_GEOM.SDO_AREA( MDSYS.SDO_GEOM.SDO_INTERSECTION(b.shape,c_diminfo,a.shape,c_diminfo),c_diminfo)) / 10000,2) FROM ops.burn_daily_incident a, ops.fire_tenveg b WHERE a.featureid = c_featureid AND MDSYS.SDO_RELATE(b.shape,a.shape,'mask=ANYINTERACT querytype=WINDOW') = 'TRUE’ AND ( MDSYS.SDO_GEOM.SDO_INTERSECTION(b.shape,c_diminfo,a.shape,c_diminfo) is not null AND ROUND( MDSYS.SDO_GEOM.SDO_AREA( MDSYS.SDO_GEOM.SDO_INTERSECTION(b.shape,c_diminfo,a.shape,c_diminfo), c_diminfo),2) > 0.0 ) GROUP BY a.OPERATION_TYPE_ID, a.OPERATION_ID, a.BURN_DATE, b.TENURE_CODE, b.VEGETATION_TYPE_CODE;

Live Fire Mapping (cont)●Actual process:

● SQL statement is hosted inside an Oracle pl/sql procedure BurnTenureVegtype()

● AFTER INSERT trigger on the fire boundary table places a request to execute procedure for specific record into the Oracle Job Queue/Scheduler: job_string := 'ops.BurnTenureVegtype('||to_char(v_featureid)||','''||USER||''');'; DBMS_JOB.SUBMIT(jobno,job_string,SYSDATE,NULL);

●DBMS_JOB After Insert request makes the processing asynchronous to the actual insert of the fire shape.

● Generation of computed data is separated from transaction data.● Because the clip out can take a bit of time when the shape is large... but mostly

these clip outs take < a few seconds!● When the procedure completes it sends an email to the person who put in

the fire boundary with its performance statistics (Yes, straight out of the database!). An example of an email is:Operation: 12345Description: Burn area computations completed for recently updated burn area shape.Records: 15 written to burn_area table.Processing Time: 0 minutes 9 seconds.Action: Please use FIRES application to view tenure/veg/area values.

Techno Stuff: Language & Framework

• Fires– Database Tier: Oracle Spatial,

PL/SQL email package .– Middle Tier: Apache Struts

JSP Business Components for Java (BC4J)

– Client Tier: HTML (JSP)

• GIS license and deployment costs: $0

Business and Technical Need: Cool Views...

Trick – A Cool View

• Database stores polygons.– However, there can often be a need, for example in the

generation of title diagrams, to annotate the sides of the polygon with bearings and distances

– And that these bearings and distances change as the base polygon is edited.

– This can be done with a trivial bit of PL/SQL and a view...

Types and Functions

CREATE OR REPLACE PACKAGE COGOAS FUNCTION PI RETURN NUMBER;

FUNCTION Bearing( dE1 in number, dN1 in number, dE2 in number, dN2 in number) RETURN NUMBER DETERMINISTIC; FUNCTION Distance( dE1 in number, dN1 in number, dE2 in number, dN2 in number) RETURN NUMBER DETERMINISTIC;...END COGO;

CREATE OR REPLACE PACKAGE GEOM AS FUNCTION GetVector2D ( p_geometry in mdsys.sdo_geometry) RETURN CODESYS.Vector2DSetType DETERMINISTIC;END GEOM;

CREATE OR REPLACE TYPE Coord2DType AS OBJECT ( x NUMBER, y NUMBER );CREATE OR REPLACE TYPE Vector2DType AS OBJECT ( startCoord Coord2DType, endCoord Coord2DType );CREATE OR REPLACE TYPE Vector2DSetType AS TABLE OF Vector2DType;

View

CREATE OR REPLACE VIEW apex_demoAS SELECT rownum AS gid, codesys.Cogo.DD2DMS( codesys.Cogo.Bearing(startx,starty,endx,endy) * (180/codesys.Cogo.PI) ) AS bearing, ROUND(codesys.Cogo.Distance(startx,starty,endx,endy),2) AS distance, MDSYS.sdo_geometry(2002,NULL,NULL, MDSYS.SDO_ELEM_INFO_ARRAY(1,2,1), MDSYS.SDO_ORDINATE_ARRAY(startx,startY,endX,endY)) AS geometry FROM ( SELECT DISTINCT c.StartCoord.X AS startX, c.StartCoord.Y AS startY, c.EndCoord.X AS endX, c.EndCoord.Y AS endY FROM ( SELECT geom FROM ProjPoly2D WHERE gid = 5 ) a, TABLE(CAST(codesys.Geom.GetVector2D(a.geom) AS codesys.Vector2DSetType)) c );

Result

Techno Stuff: Language & Framework

●Property Bearing and Distance View● Database Tier: Oracle Spatial

●GIS license and deployment costs: $0

Business Need: Viewpoint Editor...

Business System: Viewpoints

• High value (social, environmental, aesthetic) placed on landscape.

• Key tourism lookouts and other high quality landscape views represented by “viewpoints”.

• Simple Oracle table.– Data recorded by typing in northing/easting. – Sdo_Geometry (consumed by GIS viewing/mapping

applications) constructed by triggers.– Other data values (eg district) computed by triggers.

Viewpoints: Trigger

Create or Replace Trigger Viewpoints_Bef_UpdBefore Update of Northing, EastingOn ViewpointsWhen ( old.northing <> new.northing ) or ( old.easting <> new.easting )Begin :new.shape := Sdo_Geometry(2001,NULL,

Sdo_Point(:new.easting,:new.northing,NULL), NULL,NULL);

:new.district_Code := gis.geography.Which(“DISTRICT”, :new.easting,:new.northing);

….End;

Techno Stuff: Language & Framework

• Viewpoints:– Database Tier: Oracle Spatial,

PL/SQL packages & trigger processing.

– Middle Tier: Apache Web Server

HtmlDB (now APEX)

– Client Tier: HTML

Adobe Scalable Vector Graphics (SVG) plugin (spatial editing)

• GIS license and deployment costs: $0

Business Need: Centroid inside Polygon...

What do you mean?

• Many, many people have the need to generate centroids that are guaranteed to fall within a polygon.

• All standards-based implementations use mathematically weighted centroid.

• How do we create these centroids such that everyone accessing the database can consume them?

Implementation: Pure SQL

To get Return On (your database) Investment – ROI - you need to “Know your SQL”!

Deployment...

• Encapsulate inside a PL/SQL function/package• Wrap in a view (perhaps a materialised view) that

can be consumed by any client.• Use in a trigger to generate as each line/polygon is

inserted/updated.

Techno Stuff: Language & Framework

• Viewpoints:– Database Tier: Oracle Spatial,

PL/SQL packages

– Client Tier: Independent

• GIS license and deployment costs: $0

Business Need: Zone of Influence...

Zone of Influence – Policy• “... will not give approval for structures ... built over a sewer

rising main or water main ... within distances specified ...”• Result is need for a Zone of Influence is calculated as the depth

of trench plus half trench width, adjusted by soil type.

TrenchDepth

Zone of Influence

300mm

Trench Width 1 x Trench Depth1 x Trench Depth

TrenchDepth

300mm

Trench Width 2 x Trench Depth2 x Trench Depth

Clay Soils Sand, Filled Ground and Loam

SELECT .... sdo_geom.sdo_convexhull( sdo_geom.sdo_arc_densify( sdo_geom.sdo_union( sdo_geom.sdo_buffer(c.input_geom, (c.input_trench_width /2.0),0.01), sdo_geom.sdo_buffer(c.output_geom,(c.output_trench_width/2.0),0.01), 0.01), 0.01,'arc_tolerance=0.1'), 0.01) as geom FROM (SELECT ... ROUND(case when (b.pipe_diameter + (2.0 * b.input_trench_depth * b.input_soil_factor)) < 3.0 then 3.0 else (b.pipe_diameter + (2.0 * b.input_trench_depth * b.input_soil_factor)) end,3) as input_trench_width, .....) as output_trench_width, ... FROM ( SELECT ... (SELECT /*+INDEX(s SP_SOIL_COHESIVENESS_GEOM)*/ case when s.cohesiveness = 'Cohesive' then 1 else 2 end /* We use SDO_NN in case point is not within a soil polygon */ FROM mcw_gis.sp_soil_cohesiveness s WHERE SDO_NN(s.geometry,sdo_geometry(2001,82469,sdo_point_type (v.startCoord.x,v.startCoord.y,NULL),NULL,NULL), 'sdo_num_res=1') = 'TRUE' ) as input_soil_factor, ... ) as output_soil_factor, (SELECT /*+INDEX(leb LM_EASEMENT_BOUNDARY_S)*/ 1 FROM mcw_land.lm_easement_boundary leb WHERE ( SDO_ANYINTERACT(leb.geom,sdo_geometry(2001,82469,sdo_point_type(v.startCoord.x,v.startCoord.y,NULL),NULL,NULL)) = 'TRUE' OR SDO_ANYINTERACT(leb.geom,sdo_geometry(2001,82469,sdo_point_type(v.startCoord.x,v.startCoord.y,NULL),NULL,NULL)) = 'TRUE' ) AND leb.id_easement_type = 3 ) as inEasement, (SELECT case when m.diameter_outside is null or m.diameter_outside <= 225 then 0.600 else 1.000 end FROM MCW_SEWER.WW_SECTION_MODEL m WHERE m.FID = p.FID_MODEL) as pipe_diameter, ROUND((abs(v.startCoord.z) + 0.3),3) as input_trench_depth, ROUND((abs(v.endCoord.z ) + 0.3),3) as output_trench_depth, sdo_geometry(2001,82469,sdo_point_type(v.startCoord.x,v.startCoord.y,NULL),NULL,NULL) as input_geom, sdo_geometry(2001,82469,sdo_point_type( v.endCoord.x, v.endCoord.y,NULL),NULL,NULL) as output_geom FROM mcw_sewer.ww_section p, TABLE(mcw_sewer.zoi.GetVector( SDO_LRS.CONVERT_TO_LRS_GEOM( (SELECT /*+ INDEX(l WW_LINE_IX2) */ SDO_AGGR_CONCAT_LINES(l.geom) FROM mcw_sewer.ww_line l WHERE l.FID_ATTR = p.FID ), p.input_depth, p.output_depth))) v WHERE p.id_function in (10004 /*gm*/,10007/*rmain*/,10008/*vmain*/) AND p.input_depth is not null AND p.output_depth is not null ) b WHERE b.inEasement is null ) c;

Algorithm in SQL

Not processing those already in easements

Compute soil type of input end

Output soil type computation not shown

Depth Interpolation functionCustom Vectorisation function

Predicates to select valid sewer lines

WW_SECTION to WW_LINE (1:M) aggregation

Start_MeasureEnd_Measure

Trench WidthCalculations

Compute trench depthCreate point geometries

Create final Zone of Influence polygon

Compute pipe diameter

Output of SQL

Deployment Framework

• Now have single SQL statement that will generate all zone of influence polygons. (Speed is around 10 minutes to create 20,000 polygons.)

• Deployment framework created:– PL/SQL package developed

• Triggers handle attribute and spatial edits

Techno Stuff: Language & Framework

• Viewpoints:– Database Tier: Oracle Spatial,

PL/SQL packages

– Client Tier: Independent

• GIS license and deployment costs: $0

Wrap up ...

Tips...

• Always design and deploy fully constrained data models using the database vendor's functionality in preference to external, third-party, technology:– Primary, Foreign Key and Unique constraints;– Column and Table Check constraints eg (PostGIS):

• ALTER TABLE rbasin_point CONSTRAINT contained_within_basin CHECK (AssertContains(geom, 'rbasin_area', 'geom'));(where AssertConstrains is a user-created function).• Ensure all geometric data that is stored is valid:

– Consider validating each day's edits via an in-database scheduled task outside work hours

• Replicate transactional data to agreed proprietary file format on folders on shared servers via scheduled tasks (and advertise dates/times when done).

Tips (continued)...

• Think about the deployment of a solution:– Keep data tier things in the database to maximise reuse by

multiple clients (eg Cool Views)• Look to general IT frameworks, technologies and

software to build geo-spatial solutions before adopting geo-spatial software vendor proprietary solutions.

• Keep spatial indexes healthy by regular drop/create via scheduled task.

• Not just know thy SQL, but also:– Learn about your database's architecture;– Become expert in the database's programming language:

• PG/SQL; PL/SQL; T-SQL etc– Collaborate with IT Department through integration on

shared terms and goals.• Ensure GIS Strategy aligns tightly with IT Strategy;

The End (thankfully!)

Thank you for being so patient.... Any Questions?

Simon [email protected]