integrating xquery and relational database systems

44
Integrating XQuery and Relational Database Systems

Post on 22-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Integrating XQuery and Relational Database Systems

Integrating XQuery and Relational Database

Systems

Page 2: Integrating XQuery and Relational Database Systems

Xml and relational databases

Xml Advantages:

• simple

• unicode based

• platform independent syntax

• many parsers

• is able to represent structured data, semi-structured data and markup data.

Page 3: Integrating XQuery and Relational Database Systems

Structured data<customer>

<ID>C1</ID><FirstName>Janine</FirstName><LastName>Smith</LastName><Address>

<Street>1 Broadway Way</Street><City>Seattle</City><Zip>WA 98000</Zip>

</Address><Order>

<ID>01</ID><OrederDate>2003-01-21</OrederDate><Amount>7</Amount><ProductID>P3</ProductID>

</Order><Order>

<ID>02</ID><OrederDate>2003-06-24</OrederDate><Amount>3</Amount><ProductID>P3</ProductID>

</Order></customer>

Page 4: Integrating XQuery and Relational Database Systems

Semi-structured data<PatientRecord pid="P1">

<FirstName>Janine</FirstName><LastName>Smith</LastName><Address>

<Street>1 Broadway Way</Street><City>Seattle</City><Zip>WA 98000</Zip>

</Address><visit date="2002-01-05">

Janine came in with a <symptom>rash</symptom>. We idetified a <diagnosis>antibiotics allergy</diagnosis> and <remedy>Changed her cold prescription</remedy>.</visit>

</PatientRecord><PatientRecord>

<pid>P2</pid><Name>Nils Soerensen</Name><Address>23 NE 40th Street, New York</Address>

</PatientRecord>

Page 5: Integrating XQuery and Relational Database Systems

Markup data

<visit date="2002-01-05">

Janine came in with a <symptom>rash</symptom>. We idetified a <diagnosis>antibiotics allergy </diagnosis> and <remedy>Changed her cold prescription</remedy>.

</visit>

Page 6: Integrating XQuery and Relational Database Systems

Xml and realtional databases

• Relational DBs manage structured data and thay are being used by 80% of the market.

• Most relational DBs developped capabilities that fit XML structured data concept.

• Recently there have been attempts to deal with data that doesn’t fit that concept.

Page 7: Integrating XQuery and Relational Database Systems

Overview

• Relational Storage of XML: The Xml type

• Integrating Xquery and SQL: Querying XML datatypes

• Top-level XQuery

Page 8: Integrating XQuery and Relational Database Systems

Relational Storage of XML: The Xml type

• All LOB-based storage mechanisms provide a built in XML datatype. The SQL name for that built in datatype is XML.

• There are different logical models and physical representations for the XML datatype.

Page 9: Integrating XQuery and Relational Database Systems

Logical Models for the XML Datatype

• The standard doesn’t define the impelmentation of the XML datatype, only the requirement it should satisfy:Representing any element content, multiple top-level element nodes and top-level text nodes.

• It is possible to cast the XML datatype to string (serializing) and vice versa (parse) using SQL.

Page 10: Integrating XQuery and Relational Database Systems

Physical Models for the XML Datatype

• All the storage mechanisms assume that the data has been verified to be a well-formed instance of the XML datatype.

• XML storage fidelities:– String-level fidelity: code-point for code-point.– Infoset-level fidelity: the stored XML shares the

same inforamtion set as the original XML document.– Relational fidelity level: preserves information from

relational point of view and disgards XML-specific properties, such as document order.

Page 11: Integrating XQuery and Relational Database Systems

Character LOB (CLOB)With CLOB, the XML is stored in a character

representation.• The may preserve the original XML data exactly

(string-level fidelity).• The data may have been transformed by

changing the encoding of the content (infoset-level fidelity).

This method is not efficient:• It requires extensive indexing or requires every

query to parse the data before executing the query.

• Hard to update on a node level.

Page 12: Integrating XQuery and Relational Database Systems

Binary LOB (BLOB)

• Binary format provides more efficient index processing and compression, such as pre processed xml in the form of a DOM tree.

• BLOB could provide string-level fidelity, but it usually provides only infoset level.

• A BLOB provides an additional level of abstracion over CLOBs.

• Node-level updates are costly since the BLOB support in relational systems is not designed for logical updates.

Page 13: Integrating XQuery and Relational Database Systems

User-Defined TypeRelational database systems allow user to add

user defined physical presentation.• The format of the XML is defined by the user

(binary vs text etc..).• There are two major approaches for

supporting user-defined types:1. Deep integration approach: provides type

integration into the actual query processor. 2. Virtual Machine approach: provides an

external co-engine to help processing XML querries.

Page 14: Integrating XQuery and Relational Database Systems

Relational table mapping

XML data is being mapped to relational data and indices.

It often only provides relational fidelity.

• Node tables: represents every node in the XML instance as a row in the table.

• Table shredding: Can be used if additioanal structured information is available in the form of a schema.

Page 15: Integrating XQuery and Relational Database Systems

Node table exampleXML (untyped)<doc>

<customer cid="ALFKI"><Oreder oid="01"/>

</customer><customer cid="BONDO"></customer>

</doc>

Page 16: Integrating XQuery and Relational Database Systems

Shredded table exampleXML with a schema that describes Customer and Order

<Customer cid="ALFKI" customername="Alfred's Futterkiste"><Oreder oid="01" orderdate="2002-0-10"/>

</Customer><Customer cid="BONDO" customername="Bon Odessa"/>

Page 17: Integrating XQuery and Relational Database Systems

Typing an XML Datatype• Typing xml values according to xml schemas.• In order to provide type information from an xml

schema, the type relevant information must be managed in the metadata componant of the database.

• Relational databases enable mapping of xml namespace URIs to SQL idetifiers.

• example:CREATE XML SCHEMA POSchema NAMESPACE

N'<schema xmlns="http://www.w3.org/2001/XMLSchema"

targetNamespase=http://www.example.com/po-schema>

...

</schema>'

Page 18: Integrating XQuery and Relational Database Systems

Typing an XML Datatype• XML datatype instances are validated

according to the schemata in the database repository.

• There are three modes of validation:– Skip validation: no validation.– Lax validation: validation of a subtree only if there

is an applicable schema component.– Strict validation: requires that all data conform to the

schema.

• isvalid(XML instance, SchemaComponent,validation mode) boolean

Page 19: Integrating XQuery and Relational Database Systems

Typing an XML Datatype

• Static association vs. dynamic association – CREATE TABLE Customers (

CustomerID int PRIMARY KEY,CustomerName nvarchar(100),PurchaseOrders XML TYPED AS POSchema,OrderForm XML,OrderFormType nvarchar(1000)CHECK isvalid(OrderForm,OrderFormType,’strict’))

Page 20: Integrating XQuery and Relational Database Systems

Integrating XQuery and SQL: Querying XML datatypes

• SQL can retrieve the XML-typed column, but it can’t query it. That is where XQuery comes in. XQuery allows us to query and transforn the XML data.

• This section explains the integration of the two, so that we can invoke XQuery functionality from SQL, and provide information from the relational environment to the Xquery context.

Page 21: Integrating XQuery and Relational Database Systems

XQuery Functionality in SQL• We must be able to transform an XML

datatype instance to another XML datatype.

• We must have a way to extract information from an XML instance that fits into the SQL type system (scalar SQL types/XML datatype).

• Testing the XML structure for existence– value(XML, XQueryString, SQLType) SQLType

– exists(XML, XQueryString) boolen

– query(XML, XQueryString) XML

Page 22: Integrating XQuery and Relational Database Systems

XQuery Functionality in SQL

• Compiling the XQuery Expressions– The XQuery expressions could be given

dynamically or as constant strings. If the XQuery expression is given as a constant string, then relational systems can compile the XQuery at the same time as the SQL statement.

Page 23: Integrating XQuery and Relational Database Systems

Augmenting the XQuery Static Context

• The XQuery default collation is implicitly set to the relational collation of the XML datatype.

• The relational database system’s built in functions are added to the static function context, as are the SQL constructors for the built in types.

• The SQL built in types are added to the static type context, as are the types of the schema components in case of a statically constrained XML datatype.

Page 24: Integrating XQuery and Relational Database Systems

Providing Access to SQL Data inside XQuery

• Sometimes, an XQuery expression needs to access data from the relational realm.

• One way to access SQL variables in XQuery is by mapping the variables into the XQuery static variable context.Since there are more valid name characters in SQL than in XML, different encoding should take place. This may be done using the variable() built-in pseudo-function.

• Accessing an SQL column may be done using the column() built-in pseudo-function.

Page 25: Integrating XQuery and Relational Database Systems

Mapping SQL Types to XML Schema Types

• In order to be able to import the relational values into the XQeury context, the SQL types must be mapped to XML Schema.Example: Relational Type: INTEGER

maps to:<xs:simpleType name=“INTEGER”>

<xs:restriction base=“xsd:int”/><xs:simpleType>

• Since many relational systems provide additional built-in SQL types, they provide mappings that describe the implementation types.

Page 26: Integrating XQuery and Relational Database Systems

Mapping SQL Types to XML Schema Types

• The user must only provide the mapping for implementation specific namespaces.

• Problems:– Mapping of strings: one XML Schema type for

each length.– There are certain SQL character type values

that are invalid in XML (most of the low-range ASCII control characters).

Page 27: Integrating XQuery and Relational Database Systems

Adding XQuery Function Libraries

• Importing externally defined XQuery function libraries is similar to the XML Schema import model.

• A relational system can store XQuery function libraries by extending the metadata to allow them to be stored according to their namespace URI or an SQL idetifier. Such libraries would then be loaded and stored and imported into the XQuery static context when reffered to.

Page 28: Integrating XQuery and Relational Database Systems

CREATE XML FUNCTION NAMESPACEN'module "http://www.example.com.myfns"

declare namespace myf="http://www.example.com.myfns"define function myf:in-King-Country-WA

($zip as xs:integer)as xs:boolean

($zip < 90210 and $zip >= 90110)

define function myf:King-Country-WA-salestax($x as xs:decimal)as xs:decimal

{$x * 0.88}

SELECT CustomerNAme, query(PurchaseOrder, 'import module namespace myf="http://www.example.com.myfns" declare namespace po = "http://www.example.com.po-schema" for $p in /po:purchase-order, $d in $p/po:details let $net as xs:decimal := $d/@qty * $d/@price where myf:in-King-Country-WA($p/po:shipTo/op:zip) return <po-detail id="{$p/@id}"

total="{$net + myf:King-Country-WA-salestax($net)}"/>')

as po-detail-priceFROM Customers

Page 29: Integrating XQuery and Relational Database Systems

Physical mapping of XQuery

• General discussion on the different mapping strategies of XQuery into physical execution plans.

• We assume that the XQuery expressions are provided as constants; thus their compilation and SQL expressions’ can occur simultaniously.

Page 30: Integrating XQuery and Relational Database Systems

Physical mapping of XQuery

• Two major compiling and executing XQuery expressions methods:– Decoupled approach: works with a standalone

XQuery engine that is added to the relational system.The XQuery expression is passed to the XQuery processor, the XQuery is processed and the result is returned to the SQL environment.

Page 31: Integrating XQuery and Relational Database Systems

Physical mapping of XQuery

– Intergrated approach: Maps the XQuery expressions into logical operator trees that are integrated with the logical operator tree of the SQL statement. Complications:

• Xquery has a more complex, nested and sequence based data-model.

• Types and their operations are not mapped one to one.

Thus, relational systems must extend their expression services to deal with the above.

• Preserving input order in execution.

Page 32: Integrating XQuery and Relational Database Systems

Physical mapping of XQuery

Intergrated approach• cost based optimizer will have to choose

execution plans that preserve the XQuery semantics.

• Optimizer will often choose a bottom-up evaluation strategy, while the naïve execution startegy of XQuery is described top down. That may cause dynamic errors that would have been avoided in the top-down strategy.

Page 33: Integrating XQuery and Relational Database Systems

Example of the previous situation:for $i in //Awhere $i/@a castable as xs:integerreturn

for $j in $i/Bwhere xs:integer($i/@a)>10return $j

Optimization:for $i in //A, $j in $i/B

where $i/@a castable as xs:integer and xs:integer($i/@a)>10return $j

Page 34: Integrating XQuery and Relational Database Systems

Issues of combining SQL, XML Datatype and XQuery

• The need to know the two languages: SQL and XQuery.– XQuery users interested in querying across multiple

documents need to use SQL to iterate over the collection of documents.

– SQL users interested in querying into XML documents need to use XQuery.

Page 35: Integrating XQuery and Relational Database Systems

Issues of combining SQL, XML Datatype and XQuery

• Solutions, more problems and more solutions…

– The second problem can be avoided by shredding, but this mapping approach often only provides relational fidelity and thus does not address order preservation and markup scenarios. Thus either the SQL model should be extended, or The SQL model will be subsummed under the XML and Xquery model.

Page 36: Integrating XQuery and Relational Database Systems

Top-Level XQuery• Top-Level XQuery provides a way to query

collections of XML documents and forests without the need to use SQL.

• One advantage is the abillity to provide concurency and locking.

• The mid-tier model can do the above, however, if there’s already an XML datatype, it seems natural to extend the programming model to provide top-level Xquery support:XQUERY {XQuery expression}.

Page 37: Integrating XQuery and Relational Database Systems

XML Document (Fragmet) Collections

• There are two ways to provide XML collections.

– An XQuery function collection(). Used to refer to a column that is typed as XML, or the same function provided by the database.

• Another approach is to extend the notion of a table to allow the creation of a table of the XML datatype instead of a rowtype.

Page 38: Integrating XQuery and Relational Database Systems

CREATE TABLE PurchaseOrders OF TYPE XML TYPED AS POSchema

INSERT INTO PurchaseOrders SELECT PurchaseOrders FROM Customers

XQUERY-MODIFY { Insert

<po:purchase-order xmlns:po=http://www.example.com/po-schema id=“2001”> <po:shipTo> <po:address>42 2nd Avenue</ po:address>

<po:city>Bellevue</ po:city> <po:state>WA</ po:state> <po:zip>98006</ po:zip> <po:shipTo> </po:shipTo>

<po:details qty=“4” price=“3.44”/> </po:purchase-order>

at last intosql:collection(‘PurchaseOrders’)

}

XQUERY {

declare namespace po = http://www.example.com/po-schemasql:collection(‘PurchaseOrders’)/po:purchase-order/po:details

}

Page 39: Integrating XQuery and Relational Database Systems

XML Views over Relational Data

• There are two main mapping approaches from relational to XML:– Table-to-doc: Maps every table to an XML

document, where the root’s name is the table’s, and every row is mapped to an element named “row”.

– Table-to-forest: Maps a table into an XML element forest.

Page 40: Integrating XQuery and Relational Database Systems

XML Views over Relational Data

• The top-level XQuery environment can access these views, by doc(), collection() or other built in functions.

• Any of these views can be represented as a single XML document by making the top-level element(s) become the children of a document node.

• The result may not necessarily be well formed.

Page 41: Integrating XQuery and Relational Database Systems

XML Views over Relational Data

• Example of a possible built-in XQuery function provided by top-level XQuery:

sql:table($table as xs:string[,$table_map_option as xs:string[,$null_map_option as xs:string] ]) as

element*

– Use:• Sql:table(‘Order’,’table-to-forest’,’xsinil’)

Page 42: Integrating XQuery and Relational Database Systems

Conclusion and Issues

• Overview of how to integrate relational database systems with Xquery

– The XML datatype

– Mixed approach of using XQuery and SQL

– Top-level XQuery approach

– Insight into the imapct of the actual physical data model of the XML datatype on the processing of Xquery in the context of relational systems.

Page 43: Integrating XQuery and Relational Database Systems

Open Issues

• Which of the physical mappings of the XML data support XML inside a relational content the best?

• How should the query-processing model be extended?

• Concurancy control

Page 44: Integrating XQuery and Relational Database Systems

THE END!