sql server 2005: deep dive on xml and xquery michael rys dat405 program manager, sql server xml...

30
SQL Server 2005: Deep SQL Server 2005: Deep Dive Dive On XML And XQuery On XML And XQuery Michael Rys Michael Rys DAT405 DAT405 Program Manager, SQL Server XML Program Manager, SQL Server XML Technologies Technologies Microsoft Corporation Microsoft Corporation

Upload: rodger-foster

Post on 27-Dec-2015

220 views

Category:

Documents


1 download

TRANSCRIPT

SQL Server 2005: Deep SQL Server 2005: Deep Dive Dive On XML And XQueryOn XML And XQuery

Michael RysMichael RysDAT405 DAT405 Program Manager, SQL Server XML Program Manager, SQL Server XML TechnologiesTechnologiesMicrosoft CorporationMicrosoft Corporation

2

File SystemFile SystemFile SystemFile System

XMLXMLX

ML

XM

LXMLXML

XM

LX

ML

XML And Relational Data XML And Relational Data TodayToday

Rel

atio

nal

Rel

atio

nal

Dat

aD

ata

RelationalRelationalDataData

RelationalRelationalDataData

Rel

atio

nal

Rel

atio

nal

Dat

aD

ata

Query and CombineQuery and Combine

3

XML ScenariosXML Scenarios

Data ExchangeData ExchangeBusiness-to-business (B2B), business-to-consumer (B2C), Business-to-business (B2B), business-to-consumer (B2C), application-to-application (A2A)application-to-application (A2A)

XML is ubiquitous, extensible, platform independent transport XML is ubiquitous, extensible, platform independent transport formatformat

Document ManagementDocument ManagementXHTML, Office XML DocumentsXHTML, Office XML Documents

MessagingMessagingSimple Object Access Protocol (SOAP), RSSSimple Object Access Protocol (SOAP), RSS

Mid-Tier CollaborationMid-Tier Collaboration

Ad-hoc modeling of semistructured dataAd-hoc modeling of semistructured datastoring objects with sparse or multi-valued properties that do not storing objects with sparse or multi-valued properties that do not fit well into the traditional relational schematafit well into the traditional relational schemata

→→Transport, Store, and Query XML dataTransport, Store, and Query XML data

4

XML Or Relational?XML Or Relational?

Data Data CharacteristicsCharacteristics

XMLXML RelationalRelational

Flat Structured Flat Structured DataData

Hierarchical Hierarchical Structured DataStructured Data

Not First Class: Not First Class: PK-FK with PK-FK with cascading cascading deletedelete

Semi-structured Semi-structured DataData

Not First ClassNot First Class

Mark-up DataMark-up Data Not First Class: Not First Class: FTSFTS

Order Order preservationpreservation

Not First ClassNot First Class

RecursionRecursion (Recursive (Recursive query)query)

5

XML And Relational!XML And Relational!

ScenariosScenarios XMLXML RelationalRelationalRelational Data Relational Data ExchangeExchange

Use as transport, Use as transport, shred to shred to relationalrelational

Storage and QueryStorage and Query

Document Document ManagementManagement

Use as markup, Use as markup, store nativelystore natively

Provides Provides framework to framework to manage collections manage collections and relationships; and relationships; provides Full-text provides Full-text searchsearch

Semi-structured Semi-structured DataData

Represent semi-Represent semi-structured partsstructured parts

Represent Represent structured partsstructured parts

Message auditMessage audit Store nativelyStore natively Used for querying Used for querying over promoted over promoted propertiesproperties

Object serializationObject serialization Store nativelyStore natively Used for querying Used for querying over promoted over promoted propertiesproperties

6

SQL Server 2005 XML SQL Server 2005 XML ArchitectureArchitecture

XML ParserXML ParserXML ParserXML ParserXMLXML

ValidationValidationValidationValidation

XML data typeXML data type(binary XML)(binary XML)

XML data typeXML data type(binary XML)(binary XML)

SchemaSchemaCollectionCollectionSchemaSchema

CollectionCollection

XMLXML RelationalRelational

XML SchemataXML Schemata

OpenXML/nodes()OpenXML/nodes()

FOR XML with FOR XML with TYPE directiveTYPE directive

RowsetsRowsetsRowsetsRowsets

XQueryXQuery

XML-DMLXML-DMLNode Node TableTableNode Node TableTable

PATH PATH Index Index PATH PATH Index Index

PROP PROP Index Index PROP PROP Index Index

VALUE VALUE Index Index

VALUE VALUE Index Index

PRIMARYPRIMARYXML INDEXXML INDEX

XQueryXQuery

7

Why XQuery? Why XQuery?

SQL does not understand XMLSQL does not understand XML

XPath 1.0XPath 1.0W3C RecommendationW3C Recommendation

Used in SQL Server 2000: SQLXML and OpenXMLUsed in SQL Server 2000: SQLXML and OpenXML

Navigation, no reshapingNavigation, no reshaping

Limited knowledge about typesLimited knowledge about types

XSLTXSLTW3C RecommendationW3C Recommendation

Data-driven reshaping (uses XPath)Data-driven reshaping (uses XPath)

MSXML, System.XMLMSXML, System.XML

Hard to author and optimize for large amount of dataHard to author and optimize for large amount of data

No XML data modification language (DML)No XML data modification language (DML)

8

What Is XQuery? What Is XQuery?

Queries and transforms trees Queries and transforms trees

Functional, declarative query languageFunctional, declarative query language

Combines XPath with node constructionCombines XPath with node construction

Operates on (XML Schema-)typed and Operates on (XML Schema-)typed and unconstrained XMLunconstrained XML

Designed to operate on large amounts of Designed to operate on large amounts of datadata

OptimizableOptimizable

Current Status: In final Last CallCurrent Status: In final Last Call

Recommendations in H2 CY2006Recommendations in H2 CY2006

Fulltext and DML extensions will follow Fulltext and DML extensions will follow laterlater

XQuery IntroductionXQuery Introduction

10

Key XQuery FeaturesKey XQuery Features

FLWOR: FOR / LET / WHERE / ORDER BY / FLWOR: FOR / LET / WHERE / ORDER BY / RETURNRETURN

Includes XPath 2.0 (/doc[@id = 123])Includes XPath 2.0 (/doc[@id = 123])

Element constructors (<topic>{…}</topic>)Element constructors (<topic>{…}</topic>)

Order-preserving operatorsOrder-preserving operatorsInput order (FLWR)Input order (FLWR)

Document order (XPath, union)Document order (XPath, union)

Statically (or dynamically) typedStatically (or dynamically) typed

Strong typing with schema, weak typing Strong typing with schema, weak typing without schemawithout schema

FLWOR: FOR / LET / WHERE / ORDER BY / FLWOR: FOR / LET / WHERE / ORDER BY / RETURNRETURN

Includes XPath 2.0 (/doc[@id = 123])Includes XPath 2.0 (/doc[@id = 123])

Element constructors (<topic>{…}</topic>)Element constructors (<topic>{…}</topic>)

Order-preserving operatorsOrder-preserving operatorsInput order (FLWR)Input order (FLWR)

Document order (XPath, union)Document order (XPath, union)

Statically (or dynamically) typedStatically (or dynamically) typed

Strong typing with schema, weak typing Strong typing with schema, weak typing without schemawithout schema

SQL: SELECT FROM WHERE ORDER BYWITH

FOR LET WHERE ORDER BY

& SET

RETURN

11

XQuery Type SystemXQuery Type System

3 Classes of Item Types: 3 Classes of Item Types: Node types: element(), attribute(), comment() etc.Node types: element(), attribute(), comment() etc.

Element content types: xs:anyType, user-defined (e.g., Element content types: xs:anyType, user-defined (e.g., my:CustomerT)my:CustomerT)

Atomic types: built-in and user-defined (e.g., xs:int, my:hatSize)Atomic types: built-in and user-defined (e.g., xs:int, my:hatSize)

XQuery uses XML Schema for content and atomic types XQuery uses XML Schema for content and atomic types

““Untyped” data have special types (e.g., xdt:untypedAtomic)Untyped” data have special types (e.g., xdt:untypedAtomic)

XML Schema (W3C standard)XML Schema (W3C standard)Rich mechanism for type definitions and validation constraintsRich mechanism for type definitions and validation constraints

Can be used to constrain XML documentsCan be used to constrain XML documents

XML Schema Collections will be used for typing (meta-data)XML Schema Collections will be used for typing (meta-data)

Benefits of typed data Benefits of typed data Guarantees shape of dataGuarantees shape of data

Provide type specific semanticsProvide type specific semantics

Allows storage and query optimizationsAllows storage and query optimizations

12

Static Typing In XQueryStatic Typing In XQuery

Type Inference: Infers type of Expression during Type Inference: Infers type of Expression during compilationcompilation

Type Check: Inferred Type is subtype of expected Type Check: Inferred Type is subtype of expected typetype

Benefits:Benefits:Compile-time type error discoveryCompile-time type error discovery

Guarantees correct type at runtimeGuarantees correct type at runtime

More efficient executionMore efficient execution

Costs:Costs:Sometimes type inference is less precise than data will Sometimes type inference is less precise than data will be (inferring list on /a[1]/b, but there will always be only be (inferring list on /a[1]/b, but there will always be only 1 b)1 b)

Requires more explicit casts and “pick first” (/a[1]/b[1])Requires more explicit casts and “pick first” (/a[1]/b[1])

13

XML Data ModificationXML Data Modification

XQuery extensions: Insert, update, XQuery extensions: Insert, update, and deleteand delete

XML sub-tree modification:XML sub-tree modification:Add or delete XML sub-treesAdd or delete XML sub-trees

Update valuesUpdate values

Generate consistent stateGenerate consistent state

14

XML-DML:XML-DML:

CustomerCustomerCustomerCustomer

name: xs:stringname: xs:stringname: xs:stringname: xs:string OrderOrderOrderOrder

id: xs:intid: xs:intid: xs:intid: xs:int““Janine”Janine”

4242

insertinsertdeletedeletereplace value ofreplace value of

insert <notes/>into /Customer

insert <notes/>as lastinto /Customerinsert <notes/>as firstinto /Customerinsert <notes/>before /Customer/nameinsert <notes/>after/Customer/name

notesnotesnotesnotes

notesnotesnotesnotes notesnotesnotesnotes

delete /Customer/Order[id = 42]

Target needs to be statically one nodeTarget needs to be statically one node

““Nils”Nils”

replace value of(/Customer/name)[1]with “Nils”

XQuery And XML-DML In XQuery And XML-DML In SQL Server 2005SQL Server 2005

16

XQuery And XML-DML In XQuery And XML-DML In SQL Server 2005SQL Server 2005

Subset of XQuery implementedSubset of XQuery implementedIs aligned with July 2004 XQuery working draftIs aligned with July 2004 XQuery working draftAdded XML Data ModificationAdded XML Data ModificationApplies to single XML data type instanceApplies to single XML data type instanceMethods on XML data type: Methods on XML data type:

query(), value(), exist(), modify(), nodes()query(), value(), exist(), modify(), nodes()

Use SQL to iterate over collection of instances Use SQL to iterate over collection of instances (XML-typed column)(XML-typed column)Can refer to relational data Can refer to relational data Take advantage of Schema-collection information Take advantage of Schema-collection information to operate on typed XML datato operate on typed XML dataWill make use of XML indices for optimizationWill make use of XML indices for optimization

17

query()query() creates new, untyped creates new, untyped XML data type instanceXML data type instancevalue()value() extracts an XQuery value into extracts an XQuery value into the SQL value and type spacethe SQL value and type space

Expression has to statically be a singleton Expression has to statically be a singleton String value of atomized XQuery item is String value of atomized XQuery item is cast cast to SQL typeto SQL typeSQL type has to be SQL scalar type SQL type has to be SQL scalar type (no XML or CLR UDT)(no XML or CLR UDT)

exist()exist() returns 1 if the XQuery returns 1 if the XQuery expression returns at least one item, expression returns at least one item, 0 otherwise0 otherwise

XQuery MethodsXQuery Methods

18

XQuery: nodes()XQuery: nodes()

Provides OpenXML-like functionality on Provides OpenXML-like functionality on XML data type column in SQL Server 2005XML data type column in SQL Server 2005

Returns a row per selected nodeReturns a row per selected node

Each row contains a special XML data Each row contains a special XML data type instance thattype instance that

References one of the selected nodesReferences one of the selected nodes

Preserves the original structure and typesPreserves the original structure and types

Can only be used with the XQuery methods Can only be used with the XQuery methods (not modify()), count(*), and IS (NOT) NULL(not modify()), count(*), and IS (NOT) NULL

19

Map SQL value and type into XQuery values Map SQL value and type into XQuery values and types in context of XQuery or XML-DMLand types in context of XQuery or XML-DMLsql:variable():sql:variable(): accesses a SQL accesses a SQL variable/parametervariable/parameterdeclare @value int set @value=42select * from T where T.x.exist(‘/a/b[@id=sql:variable(“@value”)]’)=1

sql:column():sql:column(): accesses another column value accesses another column valuetables: T(key int, x xml), S(key int, val int)

select * from T join S on T.key=S.keywhere T.x.exist(‘/a/b[@id=sql:column(“S.val”)]’)=1

Restrictions in SQL Server 2005: Restrictions in SQL Server 2005: No XML, CLR UDT, datetime, or deprecated No XML, CLR UDT, datetime, or deprecated text/ntext/imagetext/ntext/image

sql:column()/sql:variable()sql:column()/sql:variable()

20

Used with SET:Used with SET:

declare @xdoc xmlset @xdoc.modify(‘delete /a/b[@id=“42”]’)

update T set T.xdoc.modify(‘insert <b/> into /a’)where T.id=1

Relational row-level concurrency: whole XML Relational row-level concurrency: whole XML instance is lockedinstance is locked

XQuery: modify()XQuery: modify()

21

Combined SQL And XQuery/DML ProcessingCombined SQL And XQuery/DML Processing

XQuery ParserXQuery ParserXQuery ParserXQuery Parser

Static TypingStatic TypingStatic TypingStatic Typing

AlgebrizationAlgebrizationAlgebrizationAlgebrization

XML XML SchemaSchema

CollectionCollection

XML XML SchemaSchema

CollectionCollection

MetadataMetadataStatic Static PhasePhase

Runtime Optimization Runtime Optimization and Execution of and Execution of physical Op Treephysical Op Tree

Runtime Optimization Runtime Optimization and Execution of and Execution of physical Op Treephysical Op Tree

Dynamic Dynamic PhasePhase

XML and XML and rel.rel.

IndicesIndices

XML and XML and rel.rel.

IndicesIndices

Static Optimization of Static Optimization of combined Logical and combined Logical and

Physical Operation TreePhysical Operation Tree

Static Optimization of Static Optimization of combined Logical and combined Logical and

Physical Operation TreePhysical Operation Tree

SELECT x.query(‘…’), y FROM T WHERE …SELECT x.query(‘…’), y FROM T WHERE …

SQL ParserSQL ParserSQL ParserSQL Parser

AlgebrizationAlgebrizationAlgebrizationAlgebrization

Static TypingStatic TypingStatic TypingStatic Typing

22

XML IndicesXML Indices

Create XML index on XML columnCreate XML index on XML columnCREATE PRIMARY XML INDEX idx_1 ON docs (xDoc)

Create secondary indexes on tags, Create secondary indexes on tags, values, pathsvalues, paths

Speed up queriesSpeed up queriesResults can be served directly from indexResults can be served directly from index

SQL’s cost based optimizer will consider SQL’s cost based optimizer will consider indexindex

Primary and Secondary Indices will be Primary and Secondary Indices will be efficiently maintained during updatesefficiently maintained during updates

Only subtree that changes will be updatedOnly subtree that changes will be updated

23

Example Index ContentsExample Index Contentsinsert into Person values (42,

'<book ISBN=”1-55860-438-3”><section> <title>Bad Bugs</title> Nobody loves bad bugs.</section><section> <title>Tree Frogs</title>

All right-thinking people <bold>love</bold> tree frogs.

</section></book>')

24

Primary XML IndexPrimary XML IndexCREATE PRIMARY XML INDEX CREATE PRIMARY XML INDEX PersonIdx PersonIdx ON ON Person Person ((PdescPdesc))

Assumes typed data; Columns and Values are simplified, see VLDB 2004 paper for detailsAssumes typed data; Columns and Values are simplified, see VLDB 2004 paper for details

PKPK XIDXID TAG IDTAG ID NodeNode Type-IDType-ID VALUE VALUE HIDHID

4242 11 1 (book)1 (book) ElemenElementt

1 (bookT)1 (bookT) nullnull #book#book

4242 1.11.1 2 (ISBN)2 (ISBN) AttributAttributee

2 2 (xs:st(xs:string)ring)

1-55860-1-55860-438-3438-3

#@ISBN#book#@ISBN#book

4242 1.31.3 3 3 (sec(section)tion)

ElemenElementt

3 3 (secti(sectionT)onT)

nullnull #section#book#section#book

4242 1.3.1.3.11

4 4 (TIT(TITLE)LE)

ElemenElementt

2 2 (xs:st(xs:string)ring)

Bad BugsBad Bugs #title#section#b#title#section#bookook

4242 1.3.1.3.33

---- TextText ---- Nobody Nobody loves bad loves bad bugs.bugs.

#text()#section##text()#section#bookbook

4242 1.51.5 3 3 (sec(section)tion)

ElemenElementt

3 3 (secti(sectionT)onT)

nullnull #section#book#section#book

4242 1.5.1.5.11

4 (title)4 (title) ElemenElementt

2 2 (xs:st(xs:string)ring)

Tree frogsTree frogs #title#section#b#title#section#bookook

4242 1.5.1.5.33

---- TextText ---- All right-All right-thinking thinking peoplepeople

#text()#section##text()#section#bookbook

4242 1.5.1.5.55

7 (bold)7 (bold) ElemenElementt

4 (boldT)4 (boldT) lovelove #bold#section#b#bold#section#bookook

4242 1.5.1.5.77

---- TextText ---- tree frogstree frogs #text()#section##text()#section#bookbook

25

PPKK

XIXIDD

NINIDD

TITIDD

VALUVALUEE

LVALULVALUEE

HIDHID xsinixsinill

……

11

11

11

22

22

22

33

33

33

Architectural Blueprint: Architectural Blueprint: Indexing Indexing

idid xx

11 Binary XMLBinary XML

22 Binary XMLBinary XML

33 Binary XMLBinary XML

XML ColumnXML Columnin table T(id, x)in table T(id, x)

Primary XML Index (1 per XML column)Primary XML Index (1 per XML column)Clustered on Primary Key (of table T), XIDClustered on Primary Key (of table T), XID

Non-clustered Secondary Indices (n per primary Index)Non-clustered Secondary Indices (n per primary Index)

Value IndexValue IndexValue IndexValue Index Path IndexPath IndexPath IndexPath IndexProperty IndexProperty IndexProperty IndexProperty Index

33 11 2211 2244 3333 1122

XQuery Optimizations With XQuery Optimizations With XML IndicesXML Indices

27

Take-Away: XML Indices Take-Away: XML Indices

PRIMARY XML Index PRIMARY XML Index – use when lot’s of – use when lot’s of XQueryXQuery

FOR VALUEFOR VALUE –– useful for queries where useful for queries where values are more selective than paths such values are more selective than paths such as //*[.=“Seattle”]as //*[.=“Seattle”]

FOR PATHFOR PATH –– useful for Path expressions: useful for Path expressions: avoids joins by mapping paths to avoids joins by mapping paths to hierarchical index (HID) numbers. hierarchical index (HID) numbers. Example: /person/address/zipExample: /person/address/zip

FOR PROPERTYFOR PROPERTY –– useful when optimizer useful when optimizer chooses other index (e.g., on relational chooses other index (e.g., on relational column, or FT Index) in addition so row is column, or FT Index) in addition so row is already known already known

29

Session SummarySession Summary

SQL Server 2005 provides XQuery and XML SQL Server 2005 provides XQuery and XML DML on XML datatype DML on XML datatype

XQuery subset based on July 2004 WDXQuery subset based on July 2004 WD

Typing provided by XML Schema collections Typing provided by XML Schema collections

on XML datatypeon XML datatype

Node-based Data Manipulation Language Node-based Data Manipulation Language (DML)(DML)

Integrates with relational processingIntegrates with relational processing

Optimization:Optimization:Using extended relational algebra and query Using extended relational algebra and query optimizeroptimizer

Indexing of XML datatypeIndexing of XML datatype

30

Community ResourcesCommunity Resources

At PDCAt PDCDAT Track lounge: I’ll be there dailyDAT Track lounge: I’ll be there daily

After PDCAfter PDCMSDN dev center: http://msdn.microsoft.com/SQL/2005 MSDN dev center: http://msdn.microsoft.com/SQL/2005 XML and Databases whitepapers: XML and Databases whitepapers: http://msdn.microsoft.com/XML/BuildingXML/XMLandDatahttp://msdn.microsoft.com/XML/BuildingXML/XMLandDatabase/ base/ Online WebCasts: Online WebCasts: http://msdn.microsoft.com/sql/2005/2005webcasts/ http://msdn.microsoft.com/sql/2005/2005webcasts/ Newsgroups & Forum: Newsgroups & Forum: news:microsoft.public.sqlserver.xml news:microsoft.public.sqlserver.xml http://forums.microsoft.com/msdn/ShowForum.aspx?http://forums.microsoft.com/msdn/ShowForum.aspx?ForumID=89 ForumID=89 My E-mail: [email protected] My E-mail: [email protected] My Weblog: My Weblog: http://www.sqljunkies.com/weblog/mryshttp://www.sqljunkies.com/weblog/mrys

Please fill out Session EvaluationPlease fill out Session Evaluation

31© 2005 Microsoft Corporation. All rights reserved.

This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.