xpath tao wan march 04, 2002. what is xpath? n a language designed to be used by xsl transformations...

31
XPath Tao Wan Tao Wan March 04, 2002 March 04, 2002

Post on 21-Dec-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: XPath Tao Wan March 04, 2002. What is XPath? n A language designed to be used by XSL Transformations (XSLT), Xlink, Xpointer and XML Query. n Primary

XPathXPath

Tao WanTao Wan

March 04, 2002March 04, 2002

Page 2: XPath Tao Wan March 04, 2002. What is XPath? n A language designed to be used by XSL Transformations (XSLT), Xlink, Xpointer and XML Query. n Primary

What is XPath?What is XPath?

A language designed to be used by XSL A language designed to be used by XSL Transformations (XSLT), Xlink, Xpointer Transformations (XSLT), Xlink, Xpointer and XML Query.and XML Query.

Primary purpose: Address ‘part’ of an Primary purpose: Address ‘part’ of an XML document, and provide basic facilities XML document, and provide basic facilities for manipulation of strings, numbers and for manipulation of strings, numbers and booleans.booleans.

Page 3: XPath Tao Wan March 04, 2002. What is XPath? n A language designed to be used by XSL Transformations (XSLT), Xlink, Xpointer and XML Query. n Primary

OutlineOutline

IntroductionIntroduction Data ModelData Model Xpath SyntaxXpath Syntax

Location PathLocation Path General Xpath Expressions General Xpath Expressions Core Function LibraryCore Function Library

XPath utilities XPath utilities ConclusionConclusion

Page 4: XPath Tao Wan March 04, 2002. What is XPath? n A language designed to be used by XSL Transformations (XSLT), Xlink, Xpointer and XML Query. n Primary

IntroductionIntroduction

W3C Recommendation. November 16, 1999 W3C Recommendation. November 16, 1999 Latest version: http://www.w3.org/TR/xpathLatest version: http://www.w3.org/TR/xpath XPath uses a compact, string-based, rather than XPath uses a compact, string-based, rather than

XML element-based syntax.XML element-based syntax. Operates on the abstract, logical structure of an Operates on the abstract, logical structure of an

XML document rather than its surface syntax.XML document rather than its surface syntax. Uses a path notation (like in URLs) to navigate Uses a path notation (like in URLs) to navigate

through this hierarchical tree structure. through this hierarchical tree structure.

Introduction

Page 5: XPath Tao Wan March 04, 2002. What is XPath? n A language designed to be used by XSL Transformations (XSLT), Xlink, Xpointer and XML Query. n Primary

Introduction Cont.Introduction Cont.

Xpath models an XML doc as a tree of Xpath models an XML doc as a tree of nodes and defines a way to compute a nodes and defines a way to compute a string-value for each type of node.string-value for each type of node.

Supports Namespaces.Supports Namespaces. Expression (Expr) is the primary syntactic Expression (Expr) is the primary syntactic

construct of Xpath.construct of Xpath.

Introduction

Page 6: XPath Tao Wan March 04, 2002. What is XPath? n A language designed to be used by XSL Transformations (XSLT), Xlink, Xpointer and XML Query. n Primary

Data ModelData Model The way to represent an XML document. The way to represent an XML document. This tree consists of 7 nodes:This tree consists of 7 nodes:

Root NodeRoot Node Element Nodes Element Nodes Attribute NodesAttribute Nodes Namespace NodesNamespace Nodes Processing Instruction NodesProcessing Instruction Nodes Comment NodesComment Nodes Text NodesText Nodes

The tree structure is ordered in order of the The tree structure is ordered in order of the occurrence of nodes’ start-tag in the XML doc. occurrence of nodes’ start-tag in the XML doc.

Data Model

Page 7: XPath Tao Wan March 04, 2002. What is XPath? n A language designed to be used by XSL Transformations (XSLT), Xlink, Xpointer and XML Query. n Primary

Data Model ExampleData Model Example<?<?xml version=“1.0”>xml version=“1.0”>

<?xml-stylesheet type=“text/xsl” href=“bib.xsl” ?><?xml-stylesheet type=“text/xsl” href=“bib.xsl” ?>

<! -- simple XML document --><! -- simple XML document -->

<<bib>bib><book price=“25.00” pages=“400”> <book price=“25.00” pages=“400”>

<publisher> IDG books</publisher><publisher> IDG books</publisher> <author> <first-name> <author> <first-name>RickRick</first-name></first-name> <last-name> Hull </last-name> <last-name> Hull </last-name> </author> </author> <author> Simon North</author> <author> Simon North</author> <title> XML complete </title> <title> XML complete </title> <year> 1997 </year> <year> 1997 </year></book></book><book><book> <publisher> Freeman </publisher> <publisher> Freeman </publisher> <author> Jeffrey D. Ullman </author> <author> Jeffrey D. Ullman </author> <title> Principles of Database </title> <title> Principles of Database </title> <year> 1998 </year> <year> 1998 </year></book></book>

</bib></bib>

Data Model

Page 8: XPath Tao Wan March 04, 2002. What is XPath? n A language designed to be used by XSL Transformations (XSLT), Xlink, Xpointer and XML Query. n Primary

Xpath SyntaxXpath Syntax

Expression is the primary syntactic construct in XPathExpression is the primary syntactic construct in XPath Evaluated to yield an object of 4 basic types.Evaluated to yield an object of 4 basic types.

node-set node-set (unordered collection of nodes without duplicates).(unordered collection of nodes without duplicates). booleanboolean (true/false) (true/false) number number (float)(float) string string (sequence of UCS chars)(sequence of UCS chars)

Expression Evaluation occurs will respect to a context. Expression Evaluation occurs will respect to a context. (XSLT/XPointer specified context) (XSLT/XPointer specified context)

Location path is one important kind of expression.Location path is one important kind of expression. Location paths select a Location paths select a set of nodes set of nodes relative to the relative to the contextcontext node. node.

Expression

Page 9: XPath Tao Wan March 04, 2002. What is XPath? n A language designed to be used by XSL Transformations (XSLT), Xlink, Xpointer and XML Query. n Primary

Location PathLocation Path

Location Path provides the mechanism for ‘addressing’ Location Path provides the mechanism for ‘addressing’ parts of an XML doc, similar to file system addressing.parts of an XML doc, similar to file system addressing.

Ex: /book/year (select all the year elements that have a Ex: /book/year (select all the year elements that have a book parent)book parent)

Every location path can be expressed using a Every location path can be expressed using a straightforward but rather verbose syntax:straightforward but rather verbose syntax: unabbreviated syntax (verbose syntax) unabbreviated syntax (verbose syntax)

Ex: child::* (select all element children of the context node)Ex: child::* (select all element children of the context node) abbreviated syntaxabbreviated syntax

Ex. * (equivalent to unabbreviation above)Ex. * (equivalent to unabbreviation above)

Location Path

Page 10: XPath Tao Wan March 04, 2002. What is XPath? n A language designed to be used by XSL Transformations (XSLT), Xlink, Xpointer and XML Query. n Primary

Location Path Cont. Location Path Cont.

Two types of paths: Relative & AbsoluteTwo types of paths: Relative & Absolute Relative location path: consists of a sequence of one or Relative location path: consists of a sequence of one or

more location steps separated by more location steps separated by //

absolute location path: consists of absolute location path: consists of // optionally followed optionally followed by a relative location pathby a relative location path

Composed of a series of Composed of a series of stepssteps (1 or more) (1 or more)

Location Path

Ex. Child::bib/child::book (select the book element children of thebib element children of the context node)

Ex. / (select the root node of the document containing the context node)

Page 11: XPath Tao Wan March 04, 2002. What is XPath? n A language designed to be used by XSL Transformations (XSLT), Xlink, Xpointer and XML Query. n Primary

Location Path Examples Location Path Examples Verbose syntax (has syntactic abbreviations for common cases)Verbose syntax (has syntactic abbreviations for common cases)

Examples (unabbreviated)Examples (unabbreviated) child::book child::book selects the selects the bookbook element element childrenchildren of the context of the context nodenode child::* child::* selects selects allall element element childrenchildren of the context node of the context node attribute::price attribute::price selects the selects the priceprice attribute of the context node attribute of the context node descendant::book descendant::book selects all selects all bookbook descendantsdescendants of the context node of the context node self::book self::book selects the context node if it is a selects the context node if it is a bookbook element element

(otherwise selects nothing)(otherwise selects nothing) child::*/child::book child::*/child::book selects all selects all bookbook grandchildrengrandchildren of the context node of the context node / / selects the document selects the document rootroot (which is always the (which is always the parent of the document element)parent of the document element)

Location Path

Page 12: XPath Tao Wan March 04, 2002. What is XPath? n A language designed to be used by XSL Transformations (XSLT), Xlink, Xpointer and XML Query. n Primary

Location StepsLocation Steps 3 3 partsparts

axis axis (specifies relationship btwn selected nodes and the context node)(specifies relationship btwn selected nodes and the context node) node test node test (specifies the node type and expanded-name of selected nodes)(specifies the node type and expanded-name of selected nodes) predicates (predicates (arbitrary expressions to refine the selected set of nodesarbitrary expressions to refine the selected set of nodes))

The syntax for location step is the axis name and node test separated The syntax for location step is the axis name and node test separated by a double colon followed by zero or more expressions, each in by a double colon followed by zero or more expressions, each in square bracket. square bracket.

Evaluate a location step is to generate an initial node-set from axis Evaluate a location step is to generate an initial node-set from axis

((relationship to context noderelationship to context node) and node-test () and node-test (node-type and expanded-namenode-type and expanded-name), then ), then filter that node-set by each of the predicates in turn.filter that node-set by each of the predicates in turn.

Location Step

ex: child::book[position( )=1]child is the name of the axis, book is the node test, and [position()=1] is a predicate

ex: descendant::book[position( )=1] selects the all book element descendants of the context node firstly, then filter the one

which is first book descendant of context node.

Page 13: XPath Tao Wan March 04, 2002. What is XPath? n A language designed to be used by XSL Transformations (XSLT), Xlink, Xpointer and XML Query. n Primary

Location StepsLocation Steps AxesAxes

13 axes defined in XPath13 axes defined in XPath Ancestor, ancestor-or-selfAncestor, ancestor-or-self AttributeAttribute ChildChild Descendant, descendant-or-selfDescendant, descendant-or-self Self Self FollowingFollowing PrecedingPreceding Following-sibling, preceding-siblingFollowing-sibling, preceding-sibling NamespaceNamespace ParentParent

Node testNode test Identifies type and expanded-name of node. Identifies type and expanded-name of node. Can use a name, wildcard or function to evaluate/verify type and name.Can use a name, wildcard or function to evaluate/verify type and name. ex. Child::text() select the ex. Child::text() select the text nodetext node children of context node. children of context node. Child::book select Child::book select bookbook element children of context node. element children of context node. Attribute::* select Attribute::* select allall attribute children of context node. attribute children of context node.

Location step

We’ve only seen these, so far

Page 14: XPath Tao Wan March 04, 2002. What is XPath? n A language designed to be used by XSL Transformations (XSLT), Xlink, Xpointer and XML Query. n Primary

Location Step Cont.Location Step Cont.

PredicatePredicate A predicate filters a node-set with respect to an axis to produce a A predicate filters a node-set with respect to an axis to produce a

new node-set.new node-set. Use XPath expressions (normally, boolean expressions) in square Use XPath expressions (normally, boolean expressions) in square

brackets following the basis (axis & node test).brackets following the basis (axis & node test). Ex. Child::book[attribute::price=“25”]Ex. Child::book[attribute::price=“25”]

(select all book children of the context node that have a(select all book children of the context node that have a price price attribute attribute with value 25.with value 25.

A predicateExpr is evaluated by evaluating the Expr and A predicateExpr is evaluated by evaluating the Expr and converting the result to a boolean (True or False)converting the result to a boolean (True or False)

Page 15: XPath Tao Wan March 04, 2002. What is XPath? n A language designed to be used by XSL Transformations (XSLT), Xlink, Xpointer and XML Query. n Primary

ExamplesExamples

Axis and Node Test:Axis and Node Test:descendant::publisherdescendant::publisher (selects the(selects the publisher publisher elements that are descendant of the context node) elements that are descendant of the context node)

attributes::*attributes::* (selects (selects allall attributes of the context node) attributes of the context node)

Basis and Predicate:Basis and Predicate:child::book[3]child::book[3] (selects the 3(selects the 3rdrd book book of the children of the context node)of the children of the context node)

child::*[self::author or self::year][position()=last()]child::*[self::author or self::year][position()=last()](selects the last (selects the last authorauthor or or yearyear child of the context node) child of the context node)

child::book[attribute::page=“400”][5]child::book[attribute::page=“400”][5] (selects the fifth (selects the fifth bookbook child of the context node that has a child of the context node that has a pagepage attribute attribute

with value with value 400400))

Location Path

Page 16: XPath Tao Wan March 04, 2002. What is XPath? n A language designed to be used by XSL Transformations (XSLT), Xlink, Xpointer and XML Query. n Primary

Abbreviated SyntaxAbbreviated Syntax Abbreviated syntax is the simpler way to express location path. Abbreviated syntax is the simpler way to express location path. For common case, abbreviation can be used to express concisely For common case, abbreviation can be used to express concisely

(not every case).(not every case). Each abbreviation can be converted to unabbreviated one.Each abbreviation can be converted to unabbreviated one.

Location Path

child:: can be omitted from a location step (child is the default axis)ex. bib/book is equivalent to child::bib/child::book

A location step of . is short for self::node()ex: .//book is short for self::node()/descendant-or-self::node()/child::book

Location step of .. is short for parent::node()ex. ../title is short for parent::node()/child::title

// is short for /descendant-or-self::node()/ex. Book//author is short for book/descendant-or-self::node()/child::author

attribute:: can be abbreviated to @ex. Book[@price=“25”] is short for child::book[attribute::price=“25”]

Page 17: XPath Tao Wan March 04, 2002. What is XPath? n A language designed to be used by XSL Transformations (XSLT), Xlink, Xpointer and XML Query. n Primary

ExpressionsExpressions

Function CallsFunction Calls Node-setsNode-sets BooleansBooleans NumbersNumbers StringsStrings

Expressions

Function Calls

Page 18: XPath Tao Wan March 04, 2002. What is XPath? n A language designed to be used by XSL Transformations (XSLT), Xlink, Xpointer and XML Query. n Primary

Function CallsFunction Calls

Function call expression is evaluated by using the Function call expression is evaluated by using the FunctionName to identify a function in the FunctionName to identify a function in the expression evaluation context function library.expression evaluation context function library.

An argument is converted An argument is converted to type to type stringstring (as if calling the (as if calling the stringstring function), function), to type to type booleanboolean (as if calling the (as if calling the BooleanBoolean function), function), to type to type numbernumber (as if calling the (as if calling the numbernumber function), function), An argument that is not of type An argument that is not of type node-setnode-set cannot be cannot be

converted to a converted to a node-setnode-set..

Ex. Ex. position()position() function returns the current node’s position in function returns the current node’s position in the context node list as a number.the context node list as a number.

Expressions

Page 19: XPath Tao Wan March 04, 2002. What is XPath? n A language designed to be used by XSL Transformations (XSLT), Xlink, Xpointer and XML Query. n Primary

ExpressionsExpressions

Function CallsFunction Calls Node-setsNode-sets BooleansBooleans NumbersNumbers StringsStrings

Expressions

Page 20: XPath Tao Wan March 04, 2002. What is XPath? n A language designed to be used by XSL Transformations (XSLT), Xlink, Xpointer and XML Query. n Primary

Node-setsNode-sets

A location path can be used as an A location path can be used as an expression. expression.

The expression returns the set of nodes The expression returns the set of nodes selected by the path.selected by the path.

Expressions

Page 21: XPath Tao Wan March 04, 2002. What is XPath? n A language designed to be used by XSL Transformations (XSLT), Xlink, Xpointer and XML Query. n Primary

ExpressionsExpressions

Function CallsFunction Calls Node-setsNode-sets BooleansBooleans NumbersNumbers StringsStrings

Expressions

Page 22: XPath Tao Wan March 04, 2002. What is XPath? n A language designed to be used by XSL Transformations (XSLT), Xlink, Xpointer and XML Query. n Primary

BooleansBooleans A boolean can only have two values: true or falseA boolean can only have two values: true or false The following operators can be used in boolean The following operators can be used in boolean

expressions or combine two boolean expressions expressions or combine two boolean expressions according to the usual rules of boolean logic:according to the usual rules of boolean logic: oror andand =, !==, != <=, <, >=, ><=, <, >=, >

Ex. Ex. Book=‘XML complete’ or book=‘Book=‘XML complete’ or book=‘Principles of DatabasePrinciples of Database

Expressions

Page 23: XPath Tao Wan March 04, 2002. What is XPath? n A language designed to be used by XSL Transformations (XSLT), Xlink, Xpointer and XML Query. n Primary

ExpressionsExpressions

Function CallsFunction Calls Node-setsNode-sets BooleansBooleans NumbersNumbers StringsStrings

Expressions

Page 24: XPath Tao Wan March 04, 2002. What is XPath? n A language designed to be used by XSL Transformations (XSLT), Xlink, Xpointer and XML Query. n Primary

NumbersNumbers

A number represents a floating-point A number represents a floating-point number, no pure integers exist in Xpath.number, no pure integers exist in Xpath.

The basic arithmetic operators include:The basic arithmetic operators include:+, -, *, div and mod.+, -, *, div and mod.Ex. Ex. @id div 10@id div 10

Expressions

Page 25: XPath Tao Wan March 04, 2002. What is XPath? n A language designed to be used by XSL Transformations (XSLT), Xlink, Xpointer and XML Query. n Primary

ExpressionsExpressions

Function CallsFunction Calls Node-setsNode-sets BooleansBooleans NumbersNumbers StringsStrings

Expressions

Page 26: XPath Tao Wan March 04, 2002. What is XPath? n A language designed to be used by XSL Transformations (XSLT), Xlink, Xpointer and XML Query. n Primary

StringsStrings

Strings consist of a sequence of zero or Strings consist of a sequence of zero or more character.more character.

May be enclosed in either single or double May be enclosed in either single or double quotes.quotes.

Comparison operators: Comparison operators: =, !==, !=

Expressions

Page 27: XPath Tao Wan March 04, 2002. What is XPath? n A language designed to be used by XSL Transformations (XSLT), Xlink, Xpointer and XML Query. n Primary

Core Function LibraryCore Function Library

XPath defines a core set of functions to evaluate XPath defines a core set of functions to evaluate expressions.expressions.

All implementations of Xpath must implement the All implementations of Xpath must implement the core function library.core function library.

Four type of functions:Four type of functions: Node Set Functions: operate on or return info about Node Set Functions: operate on or return info about

node sets.node sets. String Functions: are used for basic string operations.String Functions: are used for basic string operations. Ex. substring(“12345”, 0, 3) returns “12”Ex. substring(“12345”, 0, 3) returns “12” Boolean Functions: all return Boolean Functions: all return truetrue or or falsefalse.. Number Functions: are used for basic number Number Functions: are used for basic number

operations.operations.

Core Library

Page 28: XPath Tao Wan March 04, 2002. What is XPath? n A language designed to be used by XSL Transformations (XSLT), Xlink, Xpointer and XML Query. n Primary

Xpath UtilitiesXpath Utilities

Miscellaneous utilities related to XpathMiscellaneous utilities related to Xpath http://www.xmlsoftware.com/xpath/http://www.xmlsoftware.com/xpath/ XPath Visualiser: XPath Visualiser:

This is a powerful tool for the evaluation of an XPath expression This is a powerful tool for the evaluation of an XPath expression and visual presentation of the resulting node-set. and visual presentation of the resulting node-set.

allowing you to experiment with XPath for finding the correct allowing you to experiment with XPath for finding the correct expression. expression.

The display of the XML source document is similar to the default The display of the XML source document is similar to the default

IE display with the same syntax color and collapsible & IE display with the same syntax color and collapsible & expandable container nodes. expandable container nodes.

very straightforward XPath learning process.very straightforward XPath learning process.

Xpath Utilities

Page 29: XPath Tao Wan March 04, 2002. What is XPath? n A language designed to be used by XSL Transformations (XSLT), Xlink, Xpointer and XML Query. n Primary

XPath VisualiserXPath Visualiser

Tree View

of XML Doc

Xpath input

Xpath evaluating result

Context Node

Result is highlightedXpath Utilities

Page 30: XPath Tao Wan March 04, 2002. What is XPath? n A language designed to be used by XSL Transformations (XSLT), Xlink, Xpointer and XML Query. n Primary

ConclusionConclusion Xpath is complete pattern match language.Xpath is complete pattern match language. Provides an concise way for addressing Provides an concise way for addressing

parts of an XML document.parts of an XML document. Base for XSLT, Xpointer and XML Query Base for XSLT, Xpointer and XML Query

WG. Supported by W3C.WG. Supported by W3C. Implementing XPath basically requires Implementing XPath basically requires

learning the abbreviated syntax of location learning the abbreviated syntax of location path expressions and the functions of the path expressions and the functions of the core library.core library.

Conclusion

Page 31: XPath Tao Wan March 04, 2002. What is XPath? n A language designed to be used by XSL Transformations (XSLT), Xlink, Xpointer and XML Query. n Primary

ReferenceReference

XML Path Language (XPath) V1.0XML Path Language (XPath) V1.0 http://www.w3.org/TR/xpathhttp://www.w3.org/TR/xpath XML in a NutshellXML in a Nutshell http://www.oreilly.com/catalog/xmlnut/chapter/http://www.oreilly.com/catalog/xmlnut/chapter/ ch09.htmlch09.html Managing XML and Semistructured DataManaging XML and Semistructured Data

http://www.cs.washington.edu/homes/suciu/COURhttp://www.cs.washington.edu/homes/suciu/COURSES/590DS/06xpath.htmSES/590DS/06xpath.htm

Xpath utilitiesXpath utilities http://www.xmlsoftware.com/xpath/http://www.xmlsoftware.com/xpath/

Xpath Reference