1 quick intro to xpath roger l. costello 14 december, 2012
TRANSCRIPT
1
Quick Intro to XPath
Roger L. Costello14 December, 2012
2
Objective
• XML Schema 1.1 uses XPath a lot, so if you don't know XPath then you're at a disadvantage.
• The purpose of this short tutorial is to teach you enough XPath that you won't be at a disadvantage.
3
XPath is not a standalone language
• XPath requires a host language. There are currently several XML languages that host XPath.
4
XPath is not a standalone language
XPath
XSLT
XQuery
XMLSchemas
XPointer
Schematron
5
Document/
PI<?xml version=“1.0”?>
ElementFitnessCenter
ElementMember
ElementMember
ElementMember
ElementName
ElementFavoriteColor
TextJeff
Textlightgrey
ElementName
ElementFavoriteColor
TextDavid
Textlightblue
ElementName
ElementFavoriteColor
TextRoger
Textlightyellow
<?xml version="1.0"?><FitnessCenter> <Member> <Name>Jeff</Name> <FavoriteColor>lightgrey</FavoriteColor> </Member> <Member> <Name>David</Name> <FavoriteColor>lightblue</FavoriteColor> </Member> <Member> <Name>Roger</Name> <FavoriteColor>lightyellow</FavoriteColor> </Member></FitnessCenter>
This XML document can berepresented as a tree, as shown below
6
Terminology - node
Document/
PI<?xml version=“1.0”?>
ElementFitnessCenter
ElementMember
ElementMember
ElementMember
ElementName
ElementFavoriteColor
TextJeff
Textlightgrey
ElementName
ElementFavoriteColor
TextDavid
Textlightblue
ElementName
ElementFavoriteColor
TextRoger
Textlightyellow
Document nodeProcessing Instruction (PI) node
Element nodes
Text nodes
7
Document/
PI<?xml version=“1.0”?>
ElementFitnessCenter
ElementMember
ElementMember
ElementMember
ElementName
ElementFavoriteColor
TextJeff
Textlightgrey
ElementName
ElementFavoriteColor
TextDavid
Textlightblue
ElementName
ElementFavoriteColor
TextRoger
Textlightyellow
With respect to this node, these are its children
8
Document/
PI<?xml version=“1.0”?>
ElementFitnessCenter
ElementMember
ElementMember
ElementMember
ElementName
ElementFavoriteColor
TextJeff
Textlightgrey
ElementName
ElementFavoriteColor
TextDavid
Textlightblue
ElementName
ElementFavoriteColor
TextRoger
Textlightyellow
These are its descendant nodes
9
Document/
PI<?xml version=“1.0”?>
ElementFitnessCenter
ElementMember
ElementMember
ElementMember
ElementName
ElementFavoriteColor
TextJeff
Textlightgrey
ElementName
ElementFavoriteColor
TextDavid
Textlightblue
ElementName
ElementFavoriteColor
TextRoger
Textlightyellow
This is the context node
10
Document/
PI<?xml version=“1.0”?>
ElementFitnessCenter
ElementMember
ElementMember
ElementMember
ElementName
ElementFavoriteColor
TextJeff
Textlightgrey
ElementName
ElementFavoriteColor
TextDavid
Textlightblue
ElementName
ElementFavoriteColor
TextRoger
Textlightyellow
That's its parent
11
Document/
PI<?xml version=“1.0”?>
ElementFitnessCenter
ElementMember
ElementMember
ElementMember
ElementName
ElementFavoriteColor
TextJeff
Textlightgrey
ElementName
ElementFavoriteColor
TextDavid
Textlightblue
ElementName
ElementFavoriteColor
TextRoger
Textlightyellow
Those are its ancestors
12
Document/
PI<?xml version=“1.0”?>
ElementFitnessCenter
ElementMember
ElementMember
ElementMember
ElementName
ElementFavoriteColor
TextJeff
Textlightgrey
ElementName
ElementFavoriteColor
TextDavid
Textlightblue
ElementName
ElementFavoriteColor
TextRoger
Textlightyellow
It has 2 siblings
13
Document/
PI<?xml version=“1.0”?>
ElementFitnessCenter
ElementMember
ElementMember
ElementMember
ElementName
ElementFavoriteColor
TextJeff
Textlightgrey
ElementName
ElementFavoriteColor
TextDavid
Textlightblue
ElementName
ElementFavoriteColor
TextRoger
Textlightyellow
They are following-siblings
14
Document/
PI<?xml version=“1.0”?>
ElementFitnessCenter
ElementMember
ElementMember
ElementMember
ElementName
ElementFavoriteColor
TextJeff
Textlightgrey
ElementName
ElementFavoriteColor
TextDavid
Textlightblue
ElementName
ElementFavoriteColor
TextRoger
Textlightyellow
It has no preceding-siblings
15
Here are the capabilities of XPath
• XPath provides a syntax for: – navigating around an XML document– selecting nodes and values – comparing node values– performing arithmetic on node values
• XPath provides some functions (e.g., concat(), substring(), etc.) to facilitate the above.
16
Document/
PI<?xml version=“1.0”?>
ElementDocument
ElementPara
ElementPara
ElementPara
TextOne if …
TextAnd I …
TextReady to
<?xml version="1.0"?>Document classification="secret"> <Para classification="unclassified"> One if by land, two if by sea; </Para> <Para classification="confidential"> And I on the opposite shore will be, Ready to ride and spread the alarm </Para> <Para classification="unclassified"> Ready to ride and spread the alarm Through every Middlesex, village and farm, </Para></Document>
This XML document can berepresented as a tree, as shown below
Attributeclassification=“secret”
Attributeclassification=“unclassified”
Attributeclassification=“confidential”
Attributeclassification=“unclassified”
See document.xml in the xpathfolder, within the examples folder.
17
Execute XPath using Oxygen XML
Type your XPath expression here
Change this to XPath 1.0
18
Use XPath Builderfor long XPathexpressions
19
Please Run the XPath Expressions
• The following slides contain XPath expressions.
• It's important that you copy the expression on the slide and paste it into Oxygen XML to see what the expression does.
• First, copy the XML document on slide 16, save it to a file, then drag and drop the file into Oxygen XML.
20
Select all Para Elements
/Document/Para
21
/Document/Para
This is an absolute XPath expression
22
Establish a Context Node
Click on this to establish it as the "context node"(any XPath expressions will be relative to it)
23
Relative XPath Expression
In Oxygen XML click on <Document> toestablish the “context node” and then typethis in the XPath box:
Para
24
Select all Para Elements
//Para
descendents
25
Select the first Para
//Para[1]
26
Select the last Para
//Para[last()]
27
Select the classification attribute of the first Para
//Para[1]/@classification
28
Is the Document element’s classification top-secret?
/Document/@classification = 'top-secret'
29
Is the Document element’s classification top-secret or
secret?
(/Document/@classification = 'top-secret') or (/Document/@classification='secret')
30
A or BA and Bnot(A)
Logical Operators
31
Select all Para’s with a secret classification
//Para[@classification = 'secret']
32
Check that no Para has a top-secret classification
not(//Para[@classification = 'top-secret'])
33
Establish a New Context Node
Make the second Para the context node
34
Select the Following Siblings
following-sibling::*
35
Select the FirstFollowing Sibling
following-sibling::*[1]
36
Add Another Element
Add this <Test> element after the last Para
37
Select the Following Para Siblings
following-sibling::Para
38
Select all Following Siblings
following-sibling::*
39
Select all Preceding Siblings
preceding-sibling::*
40
Make Document the Context
Click on Document to make it thecontext node.
41
Equivalent!
Para[1]
child::Para[1]
42
Make Para[2] the context
Establish this as the context node.
43
Get parent element's classification
../@classification
44
Equivalent!
../@classification
parent::*/@classification
45
Axis
following-siblingpreceding-siblingchildparentancestordescendentself
46
Count the number of Para elements
count(//Para)
47
Count the number of Para elements with secret
classification
count(//Para[@classification = 'secret'])
48
Does the first Para element contain the string
“SCRIPT”?
contains(//Para[1], 'SCRIPT')
49
Select all nodes containing the string “SCRIPT”
//node()[contains(., 'SCRIPT')]
The node() function matches on these nodes: - element - text - comment - processing instructions (PIs)Note that it does not match on these nodes: - attribute - document
50
Count the number of nodes containing the
string “SCRIPT”
count(//node()[contains(., 'SCRIPT')])
51
Select the first 20 characters of the first Para
substring(//Para[1], 1, 20)
52
What's the length of the content of the first Para?
string-length(//Para[1])
53
translate(/Document/@classification, 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz')
Convert Document’s classification to lowercase
54
Add a new <Cost> element
Add this element and establishDocument as the context node.
55
Multiply Cost by 2
Cost * 2
56
N mod X = the remainder of dividing N by X
Cost mod 2
57
Arithmetic Operators
*mod- (leave space on either side)div+
58
Set this to XPath 2.0
59
Does Document’s classification match one in
Classifications.xml?
/Document/@classification = doc('Classifications.xml')/Classifications/li
60
Do the first two Para's have the same classification?
Para[1]/@classification eq Para[2]/@classification
61
eq means equalne means not equallt means less thangt means greater thanle means less than or equal toge means greater than or equal to
Boolean Operators
62
if Document's classification is top-secret then there can be no Para with a classification
not equal to top-secret
if (/Document/@classification eq 'top-secret') then not(//Para[@classification ne 'top-secret']) else true()
63
Two built-in functions
true()
false()
64
Cast a value to a numeric type
number(Cost)
65Check that Document's children are: multiple Para's,
1 Test, and 1 Cost (and nothing else)
Para[2] and Test and Cost and empty(* except (Para, Test[1], Cost[1]))
66
The sum() function
<?xml version="1.0"?><numbers> <number>23</number> <number>5</number> <number>-41</number> <number>50</number> <number>12</number></numbers>
sum(//number) returns 49.0
67
Check that every Publisher has a string-length le 140
<BookStore> <Book> <Title>My Life and Times</Title> <Author>Paul McCartney</Author> <Date>1998</Date> <ISBN>1-56592-235-2</ISBN> <Publisher>McMillin Publishing</Publisher> <Author>John Ghostwriter</Author> </Book> <Book> <Publisher>Dell Publishing Co.</Publisher> <Author>Richard Bach</Author> <Date>1977</Date> <ISBN>0-440-34319-4</ISBN> <Title>Illusions The Adventures of a Reluctant Messiah</Title> </Book> <Book> <ISBN>0-06-064831-7</ISBN> <Title>The First and Last Freedom</Title> <Author>J. Krishnamurti</Author> <Date>1954</Date> <Publisher>Harper & Row</Publisher> </Book></BookStore>
68
Check that every Publisher has a string-length le 140
every $i in //Publisher satisfies string-length($i) le 140
69
The XPath every expression
• The form of the every expression is:every variable in sequence satisfies boolean expression
• The result of the expression is either true or false.
70
Equivalent
every $i in //Publisher satisfies string-length($i) le 140
not(//Publisher[string-length(.) gt 140])