Object Oriented Programming III 2
Processing XML using XSLT
• XSLT is available on a number of platforms.
• Next week – how C# interacts with XSLT
Object Oriented Programming III 3
<?xml version="1.0" ?><?xml-stylesheet type="text/xsl" href="demo1.xsl"?><book> <title>The Catcher in the Rye</title> <author>J. D. Salinger</author> <publisher>Little, Brown and Company</publisher> </book>
Input
Object Oriented Programming III 4
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template match = "book"> <HTML><BODY><xsl:apply-templates/></BODY></HTML> </xsl:template> <xsl:template match = "title"> <H1><xsl:apply-templates/></H1> </xsl:template> <xsl:template match = "author"> <H3><xsl:apply-templates/></H3> </xsl:template> <xsl:template match = "publisher"> <P><I><xsl:apply-templates/></I></P> </xsl:template></xsl:stylesheet> Processing
Object Oriented Programming III 5
<HTML><BODY> <H1>The Catcher in the Rye</H1> <H3>J. D. Salinger</H3> <P><I>Little, Brown and Company</I></P> </BODY></HTML>
Output
Object Oriented Programming III 6
<?xml version="1.0" ?><?xml-stylesheet type="text/xsl" href="demo1.xsl"?><library><block><book> <title>The Catcher in the Rye</title> <author>J. D. Salinger</author> <publisher>Little, Brown and Company</publisher> </book></block></library>
Input
Object Oriented Programming III 7
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:template match = "book">
<HTML><BODY><xsl:apply-templates/></BODY></HTML>
</xsl:template>
<xsl:template match = "title">
<H1><xsl:apply-templates/></H1>
</xsl:template>
<xsl:template match = "author">
<H3><xsl:apply-templates/></H3>
</xsl:template>
<xsl:template match = "publisher">
<P><I><xsl:apply-templates/></I></P>
</xsl:template>
</xsl:stylesheet>
The default rules matchesthe root, library and block elements.
Object Oriented Programming III 8
<HTML><BODY> <H1>The Catcher in the Rye</H1> <H3>J. D. Salinger</H3> <P><I>Little, Brown and Company</I></P> </BODY></HTML>
The output is the same.
Object Oriented Programming III 9
<?xml version="1.0" ?><?xml-stylesheet type="text/xsl" href="demo1.xsl"?>
<book> <title>The Catcher in the Rye</title> <author>J. D. Salinger</author> <publisher>Little, Brown and Company</publisher> <book>Cliff Notes on The Catcher in the Rye</book> </book>
Two books in the input
Object Oriented Programming III 10
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template match = "book"> <HTML><BODY><xsl:apply-templates/></BODY></HTML> </xsl:template>
<xsl:template match = "title"> <H1><xsl:apply-templates/></H1> </xsl:template>
<xsl:template match = "author"> <H3><xsl:apply-templates/></H3> </xsl:template>
<xsl:template match = "publisher"> <P><I><xsl:apply-templates/></I></P> </xsl:template>
</xsl:stylesheet>
What’s the output?
Object Oriented Programming III 11
<HTML><BODY> <H1>The Catcher in the Rye</H1> <H3>J. D. Salinger</H3> <P><I>Little, Brown and Company</I></P> <HTML><BODY>Cliff Notes on The Catcher in the Rye</BODY></HTML> </BODY></HTML>
Illegal HTML
Object Oriented Programming III 12
<?xml version="1.0" ?><?xml-stylesheet type="text/xsl" href="demo1.xsl"?>
<book> <title>The Catcher in the Rye</title> <author>J. D. Salinger</author> <publisher>Little, Brown and Company</publisher> </book>
Input
Object Oriented Programming III 13
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template match = "book"> <HTML><BODY><xsl:apply-templates/></BODY></HTML> </xsl:template>
<xsl:template match = "title"> <H1><xsl:apply-templates/></H1> </xsl:template>
<xsl:template match = "author"> <H3><xsl:apply-templates/></H3> </xsl:template><!-- <xsl:template match = "publisher"> <P><I><xsl:apply-templates/></I></P> </xsl:template>--></xsl:stylesheet>
We are not matchingon publisher.
Object Oriented Programming III 14
<HTML><BODY> <H1>The Catcher in the Rye</H1> <H3>J. D. Salinger</H3> Little, Brown and Company </BODY></HTML>
We get the default rule matching thepublisher and then printing its child.
Object Oriented Programming III 15
<?xml version="1.0" ?><?xml-stylesheet type="text/xsl" href="demo1.xsl"?>
<book> <title>The Catcher in the Rye</title> <author>J. D. Salinger</author> <publisher>Little, Brown and Company</publisher> </book>
Input
Object Oriented Programming III 16
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template match = "book"> <HTML><BODY><xsl:apply-templates/></BODY></HTML> </xsl:template>
<xsl:template match = "title"> <H1><xsl:apply-templates/></H1> </xsl:template>
<xsl:template match = "author"> <H3><xsl:apply-templates/></H3> </xsl:template>
<xsl:template match = "publisher"> <!-- Skip the publisher --> </xsl:template>
</xsl:stylesheet>
We can skip the publisherby matching and stoppingthe recursion.
Object Oriented Programming III 17
<HTML><BODY> <H1>The Catcher in the Rye</H1> <H3>J. D. Salinger</H3> </BODY></HTML>
Object Oriented Programming III 18
<?xml version="1.0" ?><?xml-stylesheet type="text/xsl" href="demo1.xsl"?><shelf> <book> <title>The Catcher in the Rye</title> <author>J. D. Salinger</author> <publisher>Little, Brown and Company</publisher> </book> <book> <title>The Catcher in the Rye</title> <author>J. D. Salinger</author> <publisher>Little, Brown and Company</publisher> </book> <book> <title>The Catcher in the Rye</title> <author>J. D. Salinger</author> <publisher>Little, Brown and Company</publisher> </book></shelf>
A shelfhas many books.
Object Oriented Programming III 19
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template match = "book"> <HTML><BODY><xsl:apply-templates/></BODY></HTML> </xsl:template>
<xsl:template match = "title"> <H1><xsl:apply-templates/></H1> </xsl:template>
<xsl:template match = "author"> <H3><xsl:apply-templates/></H3> </xsl:template>
<xsl:template match = "publisher"> <i><xsl:apply-templates/></i> </xsl:template>
</xsl:stylesheet>
Will this do the job?
Object Oriented Programming III 20
<HTML> <BODY> <H1>The Catcher in the Rye</H1> <H3>J. D. Salinger</H3> <i>Little, Brown and Company</i> </BODY></HTML><HTML> <BODY> <H1>The Catcher in the Rye</H1> <H3>J. D. Salinger</H3> <i>Little, Brown and Company</i> </BODY></HTML><HTML> <BODY> <H1>The Catcher in the Rye</H1> <H3>J. D. Salinger</H3> <i>Little, Brown and Company</i> </BODY></HTML>
This is not whatwe want.
Object Oriented Programming III 21
<?xml version="1.0" ?><?xml-stylesheet type="text/xsl" href="demo1.xsl"?><shelf> <book> <title>The Catcher in the Rye</title> <author>J. D. Salinger</author> <publisher>Little, Brown and Company</publisher> </book> <book> <title>The Catcher in the Rye</title> <author>J. D. Salinger</author> <publisher>Little, Brown and Company</publisher> </book> <book> <title>The Catcher in the Rye</title> <author>J. D. Salinger</author> <publisher>Little, Brown and Company</publisher> </book></shelf>
Same input.
Object Oriented Programming III 22
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template match = "shelf"> <HTML><BODY>Found a shelf</BODY></HTML> </xsl:template>
</xsl:stylesheet>
Checks for a shelf and quits.
Object Oriented Programming III 24
<?xml version="1.0" ?><?xml-stylesheet type="text/xsl" href="demo1.xsl"?><shelf> <book> <title>The Catcher in the Rye</title> <author>J. D. Salinger</author> <publisher>Little, Brown and Company</publisher> </book> <book> <title>The Catcher in the Rye</title> <author>J. D. Salinger</author> <publisher>Little, Brown and Company</publisher> </book> <book> <title>The Catcher in the Rye</title> <author>J. D. Salinger</author> <publisher>Little, Brown and Company</publisher> </book></shelf>
Same input.
Object Oriented Programming III 25
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template match = "shelf"> <HTML> <BODY> <b>These are a few of my favorite books</b> <table width = "640“ border = “5”> <xsl:apply-templates/> </table> </BODY> </HTML> </xsl:template> <xsl:template match = "book"> <tr> <td> <xsl:number/> </td> <xsl:apply-templates/> </tr> </xsl:template> <xsl:template match = "title | author | publisher"> <td><xsl:apply-templates/></td> </xsl:template></xsl:stylesheet>
Produce a table of books.
Object Oriented Programming III 26
<HTML><BODY><b>These are a few of my favorite books</b><table width="640“ border = “5”> <tr><td>1</td> <td>The Catcher in the Rye</td> <td>J. D. Salinger</td> <td>Little, Brown and Company</td> </tr> <tr><td>2</td> <td>The XSLT Programmer's Reference</td> <td>Michael Kay</td> <td>Wrox Press</td> </tr> <tr>
<td>3</td> <td>Computer Organization and Design</td> <td>Patterson and Henessey</td> <td>Morgan Kaufmann</td> </tr></table></BODY></HTML>
Object Oriented Programming III 28
XPATH
• Non-xml language used to identify particular parts of an xml document
• Used by XSLT for matching and selecting particular elements to be copied into the result tree.
• Used by Xpointer to identify a particular point in or part of an xml document that an Xlink links to.
Slides adapted from “XML in a Nutshell” by Harold
Object Oriented Programming III 29
XPATH
First, we’ll look at three commonly used XSLT instructions:
xsl:value-of xsl:template xsl:apply-templates
Object Oriented Programming III 30
XPATH
<xsl:value-of select = “XPathExpression” />
The xsl:value-of element computes the string value of an Xpathexpression and inserts it into the result tree. XPath allows us to select nodes in the tree and different node types produce differentvalues.
Object Oriented Programming III 31
XPATH
<xsl:value-of select = “XPathExpression” />
element => the text content of the element after all tags are stripped text => the text of the node attribute => the value of the attribute root => the value of the root processing-instruction => the processing instruction data (<?, ?>, and the target are not included comment => the text of the comment (no comment symbols) namespace => the namespace URI node set => the value of the first node in the set
Object Oriented Programming III 32
XPATH
<xsl:template match = “pattern” />
The xsl:template top-level element is the key to all of xslt.The match attribute contains a pattern (location path) againstwhich nodes are compared as they’re processed. If the patternmatches a node, then the contents are instantiated
Object Oriented Programming III 33
XPATH
<xsl:apply-templates select = “XPath node set expression” />
Find and apply the highest priority template that matches the node set expression.
If the select attribute is not present then all children of the context node are processed.
Object Oriented Programming III 34
The Tree Structure of an XML Document
<?xml version="1.0" ?><?xml-stylesheet type="text/xsl" href = "pi.xsl" ?><people> <person born="1912" died = "1954" id="p342"> <name> <first_name>Alan</first_name> <last_name>Turing</last_name> </name> <!-- Did the word "computer scientist" exist in Turing's day? --> <profession>computer scientist</profession> <profession>mathematician</profession> <profession>cryptographer</profession> </person>
See Harold Pg. 147
Object Oriented Programming III 35
<person born="1918" died = "1988" id="p4567"> <name> <first_name>Richard</first_name> <middle_initial>M</middle_initial> <last_name>Feynman</last_name> </name> <profession>physicist</profession> <hobby>Playing the bongoes</hobby> </person></people>
Unicode ‘M’
Object Oriented Programming III 36
/
personborn = “1914”died = “1952”id=“p342”
person
name
first_name
Alan
<!– Did the word “computer scientist”exist in Turing’s day?”-- >
<?xml-stylesheet type="text/xsl" href = “some.xsl" ?>
profession
Object Oriented Programming III 37
The rootElement NodesText NodesAttribute NodesComment NodesProcessing InstructionsNamespace Nodes
Nodes seen by XPath Constructs not seen by XPath
CDATA sectionsEntity referencesDocument Type Declarations
Object Oriented Programming III 38
Note
The following appears in each example below so ithas been removed from the slides.
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"version="1.0"
>::</xsl:stylesheet>
Object Oriented Programming III 39
Location Paths
• The root
<xsl:template match="/"><a>matched the root</a>
</xsl:template>
<?xml version="1.0" encoding="utf-8"?><a>matched the root</a>
Object Oriented Programming III 40
Location Paths
• Child element location paths (relative to context node)
<xsl:template match="/"> <xsl:value-of select = "people/person/profession" /></xsl:template>
computer scientist
Object Oriented Programming III 41
Location Paths
• Attribute location paths (relative to context node)
<xsl:template match="/"> <xsl:value-of select = "people/person/@born" /></xsl:template>
<?xml version="1.0" encoding="utf-8"?>1912
Object Oriented Programming III 42
Location Paths
• Attribute location paths (relative to context node)<xsl:template match="/"> <xsl:apply-templates select = "people/person" /></xsl:template>
<xsl:template match = "person"> <date> <xsl:value-of select = "@born" /> </date></xsl:template>
<date>1912</date><date>1918</date>
Object Oriented Programming III 43
Location Paths
• Comment Location Step (comments don’t have names)
<xsl:template match="/"> <xsl:value-of select = "people/person/comment()" /></xsl:template>
<?xml version="1.0" encoding="utf-8"?> Did the word "computer scientist" exist in Turing's day?
Object Oriented Programming III 44
Location Paths
• Comment Location Step
<xsl:template match = "comment()" > <i>comment deleted</i></xsl:template>
Document content withcomments replaced as shown.Default – no comments output
Object Oriented Programming III 45
Location Paths
• Text Location Step (Text nodes don’t have names)
<xsl:template match="/"> <xsl:value-of select = "people/person/profession/text()" /></xsl:template>
computer scientist
Object Oriented Programming III 46
Location Paths
• Processing Instruction Location Step
<xsl:template match="/"> <xsl:value-of select = "processing-instruction()" /></xsl:template>
<?xml version="1.0" encoding="utf-8"?>type="text/xsl" href = "pi.xsl"
Object Oriented Programming III 47
Location Paths
• Wild cards
There are three wild cards: *, node(), @*
The * matches any element node. It will not match attributes, text nodes, comments or processing instructions nodes.
Object Oriented Programming III 48
Location Paths
• Matching with *<xsl:template match = "*" > <xsl:apply-templates select ="*" /></xsl:template>
Matches all elements and requestscalls on sub-elements only. Nothingis displayed. The text nodes are never reached.
Object Oriented Programming III 49
Location Paths
• Matching with node()
The node() wild card matches all nodes: element nodes,text nodes, attribute nodes, processing instruction nodes,namespace nodes and comment nodes.
Object Oriented Programming III 50
Matching with Node
<xsl:template match="node()">
<xsl:apply-templates/>
</xsl:template>
What is the output?
Object Oriented Programming III 51
Matching with Node -Output
<?xml version="1.0" encoding="UTF-8"?>
Object Oriented Programming III 52
Location Paths
• Matching with @*
The @* wild card matches all attribute nodes.
Object Oriented Programming III 53
Matching with @*<xsl:template match="@*">
Found an attribute <xsl:value-of select="."/>
</xsl:template>
<xsl:template match="node()">
<xsl:apply-templates select="@*"/> <xsl:apply-templates/>
</xsl:template>
What is the output?
Object Oriented Programming III 54
Matching with @* - Output
<?xml version="1.0" encoding="UTF-8"?> Found an attribute 1912 Found an attribute 1954 Found an attribute p342 Found an attribute 1918 Found an attribute 1988 Found an attribute p4567
Object Oriented Programming III 55
Matching with @*
<xsl:template match = "person" > <b> <xsl:apply-templates select = "@*" /> </b></xsl:template>
<?xml version="1.0" encoding="utf-8"?>
<b>19121954p342</b>
<b>19181988p4567</b>
Object Oriented Programming III 56
Location Paths
• Multiple matches with |
<xsl:template match = "profession|hobby" > <activity> <xsl:value-of select = "text()"/> </activity></xsl:template>
<xsl:template match = "*" > <xsl:apply-templates /></xsl:template>
<xsl:template match = "text()" ></xsl:template>
Matches all the elements.Skips the text nodes unlessthey describe a professionor hobby.
Object Oriented Programming III 57
Location Paths
• Selecting from all descendants with //
// selects from all descendants of the context node as well as the context nodeitself. At the beginning of an Xpathexpression, it selects from all descendantsof the root node.
Object Oriented Programming III 58
Location Paths
• Selecting from all descendants with //
<xsl:template match = "//name/last_name/text()" > <xsl:value-of select = "." /></xsl:template>
<xsl:template match = "text()" ></xsl:template>
<?xml version="1.0" encoding="utf-8"?>TuringFeynman
Object Oriented Programming III 59
Location Paths
• Selecting from all descendants with //
<xsl:template match = "/" >
<xsl:value-of select = "//first_name/text()" />
</xsl:template>
<?xml version="1.0" encoding="utf-8"?>Alan
Object Oriented Programming III 60
Location Paths
• Selecting from all descendants with //
<xsl:template match = "/" >
<xsl:apply-templates select = "//first_name/text()" />
</xsl:template>
<xsl:template select = "text()" >
<xsl:value-of select = "." />
</xsl:template> <?xml version="1.0" encoding="utf-8"?>AlanRichard
Object Oriented Programming III 61
Location Paths
• Selecting from all descendants with //
<xsl:template match = "/" >
<xsl:apply-templates select = "//middle_initial/../first_name" />
</xsl:template>
<xsl:template select = "text()" >
<xsl:value-of select = "." />
</xsl:template>
</xsl:stylesheet>
<?xml version="1.0" encoding="utf-8"?>Richard
Object Oriented Programming III 62
Specifying the Child Axis
Consider the following path:
/Envelope/Header/Signature
The above is an abbreviation for
/child::Envelope/child::Header/child::Signature
Object Oriented Programming III 63
Using an Axis <xsl:template match="people">
<xsl:apply-templates select="person"/>
</xsl:template>
<xsl:template match = "person" > <xsl:if test="position() = last()"> <xsl:value-of select="preceding-sibling::person/name"/> </xsl:if>
Object Oriented Programming III 64
<xsl:if test="position() != last()">
<xsl:value-of select="following-sibling::person/name"/>
</xsl:if>
</xsl:template>
What is the output?
Object Oriented Programming III 65
<?xml version="1.0" encoding="UTF-8"?> Richard M Feynman Alan Turing
Axis Example - Output
Object Oriented Programming III 66
Writing Output to an Attribute
<xsl:template match="@*">
<someTag id="{.}"></someTag>
</xsl:template>
<xsl:template match="node()">
<xsl:apply-templates select="@*"/> <xsl:apply-templates/>
</xsl:template>
Object Oriented Programming III 67
Writing Output to an Attribute
<?xml version="1.0" encoding="UTF-8"?><someTag id="1912"/><someTag id="1954"/><someTag id="p342"/><someTag id="1918"/><someTag id="1988"/><someTag id="p4567"/>
Object Oriented Programming III 68
Predicates
In general, an Xpath expression may refer to morethan one node. Predicates allow us to reduce the number of nodes we are interested in.
Each step in a location path may have a predicatethat selects from the node list that is current at thatstep in the expression.
The boolean expression in the predicate is tested against each node in the context node list. If the expressionis false then that node is deleted from the list.
Object Oriented Programming III 69
Predicates<xsl:template match = "/" >
<xsl:apply-templates select = "//profession[.='physicist']/../name" />
</xsl:template>
<xsl:template select = "text()" >
<xsl:value-of select = "." />
</xsl:template><?xml version="1.0" encoding="utf-8"?>
Richard M Feynman
Object Oriented Programming III 70
Predicates
<xsl:template match = "/" >
<xsl:apply-templates select = "//person[@id='p4567']" />
</xsl:template>
<xsl:template select = "text()" >
<xsl:value-of select = "." />
</xsl:template>
<?xml version="1.0" encoding="utf-8"?>
Richard M Feynman
physicist Playing the bongoes
Object Oriented Programming III 71
Predicates<xsl:template match = "/" >
<xsl:apply-templates select = "//person[@born <= 1915]" />
</xsl:template>
<xsl:template select = "text()" >
<xsl:value-of select = "." />
</xsl:template>
<?xml version="1.0" encoding="utf-8"?>
Alan Turing
computer scientist mathematician cryptographer
Object Oriented Programming III 72
Predicates<xsl:template match = "/" >
<xsl:apply-templates select = "//person[@born <= 1919 and @born >= 1917]" />
</xsl:template>
<xsl:template select = "text()" >
<xsl:value-of select = "." />
</xsl:template>
<?xml version="1.0" encoding="utf-8"?>
Richard M Feynman
physicist Playing the bongoes
Object Oriented Programming III 73
Predicates<xsl:template match = "/" >
<xsl:apply-templates select = "/people/person[@born < 1950]/ name[first_name='Alan']" />
</xsl:template>
<?xml version="1.0" encoding="utf-8"?>
Alan Turing
Object Oriented Programming III 74
General XPath Expressions
Xpath expressions that are not node sets can’t be usedin the match attribute of an xsl:template element.
They can be used for the values for the select attributeof xsl:value-of elements and in location path predicates.
Object Oriented Programming III 75
General XPath Expressions
<xsl:template match = "/" > <xsl:apply-templates select = "/people/person" /></xsl:template>
<xsl:template match = "person"> <xsl:value-of select="@born div 10" /></xsl:template>
<xsl:template match = "text()"></xsl:template>
<?xml version="1.0" encoding="utf-8"?>191.2191.8
Object Oriented Programming III 76
General XPath ExpressionsXpath Functions
<xsl:template match = "/" > <xsl:apply-templates select = "/people/person" /></xsl:template>
<xsl:template match = "person"> Person <xsl:value-of select="position()" /></xsl:template>
<xsl:template match = "text()"></xsl:template> <?xml version="1.0" encoding="utf-8"?>
Person 1
Person 2
Object Oriented Programming III 77
General XPath ExpressionsXpath Functions
<xsl:template match = "/" > <xsl:apply-templates select = "//name[starts-with(last_name,'T')]"/></xsl:template>
<xsl:template match = "name"> Mr. T. <xsl:value-of select="." /></xsl:template>
<?xml version="1.0" encoding="utf-8"?>
Mr. T. Alan Turing
Node set convertedto string