notes from the library juice academy courses on xpath, xslt, and xquery: university of florida...
TRANSCRIPT
Tech TalkBIBFRAME Working Group
17 November 2015
XPath, XSLT, and XQuery
Notes from the Library Juice Academy courses, “Introduction to XML” and
“Transforming and Querying XML with XSLT and XQuery”
Allison Jai O’Dell | [email protected] || Hikaru Nakano | [email protected]
Douglas Smith | [email protected] || Gerald Langford | [email protected]
XML“The Extensible Markup Language (XML) is a simple text-based format for
representing structured information: documents, data, configuration, books,
transactions, invoices, and much more. ” -- http://www.w3.org/standards/xml/core
XML Example
<?xml version="1.0" encoding="UTF-8"?>
<bunnies xmlns:food=“http://www.example.com/food”>
<bunny>
<name>Frances</name>
<breed>mini lop</breed>
<gender>female</gender>
<color>white with brown spots</color>
<birth>January 10, 2009</birth>
<food:fave>strawberries, parsley,
cilantro, carrots</food:fave>
</bunny>
<bunny status="RBB">
<name>Howard</name>
<breed>mixed, dwarf</breed>
<gender>male</gender>
<color>light brown agouti</color>
<birth>March 15, 2009</birth>
<death>September 1, 2012</death>
</bunny>
</bunnies>
• opening and closing tag
• case sensitive
• properly nested
• quoted attribute values
• opening XML declaration
• character encoding
• root element
• namespace declaration
XPathSelects nodes from an XML document
http://www.w3schools.com/xsl/xpath_intro.asp
XPath Examples
<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
<book category="COOKING">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="CHILDREN">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="WEB">
<title lang="en">XQuery Kick Start</title>
<author>James McGovern</author>
<author>Per Bothner</author>
<author>Kurt Cagle</author>
<author>James Linn</author>
<author>Vaidyanathan Nagarajan</author>
<year>2003</year>
<price>49.99</price>
</book>
<book category="WEB">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>
Select all the title nodes:
/bookstore/book/title
Select all the year nodes, regardless of path:
//year
Select the title of the first book:
/bookstore/book[1]/title
Select price nodes with price>35:
/bookstore/book[price>35]/price
Select the attribute category "WEB" within book node; return titles
/bookstore/book[@category=“WEB”]/title
And you can use regular expressions!
http://www.w3schools.com/xsl/xpath_examples.asp
XSLTeXtensible Stylesheet Language Transformations
Transforms XML documents into other documents
http://www.w3schools.com/xsl/default.asp
XSLT Overview
XSLT is used to transform an XML document into various types of documents, such as another
XML document, a web recognizable document (for example, HTML, XHTML, HTML5), and even
plain text documents.
How it works: The XSLT process utilizes XPath to navigate through the source tree and to identify
nodes in the source tree. The process then checks to see if the nodes that the XPath identifies
match a template that the user has defined. If a node matches a template, the transformation
defined in the template is performed.
Transformations using XSL
“A transformation in the XSLT language is expressed in the form of a stylesheet, whose syntax is
well-formed XML …
“The term stylesheet reflects the fact that one of the important roles of XSLT is to add styling
information to an XML source document, by transforming it into a document consisting of XSL
formatting objects (see [Extensible Stylesheet Language (XSL)]), or into another presentation-
oriented format such as HTML, XHTML, or SVG. However, XSLT is used for a wide range of
transformation tasks, not exclusively for formatting and presentation applications.”
-- http://www.w3.org/TR/xslt20/#what-is-xslt
How is XSLT used?
“If you make a purchase on eBay, or buy a book at Amazon, chances are that pretty much everything
you see on every Web page has been processed with XSLT. Use XSLT to process multiple XML
documents and to produce any combination of text, HTML and XML output. XSLT support is
shipped with all major computer operating systems today, as well as being built in to all major Web
browsers.”
-- http://www.w3.org/standards/xml/transformation
XSLT Example: XML to XML
Input:
\\ad.ufl.edu\uflib\deptdata\Cataloging\Authorities_&_Metadata_Quality\BibFrame\Meeting20151117\pubmed_sample_xml_rev.docx
Output:
\\ad.ufl.edu\uflib\deptdata\Cataloging\Authorities_&_Metadata_Quality\BibFrame\Meeting20151117\pubmed_sample_xslt_out.docx
XSLT Example: XML to XML
XSLT:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs"
version="2.0">
<xsl:output method="xml" indent="yes" encoding="utf-8"/>
<xsl:template match="/ArticleSet">
<xsl:element name="ArticleSet">
<xsl:for-each select="Article">
<xsl:element name="title"><xsl:value-of select="ArticleTitle"/></xsl:element>
<xsl:copy-of select="AuthorList"/>
<xsl:element name="pages"><xsl:text>pages </xsl:text><xsl:value-of select="FirstPage"/><xsl:text>-
</xsl:text><xsl:value-of select="LastPage"/></xsl:element>
<xsl:element name="link"><xsl:if
test="ArticleIdList/ArticleId[@IdType='doi']"><xsl:text>http://dx.doi.org/</xsl:text><xsl:value-of
select="ArticleIdList/ArticleId[@IdType='doi']"/></xsl:if></xsl:element>
</xsl:for-each>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
XSLT Example: XML to HTML
Input:<?xml-stylesheet type="text/xsl" href="quiz1.xsl"?>
<catalog>
<type>Image Catalog</type>
<image>
<id>entry.0001</id>
<preview>http://upload.wikimedia.org/wikipedia/commons/9/93/Waterhouse-sleep_and_his_half-
brother_death-1874.jpg</preview>
<title>Hypnos and Thanatos</title>
<artist>John Willian Waterhouse</artist>
<country>UK</country>
<medium>Painting</medium>
<year>1874</year>
<subject>Greek Mythology</subject>
</image>
</catalog>
XSLT Example: XML to HTML
XSLT:<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="2.0">
<xsl:output method="html"/>
<xsl:template match="/">
<xsl:apply-templates select="catalog"/>
</xsl:template>
<xsl:template match="catalog">
<html>
<head>
<title>Quiz 1</title>
<style>
body {background-color: #000000}
h1 {color: #ffffff;
font-family: verdana}
h2 {color: #F6CEEC;
font-family: verdana}
p {color: #F6CEEC;
font-family: verdana}
</style>
</head>
<body>
<xsl:for-each select="image">
<p>
<xsl:variable name="preview" select="preview"></xsl:variable>
<img src="{$preview}" width="400px"/>
</p>
<h1>
<xsl:value-of select="title"/>
</h1>
<h2>
<b><xsl:value-of select="artist"/></b>
</h2>
<p>
Country: <xsl:value-of select="country" /><br />
Medium: <xsl:value-of select="medium" /><br />
Date: <xsl:value-of select="year" /><br />
Subject: <xsl:value-of select="subject" /><br />
</p>
</xsl:for-each>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
XSLT Example: XML to HTML
Output:
http://allisonjai.com/lja/quiz1.xml
XQueryQueries XML data
http://www.w3schools.com/xsl/xquery_intro.asp
XQuery Overview
There are a myriad of uses for XQuery including:
• querying XML documents and data sources that can output XML documents
• combining data from multiple sources
• transforming data
• generating reports from XML data
• building web and application services over XML data
There is some overlap in utility between XSLT and XQuery, but in general XQuery is more effective
in querying large structured and unstructured data sets and deriving data from large data sets.
FLWOR
XQuery works by combining the use of path expressions (XPaths) to access parts or fragments of
XML data and the use of FLWOR ("for", "let", "where", "order by", "return") expressions to process,
join, and return data.
http://www.w3ctutorial.com/xquery-basic/xquery-flwor