1 xml & related technologies. 2 outline zmarkup language yxml ydtd zapi for xml ydom ysax...
TRANSCRIPT
1
XML & related technologies
2
Outline
Markup Language XML DTD
API for XML DOM SAX
Related Technologies Name Space Xpath Xlink XSL
Query Languages for XML Quilt
3
A critique of HTML
Extraordinarily flexible, but low on structure
Fixed tag set (vocabulary)No automatic validationUnreliable use of the syntaxNobody uses the data model
4
How XML solves this
Define your own tags (vocabulary)Validate against the definitionError handling and strict definition of the
syntaxSmaller and simpler than SGMLStandardized APIs for working with itA data model specification is coming
5
XML background
A subset of SGMLSimplifies SGML by:
leaving out many syntactical options and variants
leaving out some DTD features leaving out some troublesome features
Recommendation approved by the W3C
6
A simple and complete element:
<address>
<street> 33, Terry Dr.</street>
<city> Morristown </city>
</address>
Elements
markup
Content
Start tag
End tag
7
Elements
Attach a meaning to a piece of a document
Have an element type (`example’, `name’) represented by a markup (tag).
Can be nested at any depth
8
Elements
Can contain: other elements (sub-elements)
<address>
<street> 33, Terry Dr.</street><city> Morristown </city>
</address>
text (data content)<street> 33, Terry Dr.</street>
a combination of them (mixed content)<par>Today, <date>05-06-2000</date> Mr. <name>Bill Gates<name> is in California to talk to ... </par>
9
Document Element
It is the outermost element containing all the elements in a documentexample:
<employee> … </employee>
It must always exist
10
Empty Elements
elements without content They do not have end tags Particular representation of start tags
example:
<medical-dossier …/>
11
Attributes
Used to annotate the element with extra information
Always attached to start tags:<el-name attr-name1=“v1” .. attr-name1=“v1” >……<el-name/>
Elements can have any number of
attributes, but all distinct
<Orders>
<SalesOrder SONumber="12345">
<Customer CustNumber="543">
<CustName>ABC Industries</CustName>
<Street>123 Main St.</Street>
<City>Chicago</City>
<State>IL</State>
<PostCode>60609</PostCode>
</Customer>
<OrderDate>981215</OrderDate>
<Line LineNumber="1">
<Part PartNumber="123">
<Description> Turkey wrench: Stainless steel,
one piece construction, lifetime guarantee.
</Description>
<Price>9.95</Price>
</Part>
<Quantity>10</Quantity>
</Line>
</SalesOrder>
</Orders>
13
An XML document
<?XML version=“1.0”><books>
<book><entry isbn=“1-55860-622-X”>
<title>Data on the Web:...</title><publisher>Morgan Kaufmann</publisher>
</entry><author> Serge Abiteboul</author> <author> Peter Buneman</author><author> Dan Suciu</author><bookRef to=“0-201-53771-O 1-55860-463-4”/><articleLink href=“http://…/articles.xml#id(Abi97)”>
</book> <book>
<entry isbn=“0-201-53771-O”> <title>Foundation of Databases</title>
<publisher>Addison Wesley</publisher></entry><author> Serge Abiteboul</author>...
</book>...
</books>
14
Elements Vs Attributes
An element, when: I need fast searching process
it is visible to everyone
it is relevant for the meaning of the document
An attribute, when: it is a choice it is visible only to the system
it is not relevant for the meaning of the document
Do I use an element or an attribute to store semantic info?
15
Other Stuff
Processing instructions, used mainly for extension purposes (<?target data?>)
Comments (<!-- … -->)Character references (£)Entities:
named files or pieces of markup can be referred to recursively, inserted at
point of reference
16
Document Types
Basic idea: we need a type associated with a document, just like objects and values
A document type is a class of documents with similar structure and semanticsExamples: slide presentations, journal articles, meeting agendas, method calls, etc.
17
DTDs
DTDs provide a standardized means for declaratively describing the structure of a document type
This means: which (sub-)elements an element can contain whether it can contain text or not which attributes it can have some typing and defaulting of attributes
18
DTD
A DTD can be Internal: the DTD is in the document External: the DTD is in an external file and
included in the document mixed: part in the document and part outside
A DTD is logically composed of 2 parts: Element Type Definition Attribute List Declaration
19
Element Type Definition
The element type definition specifies: structure of the document allowed contents (content model) allowed attributes (by the meaning of attribute
list declarations)
20
Element Type Definition
<!ELEMENT A (B*, C, D?)> <!ELEMENT A (B | C+)> <!ELEMENT A (#PCDATA)> <!ELEMENT A EMPTY> <!ELEMENT A ANY> <!ELEMENT A (#PCDATA| B | C)*>
• The following are examples of possible declarations:
21
Attribute-List Declarations
It is the list of allowed attributes for each element.For each attribute: name, type, and other information.
Attribute types. Three groups: string types (CDATA) tokenized types (ID,IDREF,IDREFS,...) enumerated types (as the ones in Pascal)
22
Attribute-List Declarations
<!ATTLIST A a CDATA #IMPLIED> <!ATTLIST A a CDATA #IMPLIED b CDATA
#REQUIRED> <!ATTLIST A a CDATA #IMPLIED “aaa”> <!ATTLIST A a CDATA #REQUIRED “aaa”> <!ATTLIST A a CDATA #FIXED “aaa”> <!ATTLIST A a (aaa|bbb) #IMPLIED “aaa”> <!ATTLIST A id ID #REQUIRED> <!ATTLIST A ref IDREF #IMPLIED>
• <!ELEMENT A (#PCDATA)>
<!DOCTYPE Orders[
<!ELEMENT Orders(SalesOrder)+>
<!ELEMENT SalesOrder(Customer,OrderDate,Line*)>
<!ELEMENT Customer(CustName,Street,City,State,PostCode,tel*)>
<!ELEMENT CustName (#PCDATA)>
<!ELEMENT Street (#PCDATA)> <!ELEMENT State (#PCDATA)>
<!ELEMENT PostCode (#PCDATA)> <!ELEMENT tel (#PCDATA)>
<!ELEMENT OrderDate (#PCDATA)> <!ELEMENT Line (Part,Quantity)>
<!ELEMENT Part(Description,Price)> <!ELEMENT Quantity (#PCDATA)>
<!ELEMENT Description (#PCDATA)> <!ELEMENT Price (#PCDATA)>
<!ATTLIST SalesOrder SONumber CDATA #REQUIRED>
<!ATTLIST Customer CustNumer CDATA #REQUIRED>
<!ATTLIST Line LineNumber CDATA #IMPLIED>
<!ATTLIST Part PartNumber CDATA #REQUIRED>
]
A DTD
24
A DTD
<!DOCTYPE Books[
<!ELEMENT Books(book)+>
<!ELEMENT book(entry, author+, bookRef, articleLink*)>
<!ELEMENT entry(title, publisher)>
<!ELEMENT bookRef EMPTY>
<!ELEMENT articleLink EMPTY>
<!ELEMENT title (#PCDATA)>
<!ELEMENT author (#PCDATA)>
<!ELEMENT pubblisher (#PCDATA)>
<!ATTLIST entry isdn ID #REQUIRED>
<!ATTLIST bookRef to IDREFS #IMPLIED>
<!ATTLIST articleLink
xmlns:xlink CDATA #FIXED “http://w3c.org/xlink”
xlink:type CDATA #FIXED “simple”
xlink:href CDATA #REQUIRED>
]>
25
Well-formed and Valid Docs
A document is well-formed if it follows the grammar rules provided by W3C.
A document is valid if it conforms to a DTD which specifies the allowed structure of the document
26
Uses of XML Entities
Physical partition size, reuse, "modularity", … (both XML docs
& DTDs)
Non-XML data unparsed entities binary data
Non-standard characters character entities
Shorthand for phrases & markup, => effectively are macros
27
Types of Entities
Internal (to a doc) vs. External ( use URI)
General (in XML doc) vs. Parameter (in DTD)
Parsed (XML) vs. Unparsed (non-XML)
28
Entities & Physical Structure
A logical elementcan be split into
multiplephysical entities
Mylife.xml
DTD...
<mylife>Chap1.xml
Chap2.xml
</mylife>
<teen>yada yada</teen>
<adult>blah blah..</adult>
29
External Text Entities
<!ENTITY chap1 SYSTEM "http://...chap1.xml">
<mylife> &chap1; &chap2;</mylife>
External Text Entity Declaration
Entity Reference
<mylife> <teen>yada yada</teen><adult> blah blah</adult>
</mylife>
Logically equivalent to inlining file contents
URL
DTD
XML
30
Internal Text Entities
<!ENTITY WWW "World Wide Web">
<p>We all use the &WWW;.</p>
Internal Text Entity Declaration
Entity Reference
<p>We all use the World Wide Web.</p>
Logically equivalent to actually appearing
DTD
XML
31
Unparsed (& "Binary") Entities
<!ENTITY fusion SYSTEM "http://... fusion.ps" NDATA ps>
... and unparsed entity
<fullPaper source="fusion"/>
<!attlist fullPaper source ENTITY #REQUIRED>
Element with ENTITY attribute
Declare attribute type to be entity
<!NOTATION ps SYSTEM "ghostview.exe">
NOTATION declaration (helper app.)
Declare external...
DTD
XML
32
Processing XML
Non-validating parser: checks that XML doc is syntactically well-formed
Validating parser: checks that XML doc is also valid w.r.t. a given DTD
Parsing yields tree/object representation: Document Object Model (DOM) API
Or a stream of events (open/close tag, data): Simple API for XML (SAX)
33
API for handling XML Documents
DOM, SAX
34
DOM Structure Model and API
hierarchy of Node objects: document, element, attribute, text, comment, ...
language independent programming DOM API: get... first/last child, prev/next sibling, childNodes insertBefore, replace getElementsByTagName ...
alternative event-based SAX API (Simple API for XML) does not build a parse tree (reports events when
encountering begin/end tags) for (partially) parsing very large documents
35
DOM Summary
Object-Oriented approach to traverse the XML node tree
Automatic processing of XML docs
Operations for manipulating XML tree
Manipulation & Updating of XML on client & server
Database interoperability mechanism
36
SAX Event-Based API
Pros: The whole file doesn’t need to be loaded into
memory XML stream processing Simple and fast Allows you to ignore less interesting data
Cons: limited expressive power (query/update) when
working on streams=> application needs to build (some) parse-tree when
necessary
37
Related XML technologies
Namespace, Xlink, Xpath, XSL
38
Namespace
39
Namespace
Through namespaces it is possible to declare a set of names which meaning is not ambiguous, i.e. everyone agree on their meaning.
In other worlds, namespaces allow to distinguish two elements with the same name, but different meaning
Element/attribute name is a combination of 2 parts: prefix:name
40
Example: Namespace
<person><name>Rosalie Panelli</name><address>33 Terry Dr.</address>
</person><webSite>
<name>XML Italia</name><address>http://www.xml.it</address>
</webSite>
<person><name>Rosalie Panelli</name><address>33 Terry Dr.</address>
</person><webSite>
<name>XML Italia</name><address>http://www.xml.it</address>
</webSite>
<person xmlns:person=“http://namespaces.xml.it/person”><person:name>Rosalie Panelli</person:name><person:address>33 Terry Dr.</person:address>
</person><webSite
xmlns:webSite=“http://namespaces.xml.it/webSite”><webSite:name>XML Italia</webSite:name><webSite:address>http://www.xml.it</webSite:address>
</webSite>
<person xmlns:person=“http://namespaces.xml.it/person”><person:name>Rosalie Panelli</person:name><person:address>33 Terry Dr.</person:address>
</person><webSite
xmlns:webSite=“http://namespaces.xml.it/webSite”><webSite:name>XML Italia</webSite:name><webSite:address>http://www.xml.it</webSite:address>
</webSite>
41
Namespace declaration
Up to now, in order to parse a document containing a namespace against a DTD, it is necessary to include the prefixes in the element declarations:
<!ELEMENT person (person:name, person:address)><!ATTLIST person xmlns:person CDATA
#FIXED “http://namespaces.xml.it/person”><!ELEMENT person:name (#PCDATA)><!ELEMENT person:address (#PCDATA)>
Note: the address http://namespaces.xml.it/person might be dangling
42
Xlink
43
Xlink
Only internal links can be represented by the means of ID/IDREF(S) attributes
Xlink is a language that allows the definition of links among documents (external links) through Xlink elements
Unidirectional links can be defined as in HTML, but also other kinds
Based on a namespace specifically tailored by W3C
44
The Xlink Namespace
It consists of the following attributes: Type Href Role Title Show Actuate From to
By the means of these attributes it is possible to describe the different kind of links
45
Optional and Required Attributes
OTO
OFROM
OOACTUATE
OOSHOW
OOOOOTITLE
OOOOOROLE
ROHREF
RRRRRRTYPE
TITLEARCLOCATORRESOURCEEXTENDEDSIMPLEAttribute Link
46
Description of attributes
Type it is the type of Xlink element
Href URI address of the used resource
Role link description used from the application
Title link description used from the human user
Show It specifies the behavior of an application when cross the link
Actuate It specifies when the behavior selected by the show attribute must be executed
From/To they specify the role attributes of the sources and of the targets in an extended link
47
A simple link
<dsi xmlns:xlink=“http://www.w3c.org/”xlink:type=“simple”xlink:href=“http://www.dsi.unimi.it”xlink:show=“new”xlink:actuate=“onRequest”xlink:role=“DSI”xlink:title=“Dipartimento di Scienze dell’Info..”>Dipartimento di Scienze dell’Informazione
</dsi>
48
Actuate e Show
By the means of these attributes it is possible to specify when a particular link should be crossed and the behavior that an application should show when the link is effectively crossed
49
The values of show
New when the link is crossed, the resource is loaded in a new page
Replace when the link is crossed, the target resource is substituted in the current page
Embed the resource is included in the current document
Undefined the application is free to apply the behavior it likes
50
The values of actuate
onLoad the link is crossed when the document is loaded into the application
OnRequest the link is crossed when the user explicitly request to cross the link
Undefined the application crosses the link as it likes
51
The type of links
Simple: a link is simple when it is between two resources (one local, the other remote)
Extended: a link is extended when it is among different sources (both local and remote)
52
The simple links
They are similar to those of HTML:<html>
...<a href=“http://www.dsi.unimi.it/Index.html”> Computer Science Dept </a>...</html>
53
Example: Xlink declaration
<!Element a (#PCDATA)><!ATTLIST a
xmlns:xlink CDATA #FIXED “http://www.w3c.org”xlink:type (simple) #FIXED “simple”xlink:href CDATA #REQUIREDxlink:show (new|replace) #FIXED “replace”xlink:actuate (onRequest) #FIXED “onRequest”xlink:role CDATA #IMPLIEDxlink:title CDATA #IMPLIED
>
54
Example: Xlink Instance
<xml>
...<a href=“http://www.disi.unige.it/Index.html”> Computer Science Dept </a>...</xml>
55
Xpath
56
Xpath
It is a language for expressing path expression on the hierarchical structure of XML documents
Based on the concept of context node.A context node is the node from which an
expression is evaluatedExamples:
/books/book/author /child::books/child::book/child::author
57
Xpath: Location Path
The central construct is the location path, which is a sequence of location steps separated by /.
A location step is evaluated wrt some context resulting in a set of nodes.
A location path is evaluated compositionally, left-to-right, starting with some initial context.
Each node resulting from the evaluation of one step is used as context for evaluation of the next, and the results are combined together.
58
Location steps
A location step has the form
axis :: node-test [ predicate ]
Example: child::section[position()<6]/descendant::cite/attribute::href
selects all href attributes in cite elements in the first 5
sections of an article.
59
Axes
60
Node tests
Testing by node type: text() chardata nodes
comment() comment nodes
processing-instruction() processing instruction nodes
node() all nodes (not including attributes and namespace decl.s)
Testing by node name: name nodes with that name * any node
61
Predicates
expressions coerced to type boolean A predicate filters a node-set by
evaluating the predicate expression on each node in the set with that node as the context node
Different predefined predicates are available
Predicates can be conjunction, disjunction, negation of other predicates
62
Examples of Predicates
position() returns the sequential location of the element being tested
ex: author[position()=1] abbr. author[1]
last() returns the last siblingex: author[last()]
id(value) returns the element with id equal to “value” ex: entry[id(1-55860-622-X)]
text() returns the text content of a nodeex: author[text()=“Serge Abiteboul”]
63
Abbreviations
Normal syntax Abbreviation child:: nothing
attribute:: @
/descendant-or-self::node()/ //
self::node() .
parent::node() ..
Syntactic sugar: convenient notation for common situations
64
Examples
/books/book/author //author //author[1] //book[entry/@isbn =“1-5586..” or author =
“...”] //book[author][last()>2]
65
XSLeXtensible Stylesheet Language
66
Why Stylesheets?
XML is not a fixed tag set (like HTML) and has no (application) semantics
XML markup does not (usually) include formatting information
Reuse: the same content can look different in different contexts
Multiple output formats: different media (paper, online), different sizes (manuals, reports), different classes of output devices (palmtops, workstations, cellurars)
67
Why Style sheets? (Cont.)
Standardized styles: corporate style sheets can be applied to the content at any time
Freedom from style issues for content authors: technical writers need not be concerned with layout issues because the correct style can be applied later
Therefore there must be something in addition to the XML document that provides information on how to present or otherwise process the XML
68
XSL
XSL stands for eXtensible Style Sheet Language
XSL has currently on preparation by W3CXSL is a style sheet language allowing one
to address the previous issues
69
XSL - More than a Style Sheet
XSL consists of two parts: a method for transforming XML documents a method for formatting XML documents
70
Transforming XML documents
An input document can be transformed into another document by: generation of constant text suppressing of content moving text (e.g. exchanging the order of the first and
last name) duplicating text (e.g. copying titles to make a table of
contents) sorting more complex transformations that “compute” new
information in terms of the existing information
71
Formatting XML documents
A description of how to present the transformed information (i.e. specification of what properties to associate to each of the various parts of the transformed info)
This includes: Specification of the general screen or page layout Assignment of the transformed content into basic
“content container types” (e.g. lists, paragraphs, inline text)
Specification of formatting properties (spacing, margins, alignment, fonts, etc.) for each resulting “container”
72
Components of XSL
The full XSL language logically consists ot 3 component languages which described in 3 W3C recommandations: Xpath XSLT: XSL Transformation -- a language for describing
how to transform one XML document into another XSL: Extensible Stylesheet Language -- XSLT + a
description of a set of Formatting Objects and Formatting Properties
73
XSLT Processing Model
XML source tree XML,HTML, text… result tree
XSLT stylesheet
Transformation
74
XSLT Elements
<xsl:stylesheet version="1.0” xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
root element of an XSLT stylesheet "program"
<xsl:template match=pattern name=qname priority=number mode=qname>
...template...</xsl:template>
declares a rule: (pattern => template)
<xsl:apply-templates select = node-set-expression mode = qname>
apply templates to selected children (default=all) optional mode attribute
<xsl:call-template name=qname>
75
XSLT Processing Model
XSL stylesheet: collection of template rules template rule: (pattern template)main steps:
match pattern against source tree (expressed in Xpath)
instantiate template (replace current node “.” by the template in the result tree)
select further nodes for processing (expressed in Xpath)
control can be a mix of recursive processing (<xsl:apply-templates> ...) program-driven (<xsl:foreach> ...)
76
<xsl:template match="product"> <table> <xsl:apply-templates select="sales/domestic"/> </table> <table> <xsl:apply-templates select="sales/foreign"/> </table> </xsl:template>
Template Rule: Example
(i) match pattern: process <product> elements(ii) instantiate template: replace each product element with two HTML tables(iii) select the <product> grandchildren (“sales/domestic”, “sales/foreign”) for further processing
pattern
template
77
An XML document<?xml version="1.0"?><books>
<book category="reference"><author>Nigel Rees</author><title>Sayings of the Century</title><price>8.95</price>
</book><book category="fiction">
<author>Evelyn Waugh</author><title>Sword of Honour</title><price>12.99</price>
</book><book category="fiction">
<author>Herman Melville</author><title>Moby Dick</title><price>8.99</price>
</book><book category="fiction">
<author>J. R. R. Tolkien</author><title>The Lord of the Rings</title><price>22.99</price>
</book></books>
78
An XSL stylesheet<?xml version="1.0" ?><xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"version="1.0">
<xsl:template match="books"><html><body><h1>A list of books</h1><table width="640"><xsl:apply-templates/></table></body></html>
</xsl:template><xsl:template match="book">
<tr><td><xsl:number/></td><xsl:apply-templates/></tr>
</xsl:template><xsl:template match="author | title | price">
<td><xsl:value-of select="."/></td></xsl:template></xsl:stylesheet>
79
The result (textual version)
<html> <body> <h1>A list of books</h1> <table width="640"> <tr> <td>1</td> <td>Nigel Rees</td> <td>Sayings of the Century</td> <td>8.95</td> </tr> <tr> <td>2</td> <td>Evelyn Waugh</td> <td>Sword of Honour</td> <td>12.99</td> </tr> …... </table> </body></html>
80
The result (shown on a browser)
81
Query Languages for XML
82
Querying XML
Different XML QL paradigms depending on the community: (relational, oo, semistructured) database
perspectiveLorel, YaTL, XML-QL, XMAS, FLORA/FLORID, ...
document processing perspectiveXQL, XSL(T), XPath, ...
functional programming perspectiveQLs with structural recursion, …
Patching desirable features together: Quilt
83
Important QL Features (DB Perspective)
typical parts of a query: (match) pattern (selects parts of the source XML tree
without looking at data)filter condition (selects further, now looking at the data)answer construction (putting the results together,
possibly reordered, grouped, etc.)
reordering based on nested queries, grouping, sorting
tag variables, path expressions for defining the patterns without requiring knowledge of the DTD
84
Querying XML No "official" W3C XML QL yet (but bits and pieces) numerous quite different XML QLs are popping up some XML QL overviews, comparisons, and resources:
XML Query Languages: Experiences and Exemplarshttp://www-db.research.bell-labs.com/user/simeon/xquery.html (co-authored by several XML QL gurus)
XML and Query Languages (Oasis Cover Pages)http://www.oasis-open.org/cover/xmlQuery.html
Comparative Analysis of Five XML Query Languages (A. Bonifati, S. Ceri) http://www.acm.org/sigmod/record/issues/0003/bonifati.pdf.gz
A Data Model and Algebra for XML Query (Philip Wadler et.al. “functional (Haskell) perspective”)http://cm.bell-labs.com/cm/cs/who/wadler/topics/xml.html
XML-QL vs XSLT queries (Geert Jan Bex and Frank Neven)http://alpha.luc.ac.be/~gjb/xml-ql2xslt.html
children of the “(semistructured) database(s) crowd”: XML-QL, YaTL, Lorel, …
… from the “functional crowd”:
… from the “document processing folks”: XQL, XSL(T), XPath, ...
XPath: W3C Recommendation Powerful pattern language for selecting parts of XML docs Used by XSL(T), XPointer, and XQL
XQL based on XPath,
Browser:IE5 XML DBs: Excelon, Tamino, Perl, …
85
Quilt Source: Daniela Florescu, Jérôme Siméon, VLDB 2000
Goals: Put together the most effective features of several existing
and proposed Query languages Design a small, clean, implementable language
Antecedent: Xpath and XQL XML-QL by A. Deutsch, M. Fernandez, D. Florescu, A. Levy,
D. Suciu SQL and OQL
86
Antecedents: Xpath and XQL
Closely-related languages for navigating in a hierarchy
A path expression is a series of stepsEach step moves along an axis (children,
ancestors, attributes, etc.) and may apply a predicate
XQL has some additional operators: BEFORE, AFTER
87
Antecedents: XML-QL
WHERE-clause binds variables according to a pattern, CONSTRUCT-clause generates output document
WHERE<part pno = $pno> $pname </> in “parts.xml”,<supp sno = $sno> $sname </> in “supp.xml”,<sp pno = $pno sno = $sno> $numofpart </> in “sp.xml”
CONSTRUCT<purchase> <partname> $pname </> <suppname> $sname </> <numofparts> $numofpart </></purchase>
88
Antecedents SQL and OQL
SQL and OQL are database query languagesSQL derives a table from other tables by a
stylized series of clauses: SELECT-FROM-WHERE
OQL is a functional language a query is an expression Expressions can take several forms Expressions can be nested and combined SELECT-FROM-WHERE is one form of OQL-
expression
89
An Example of Document: bib.xml
<bib><book>
<title>…</title><author>…</author>...<publisher>…</publisher><year>…</year><price>…</price>
</book>….
</bib>
90
Some examples of queries
“Find all the books published in 1998 by Jackson”
FOR $b IN document(“bib.xml”)//bookWHERE $b/year = “1998” and $b/pablisher = “Jackson”RETURN $b SORTEDBY(author, title)
“Find title of books that have no authors”
<orphan_books>FOR $b IN document(“bib.xml”)//bookWHERE empty($b/author)RETURN $b/title
</orphan_books>
91
Some examples of queries (cont.)
“Invert the hierarchy from publishers inside books to books inside publishers”
FOR $p IN distinct(//publisher) RETURN
<publisher><name> $p/text() </name>,
FOR $b IN //book[publisher = $p]RETURN <book> $b/title, $b/price </book> SORTEDBY(price DESCENDENT)
</publisher> SORTBY(name)
92
Some examples of queries (cont.)
“Find the description and average price of each red part that has at least 10 orders”
FOR $p IN document(“parts.xml”)//part[color=“Red”]LET $o := document(“orders.xml”)//order[partno = $p/partno]WHERE count ($o) >= 10RETURN
<important_red_part>$p/description,<avg_price> avg($o/price)</avg_price>
</important_red_part>
93
Quilt data model
A document in Quilt is a treeIn order to model parts of a document and
collection of documents Quilt supports the concept of forest of trees
E
T
A
EE
E
E
E
T
T T
TT A
A
denotes element
denotes text node
denotes attribute
a document a forest
94
Quilt Expressions
Like OQL, Quilt is a functional language (a query is an expression, and expressions can be composed)
Some types of Quilt expressions: A path expression (using Xpath syntax):
document(“bids.xml”)//bis[itemno =“47”]/bid_amount
An expression using operators and functions:($x + $y) * foo($z)An element constructor
<bid><userid> $u </userid><bid_amount> $a </bid_amount>
<bid>A “FLWR” expression
95
Expression results
The result of the evaluation of a Quilt expression can be: an XML document (i.e. a tree) a set of elements (i.e. a forest) a primitive value (i.e. a string)
96
A FLWR Expression
A FLWR expression binds some variables, applies a predicate, and constructs a new result
FOR … LET … WHERE … RETURN …
RETURNCLAUSE
LETCLAUSE
WHERECLAUSE
FORCLAUSE
97
FOR clause
FOR is used for iterating over one or more collections
Each expression evaluates to a collection of nodes
The FOR clause produces many binding-tuples from the Cartesian product of these collections
In each tuple, the value of each variable is one node and its descendents
The order of the tuples preserves document order unless some expression contains a non-order-preserving function such as distinct()
FOR var IN expr
,
98
LET clause
LET is used for binding variables (without iteration) A LET clause produce one binding for each variable
(therefore it does not effect the number of binding-tuples)
The variable is bound to the value of expression, which may contain many nodes (i.e. the set of nodes returned by the expression evaluation are linked into a forest of nodes)
document order is preserved among the nodes in each bound collection, unless some expression contains a non-order-preserving function such as distinct()
LET var := expr
,
99
WHERE clause
Applies a predicate to the tuples of bound variabes Retains only tuples that satisfy the predicate Preserves order of tuples, if any May contain AND, OR, NOT Applies scalar conditions to scalar variables:
$color = “Red” Applies set conditions to variables bound to sets:
avg($emp/salary) > 1000
WHERE boolean-expression
100
RETURN clause
Constructs the result of the FLWR expression Executed once for each tuple of bound variables Preserves order of tuples, if any, … OR, can impose a new order using a SORTBY
clause Often uses an element constructor:
<item>$item/itemno,<avg_bid> avg($b/bid_amount) </avg_bid>
</item> SORTBY itemno
RETURN expr