dr. alexandra i. cristea acristea/ xml

141
Dr. Alexandra I. Cristea http://www.dcs.warwick.ac.uk/ ~acristea/ XML

Upload: helen-lyons

Post on 14-Dec-2015

224 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Dr. Alexandra I. Cristea acristea/ XML

Dr. Alexandra I. Cristea

http://www.dcs.warwick.ac.uk/~acristea/

XML

Page 2: Dr. Alexandra I. Cristea acristea/ XML

2

XML history• Inception: circa 1996 • The Extensible Markup Language (XML)

became a W3C Recommendation 10. February 1998.

• It’s being used currently in very many places – see HESA

Page 3: Dr. Alexandra I. Cristea acristea/ XML

3

What is XML?• XML stands for EXtensible Markup

Language • XML was designed to describe data • XML is more of a standard and

supporting structure than a standalone programming language

• XML is a markup language much like HTML – wrong!: meta-language

Page 4: Dr. Alexandra I. Cristea acristea/ XML

4

How does XML work?• XML tags are not predefined. You must

define your own tags • XML uses a Document Type

Definition (DTD) or an XML Schema to describe the data

• XML with a DTD or XML Schema is designed to be self-descriptive

Page 5: Dr. Alexandra I. Cristea acristea/ XML

5

XML is Free and Extensible• XML tags are not predefined. You must

"invent" your own tags.• The tags used to mark up HTML documents

and the structure of HTML documents are predefined. The author of HTML documents can only use tags that are defined in the HTML standard (like <p>, <h1>, etc.).

• XHTML is XML but not vice-versa.

Page 6: Dr. Alexandra I. Cristea acristea/ XML

6

XML does not DO anything• XML was created to structure, store and to send

information

<note> <to>John</to> <from>Jane</from>

<heading>Reminder</heading> <body>Don't forget the book!</body>

</note>

Page 7: Dr. Alexandra I. Cristea acristea/ XML

7

Main Difference XML, HTML• XML was designed to carry data.• XML is not a replacement for HTML.

XML and HTML were designed with different goals:– XML was designed to describe data and to focus on what

data is.

– HTML was designed to display data and to focus on how data looks.

• HTML is about displaying information, while XML is about describing information.

• Syntax: XML is well formed, just like XHTML

Page 8: Dr. Alexandra I. Cristea acristea/ XML

8

XML is a Complement to HTML

• XML is not a replacement for HTML.– In Web development XML is used to

describe the data, while HTML is used to format and display the same data.

• XML is a cross-platform, software and hardware independent tool for transmitting information.

Page 9: Dr. Alexandra I. Cristea acristea/ XML

9

Benefits XML• extensibility and structured nature of XML

allows it to be used for communication between different systems

• from one source of XML-based information you can format and distribute it via a multitude of different channels – XSL files act as templates, allowing a single

stylesheet to be used to format multiple pages or the same content for multiple distribution channels

Page 10: Dr. Alexandra I. Cristea acristea/ XML

10

XML in Web Development

• XML is everywhere.• the XML standard has been developed

quickly and a large number of software vendors have adopted it.

• XML might be the most common tool for all data manipulation and data transmission.

Page 11: Dr. Alexandra I. Cristea acristea/ XML

11

XML Can be Used to Create New Languages• XML is the mother of WAP and WML.

– WAP: standard for web browser for mobile devices

– The Wireless Markup Language (WML), used to markup Internet applications for handheld devices like mobile phones, is written in XML.

• And many others … search for more as homework

Page 12: Dr. Alexandra I. Cristea acristea/ XML

12

• Question: When should I use XML?• Answer: When you need a buzzword in

your resume.

Page 13: Dr. Alexandra I. Cristea acristea/ XML

13

Viewing XML• to view XML documents hierarchically or view

their output, you need an XML parser and processor.

• there are a number of these tools available:• See examples at: • http://www.stylusstudio.com/xml_download.html • http://www.w3schools.com/xml/xml_parser.asp • Please note, however: XML was not designed to

display data.

Page 14: Dr. Alexandra I. Cristea acristea/ XML

14

The basic XML flow

Page 15: Dr. Alexandra I. Cristea acristea/ XML

XML-based languages

• RSS• Twitter API• MathML• SVG• SOAP• WSDL• Microsoft Office (pptx, docx, xlsx)• Open Office XML• SMIL• RDF

15

Page 16: Dr. Alexandra I. Cristea acristea/ XML

16

XML Rules1. Every start-tag must have a matching end-tag.

2. Tags cannot overlap. Proper nesting is required.

3. XML documents can only have one root element.

4. Element names must obey the following XML naming conventions: a) Names must start with letters or the "_" character.

Names cannot start with numbers of punctuation characters.

b) After the first character, numbers and punctuation characters are allowed.

Page 17: Dr. Alexandra I. Cristea acristea/ XML

17

XML Rules (cont.)

c) Names cannot contain spaces. d) Names should not contain the ":" character as it is a

"reserved" character. e) Names cannot start with the letters "xml" in any

combination of case. f) The element name must come directly after the "<" without

any spaces between them.

5. XML is case sensitive. 6. XML preserves white space within text. 7. Elements may contain attributes. If an attribute is

present, it must have a value, even if it is an empty string "".

Page 18: Dr. Alexandra I. Cristea acristea/ XML

18

Spot the error!

<?xml version="1.0" encoding="ISO-8859-1"?> <note date=12/11/2002>

<to>Tove</to> <from>Jani</from>

</note>

Page 19: Dr. Alexandra I. Cristea acristea/ XML

19

Spot the error!

<?xml version="1.0" encoding="ISO-8859-1"?> <note date="12/11/2002">

<to>Tove</to> <from>Jani</from>

</note>

Page 20: Dr. Alexandra I. Cristea acristea/ XML

20

With XML, CR / LF is converted to LF

• Windows: CR + LF• Unix: LF• Macintosh: CR

Page 21: Dr. Alexandra I. Cristea acristea/ XML

21

There is Nothing Special About XML

• plain text w XML tags • Software that can handle plain text can

also handle XML. • In an XML-aware application, the XML

tags can be handled specially: – Visibility,– Functional meaning, etc.

Page 22: Dr. Alexandra I. Cristea acristea/ XML

22

Is this an error?

<note>

<to>Tove</to>

<from>Jani</from>

<body>Don't forget me this weekend!</body>

</note>

<heading>Reminder</heading>

Page 23: Dr. Alexandra I. Cristea acristea/ XML

23

XML Elements have Relationships

• Elements are related as parents and children.

• Root element / Parents• Children / Siblings

Page 24: Dr. Alexandra I. Cristea acristea/ XML

24

Elements• An element consists of all the information from the

beginning of a start-tag to the end of an end-tag including everything in between.

• E.g. from (X)HTML, all of the following would be the equivalent of one element, named h1:

<h1>This is a heading.</h1> – Where, <h1> is the start tag, </h1> is the end tag, and the

content is in between. • Each XML document has a root element within which

all other elements are nested.

Page 25: Dr. Alexandra I. Cristea acristea/ XML

25

Examples• See at:

– http://www.dcs.warwick.ac.uk/~acristea/courses/CS253/2009/books.xml

– http://feeds.bbci.co.uk/news/rss.xml– Search more by yourself and familiarize

yourself with the syntax!

Page 26: Dr. Alexandra I. Cristea acristea/ XML

26

XML Attributes• XML elements can have attributes.

• From HTML you will remember this: <IMG SRC="computer.gif">

• The SRC attribute provides additional information about the IMG element.

Page 27: Dr. Alexandra I. Cristea acristea/ XML

27

Attributes versus Elements• <person sex="female">

<firstname>Anna</firstname> <lastname>Smith</lastname> </person>

• <person> <sex>female</sex> <firstname>Anna</firstname> <lastname>Smith</lastname> </person>

Page 28: Dr. Alexandra I. Cristea acristea/ XML

28

Comments• same as in any other languages with line(s) of

code whose sole purpose is to provide the developer, and anyone reading the code in the future, information about the code.

<!-- all the comments go in here -->

Page 29: Dr. Alexandra I. Cristea acristea/ XML

29

XML Validation: Well Formed-ness

• An XML document is well formed, if all the XML rules are obeyed.

(with 7 XML rules as defined in slides 16-17)

Page 30: Dr. Alexandra I. Cristea acristea/ XML

30

XML declaration

• Every XML document begins with a declaration (not mandatory, good practice)

<?xml version="1.0"?> • Or, using optional attributes:<?xml version="1.0" encoding=“UTF-16” standalone="yes"?>

Page 31: Dr. Alexandra I. Cristea acristea/ XML

31

Document Type Definition (DTD)

• which tags and attributes are allowed, • where they can be placed, and • whether or not they can be nested

within a given document.

Page 32: Dr. Alexandra I. Cristea acristea/ XML

32

Document Type Declaration (DOCTYPE)

• <!DOCTYPE MovieCatalog SYSTEM "movie_catalog.dtd">

Root document element

URL to DTD(external subset via a system identifier)

Page 33: Dr. Alexandra I. Cristea acristea/ XML

33

Internal vs External DTD declaration

Internal:

<!DOCTYPE foo [ <!ENTITY greeting "hello"> ]>

External, public:

<!DOCTYPE html PUBLIC "//W3C//DTD HTML 4.01//EN” >

Page 34: Dr. Alexandra I. Cristea acristea/ XML

34

Valid XML Documents• A "Valid" XML document is a "Well Formed"

XML document, which also conforms to the rules of a Document Type Definition (DTD):

<?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE note SYSTEM "Note.dtd"> <note> <to>Tom</to> <from>Jane</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body>

</note>

Page 35: Dr. Alexandra I. Cristea acristea/ XML

35

Validators• http://xmlvalidator.new-studio.org/

• Also at: http://validator.w3.org/#validate_by_input

Page 36: Dr. Alexandra I. Cristea acristea/ XML

36

Internal DTD<?xml version="1.0"?> <!DOCTYPE note [ <!ELEMENT note (to,from,heading,body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)> ]> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>

Page 37: Dr. Alexandra I. Cristea acristea/ XML

37

External DTD

<!ELEMENT note (to,from,heading,body)>

<!ELEMENT to (#PCDATA)>

<!ELEMENT from (#PCDATA)>

<!ELEMENT heading (#PCDATA)>

<!ELEMENT body (#PCDATA)>

>> saved as file Note.dtd

Page 38: Dr. Alexandra I. Cristea acristea/ XML

Attributes in DTDs• Syntax:

<!ATTLIST element-name attribute-name attribute-type attribute-value>

• Example:<!ATTLIST body lang CDATA "EN">

• Usage (in XML doc):<body lang="EN"/>

38

Attribute-value can be an actual value, or #REQUIRED or #IMPLIED or even #FIXED value

Page 39: Dr. Alexandra I. Cristea acristea/ XML

39

XML Schema (XSD)• XML Schema is an XML based

alternative to DTD. • W3C supports an alternative to DTD

called XML Schema:http://www.w3.org/XML/Schema

Page 40: Dr. Alexandra I. Cristea acristea/ XML

40

Displaying your XML Files with CSS?• It is possible to use CSS to format an XML document.• Example:• XML file: The CD catalog• style sheet: The CSS file• product: The CD catalog formatted with the CSS file• Below is a fraction of the XML file. The second line,

<?xml-stylesheet type="text/css" href="cd_catalog.css"?>, links the XML file to the CSS file

Page 41: Dr. Alexandra I. Cristea acristea/ XML

41

Displaying XML with XSL• XSL is the preferred style sheet language

of XML.• XSL (the eXtensible Stylesheet Language) is

far more sophisticated than CSS. • examples:

– View the XML file, the XSL style sheet, and View the result.

<?xml version="1.0" encoding="ISO-8859-1"?> <?xml-stylesheet type="text/xsl" href=“simple.xsl"?>

Page 42: Dr. Alexandra I. Cristea acristea/ XML

42

XML Conclusions• We have learned:

– XML history– What it is– How it works– Differences to (X)HTML– XML flow– XML Rules– XML Elements, Relationships, Attributes,

Comments– Well-formed-ness concept– XML supporting frame: XML Schema or DTD– Generics on displaying XML

Page 43: Dr. Alexandra I. Cristea acristea/ XML

43

Why an XML Editor?• XML Schema to define XML structures and data

types • XSLT to transform XML data • SOAP to exchange XML data between applications • WSDL to describe web services • RDF to describe web resources • XPath and XQuery to access XML data • SMIL to define graphics • Altova's XMLSpy

– 30 days free trial– http://www.altova.com/xmlspy.html– http://www.altova.com/products/xmlspy/xsl_xslt_editor.html

Page 44: Dr. Alexandra I. Cristea acristea/ XML

44

• Next:– We look at how to access elements and

attributes inside the XML– This can be done via …

– XPATH

Page 45: Dr. Alexandra I. Cristea acristea/ XML

45

• Previously we looked at:– XML

• Next:– XPath– Namespaces

Page 46: Dr. Alexandra I. Cristea acristea/ XML

Validity versus well-formedness• Checking for well-formedness:

– http://www.xmlvalidation.com/– http://www.w3schools.com/xml/xml_validator.asp

• Checking for validity:– http://www.validome.org/xml/validate/

• As a homework, check the validity and well-formedness of various XML documents in this course and on the web. Pay attention to the difference.

46

Page 47: Dr. Alexandra I. Cristea acristea/ XML

47

XPathhttp://www.w3.org/TR/xpath20/

Page 48: Dr. Alexandra I. Cristea acristea/ XML

48

XPath• XPath is a syntax for defining parts of an XML

document • XPath uses path expressions to navigate in XML

documents • XPath contains a library of standard functions • XPath is a major element in XSLT • XPath is a W3C recommendation, thus a

Standard (16. November 1999; newest Dec. 2010 )

Page 49: Dr. Alexandra I. Cristea acristea/ XML

49

XPath Path Expressions• Uses path expressions to select nodes

or node-sets in an XML document. – These path expressions look very much

like the expressions you see when you work with a traditional computer file system.

Page 50: Dr. Alexandra I. Cristea acristea/ XML

50

XPath Standard Functions• over 100 built-in functions.

– string values, – numeric values, – date and time comparison, – node and QName manipulation, – sequence manipulation, – Boolean values, – and more.

Page 51: Dr. Alexandra I. Cristea acristea/ XML

51

XPath Terminology• Nodes• Atomic values• Items (atomic values or nodes)• Relationships of nodes

– Parent– Children– Siblings– Ancestors– Descendants

Page 52: Dr. Alexandra I. Cristea acristea/ XML

52

XPath Nodes• 7 kinds of nodes:

1. element,

2. attribute,

3. text,

4. namespace,

5. processing-instruction,

6. comment, and

7. document (root) nodes. • XML documents are treated as trees of nodes. The root

of the tree is called the document node (or root node).

Page 53: Dr. Alexandra I. Cristea acristea/ XML

53

Nodes Examples<?xml version="1.0" encoding="ISO-8859-1"?> <bookstore> <book> <title lang="en">Harry Potter</title> <author>J K. Rowling</author> <year>2005</year> <price>29.99</price> </book>

</bookstore>

Document (root) node

Element node

Attribute node

Page 54: Dr. Alexandra I. Cristea acristea/ XML

54

Atomic values Examples*<?xml version="1.0" encoding="ISO-8859-1"?> <bookstore> <book> <title lang="en">Harry Potter</title> <author>J K. Rowling</author> <year>2005</year> <price>29.99</price> </book> </bookstore>

*nodes with no children or parent

Page 55: Dr. Alexandra I. Cristea acristea/ XML

55

Selecting nodes

Expression Description

nodename Selects all child nodes with this name

/ Selects from the root node

// Selects nodes in the document from the current node down that match the selection no matter where they are

. Selects the current node

.. Selects the parent of the current node

@ Selects attributes

Page 56: Dr. Alexandra I. Cristea acristea/ XML

56

Examples of selecting nodesPath Expression Result

bookstore Selects all the bookstore elements

/bookstore Selects the root element bookstoreNote: If the path starts with a slash ( / ) it always represents an absolute path to an element!

bookstore/book Selects all book elements that are children of bookstore

//book

bookstore//book

//@lang Selects all attributes that are named lang

Selects all book elements no matter where they are in the document

Selects all book elements that are descendant of the bookstore element, no matter where they are under the bookstore element

Page 57: Dr. Alexandra I. Cristea acristea/ XML

57

Predicates• Predicates are used to find a specific

node or a node that contains a specific value.

• Predicates are always embedded in square brackets.

Page 58: Dr. Alexandra I. Cristea acristea/ XML

58

Example predicates

Path Expression Result

/bookstore/book[1] Selects the first book element that is the child of the bookstore element

/bookstore/book[last()] Selects the last book element that is the child of the bookstore element

/bookstore/book[last()-1]

/bookstore/book[position()<3]

Selects the last but one book element thatis the child of the bookstore element

Selects the first two book elements that are children of the bookstore element

Page 59: Dr. Alexandra I. Cristea acristea/ XML

59

Example predicates – cont. Path Expression Result

//title[@lang] Selects all the title elements that have an attribute named lang

//title[@lang='eng'] Selects all the title elements that have an attribute named lang with a value of 'eng'

/bookstore/book[price>35.00]

/bookstore/book[price>35.00]/title

Selects all the book elements of the bookstore element that have a price element with a value greater than 35.00

Selects all the title elements of the book elements of the bookstore element that have a price element with a value greater than 35.00

Page 60: Dr. Alexandra I. Cristea acristea/ XML

60

Selecting Unknown Nodes

Wildcard Description

* Matches any element node

@* Matches any attribute node

node() Matches any node of any kind

Page 61: Dr. Alexandra I. Cristea acristea/ XML

61

Example: selecting several paths

Path Expression Result//book/title | //book/price Selects all the title as well as price

elements of all book elements

//title | //price

/bookstore/book/title | //priceSelects all the title as well as price elements in the documentSelects all the title elements of the book element of the bookstore element as well as all the price elements in the document

Page 62: Dr. Alexandra I. Cristea acristea/ XML

62

XPath Axesself

child parent

ancestor descendant

ancestor-or-self descendant-or-self

preceding-sibling following-sibling

preceding following

attribute

namespace

Page 63: Dr. Alexandra I. Cristea acristea/ XML

63

axisname::nodetest[predicate] • //DDD/parent::*

<AAA>           <BBB>               <DDD>

               </DDD>           </BBB>

</AAA>

Page 64: Dr. Alexandra I. Cristea acristea/ XML

64

axisname::nodetest[predicate] • //BBB/child::*

<AAA>           <BBB>               <DDD>

               </DDD>           </BBB>

</AAA>

Note: /AAA is equivalent to /child::AAA

Page 66: Dr. Alexandra I. Cristea acristea/ XML

XPath Try-out• XPath online:

http://www.freeformatter.com/xpath-tester.html#ad-output

• You can use an existing XML:– http://www.dcs.warwick.ac.uk/~acristea/courses/CS25

3/2008/books.xml

• Use XPath snippets from module or create your own:– //title | //price– /bookstore/book[last()]– /bookstore/book[last()-1]– /bookstore/book[position()<3] etc.

66

Page 67: Dr. Alexandra I. Cristea acristea/ XML

67

XPath Conclusion• We have learned:

– XPath definition– Path expressions– Standard functions– Terminology– Predicates– Location paths– Axes– Some operators

Page 68: Dr. Alexandra I. Cristea acristea/ XML

68

• Before we go on, one more thing about XML:

• XML Namespaces

Page 69: Dr. Alexandra I. Cristea acristea/ XML

69

Naming ambiguity

Page 70: Dr. Alexandra I. Cristea acristea/ XML

70

The Idea to Solve it

• Assign a URI (~ URL) to every sub-language:– E.g., for XHTML 1.0:

http://www.w3.org/1999/xhtml• Qnames: Qualify element names with

URIs:– {http://www.w3.org/1999/xhtml}head

Web Naming and Addressing Overview (URIs, URLs, ...)

Page 71: Dr. Alexandra I. Cristea acristea/ XML

71

The actual solution• Namespace declarations bind URIs to

prefixes:• Default namespace (no prefix) declared

with: xmlns=“…”• Lexical Scope• Attribute names can also be prefixed

Page 72: Dr. Alexandra I. Cristea acristea/ XML

72

Applying namespaces

Page 73: Dr. Alexandra I. Cristea acristea/ XML

73

• Next we look at how to query XML• This can be done, to some extent, as

we have seen, within XPath (or XSLT), • but the main language developed for

this purpose is …

Page 74: Dr. Alexandra I. Cristea acristea/ XML

74

XQueryhttp://www.w3.org/TR/xquery/

Page 75: Dr. Alexandra I. Cristea acristea/ XML

75

• Previously we looked at:– XPath– Namespaces

• Next:– XQuery

Page 76: Dr. Alexandra I. Cristea acristea/ XML

76

What is XQuery?• XQuery is the language for querying XML data • XQuery for XML is like SQL for databases • XQuery is built on XPath expressions • XQuery is defined by the W3C • XQuery is supported by all the major database

engines (IBM, Oracle, Microsoft, etc.) • XQuery is a W3C recommendation (Jan 2007;

latest 14 Dec 2010) thus a standard

Page 77: Dr. Alexandra I. Cristea acristea/ XML

77

XQuery - Examples of Use

• Extract information to use in a Web Service • Generate summary reports • Transform XML data to XHTML • Search Web documents for relevant

information

Page 78: Dr. Alexandra I. Cristea acristea/ XML

78

XQuery compared to XPath

• XQuery 1.0 and XPath 2.0 share the same data model and support the same functions and operators.

• XQuery 1.0 is a strict superset of XPath 2.0• XPath 2.0 expression is directly an XQuery 1.0

expression (a query)• The extra expressive power is the ability to:

– Join information from different sources and– Generate new XML fragments

Page 79: Dr. Alexandra I. Cristea acristea/ XML

79

XQuery ‘compilers’

• Download: http://www.altova.com/altovaxml.html

• Syntax check at: http://www.w3.org/2007/01/applets/xqueryApplet.html

• XPath, XQuery tester:http://www.xpathtester.com/xqueryhttp://brettz9.github.io/xqueryeditor/

Page 80: Dr. Alexandra I. Cristea acristea/ XML

80

XQuery query makeup

• Prolog– Like XPath, XQuery expressions are evaluated

relatively to a context– explicitly provided by a prolog (header)~ header with definitions

• Body– The actual query

• Select• Join• Generate

Page 81: Dr. Alexandra I. Cristea acristea/ XML

81

XQuery Ex.: Prolog + Query

Page 82: Dr. Alexandra I. Cristea acristea/ XML

82

XQuery Prolog (i.e., header(s))• Settings define various parameters for the XQuery processor language,

such as:xquery version "1.0";

declare base-uri "http://example.org";declare default element namespace

"http://example.org/names";declare namespace xs= "http://www.w3.org/2001/XMLSchema";import module "http://www.w3.org/2003/05/xpath-functions"

at "logo.xq";declare variable $x as xs:integer := 7;declare function addLogo($root as node()) as node()*{ };(: etc :)

Page 83: Dr. Alexandra I. Cristea acristea/ XML

Module definition

xquery version "1.0";module namespace mylib = "http://www.example.com/test_library";

declare variable $mylib:foo as xs:string := "foo";

declare function mylib:foobar() as xs:string

{

concat ($mylib:foo, "bar")

};

83

Page 84: Dr. Alexandra I. Cristea acristea/ XML

84

Body: Constructors

Direct constructors in XQuery:

<XMLfragment>my fragment </XMLfragment>

– Evaluates to the given XML fragment

Page 85: Dr. Alexandra I. Cristea acristea/ XML

85

Explicit constructors

computed constructors

Page 86: Dr. Alexandra I. Cristea acristea/ XML

86

Variable bindings (implicit constructors)

<employee empid="{$id}"> <name>{$name}</name>

{$job} <deptno>{$deptno}</deptno> <salary>{$SGMLspecialist+100000}</salary>

</employee>

Page 87: Dr. Alexandra I. Cristea acristea/ XML

87

How to Select Nodes with XQuery?

• Functions– XQuery uses functions to extract data from XML

documents.• (X)Path Expressions

– XQuery uses path expressions to navigate through elements in an XML document.

• Predicates– XQuery uses predicates to limit the extracted

data from XML documents.

Page 88: Dr. Alexandra I. Cristea acristea/ XML

88

Functions• doc()

– function to open a file• Example:

– doc("books.xml")

• Note: A call to a function can appear where an expression may appear.

Page 89: Dr. Alexandra I. Cristea acristea/ XML

89

Path Expressions

• Example:select all the title elements in the "books.xml"

file:

doc("books.xml")/bookstore/book/title

Page 90: Dr. Alexandra I. Cristea acristea/ XML

90

Predicates• Example:

select all the book elements under the bookstore element that have a price element with a value that is less than 30 :

doc("books.xml")/bookstore/book[price<30]

Page 91: Dr. Alexandra I. Cristea acristea/ XML

91

At a glance: function, path, predicate

Page 92: Dr. Alexandra I. Cristea acristea/ XML

92

FLWOR• For, Let, Where, Order by, Return

= main engine

~ SQL syntax (SFW(GH)O)

~ programs and function calls

Page 93: Dr. Alexandra I. Cristea acristea/ XML

93

FLWOR by comparison with Path expressions

• select all the title elements under the book elements that are under the bookstore element that have a price element with

a value that is higher than 30.

• Path expression:doc("books.xml")/bookstore/book[price>30]/title

• FLWOR expression: for $x in doc("books.xml")/bookstore/book where $x/price>30 return $x/title

Page 94: Dr. Alexandra I. Cristea acristea/ XML

94

Sorting in FLWOR• for $x in doc("books.xml")/bookstore/book

where $x/price>30 order by $x/title return $x/title

Page 95: Dr. Alexandra I. Cristea acristea/ XML

95

Present the Result as HTML List

<ul>

{

for $x in doc("books.xml")/bookstore/book/title

order by $x

return <li>{$x}</li>

}

</ul>

Page 96: Dr. Alexandra I. Cristea acristea/ XML

96

Result HTML List

<ul> <li><title lang="en">Everyday

Italian</title></li> <li><title lang="en">Harry

Potter</title></li> <li><title lang="en">Learning

XML</title></li> <li><title lang="en">XQuery Kick

Start</title></li> </ul>

Page 97: Dr. Alexandra I. Cristea acristea/ XML

97

Eliminate element (here: title)

<ul>

{

for $x in doc("books.xml")/bookstore/book/title

order by $x

return <li>{data($x)}</li> (: also text{} :)

}

</ul>

Page 98: Dr. Alexandra I. Cristea acristea/ XML

98

New result HTML List

<ul>

<li>Everyday Italian</li>

<li>Harry Potter</li>

<li>Learning XML</li>

<li>XQuery Kick Start</li>

</ul>

Page 99: Dr. Alexandra I. Cristea acristea/ XML

99

Another FLWOR Expression

<doubles>{ for $s in doc("students.xml")//student let $m := $s/major where count($m) ge 2 order by $s/@id return <double>

{ $s/name/text()} </double>}</doubles>

Page 100: Dr. Alexandra I. Cristea acristea/ XML

100

The Difference between for and let

Page 101: Dr. Alexandra I. Cristea acristea/ XML

101

The Difference between for and let

:=in

Page 102: Dr. Alexandra I. Cristea acristea/ XML

102

The Difference between for and let

Page 103: Dr. Alexandra I. Cristea acristea/ XML

103

The Difference between for and let

Page 104: Dr. Alexandra I. Cristea acristea/ XML

104

FLWOR Basic Building Blocks

Page 105: Dr. Alexandra I. Cristea acristea/ XML

105

General rules

• for and let may be used many times in any order

• only one where is allowed • many different sorting criteria can be

specified (descending, ascending, etc.)

Page 106: Dr. Alexandra I. Cristea acristea/ XML

106

Reversing order• Reverses the order of a sequence, for

nodes or atomic values

• reverse (( 1, 2, 3))

-> 321

Page 107: Dr. Alexandra I. Cristea acristea/ XML

107

Joining documentsfor $p in doc("www.irs.gov/taxpayers.xml")//person

for $n in doc("neighbors.xml")//neighbor[ssn = $p/ssn]

return

<person>

<ssn> { $p/ssn } </ssn>

{ $n/name }

<income> { $p/income } </income>

</person>

Page 108: Dr. Alexandra I. Cristea acristea/ XML

108

Two-way join in a where Clause

for $item in doc(“ord.xml”)//item,

$product in doc(“cat.xml”)//product

where $item/@num = $product/number

return

<item num=“{$item/@num}”

name=“{$product/name}”

quan=“{$item/@quantity}” />

Page 109: Dr. Alexandra I. Cristea acristea/ XML

109

Aggregating• Make summary calculations on grouped

data• Functions:

– sum, avg, max, min, count

Page 110: Dr. Alexandra I. Cristea acristea/ XML

110

Conditionalsfor $b in doc(“bib.xml”)/book

return  <short>   {$b/title}   <author>    {if ( count($b/author) < 3 )      then   $b/author      else        ( $b/author[1], <author>and others</author>)      }    </author>  </short>

Page 111: Dr. Alexandra I. Cristea acristea/ XML

111

Nesting Conditional Expressions

• Conditional expressions can be nested• ‘else if’ functionality is provided

• if ( count($b/author) = 1 )      then   $b/author      else if (count($b/author) = 2 )then (: .. :)        else ( $b/author[1], <author>and others</author>)

Page 112: Dr. Alexandra I. Cristea acristea/ XML

112

Logical Expressions• and, or operators:

– and has precedence over or– Parentheses can change precedence

if ($isDiscounted and ($discount > 5 or $discount < 0 ) ) then 5 else $discount

• not function for negations: if (not($isDiscounted)) then 0 else $discount

Page 113: Dr. Alexandra I. Cristea acristea/ XML

113

XQuery Built-in Functions

XQuery function namespace URI is:http://www.w3.org/2005/02/xpath-functions

default prefix: fn:.• E.g.: fn:string() or fn:concat(). • fn: is the default prefix of the namespace, the

function names does not need to be prefixed when called.

Page 114: Dr. Alexandra I. Cristea acristea/ XML

114

Built-in Functions• String-related

– substring, contains, matches, concat, normalize-space, tokenize

• Date-related– current-date, month-from-date, adjust-time-to-

timezone• Number-related

– round, avg, sum, ceiling• Sequence-related

– index-of, insert-before, reverse, subsequence, distinct-values

Page 115: Dr. Alexandra I. Cristea acristea/ XML

115

Built-in Functions (2)• Node-related

– data, empty, exists, id, idref• Name-related

– local-name, in-scope-prefixes, QName, resolve-QName

• Error handling and trapping– error, trace, exactly-one

• Document and URI-related– collection, doc, root, base-uri

Page 116: Dr. Alexandra I. Cristea acristea/ XML

116

Function calls

doc("books.xml")//book[substring(title,1,5)='Harry']

let $name := (substring($book/title,1,4))

<name>{upper-case($book/title)}</name>

Page 117: Dr. Alexandra I. Cristea acristea/ XML

117

for $x in doc("http://www.dcs.warwick.ac.uk/~acristea/courses/CS253/2009/books.xml")//book/title

for $y in data($x)for $name in (substring($y,1,4))

return $name

Page 118: Dr. Alexandra I. Cristea acristea/ XML

118

User Defined Functions

declare function prefix:function_name($parameter AS datatype)

AS returnDatatype

{ (: ...function code here... :) };

Page 119: Dr. Alexandra I. Cristea acristea/ XML

119

User-defined Functionsdeclare function depth($e AS xsd:integer) AS xsd:integer

{  if (empty($e/*) then 1  else max(for $c in $e/* return depth($c)) ) +1};

(: usage :) for $b in doc(“bib.xml”)/book

return depth($b)

Page 120: Dr. Alexandra I. Cristea acristea/ XML

120

Existential and Universal Quantifiers

• for $b in doc(“bib.xml”)/bookwhere some $author in $b/author   satisfies $author/text() = “Ullman”return $b

• for $b in doc(“bib.xml”)/bookwhere every $author in $b/author           satisfies $author/text() = “Ullman”return $b

Return books where all authors are “Ullman”

Return books where at least one author is “Ullman”

Page 121: Dr. Alexandra I. Cristea acristea/ XML

121

Comments

Page 122: Dr. Alexandra I. Cristea acristea/ XML

122

Comparisons• Value comparisons

Eq, ne, lt, le, gt, ge

Used to compare individual values

Each operand must be a single atomic value (or a node containing a single atomic value)

• General comparisons=, !=, <, <=, >, >=

Can be used with sequences of multiple items

Page 123: Dr. Alexandra I. Cristea acristea/ XML

123

Example

Page 124: Dr. Alexandra I. Cristea acristea/ XML

Node comparisons• to compare two nodes, by their identity

or by their document order• Example:• The following comparison is true only if

the left and right sides each evaluate to exactly the same single node:

• /books/book[isbn="1558604820"] is /books/book[call="QA76.9 C3845"]

124

Page 125: Dr. Alexandra I. Cristea acristea/ XML

Remember direct element constructors?

element book {

attribute isbn {"isbn-0060229357" },

element title { "Harold and the Purple Crayon"},

element author { element first { "Crockett" },

element last {"Johnson" }

}

}

125

Page 126: Dr. Alexandra I. Cristea acristea/ XML

Using direct element constructors interestingly

let $e := <length units="inches">{5}</length>

return element {fn:node-name($e)}

{$e/@*, 2 * fn:data($e)}

126

Page 127: Dr. Alexandra I. Cristea acristea/ XML

Other examples node comparison

• The following comparison is false because each constructed node has its own identity:

• <a>5</a> is <a>5</a> • The following comparison is true only if the

node identified by the left side occurs before the node identified by the right side in document order:

• /transactions/purchase[parcel="28-451"] << /transactions/sale[parcel="33-870"]

127

Page 128: Dr. Alexandra I. Cristea acristea/ XML

128

XQuery Syntax• Declarative, functional language

~ SQL• Nested expressions• Case sensitive• White spaces:

– Tabs, space, CR, LF– Ignored between language constructs– Significant in quoted strings

• No special EOL character

Page 129: Dr. Alexandra I. Cristea acristea/ XML

129

Keywords and names• Keywords and operators

– Case-sensitive, generally lower case– May have several meanings depending on the

context• E.g. “*” or “in”

– No reserved words

• All names must be valid XML names – variables, functions, elements, attributes– Can be associated with a namespace

Page 130: Dr. Alexandra I. Cristea acristea/ XML

130

XQuery gives you a choice:• Path Expressions:

– If you just want to copy certain elements and attributes as is

• FLWOR Expressions:– Allow sorting– Allow adding elements/attributes– Verbose, but can be clearer

Page 131: Dr. Alexandra I. Cristea acristea/ XML

131

XQuery tools• XStylus Studio 2007

http://www.stylusstudio.com/xml_download.html (free trial version)– See also short XQuery intro at:

http://www.stylusstudio.com/xquery_primer.html

Page 132: Dr. Alexandra I. Cristea acristea/ XML

132

Other info:

–XQuery on Distributed Resources–Extensions for generic programming with XML

Page 133: Dr. Alexandra I. Cristea acristea/ XML

133

XQuery on Distributed Sources

Page 134: Dr. Alexandra I. Cristea acristea/ XML

134

Page 135: Dr. Alexandra I. Cristea acristea/ XML

135

Page 136: Dr. Alexandra I. Cristea acristea/ XML

136

Page 137: Dr. Alexandra I. Cristea acristea/ XML

137

Page 138: Dr. Alexandra I. Cristea acristea/ XML

138

Page 139: Dr. Alexandra I. Cristea acristea/ XML

139

XML and programming

• XSLT, XPath and XQuery provide tools for specialized tasks.

• But many applications are not covered: – domain-specific tools for concrete XML

languages – general tools that nobody has thought of yet

Page 140: Dr. Alexandra I. Cristea acristea/ XML

140

XML in general-purpose programming languages

• parse XML documents into XML trees • navigate through XML trees • construct XML trees • output XML trees as XML documents • DOM and SAX are corresponding APIs that

are language independent and supported by numerous languages. JDOM is an API that is tailored to Java.

Page 141: Dr. Alexandra I. Cristea acristea/ XML

141

XQuery Conclusion• We have learned:

– XQuery definition– Usage scenarios– Comparison w. XSLT and XPath– Capabilities– Functions, path expressions and

predicates– FLWOR