craig stewart dr. alexandra i. cristea (acristea/) xml

65
Craig Stewart Dr. Alexandra I. Cristea (http://www.dcs.warwick.ac.uk/~acristea/) XML

Upload: adriel-rowbotham

Post on 01-Apr-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

Craig Stewart

Dr. Alexandra I. Cristea(http://www.dcs.warwick.ac.uk/~acristea/)

XML

Page 2: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

2

XML history• Inception: circa 1996 • The eXtensible Markup Language (XML)

v1.0 became a W3C Recommendation 10. February 1998. – Currently v1.0 is in it’s fifth version– v1.1 published 2004

• End of line issues• > Unicode v2.0 character sets• Other non-Unicode special characters

Page 3: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

3

What is XML?• XML stands for EXtensible Markup

Language • XML is a markup language much like

HTML• XML was designed to describe data • XML is more of a standard and

supporting structure than a standalone programming language

– wrong!

Page 4: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

4

How does XML work?• XML tags are not predefined. You must

define your own tags • XML uses a Document Type

Definition (DTD) or an XML Schema to describe the data – Also Relax NG (ISO DSDL)

• XML with a DTD or XML Schema is designed to be self-descriptive

Page 5: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

5

Main Difference XML, HTML• XML was designed to carry data.• XML is not a replacement for HTML.

XML and HTML were designed with different goals:– XML was designed to describe data and to focus on what

data is.

– HTML was designed to display data and to focus on how data looks.

• HTML is about displaying information, while XML is about describing information.

• Syntax: XML is well formed, just like XHTML

Page 6: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

6

XML does not DO anything• XML was created to structure, store and to send

information

<note> <to>John</to> <from>Jane</from>

<heading>Reminder</heading> <body>Don't forget the book!</body>

</note>

Page 7: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

7

XML is Free and Extensible• XML tags are not predefined. You must

"invent" your own tags.• The tags used to mark up HTML documents

and the structure of HTML documents are predefined. The author of HTML documents can only use tags that are defined in the HTML standard (like <p>, <h1>, etc.).

• XHTML is an application of XML but not vice-versa.

Page 8: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

8

Benefits XML• extensibility and structured nature of XML

allows it to be used for communication between different systems

• from one source of XML-based information you can format and distribute it via a multitude of different channels – XSL files act as templates, allowing a single

stylesheet to be used to format multiple pages or the same content for multiple distribution channels

Page 9: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

9

XML is a Complement to HTML

• XML is not a replacement for HTML.– In future Web development it is most likely

that XML will be used to describe the data, while HTML will be used to format and display the same data.

• XML is a cross-platform, software and hardware independent tool for transmitting information.

Page 10: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

10

XML in Future Web Development

• XML is going to be everywhere.

• the XML standard has been developed quickly and a large number of software vendors have adopted it.

• XML might be the most common tool for all data manipulation and data transmission.

Page 11: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

11

XML Can be Used to Create New Languages• XML is the mother of WAP and WML.

• The Wireless Markup Language (WML), used to markup Internet applications for handheld devices like mobile phones, is written in XML.

• And many others …

Page 12: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

12

Viewing XML• to view XML documents hierarchically or view

their output, you need an XML parser and processor.

• there are a number of these tools available:• See examples at: • http://www.stylusstudio.com/xml_download.html• http://www.w3schools.com/xml/xml_parser.asp• Please note, however: XML was not designed to

display data.

Page 13: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

13

The basic XML flow

Page 14: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

14

XML Rules1. Every start-tag must have a matching end-tag.

2. Tags cannot overlap. Proper nesting is required.

3. XML documents can only have one root element.

4. Element names must obey the following XML naming conventions: a) Names must start with letters or the "_" character.

Names cannot start with numbers or punctuation characters.

b) After the first character, numbers and punctuation characters are allowed.

Page 15: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

15

XML Rules (cont.)c) Names cannot contain spaces. d) Names should not contain the ":" character as it is a

"reserved" character. e) Names cannot start with the letters "xml" in any

combination of case. f) The element name must come directly after the "<" without

any spaces between them.

5. XML is case sensitive. 6. XML preserves white space within text. 7. Elements may contain attributes. If an attribute is

present, it must have a value, even if it is an empty string "".

Page 16: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

16

Spot the error!

<?xml version="1.0" encoding="ISO-8859-1"?>

<note date=12/11/2002>

<to>Tove</to>

<from>Jani</from>

</note>

Page 17: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

17

Spot the error!

<?xml version="1.0" encoding="ISO-8859-1"?>

<note date="12/11/2002">

<to>Tove</to>

<from>Jani</from>

</note>

Page 18: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

18

With XML, CR / LF is converted to LF

• Windows: CR + LF

• Unix: LF

• Macintosh: CR

Page 19: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

19

There is Nothing Special About XML

• plain text with XML tags

• Software that can handle plain text can also handle XML.

• In an XML-aware application, the XML tags can be handled specially: – Visibility,– Functional meaning, etc.

Page 20: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

20

Is this an error?

<note>

<to>Tove</to>

<from>Jani</from>

<body>Don't forget me this weekend!</body>

</note>

<heading>Reminder</heading>

Page 21: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

21

XML Elements have Relationships

• Elements are related as parents and children.

• Root element / Parents

• Children / Siblings

Page 22: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

22

Elements• An element consists of all the information from the

beginning of a start-tag to the end of an end-tag including everything in between.

• E.g. from (X)HTML, all of the following would be the equivalent of one element, named h1:

<h1>This is a heading.</h1> – Where, <h1> is the start tag, </h1> is the end tag, and the

content is in between.

• Each XML document has a root element within which all other elements are nested.

Page 23: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

23

Examples• See at:

– http://www.intranetjournal.com/articles/200402/ij_02_10_04a.html

– http://prolearn.dcs.warwick.ac.uk/caf/gipfBegIntAdv.xml

– http://www.intranetjournal.com/articles/200402/ij_02_10_04a.html

– http://prolearn.dcs.warwick.ac.uk/caf/gipfBegIntAdv.xml

– Search more by yourself and familiarize yourself with the syntax!

Page 24: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

24

XML Attributes

• XML elements can have attributes.

• From HTML you will remember this: <IMG SRC="computer.gif">

• The SRC attribute provides additional information about the IMG element.

Page 25: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

25

Attributes versus Elements

• <person sex="female"> <firstname>Anna</firstname> <lastname>Smith</lastname> </person>

• <person><sex>female</sex> <firstname>Anna</firstname> <lastname>Smith</lastname></person>

Page 26: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

26

Comments

• same as in any other languages with line(s) of code whose sole purpose is to provide the developer, and anyone reading the code in the future, information about the code.

<!-- all the comments go in here -->

Page 27: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

27

XML declaration

• Every XML document begins with a declaration (not mandatory, but good practice) <?xml version=“1.0”?>

• Or, using optional attributes:<?xml version=“1.0” encoding=“UTF-16” standalone=“yes”?>

Page 28: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

28

Document Type Definition (DTD)

• which tags and attributes are allowed,

• where they can be placed,

• whether or not they can be nested within a given document and

• what additional entity definitions are required

Page 29: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

29

Document Type Declaration (DOCTYPE)

• <!DOCTYPE MovieCatalog SYSTEM "movie_catalog.dtd">

Root document

URL to DTD(external subset via a system identifier)

Page 30: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

30

Internal vs External DTD declaration

Internal:

<!DOCTYPE foo [ <!ENTITY greeting "hello"> ]>

External, public:

<!DOCTYPE html PUBLIC "//W3C//DTD HTML 4.01//EN” >

Page 31: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

31

Valid XML Documents

• A "Valid" XML document is a "Well Formed" XML document, which also conforms to the rules of a Document Type Definition (DTD):

<?xml version="1.0" encoding="ISO-8859-1"?>

<!DOCTYPE note SYSTEM "InternalNote.dtd">

<note> <to>Tom</to> <from>Jane</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body>

</note>

Page 32: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

32

Validator

• syntax-check any XML file

• Also at: http://www.validome.org/xml/validate/

http://www.w3.org/2001/03/webdata/xsv

Page 33: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

33

Internal DTD<?xml version="1.0"?> <!DOCTYPE note [ <!ELEMENT note (to,from,heading,body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)> <!ENTITY cs “Craig Stewart”> ]> <note> <to>Tove</to> <from>&cs;</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>

Page 34: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

34

External DTD<?xml version="1.0"?> <!ELEMENT note (to,from,heading,body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)><!ELEMENT body (#PCDATA)> <!ENTITY cs “Craig Stewart”>

>> saved as file note.dtd

Page 35: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

35

Character Entities• What are they?

• How would you write an XML element called ‘summary’ for the following data:– The result is <17% of the original

• <summary>The result is <17% of the original

</summary>

• ??

Page 36: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

36

Character Entities• Character entities are a way to solve this problem

and get around the limitations of computer character sets (old ones) and keyboards.

• &lt; <• &gt; >• &apos; ‘• &quot; “• &amp;&

• Are all standard XML entities and can be used without fear of compatibility issues.

Page 37: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

37

Numeric Character Reference• “A numeric character reference (NCR) is a

common markup construct used in SGML and other SGML-based markup languages such as HTML and XML. It consists of a short sequence of characters that, in turn, represent a single character from the Universal Character Set (UCS) of Unicode”– Wikipedia

• Eg:– &#931; &#0931; &#x3A3; &#x3a3; – All represent "Σ"

Page 38: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

38

Defined Entities in DTDs• Three types:

– Internal<!ENTITY cs “Craig Stewart”>

– External<!ENTITY mypicture SYSTEM "pic01.gif" GIF>

– Parameter• For parameterizing the DTD • Start with a % not a &• Entirely different to other entities

http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references

Page 39: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

39

XML Schema (XSD)

• XML Schema is an XML based alternative to DTD.

• W3C supports an alternative to DTD called XML Schema:http://www.w3.org/XML/Schema

Page 40: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

40

Displaying your XML Files with CSS?• It is possible to use CSS to format an XML document.• Example:• XML file: The CD catalog• style sheet: The CSS file• product: The CD catalog formatted with the CSS file• Below is a fraction of the XML file. The second line,

<?xml-stylesheet type="text/css" href="cd_catalog.css"?>, links the XML file to the CSS file

Page 41: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

41

Displaying XML with XSL

• XSL is the preferred style sheet language of XML.

• XSL (the eXtensible Stylesheet Language) is far more sophisticated than CSS.

• examples:– View the XML file, the XSL style sheet, and

View the result.

<?xml version="1.0" encoding="ISO-8859-1"?> <?xml-stylesheet type="text/xsl" href=“simple.xsl"?>

Page 42: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

42

XML Conclusions• We have learned:

– XML history– What it is– How it works– Differences to (X)HTML– XML flow– XML Rules– XML Elements, Relationships, Attributes,

Comments– Well-formed-ness concept– XML supporting frame: XML Schema or DTD– Generics on displaying XML

Page 43: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

43

• Next we are looking into more specific information about how to display XML and more, with …

Page 44: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

44

XSL

Page 45: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

45

XSL• XSL is an XML-based language used

for stylesheets that can be used to transform XML documents into other document types and formats.

• XSL is a family of recommendations for defining XML document transformation and presentation.

• It consists of three parts.

Page 46: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

46

XSL parts:• XSL Transformations (XSLT)

– a language for transforming XML

• XML Path Language (XPath)– an expression language used by XSLT to access

or refer to parts of an XML document. (XPath is also used by the XML Linking specification)

• XSL Formatting Objects (XSL-FO)– an XML vocabulary for specifying formatting

semantics

Page 47: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

47

Conclusion XSL

• We have learned:– What is XSL– What are its parts

• Next: – Not all parts are equally important– We look at the most important one …

Page 48: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

48

XSLT

Page 49: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

49

XSLT• XSLT became a W3C Recommendation 16. November 1999. • most important part of XSL• transforms input document (source tree) into a particular way in

a specified output document (result tree). • built on a structure known as an XSL template: <xsl:template> • e.g., <xsl:template match=“/movie/title"> <xsl:value-of select="."/></xsl:template>

this selects one/all movies• Multiple templates: first the root template; if that doesn’t match,

the next, etc.

XPath

Page 50: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

50

XSLT Browsers• nearly all major browsers support XML and XSLT.• Mozilla Firefox

– v 1.0.2, Firefox has support for XML and XSLT (and CSS).

• Mozilla– XML + CSS. Namespaces. Available with an XSLT

implementation.

• Netscape– v 8, uses the Mozilla engine.

• Opera– v 9, XML, XSLT (and CSS). V 8 only XML + CSS.

• Internet Explorer– v 6, XML, Namespaces, CSS, XSLT, and XPath. V 5 NOT !

compatible

Page 51: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

51

XSLT Elements in common use

• <xsl:stylesheet>used as root element of nearly all XSLT stylesheets.

• <xsl:stylesheet version="version number" xmlns="path to W3C namespace">

• Version number: current number of XSLT specification from W3C.

• xmlns is the path to the XML Namespace defined by the W3C for the XSLT Transformation language. Currently, that path is http://www.w3.org/1999/XSL/Transform.

Page 52: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

52

Example Correct Style Sheet Declaration

<xsl:stylesheet version="1.0"

xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:transform version="1.0"

xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

Page 53: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

53

Example use of template<?xml version="1.0" encoding="ISO-8859-1"?>

<xsl:transform version="1.0“ xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template match="/">

<html>

<body> <h2>My CD Collection</h2>

<table border="1"> <tr bgcolor="#9acd32"> <th>Title</th>

<th>Artist</th> </tr> <tr> <td>.</td> <td>.</td> </tr> </table>

</body>

</html>

</xsl:template>

</xsl:stylesheet>

XML: http://www.w3schools.com/xsl/cdcatalog.xmlResult:http://www.w3schools.com/xsl/cdcatalog_with_ex1.xml

Page 54: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

54

XSLT Elements in common use (2)

• <xsl:apply-templates>call other templates repeatedly

• <xsl:value-of>insert specified content from source tree into result tree

• <xsl:output>output method to use, e.g., xml, text, or html.

Page 55: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

55

Pull method: value-of<?xml version="1.0" encoding="ISO-8859-1"?>

<xsl:stylesheet version="1.0“ xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template match="/">

<html>

<body> <h2>My CD Collection</h2>

<table border="1"> <tr bgcolor="#9acd32"> <th>Title</th>

<th>Artist</th> </tr> <tr> <td>.</td> <td>.</td> </tr> </table>

</body>

</html>

</xsl:template>

</xsl:stylesheet>

<xsl:value-of select="catalog/cd/title"/>

<xsl:value-of select="catalog/cd/artist"/>

View the XML file, View the XSL file, and View the result

Page 56: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

56

Push method: apply-templates<xsl:template match="/">

<html> <body> <h2>My CD Collection</h2> <xsl:apply-templates/> </body> </html>

</xsl:template> <xsl:template match="cd">

<p> <xsl:apply-templates select="title"/> <xsl:apply-templates select="artist"/> </p>

</xsl:template> <xsl:template match="title">

Title: <span style="color:#ff0000"> <xsl:value-of select="."/></span> <br />

</xsl:template> <xsl:template match="artist">

Artist: <span style="color:#00ff00"> <xsl:value-of select="."/></span> <br />

</xsl:template> View the XML file, View the XSL file, and View the result.

Page 57: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

57

XSLT Elements in common use (3)

• <xsl:element>dynamically create generic elements during the transformation process

• <xsl:text>insert some specified text (PCDATA)

Page 58: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

58

XSLT Elements in common use (4)

• <xsl:if>evaluate an expression; if true, the contents are evaluated.

<xsl:if test="boolean expression">Some content or XSL elements here</xsl:if>

• <xsl:choose>more flexibility: one of any number of choices + default choice

Page 59: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

59

Format choose:

<xsl:choose><xsl:when test="element name[test criteria 1]">

Result number one.</xsl:when><xsl:when test="element name[test criteria 2]">

Result number two.</xsl:when><xsl:otherwise>The default result.</xsl:otherwise></xsl:choose>

Page 60: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

60

Example: choose <xsl:choose> <xsl:when test="price &gt; 10"> <td bgcolor="#ff00ff"> <xsl:value-of select="artist"/></td> </xsl:when> <xsl:otherwise> <td><xsl:value-of select="artist"/></td> </xsl:otherwise> </xsl:choose>

resultXLS fileXML file

Page 61: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

61

XSLT Elements in common use (5)

• <xsl:for-each>loop for a set of nodes; to form template within template.

<xsl:for-each select="expression">

• <xsl:copy-of>take sections of source tree and copy them directly to the result tree

Page 62: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

62

Example: for-each

<xsl:for-each select="catalog/cd">

<tr>

<td><xsl:value-of select="title"/></td>

<td><xsl:value-of select="artist"/></td>

</tr>

</xsl:for-each>

resultXLS fileXML file

Page 63: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

63

Why an XML Editor?• XML Schema to define XML structures and data

types • XSLT to transform XML data • SOAP to exchange XML data between applications • WSDL to describe web services • RDF to describe web resources • XPath and XQuery to access XML data • SMIL to define graphics • Altova's XMLSpy

– 30 days free trial– http://www.altova.com/products/xmlspy/xsl_xslt_editor.html

Page 64: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

64

Conclusions XSLT• We have learned:

– XSLT Definition– Correct style sheet declaration– XSLT common elements– Looked at some examples– Browsers and their support for XSLT and more– About XML editors for editing XSLT and more

Page 65: Craig Stewart Dr. Alexandra I. Cristea (acristea/) XML

65

• Next:– We look at how to access elements and

attributes inside the XML (for XSLT and more)

– This can be done via …– XPATH