xml and xsl overview

Post on 19-Jan-2016

32 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

XML and XSL Overview. by Alex Chaffee alex@jguru.com, http://www.purpletech.com/ Purple Technology: Open source development jGuru: Java online resource FAQs and News and other cool stuff. XML. eXtensible Markup Language Replacement for HTML Metalanguage - used to create other languages - PowerPoint PPT Presentation

TRANSCRIPT

XML and XSL Overview

by Alex Chaffeealex@jguru.com, http://www.purpletech.com/Purple Technology: Open source development

jGuru: Java online resourceFAQs and News and other cool stuff

XML

• eXtensible Markup Language• Replacement for HTML• Metalanguage - used to create other

languages• Has become a universal data-

exchange format

Advantages of XML

• Human-readable• Machine-readable (easy to parse)• Standard format for data interchange• Possible to validate• Extensible

– can represent any data– can add new tags for new data formats

• Hierarchical structure (nesting)

Why not HTML?

• Browsers are too lenient• Led to sloppy HTML code all over the

Web– <imG src="foo.gif> is "legal" HTML

• Told HTML, "go to your room and don't come out until it's clean"– Out came XML

XML Searching and Agents

• An early motivation for XML• Allows detailed queries of disparate data

sources– Find best price for certain product– Search for properties with different real estate

brokers

• HTML insufficient– Good for humans, bad for computers– Doesn't scale

XML Example

<?xml version="1.0"?><!DOCTYPE menu SYSTEM "menu.dtd"><menu>

<meal name="breakfast"><food>Scrambled Eggs</food><food>Hash Browns</food><drink>Orange Juice</drink>

</meal></menu>

XML Languages

• MML - musical scores• CML - chemicals• HRMML - Human Resource

Management (???)• MathML - equations• RSS - web syndication

Tag vs. Element

• A tag is a name, enclosed by angle brackets, with optional attributes– <foo id=“123”>

• An element is a tree, containing an open tag, contents, and a close tag– <foo id=“123”>This is <bar>an

element</bar></foo>

XML Syntax

• Tags properly nested• Tag names case-sensitive• All tags must be closed

– or self-closing– <foo/> is the same as <foo></foo>

• Attributes enclosed in quotes• Document consists of a single (root) element• A few other details

Well-Formed vs. Valid

• Well-Formed:– Structure follows XML syntax rules

• Valid:– Structure conforms to a DTD

DTD

• Document Type Definition• A grammar for XML documents• Defines

– which elements can contain which other elements

– which attributes are allowed/required/permitted on which elements

DTD and Data Exchange

• Both sides must agree on DTD ahead of time

• DTD can be part of document or stored separately

DTD Example

<?xml encoding="US-ASCII"><!ELEMENT menu (meal)*><!ATTLIST menu name CDATA #OPTIONAL><!ELEMENT meal (food|drink)*><!ATTLIST meal

name CDATA #REQUIRED>

<!ELEMENT food (#PCDATA)*><!ELEMENT drink (#PCDATA)*>

Why isn't a DTD in XML?

• It will be someday: XSchema

XML Namespaces

• A single document can use multiple DTDs

• But! Two DTDs can use the same element name with different rules

• Solution: Namespaces• Must prefix tag name with namespace

name– e.g. <xsl:apply-templates select="."/>

Entities

• Macros / constants• Values defined once, used in

document<!DOCTYPE foo SYSTEM "foo.dtd" [

<!ENTITY background "#99FFFF">]><BODY BGCOLOR="&background;">

SML / Minimal XML

• Simplified Markup Language• Subset of XML, but stripped down• Easier to understand, parse• No

– DTDs– Attributes– Processing instructions– etc.

XSL: XML Transformation

XSL

• The eXtensible Style Language• Transforms XML into HTML• Actually, transforms XML into a tree,

then turns that tree into another tree, then outputs that tree as XML

XSL Architecture

XMLSource

XSLStylesheet

HTMLOutput

XSLProcessor

XML is a Tree

<?xml version="1.0"?>

<!DOCTYPE menu SYSTEM "menu.dtd">

<menu>

<meal name="breakfast">

<food>Scrambled Eggs</food>

<food>Hash Browns</food>

<drink>Orange Juice</drink>

</meal>

<meal name="snack">

<food>Chips</food>

</meal>

</menu>

menu

meal

name

"breakfast"

food

"ScrambledEggs"

food

"HashBrowns"

drink

"OrangeJuice"

meal

XML Is A Tree

• Nodes– Branch nodes contain children– Leaf nodes contain content

• Attributes, Values, Entities, etc.

• DOM provides API-based access to tree models

• XSL turns one tree into a different tree

Command Line Invocation

• Apache Xalanjava org.apache.xalan.xslt.Process

-IN faq.xml –XSL faq.xsl –OUT faq.html

• IBM LotusXSLjava com.lotus.xsl.xml4j.ProcessXSL

-in servletfaq.xml -xsl faq.xsl -out faq.html

• And so on…

Formatting Objects

• Forget about it for now

XSLT

• The meat of XSL• Syntax for making XSL template files• Pattern matching• Output formatting• Rules-based (like Prolog)

XPath

• The stuff inside the quotes in XSL patterns– "/person/name/firstname"

• A sensible way to locate content in an XML document

• More straightforward than walking a DOM tree or waiting for a SAX callback

XPath Syntax

• book/title– title child of book child of current node

• /book/title– title child of book child of document root

• @language– language attribute of current node

• chapter/@language– language attribute of chapter child of

current node

XPath Syntax (cont.)

• chapter[3]/para– all the para children of the third chapter

• book/*/title– all title children of all children of book (but not of

their children)

• chapter//para– all para children of any child of chapter, recursively

• ../../title– title child of parent of parent– parent::node()/parent::node()/child::title

XPath Abbreviations

. self::node()

.. parent::node()

// descendant-or-self::node()

@ attribute::

XPath Functions

• para[1] or para[position()=1]– the first para node of the current node

• para[last()]• para[count(child::note)>0]

– all paragraphs with one or more notes

• para[id("abstract")]– selects all child nodes like

<para id="abstract">

• para[@type='secret'] or para[attribute::type='secret']– selects all child nodes like

<para type="secret">

XPath Functions (cont.)

• para[not(title)]– selects all child paragraphs with no title elements

• para[position() >= 2 and position() < last()]– selects all but the first and last paragraphs

• para[lang("en")]– matches <para xml:lang="en-uk">…</para>

• note[contains(., "alex")]– . means "test childrens' content too, recursively" in

this context

• note[starts-with(., "hello")]

XPath Disadvantages

• Not XML– Not hierarchical– New syntax rules– Weird mix of /,[],(),*,:,::,.,..,@

• New function set– Not Java

• Concepts like "axis" not always clear

XSLT Syntax

XSL Rules

• XSL is a series of rules or templates• Each template matches an element• Templates can contain XML

commands

XSL Commands: apply-templates

• Main rule: apply-templates– looks for a template match– applies it

• Usually the template calls apply-templates recursively on its children

• If not, then processing stops at that node (but continues for its other siblings that matched this template)

Default Rule

• For a leaf node, output its contents• For a branch node, apply templates

(recursively) (including default rule)

Some XSL Commands

• value-of– grabs raw value, good for text elements and

attributes

• if– executes conditionally

• number– counts position of element in group– good for ordered list numbering, table of

contents, etc.

XSL Example

<?xml version="1.0"?><!DOCTYPE xsl:stylesheet [

<!ENTITY background "#99FFFF">]><xsl:stylesheet

xmlns:xsl="http://www.w3.org/XSL/Transform/1.0" xmlns="http://www.w3.org/TR/REC-html40" result-ns="">

Example (cont.)

<xsl:template match="menu"><HTML> <HEAD> <TITLE>Menu: <xsl:value-of select="@name"/>

</TITLE> </HEAD> <BODY BGCOLOR="&background;"> <H1> Menu <xsl:value-of select="@name"/> </H1>

[Note: Can reuse contents, unlike CSS]

Example (cont.)

<xsl:apply-templates />

</BODY></HTML></xsl:template>

Example (cont.)

<xsl:template match="meal"> <H2><xsl:value-of select="@name"/></H2><br />; <UL>

<xsl:apply-templates/> </UL></xsl:template>

Example (cont.)

<xsl:template match="food"> <LI><xsl:apply-templates/></LI></xsl:template>

<xsl:template match="drink"> <LI><xsl:apply-templates/></LI></xsl:template>

</xsl:stylesheet>

Outputting Attributes

• From This:– <link>

<name>Stinky</name> <url>http://www.stinky.com/</url></link>

• We Want This:– <A href="http://www.stinky.com/">Stinky</A>

Outputting Attributes

• The Hard Way:– <xsl:element name="A">

<xsl:attribute name="href"> <xsl:value-of select="url" /> </xsl:attribute> <xsl:value-of select="name" /></xsl:element> 

• The Easy Way:– <A href="{url}">

<xsl:value-of select="name"/></A>

Copying Subtrees

• <xsl:template match="*|@*|text()"> <xsl:copy> <xsl:apply-templates select="*|@*|text()"/> </xsl:copy></xsl:template>

• No, I don't understand it either • Default copy rule strips all tags/attributes• Also copy-of

XSL conditionals: if

• <xsl:if test="author"> by <xsl:apply-templates select="author" /></xsl:if>

• Note: no else (?!?)

XSL Conditonals: choose

• Case 1– <link>

<name>Stinky</name> <url>http://www.stinky.com/</url></link>

– <a href="http://www.stinky.com/">Stinky</a>

• Case 2– <link>

<url>http://www.stinky.com/</url></link>

– <a href="http://www.stinky.com/">http://www.stinky.com/</a>

• Case 3– <link>

<name>Stinky</name></link>

– Stinky

XSL Conditionals: choose• <xsl:choose>

<xsl:when test="url"> <A href="{url}"> <xsl:choose> <xsl:when test="name"> <xsl:value-of select="name" /> </xsl:when> <xsl:otherwise> <xsl:value-of select="url" /> </xsl:otherwise> </xsl:choose> </A> </xsl:when> <xsl:otherwise> <xsl:value-of select="name" /> </xsl:otherwise></xsl:choose>

XSL Looping: for-each

• <xsl:for-each select="chapter"> <h2><xsl:value-of select="@title"/> </h2></xsl:for-each>

• Functional overlap with apply-templates– Difference in programming style– Use it inside a given template rule

Template Modes• Same element name, different context -> different

template, different output• Can invoke apply-templates with a mode, matches

corresponding moded template• <h1>Table of Contents</h1>

<ol><xsl:apply-templates select="chapter" mode="toc"/></ol>

• <xsl:template select="chapter" mode="toc"> <li><xsl:value-of select="@title"/></li></xsl:template>

• <xsl:template select="chapter"> <h1><xsl:value-of select="@title"/></h1> <xsl:apply-templates/></xsl:template>

XSL vs. CSS

• Similar problem, different solutions• CSS takes HTML and applies fonts,

styles, positions• XSL takes any XML and turns it into

anything else• XSL more powerful than CSS

– e.g. can use same content in multiple places in result document

XSL Disadvantages

• Confusing syntax and semantics– Like Prolog+C+XML – It's really a programming language, but using markup

language syntax – yuck!

• Hard to debug– XSL Trace helps a little

• Don't have full power of, say, Java inside templates– No database access, hashtables, methods, objects, etc.

• Still need separate .xsl file for each client device

Other XSL-Based Products

• LotusXSL• Resin by Caucho• Cocoon• IBM XSL Trace• Xalan (Apache)• XT• Cocoon• Resin• Lots more

Links: XML

• XML Spec– http://www.w3.org/TR/REC-xml

• XML FAQ– http://www.ucc.ie/xml/

• Café con Leche– http://metalab.unc.edu/xml/

• XML.com– http://www.xml.com/

• Servlet FAQ in XSL– http://www.purpletech.com/servlet-faq/

References

• McLaughlin, "Java and XML", O'Reilly• Eckstein, "XML Pocket Reference",

O'Reilly• Harrold, "XML Bible"• Bradley, "The XML Companion",

Addison-Wesley

Q&A

top related