an introduction to xml paul donohue may 8th 2002 hotel senator zürich
TRANSCRIPT
An Introduction to XML
Paul DonohueMay 8th 2002Hotel Senator
Zürich
Overview
• What Is XML?
• Why Use XML?
• Defining Rules with XML
• Related Technologies
• Demonstration
• Summary
• Questions
Topics we will cover
What is XML?
• XML = Extensible Markup Language
• Not a programming language
• An open standard for representing structured data
• Describes data structure and content
• Separates data from its presentation
XML In A Nutshell
What is XML?
A woman without her man is nothing
A woman without her man, is nothing
A woman: without her, man is nothing
Markup Clarifies Data
What is XML?
Some History
1969 Generalized Markup Language (GML)
1980 Standardized Generalized Markup Language (SGML)
1986 SGML becomes ISO standard
1991 Hypertext Markup Language (HTML)
1996 W3C begin work on a language to combine SGML & HTML
1998 XML standard is published
What is XML?
<?xml version="1.0"?> <talk code="ID352"><!--Example XML file--><title>A PB / XML Messaging System</title><presenter>Mr Paul Donohue</presenter><audience>PowerBuilder Developers</audience><time>13:30</time><date>2001-08-13</date></talk>
An XML Document
What is XML?
Parts of an XML document
• XML Declaration
• Prolog
• Elements
• Attributes
• Comments
• Other Parts
<?xml version="1.0"?> <!DOCTYPE talk SYSTEM “DEMO.DTD"><talk code="ID352" ><!--Example XML file--><title>A PB / XML Messaging System</title><presenter>Mr Paul Donohue</presenter><audience>PowerBuilder Developers</audience><time>13:30</time><date>2001-08-13</date></talk>
What is XML?
• Elements are the basic building blocks of XML
• XML’s nouns
• Elements consist of start tag, contents and end tag… <animal>cat</animal>
• Empty elements can be shown as <animal/>
• Contents can be data or other elements
• Elements can own attributes
Elements
What is XML?
• Attributes give further information about an element
• XML’s adjectives
• Attributes consist of name, equals and value… cat_type=“persian”
• Attribute values are quote delimited strings
• Attributes are placed inside an element’s start tag
Attributes
What is XML?
• XML messages are text files
• XML is case sensitive
• XML uses the Unicode 2.1 character set
• Special codes for markup characters such as < and &
Characters
What is XML?
• All elements and attributes are named
• Must begin with a letter, underscore or colon
• Must continue with valid name characters Letter / Underscore / Colon Digit Hyphen Full stop
• May not begin with “XML”
Names
What is XML?
• XML uses tags like in HTML
• XML complements HTML
• XML is for “smart data”
• XML is both data and document
XML vs HTML
What is XML?
<table width="500" border="0" cellspacing="0" cellpadding="0">
<tr bordercolor="#FFFFFF" bgcolor="#6666FF">
<td width="214">Sybase</td>
<td width="150"><b>PowerBuilder 7.1</b></td>
<td width="136">£145.00</td>
</tr>
</table>
Data encoded as HTML
What is XML?
<Software>
<Publisher>Sybase</Publisher>
<Title>PowerBuilder
<MajorVersion>7</MajorVersion>
<MinorVersion>1</MinorVersion>
</Title>
<Price currency="GBP">145.00</Price>
</Software>
Data encoded as XML
Why Use XML?
• Royalty free
• Industry standard
• Platform & vendor independent
• Self describing
• Flexible
• Caters for nested & repeating data
Six Good Reasons
Why Use XML?
• Nobody “owns” XML
• No software to purchase
• No licensing fees
Royalty Free
Why Use XML?
• XML version 1.0 became a W3C standard in 1998
• Good support from vendors
• Low risk technology
• Large community of developers
Industry Standard
Why Use XML?
• XML documents are text based
• Perfect for messaging
• There are no vendor-specific extensions
• PDA to Mainframe
Platform & Vendor Independent
Why Use XML?
• Descriptive element & attribute names
• The name / data combination is easy to understand <price>11.50</price> <currency>GBP</currency>
• XML can be viewed with a text editor
Self Describing
Why Use XML?
• XML can handle any structured data
• XML can be easily transformed
• Direct access to the required data
Flexible
Why Use XML?
• Nested data… An employee has an address and that address has a street and a post code
• Repeating data… An invoice has one or more items on it
• Hard to do with traditional file formats
Nested & Repeating Data
Defining Rules
• Conform to the grammar of XML One root element Non-empty elements have start & end tags Elements are nested correctly Attributes are not repeated within elements Attribute values are quoted
• Can be parsed by any parser
• The data may be nonsense
Well-Formed XML Documents
Defining Rules
• Must contain a valid document type declaration
• Must obey the constraints of that declaration Element sequence is valid Required attributes are provided Attribute values are a valid value
• Ensures data is valid for the application domain
• Rules are in a DTD or schema
Valid XML Documents
Defining Rules
• DTDs define validation rules for XML documents
Elements – contents, order & occurrence
Attributes – valid & default values
• DTDs are optional
• DTDs can be internal or external
• DTDs are written in XML Declaration Syntax
Document Type Definitions (DTD)
Defining Rules
• Schemas are more powerful that DTDs Data types Improved occurrence constraints
• Schemas are written in XML
• Schemas can refer to other schemas
• Get your DTD or schema correct before you code
Schemas
Defining Rules
• Standard terms facilitate data exchange
• Industry-wide standards have emerged; MathML : Mathematical Markup Language CML : Chemical Markup Language FPML : Financial Products Markup Language CDF : Channel Description Format
• Check if your industry or organisation has a standard
Semantics
Related Technologies
• Parsers : SAX & DOM
• Searching : XPath
• Formatting : CSS, XSL & XSLT
• Linking : XLink & XPointer
• Resource Description Framework (RDF)
Overview
Related Technologies
• Event driven
• Can handle large files
• No random access
• Read only
• Primarily for Java
Simple API for XML (SAX)
Related Technologies
• Standard set of function calls
• XML loaded into memory
• Best for smaller files
• Data is parsed into a tree of nodes
• Language and platform neutral
Document Object Model (DOM)
Demonstration
Summary
Summary
• What XML is
• Why we should use XML
• How to define rules in XML
• XML’s related technologies
What have we learnt?
Summary
Recommended reading
Title : Professional XMLAuthor : Mark Birbeck et alPublisher : Wrox Press IncISBN: 1861003110
Summary
Recommended reading
Title : Fast Track To XMLAuthor : Eric ZenorPublisher : SybaseArticle : 1003388
Summary
Useful Web Sites
XML Org : www.xml.org W3C : www.w3.org/xml XML FAQ : www.ucc.ie/xml XML Cover Pages : www.oasis-open.org/cover/sgml-xml XML Journal : www.sys-con.com/xml
Summary
• MS XML Notepad http://msdn.microsoft.com
• XML Spy http://www.xmlspy.com
Useful XML Tools
Questions
If you have any questions about this presentation please email me or visit my web site.
Email : [email protected]
Web : www.pauldonohue.com