an introduction to xml based on the w3c xml recommendations
TRANSCRIPT
An Introduction to XML
Based on the W3C XML Recommendations
Agenda• XML Syntax
– XML vs HTML
– Data Types – Elements, Attributes
– White Space – Optional, Mandatory, & Preserved
– Empty Content
– Valid vs Well Formed
• XML Schema– Used to Validate XML data
– Before XML Schema – DTD’s
– Simple Types vs Complex Types
– Restricting data with Regular Expressions
• Namespaces – Avoiding Tag Name Conflicts
• XML Tools– XML Spy and Other Tools
• Corresponding Sample XML, XSD, DTD, XSL, and XHTML files
• XML Resources on the web– http://www.w3schools.com - an excellent site
– http://www.xml.com
– http://www.w3.org
XML vs HTML
As you can see, XML looks similar to HTML.
<?xml version="1.0“?>
<root>
<child>
<subchild attribute=“metadata”>Data</subchild>
</child>
</root>
XML vs HTML
• Unlike HMTL:– XML is Case Sensitive
– Tags must be properly nested
– All start tags must have a corresponding end tag to close the element
– All XML documents must have a root element
– Attrbutes must use quotes (can be single or double)
– White space between tags is preserved
XML vs HTML
• Special Characters– Handled the same way– For Example:
• < > ‘ “ &
• < > ' " &
Elements
• XML Elements are extensible and they have parent/child relationships.
• XML elements must follow these naming rules:– Names can contain letters, numbers, underscores, periods, colons,
and hyphens (last three are not normally used in element names)
– Names must not start with a number or punctuation character
– Names must not start with the letters xml (or XML or Xml )
– Names cannot contain spaces
Attributes
• Attributes are normally used to store metadata, data about data, and the real data is stored in elements between the start and end tags.
• Single or Double quotes can be used.
White Space
• White Space Includes:– Carrage returns, Line feeds, Spaces, Horizontal Tabs
• Optional White Space – White space is optional in XML files
• Mandatory White Space – White Space must occur when using attributes
• Preserved White Space – Between start/end tag pairs
Optional White Space
Valid<?xml version="1.0"?>
< root >
< child >
<subchild>Data</subchild>
</child>
</root>
Valid<?xml version="1.0"?><root><child><subchild>Data</subchild></child>
</root>
Mandatory White Space
Valid
<?xml version="1.0"?><root> <child
attribute=“metadata”></root>
Invalid
<?xml version="1.0"?><root>
<childattribute=“metadata”>
</root>
Must have white space here
Preserved White Space
Valid
<?xml version="1.0"?>
<root>
<child>
<subchild>White space between
start/end tag pairs will be preserved</subchild>
</child>
</root>
Empty Content
• IF no data is “held” between a start/ end tag pair, two formats may be used:
<tag></tag>
<tag/>
• The second format is called an Empty Tag (aka Null tag) and commonly used when only an attribute is needed:
<tag attribute=“data”/>
Valid vs Well Formed
• XML data is defined and validated most commonly by:– XML Schemas– DTD’s (Document Type Definition)
• XML data is well formed if it follows the W3C XML Recommendation, Version 1.0– This includes:
• The start/end tags matching up• White space used properly
• NOTE: XML Spy does both checks
XML Schema
• Used to Define and Validate XML
– In order for the XML file to be validated by a schema, the schema’s location is referenced as an attribute of the root element
<FirmOrder … schemaLocation="http://www.telcordia.com/SGG/FO
C:\13.0\Documentation\xsd\firmOrder.xsd"/>
XML Schema• Before XML Schema, most XML
documents were validated against a DTD
<?xml version="1.0"?><xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.sample.org/xml" xmlns="http://www.sample.org/xml" elementFormDefault="qualified"> <xsd:element name=“File"> <xsd:complexType> <xsd:sequence> <xsd:element ref=“Record" minOccurs=“0" maxOccurs=“unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name=“Record"> <xsd:complexType> <xsd:sequence> <xsd:element ref=“Fill minOccurs=“1" maxOccurs=“1”/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name=“Fill" type="xsd:string"/></xsd:schema>
<!ELEMENT File (Record)*>
<!ELEMENT Record (Fill)>
<!ELEMENT Fill (#PCDATA)>DTD XML
Schema
Mercator Type Tree
XML Schema
• Element Data Types– XML Schema’s Simple Types
• Similar to Items in Mercator• 44 Simple types built-in XML Schema
– XML Schema’s Complex Types• Similar to Groups in Mercator• 36 Complex types built-in XML Schema
– XML Schema’s Attributes• Similar to the Properties of Items and Groups in
Mercator
XML Schema
• Simple Types can be restricted using Regular Expressions:
<xsd:simpleType name="alphaString">
<xsd:restriction base="xsd:string">
<xsd:pattern value="([A-Z]|[a-z]|[ ])*"/>
</xsd:restriction>
</xsd:simpleType>
Namespaces
• XML Namespaces provide a method to avoid element name conflicts.
– Since element names are not predefined as in HTML, often times a name conflict can occur when combining two different documents using the same name for two different elements
Namespaces, cont.
• If the following two XML documents were added together, there would be an element name conflict because both documents contain a <table> element with different content and definition.
<table> <name>Tea Table</name> <width>80</width> <length>120</length></table>
<table> <tr> <td>Apples</td> <td>Bananas</td> </tr></table>
Solving Name Conflicts using a Prefix
<h:table>
<h:tr>
<h:td>Apples</h:td>
<h:td>Bananas</h:td>
</h:tr>
</h:table>
<f:table>
<f:name>African Coffee Table</f:name>
<f:width>80</f:width>
<f:length>120</f:length>
</f:table>
Using Namespaces
<h:table xmlns:h="http://www.w3.org/TR/html4/">
<h:tr>
<h:td>Apples</h:td>
<h:td>Bananas</h:td>
</h:tr>
</h:table>
<f:table xmlns:f="http://www.w3schools.com/furniture">
<f:name>African Coffee Table</f:name>
<f:width>80</f:width>
<f:length>120</f:length>
</f:table>
Namespaces
• URI’s are used as the namespace name– Most commonly used URI is a URL– URL’s by definition are unique to companies– The URL does NOT need to be valid
• They are used for creating uniqueness not validating your tags
• Most companies put “help” documentation about their namespace, tags, and/or XML Schemas
XML Samples
• The next five slides have different types of XML files that correspond to each other:– XML Data Document– XML Schema– DTD (these are not written in XML)– XSL – style sheet
XML Data Sample
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="xmlxsl.xsl"?>
<root>
<child>
<name>Optional name tag used in this child tag</name>
<description>First description start/end tag pair in child tag</description>
</child>
<child>
<name>Optional name tag used in this child tag</name>
<description>First description start/end tag pair in child tag</description>
<description>Second description start/end tag pair in child tag</description>
</child>
</root>
XML Schema Sample<?xml version="1.0"?><xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.sample.org/xml" xmlns="http://www.sample.org/xml" elementFormDefault="qualified"> <xsd:element name=“root"> <xsd:complexType> <xsd:sequence> <xsd:element ref=“child" minOccurs="0" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="child"> <xsd:complexType> <xsd:sequence> <xsd:element ref="name" minOccurs ="0"/> <xsd:element ref="description" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="name" type="xsd:string"/> <xsd:element name="description" type="xsd:string"/></xsd:schema>
XML DTD Sample
<!ELEMENT root (child)*>
<!ELEMENT child (name?, description+)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT description (#PCDATA)>
Style Sheet (XSL) Sample<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<html>
<body>
<h2>XHTML Sample</h2>
<table border="1">
<tr bgcolor="gray">
<th>Name</th>
<th>Description</th>
</tr>
<xsl:for-each select="root/child">
<tr>
<td><xsl:value-of select="name" /></td>
<td><xsl:value-of select="description" /></td>
</tr>
</xsl:for-each>
</table>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
XHTML Generated<html>
<body>
<h2>XHTML Sample</h2>
<table border="1">
<tr bgcolor="gray">
<th>Name</th>
<th>Description</th>
</tr>
<tr>
<td>Optional name tag used in this child tag</td>
<td>First description start/end tag pair in child tag</td>
</tr>
<tr>
<td>Optional name tag used in this child tag</td>
<td>First description start/end tag pair in child tag</td>
</tr>
</table>
</body>
</html>
XML Spy
• Accomplishes several XML tasks including:– Editing a variety of XML data graphically– Allowing multiple views including:
• Text, browser, grid, structure (schema design),
– Creates test data from XML Schemas– Generates XML Schemas from XML files– Validates data– Checks data for Well-formedness
Other Tools
• Internet Explorer– Displays XML data using a default style sheet– Checks XML for Well-formedness and displays
error message for troubleshooting
• UltraEdit
XML Resources on the web
• They are hundreds of XML resources on the web.– http://www.w3schools.com (an excellent site)
– http://www.xml.com
– http://www.w3.org
• The easiest was to find data about a specific XML topic or syntax is to type it into google.com
Contact Us
Barry DeBruin
debruinconsulting.com
919-434-5399