xml for dummies (and managers) mark pascall technical architect

34
XML for Dummies XML for Dummies (and managers) (and managers) Mark Pascall Technical Architect

Upload: osborn-hopkins

Post on 18-Dec-2015

224 views

Category:

Documents


0 download

TRANSCRIPT

XML for Dummies XML for Dummies (and managers)(and managers)

Mark PascallTechnical Architect

Overview Overview

What is XML?What is XML?

Extensible Markup Extensible Markup LanguageLanguage

Many pieces in the XML Many pieces in the XML puzzlepuzzle

Very fast movingVery fast moving

First – back to basics….First – back to basics….XML Schema

SOAP

Namespace

XMLXSLT

XQuery

HTML

XPointerXlink

Web Clients(Browser) Web Server

How the Internet works….How the Internet works….

Get me that text file

OK here it is

Hypertext Mark-up LanguageHypertext Mark-up Language

<h1>Here is the title</h1><h1>Here is the title</h1>

<p>This is a piece of <b>Text</b> </p><p>This is a piece of <b>Text</b> </p>

A formatting languageA formatting language

Browser knows how to interpret the tags.Browser knows how to interpret the tags.

What is Extensible Markup What is Extensible Markup Language?Language?

NOT a markup languageNOT a markup language

Meta-markup LanguageMeta-markup Language

Set of very simple rulesSet of very simple rules

XML provides a uniform method for XML provides a uniform method for describing and exchanging structured datadescribing and exchanging structured data

Describes structure and semantics, not Describes structure and semantics, not formattingformatting

French English German …..

1. Use letters of alphabet.2. Spaces between words.3. Read from left to write.…..

Meta Language

SVG WML MathML …..

1. Tags must not overlap2. Tags case-sensitive3. Must have root tag……

XML Specification

The XML Rules….The XML Rules….

1.1. Single, unique root Single, unique root elementelement

2.2. Matching open/close Matching open/close tagstags

3.3. Consistent Consistent capitalisationcapitalisation

4.4. Correctly nested Correctly nested elements (no elements (no overlapping elements)overlapping elements)

5.5. Attribute values Attribute values enclosed in quotesenclosed in quotes

6.6. No repeating attributes No repeating attributes in an elementin an element

<?xml version=“1.0”?>

<company id=“4859”>

<name>3Months.com</name>

<type>Web Development</type>

<address>

<street>Wakefield st</street>

<city>Wellington</city>

<country>New Zealand</country>

</address>

</company>

Well Formed

History of XMLHistory of XML

Standard Generalised Markup LanguageStandard Generalised Markup Language

Been around since early 90’sBeen around since early 90’s

XML is a sub-set of SGML (SGML-lite)XML is a sub-set of SGML (SGML-lite)

XML has smaller and simpler syntaxXML has smaller and simpler syntax

SGML’s development provides the SGML’s development provides the foundation for XML foundation for XML

XML is therefore not “bleeding edge”XML is therefore not “bleeding edge”

Why XML is so powerfulWhy XML is so powerful

Provides international, vendor independent standard for describing informationAny information – structured or unstructured

TCP/IP

HTTPXML

XML Markup XML Markup languages/vocabularieslanguages/vocabularies

Remember XML is a meta-languageRemember XML is a meta-language

Anybody can create their own languageAnybody can create their own language

Why would you want to?Why would you want to?

Each language designed for a specific Each language designed for a specific purpose….purpose….

Mathematical Markup Language Mathematical Markup Language (MathML)(MathML)

x2 + 4x + 4 =0x2 + 4x + 4 =0

<apply><plus/><apply><plus/> <apply><power/><apply><power/> <ci>x</ci><ci>x</ci> <cn>2</cn><cn>2</cn> </apply></apply> <apply> <times/><apply> <times/> <cn>4</cn><cn>4</cn> <ci>x</ci><ci>x</ci> </apply></apply> <cn>4</cn><cn>4</cn></apply></apply>

Synchronized Multimedia Synchronized Multimedia Integration Language (SMIL)Integration Language (SMIL)

<DIV CLASS=“time” t:timeline=“seq”> <DIV CLASS=“time” t:timeline=“seq”> <P class=“time” t:dur=“1”><P class=“time” t:dur=“1”> This appears for one second and goes awayThis appears for one second and goes away </P></P> <P class=“time” t:dur=“1”><P class=“time” t:dur=“1”> This appears after one second, remains visible for one second and This appears after one second, remains visible for one second and

goes awaygoes away </P></P> <P class=“time” t:dur=“1”><P class=“time” t:dur=“1”> This appears after two seconds, remains visible for one second and This appears after two seconds, remains visible for one second and

goes awaygoes away </P></P></DIV></DIV>

Vector Markup LanguageVector Markup Language

<v:shape style=‘top: 0; left: 0; width: 250; height: 250’ <v:shape style=‘top: 0; left: 0; width: 250; height: 250’ stroke=“true” strokecolor=“red” strokeweight=“2” stroke=“true” strokecolor=“red” strokeweight=“2” fill=“truefill=“true”” fillcolor=“green” coordorigin=“0 0” fillcolor=“green” coordorigin=“0 0” coordsize=“175 175”>coordsize=“175 175”>

<v:path v=“m 8,65 l <v:path v=“m 8,65 l 72,65,92,11,112,65,174,65,122,100,142,155,92,121,42,72,65,92,11,112,65,174,65,122,100,142,155,92,121,42,155,60,100155,60,100

x e”/>x e”/>

</v:shape></v:shape>

Wireless Markup LanguageWireless Markup Language

<wml> <wml> <!-- root element --> <!-- root element --> <card id="card1" title="Example 1"> <card id="card1" title="Example 1"> <p> <!-- a card can only contain P blocks or DO blocks --><p> <!-- a card can only contain P blocks or DO blocks --> <do type="accept" label="go to card 2"><do type="accept" label="go to card 2"> <go href="#card2"/></do> This is the first card. </p><go href="#card2"/></do> This is the first card. </p> </card> <card id="card2" title="Example 1"> </card> <card id="card2" title="Example 1"> <p> This is the second card. </p> <p> This is the second card. </p> </card> </card>

</wml></wml>

Hypertext Markup Language (HTML)Hypertext Markup Language (HTML)

<h1>The Title</h1><h1>The Title</h1>

<p>This is a piece of <b>HTML </b> </p><p>This is a piece of <b>HTML </b> </p>

Or is it??Or is it??

Next version of HTML = XHTMLNext version of HTML = XHTML

XML Schemas and DTDsXML Schemas and DTDs

XML is about communicationXML is about communication

Need to speak the same languageNeed to speak the same language

Schemas/DTDs describe the vocabulary of Schemas/DTDs describe the vocabulary of the languagethe language

i.e. what tags are used and how they can i.e. what tags are used and how they can be organisedbe organised

Schemas will replace DTDsSchemas will replace DTDs

An Example SchemaAn Example Schema

<?xml version="1.0" encoding="UTF-8"?>

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">

<xsd:element name="PressRelease">

<xsd:complexType>

<xsd:sequence>

<xsd:element name="Title" type="xsd:string"/>

<xsd:element name="Date" type="xsd:date"/>

<xsd:element name="Content" type="xsd:string"/>

<xsd:element name="Author" type="xsd:string" minOccurs="0"/>

</xsd:sequence>

</xsd:complexType>

</xsd:element>

</xsd:schema>

<?xml version="1.0" encoding="UTF-8"?>

<PressRelease>

<Title>Studend Loan Problems</Title>

<Date>20/7/01</Date>

<Content>Bla Bla Bla</Content>

</PressRelease>

Introducing XSL-TIntroducing XSL-T

Extensible Stylesheet LanguageExtensible Stylesheet Language

Standard ratified this year by W3CStandard ratified this year by W3C

Way of transforming an XML document Way of transforming an XML document into another documentinto another document

TransformationTransformation

XML document

XSLT document

XSLT Processor

Text

HTML

XML

SummarySummary

The XML Specification = Meta-language for The XML Specification = Meta-language for describing XML Mark-up languagesdescribing XML Mark-up languagesXML Schemas (and DTD’s) describe the XML Schemas (and DTD’s) describe the structure of a particular XML Mark-up structure of a particular XML Mark-up languagelanguageXSL-T documents transform XML document XSL-T documents transform XML document into another format (HTML, XML etc)into another format (HTML, XML etc)

XML in Action – a case studyXML in Action – a case study

AgendaAgenda

Case study backgroundCase study background

Choosing a DTD/SchemaChoosing a DTD/Schema

Creating XML – the optionsCreating XML – the options

Storing XML – the optionsStoring XML – the options

Presenting XML – the optionsPresenting XML – the options

Solution BenefitsSolution Benefits

DemoDemo

Case study backgroundCase study background

October 2001 – won contract to redevelop E-government October 2001 – won contract to redevelop E-government website (website (www.e-government.govt.nzwww.e-government.govt.nz))Business requirementsBusiness requirements– Usual stuff (accessible, usable etc)Usual stuff (accessible, usable etc)– Guidelines compliant (squeaky clean)Guidelines compliant (squeaky clean)

““Content must be made available in a standard HTML format. Content must be made available in a standard HTML format. Where information is provided in a proprietary format an Where information is provided in a proprietary format an alternative HTML version must also be made available.”alternative HTML version must also be made available.”Can’t just put it up as a PDF anymoreCan’t just put it up as a PDF anymore

– Simple publishing processSimple publishing process– Future-proofedFuture-proofed

ConstraintsConstraints– Limited budgetLimited budget– Tight timeframeTight timeframe

Traditional optionsTraditional options

Static siteStatic site– Did not give simple publishing process for Did not give simple publishing process for

“unskilled” people“unskilled” people

Content Management SystemContent Management System– Store information in RDBMSStore information in RDBMS– Not good for “document centric” applicationsNot good for “document centric” applications– Cost, timeframe, simple workflow Cost, timeframe, simple workflow

requirementsrequirements

The challengeThe challenge– To create a system that allows non-technical To create a system that allows non-technical

authors to publish to guidelines compliant authors to publish to guidelines compliant HTML (and eventually other formats)HTML (and eventually other formats)

The solutionThe solution– XMLXML

Choosing a DTD/SchemaChoosing a DTD/Schema

Make up your ownMake up your own

Don’t reinvent the wheel!Don’t reinvent the wheel!

Xml.orgXml.org

We selected a subset of DocBookWe selected a subset of DocBook– Could handle all the information we needed to Could handle all the information we needed to

storestore– Supported by a growing number of productsSupported by a growing number of products

NZETC uses TEINZETC uses TEI

Creating XML – the OptionsCreating XML – the Options

Use an XML editorUse an XML editor– E.g. XML Spy, Xmetal, Framemaker etcE.g. XML Spy, Xmetal, Framemaker etc– Allow you to create a document that conforms Allow you to create a document that conforms

to a specified DTD/Schemato a specified DTD/Schema– Problem: everybody potentially an authorProblem: everybody potentially an author

Convert Word documents to XMLConvert Word documents to XML– StylesStyles

WordWord(using Styles)(using Styles)

HTMLHTML(Internet)(Internet)

Anything you want!!Anything you want!!

eBookeBookDocBookDocBook

XMLXML

XSL-T for XSL-T for convertionconvertion

Word is Word is authoritative authoritative sourcesource

Xcon DemoXcon Demo

Storing XML – the OptionsStoring XML – the Options

Relational databaseRelational database

Native XML RepositoryNative XML Repository– E.g. Excelon, Tamino, Ipedo, E.g. Excelon, Tamino, Ipedo, Xindice Xindice – First generation productsFirst generation products– At the time too immature/expensiveAt the time too immature/expensive

On the file systemOn the file system

Presenting XML – the OptionsPresenting XML – the Options

Publishing to humansPublishing to humansNeed to “transform” XML to a format appropriate Need to “transform” XML to a format appropriate for humansfor humansPhysical print – out of scopePhysical print – out of scopeHTML obvious choiceHTML obvious choiceXSL-T to transform XML to HTMLXSL-T to transform XML to HTMLNot the only way to present to humansNot the only way to present to humans– SMIL, SVG, MathML etcSMIL, SVG, MathML etc– Audience must have softwareAudience must have software

What about publishing “Raw” XML?What about publishing “Raw” XML?

XML document

Organisation A Website

Organisation B Website

HTMLdocument

Organisation C Website

HTMLdocument

XML document

DemoDemo