practical use of xml - cerncern.ch/computing.seminars/2004/1201/slides.pdfbusiness section practical...

36
CERN – European Organization for Nuclear Research IT Department – e Business Section Practical Use of Practical Use of XML XML Rostislav Titov IT-AIS-EB (e-Business) Section CERN – Geneva, Switzerland

Upload: others

Post on 15-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Practical Use of XML - CERNcern.ch/Computing.Seminars/2004/1201/slides.pdfBusiness Section Practical Use of XML Rostislav Titov IT-AIS-EB (e-Business) Section CERN – Geneva, Switzerland

CERN – European Organization for Nuclear ResearchIT Department – e–Business Section

Practical Use ofPractical Use of XMLXML

Rostislav TitovIT-AIS-EB (e-Business) Section

CERN – Geneva, Switzerland

Page 2: Practical Use of XML - CERNcern.ch/Computing.Seminars/2004/1201/slides.pdfBusiness Section Practical Use of XML Rostislav Titov IT-AIS-EB (e-Business) Section CERN – Geneva, Switzerland

CERNe–Business

XMLXML

eXtensible Markup LanguageeXtensible Markup Language

l SGML (ISO standard, 1986)Mainly for technical documentation

l XML (W3C recommendation, 1998)Simplification and enhancement of SGML, wide area of use

Page 3: Practical Use of XML - CERNcern.ch/Computing.Seminars/2004/1201/slides.pdfBusiness Section Practical Use of XML Rostislav Titov IT-AIS-EB (e-Business) Section CERN – Geneva, Switzerland

CERNe–Business

<book lang=“Hungarian”><chapter>

<section> </section><section> </section>

</chapter><chapter>

<section> </section><section> </section>

</chapter> </book>

IntroductionTextMarkup

More document markupReserved attributesProcessing instructions

Why Markup?Why Markup?

Markup allows to add information about data

structure

Markup allows to add information about data

structure????????

?????????????

?????????????? ?????? ? ????????????????????????? ????????? ????????? ?? ?????????

Page 4: Practical Use of XML - CERNcern.ch/Computing.Seminars/2004/1201/slides.pdfBusiness Section Practical Use of XML Rostislav Titov IT-AIS-EB (e-Business) Section CERN – Geneva, Switzerland

CERNe–Business

<?xml version="1.0" encoding="UTF-8"?><presentation>

<author><firstname>Rostislav</firstname><lastname>Titov</lastname>

</author><chapter number="1" title="What is XML">

XML (Extensible Markup Language) is …</chapter><conclusion/>

</presentation>

XMLXML: Rules: Rules

l Headerl One root elementl Tag hierarchyl Attributes

Some rules

l Element names are case-sensitive

l Every opening tag should have a closing tag

l Tags cannot intersect (<a><b></a></b>)

l Attributes values – in quotes or apostrophes

l Text elementsl Empty elements

Page 5: Practical Use of XML - CERNcern.ch/Computing.Seminars/2004/1201/slides.pdfBusiness Section Practical Use of XML Rostislav Titov IT-AIS-EB (e-Business) Section CERN – Geneva, Switzerland

CERNe–Business

XMLXML: Data Transfer: Data Transfer

l Platform and language independent

l Easy to write, easy to process

l Understandable for humans and computers

l Open standard

– Many libraries exist

– Lots of literature available

– Specialized XML-editors

l Possibility to check the document structure

Page 6: Practical Use of XML - CERNcern.ch/Computing.Seminars/2004/1201/slides.pdfBusiness Section Practical Use of XML Rostislav Titov IT-AIS-EB (e-Business) Section CERN – Geneva, Switzerland

CERNe–Business

XMLXML: Data Transfer (2): Data Transfer (2)

ExternalProgram

EDH

XML

l Automatic form generation from external programs

l XML as data transfer format

l Schema checkup as a warranty of data consistency

Example: EDH Transport Request

Page 7: Practical Use of XML - CERNcern.ch/Computing.Seminars/2004/1201/slides.pdfBusiness Section Practical Use of XML Rostislav Titov IT-AIS-EB (e-Business) Section CERN – Geneva, Switzerland

CERNe–Business

Web ServicesWeb Services

Webservice

WSDLWSDL

XML

SOAPSOAP

XML

l Data transfer between programs on Internetl Open Standardl Platform and language independent (Java, .Net, …)

WSDL – Web Service Definition Language

SOAP – Simple Object Access Protocol

Page 8: Practical Use of XML - CERNcern.ch/Computing.Seminars/2004/1201/slides.pdfBusiness Section Practical Use of XML Rostislav Titov IT-AIS-EB (e-Business) Section CERN – Geneva, Switzerland

CERNe–Business

XMLXML: Data Storage: Data Storage

l Data structure is kept together with the data

l Object “addendum” to relational RDBMS

l Structure checkup

l Supported by many modern RDBMS

– Microsoft SQL Server 2005, Oracle 9i +,

– XML Data Type

– XML indexes

– XML Queries (XQuery etc.)

– Data output in XML format

Page 9: Practical Use of XML - CERNcern.ch/Computing.Seminars/2004/1201/slides.pdfBusiness Section Practical Use of XML Rostislav Titov IT-AIS-EB (e-Business) Section CERN – Geneva, Switzerland

CERNe–Business

XMLXML: Data Storage (2): Data Storage (2)

Example: EDH Search System

Our solution:

l All documents are stored in XML

l Context-specific XML search (Oracle InterMedia)

Example: «Find documents created by Slava»:

Select DOC_ID from DOC_XML where Contains(XML, “Slava within creator”) > 0;

Problem: Effective search using arbitrarynumber of criteria is problematic

Page 10: Practical Use of XML - CERNcern.ch/Computing.Seminars/2004/1201/slides.pdfBusiness Section Practical Use of XML Rostislav Titov IT-AIS-EB (e-Business) Section CERN – Geneva, Switzerland

CERNe–Business

XMLXML: Data Transformations: Data Transformations

l XML can be transformed into HTML, text, PDF, ...

– No need for special program solutions

– Commercial visual editors exist

– Platform independent

Page 11: Practical Use of XML - CERNcern.ch/Computing.Seminars/2004/1201/slides.pdfBusiness Section Practical Use of XML Rostislav Titov IT-AIS-EB (e-Business) Section CERN – Geneva, Switzerland

CERNe–Business

XMLXML--based Standardsbased Standards

l Possibility to formally define the structure

l Platform and language independent

l Understandable for humans and computers

l Possibility to use XML technologies (XSLT transformations, XQuery queries)…

– WSDL (Web Services Definition Language)

– SOAP (Simple Object Access Protocol)

– XHTML (HTML that complies to XML rules)

– SVG (Scalable Vector Graphics)

– ebXML (XML for e-Business)

– …

Page 12: Practical Use of XML - CERNcern.ch/Computing.Seminars/2004/1201/slides.pdfBusiness Section Practical Use of XML Rostislav Titov IT-AIS-EB (e-Business) Section CERN – Geneva, Switzerland

CERNe–Business

Formal Structure DefinitionFormal Structure Definition

l There are ways to define XML structure formally

• DTD (Document Type Definition)

• XML Schema

Obsolete!Not for new

development

Obsolete!Not for new

development

Page 13: Practical Use of XML - CERNcern.ch/Computing.Seminars/2004/1201/slides.pdfBusiness Section Practical Use of XML Rostislav Titov IT-AIS-EB (e-Business) Section CERN – Geneva, Switzerland

CERNe–Business

XML SchemaXML Schema: : PossibilitiesPossibilities

l Check element presence and their order

l Sequences and choices

l Number of repetitions for elements and groups

l Attributes and their presence

l Type of elements and attributes

l Restrictions for elements and attributes

l Default values

l Unique constraints

l ...

Page 14: Practical Use of XML - CERNcern.ch/Computing.Seminars/2004/1201/slides.pdfBusiness Section Practical Use of XML Rostislav Titov IT-AIS-EB (e-Business) Section CERN – Geneva, Switzerland

CERNe–Business

XMLXML--schemaschema: : when it is neededwhen it is needed??

l Formal structure definition for future reference

l Programmers may rely on data consistence

l Authors may check XML validness in advance

Page 15: Practical Use of XML - CERNcern.ch/Computing.Seminars/2004/1201/slides.pdfBusiness Section Practical Use of XML Rostislav Titov IT-AIS-EB (e-Business) Section CERN – Geneva, Switzerland

CERNe–Business

XMLXML--schemaschema: : when NOT neededwhen NOT needed??

lWhen we know in advance that XML is valid

lWhen we do not care about document validness

lWhen maximum processing speed is required

l Small “throw away” projects

Page 16: Practical Use of XML - CERNcern.ch/Computing.Seminars/2004/1201/slides.pdfBusiness Section Practical Use of XML Rostislav Titov IT-AIS-EB (e-Business) Section CERN – Geneva, Switzerland

CERNe–Business

XPathXPath: : XML NavigationXML Navigation

l Access to XML elements

l Result of an XPATH-expression can be:

C:\presentation\author\firstname /presentation/author/firstname

l XML Nodel Node Setl Boolean

l Stringl Numberl Empty Set

Page 17: Practical Use of XML - CERNcern.ch/Computing.Seminars/2004/1201/slides.pdfBusiness Section Practical Use of XML Rostislav Titov IT-AIS-EB (e-Business) Section CERN – Geneva, Switzerland

CERNe–Business

XXPathPath: Examples: Examples

l Find the DG’s name

/cern/dg/person/text()

l Find all departments

/cern/department/@name

l Find all people

//person

l Find the name of DH of IT

/cern/department[@name=“IT”]/dh/person/text()

l Find how many groups has a department where R. Martens workscount(//gl/person[starts-with(., 'R. Martens')]/../../../group)

<cern><dg><person>R. Aymar</person></dg><department name=“PH”>

<dh><person>W-D. Schlatter</person></dh></department><department name=“IT”>

<dh><person>W. von Rueden</person></dh><group name=“IT-AIS”>

<gl><person>R. Martens</person></gl></group><group name=“IT-CO”>

<gl><person>D. Myers</person></gl></group><group name=“IT-IS”>

<gl><person>A. Pace</person></gl></group>

</department></cern>

Page 18: Practical Use of XML - CERNcern.ch/Computing.Seminars/2004/1201/slides.pdfBusiness Section Practical Use of XML Rostislav Titov IT-AIS-EB (e-Business) Section CERN – Geneva, Switzerland

CERNe–Business

XPathXPath: Examples: Examples ((88))

Example: Event Handling System

Check eventsagainst XPath

XMLXML

XML

Events Subscriptions

XPath XPath

Handling System

Notifications

«I want to see all documents for more than 600 CHF»

/ document [amount > 600]

Page 19: Practical Use of XML - CERNcern.ch/Computing.Seminars/2004/1201/slides.pdfBusiness Section Practical Use of XML Rostislav Titov IT-AIS-EB (e-Business) Section CERN – Geneva, Switzerland

CERNe–Business

XPathXPath: Program Use: Program Use

Element root = xml.getDocumentElement();

Node child;

for (child = root.getFirstChild(); child != null; child = child.getNextSibling())

if (child.getNodeName().equals("report") && ( (Element)child ).getAttribute("name").equals("Slava"))

break;

for (child = ((Element)child).getFirstChild(); child != null; child = child.getNextSibling())

{

if (child.getNodeName().equals("title") )

{

for (Node child2 = child.getFirstChild(); child2 != null; child2 = child2.getNextSibling())

if ( child2 instanceof Text )

System.out.println(( (Text)child2 ).getData().trim());

}

}

System.out.println(((XMLDocument)xml).selectSingleNode("/config/report[@name='Slava']/title/text()").getNodeValue());

XPath

DOM Model

Page 20: Practical Use of XML - CERNcern.ch/Computing.Seminars/2004/1201/slides.pdfBusiness Section Practical Use of XML Rostislav Titov IT-AIS-EB (e-Business) Section CERN – Geneva, Switzerland

CERNe–Business

XQuerXQuery y ––XML XML Query LanguageQuery Language

l XQuery is SQL for XML

– Database independent

– Easy to use

l Supported by popular RDBMS(Microsoft SQL Server 2005, Oracle 9i and10g)

l Based on XPath, supports document sets

Page 21: Practical Use of XML - CERNcern.ch/Computing.Seminars/2004/1201/slides.pdfBusiness Section Practical Use of XML Rostislav Titov IT-AIS-EB (e-Business) Section CERN – Geneva, Switzerland

CERNe–Business

XSLT: XML TransformationsXSLT: XML Transformations

l Transforms XML to HTML, text or other XMLl XSLT 1.0 (Current), XSLT 2.0 (Draft)l XSLT is a “Human Interface” to XMLl Supported by Web Browsers

XSLT

Page 22: Practical Use of XML - CERNcern.ch/Computing.Seminars/2004/1201/slides.pdfBusiness Section Practical Use of XML Rostislav Titov IT-AIS-EB (e-Business) Section CERN – Geneva, Switzerland

CERNe–Business

XSLT: Simplified StructureXSLT: Simplified Structure

xsl:stylesheet

xsl:template

xsl:template

xsl:value-of

xsl:value-of

xsl:apply-templates

<html><body>

… </body>

<html>

l XSLT is an XML filel Active usage of XPath expressions

Apply a templateto the given element

Evaluate XPath and print value

Apply templatesto other elements

Page 23: Practical Use of XML - CERNcern.ch/Computing.Seminars/2004/1201/slides.pdfBusiness Section Practical Use of XML Rostislav Titov IT-AIS-EB (e-Business) Section CERN – Geneva, Switzerland

CERNe–Business

XSLT: PossibilitiesXSLT: Possibilities

l Conditions (<xsl:if>)l Loops (<xsl:for-each>)l Variables (<xsl:variable>)l Sorting (<xsl:sort>)l Numbering [1., 1.1., 1.1.?, 2.,] (<xsl:number>)l Number formatting (format-number())l Multiple step processing (mode)l String manipulations (via XPath)

XSLT 2.0 (Draft)l XPath 2.0l Custom functionsl Regular expressionsl Date and time formattingl Groupings

Page 24: Practical Use of XML - CERNcern.ch/Computing.Seminars/2004/1201/slides.pdfBusiness Section Practical Use of XML Rostislav Titov IT-AIS-EB (e-Business) Section CERN – Geneva, Switzerland

CERNe–Business

XSLT: ExampleXSLT: Example<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="html" version="1.0" encoding="UTF-8" indent="yes"/>

<xsl:template match="presentation"><html>

<body bgcolor="#FFCCFF"><h1><font color="darkblue"><xsl:value-of select="title"/></font></h1><h4><font color="green"><i>Author: <xsl:value-of select="author"/></i></font></h4><b>Table of Contents</b><br/><br/><xsl:apply-templates select="chapter" mode="contents"/><br/><br/><xsl:apply-templates select="chapter" mode="normal"/>

</body></html>

</xsl:template>

<xsl:template match="chapter" mode="normal"><b>Chapter <xsl:value-of select="@number"/>. <xsl:value-of select="@title"/></b><br/><br/><i><xsl:value-of select="text()"/></i><br/><br/>

</xsl:template>

<xsl:template match="chapter" mode="contents"><xsl:value-of select="@number"/>. <xsl:value-of select="@title"/><br/>

</xsl:template></xsl:stylesheet>

Page 25: Practical Use of XML - CERNcern.ch/Computing.Seminars/2004/1201/slides.pdfBusiness Section Practical Use of XML Rostislav Titov IT-AIS-EB (e-Business) Section CERN – Geneva, Switzerland

CERNe–Business

XSLTXSLT:: Web Web “Skins”“Skins”

<aissearchscreen><head><title>Person Search</title></head><body>

<input type="hidden" name="isAdvanced" value="false"/><input show="always" type="text" label="Keyword" value="titov"/><input type="checkbox" label="Fuzzy search" value="No"/><result>

<header><tablecell>Full Name</tablecell>…

</header><row>

<tablecell>Maksym TITOV</tablecell><tablecell>71169</tablecell><tablecell>40-3-C08</tablecell>…

</row><row>

<tablecell>Oleg TITOV</tablecell><tablecell>EXT</tablecell>…

</row>…<rowcount>4</rowcount>

</result></body>

</aissearchscreen>

Page 26: Practical Use of XML - CERNcern.ch/Computing.Seminars/2004/1201/slides.pdfBusiness Section Practical Use of XML Rostislav Titov IT-AIS-EB (e-Business) Section CERN – Geneva, Switzerland

CERNe–Business

XSLTXSLT:: Web Web “Skins” “Skins” -- 22

XSLT

Page 27: Practical Use of XML - CERNcern.ch/Computing.Seminars/2004/1201/slides.pdfBusiness Section Practical Use of XML Rostislav Titov IT-AIS-EB (e-Business) Section CERN – Geneva, Switzerland

CERNe–Business

XSLTXSLT:: User InterfacesUser Interfaces

CERN Stores Catalog

l Data loaded through XML

l Data stored in XML

l XSLT for data output

l 150000 items

l +10000 users

l ~15-20K XML for each page

l Custom formatting (through XSLT redefinition)

Page 28: Practical Use of XML - CERNcern.ch/Computing.Seminars/2004/1201/slides.pdfBusiness Section Practical Use of XML Rostislav Titov IT-AIS-EB (e-Business) Section CERN – Geneva, Switzerland

CERNe–Business

XSLT: XML to TextXSLT: XML to Text

Example:l Automatic code generation

<document><input type=“person” name=“A”/><input type=“number” name=“B”/>…

</document>InterfaceInterface

XML-description

Program

Business LogicBusiness Logic

SQLSQL

...

Did you know…that 1 EDH document is:

l At least 20 source files (code, HTML templates, resources, SQL, …)

l About 250K of source code

Page 29: Practical Use of XML - CERNcern.ch/Computing.Seminars/2004/1201/slides.pdfBusiness Section Practical Use of XML Rostislav Titov IT-AIS-EB (e-Business) Section CERN – Geneva, Switzerland

CERNe–Business

XSLT: XML to XMLXSLT: XML to XML

l Generate XML from another XML source

l “Configuration files update”

l XSL:FO

Page 30: Practical Use of XML - CERNcern.ch/Computing.Seminars/2004/1201/slides.pdfBusiness Section Practical Use of XML Rostislav Titov IT-AIS-EB (e-Business) Section CERN – Geneva, Switzerland

CERNe–Business

XSLXSL--FO: Formatting ObjectsFO: Formatting Objects

l FO: XML-description of document layout

l XSL-FO: XSLT transformation of XML document to FO document

l FO Processor: program that converts the FO definition into a printable format (PDF, PS, ...)

<?xml version="1.0"?><presentation>

<title>XXX

</title></presentation>

<?xml version="1.0"?><presentation>

<title>XXX

</title></presentation>

<fo:root><fo:page-sequence>

<fo:flow>...

</fo:flow></fo:page-sequence></fo:root>

<fo:root><fo:page-sequence>

<fo:flow>...

</fo:flow></fo:page-sequence></fo:root>

XMLDocument

FODocument

PDFDocument

XSL:FOTransformation

FOProcessor

Page 31: Practical Use of XML - CERNcern.ch/Computing.Seminars/2004/1201/slides.pdfBusiness Section Practical Use of XML Rostislav Titov IT-AIS-EB (e-Business) Section CERN – Geneva, Switzerland

CERNe–Business

XSLXSL--FO: Formatting ObjectsFO: Formatting Objects

l Fontsl Paginationl Headers and footersl Page numberingl Odd/even page distinctionl Margins and intervalsl Keep paragraphs togetherl Hangout linesl Tablesl Graphicsl …

FO has all capabilities of moderntext editors:

FO Processor:Apache FOP

Page 32: Practical Use of XML - CERNcern.ch/Computing.Seminars/2004/1201/slides.pdfBusiness Section Practical Use of XML Rostislav Titov IT-AIS-EB (e-Business) Section CERN – Geneva, Switzerland

CERNe–Business

XSLXSL--FO: ExampleFO: Example

XMLXML

e-MAPS

XSLT

Web Interface

Printable Version

XSL:FO

FOP Processor

l No extra code requiredl RTF to XSL:FO converters are goodl Can be written by a studentl Output format independent

Page 33: Practical Use of XML - CERNcern.ch/Computing.Seminars/2004/1201/slides.pdfBusiness Section Practical Use of XML Rostislav Titov IT-AIS-EB (e-Business) Section CERN – Geneva, Switzerland

CERNe–Business

XMLXML EditorsEditors

l Specially designed for XML editing

l XML well-formedness and validity check

l DTD and Schema visual editing

l XML generation accordingly to DTD/Schema

l Creation and debugging of XSLT and XSL:FO

l Visual XSLT editing

Example: Altova XML Spy (www.xmlspy.com)

- Available from NICE

- License can be obtained from the SDT service

XMLSpy 2005

Page 34: Practical Use of XML - CERNcern.ch/Computing.Seminars/2004/1201/slides.pdfBusiness Section Practical Use of XML Rostislav Titov IT-AIS-EB (e-Business) Section CERN – Geneva, Switzerland

CERNe–Business

XMLXML: Program Handling: Program Handling

l DOM (Document Object Model)

– Tree building

l SAX

– Event handling– startElement()– endElement()

Java, C++:

– Apache Xalan

– Oracle XML Parser

...

PERL, .Net:

– Built-in support

SAX - much faster, DOM – more versatile

SAX - much faster, DOM – more versatile

Page 35: Practical Use of XML - CERNcern.ch/Computing.Seminars/2004/1201/slides.pdfBusiness Section Practical Use of XML Rostislav Titov IT-AIS-EB (e-Business) Section CERN – Geneva, Switzerland

CERNe–Business

New TechnologiesNew Technologies

l InfoPath 2003– Corporate system for electronic form

handling– XML-based– Business rules defined by XML schema– Data validation using XML schemas

l Adobe Intellegent Document Platform– Similar ideas

Page 36: Practical Use of XML - CERNcern.ch/Computing.Seminars/2004/1201/slides.pdfBusiness Section Practical Use of XML Rostislav Titov IT-AIS-EB (e-Business) Section CERN – Geneva, Switzerland

CERNe–Business

ConclusionConclusion

«XML is one of the biggest inventions in IT area in the last few years. There is a lot of XML applications around the world today, and this amount will grow every year»

«XML is one of the biggest inventions in IT area in the last few years. There is a lot of XML applications around the world today, and this amount will grow every year»

W3C Consortium Web Site:http://www.w3c.org

Questions:[email protected]