intro xml for archivists (2011)

30
archives hub workshop 2011 An Introduction to XML

Upload: jane-stevenson

Post on 18-Nov-2014

1.413 views

Category:

Education


2 download

DESCRIPTION

A short introduction to XML (intended to be used as part of a course on EAD (XML for archives).

TRANSCRIPT

Page 1: Intro XML for archivists (2011)

archives hub workshop 2011

An Introduction to XML

Page 2: Intro XML for archivists (2011)

XML

eXtensible Markup Language

Define XML

XML syntax and rules

XML DTDs and Schemas

Displaying XML

Why use XML?

Page 3: Intro XML for archivists (2011)

What is XML?

XML is a grammatical system for creating

languages… a meta-language

Use XML to design your own markup language,

consisting of meaningful tags that describe the

data they contain

Create a language for describing anything:

archives, books, government services, properties…

Page 4: Intro XML for archivists (2011)

What is interoperability?

the ability to exchange/share data

provides advantages of cross-searching, so user can easily search across and retrieve resources from a variety of different systems

allows users to move beyond individual websites for individual resources

integrates information resources presented in different formats

XML facilitates interoperability

Page 5: Intro XML for archivists (2011)

Something to remember about XML

XML does not do anything itself. It is pure

information wrapped in XML tags.

You must use other means to send, receive or

display the data

XML XML technologiesis used by

to create

Detailed description to view in a browser Summary

entry to view in a browser

PDF for print

Page 6: Intro XML for archivists (2011)

XML: elements

<language> English </language>

<tag> </tag>content

Page 7: Intro XML for archivists (2011)

XML attributes

Attributes are simple name/value pairs associated with an element

<tag attribute_name=“attribute_value”>content</tag>

<language …………….. >English<language>

<language langcode=“eng”>English</language>

<date>20 Sept 2004</date>

<date normal=“2004”>20 Sept 2004</date>

Page 8: Intro XML for archivists (2011)

XML and Content

XML is essentially about structure. It focuses on

what the data is

The structure enables content to be identified by

machines so they can process the data

XML is not primarily about content, though there

might be some restrictions on content

Page 9: Intro XML for archivists (2011)

Sample Content

Papers of John Ruskin

1864-1888

10 boxes

Held at the University of London Library

Page 10: Intro XML for archivists (2011)

Table

Title Papers of John Ruskin

Dates 1864-1888

Extent 10 boxes

Held At University of London Library

Page 11: Intro XML for archivists (2011)

XML: Structure

<catalog>

<title>Papers of John Ruskin</title>

<date>1864-1888</date>

<extent>10 boxes</extent>

<location>University of London Library</location>

</catalog>

Page 12: Intro XML for archivists (2011)

Well-formed XML

a root element is required

<catalog> all content </catalog>

closing tags are required

elements must be properly nested

case must be consistent

attribute values must be in quotation marks

Page 13: Intro XML for archivists (2011)

Create tags for your data

Hands-On

Page 14: Intro XML for archivists (2011)

Valid XML (1)

Valid XML: rules specify elements and attributes &

how they are used

Valid XML provides consistency and facilitates the

exchange of data

Valid XML is important for displaying, processing

and exchanging XML in a wider environment

Page 15: Intro XML for archivists (2011)

Valid XML (2)

Must conform to a Document Type Definition (DTD)

or Schema

Archives: Encoded Archival Description - EAD

version 1; EAD 2002

e-learning: IEEE Learning Object Metadata

Schema (LOM)

Government: Council Roadworks Schema

Page 16: Intro XML for archivists (2011)

DTDs/Schemas

A Document Type Definition or Schema defines the

building blocks of an XML document

It specifies elements and attributes and defines

how they can be used

People can agree to use a common DTD/schema for

interchanging data

Usually point to an external DTD/schema from the

XML document

Page 17: Intro XML for archivists (2011)

Schemas

Schemas perform the same task as DTDs

Schemas use XML syntax

Schemas support complex data types

Schemas are extensible

One XML document can point to more than one

schema

Page 18: Intro XML for archivists (2011)

A simple XML document

<?xml version="1.0"?>

<note>

<to>Rachel</to>

<from>John</from>

<heading>Reminder</heading>

<body>Don't forget the concert!</body>

</note>

Page 19: Intro XML for archivists (2011)

Example of a simple Schema

<?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"

targetNamespace="http://www.w3schools.com" xmlns="http://www.w3schools.com" elementFormDefault="qualified">

<xs:element name="note"> <xs:complexType>

<xs:sequence> <xs:element name="to" type="xs:string"/> <xs:element name="from" type="xs:string"/> <xs:element name="heading" type="xs:string"/> <xs:element name="body" type="xs:string"/>

</xs:sequence> </xs:complexType> </xs:element> </xs:schema>

Page 20: Intro XML for archivists (2011)

What about display?

XML file DTD or Schema Valid XML

Blue Elephant Papers

……………………

…………

Blue Elephant Papers Browse

List

Page 21: Intro XML for archivists (2011)

Displaying XML

XML technologies – for displaying, retrieving,

transforming, manipulating

DOM, SAX, XForms, XLink, XPointer

XSL FO – Extensible Stylesheet Language

Formatting Objects

XSLT – Extensible Stylesheet Language for

Transformations

CSS – a less sophisticated way to display XML

Page 22: Intro XML for archivists (2011)
Page 23: Intro XML for archivists (2011)
Page 24: Intro XML for archivists (2011)

Transformation of XML

Transformation involves the reading in of an XML

file and an XSLT file to a processor,which can then

generate some output – typically HTML

XSLT

XML

processor HTML output

Page 25: Intro XML for archivists (2011)

HTML vs. XML

HTML is ONLY for display, typically in a Web

browser

Browsers display XML but not necessarily as HTML (http://www.w3schools.com/xml/simple.xml)

HTML tags do not describe the content

HTML cannot easily be extracted

Store the data separately as XML files and change

the presentation with HTML

Page 26: Intro XML for archivists (2011)

Why use XML?

International standard, supported by the W3C

The most common means to transmit data

XML is open, licence free and platform neutral

XML is human and machine readable

XML documents are text documents: independent

of hardware and software

Page 27: Intro XML for archivists (2011)

More reasons to use XML

Separation of content and presentation

With proprietary systems content is inextricably

bound up with format

Use XSLT (Extensible Style Sheet Language for

Transformations) to present XML data

Flexibility to manipulate and customise

Page 28: Intro XML for archivists (2011)

..and hierarchy

Hierarchical structure

<collection> <part> <item> One item </item> </part></collection>

Page 29: Intro XML for archivists (2011)

…as well as sharing data

XML is the main basis for defining data exchange

languages

Meaningful/consistent tags facilitate extraction

Different incompatible systems can access and use

the same data

Page 30: Intro XML for archivists (2011)

Summary

XML must be well-formed and valid

DTDs and Schemas provide tags, attributes and

rules

XML requires other XML technologies

XSLT can transform XML

XML is simple, flexible and great for data

exchange

It is a more efficient way to a sustainable system