dsa week 161 xml data and schemas dsa week 16. dsa week 162 news bloglines wall of images –

22
DSA week 16 1 XML Data and Schemas DSA Week 16

Post on 20-Dec-2015

222 views

Category:

Documents


0 download

TRANSCRIPT

DSA week 16 1

XML Data and Schemas

DSA

Week 16

DSA week 16 2

News

• Bloglines wall of images– http://www.bloglines.com/about/wallofimages

• XML in 2007 – http://www-128.ibm.com/developerworks/xml/l

ibrary/x-xml2007predictions.html

DSA week 16 3

Agenda

• Help on Coursework 2• Data / schema distinction• Well-formed and valid• Different Schema languages• Top-down – Using QSEE to create a Schema• Bottom-up – Using Trang to generalize• Guidelines for a good schema• Linking Multiple documents

DSA week 16 4

Coursework :Icons in kml

• Each Placemark has an icon – by default it is the pin symbol but its easy to change this

• The full set of Google Earth supplied icons is here

• Define one or more styles in the Folder

• Link to one of these styles from the Placemark to be styled

DSA week 16 5

Example kml with icon <Folder> … <Style id="home"> <IconStyle> <Icon><href>http://maps.google.com/mapfiles/kml/pal3/icon21.png</href> </Icon> </IconStyle> </Style> … <Placemark> …. <styleUrl>#home</styleUrl> </Placemark>

</Folder>

DSA week 16 6

RDBMS table

Data

Schema

SQL DDL

XML document

DTDXML

Schema

ConceptualModel

ERM

defines

realises

definesinfer

DSA week 16 7

Well-formed and Valid

• Well-formed– Applies to all XML documents– Tag naming, nesting, attributes and values– An XML-aware programme should be able to read any well-

formed XML

• Valid– The document conforms to a given definition of its structure

• The names of tags• How tags are nested • What kinds of values are allowable • What multiplicities of children are allowed

– A document may conform to any number of different schemas – Validity is not an absolute property

DSA week 16 8

Map layout

• Top-down– Draw XML model– Generate schema in a schema language

• Bottom-up– Create some data following a informal pattern

or prototype– Use trang (Jim Clark) to infer the schema

DSA week 16 9

<MapSet> <Map id="P2" desc="P Block level 2"> <room id="2P2"> <area shape="rect" coords="118,39,138,68"/> <type>Staff Room</type> <occupant>Tony Solomonides</occupant> </room> <room id="2P3"> <area shape="rect" coords="141,40,162,69"/> <type>Staff Room</type> <occupant>Richard Lawson</occupant> </room> <room id="2P4"> <area shape="poly" coords="201,40,234,40,234,118,164,119,163,71,200,71"/> <type>Office</type> <occupant>Eleanor Gibbons</occupant> <occupant>Dee Evans</occupant> <occupant>Ali Jack</occupant> </room>

DSA week 16 10

Using QSEE to create the schema

• Start QSEE• Select XML data model• Name the root entity and then add children• When a node is a text node (a leaf) define the

data type.• Add attributes• Multiplicities are added to the parent-child

relationship• Save the project• Generate the XML schema or DTD

DSA week 16 11

DSA week 16 12

Languages for defining schemas

• DTD

• XML Schema (.xsd)

• RelaxNG – XML syntax

• RelaxNG – compact syntax

• Schematron

DSA week 16 13

<?xml version="1.0" encoding="UTF-8"?><xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified"> <xs:element name="MapSet"> <xs:complexType> <xs:sequence> <xs:element ref="Map"/> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="Map"> <xs:complexType> <xs:sequence> <xs:element maxOccurs="unbounded" ref="room"/> </xs:sequence> <xs:attribute name="desc" use="required"/> <xs:attribute name="id" use="required" type="xs:NCName"/> </xs:complexType> …

DSA week 16 14

RelaxNG default namespace = ""

start = element MapSet { element Map { attribute desc { text }, attribute id { xsd:NCName }, element room { attribute id { xsd:NMTOKEN }, element area { attribute coords { text }, attribute shape { xsd:NCName } }, (element type { text }, element occupant { text }+)? }+ } }

DSA week 16 15

Schema by induction

• It is possible to deduce some of the structure of a file by induction – If all the children of a node have the same tag name,

and the same (or compatible )structure then they are instances of the same element.

• InfoPath (which creates data entry forms) can induce the structure from an example file

• Trang is an open source programme written in Java by James Clark– http://www.thaiopensource.com/relaxng/trang.html

• Built into XML IDEs e.g. oXygen

DSA week 16 16

Designing a Schema

• My advice for this coursework:– Use elements only, not attributes to simplify the

design– Common text values are indicative of a simple

enumerated property– Create meaningful names for the tags– Names don’t have to be unique in the hierarchy, only

within the parent node 

• If there is redundancy in the data, consider normalising into several files (but you will need to work out how to link the files in PHP)

DSA week 16 17

Multiple XML documents

• Simple XML models have a single hierarchy – ok if children are contained within parent – composition relationship

• More generally, links from one node in a document to other nodes in the same or other documents are required– Emp to dept– Timetable to Bus stop

• Need to be able to represent associations as well• Can use any element or attribute as primary key and

foreign key • Links can be made between any atomic items (as in

SQL)• Relationships can be defined and constrained but its not

easy

DSA week 16 18

Rooms and People

• Normalise the room data to:

Room Person? ?

DSA week 16 19

<MapSet> <Map id="P2" desc="P Block level 2"> <room id="2P2"> <area shape="rect" coords="118,39,138,68"/> <type>Staff Room</type> </room> <room id="2P3"> <area shape="rect" coords="141,40,162,69"/> <type>Staff Room</type> </room> <room id="2P4"> <area shape="poly" coords="201,40,234,40,234,118,164,119,163,71,200,71"/> <type>Office</type> </room> <room id="2P5"> <area shape="rect" coords="165,40,199,68"/> <type>Staff Room</type> </room> <room id="2P6"> <area shape="rect" coords="236,83,308,121"/> </room> <room id="2P7"> <area shape="rect" coords="236,39,270,68"/> </room> </Map></MapSet>

DSA week 16 20

<People> <Person> <name>Tony Solominides</name> <ext>81111</ext> <room>2P2</room> </Person><Person> <name>Eleanor Gibbons</name> <ext>85555</ext> <room>2P4</room> </Person><Person> <name>Dee Allan</name> <ext>85566</ext> <room>2P4</room> </Person> <Person><Person> <name>Ali Jack</name> <ext>85566</ext> <room>2P4</room> </Person></People>

DSA week 16 21

What problems?

• Could remove a room, or change its id when it is referenced by a person

• RDBMS can detect this and can take action – e.g. forbid the change

• XML is like the web – no protection

• Some advances in this area

DSA week 16 22

Workshop

• To create a simple XML schema for whisky distilleries

• Use QSEE to create the model.

• Create a couple of entries

• Generate the kml