dsa week 161 xml data and schemas dsa week 16. dsa week 162 news bloglines wall of images –
Post on 20-Dec-2015
222 views
TRANSCRIPT
DSA week 16 2
News
• Bloglines wall of images– http://www.bloglines.com/about/wallofimages
• XML in 2007 – http://www-128.ibm.com/developerworks/xml/l
ibrary/x-xml2007predictions.html
DSA week 16 3
Agenda
• Help on Coursework 2• Data / schema distinction• Well-formed and valid• Different Schema languages• Top-down – Using QSEE to create a Schema• Bottom-up – Using Trang to generalize• Guidelines for a good schema• Linking Multiple documents
DSA week 16 4
Coursework :Icons in kml
• Each Placemark has an icon – by default it is the pin symbol but its easy to change this
• The full set of Google Earth supplied icons is here
• Define one or more styles in the Folder
• Link to one of these styles from the Placemark to be styled
DSA week 16 5
Example kml with icon <Folder> … <Style id="home"> <IconStyle> <Icon><href>http://maps.google.com/mapfiles/kml/pal3/icon21.png</href> </Icon> </IconStyle> </Style> … <Placemark> …. <styleUrl>#home</styleUrl> </Placemark>
</Folder>
DSA week 16 6
RDBMS table
Data
Schema
SQL DDL
XML document
DTDXML
Schema
ConceptualModel
ERM
defines
realises
definesinfer
DSA week 16 7
Well-formed and Valid
• Well-formed– Applies to all XML documents– Tag naming, nesting, attributes and values– An XML-aware programme should be able to read any well-
formed XML
• Valid– The document conforms to a given definition of its structure
• The names of tags• How tags are nested • What kinds of values are allowable • What multiplicities of children are allowed
– A document may conform to any number of different schemas – Validity is not an absolute property
DSA week 16 8
Map layout
• Top-down– Draw XML model– Generate schema in a schema language
• Bottom-up– Create some data following a informal pattern
or prototype– Use trang (Jim Clark) to infer the schema
DSA week 16 9
<MapSet> <Map id="P2" desc="P Block level 2"> <room id="2P2"> <area shape="rect" coords="118,39,138,68"/> <type>Staff Room</type> <occupant>Tony Solomonides</occupant> </room> <room id="2P3"> <area shape="rect" coords="141,40,162,69"/> <type>Staff Room</type> <occupant>Richard Lawson</occupant> </room> <room id="2P4"> <area shape="poly" coords="201,40,234,40,234,118,164,119,163,71,200,71"/> <type>Office</type> <occupant>Eleanor Gibbons</occupant> <occupant>Dee Evans</occupant> <occupant>Ali Jack</occupant> </room>
DSA week 16 10
Using QSEE to create the schema
• Start QSEE• Select XML data model• Name the root entity and then add children• When a node is a text node (a leaf) define the
data type.• Add attributes• Multiplicities are added to the parent-child
relationship• Save the project• Generate the XML schema or DTD
DSA week 16 12
Languages for defining schemas
• DTD
• XML Schema (.xsd)
• RelaxNG – XML syntax
• RelaxNG – compact syntax
• Schematron
DSA week 16 13
<?xml version="1.0" encoding="UTF-8"?><xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified"> <xs:element name="MapSet"> <xs:complexType> <xs:sequence> <xs:element ref="Map"/> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="Map"> <xs:complexType> <xs:sequence> <xs:element maxOccurs="unbounded" ref="room"/> </xs:sequence> <xs:attribute name="desc" use="required"/> <xs:attribute name="id" use="required" type="xs:NCName"/> </xs:complexType> …
DSA week 16 14
RelaxNG default namespace = ""
start = element MapSet { element Map { attribute desc { text }, attribute id { xsd:NCName }, element room { attribute id { xsd:NMTOKEN }, element area { attribute coords { text }, attribute shape { xsd:NCName } }, (element type { text }, element occupant { text }+)? }+ } }
DSA week 16 15
Schema by induction
• It is possible to deduce some of the structure of a file by induction – If all the children of a node have the same tag name,
and the same (or compatible )structure then they are instances of the same element.
• InfoPath (which creates data entry forms) can induce the structure from an example file
• Trang is an open source programme written in Java by James Clark– http://www.thaiopensource.com/relaxng/trang.html
• Built into XML IDEs e.g. oXygen
DSA week 16 16
Designing a Schema
• My advice for this coursework:– Use elements only, not attributes to simplify the
design– Common text values are indicative of a simple
enumerated property– Create meaningful names for the tags– Names don’t have to be unique in the hierarchy, only
within the parent node
• If there is redundancy in the data, consider normalising into several files (but you will need to work out how to link the files in PHP)
DSA week 16 17
Multiple XML documents
• Simple XML models have a single hierarchy – ok if children are contained within parent – composition relationship
• More generally, links from one node in a document to other nodes in the same or other documents are required– Emp to dept– Timetable to Bus stop
• Need to be able to represent associations as well• Can use any element or attribute as primary key and
foreign key • Links can be made between any atomic items (as in
SQL)• Relationships can be defined and constrained but its not
easy
DSA week 16 19
<MapSet> <Map id="P2" desc="P Block level 2"> <room id="2P2"> <area shape="rect" coords="118,39,138,68"/> <type>Staff Room</type> </room> <room id="2P3"> <area shape="rect" coords="141,40,162,69"/> <type>Staff Room</type> </room> <room id="2P4"> <area shape="poly" coords="201,40,234,40,234,118,164,119,163,71,200,71"/> <type>Office</type> </room> <room id="2P5"> <area shape="rect" coords="165,40,199,68"/> <type>Staff Room</type> </room> <room id="2P6"> <area shape="rect" coords="236,83,308,121"/> </room> <room id="2P7"> <area shape="rect" coords="236,39,270,68"/> </room> </Map></MapSet>
DSA week 16 20
<People> <Person> <name>Tony Solominides</name> <ext>81111</ext> <room>2P2</room> </Person><Person> <name>Eleanor Gibbons</name> <ext>85555</ext> <room>2P4</room> </Person><Person> <name>Dee Allan</name> <ext>85566</ext> <room>2P4</room> </Person> <Person><Person> <name>Ali Jack</name> <ext>85566</ext> <room>2P4</room> </Person></People>
DSA week 16 21
What problems?
• Could remove a room, or change its id when it is referenced by a person
• RDBMS can detect this and can take action – e.g. forbid the change
• XML is like the web – no protection
• Some advances in this area