1 dr alexiei dingli xml technologies xml advanced
TRANSCRIPT
2
• A Uniform Resource Identifier (URI) is a string of characters which identifies an Internet Resource
• The most common URI is the Uniform Resource Locator (URL) which identifies an Internet domain address
• Another, not so common type of URI is the Universal Resource Name (URN)– urn:isbn:0451450523
URIs
3
• Since XML element names can be defined by anyone, this often results in a conflict
• Such as …
– How would you add XML related to furniture (tables and chairs) to an HTML document?
Conflicts !!!
4
More conflicts !!!
HTML
<table>
<tr>
<td>Apples</td> <td>Bananas</td>
</tr>
</table>
XML
<table>
<name>Coffee Table</name> <width>8</width> <length>10</length>
</table>
5
• If these XML fragments were added together, there would be a name conflict
• Both contain a <table> element, but the elements have different content and meaning
• An XML parser will not know how to handle these differences
Even more conflicts !!!
6
• Name conflicts in XML can easily be avoided using a name prefix
• This XML carries information about an HTML table, and a piece of furniture:
<h:table>
<h:tr>
<h:td>Apples</h:td>
<h:td>Bananas</h:td>
</h:tr>
</h:table>
<f:table>
<f:name>Coffee Table</f:name>
<f:width>8</f:width>
<f:length>10</f:length>
</f:table>
• In the example above, there will be no conflict because the two <table> elements have different names.
Solving conflicts ...
7
• The namespace is defined by the xmlns attribute in the start tag of an element
• The namespace declaration has the following syntax
xmlns:prefix="URI"
XML Namespaces
8
<root>
<h:table xmlns:h="http://www.w3.org/TR/html4/">
<h:tr>
<h:td>Apples</h:td>
<h:td>Bananas</h:td>
</h:tr>
</h:table>
<f:table xmlns:f="http://www.w3schools.com/furniture"> <f:name>African Coffee Table</f:name>
<f:width>80</f:width>
<f:length>120</f:length>
</f:table>
</root>
Namespaces Ex 1
9
<root xmlns:h="http://www.w3.org/TR/html4/"
xmlns:f="http://www.w3schools.com/furniture">
<h:table>
<h:tr>
<h:td>Apples</h:td>
<h:td>Bananas</h:td>
</h:tr>
</h:table>
<f:table>
<f:name>African Coffee Table</f:name>
<f:width>80</f:width>
<f:length>120</f:length>
</f:table>
</root>
Namespaces Ex 2
10
• The xmlns attribute in the <table> tag gave the h: and f: prefixes a qualified namespace
• When a namespace is defined for an element, all child elements with the same prefix are associated with the same namespace
• Namespaces can be declared in the elements where they are used or in the XML root element
• The namespace URI is not used by the parser to look up information
• The purpose is to give the namespace a unique name, however, often companies use the namespace as a pointer to a web page containing namespace information
More on Namespaces
11
• Defining a default namespace for an element saves us from using prefixes in all the child elements
• <table xmlns="http://www.w3schools.com/furniture"> – <name>African Coffee Table</name> – <width>80</width> – <length>120</length>
• </table>
Default Namespaces
13
• XML parsers parse all text in a document• When an XML element is parsed, the text between the XML tags is
also parsed:– <message>This text is also parsed</message>
• The parser does this because XML elements can contain other elements:
– <name><first>Bill</first><last>Gates</last></name>
• The parser will break it up into sub-elements like this:– <name>
• <first>Bill</first> • <last>Gates</last>
– </name>
• Parsed Character Data (PCDATA) is a term used about text data that will be parsed by the XML parser.
Parsed Character Data
14
• The term CDATA is used about text data that should not be parsed by the XML parser
• Characters like "<" and "&" are illegal in XML elements
• "<" will generate an error because the parser interprets it as the start of a new element
• "&" will generate an error because the parser interprets it as the start of an character entity
• Some text, like JavaScript code, contains a lot of "<" or "&" characters. To avoid errors script code can be defined as CDATA
• Everything inside a CDATA section is ignored by the parser
• A CDATA section starts with "<![CDATA[" and ends with "]]>"
Un-Parsed Character Data (1)
16
• A CDATA section cannot contain the string "]]>"
• Nested CDATA sections are not allowed
• The "]]>" that marks the end of the CDATA section cannot contain spaces or line breaks
Un-Parsed Character Data (3)
17
• XML documents can contain non ASCII characters, like French ê è é.
• To avoid errors, specify the XML encoding
• Two encoding errors:– An invalid character was found in text content.– Switch from current encoding to specified encoding not
supported.
XML Encoding
18
<?xml version="1.0" encoding="windows-1252"?>
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml version="1.0" encoding="UTF-8"?>
<?xml version="1.0" encoding="UTF-16"?>
Different encodings ...
19
• DOM (Document Object Model) defines a standard way for accessing and manipulating documents
• Consider XML documents as a tree-structure
• All elements can be accessed through the DOM tree
• Their content (text and attributes) can be modified or deleted, and new elements can be created
• The elements, their text, and their attributes are all known as nodes
DOM
20
document.getElementById("to").innerHTML=""
• document - the HTML document• getElementById("to") - the HTML element
where id="to"• innerHTML - the inner text of the HTML element
HTML DOM example
21
xmlDoc.getElementsByTagName("to")[0].childNodes[0].nodeValue
• xmlDoc - the XML document created by the parser.
• getElementsByTagName("to")[0] - the first <to> element
• childNodes[0] - the first child of the <to> element (the text node)
• nodeValue - the value of the node (the text itself)
XML DOM example
22
x=xmlDoc.getElementsByTagName("title")[0].childNodes[0];
Retrieves the text value of the first <title> element
txt=xmlDoc.getElementsByTagName("title")[0].getAttribute("lang");
retrieves the text value of the "lang" attribute of the first <title> element
Advanced DOM examples
23
x=xmlDoc.getElementsByTagName("book");
for(i=0;i<x.length;i++) {
x[i].setAttribute("edition","first");
}
Change value of attributes
24
newel=xmlDoc.createElement("edition");
newtext=xmlDoc.createTextNode("First");
newel.appendChild(newtext);
x=xmlDoc.getElementsByTagName("book");
x[0].appendChild(newel);
Show using trees ...
What is happening here ... ?
27
• We check the browser, and load the XML using the correct parser
• We create an HTML table with <table border="1">
• We use getElementsByTagName() to get all XML CD nodes
• For each CD node, we display data from ARTIST and TITLE as table data.
• We end the table with </table>
XML to HTML explanation
28
• Provides a way to communicate with a server after a web page has loaded
• Allows website to – Update a web page with new data without reloading the page– Request data from a server after the page has loaded – Receive data from a server after the page has loaded– Send data to a server in the background
XMLHttpRequest Object
29
function loadXMLDoc(url) {
xmlhttp=new XMLHttpRequest();
xmlhttp.onreadystatechange=state_Change;
xmlhttp.open("GET",url,true);
xmlhttp.send();
}
function state_Change() {
...
x=xmlhttp.responseXML.documentElement.getElementsByTagName("CD");
...
}
<body>
<button onclick="loadXMLDoc('cd_catalog.xml')">Get CD info</button>
</body>
Using XMLHttpRequest
30
xmlhttp.open("GET",url,true);
• Our examples use "true" in the third parameter of open()• This parameter specifies whether the request should be handled
asynchronously• True means that the script continues to run after the send() method,
without waiting for a response from the server• The onreadystatechange event complicates the code. But it is the
safest way if you want to prevent the code from stopping if you don't get a response from the server.
• By setting the parameter to "false", your can avoid the extra onreadystatechange code. Use this if it's not important to execute the rest of the code if the request fails.
XMLHttpRequest Async