Download - XML Technology in E-Commerce
Sheet 1XML Technology in E-Commerce 2001 Lecture 3
XML Technology in E-Commerce
Lecture 3
DOM and SAX
Sheet 2XML Technology in E-Commerce 2001 Lecture 3
• General Model for XML Processing
• Document Object Model (DOM)– Logical Model;
– DOM Interfaces;
– Example;
• Simple API for XML (SAX);– Parser Architecture;
– Events;
– Java Classes and Interfaces;
– Example. Error Handling;
• Summary;
Lecture Outline
Sheet 3XML Technology in E-Commerce 2001 Lecture 3
General Model for XML Processing
DOM - Document Object Model
SAX - Simple API for XML
Sheet 4XML Technology in E-Commerce 2001 Lecture 3
DOM
• DOM defines:– Logical model for XML documents;
– Platform and language independent application programming interfaces for model manipulation;
• DOM allows:– accessing document content;
– modifying document content;
– creating new documents in the memory;
• DOM homepage:– http://www.w3.org/DOM/
Sheet 5XML Technology in E-Commerce 2001 Lecture 3
XML Document:
<competition> <results> <name>John Smith</name> <name>Derek Warwick</name> <name>Mik Douglas</name> </results> <photos> <img src="img1.gif"/> <img src="img2.gif"/> </photos></competition>
img
competition
results photos
imgname name name
John Smith D. Warwick M. Douglas
DOMLogical Model
DOM Tree Structure:
XML Document is a set of Nodes that form tree structure. There are different node types: for elements, attributes, text content, etc.
Sheet 6XML Technology in E-Commerce 2001 Lecture 3
• DOM Interfaces are defined in the Interface Definition Language (IDL);
• There are bindings for different languages:
DOMProgramming Interfaces
IDL
JavaScript C++ PythonJava
Sheet 7XML Technology in E-Commerce 2001 Lecture 3
DOMInterface Hierarchy
More important interfaces defined in Java package org.w3c.dom
Sheet 9XML Technology in E-Commerce 2001 Lecture 3
• DOM provides two groups of interfaces:– Generic: Node, NodeList, NamedNodeMap;
– Specialized: Node subinterfaces for elements, attributes, text nodes, etc.
• Interfaces:– Node - Deitel 8.5, Fig. 8.7, page 201;
– Document - Deitel 8.5, Fig. 8.5, page 200;
– Element - Deitel 8.5, Fig. 8.9. Page 200;
– Attr;
– Text;
DOMInterface Details
Sheet 10XML Technology in E-Commerce 2001 Lecture 3
• Demo - Example on fig. 8.10, Deitel 8.5, page 202;
• Tools:– Java 1.2.2;
• http://java.sun.com/products/jdk/1.2/
– Java API for XML Processing (JAXP) 1.0.1;• http://java.sun.com/xml/archive.html
• Classes: jaxp.jar and parser.jar;
• Demo files:– ReplaceText.java;
– MyErrorHandler.java;
– intro.xml;
DOMDemo
Sheet 11XML Technology in E-Commerce 2001 Lecture 3
• Importing packages:
import org.w3c.dom.*;import org.xml.sax.*;import javax.xml.parsers.*;import com.sun.xml.tree.XmlDocument;
• Instantiation of the parser. DOM does not specify parser instantiation, so this is an implementation specific detail:
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setValidating( true );DocumentBuilder builder =
factory.newDocumentBuilder();
DOMDemo Explained (1)
Sheet 12XML Technology in E-Commerce 2001 Lecture 3
• Loading and Parsing the XML file:Document document = builder.parse(new File( "intro.xml"));
• Getting the root element (myMessage):Node root = document.getDocumentElement();
• Casting the root to Element type: Element myMessageNode = ( Element ) root;
• Finding the message elements:NodeList messageNodes = myMessageNode.getElementsByTagName("message");
• Getting the first message element:Node message = messageNodes.item(0);
DOMDemo Explained (2)
Sheet 13XML Technology in E-Commerce 2001 Lecture 3
• Creating a new text content and replacing the old one: Text newText = document.createTextNode("New Changed Message!!"); Text oldText = (Text) message.getChildNodes().item(0); message.replaceChild( newText, oldText );
• Writing the changed document to a new file. DOM does not specify how to save the DOM structure. This is implementation specific detail:
((XmlDocument) document).write(
new FileOutputStream("intro1.xml"));
DOMDemo Explained (3)
Sheet 14XML Technology in E-Commerce 2001 Lecture 3
• DOM level 1 (Discussed here);
• DOM level 2:– Namespace support;
– Stylesheets interface;
– Model for events;
– Views, Range and Traversal interfaces;
• DOM level 3 (work in progress):– Loading and Saving documents;
– Model for DTD and Schema;
DOMSpecification Levels
Sheet 15XML Technology in E-Commerce 2001 Lecture 3
• General Model for XML Processing
• Document Object Model (DOM)– Logical Model;
– DOM Interfaces;
– Example;
• Simple API for XML (SAX);– Parser Architecture;
– Events;
– Java Classes and Interfaces;
– Example. Error Handling;
• Summary;
Lecture Outline
Sheet 16XML Technology in E-Commerce 2001 Lecture 3
• SAX - Simple API for XML;
• Developed by the members of XML-DEV list in 1998;
• SAX is Event based:– The parser reports parsing events: start and end of the
document, start and end of an element, errors, etc.
– When an event occurs, the parser invokes a method on an event handler;
– The application handles the events accordingly;
• SAX home page:http://www.megginson.com/SAX/
SAX
Sheet 17XML Technology in E-Commerce 2001 Lecture 3
XMLSource
S AX
Pa r
s er
Document Handler
Error Handler
DTD Handler
Entity Resolver
Ap
plication
SAXParser Architecture
DocumentHandler, ErrorHandler, DTDHandler and EntityResolver are interfaces that the Application can implement
Sheet 19XML Technology in E-Commerce 2001 Lecture 3
SAXDocumentHandler Interface
• Java package org.xml.sax;
• DocumentHandler Interface;More important methods:
public abstract void startDocument()public abstract void endDocument()
public abstract void startElement(String name, AttributeList atts)
public abstract void endElement(String name)
public abstract void characters(char ch[],int start, int length)
public abstract void processingInstruction(String target,String
data)
Sheet 20XML Technology in E-Commerce 2001 Lecture 3
SAXDemo
• Demo - Example on fig. 9.3, Deitel 9.6, page 235;
• Tools:– Java 1.2.2;
• http://java.sun.com/products/jdk/1.2/
– Java API for XML Processing (JAXP) 1.0.1;• http://java.sun.com/xml/archive.html
• Classes: jaxp.jar and parser.jar;
• Demo files:– Tree.java;
– Sample XML documents;
Sheet 21XML Technology in E-Commerce 2001 Lecture 3
SAXDemo Explained (1)
• Importing packages:import org.xml.sax.*; import javax.xml.parsers.SAXParserFactory;import javax.xml.parsers.ParserConfigurationException;import javax.xml.parsers.SAXParser;
• Class HandlerBase:– Provide default implementation of the four event handlers.
Applications usually extends it and overrides some methods:
public class Tree extends HandlerBase
– Tree class overrides the methods from DocumentHandler interface;
– Registration in the parser before parsing;
Sheet 22XML Technology in E-Commerce 2001 Lecture 3
SAXDemo Explained (2)
• Factory Instantiation:
SAXParserFactory saxFactory = SAXParserFactory.newInstance();
saxFactory.setValidating( validate );
• Obtaining the parser and start parsing:
SAXParser saxParser = saxFactory.newSAXParser();
saxParser.parse(new File(args[1]), new Tree());
Sheet 23XML Technology in E-Commerce 2001 Lecture 3
SAXError Handling
• Three error types:– Fatal errors: usually violation of well-formedness constraints.
The parser must stop processing;
– Errors: usually violation of validity rules;
– Warnings: related to DTD;
• Errors are handled by implementing ErrorHandler Interface;
• The Tree class overrides the default implementation of methods for warnings and errors;
• The same mechanism is used with DOM parsers;
Sheet 24XML Technology in E-Commerce 2001 Lecture 3
• Main Changes:
– Namespace support;
– Introduction of Filter mechanism;
– Interface DocumentHandler is replaced by
ContentHandler;
– New exception classes;
SAX 2.0
Sheet 25XML Technology in E-Commerce 2001 Lecture 3
SAX and DOMComparison
• DOM:– maintains an internal structure for the document;
– possible high memory usage for large documents;
– enables traversing;• SAX:
– doesn’t maintain an internal structure;
– enables building of custom structure;
– low memory usage;
– usually faster than DOM;
– traversing is impossible without internal structure;
• Usually a DOM implementation is built on the top of a SAX parser;
Sheet 26XML Technology in E-Commerce 2001 Lecture 3
• Two approaches for XML processing:– Tree-based (DOM);– Event-based (SAX);
• Tools:– JDK 1.2.2;– JAXP 1.0.1 (used in the book);– JAXP 1.1 is also available;– See also http://xml.apache.org;
Read: Deitel 8, 9
Assignment: Modify the case study in Deitel 8.8. In the new version the query should be based only on year, month and day (time is excluded). Add new functionality for making new appointment for a meeting on the found day and at specified time.
For more detailed explanation and some hints see the course site.
Summary