xml internet engineering spring 2015 bahador bakhshi ce & it department, amirkabir university of...

Download XML Internet Engineering Spring 2015 Bahador Bakhshi CE & IT Department, Amirkabir University of Technology

If you can't read please download the document

Upload: kelley-oconnor

Post on 23-Dec-2015

214 views

Category:

Documents


2 download

TRANSCRIPT

  • Slide 1
  • XML Internet Engineering Spring 2015 Bahador Bakhshi CE & IT Department, Amirkabir University of Technology
  • Slide 2
  • Questions Q6) How to define the data that is transferred between web server and client? Q6.1) Which technology? Q6.2) Is data correctly encoded? Q6.3) How to access the data in web pages? Q6.4) How to present the data? 2
  • Slide 3
  • Homework Homework 3 Will be announced 3
  • Slide 4
  • Outline Introduction Namespaces Validation Presentation XML Processing (using JavaScript) Conclusion 4
  • Slide 5
  • Outline Introduction Namespaces Validation Presentation XML Processing (using JavaScript) Conclusion 5
  • Slide 6
  • Introduction HTML + CSS + JavaScript Interactive Web pages Web server is not involved after page is loaded JavaScript reacts to user events However, most web applications needs data from server after the page is loaded e.g., new emails data in Gmail A mechanism to communication: AJAX A common (standard) format to exchange data In most applications, the data is structured 6
  • Slide 7
  • Introduction (contd) In general (not only in web) to store or transport data, we need a common format, to specify the stucture of data; e.g., Documents: PDF, HTML, DOCx, PPTx,... Objects: Java Object Serialization/Deserialization How to define the data structure? Binary format (similar to binary files) Difficult to develop & debug, machine depended, Text format (similar to text files) Human readable, machine independent & easier 7
  • Slide 8
  • Introduction (contd) Example: Data structure of a class Course name, teacher, # of students, each student information 8 IE Bakhshi 48 Ali Hassani 1111 Babak Hosseini 2222 . Student num: 48 Name: IE Teacher: Bakhshi Ali Hassani 1111 Babak Hosseini 2222 . class Course{ string name; string teacher; integer num; Array st of Students; } c = new Course(); c.name = IE; c.teacher = Bakhshi; c.num = 48 st[1] = new Student(); st[1].name=Ali; st[2].fam=Hassani .
  • Slide 9
  • Introduction (contd) W3Cs approach XML: eXtensible Markup Language A meta-markup language to describe data structure In each application, a markup language (set of tags & attributes) are defined 9 IE 48 Bakhshi Ali Hassani 1111
  • Slide 10
  • Introduction (contd) Standard Generalized Markup Language (SGML) Expensive, complex to implement XML: a subset of SGML Goals: generality while simplicity and usability Simplifies SGML by: leaving out many syntactical options and variants SGML ~ 600pp, XML ~ 30pp XML = SGML {complexity, document perspective} + {simplicity, data exchange perspective} 10
  • Slide 11
  • Why to Study XML: Benefits Simplify data sharing & transport XML is text based and platform independent Extensive tools to process XML To validate, to present, to search, In web application, data separation from HTML E.g., table structure by HTML, table data by XML Extensible for different applications A powerful tool to model/describe complex data E.g., MS Office!!! 11
  • Slide 12
  • XML Document Elements Markup Elements Tag + Content Attributes Comments Processing instructions Content Parsed Character Data Unparsed Character Data (CDATA) 12
  • Slide 13
  • XML Elements XML element structure Tag + content Content No predefined tag If content is not CDATA, is parsed by parser A value for this element Child elements of this element 13
  • Slide 14
  • XML Elements Attributes Tags (elements) are customize by attribute No predefined attributes Windows Linux Attribute vs. Tags (elements) Attributes can be replaced by elements Attribute cannot be repeated for an element Attribute cannot have children Attributes mainly used for metadata, e.g., ID, class 14
  • Slide 15
  • Processing Instructions Processing instructions pass information (instruction) to the application that processes the XML file They are not a part of user data Common usage XML Declaration is a special PI XML Declaration is always first line in file 15
  • Slide 16
  • Basic XML Document Structure Data 16
  • Slide 17
  • Example ThinkPad T500 4GB Linux, FC21 17
  • Slide 18
  • Example (CDATA) + - * / % != ]]> 18
  • Slide 19
  • XML vs. HTML Tags HTML: Predefined fixed tags XML: No predefined (meta-language) User defined tags & attributes Purpose HTML: Information display XML: Data structure & transfer (which is displayed) Rules strictness HTML (not XHTLM): loose XML: strong/strict rule checking 19
  • Slide 20
  • XML in General Application XML by itself does not do anything XML just describes the structure of the data Other applications parse XML and use it A similar approach is used for formats (event user-defined format); so, what is the advantages of XML?!!! XML is standard Available XML tools & technologies 20 XML document XML processor (aka. XML Parser) application SAX, DOM Apache Xerces
  • Slide 21
  • XML Technology Components Data structure (tree) representation XML document (a text file) Validation & Conformance Document Type Definition (DTD) or XML Schema Element access & addressing XPath, DOM Display and transformation XSLT or CSS Programming, Database, Query, 21
  • Slide 22
  • Outline Introduction Namespaces Validation Presentation XML Processing (using JavaScript) Conclusion 22
  • Slide 23
  • Namespaces In XML, element names are defined by developers Results in a conflict when trying to mix XML documents from different XML applications XML file 1 Apples Bananas XML file 2 Dinner Table 80 120 23
  • Slide 24
  • Namespaces Name conflicts in XML can easily be avoided by using a qualified names according to a prefix Qualified name is the prefixed name Prefix is the namespaces Step 1: Namespace declaration Defines a label (prefix) for the namespace and associates it to the namespace identifier URI/URL is used to be universally unique Step 2: Qualified name namespace prefix: local name 24
  • Slide 25 Computer Engineering & Information Technology Internet Engineering 25 Actual name of this "> Computer Engineering & Information Technology Internet Engineering 25 Actual name of this tag of parser: http://ceit.aut.ac.ir:department"> Computer Engineering & Information Technology Internet Engineering 25 Actual name of this " title="Namespaces Computer Engineering & Information Technology Internet Engineering 25 Actual name of this ">
  • Namespaces Computer Engineering & Information Technology Internet Engineering 25 Actual name of this tag of parser: http://ceit.aut.ac.ir:department
  • Slide 26
  • Default Namespaces Apples Bananas Dinner Table 80 120 26
  • Slide 27
  • Outline Introduction Namespaces Validation Presentation XML Processing (using JavaScript) Conclusion 27
  • Slide 28
  • Valid XML XML is used to describe a structured data The description must be correct A valid XML file Correctness Syntax Syntax error parser fails to parse the file Syntax rules: e.g., all XML tags must be closed Symantec (structure) Application specific rules, e.g. student must have ID Error Application failure 28
  • Slide 29
  • XML Syntax Rules (Well-Formed) Start-tag and End-tag, or self-closing tag Tags cant overlap XML documents can have only one root element XML naming conventions Names can start with letters or the dash (-) character After the first character, numbers, hyphens, and periods are allowed Names cant start with xml, in uppercase or lowercase There cant be a space after the opening < character XML is case sensitive Value of attributes must be quoted White-spaces are preserved &, are represented by & < > 29
  • Slide 30
  • How to Validate XML (structure)? 1) Application specific programs need to check structure of XML document Different applications different programs Change in data structure code modification 2) General XML parser + reference document Reference document Tag names, attributes, tree structure, tag relations, Different reference documents DTD, XML Schema, RELAX NG 30
  • Slide 31
  • XML Validation (contd) Document Type Definition (DTD) or XML Schema A language to define document type The rules of the structure of XML Internal or External 31 XML data parser interface XML-based application DTD / Schema
  • Slide 32
  • DTD DTD is a set of structural rules called declarations, specify A set of elements and attributes that can be in XML Where these elements and attributes may appear ELEMENT : to define tags For leaf nodes: Character pattern For internal nodes: List of children ATTLIST : to define tag attributes Includes: name of the element, the attributes name, its type, and a default option 32
  • Slide 33
  • ELEMENT Declaration General form of internal nodes To control the number of times a child may appear + : One or more * : Zero or more ? : Zero or one General form of leaf nodes Where, types PCDATA : Most commonly used, the content will be parsed, i.e. & is not allowed ANY : Any character can be used (i.e., CDATA) EMPTY : No content 33
  • Slide 34
  • ATTLIST Declaration element_name: The name of the corresponding element attribute_name: The name of attribute attribute_type: Commonly CDATA is used default_option: A value: The default value of the attribute #REQUIRED : The attribute is mandatory #IMPLIED : The attribute is optional 34
  • Slide 35
  • ENTITY Declaration Defines an entity by given name and associated value %name is replaced by the value Very similar to #define in C Example 35
  • Slide 36
  • Example: Internal DTD John Johansen Smith 36
  • Slide 37
  • External DTD System Public Common format is FPI (Defined in the document ISO 9070) Example 37
  • Slide 38
  • Example: External DTD sample.dtd -------------------------------------------------------------------------- external-dtd.xml Ali Hassan Babak This is message 38
  • Slide 39
  • XML Schema XML Schema describes the structure of an XML file Also referred to as XML Schema Definition (XSD) XML Schemas benefits (DTD disadvantages) Created using basic XML syntax (DTD has its own syntax) Validate text element content based on built-in and user- defined data types (DTD does not fully support data type) Similar to OOP Schema is a class & XML files are instances Schema specifies Elements and attributes, where and how often Data type of every element and attribute 39
  • Slide 40
  • Schema (contd) XML schema is itself an XML-based language Has its own predefined tags & namespace xmlns:xs="http://www.w3.org/2001/XMLSchema" Two categories of data types Simple: Cannot have nested elements or attribute (i.e., itself is a leaf or attribute) Primitive : string, Boolean, integer, float,... Derived: byte, long, unsignedInt, User defined: restriction of base types Complex: Can have attribute or/and nested elements 40
  • Slide 41
  • XML Schema (contd) Simple element declaration Complex element declaration or or or or 41
  • Slide 42
  • XML Schema (contd) Notes on minOccurs & maxOccurs Using the all indicator minOccurs & maxOccurs indicator can only be 0 or 1 The default value for minOccurs is 1 To allow an element to appear an unlimited number of times, use the maxOccurs="unbounded" statement 42
  • Slide 43
  • XML Schema Example: note.xsd 43
  • Slide 44
  • XML Schema Example: note.xml Ali Reza 1391/1/1 44
  • Slide 45
  • XML Validation Tools Online validators validator.w3.org www.xmlvalidation.com XML tools & commands xmllint commands in Linux xmllint xmlfile --valid --dtdvalid DTD xmllint xmlfile --schema schema XML libraries LibXML2 for C Java & C# XML libraries 45
  • Slide 46
  • Outline Introduction Namespaces Validation Presentation XML Processing (using JavaScript) Conclusion 46
  • Slide 47
  • XML Presentation By default, browsers parses & displays XML files Tree structure of XML Syntax checking Well-formed XML Other presentations of XML 1) Browsers support CSS for XML files CSS is used to format the representation of XML 2) Transform to HTML + CSS using XSLT A powerful tool to separate data from HTML 3) Use JavaScript to generate HTML for XML Parse the XML and create HTML elements 47
  • Slide 48
  • XML & CSS Attach styling instructions directly to XML Can style but not rearrange elements Block or inline style Bold, italic, underline, font, color, etc. Tag_name {color:red; font-weight:bold; font-family:serif;} 48
  • Slide 49
  • CSS Example Good programming books: The C Programming Language Ritchie Thinking in Java Eckel ========================================== *{display: block;} programming{font-family: Arial; font-size:20pt;} C{color: blue;} Java{color: green;} author{ font-style:italic;} 49 XSLT commands HTML tags + Other contents 58 The template are run for every node that is matched to the given expression The default template: If there is a child node current node = child If there is not child node print the value Copied to output"> XSLT commands HTML tags + Other contents 58 The templat" title="XSLT Structure XSLT commands HTML tags + Other contents 58 The templat">
  • XSLT Structure XSLT commands HTML tags + Other contents 58 The template are run for every node that is matched to the given expression The default template: If there is a child node current node = child If there is not child node print the value Copied to output
  • Slide 59
  • XSLT Commands (elements) To define the format of the output created by the stylesheet In this course, we want to create HTML from XML method="html" The default output is also HTML To define a template for a specified nodes If select is given Processes all the element specified by the expression (current node loops over the selected elements) If select is not given Process all child nodes of the current node (current node loops over the children of this node) 59
  • Slide 60
  • XSLT Commands (elements) The value of the specified nodes Text Copy the given text to the output If escaping is not disabled output is escaped; i.e, > > Create a loop on array of selected nodes Conditional rules Comparison Operators Equal: = Not equal: != Less than: < Greater than: > 60
  • Slide 61
  • XSLT Commands (elements) Switch-case (if-else) rules The template applying algorithm selects the best matched template if current node matches with multiple templates The best is defined according to the priority of templates The priorities are defined in the standard (we dont see here) E.g., in the case of multiple template with identical match attribute, the last template has the highest priority 61
  • Slide 62
  • XSLT Example: XML Data file Internet Engineering Spring 2012 Ali Alizadeh 18.0 123 Babak Babaki 7.0 234 Hassan Hassani 19.0 345 62
  • Slide 63 Course: Semester: Students: " Student # Name F"> Course: Semester: Students: " Student # Name Family Grade 63"> Course: Semester: Students: " Student # Name F" title="XSLT Example: XSLT file Course: Semester: Students: " Student # Name F">
  • XSLT Example: XSLT file Course: Semester: Students: " Student # Name Family Grade 63
  • Slide 64
  • 64
  • Slide 65
  • XSLT Example: Result 65
  • Slide 66
  • XSLT Example: apply-templates (1) 66
  • Slide 67 Student # Name Family Grade 67"> Student # Name Family Grade 67"> Student # Name Family Grade 67" title="XSLT Example: apply-templates (2) Student # Name Family Grade 67">
  • XSLT Example: apply-templates (2) Student # Name Family Grade 67
  • Slide 68
  • XSLT Example: apply-templates (2) Course: Semester: Students: " " 68
  • Slide 69
  • XSLT Example: apply-templates (2) 69
  • Slide 70
  • XSLT Example: xsl:text style="background-color:green"> 70
  • Slide 71
  • Where XSLT? Server Side 71 XSLT Server transforms XML to HTML/CSS; Ship to client browser for display http HTML +CSS Browser/ Interface XMLStylesheet
  • Slide 72
  • Where XSLT? Client-Side 72 XSLT Server sends XML & Stylesheet to client Client transforms XML to HTML & CSS http HTML +CSS Browser/ Interface XMLStylesheet
  • Slide 73
  • Where XSLT? Client-side (browser) Reduce the load on your servers Search engines do not process the XML & XSLT Server-side Performance could be a problem on busy servers Search engines can process the HTML (we have SO!) Offline Pre-conversion E.g. xsltproc command Linux Best performance Not good for dynamic documents 73
  • Slide 74
  • Outline Introduction Namespaces Validation Presentation XML Processing (using JavaScript) Conclusion 74
  • Slide 75
  • XML Processor: Parsers There are two basic types of XML parsers: Tree (DOM)-based parser: Whole document is analyzed to create a DOM tree Advantages: Multiple & Random access to elements, easier to validate the structure of XML Event-based parser (SAX): XML document is interpreted as a series of events When a specific event occurs, a function is called to handle it (we will see it later in PHP) Advantages: less memory usage and no wait to complete faster 75
  • Slide 76
  • XML Parsing in Browser Web browsers have built-in XML parser XML parser output: XML DOM XML DOM is accessible through JavaScript How to get XML file in Java Script? Using AJAX We will see later (Input) string 76
  • Slide 77
  • XML DOM XML DOM is similar to HTML DOM A tree of nodes (with different types: element, text, attr, ) Nodes are accessed by getElementsByTagName Nodes are objects (have method & fields) DOM can be modified, e.g., create/remove nodes However There is not predefined attributes link id / class getElementById or similar methods are not applicable Since XML is not for presentation Nods have not event handler functions Nodes have not style field 77
  • Slide 78
  • XML DOM in JavaScript DOMParser can parse an input XML string Each node have parentNode, children, childNodes, Access to value of a node In the DOM, everything is a node (with different types) Element nodes do not have a content value The content of an element is stored in a child node To get content of a leaf element, the value of the first child node (text node) should got 78
  • Slide 79
  • Example: Message Parser Enter Your XML: 79
  • Slide 80
  • Example: Message Parser function parse(){ output = ""; input = document.getElementsByName("inputtext")[0].value; parser = new DOMParser(); xmlDoc = parser.parseFromString(input,"text/xml"); messages = xmlDoc.getElementsByTagName("root")[0].children; for(i=0; i < messages.length; i++){ msg = messages[i]; fromNode = msg.getElementsByTagName("from")[0]; fromText = fromNode.childNodes[0].nodeValue; toNode = msg.getElementsByTagName("to")[0]; toText = toNode.childNodes[0].nodeValue; bodyNode = msg.getElementsByTagName("body")[0]; bodyText = bodyNode.childNodes[0].nodeValue; output = output + fromText +" sent following message to " + toText + " ''" + bodyText +"'' " } document.getElementsByName("outputdiv")[0].innerHTML = output; } 80
  • Slide 81
  • Outline Introduction Namespaces Validation Presentation XML Processing (using JavaScript) Conclusion 81
  • Slide 82
  • Conclusion: XML Technologies XHTML A stricter and cleaner XML based version of HTML XSL (Extensible Style Sheet Language) XSL consists of three parts: XSLT (XSL Transform) - transforms XML into other format XPath - a language for navigating XML documents DTD (Document Type Definition) A standard for defining the legal elements XSD (XML Schema) An XML-based alternative to DTD 82
  • Slide 83
  • Conclusion: XML Technologies SVG (Scalable Vector Graphics) Defines graphics in XML format XQuery (XML Query Language) An XML based language for querying XML data XLink (XML Linking Language) A language for creating hyperlinks in XML documents XPointer (XML Pointer Language) Allows the XLink hyperlinks to point to more specific parts in the XML document 83
  • Slide 84
  • Conclusion XML is easy and powerful technology To describe data structure To exchange To process & transform XML application beyond web Image format: Scalable Vector Graphics (SVG) Everywhere to save/restore structured data Microsoft Office, Other related technologies in data exchange JSON (JavaScript), Protocol Buffers (Google), Thrift (Apache & FB) 84
  • Slide 85
  • Answers Q6.1) Which technology? Text based! XML: A markup met-language with user defined tags Q6.2) Is data correctly encoded? XML Validation: DTD and Schema Q6.3) How to access the data in web pages? Parse XML to DOM, we know how to work with DOM Q6.4) How to present the data? CSS for XML XPATH to select element, XSLT to translate XML to HTML 85
  • Slide 86
  • References Reading Assignment: Chapter 7 of Programming the World Wide Web David Hunter, et.al, Beginning XML, chapters 1-5 & chapter 8 Andrew H. Watt, Sams Teach Yourself XML in 10 Minutes, chapters 1-4, chapters 8-10 http://w3schools.com/xsl/default.asp 86