intro to xml - washington and lee university · 2019-05-15 · xml attributes (syntax 1) attributes...
Post on 31-May-2020
7 Views
Preview:
TRANSCRIPT
Intro to XML
Borrowed, with author’s permission, from:
http://business.unr.edu/faculty/ekedahl/IS389/Topic3AndroidIntroduction/IS389AndroidBasics.aspx
Part 1: XML Basics
Why XML Here?
You need to understand the basics of XML to do much with Android
All of the layout and configuration files are XML files
Overview
XML is a very simple language
The entire specification is only about 35 printed pages
XML IS CASE SENSITIVE
The XML Document
At the heart of XML is the XML document
An XML document is a logical entity rather than a physical one
One logical document can be stored in different physical files locally and on the Web and assembled somehow
A well-formed subset of an XML document is called a document fragment
An XML document contains markup (tags) and data
Looks like HTML without predefined tags
Elements contain 0, 1 or many attributes
Elements (Introduction)
XML tags resemble HTML tags but they are not predefined That is, you give names to tags just as you give names to
variables and other identifiers
Starting tags appear as <tag>
Ending tags appear as </tag>
Empty elements (elements without data) can appear as <tag/> XML elements MUST have an ending tag
A starting and ending tag (along with any text) makes up an element
Elements (Nesting)
Elements can be nested
These are hierarchical nodes
Elements MUST be nested correctly though
A child element must completely reside in its parent element
There must be exactly one and only one root element
Elements (Nesting – Example)
<?xml version="1.0" encoding="utf-8" ?>
<!-- What is the answer? -->
<universe>
<answer>
42 <!-- It's 42. -->
</answer>
</universe>
Root element is <universe>
Nested element is <answer>
XML Attributes (Introduction)
Elements can have one or more attributes
Same rules as HTML
Attributes appear after the starting tag name
One element cannot have two attributes of the same name
One element can have many attributes though
XML Attributes (Syntax 1)
Attributes appear as key=value pairs
An attribute must have both a key and a value
Attribute keys do not appear in quotation marks
Attribute values must appear in quotation marks
Remember, single and double quotes are interchangeable
A space separates each key=value pair Attributes can appear in starting and empty
element tags but cannot appear in an ending tag For a given element, each attribute key must differ
XML Attributes (Example)
In the following example, id and size are attributes of the <universe> element
<?xml version="1.0" encoding="utf-8" ?>
<universe id="1" size="Infinite">
<answer>42</answer>
</universe>
Comparing Attributes and Elements
The following 2 documents are (roughly) equivalent<?xml version="1.0" encoding="utf-8" ?>
<student id="12345">
<name>Bill</name>
</student>
<?xml version="1.0" encoding="utf-8" ?>
<student>
<id>12345</id>
<name>Bill</name>
</student>
PART 2 XML Namespaces
Namespace Caveats
XML namespaces are the source of criticism and confusion
XML namespaces are the source of many myths
Java namespaces and XML namespaces are not the same thing
In fact, they have nothing to do with each other
The Problem of Name Ambiguity
Names can be ambiguous
When we say “Las Vegas” do we mean “Las Vegas, Nevada” or “Las Vegas, New Mexico” How many cities have the name “Springfield”?
When we say “address”, do we mean street address or Internet address?
Name ambiguity occurs when two XML documents have elements of the same name but with different meanings (context)
The Problem of Name Ambiguity (Example)
<?xml version="1.0" encoding="utf-8" ?>
<office>
<address>123 Oak street</address>
<address>121.216.39.3</address>
</office>
The meaning of <address> differs between
the two contexts
The Purpose of XML Namespaces (1)
They eliminate name ambiguity in element names between documents
Namespaces answer the question “Are we talking about the same thing?”
XML namespaces were released about 1 year after XML itself was released
The Purpose of Namespaces (2)
According to the W3C namespace specification:
We envision applications of Extensible Markup Language (XML) where a single XML document may contain elements and attributes (here referred to as a "markup vocabulary") that are defined for and used by multiple software modules.
Implementation of XML Namespaces
XML namespaces use URIs (IRIs) to qualify element names
This allows us to use the same element name with a different URI thereby qualifying the element name
The URI is just a globally unique name
Formally, XML namespaces define a vocabulary or universe of names
The Implementation of Namespaces (1)
Namespace names are defined through ‘special’ attributes
Any attribute starting with the prefix xmlns: is considered a prefix defining attribute The prefix following the attribute is the local abbreviation
for the namespace
The prefix name has no relevance although there are common naming conventions
The attribute value is a unique URI
XML Namespace Example (Android)
The prefix android is given to http://schemas.android.com/apk/res/android
The prefix tools is given to http://schemas.android.com/tools
Namespaces (Quotable Quotes)
What Does a Namespace URL Locate?
There is nothing at all at the end of a namespace URI, except perhaps a 404 Not Found error
Namespaces (Myths)
IT’S NOT NECESSARILY A WEB SITE
IT’S NOT A REFERENCE TO A PROGRAM ON THE WEB
IT’S NOT A POINTER TO A RESOURCE
IT’S JUST A NAME
Applications using XML namespaces
XML namespaces are widely used in
The Schema Definition Language (XSDL)
The Extensible Stylesheet Language (XSL)
The namespace can represent a vocabulary
Characteristics of Namespace URIs (Example)
The following URIs ARE different:
http://www.example.org/foo
http://www.Example.org/foo
http://www.example.org/Foo
And so are the following:
http://www.example.org/~foo
http://www.example.org/%7efoo
Part 3: XML Parsing
Parsing: Natural Language
Sentences aren’t just sequences of words: we need to know how words group together to form meaningful units.
E.g., in The book for this course costs a fortune, we need to represent the following facts:
The book for this course serves to identify a single book (as would my book, Moby Dick, etc.)
The verb costs has to agree with the noun bookin number; in other languages, they would have to agree in gender or other properties.
Parsing: Natural Language
S
NP VP
NPPP
V
Det
NP
NDet N
The book
P NP
Det N
for this course costs a fortune
Parsing: XML
LinearLayout
TextView LinearLayout
Button Button
Parsing: Programming languages
Natural languages give us some cues (intonation, punctuation) for parsing, but in general it is a very difficult computational problem (O(N 3)), and an active research area.
Natural language words also don’t come with explicit tags (N, V, P): Part-Of-Speech (POS) tagging algorithms (Brill 1995) are the typical solution
Programming languages (including XML) are designed to make parsing easier, but we want to use an existing package, like javax.xml.parsers.
XML Parsing: DOM vs. SAX
Document Object Model:Build up tree representation of an entire document at once; then traverse the tree to find relevant tags. (This is the one I use!)
http://stackoverflow.com/questions/5059224/which-is-the-best-library-for-xml-parsing-in-java
XML Parsing: DOM vs. SAX
Simple API for XML: Doesn’t build a big tree; instead, allows you to write callback methods (like listeners) that get called automatically when you traverse the document.
http://stackoverflow.com/questions/5059224/which-is-the-best-library-for-xml-parsing-in-java
Addendum: JSON
Javascript Object Notation
XML requires a lot of “syntactic noise”: makes it difficult to see the contents (data) from the notation (metadata tags).
Javascript Object Notation (JSON) replaces tag words with curly braces …
https://izlooite.files.wordpress.com/2010/05/ad1.jpg
top related