intro to xml - washington and lee university · 2019-05-15 · xml attributes (syntax 1) attributes...

34
Intro to XML Borrowed, with author’s permission, from: http://business.unr.edu/faculty/ekedahl/IS389/Topic3A ndroidIntroduction/IS389AndroidBasics.aspx

Upload: others

Post on 31-May-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Intro to XML - Washington and Lee University · 2019-05-15 · XML Attributes (Syntax 1) Attributes appear as key=valuepairs An attribute must have both a key and a value Attribute

Intro to XML

Borrowed, with author’s permission, from:

http://business.unr.edu/faculty/ekedahl/IS389/Topic3AndroidIntroduction/IS389AndroidBasics.aspx

Page 2: Intro to XML - Washington and Lee University · 2019-05-15 · XML Attributes (Syntax 1) Attributes appear as key=valuepairs An attribute must have both a key and a value Attribute

Part 1: XML Basics

Page 3: Intro to XML - Washington and Lee University · 2019-05-15 · XML Attributes (Syntax 1) Attributes appear as key=valuepairs An attribute must have both a key and a value Attribute

Why XML Here?

You need to understand the basics of XML to do much with Android

All of the layout and configuration files are XML files

Page 4: Intro to XML - Washington and Lee University · 2019-05-15 · XML Attributes (Syntax 1) Attributes appear as key=valuepairs An attribute must have both a key and a value Attribute

Overview

XML is a very simple language

The entire specification is only about 35 printed pages

XML IS CASE SENSITIVE

Page 5: Intro to XML - Washington and Lee University · 2019-05-15 · XML Attributes (Syntax 1) Attributes appear as key=valuepairs An attribute must have both a key and a value Attribute

The XML Document

At the heart of XML is the XML document

An XML document is a logical entity rather than a physical one

One logical document can be stored in different physical files locally and on the Web and assembled somehow

A well-formed subset of an XML document is called a document fragment

An XML document contains markup (tags) and data

Looks like HTML without predefined tags

Elements contain 0, 1 or many attributes

Page 6: Intro to XML - Washington and Lee University · 2019-05-15 · XML Attributes (Syntax 1) Attributes appear as key=valuepairs An attribute must have both a key and a value Attribute

Elements (Introduction)

XML tags resemble HTML tags but they are not predefined That is, you give names to tags just as you give names to

variables and other identifiers

Starting tags appear as <tag>

Ending tags appear as </tag>

Empty elements (elements without data) can appear as <tag/> XML elements MUST have an ending tag

A starting and ending tag (along with any text) makes up an element

Page 7: Intro to XML - Washington and Lee University · 2019-05-15 · XML Attributes (Syntax 1) Attributes appear as key=valuepairs An attribute must have both a key and a value Attribute

Elements (Nesting)

Elements can be nested

These are hierarchical nodes

Elements MUST be nested correctly though

A child element must completely reside in its parent element

There must be exactly one and only one root element

Page 8: Intro to XML - Washington and Lee University · 2019-05-15 · XML Attributes (Syntax 1) Attributes appear as key=valuepairs An attribute must have both a key and a value Attribute

Elements (Nesting – Example)

<?xml version="1.0" encoding="utf-8" ?>

<!-- What is the answer? -->

<universe>

<answer>

42 <!-- It's 42. -->

</answer>

</universe>

Root element is <universe>

Nested element is <answer>

Page 9: Intro to XML - Washington and Lee University · 2019-05-15 · XML Attributes (Syntax 1) Attributes appear as key=valuepairs An attribute must have both a key and a value Attribute

XML Attributes (Introduction)

Elements can have one or more attributes

Same rules as HTML

Attributes appear after the starting tag name

One element cannot have two attributes of the same name

One element can have many attributes though

Page 10: Intro to XML - Washington and Lee University · 2019-05-15 · XML Attributes (Syntax 1) Attributes appear as key=valuepairs An attribute must have both a key and a value Attribute

XML Attributes (Syntax 1)

Attributes appear as key=value pairs

An attribute must have both a key and a value

Attribute keys do not appear in quotation marks

Attribute values must appear in quotation marks

Remember, single and double quotes are interchangeable

A space separates each key=value pair Attributes can appear in starting and empty

element tags but cannot appear in an ending tag For a given element, each attribute key must differ

Page 11: Intro to XML - Washington and Lee University · 2019-05-15 · XML Attributes (Syntax 1) Attributes appear as key=valuepairs An attribute must have both a key and a value Attribute

XML Attributes (Example)

In the following example, id and size are attributes of the <universe> element

<?xml version="1.0" encoding="utf-8" ?>

<universe id="1" size="Infinite">

<answer>42</answer>

</universe>

Page 12: Intro to XML - Washington and Lee University · 2019-05-15 · XML Attributes (Syntax 1) Attributes appear as key=valuepairs An attribute must have both a key and a value Attribute

Comparing Attributes and Elements

The following 2 documents are (roughly) equivalent<?xml version="1.0" encoding="utf-8" ?>

<student id="12345">

<name>Bill</name>

</student>

<?xml version="1.0" encoding="utf-8" ?>

<student>

<id>12345</id>

<name>Bill</name>

</student>

Page 13: Intro to XML - Washington and Lee University · 2019-05-15 · XML Attributes (Syntax 1) Attributes appear as key=valuepairs An attribute must have both a key and a value Attribute

PART 2 XML Namespaces

Page 14: Intro to XML - Washington and Lee University · 2019-05-15 · XML Attributes (Syntax 1) Attributes appear as key=valuepairs An attribute must have both a key and a value Attribute

Namespace Caveats

XML namespaces are the source of criticism and confusion

XML namespaces are the source of many myths

Java namespaces and XML namespaces are not the same thing

In fact, they have nothing to do with each other

Page 15: Intro to XML - Washington and Lee University · 2019-05-15 · XML Attributes (Syntax 1) Attributes appear as key=valuepairs An attribute must have both a key and a value Attribute

The Problem of Name Ambiguity

Names can be ambiguous

When we say “Las Vegas” do we mean “Las Vegas, Nevada” or “Las Vegas, New Mexico” How many cities have the name “Springfield”?

When we say “address”, do we mean street address or Internet address?

Name ambiguity occurs when two XML documents have elements of the same name but with different meanings (context)

Page 16: Intro to XML - Washington and Lee University · 2019-05-15 · XML Attributes (Syntax 1) Attributes appear as key=valuepairs An attribute must have both a key and a value Attribute

The Problem of Name Ambiguity (Example)

<?xml version="1.0" encoding="utf-8" ?>

<office>

<address>123 Oak street</address>

<address>121.216.39.3</address>

</office>

The meaning of <address> differs between

the two contexts

Page 17: Intro to XML - Washington and Lee University · 2019-05-15 · XML Attributes (Syntax 1) Attributes appear as key=valuepairs An attribute must have both a key and a value Attribute

The Purpose of XML Namespaces (1)

They eliminate name ambiguity in element names between documents

Namespaces answer the question “Are we talking about the same thing?”

XML namespaces were released about 1 year after XML itself was released

Page 18: Intro to XML - Washington and Lee University · 2019-05-15 · XML Attributes (Syntax 1) Attributes appear as key=valuepairs An attribute must have both a key and a value Attribute

The Purpose of Namespaces (2)

According to the W3C namespace specification:

We envision applications of Extensible Markup Language (XML) where a single XML document may contain elements and attributes (here referred to as a "markup vocabulary") that are defined for and used by multiple software modules.

Page 19: Intro to XML - Washington and Lee University · 2019-05-15 · XML Attributes (Syntax 1) Attributes appear as key=valuepairs An attribute must have both a key and a value Attribute

Implementation of XML Namespaces

XML namespaces use URIs (IRIs) to qualify element names

This allows us to use the same element name with a different URI thereby qualifying the element name

The URI is just a globally unique name

Formally, XML namespaces define a vocabulary or universe of names

Page 20: Intro to XML - Washington and Lee University · 2019-05-15 · XML Attributes (Syntax 1) Attributes appear as key=valuepairs An attribute must have both a key and a value Attribute

The Implementation of Namespaces (1)

Namespace names are defined through ‘special’ attributes

Any attribute starting with the prefix xmlns: is considered a prefix defining attribute The prefix following the attribute is the local abbreviation

for the namespace

The prefix name has no relevance although there are common naming conventions

The attribute value is a unique URI

Page 21: Intro to XML - Washington and Lee University · 2019-05-15 · XML Attributes (Syntax 1) Attributes appear as key=valuepairs An attribute must have both a key and a value Attribute

XML Namespace Example (Android)

The prefix android is given to http://schemas.android.com/apk/res/android

The prefix tools is given to http://schemas.android.com/tools

Page 22: Intro to XML - Washington and Lee University · 2019-05-15 · XML Attributes (Syntax 1) Attributes appear as key=valuepairs An attribute must have both a key and a value Attribute

Namespaces (Quotable Quotes)

What Does a Namespace URL Locate?

There is nothing at all at the end of a namespace URI, except perhaps a 404 Not Found error

Page 23: Intro to XML - Washington and Lee University · 2019-05-15 · XML Attributes (Syntax 1) Attributes appear as key=valuepairs An attribute must have both a key and a value Attribute

Namespaces (Myths)

IT’S NOT NECESSARILY A WEB SITE

IT’S NOT A REFERENCE TO A PROGRAM ON THE WEB

IT’S NOT A POINTER TO A RESOURCE

IT’S JUST A NAME

Page 24: Intro to XML - Washington and Lee University · 2019-05-15 · XML Attributes (Syntax 1) Attributes appear as key=valuepairs An attribute must have both a key and a value Attribute

Applications using XML namespaces

XML namespaces are widely used in

The Schema Definition Language (XSDL)

The Extensible Stylesheet Language (XSL)

The namespace can represent a vocabulary

Page 25: Intro to XML - Washington and Lee University · 2019-05-15 · XML Attributes (Syntax 1) Attributes appear as key=valuepairs An attribute must have both a key and a value Attribute

Characteristics of Namespace URIs (Example)

The following URIs ARE different:

http://www.example.org/foo

http://www.Example.org/foo

http://www.example.org/Foo

And so are the following:

http://www.example.org/~foo

http://www.example.org/%7efoo

Page 26: Intro to XML - Washington and Lee University · 2019-05-15 · XML Attributes (Syntax 1) Attributes appear as key=valuepairs An attribute must have both a key and a value Attribute

Part 3: XML Parsing

Page 27: Intro to XML - Washington and Lee University · 2019-05-15 · XML Attributes (Syntax 1) Attributes appear as key=valuepairs An attribute must have both a key and a value Attribute

Parsing: Natural Language

Sentences aren’t just sequences of words: we need to know how words group together to form meaningful units.

E.g., in The book for this course costs a fortune, we need to represent the following facts:

The book for this course serves to identify a single book (as would my book, Moby Dick, etc.)

The verb costs has to agree with the noun bookin number; in other languages, they would have to agree in gender or other properties.

Page 28: Intro to XML - Washington and Lee University · 2019-05-15 · XML Attributes (Syntax 1) Attributes appear as key=valuepairs An attribute must have both a key and a value Attribute

Parsing: Natural Language

S

NP VP

NPPP

V

Det

NP

NDet N

The book

P NP

Det N

for this course costs a fortune

Page 29: Intro to XML - Washington and Lee University · 2019-05-15 · XML Attributes (Syntax 1) Attributes appear as key=valuepairs An attribute must have both a key and a value Attribute

Parsing: XML

LinearLayout

TextView LinearLayout

Button Button

Page 30: Intro to XML - Washington and Lee University · 2019-05-15 · XML Attributes (Syntax 1) Attributes appear as key=valuepairs An attribute must have both a key and a value Attribute

Parsing: Programming languages

Natural languages give us some cues (intonation, punctuation) for parsing, but in general it is a very difficult computational problem (O(N 3)), and an active research area.

Natural language words also don’t come with explicit tags (N, V, P): Part-Of-Speech (POS) tagging algorithms (Brill 1995) are the typical solution

Programming languages (including XML) are designed to make parsing easier, but we want to use an existing package, like javax.xml.parsers.

Page 31: Intro to XML - Washington and Lee University · 2019-05-15 · XML Attributes (Syntax 1) Attributes appear as key=valuepairs An attribute must have both a key and a value Attribute

XML Parsing: DOM vs. SAX

Document Object Model:Build up tree representation of an entire document at once; then traverse the tree to find relevant tags. (This is the one I use!)

http://stackoverflow.com/questions/5059224/which-is-the-best-library-for-xml-parsing-in-java

Page 32: Intro to XML - Washington and Lee University · 2019-05-15 · XML Attributes (Syntax 1) Attributes appear as key=valuepairs An attribute must have both a key and a value Attribute

XML Parsing: DOM vs. SAX

Simple API for XML: Doesn’t build a big tree; instead, allows you to write callback methods (like listeners) that get called automatically when you traverse the document.

http://stackoverflow.com/questions/5059224/which-is-the-best-library-for-xml-parsing-in-java

Page 33: Intro to XML - Washington and Lee University · 2019-05-15 · XML Attributes (Syntax 1) Attributes appear as key=valuepairs An attribute must have both a key and a value Attribute

Addendum: JSON

Page 34: Intro to XML - Washington and Lee University · 2019-05-15 · XML Attributes (Syntax 1) Attributes appear as key=valuepairs An attribute must have both a key and a value Attribute

Javascript Object Notation

XML requires a lot of “syntactic noise”: makes it difficult to see the contents (data) from the notation (metadata tags).

Javascript Object Notation (JSON) replaces tag words with curly braces …

https://izlooite.files.wordpress.com/2010/05/ad1.jpg