august 20061 chapter 2 - markup and core concepts learning xml by erik t. ray slides were developed...

15
August 2006 1 Chapter 2 - Markup and Core Concepts Learning XML by Erik T. Ray Slides were developed by Jack Davis College of Information Science and Technology Radford University

Upload: maximillian-roberts

Post on 29-Dec-2015

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: August 20061 Chapter 2 - Markup and Core Concepts Learning XML by Erik T. Ray Slides were developed by Jack Davis College of Information Science and Technology

August 2006 1

Chapter 2 - Markup and Core Concepts Learning XML

byErik T. Ray

Slides were developed by Jack Davis

College of Information Scienceand Technology

Radford University

Page 2: August 20061 Chapter 2 - Markup and Core Concepts Learning XML by Erik T. Ray Slides were developed by Jack Davis College of Information Science and Technology

August 2006 2

XML Syntax

• “Syntax” refers to the rules of a language

• Syntax is needed with any language so that the documents created with that language are consistent

• Programs that process documents expect the syntax rules to be followed, otherwise the document may not be interpreted correctly

Page 3: August 20061 Chapter 2 - Markup and Core Concepts Learning XML by Erik T. Ray Slides were developed by Jack Davis College of Information Science and Technology

August 2006 3

Components of an XML Document

• XML Declaration

• Elements

• Attributes

• Entities

• Comments

Page 4: August 20061 Chapter 2 - Markup and Core Concepts Learning XML by Erik T. Ray Slides were developed by Jack Davis College of Information Science and Technology

August 2006 4

Components: The XML Declaration

• The XML Declaration:– Tells the processing program that the document is an

XML document, along with other optional information

– The declaration is always the first line of an XML document

– Attributes that can be used in the Declaration:• version• encoding• standalone

– Example: <?xml version=“1.0”? Encoding=“UTF-8” standalone=“yes”?>

Page 5: August 20061 Chapter 2 - Markup and Core Concepts Learning XML by Erik T. Ray Slides were developed by Jack Davis College of Information Science and Technology

August 2006 5

Document Type Declaration

• Document type declarations are used to define entities or default attribute values. Secondly, they are used to support validation, a special mode of parsing that checks grammar and vocabulary of markup. A validating parser needs to read a list of declarations for element rules before it can begin to parse. In both cases, this is done in document type declaration section.

• A document type declaration consists of:- delimeter <!DOCTYPE- element name identifies the type element- dtd id local path or url- entity decl optional list of entity declara.

• dtd identifier supports two methods of identification: system-specific and public

<!DOCTYPE doc SYSTEM "/usr/simple.dtd">

<!DOCTYPE html PUBLIC "-//w3c//DTD HTML 3.2//EN" "http://www.w3.org/TR … >

Page 6: August 20061 Chapter 2 - Markup and Core Concepts Learning XML by Erik T. Ray Slides were developed by Jack Davis College of Information Science and Technology

August 2006 6

XML Syntax

• “Syntax” refers to the rules of a language

• Syntax is needed with any language so that the documents created with that language are consistent

• Programs that process documents expect the syntax rules to be followed, otherwise the document may not be interpreted correctly

Page 7: August 20061 Chapter 2 - Markup and Core Concepts Learning XML by Erik T. Ray Slides were developed by Jack Davis College of Information Science and Technology

August 2006 7

Components: XML Elements

• Elements:– Used to describe the data. Consist of:

• A start tag

• Content

• An end tag

– Example: <element>Content</element>

– The “root” element of a document is the outermost element, and contains all of the other elements in the document. There can be only one root element in a single document

• An element that does not contain any content is known as an “empty element”

Page 8: August 20061 Chapter 2 - Markup and Core Concepts Learning XML by Erik T. Ray Slides were developed by Jack Davis College of Information Science and Technology

August 2006 8

Element Nesting

• The term “nesting” refers to the process of containing elements within other elements

• Terminology:

– Child elements – elements that are contained within other elements

– Parent elements – elements that contain other elements

– Sibling elements – elements that share the same parent element

Page 9: August 20061 Chapter 2 - Markup and Core Concepts Learning XML by Erik T. Ray Slides were developed by Jack Davis College of Information Science and Technology

August 2006 9

Nesting Example

1 <family_tree>

2 <mother>Sally</mother>

3 <father>Joe</father>

4 <children>

5 <child>Larry</child>

6 <child>Curly</child>

7 <child>Mo</child>

8 </children>

9 </family_tree>

Page 10: August 20061 Chapter 2 - Markup and Core Concepts Learning XML by Erik T. Ray Slides were developed by Jack Davis College of Information Science and Technology

August 2006 10

Components: XML Attributes

• Attributes help to describe XML elements

• Attributes are always contained in the start tag of the element they are describing

• Attributes are known as “name-value pairs”

• Example: address=“123 Main Street”

Page 11: August 20061 Chapter 2 - Markup and Core Concepts Learning XML by Erik T. Ray Slides were developed by Jack Davis College of Information Science and Technology

August 2006 11

Components: XML Entities

• Two types of entities:– General – placeholders for information contained in

the XML document

– Parameter – used within a DTD to reference a grouping of elements

• Three types of general entities:– Character – used in place of special characters

– Content – used for blocks of frequently used text

– Unparsed – used for binary or non-text data, like image files

Page 12: August 20061 Chapter 2 - Markup and Core Concepts Learning XML by Erik T. Ray Slides were developed by Jack Davis College of Information Science and Technology

August 2006 12

Examples of Entities

• Character entity:– Character: >– Entity reference: &gt; or &#62;– Usage: <formula> x &gt; y </formula>

• Content entity:– Declaration:

<!ENTITY address “123 Main St”>

– Usage: <ship_address> &address; <ship_address>

• Unparsed entity:– Declaration:

<!ENTITY image SYSTEM “sunset.gif” NDATA GIF>

– Usage:

<picture> &aimage; </picture>

Page 13: August 20061 Chapter 2 - Markup and Core Concepts Learning XML by Erik T. Ray Slides were developed by Jack Davis College of Information Science and Technology

August 2006 13

Components: Comments

• An XML comment is ignored by applications that process XML

• Comments are commonly used for documentation, or to add information for others viewing the document

• The content of the comment is surrounded by special comment tags: <!– and -->

• Example: <!-- This is a comment -->

Page 14: August 20061 Chapter 2 - Markup and Core Concepts Learning XML by Erik T. Ray Slides were developed by Jack Davis College of Information Science and Technology

August 2006 14

Well-Formed XML Documents

• A “well-formed” document is one which adheres to the syntax rules for XML:

– An XML document contains one root element

– All elements must have start and end tags, except for empty elements

– Elements must be properly nested

– All attributes must have a value

– Attributes can only appear in the start tag and must be unique to that element

– Element names are case-sensitive

– Special characters must be written as entities

– Names of element can start only with letters or an underscore, and can contain letters, numbers, hyphens, periods and underscores

Page 15: August 20061 Chapter 2 - Markup and Core Concepts Learning XML by Erik T. Ray Slides were developed by Jack Davis College of Information Science and Technology

August 2006 15

XML Parsers

• A “parser” is a program that checks the syntax of an XML document to ensure that the document is well-formed

• Two types of parsers:

– Non-validating – only checks for syntax

– Validating – checks syntax and verifies the document against a DTD or Schema