xml and the semi-structured data model
DESCRIPTION
XML and the Semi-Structured Data Model. Motivation. We have seen that relational databases are very convenient to query. However: There is a LOT of data not in relational databases!! Perhaps the most widely accessed database is the web, and it certainly isn’t a relational database. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/1.jpg)
1
XML and the Semi-Structured Data Model
![Page 2: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/2.jpg)
2
Motivation
• We have seen that relational databases are very convenient to query. However:– There is a LOT of data not in relational
databases!!
• Perhaps the most widely accessed database is the web, and it certainly isn’t a relational database.
![Page 3: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/3.jpg)
3
Documents Vs. Databases
Documents Databases
Paragraphs, Sentences Tables, tuples
Easy for people to understand
Easy for computers to understand
Static Dynamic
![Page 4: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/4.jpg)
4
Querying the Web
• The web can be queried using a search engine, however, we can’t ask questions like:– What is the weather in Zanzibar today?– What is the lowest price for which a Jaguar is sold
on the web?
• Problems:– There are no facilities for asking complex
questions, such as aggregation of data– Words have overloaded meanings (Jaguar)
![Page 5: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/5.jpg)
5
Understanding the Web
• In order to query the web, we must be able to understand it.
• 2 Computer Science Approaches:– Artificial Intelligence Approach– Database Approach
![Page 6: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/6.jpg)
6
Artificial Intelligence Approach
“The web is unstructured and we must deal with it”
• Use techniques for machine learning to understand the web.
• Example: To understand the word “Jaguar” check if it appears on a page with the word car or automobile; or rather with jungle and Africa
• Problem: Such techniques tend to be inexact and have a large percentage of mistakes
![Page 7: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/7.jpg)
7
Database Approach
“The web is unstructured and we will structure it”
• Sometimes problems that are very difficult can be solved easily by enforcing a standard
• Encourage the use of XML as a standard for data exchange on the web
![Page 8: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/8.jpg)
8
Example XML Document<?xml version=“1.0”?>
<transaction>
<account>89-344</account>
<buy shares = “100”>
<ticker exch = “NASDAQ”>WEBM</ticker>
</buy>
<sell shares = “30”>
<ticker exch = “NYSE”>GE</ticker>
</sell>
</transaction>
Opening Tag
Attribute Name
Attribute Value
ElementClosing Tag
![Page 9: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/9.jpg)
9
XML Representation of a Table<?xml version=“1.0”?>
<ROWSET>
<ROW num = “1” >
<ENAME>KING </ENAME>
<SAL>5000</SAL>
</ROW>
<ROW num = “2” >
<ENAME>SCOTT </ENAME>
<SAL>3000</SAL>
</ROW>
</ROWSET>
ENAME SAL
KING 5000
SCOTT 3000
![Page 10: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/10.jpg)
10
Very Unstructured XML
<?xml version=“1.0”?>
<DamageReport>
The insured’s <Vehicle Make = “Volks”> Beetle </Vehicle> broke through the guard rail and plummeted into the ravine. The cause was determined to be <Cause>faulty brakes </Cause>. Amazingly there were no casualties.
</DamageReport>
![Page 11: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/11.jpg)
11
XML Vs. HTML
• XML and HTML are brothers. They are both special cases of SGML.
• HTML has specific tag and attribute names. These are associated with a specific meaning
• XML can have any tag and attribute name. These are not associated with any meaning
• HTML is used to specify visual style• XML is used to specify meaning
![Page 12: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/12.jpg)
12
Rules for Creating XML Documents
![Page 13: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/13.jpg)
13
Rule 1 – XML Declaration
• An XML document should begin with an XML declaration. A simple declaration is:
<?xml version=“1.0”?>
Other things can be specified, such as
character encoding.
![Page 14: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/14.jpg)
14
Rule 2 – Document Element
• Use exactly one top-level document element:
Example:<?xml version=“1.0”?>
<Question> This is legal </Question>
<?xml version=“1.0”?>
<Question> Is this legal? </Question>
<Answer> No. </Answer>
![Page 15: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/15.jpg)
15
Rule 3 – Match Opening and Closing Tags
• XML is case sensitive. The following examples are all illegal
Example:
<Question> This is legal </QUESTION>
<Question> <B> Is this legal? </Question> </B>
![Page 16: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/16.jpg)
16
Rule 4 – Comments
• Comments are between <!-- and --> characters. Comments can’t appear as attribute values or within a tag.
Example:<!-- This is a legal comment -->
<Question <!-- This is illegal -->>
Why is this illegal
<!-- This is a legal comment -->
</Question>
![Page 17: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/17.jpg)
17
Rule 5 – Element Names
• Element and attribute names must be continuous sequences of letters or hyphens or underscores.
Example:Legal Names:
<_legal> <This-is-OK>
I Illegal Names: <2-Part-Question> <Two Part Question>
<Question 4You = “Yes”>
![Page 18: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/18.jpg)
18
Rule 6 – Attribute Values
• Attribute values – go in opening tags.– should be enclosed by matching quotes (‘ or “)– should have only text and not tags
Legal Example:
<Question Poster = “Yitzchak”>Do you like XML? </Question>
<Answer Poster = ‘Yaakov’>I do.</Answer>
![Page 19: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/19.jpg)
19
Rule 6 – Continued
Illegal Examples:
<Question Poster = “Yitzchak’>Do you like XML? </Question>
<Question>Do you like XML? </Question Poster = “Yitzchak”>
<Question Poster = “<first>Yitzchak</first>”>Do you like XML? </Question>
![Page 20: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/20.jpg)
20
Rule 7 – Empty Elements
• Empty elements are elements that do not contain text or nested elements. They can be written in a compact syntax:
<Person First = “Shmuel” Last = “Levy”></Person>
is the same as
<Person First = “Shmuel” Last = “Levy” />
![Page 21: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/21.jpg)
21
Abstract View of XML
![Page 22: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/22.jpg)
22
A Different Data Model
Relational Semi-Structured
Abstract
Model
Sets of tuples
Labeled Directed Graph
Concrete
Model
Tables XML Documents
Standard
for
Storing Data
Data Exchange
![Page 23: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/23.jpg)
23
An Example<?xml version=“1.0”?>
<transaction>
<account>89-344</account>
<buy shares = “100”>
<ticker exch = “NASDAQ”>WEBM</ticker>
</buy>
<sell shares = “30”>
<ticker exch = “NYSE”>GE</ticker>
</sell>
</transaction>
![Page 24: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/24.jpg)
24
Corresponding Treetransaction
account
89-344
buy
ticker
shares
100
NASDAQ WEBM
exch
sell
ticker
shares
30
NYSE GE
exch
![Page 25: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/25.jpg)
25
Using XML
• Quering XML: There are query languages that query XML and return XML. Examples: XQuery, XPath, SQL4X
• Displaying XML: An XML document can have an associated style-sheet which specifies how the document should be translated to HTML. Examples: CSS, XSL
![Page 26: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/26.jpg)
26
Namespaces
• Namespaces are used to attach an accepted meaning to a set of tags.
• Syntax for defining a namespace
<SomeElement xmlns:prefixname=“namespaceURL” >
the namespace will be recognized within the SomeElement element.
![Page 27: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/27.jpg)
27
Example Namespace
<irs:Form id=“1040” xmlns:irs=“http://www.irs.gov”><irs:Name>Tina Wells</irs:Name><PhoneNumber>03-5655666</PhoneNumber>
</irs:Name>
• In order for the namespace to be recognized in all elements, the declaration should be in the document element
![Page 28: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/28.jpg)
28
XSQL Pages
![Page 29: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/29.jpg)
29
What are XSQL Pages?
• XSQL pages are XML documents that have SQL queries embedded in them.
• When a user requests to view an XSQL page, the web server:1. Dynamically computes the embedded queries2. Translates the query results into XML3. Inserts the results in the proper places in the
document4. Transforms the result to HTML if a stylesheet is
given
![Page 30: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/30.jpg)
30
A Simple Example
<?xml version=“1.0”?>
<xsql:query connection=“scott” xmlns:xsql=“urn:oracle-xsql”>
SELECT sname
FROM Sailors
</xsql:query>You should specify the connection and the namespace on the document element
![Page 31: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/31.jpg)
31
Page Seen in Browser
<?xml version=“1.0”?>
<ROWSET>
<ROW num = “1” >
<SNAME>Rusty</SNAME>
</ROW>
<ROW num = “2” >
<SNAME>Justin </SNAME>
</ROW>
</ROWSET>
• A ROWSET element encloses query result
• Each ROW element encloses each row
• Each column in the row is within a tag with its column’s name
![Page 32: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/32.jpg)
32
Another Example
<?xml version=“1.0”?>
<RESULTS connection=“scott” xmlns:xsql=“urn:oracle-xsql”>
Here is something interesting:
<xsql:query>
SELECT sname, age + rating as ra
FROM Sailors
WHERE sid = 13
</xsql:query>
</RESULTS>
![Page 33: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/33.jpg)
33
Resulting Document
<?xml version=“1.0”?>
<RESULTS>
Here is something interesting:
<ROWSET>
<ROW num = “1” >
<SNAME>Rusty</SNAME>
<RA>55</RA>
</ROW>
</ROWSET>
</RESULTS>
![Page 34: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/34.jpg)
34
Using Parameters
• Your page can use parameters. The value of a parameter param is determined in the following fashion:1. The value of the URL parameter param if
supplied2. The value of the HTTP session object param if
supplied3. The value of the closest ancestor’s attribute
named param, if present4. An empty string
![Page 35: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/35.jpg)
35
Example with Parameters
<?xml version=“1.0”?>
<xsql:query connection=“scott” xmlns:xsql=“urn:oracle-xsql”
sname = “Joe”>
SELECT *
FROM Sailors
WHERE sname = ‘{@sname}’
</xsql:query>
![Page 36: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/36.jpg)
36
Evaluating the Query
• Suppose the XSQL document is at:
http://cs.huji.ac.il/~db/query1.xsql• Then, requesting the url:
http://cs.huji.ac.il/~db/query1.xsql?sname=Jim
will return all the details of Jim.• Requesting
http://cs.huji.ac.il/~db/query1.xsql
will return all the details of Joe (the defualt value)
![Page 37: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/37.jpg)
37
A Strange Example
<?xml version=“1.0”?>
<xsql:query connection=“scott” xmlns:xsql=“urn:oracle-xsql”
select = “*” where = “1=1” order=“1”>
SELECT {@select}
FROM {@from}
WHERE {@where}
ORDER BY {@order}
</xsql:query>
![Page 38: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/38.jpg)
38
Customizing Results
• The query tag can have different attributes that customize the query results. Here are some of the important options:– max-rows: The maximum number of rows returned– skip-rows: The number of rows to skip before
returning rows– rowset-element: The name of the rowset element– row-element: The name of the row element
![Page 39: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/39.jpg)
39
Customizing Results
<?xml version=“1.0”?>
<xsql:query connection=“scott” xmlns:xsql=“urn:oracle-xsql”
skip = “0” max-rows=“2” skip-rows={@skip} >
SELECT *
FROM Program
ORDER BY url
</xsql:query>
By calling the same page with different values for skip, we can see the different programs
![Page 40: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/40.jpg)
40
Notes
• An XSQL document can have many queries.• The queries can appear within arbitrary XML
tags
• We can produce XML that has a more nested structure using the CURSOR function...
![Page 41: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/41.jpg)
41
Remembering Subqueries in the SELECT Clause
• Subqueries in the SELECT clause must return a single value. What do we do if we want for each boat, all the sailors who reserved the boat?
• We want each bid to be associated with a table of Sailors data!
![Page 42: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/42.jpg)
42
Using the CURSOR Function
<?xml version=“1.0”?>
<xsql:query connection=“scott” xmlns:xsql=“urn:oracle-xsql”>SELECT bid,
CURSOR(SELECT sid, sname FROM Sailors S, Reserves R WHERE S.sid = R.sid
and R.bid = B.bid) as Reservers
FROM Boats B;</xsql:query>
![Page 43: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/43.jpg)
43
<?xml version=“1.0”?>
<ROWSET>
<ROW num = “1” >
<BID>113</BID>
<RESERVERS>
<RESERVERS_ROW num = “1” >
<SID> 13 </SID>
<SNAME> Joe </SNAME>
</RESERVERS_ROW>
<RESERVERS_ROW num = “2” >
.... </RESERVERS_ROW>
</RESERVERS>
</ROW>
</ROWSET>
Note use of select query alias instead of inner row set and row tags.
![Page 44: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/44.jpg)
44
Setting Page Level Parameters
• The following statement defines a parameter pname. The value of pname is the value in the first column of the first row of the query
• The variable pname will be recognized in the page
<xsql:set-page-param name=“pname”>
SELECT Statement
</xsql:set-page-param>
![Page 45: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/45.jpg)
45
Example<?xml version=“1.0”?>
<page connection=“scott” xmlns:xsql=“urn:oracle-xsql”>
<xsql:set-page-param name=“num-stories”> SELECT headings_num
FROM user_prefs WHERE userid={@user}
</xsql:set-page-param>
<xsql:query max-rows={@num-stories} > SELECT title, url FROM latest_news
</xsql:query>
</page>
![Page 46: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/46.jpg)
46
Another Way to Define a Page Level Parameter
• Page level parameters can also be set with the statement:
<xsql:set-page-param name=“pname” value=“val”/>
• For example:
<xsql:set-page-param name=“num-stories” value=“10”/>
![Page 47: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/47.jpg)
47
Additional Options
• The set-page-param element can have the following attributes:– only-if-unset: If the value is “yes” then the
parameter will be set only if it has no value– ignore-empty-value: If value is “yes” then the
parameter will be set only if its value will not be an empty string
![Page 48: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/48.jpg)
48
Setting Cookie Values
• The following statement defines a parameter pname. The value of pname is the value in the first column of the first row of the query
• The variable pname will be recognized until the cookie expires
<xsql:set-cookie name=“pname”> SELECT Statement
</xsql:set-cookie>
![Page 49: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/49.jpg)
49
Additional Attributes for Set-Cookie
• The set-cookie element can have the following attributes:– max-age: The number of seconds before
the cookie expires (defaults to expire when user exits current browser instance)
– only-if-unset– ignore-empty-value
![Page 50: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/50.jpg)
50
Example
<?xml version=“1.0”?>
<page connection=“scott” xmlns:xsql=“urn:oracle-xsql”>
<xsql:set-cookie name=“siteuser” max-age=“31536000”
only-if-unset=“yes” ignore-empty-value=“yes”> SELECT username
FROM site_users WHERE username= ‘{@username}’ and password=‘{@password}’
</xsql:set-cookie>
<!-- Other Actions Here -->
</page>
![Page 51: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/51.jpg)
51
DML or PL/SQL• We can do DML (update, insert, delete) or call PL/SQL
procedures with the following basic syntax:
<xsql:dml> DML Statement
</xsql:dml>
or
<xsql:dml>BEGIN
Any valid PL/SQL StatementEND;
</xsql:dml>
![Page 52: XML and the Semi-Structured Data Model](https://reader035.vdocument.in/reader035/viewer/2022062222/568152ba550346895dc0dc05/html5/thumbnails/52.jpg)
52
Example<xsql:dml>
INSERT INTO page_requests_log(page,userid) VALUES(‘page12.xsql’, ‘{@siteuser}’)
</xsql:dml>
If successful the following element is added to the page:
<xsql-status action=“xsql:dml” rows=“n” />
Otherwise, an error element is added:<xsql-error action=“xsql:dml”> ...</xsql-error>