1 xml interoperability manjusha ravindranath. 2 contents introduction interoperability xssql syntax...
Post on 11-Jan-2016
226 Views
Preview:
TRANSCRIPT
1
XML INTEROPERABILITY
Manjusha Ravindranath
2
CONTENTS
Introduction InteroperabilityXSSQL syntaxUsecases documentGroup By
-Without aggregation
-With aggregation
-Multiple XML DatabasesRestructuring Queries ImplementationConclusion
3
INTRODUCTION
The goal of this research is
- to study XML interoperability.
- to develop a SQL oriented query language
XSSQL for querying XML documents in
comparison to procedure oriented languages like
XQuery.
- to study mapping between the flat
representation of relational data and the
hierarchical representation of XML data.
4
INTEROPERABILITY
Interoperability is the ability to uniformly share, interpret, query and manipulate data across component databases.
XSSQL supports main key features of an interoperable language by
-being independent of the XML schemas and
Document Type Descriptors.
-permitting restructuring of one XML document to
another through view definition capabilities.
5
XSSQL SYNTAX
select <tag_Name attrib_Name > {$var/Qname } </tag_Name>
from document (“doc_name.xml”)//Qname $var where whereConditions
Variables are declared in the from clause as - document (“doc_name.xml”)/Qname $var
for an element at the top level of the document. - document (“doc_name.xml”)//Qname $var for an element which is at intermediate levels of the document.
Group by is given inside the <tag_name> in the select clause as <tag_name group by $var>
6
QUERIES FROM USECASES DOCUMENT
1. “XMP” Queries
2. Tree Queries
3. “SEQ” Queries
4. “R” Queries
5. “SGML” Queries
6. ”STRING” Queries
7. “NS” Queries
8. “PARTS”
9. “STRONG”
7
“XMP” QUERY
Sample Data-”bib.xml”
<bib> <book year=“1994”> <title> TCP/IP Illustrated </title>
<author>
<last>Stevens</last><first>W.</first>
</author>
<publisher>Addison-Wesley</publisher>
</book>
<book year=“2000”> <title> Data on the web </title>
<author>
<last>Suciu</last><first>Dan</first></author>
<publisher>Morgan Kaufmann</publisher>
</book></bib>
8
QUERY (XSSQL)Solution in XSSQL
Q1. List books published by Addison-Wesley after 1991 including their year and title.
select<book year =“{$b/@year}”>{$b/title}</book>from document(“bib.xml”)//book $bwhere $b/publisher =“Addison-Wesley” and $b/@year >1991
Expected Result <book year =“1994”> <title>TCP/IP Illustrated</title></book>
9
QUERY (XQUERY)
Solution in XQuery
XQuery uses the “FLWR” expression which consists of
FOR, LET,WHERE and RETURN
for $b in document(“bib.xml”)//book
where $b/publisher=“Addison-Wesley” and $b/@year >1991
return
<book year =“{$b/@year}”>
{$b/title}
</book>
Solution in XQuery has the same above expected result.
10
TREE QUERY -FUNCTIONS IN XSSQLSample Data -”book.xml”
<book><title>Data on the web</title>
<author>Dan Suciu </author>
<section id =“intro” difficulty = “easy”>
<title> Introduction</title> <p>Text….</p>
<section>
<title>Audience</title> <p>Text….</p>
</section>
<section>
<title>Web Data and the Two Cultures </title>
<figure height=“400” width=“400”>
<image source=“pic.gif”/> </figure>
</section> </section></book>
11
QUERY (XSSQL)Solution in XSSQL
Q2. Prepare a nested table of contents for Book1 listing all the sections and their titles preserving the attributes of each <section> element if any.
create function toc ($e as element)return element * as begin{ declare $n = local-name($e) if ($n =“section”) select <section> {$e/@*} {toc($e/*)} </section> if ($n =“title”) select <title> {$e/text ()} </title> } end <toc>{ toc(document(“book.xml”)/book) } </toc>
12
EXPECTED RESULT
<toc> <section id = “intro” difficulty = “easy”> <title>Introduction</title> <section> <title>Audience</title> </section> <section> <title>Web Data and the Two Cultures </title> </section> </section></toc>
13
QUERY (XQUERY)
Solution in XQuery
define function toc ($e as element)as element * { let $n: = local-name($e) return if ($n =“section”) then <section> {$e/@*} {toc($e/*)} </section> else if ($n =“title”) then <title> {$e/text ()} </title> else {}} <toc>{ toc(document(“book.xml”)/book) } </toc>
14
GROUP BY In XSSQL the concept is that each node will have its own
grouping. Each child will inherit grouping of its parent The cases studied under group by are
- Group by without aggregation - Group by with aggregation - Multiple XML Databases
Following queries are based on the document “sales.xml”. This document gives the daily sales of the stores in two cities
in each month starting from January of the current year. For the sake of simplicity two stores in two cities of NC are
taken and sales of couple of days in the months of January and February are discussed.
15
WITHOUT AGGREGATION
Sample Data -“sales.xml”<entries><entry>
<state>NC</state>
<city>Greensboro</city><store>Harris Teeter</store>
<month>January</month>
<day>1</day><sales>100.00</sales>
<day>2</day><sales>110.00</sales>
</entry><entry>
<state>NC</state>
<city>Greensboro</city><store>Food Lion</store>
<month>January</month>
<day>1</day><sales>100.00</sales>
<day>2</day><sales>200.00</sales>
</entry></entries>
16
QUERY (XSSQL)
Q3. List all stores in each city.
<root>select
<city group by $c>distinct($c/text())
<store group by $s>distinct($s/text())</store>
</city></root>
from document (“sales.xml”)/entries/entry $e,
$e/city $c,$e/store $s
17
SEMANTICS OF GROUP BY IN XSSQLThe instantiations after the group by would be like the following
$c $s
Greensboro Harris Teeter
Greensboro Harris Teeter
Greensboro Food Lion
Greensboro Food Lion
Raleigh Harris Teeter
Raleigh Harris Teeter
Raleigh Lowes
Raleigh Lowes
18
SEMANTICS OF GROUP BY IN XSSQL The output instance is graphically shown below. By <city group
By $c, $c binds to every <city>…</city> in the document. Duplicate city names are eliminated by distinct ($c/text()).
root
Gso Raleigh
HT FL HT Lowes
19
EXPECTED RESULT
<root>
<city>Greensboro
<store>Harris Teeter </store>
<store>Food Lion </store>
</city>
<city> Raleigh
<store> Harris Teeter</store>
<store> Lowes</store>
</city>
</root>
20
QUERY(XQUERY)
<root>
for $c in distinct-values(document(“sales.xml”)//city)
return
<city>$c/text() {
for $e in document(“sales.xml”)/entries/entry
where some $ca in $e/city satisfies
deep-equal ($ca,$c)
for $s in distinct-values ($e/store)
return
<store> $s/text()</store> }
</city>
</root>
21
NEW GROUP BY PROPOSAL
The above example can be written in XQuery using a new GROUP BY proposal provided by Prof. Dan Suciu.
<root>
for $e in document(“sales.xml”)/entries/entry,
$c in $e/city, $s in $e/store
return GROUPBY $c IN
<city>$c/text()
GROUPBY $s IN $s
</city>
</root>
22
QUERY (XSSQL)
Q4. Give the monthly sales in all stores in each city
<root>select
<city group by $c>distinct($c/text())
<store group by $s>distinct($s/text())
<month group by $m>distinct($m/text())
<total_sales>SUM($i) </total_sales>
</month>
</store>
</city></root>
from document (“sales.xml”)/entries/entry $e,
$e/city $c,$e/store $s, $e/month $m, $e/sales $i
23
EXPECTED RESULT<root>
<city>Greensboro
<store> Harris Teeter
<month> January
<totalsales>210</totalsales>
</month><month>February
<totalsales>730</totalsales>
</month></store>
<store>Food Lion
<month> January
<totalsales>300</totalsales>
</month><month>February
<totalsales>830</totalsales>
</month></store></city></root>
24
QUERY (XQUERY)
<root>
for $c in distinct-values(document(“sales.xml”)//city),
$e in document(“sales.xml”)/entries/entry
where some $ca in $e/city satisfies deep-equal($ca,$c)
return
<city> distinct($c/text()) {
for $s in distinct-values(document(“sales.xml”)//store),
where some $sa in $e/store satisfies deep-equal($sa,$s)
return
<store> distinct($s/text()) {
for $m in distinct-values(document(“sales.xml”)//month)
let $i=$e/sales
25
QUERY (XQUERY) contd...
where some $ma in $e/month
satisfies deep-equal($ma,$m)
return
<month> distinct($m/text()) {
<total_sales>SUM($i) </total_sales> }
</month>}
</store>}
</city>
</root>
26
MULTIPLE XML DATABASES Suppose we have multiple XML databases having similar and
possibly overlapping data.
Sample Data“Univ1.xml” and “Univ2.xml” deals with student information in different
majors.
<entries>
<entry>
<major> Mathematics </major>
<student> Stephen Providence </student>
<student> Dale Borget </student>
</entry><entry>
<major>Computer Science </major>
<student> Barbara McMasters </student></entry></entries>
27
MULTIPLE XML DATABASESSample Data“Univ2.xml”
<entries>
<entry>
<major> Mathematics </major>
<student> Dale Borget </student>
<student> Mary Rierson </student>
</entry><entry>
<major>English </major>
<student> Robin Mooney </student>
</entry>
</entries>
28
QUERY (XSSQL)define function students ($a as element entry)
as xs:string {
declare $b =$a/student
return $b }
<merge>
select
<entry>
{ $e1 / major}
{ students ($e1)} {
$e2[student NOT IN (select $sa from
document(“Univ1.xml”)//entry $ea, $ea/major$ma
$ea/student $sa
where $ma/text()=$m2/text() ) ]/student } </entry>
29
QUERY (XSSQL) contd...
from document(“Univ1.xml”)//entry $e1,
document(“Univ2.xml”)//entry $e2, $e2/major $m2
UNION
select
{$a}
</merge>
from document(“Univ2.xml”)//entry $a,
$n in $a/major
where $n not in document(“Univ1.xml”)//entry/major
30
EXPECTED RESULT<merge>
<entry>
<major> Mathematics </major>
<student> Stephen Providence </student>
<student> Dale Borget </student>
<student> Mary Rierson </student>
</entry> <entry>
<major>Computer Science </major>
<student> Barbara McMasters </student>
</entry><entry>
<major>English </major>
<student> Robin Mooney </student>
</entry></merge>
31
RESTRUCTURING QUERIES The following two documents “doc1.xml” and “doc2.xml”
contain the same information about company stocks but have a different hierarchical structure.
Views have been created to demonstrate the restructuring capabilities of XSSQL.
32
RESTRUCTURING QUERIES
Sample Data“doc1.xml”
<entries>
<stock>
<date>8/8/03</date>
<ticker>IBM</ticker>
<value>5881</value>
</stock>
<stock>
<date>8/8/03</date>
<ticker>MSFT</ticker>
<value>6681</value>
</stock> <stock>
33
RESTRUCTURING QUERIES
Sample Data contd..
<date>8/9/03</date>
<ticker>IBM</ticker>
<value>5981</value>
</stock>
<stock>
<date>8/9/03</date>
<ticker>MSFT</ticker>
<value>6981</value>
</stock>
</entries>
34
RESTRUCTURING QUERIES
Sample Data“doc2.xml”
<entries>
<stock>
<date>8/8/02</date>
<IBM>5681</IBM>
<MSFT>6681</value>
</stock>
<stock>
<date>8/9/02</date>
<IBM>5981</IBM>
<MSFT>6981</MSFT>
</stock> </entries>
35
RULES OF RESTRUCTURING
a. /doc2/entries/stock/IBM is a /doc1/entries/stock/ticker
b. If x is a ticker then /doc2/entries/stock/x/text() corresponds to /doc1/entries/stock/value.
36
QUERY (XSSQL)
create view doc1_to_doc2 as
<entries>
select
<stock group by $d>
<date> distinct($d/text() ) </date>
<t/text()>$v/text() </t/text()>
</stock>
</entries>
from document (“doc1.xml”)//stock $s,
$s/date $d, $s/ticker $t, $s/value $v
Expected Result
“doc2.xml”
37
IMPLEMENTATION XSSQL queries are translated into XQuery using naive
algorithms.
General Algorithm used to translate XSSQL into XQuery: - Read and tokenize input XSSQL string using white spaces
(Can use JAVA stringTokenizer classes).
- Translate XSSQL tokens to tokens in XQuery using
functions .
- Finally concatenate XQuery tokens to produce the output
string.
38
CONCLUSION
In conclusion We introduced XSSQL as a SQL oriented query language for
querying XML documents. We developed a formal syntax of XSSQL akin to SQL and
provided novel algorithms for translating XSSQL to XQUERY. We have shown that XSSQL extensively deals with group by
with and without aggregation in single and multiple XML documents using several levels of nesting.
This work leads to many important directions of future work like
- optimization of views in XML documents. - merging of multiple (more than two) XML documents. - developing standalone engine for XSSQL.
39
RFERENCES Lakshmanan, L.V.S. , Sadri, F. , and Subramanian, S.N - 2001
SchemaSQL- An Extension to SQL for Multi-database interoperability.
W3C Working Draft
XML Query Use Cases-
http://www.w3.org/TR/xmlquery-use-cases/ Cotton, P. , Robie, J. , - Jan 30, 2002
Querying XML Documents. Unicode Conference Berners-Lee, T. , Hendler, J. , Lassila, O. , - May 17, 2001
The Semantic Web.Scientific American W3C Recommendation, - May 2, 2001
XML Schema Part 0: Primer
http://www.w3.org/TR/2001/REC-xmlschema-0-20010502/
top related