module 3: xml query and manipulation · module 3: xml query and manipulation ... xpath xquery xslt...

28
Module 3: XML Query and Manipulat Key XML query and manipulation languages include XPath XQuery XSLT SQL/XML c Munindar P. Singh, CSC 513, Spring 2010 p.45 Metaphors for Handling XML: 1 How we conceptualize XML documents determines our approach for handling them Text: an XML document is text Ignore any structure and perform simple pattern matches Tags: an XML document is text interspersed with tags Treat each tag as an “event” during reading a document, as in SAX (Simple API for XML) Construct regular expressions as in screen scraping c Munindar P. Singh, CSC 513, Spring 2010 p.46

Upload: hoangquynh

Post on 16-Aug-2018

251 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Module 3: XML Query and Manipulation · Module 3: XML Query and Manipulation ... XPath XQuery XSLT SQL/XML c Munindar P. Singh, CSC 513, Spring 2010 p.45 ... A numeric that is 0 or

Module 3: XML Query and Manipulation

Key XML query and manipulation languagesinclude

XPathXQueryXSLTSQL/XML

c©Munindar P. Singh, CSC 513, Spring 2010 p.45

Metaphors for Handling XML: 1

How we conceptualize XML documentsdetermines our approach for handling them

Text: an XML document is textIgnore any structure and perform simplepattern matches

Tags: an XML document is text interspersedwith tags

Treat each tag as an “event” duringreading a document, as in SAX (SimpleAPI for XML)Construct regular expressions as inscreen scraping

c©Munindar P. Singh, CSC 513, Spring 2010 p.46

Page 2: Module 3: XML Query and Manipulation · Module 3: XML Query and Manipulation ... XPath XQuery XSLT SQL/XML c Munindar P. Singh, CSC 513, Spring 2010 p.45 ... A numeric that is 0 or

Metaphors for Handling XML: 2

Tree: an XML document is a treeWalk the tree using DOM (DocumentObject Model)

Template: an XML document has regularstructure

Let XPath, XSLT, XQuery do the workThought: an XML document represents aninformation model

Access knowledge via RDF or OWL

c©Munindar P. Singh, CSC 513, Spring 2010 p.47

XPath

Used as part of XPointer, SQL/XML, XQuery,and XSLT

Models XML documents as trees with nodesElementsAttributesText (PCDATA)CommentsRoot node: above root of document

c©Munindar P. Singh, CSC 513, Spring 2010 p.48

Page 3: Module 3: XML Query and Manipulation · Module 3: XML Query and Manipulation ... XPath XQuery XSLT SQL/XML c Munindar P. Singh, CSC 513, Spring 2010 p.45 ... A numeric that is 0 or

Achtung!

Parent in XPath is like parent as traditionallyin computer scienceChild in XPath is confusing:

An attribute is not a child of its parentMakes a difference for recursion (e.g., inXSLT apply-templates)

Our terminology follows computer science:e-children, a-children, t-childrenSets via et-, ta-, and so on

c©Munindar P. Singh, CSC 513, Spring 2010 p.49

XPath Location Paths: 1

Relative or absoluteReminiscent of file system paths, but muchmore subtle

Name of an element to walk downLeading /: root/: indicates walking down a tree.: currently matched (context) node..: parent node

c©Munindar P. Singh, CSC 513, Spring 2010 p.50

Page 4: Module 3: XML Query and Manipulation · Module 3: XML Query and Manipulation ... XPath XQuery XSLT SQL/XML c Munindar P. Singh, CSC 513, Spring 2010 p.45 ... A numeric that is 0 or

XPath Location Paths: 2

@attr: to check existence or access value ofthe given attributetext(): extract the textcomment(): extract the comment

[ ]: generalized array accessors

Variety of axes, discussed below

c©Munindar P. Singh, CSC 513, Spring 2010 p.51

XPath Navigation

Select children according to position, e.g., [j],where j could be 1 . . . last()Descendant-or-self operator, //

.//elem finds all elems under the currentnode//elem finds all elems in the document

Wildcard, *:collects e-children (subelements) of thenode where it is applied, but omits thet-children@*: finds all attribute values

c©Munindar P. Singh, CSC 513, Spring 2010 p.52

Page 5: Module 3: XML Query and Manipulation · Module 3: XML Query and Manipulation ... XPath XQuery XSLT SQL/XML c Munindar P. Singh, CSC 513, Spring 2010 p.45 ... A numeric that is 0 or

XPath Queries (Selection Conditions)Attributes: //Song[@genre="jazz"]Text: //Song[starts-with(.//group, "Led")]

Existence of attribute: //Song[@genre]Existence of subelement: //Song[group]Boolean operators: and, not, orSet operator: union (|), analogous to choiceArithmetic operators: >, <, . . .String functions: contains(), concat(),length(), starts-with(), ends-with()distinct-values()Aggregates: sum(), count()

c©Munindar P. Singh, CSC 513, Spring 2010 p.53

XPath Axes: 1

Axes are addressable node sets based on thedocument tree and the current node

Axes facilitate navigation of a treeSeveral are definedMostly straightforward but some of themorder the nodes as the reverse of othersSome captured via special notation

current, child, parent, attribute, . . .

c©Munindar P. Singh, CSC 513, Spring 2010 p.54

Page 6: Module 3: XML Query and Manipulation · Module 3: XML Query and Manipulation ... XPath XQuery XSLT SQL/XML c Munindar P. Singh, CSC 513, Spring 2010 p.45 ... A numeric that is 0 or

XPath Axes: 2

preceding: nodes that precede the start ofthe context node (not ancestors, attributes,namespace nodes)following: nodes that follow the end of thecontext node (not descendants, attributes,namespace nodes)preceding-sibling: preceding nodes thatare children of the same parent, in reversedocument orderfollowing-sibling: following nodes that arechildren of the same parent

c©Munindar P. Singh, CSC 513, Spring 2010 p.55

XPath Axes: 3

ancestor: proper ancestors, i.e., elementnodes (other than the context node) thatcontain the context node, in reversedocument orderdescendant: proper descendantsancestor-or-self: ancestors, including self (ifit matches the next condition)descendant-or-self: descendants, includingself (if it matches the next condition)

c©Munindar P. Singh, CSC 513, Spring 2010 p.56

Page 7: Module 3: XML Query and Manipulation · Module 3: XML Query and Manipulation ... XPath XQuery XSLT SQL/XML c Munindar P. Singh, CSC 513, Spring 2010 p.45 ... A numeric that is 0 or

XPath Axes: 4

Longer syntax: child::SongSome captured via special notation

self::*:child::node(): node() matches all nodespreceding::*descendant::text()ancestor::Songdescendant-or-self::node(), whichabbreviates to //Compare /descendant-or-self::Song[1](first descendant Song) and //Song[1](first Songs (children of their parents))

c©Munindar P. Singh, CSC 513, Spring 2010 p.57

XPath Axes: 5

Each axis has a principal node kindattribute: attributenamespace: namespaceAll other axes: element

* matches whatever is the principal nodekind of the current axisnode() matches all nodes

c©Munindar P. Singh, CSC 513, Spring 2010 p.58

Page 8: Module 3: XML Query and Manipulation · Module 3: XML Query and Manipulation ... XPath XQuery XSLT SQL/XML c Munindar P. Singh, CSC 513, Spring 2010 p.45 ... A numeric that is 0 or

XPointer

Enables pointing to specific parts of documentsCombines XPath with URLsURL to get to a document; XPath to walkdown the documentCan be used to formulate queries, e.g.,

Song-URL#xpointer(//Song[@genre="jazz"])The part after # is a fragment identifier

Fine-grained addressability enhances theWeb architecture

High-level “conceptual” identification of node sets

c©Munindar P. Singh, CSC 513, Spring 2010 p.59

XQuery

The official query language for XML, now aW3C recommendation, as version 1.0Given a non-XML syntax, easier on thehuman eye than XMLAn XML rendition, XqueryX, is in the works

c©Munindar P. Singh, CSC 513, Spring 2010 p.60

Page 9: Module 3: XML Query and Manipulation · Module 3: XML Query and Manipulation ... XPath XQuery XSLT SQL/XML c Munindar P. Singh, CSC 513, Spring 2010 p.45 ... A numeric that is 0 or

XQuery Basic Paradigm

The basic paradigm mimics the SQL(SELECT–FROM–WHERE) clausef o r $x i n doc ( ’ q2 . xml ’ ) / / Songwhere $x / @lg = ’ en ’r e t u r n

4 <Engl ish−Sgr name= ’ { $x / Sgr /@name} ’ t i = ’ { $x / @ti } ’ / >

c©Munindar P. Singh, CSC 513, Spring 2010 p.61

FLWOR Expressions

Pronounced “flower”For: iterative binding of variables over rangeof valuesLet: one shot binding of variables over vectorof valuesWhere (optional)Order by (sort: optional)Return (required)

Need at least one of for or let

c©Munindar P. Singh, CSC 513, Spring 2010 p.62

Page 10: Module 3: XML Query and Manipulation · Module 3: XML Query and Manipulation ... XPath XQuery XSLT SQL/XML c Munindar P. Singh, CSC 513, Spring 2010 p.45 ... A numeric that is 0 or

XQuery For Clause

The for clauseIntroduces one or more variablesGenerates possible bindings for eachvariableActs as a mapping functor or iterator

In essence, all possible combinations ofbindings are generated: like a Cartesianproduct in relational algebraThe bindings form an ordered list

c©Munindar P. Singh, CSC 513, Spring 2010 p.63

XQuery Where Clause

The where clauseSelects the combinations of bindings that aredesiredBehaves like the where clause in SQL, inessence producing a join based on theCartesian product

c©Munindar P. Singh, CSC 513, Spring 2010 p.64

Page 11: Module 3: XML Query and Manipulation · Module 3: XML Query and Manipulation ... XPath XQuery XSLT SQL/XML c Munindar P. Singh, CSC 513, Spring 2010 p.45 ... A numeric that is 0 or

XQuery Return Clause

The return clauseSpecifies what node-sets are returned basedon the selected combinations of bindings

c©Munindar P. Singh, CSC 513, Spring 2010 p.65

XQuery Let Clause

The let clauseLike for, introduces one or more variablesLike for, generates possible bindings foreach variableUnlike for, generates the bindings as a list inone shot (no iteration)

c©Munindar P. Singh, CSC 513, Spring 2010 p.66

Page 12: Module 3: XML Query and Manipulation · Module 3: XML Query and Manipulation ... XPath XQuery XSLT SQL/XML c Munindar P. Singh, CSC 513, Spring 2010 p.45 ... A numeric that is 0 or

XQuery Order By Clause

The order by clauseSpecifies how the vector of variable bindingsis to be sorted before the return clauseSorting expressions can be nested byseparating them with commas

Variants allow specifyingdescending or ascending (default)empty greatest or empty least toaccommodate empty elementsstable sorts: stable order bycollations: order by $t collationcollation-URI: (obscure, so skip)

c©Munindar P. Singh, CSC 513, Spring 2010 p.67

XQuery Positional Variables

The for clause can be enhanced with a positionalvariable

A positional variable captures the position ofthe main variable in the given for clause withrespect to the expression from which themain variable is generated

Introduce a positional variable via the at $varconstruct

c©Munindar P. Singh, CSC 513, Spring 2010 p.68

Page 13: Module 3: XML Query and Manipulation · Module 3: XML Query and Manipulation ... XPath XQuery XSLT SQL/XML c Munindar P. Singh, CSC 513, Spring 2010 p.45 ... A numeric that is 0 or

XQuery Declarations

The declare clause specifies things likeNamespaces: declare namespacepref=’value’

Predefined prefixes include XML, XMLSchema, XML Schema-Instance, XPath,and local

Settings: declare boundary-space preserve(or strip)Default collation: a URI to be used forcollation when no collation is specified

c©Munindar P. Singh, CSC 513, Spring 2010 p.69

XQuery Quantification: 1

Two quantifiers some and everyEach quantifier expression evaluates to trueor falseEach quantifier introduces a bound variable,analogous to for

1 f o r $x i n . . .where some $y i n . . .s a t i s f i e s $y . . . $xr e t u r n . . .

Here the second $x refers to the same variableas the first

c©Munindar P. Singh, CSC 513, Spring 2010 p.70

Page 14: Module 3: XML Query and Manipulation · Module 3: XML Query and Manipulation ... XPath XQuery XSLT SQL/XML c Munindar P. Singh, CSC 513, Spring 2010 p.45 ... A numeric that is 0 or

XQuery Quantification: 2

A typical useful quantified expression would usevariables that were introduced outside of itsscope

The order of evaluation isimplementation-dependent: enablesoptimizationIf some bindings produce errors, this canmattersome: trivially false if no variable bindingsare found that satisfy itevery: trivially true if no variable bindings arefound

c©Munindar P. Singh, CSC 513, Spring 2010 p.71

Variables: Scoping, Bound, and Free

for, let, some, and every introduce variablesThe visibility variable follows typical scopingrulesA variable referenced within a scope is

Bound if it is declared within the scopeFree if it not declared within the scope

1 f o r $x i n . . .where some $x i n . . .s a t i s f i e s . . .r e t u r n . . .

Here the two $x refer to different variables

c©Munindar P. Singh, CSC 513, Spring 2010 p.72

Page 15: Module 3: XML Query and Manipulation · Module 3: XML Query and Manipulation ... XPath XQuery XSLT SQL/XML c Munindar P. Singh, CSC 513, Spring 2010 p.45 ... A numeric that is 0 or

XQuery Conditionals

Like a classical if-then-else clauseThe else is not optionalEmpty sequences or node sets, written ( ),indicate that nothing is returned

c©Munindar P. Singh, CSC 513, Spring 2010 p.73

XQuery Constructors

Braces { } to delimit expressions that areevaluated to generate the content to be included;analogous to macros

document { }: to create a document nodewith the specified contentselement { } { }: to create an element

element foo { ’bar’ }: creates<foo>Bar</foo>element { ’foo’ } { ’bar’ }: also evaluatesthe name expression

attribute { } { }: likewisetext { body}: simpler, because anonymous

c©Munindar P. Singh, CSC 513, Spring 2010 p.74

Page 16: Module 3: XML Query and Manipulation · Module 3: XML Query and Manipulation ... XPath XQuery XSLT SQL/XML c Munindar P. Singh, CSC 513, Spring 2010 p.45 ... A numeric that is 0 or

XQuery Effective Boolean Value

Analogous to Lisp, a general value can betreated as if it were a Boolean

A xs:boolean value maps to itselfAn empty sequence maps to falseA sequence whose first member is a nodemaps to trueA numeric that is 0 or NaN maps to false,else to trueAn empty string maps to false, others to true

c©Munindar P. Singh, CSC 513, Spring 2010 p.75

Defining Functions

1 dec lare f u n c t i o n l o c a l : i t em f top ( $ t ){

l o c a l : i t em f ( $t , ( ) )} ;

Here local: is the namespace of the queryThe arguments are specified in parentheses

All of XQuery may be used within thedefining bracesSuch functions can be used in place ofXPath expressions

c©Munindar P. Singh, CSC 513, Spring 2010 p.76

Page 17: Module 3: XML Query and Manipulation · Module 3: XML Query and Manipulation ... XPath XQuery XSLT SQL/XML c Munindar P. Singh, CSC 513, Spring 2010 p.45 ... A numeric that is 0 or

Functions with Types

1 dec lare f u n c t i o n l o c a l : i t em f top ( $ t as element ( ) )as element ( ) ∗

{l o c a l : i t em f ( $t , ( ) )

} ;

Return types as aboveAlso possible for parameters, but ignore suchfor this course

c©Munindar P. Singh, CSC 513, Spring 2010 p.77

XSLT

A programming language with a functional flavorSpecifies (stylesheet) transforms fromdocuments to documentsCan be included in a document (best not to)

<?xml vers ion ="1.0"? ><?xml−s t y l eshee t type =" t e x t / x s l "

h r e f ="URL−to−xs l−sheet "?><main−element >

5 . . .</main−element >

c©Munindar P. Singh, CSC 513, Spring 2010 p.78

Page 18: Module 3: XML Query and Manipulation · Module 3: XML Query and Manipulation ... XPath XQuery XSLT SQL/XML c Munindar P. Singh, CSC 513, Spring 2010 p.45 ... A numeric that is 0 or

XQuery versus XSLT: 1

Competitors in some ways, butShare a basis in XPathConsequently share the same data modelSame type systems (in the type-sensitiveversions)XSLT got out first and has a sizablefollowing, but XQuery has strong backingamong vendors and researchers

c©Munindar P. Singh, CSC 513, Spring 2010 p.79

XQuery versus XSLT: 2

XQuery is geared for querying databasesSupported by major relational DBMSvendors in their XML offeringsSupported by native XML DBMSsOffers superior coverage of processingjoinsIs more logical (like SQL) and potentiallymore optimizable

XSLT is geared for transforming documentsIs functional rather than declarativeBased on template matching

c©Munindar P. Singh, CSC 513, Spring 2010 p.80

Page 19: Module 3: XML Query and Manipulation · Module 3: XML Query and Manipulation ... XPath XQuery XSLT SQL/XML c Munindar P. Singh, CSC 513, Spring 2010 p.45 ... A numeric that is 0 or

XQuery versus XSLT: 3

There is a bit of an arms race between themTypes

XSLT 1.0 didn’t support typesXQuery 1.0 doesXSLT 2.0 does too

XQuery presumably will be enhanced withcapabilities to make updates, but XSLT couldtoo

c©Munindar P. Singh, CSC 513, Spring 2010 p.81

XSLT Stylesheets

A programming language that follows XMLsyntax

Use the XSLT namespace (conventionallyabbreviated xsl)Includes a large number of primitives,especially:

<copy-of> (deep copy)<copy> (shallow copy)<value-of><for-each select="..."><if test="..."><choose>

c©Munindar P. Singh, CSC 513, Spring 2010 p.82

Page 20: Module 3: XML Query and Manipulation · Module 3: XML Query and Manipulation ... XPath XQuery XSLT SQL/XML c Munindar P. Singh, CSC 513, Spring 2010 p.45 ... A numeric that is 0 or

XSLT Templates: 1

A pattern to specify where the giventransform should apply: an XPath expression

This match only works on the root:< x s l : template match =" / " >

. . .</ x s l : template >

Example: Duplicate text in an element< x s l : template match=" t e x t ( ) " >

2 < x s l : value−of s e l e c t = ’ . ’ / >< x s l : value−of s e l e c t = ’ . ’ / >

</ x s l : template >

c©Munindar P. Singh, CSC 513, Spring 2010 p.83

XSLT Templates: 2

If no pattern is specified, apply recursively onet-children via <xsl:apply-templates/>By default, if no other template matches,recursively apply to et-children of currentnode (ignores attributes) and to root:

1 < x s l : template match = "∗ | / " >< x s l : apply−templates / >

</ x s l : template >

c©Munindar P. Singh, CSC 513, Spring 2010 p.84

Page 21: Module 3: XML Query and Manipulation · Module 3: XML Query and Manipulation ... XPath XQuery XSLT SQL/XML c Munindar P. Singh, CSC 513, Spring 2010 p.45 ... A numeric that is 0 or

XSLT Templates: 3

Copy text node by defaultUse an empty template to override thedefault:< x s l : template match="X" / >

2 <!−− X = des i red pa t t e r n −−>

Confine ourselves to the examples discussed inclass (ignore explicit priorities, for example)

c©Munindar P. Singh, CSC 513, Spring 2010 p.85

XSLT Templates: 4

Templates can be namedTemplates can have parameters

Values for parameters are supplied atinvocationEmpty node sets by defaultAdditional parameters are ignored

c©Munindar P. Singh, CSC 513, Spring 2010 p.86

Page 22: Module 3: XML Query and Manipulation · Module 3: XML Query and Manipulation ... XPath XQuery XSLT SQL/XML c Munindar P. Singh, CSC 513, Spring 2010 p.45 ... A numeric that is 0 or

XSLT Variables

Explicitly declaredValues are node setsConvenient way to document templates

c©Munindar P. Singh, CSC 513, Spring 2010 p.87

Integrity Constraints in XML

Entity: xsd:unique and xsd:keyReferential: xsd:keyrefData type: XML Schema specificationsValue: Solve custom queries using XPath orXQuery

Entity and referential constraints are based onXPath

c©Munindar P. Singh, CSC 513, Spring 2010 p.88

Page 23: Module 3: XML Query and Manipulation · Module 3: XML Query and Manipulation ... XPath XQuery XSLT SQL/XML c Munindar P. Singh, CSC 513, Spring 2010 p.45 ... A numeric that is 0 or

XML Keys: 1

Keys serve as generalized identifiers, and arecaptured via XML Schema elements:

Unique: candidate keyThe selected elements yield unique fieldtuples

Key: primary key, which means candidatekey plus

The tuples exist for each selected elementKeyref: foreign key

Each tuple of fields of a selected elementcorresponds to an element in thereferenced key

c©Munindar P. Singh, CSC 513, Spring 2010 p.89

XML Keys: 2

Two subelements built using restrictedapplication of XPath from within XML Schema

Selector: specify a set of objects: this is thescope over which uniqueness appliesField: specify what is unique for eachmember of the above set: this is the identifierwithin the targeted scope

Multiple fields are treated as ordered toproduce a tuple of values for eachmember of the setThe order matters for matching keyref tokey

c©Munindar P. Singh, CSC 513, Spring 2010 p.90

Page 24: Module 3: XML Query and Manipulation · Module 3: XML Query and Manipulation ... XPath XQuery XSLT SQL/XML c Munindar P. Singh, CSC 513, Spring 2010 p.45 ... A numeric that is 0 or

Selector XPath Expression

A selector finds descendant elements of thecontext node

The sublanguage of XPath used allowsChildren via ./child or ./* or childDescendants via .// (not within a path)Choice via |

The subset of XPath used does not allowParents or ancestorstext()AttributesFancy axes such as preceding,preceding-sibling, . . .

c©Munindar P. Singh, CSC 513, Spring 2010 p.91

Field XPath Expression

A field finds a unique descendant element(simple type only) or attribute of the context node

The subset of XPath used allowsChildren via ./child or ./*Descendants via .// (not within a path)Choice via |Attributes via @attribute or @*

The subset of XPath used does not allowParents or ancestorstext()Fancy axes such as preceding, . . .

An element yields its text()c©Munindar P. Singh, CSC 513, Spring 2010 p.92

Page 25: Module 3: XML Query and Manipulation · Module 3: XML Query and Manipulation ... XPath XQuery XSLT SQL/XML c Munindar P. Singh, CSC 513, Spring 2010 p.45 ... A numeric that is 0 or

XML Foreign Keys

<keyre f name = " . . . " r e f e r =" pr imary−key−name">< s e l e c t o r xpath = " . . . " / >

3 < f i e l d name = " . . . " / ></ keyref >

Relational requirement: foreign keys don’thave to be unique or non-null, but if onecomponent is null, then all components mustbe null.

c©Munindar P. Singh, CSC 513, Spring 2010 p.93

Document Object Model (DOM)

Basis for parsing XML, which provides anode-labeled tree in its API

Conceptually simple: traverse by requestingelement, its attribute values, and its childrenProcessing program reflects documentstructure, as in recursive descentCan edit documentsInefficient for large documents: parses themfirst entirely even if a tiny part is neededCan validate with respect to a schema

c©Munindar P. Singh, CSC 513, Spring 2010 p.94

Page 26: Module 3: XML Query and Manipulation · Module 3: XML Query and Manipulation ... XPath XQuery XSLT SQL/XML c Munindar P. Singh, CSC 513, Spring 2010 p.45 ... A numeric that is 0 or

DOM Example

1 DOMParser p = new DOMParser ( ) ;p . parse ( " f i lename " ) ;Document d = p . getDocument ( )Element s = d . getDocumentElement ( ) ;NodeList l = s . getElementsByTagName ( " member " ) ;

6 Element m = ( Element ) l . i tem ( 0 ) ;i n t code = m. g e t A t t r i b u t e ( " code " ) ;NodeList k ids = m. getChildNodes ( ) ;Node k id = k ids . i tem ( 0 ) ;S t r i n g elemName = ( ( Element ) k i d ) . getTagName ( ) ; . . .

c©Munindar P. Singh, CSC 513, Spring 2010 p.95

Simple API for XML (SAX)

Parser generates a sequence of events:startElement, endElement, . . .

Programmer implements these as callbacksMore control for the programmer

Processing program does not necessarilyreflect document structure

c©Munindar P. Singh, CSC 513, Spring 2010 p.96

Page 27: Module 3: XML Query and Manipulation · Module 3: XML Query and Manipulation ... XPath XQuery XSLT SQL/XML c Munindar P. Singh, CSC 513, Spring 2010 p.45 ... A numeric that is 0 or

SAX Example: 1

c lass MemberProcess extends Defau l tHand ler {p u b l i c vo id s ta r tE lement ( S t r i n g u r i , S t r i n g n ,

S t r i n g qName, A t t r i b u t e s a t t r s ) {i f ( n . equals ( " member " ) ) code = a t t r s . getValue ( " code " ) ;

5 i f ( n . equals ( " p r o j e c t " ) ) i n P r o j e c t = t r ue ;b u f f e r . rese t ( ) ;

}

. . .

c©Munindar P. Singh, CSC 513, Spring 2010 p.97

SAX Example: 2

1

. . .

p u b l i c vo id endElement ( S t r i n g u r i , S t r i n g n ,S t r i n g qName) {

6 i f ( n . equals ( " p r o j e c t " ) ) i n P r o j e c t = f a l s e ;i f ( n . equals ( " member " ) && ! i n P r o j e c t )

. . . do something . . .}

}

c©Munindar P. Singh, CSC 513, Spring 2010 p.98

Page 28: Module 3: XML Query and Manipulation · Module 3: XML Query and Manipulation ... XPath XQuery XSLT SQL/XML c Munindar P. Singh, CSC 513, Spring 2010 p.45 ... A numeric that is 0 or

SAX Filters

A component that mediates between anXMLReader (parser) and a client

A filter would present a modified set ofevents to the clientTypical uses:

Make minor modifications to the structureSearch for patterns efficiently

What kinds of patterns, though?

Ideally modularize treatment of differentevent patternsIn general, a filter can alter the structure ofthe document

c©Munindar P. Singh, CSC 513, Spring 2010 p.99

Programming with XML

LimitationsDifficult to construct and maintaindocumentsInternal structures are cumbersome;hence the criticisms of DOM parsers

Emerging approaches provide superiorbinding from XML to

Programming languagesRelational databases

Check pull-based versus push-based parsers

c©Munindar P. Singh, CSC 513, Spring 2010 p.100