josh lubell, lubell@nist

27
A Tool Kit for Implementing XML Schema Naming and Design Rules OASIS Symposium: The Meaning of Interoperability May 9, 2006 Josh Lubell, [email protected] National Institute of Standards and Technology Manufacturing Systems Integration Division

Upload: avel

Post on 21-Mar-2016

40 views

Category:

Documents


2 download

DESCRIPTION

Josh Lubell, [email protected] National Institute of Standards and Technology Manufacturing Systems Integration Division. A Tool Kit for Implementing XML Schema Naming and Design Rules OASIS Symposium: The Meaning of Interoperability May 9, 2006. XML Exchange Schemas are Bridges. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Josh Lubell,  lubell@nist

A Tool Kit for Implementing XML Schema Naming and Design Rules

OASIS Symposium: The Meaning of

InteroperabilityMay 9, 2006

Josh Lubell, [email protected] Institute of Standards and TechnologyManufacturing Systems Integration Division

Page 2: Josh Lubell,  lubell@nist

XML Exchange Schemas are Bridges

Page 3: Josh Lubell,  lubell@nist

But Bridges Must Be Designed Properly

Page 4: Josh Lubell,  lubell@nist

A Solution: Naming and Design Rules Encode XML schema best practices Enforce a particular modeling

methodology Ensure common naming conventions

Use of camel case Allowable acronyms …

But NDRs can be difficult to apply

Page 5: Josh Lubell,  lubell@nist

Barriers to NDR Usefulness Proliferation

How do I decide which NDR set to adopt? Should I develop my own NDR?

Lack of structure NDR documents usually in proprietary word

processor formats Inhibits rule reuse Limited versioning and traceability

Ambiguity Rules written in English rather than computer-

interpretable language NDR enforcement not automatic

Page 6: Josh Lubell,  lubell@nist

Schematron as an NDR Implementation Method Advantages

XML-native (based on XPath) Rule-based Can test for co-occurrence constraints User-configurable diagnostic messages ISO standard

Disadvantage Less versatile than a general purpose

programming language

Page 7: Josh Lubell,  lubell@nist

Example from Universal Business Language NDR[ELD1] Each UBL:DocumentSchema MUST identify one and only one global element declaration that defines the document ccts:AggregateBusinessInformationEntity being conveyed in the Schema expression. That global element MUST include an xsd:annotation child element which MUST further contain an xsd:documentation child element that declares “This element MUST be conveyed as the root element in any instance document based on this Schema expression.”

Page 8: Josh Lubell,  lubell@nist

Implementation Observations[ELD1] Each UBL:DocumentSchema MUST identify one and only one global element declaration that defines the document ccts:AggregateBusinessInformationEntity being conveyed in the Schema expression. That global element MUST include an xsd:annotation child element which MUST further contain an xsd:documentation child element that declares “This element MUST be conveyed as the root element in any instance document based on this Schema expression.”

Rule label

Namespace dependence

Namespace dependence

Subrule 1Context 1

Subrule 2

Context 2

Page 9: Josh Lubell,  lubell@nist

UBL Lessons Learned Implementation non-trivial even for a seemingly simple

rule Some rules require a general purpose programming

language for implementation [GNR1] UBL XML element, attribute and type names MUST be

in the English language, using the primary English spellings provided in the Oxford English Dictionary.

[GNR7] UBL XML element, attribute and type names MUST be in singular form unless the concept itself is plural.

Some rules cannot be implemented at all [NMS6] UBL published namespaces MUST never be changed. [VER10] UBL Schema and schema module minor version

changes MUST not break semantic compatibility with prior versions.

MUST versus SHOULD versus MAY More on MAY later…

Page 10: Josh Lubell,  lubell@nist

Dept. of Navy (DON) NDR Case Study 128 rules Based on UBL NDR Why choose the DON NDR?

Help developers write better schemas for Federal government applications

Gain insight into best practices for NDR development (particularly reuse of existing NDRs)

Publicly available A Navy standard

Page 11: Josh Lubell,  lubell@nist

DON NDR Testability (using Schematron)

Page 12: Josh Lubell,  lubell@nist

Issue: Use of MAY A rule saying that something MAY occur,

strictly speaking, will always pass But this may not be the rule creator’s intent

Example: [CTD8] Code and ID ccts:BBIE Property complex types MAY use the xsd:choice element to reference global elements defined in standardized ID Scheme or Code List Schema modules.

Approaches Consider rule as guidance only (don’t implement) Interpret MAY as discouragement, e.g. “warning:

referencing global element using xsd:choice”

Page 13: Josh Lubell,  lubell@nist

Issue: Requirement for External Resources

[GNR1] UBL XML element, attribute and type names MUST be in the English language, using the primary English spellings provided in the Oxford English Dictionary.

Implementation requires access to electronic OED And the DON adaptation of this rule has additional

requirements:[GNR1] XML element, attribute, and type names MUST be

in the English language, using the Oxford English Dictionary for Writers and Editors (Latest Ed.). Where both American and English spellings of the same word are provided, the American spelling MUST be used.

Electronic OED must be fully up to date

Page 14: Josh Lubell,  lubell@nist

Illustrated by UBL rule GNR1 versus DON rule GNR1

DON rule same as UBL rule, but with added contraints

American spelling favored Latest OED edition required

But no explicit relationship specified in DON NDR! Both rules have same ID, even though they are

different rules Improved traceability and reusability would reduce

the confusion

Issue: Rule Proliferation

Page 15: Josh Lubell,  lubell@nist

Issue: Ambiguous Terminology

More rigor needed in NDR definitions Example: “xsd:SchemaExpression”

Not defined in W3C XML Schema recommendation

Used but not defined in DON NDR Defined in UBL NDR to mean “a

concept”

Page 16: Josh Lubell,  lubell@nist

Issue: Mixed Content Essential for representing semi-structured

data But allowing it makes the NDR more

complicated UBL NDR forbids mixed content DON NDR allows it, but only if defined by a

namespace from a Navy-approved standard (e.g. XHTML)

But XHTML element and attribute names violate rule GNR1!

Page 17: Josh Lubell,  lubell@nist

Quality of Design (QoD) Tool• Contains rules based on naming and design

guidelines (NDRs) from a number of sources• Stores executable test cases written in

Schematron and Java Expert System Shell (Jess)

• Executes tests against user-provided schemas and reports results

• Rules grouped into test profiles

Page 18: Josh Lubell,  lubell@nist

Why QoD? Addresses proliferation of NDRs

Overlapping NDR standards Supports reusability of rules

Highlights ambiguous rules Provides an explicit structure for

rules in NDRs Automates rule enforcement Enables versioning and traceability

of rules

Page 19: Josh Lubell,  lubell@nist

Characteristics of Rules Coverage: full, partial, none Applicability: indicates type of schema

(document, low, or aggregate) the rule applies to

Rationale: reason for rule from a list of justifications

Requirement: text from the NDR document

Implementation File: URI of the file containing the implementation of the rule

Page 20: Josh Lubell,  lubell@nist

Example XML Description of a Ruleusing QoD Exchange Schema<testProfile> <source id="ubl"> <organization>OASIS</organization> <orgURL>http://www.oasis-open.org</orgURL> <title>Universal Business Language (UBL) Naming and Design Rules</title> <version>1.0</version> <date>2004-11-15</date> <docURL>http://docs.oasis-open.org/ubl/cd-UBL-NDR-1.0.1</docURL> </source> <ruleSet id="ELD"> <name>Element Declaration Rules</name> <rule id="ELD1"> <coverage>full</coverage> <schema>D</schema> <rationale>structural clarity</rationale> <requirement>Each UBL:DocumentSchema MUST identify one and ... </requirement> <implementation file="example.scmt#eld1" type="schematron"/> </rule> ... </testProfile>

Page 21: Josh Lubell,  lubell@nist

QoD Test Profile Exchange

Page 22: Josh Lubell,  lubell@nist
Page 23: Josh Lubell,  lubell@nist

Application to Developing XML Schemas Currently a limited set of rules are implemented Recently implemented subset of DON NDR in

Schematron Tested with a small but varied set of sample

schemas Navy – IETM Schema Q70:IETM (Interactive Electronic

Technical Manual) Grants.gov AEX (building and construction industry) US Dept. of Defense

Provided meaningful results to schema developers

Page 24: Josh Lubell,  lubell@nist

Examples of types of warnings found in developing XML Schemas

Global elements declared in non-desirable places

Anonymous/local types defined in non-desirable places

“Global” schemas that do not declare a default namespace

Document/Transaction level schemas that define multiple global elements

Re-declaration of elements and types (e.g. programType) in different namespaces

Page 25: Josh Lubell,  lubell@nist

Lesson Learned in coding NDRs NDR documents need to be regarded as rigorous

technical documentation More review needed Better authoring tools needed

Rules that cannot be implemented are non-enforceable

Definition of NDRs is non-trivial Many rules cannot be tested Many rules are more difficult to implement than thought Difficult to reuse rules due to namespace definitions Often rules are ambiguous or unclear

Implementation of rules is non-trivial Testing of rules is complex All boundary conditions need to be thought of and covered

Legacy data and 3rd party schemas need to be addressed in NDRs

Page 26: Josh Lubell,  lubell@nist

What’s Next Continue to expand our NDR rule-base Continue to enhance software based on

user requirements Produce a tool kit for NDR developers

Enhance QoD schema to represent entire NDR document

Provide authoring templates Identify collaborators for future work

If interested, contact me!

Page 27: Josh Lubell,  lubell@nist

Summary A process for XML schema development

is necessary Tools can automate the process, thereby

reducing labor and deployment time Definition and implementation of NDRs is

non-trivial but necessary to support reuse of schemas

Enforcing NDRs will ultimately make XML schemas more interoperable