Syntax Reuse: XSLT as a Metalanguage for Knowledge Representation Languages
Tara Athan
Athan Services, West Lafayette, IN, USA
taraathan AT gmail.com
presented at RuleML 2012
August 29, 2012
Montpellier, France
Situation: Information Overload
Information Abundance Datasets, Texts Metadata Knowledge Bases
Information Diversity Characteristics
Numerical Textual Hierarchical
Uses Quantitative Processing Search Reasoning
Response: Representation Format Overload
Geographical Data Formats
Flat: Shapefiles, Tabular Data
Structured Exchange Formats
XML: GPX, ArcGIS XML, KML, GeoRSS, GML2, GML3
SQL: GDF JSON: GeoJSON
Geodatabases
Knowledge Representation Formats
RDF (XML, N3, Turtle) OWL (RDF/XML,
OWL/XML, Manchester) RuleML (XML, POSL) Common Logic (CLIF,
CGIF) F-Logic
Semantic Interoperability:Translation or Coexistence (I)
Between disparate KR languages Translation: RDF to OWL, KIF to CLIF, ...
Issues: Incompatibility, redundancy, inefficiency, lossy Coexistence
OntoIOp's DOL aims to allow heterogeneous knowledge bases to interoperate without translation
Between disparate data formats Translation for computation, optimized in domain-
specific applications such as GIS Coexistence for visualization (mashups, mapping)
Semantic Interoperability:Translation or Coexistence (II)
Between data and KR Translation:
GML3 transforms shapefiles, databases, ... Has conceptual models, not model-theoretic, not grounded in ontology Not in general invertible
Coexistence RDF, RuleML, CLIF allow embedded structured data
XMLLiteral, , '...' IRI with XPath or XPointer provides fine access, but can't quantify
Need: Metalanguage for Quantification within Embedded Data
Metalanguage Strategy: Semantic or Syntactic
Semantic RDF Graph
Is a set of RDF triples Model-theoretic
semantics (Limited) inferences may
be drawn RuleML OrdLab Tree
Herbrand Semantics
Syntactic XPath Data Model (XDM)
Is also a graph Each node of the graph
has an information set (infoset)
No interpretations No inferences
XSLT uses XDM
XSLT: W3C Recommendation
Turing-complete language Takes well-formed XML input Produces arbitrary character string output
Expressed in XML May be validated by
Relax NG (https://github.com/ndw/xslt-relax-ng) XSD (http://www.w3.org/2007/schema-for-xslt20.xsd)
Processed by Libxslt, Apache Xalan, Saxon XSLT, ...
Metalanguage MXSL Goals
Enable reasoning over large, structured legacy datasets (e.g. in XML or flat files) without large-scale conversion of compact representations into the more verbose RuleML format;
Allow user-specified deductive systems for reasoning over RuleML knowledge bases via axiom schemas;
Enhance the RuleML query capability.
MXSL Approach
Use XSLT to represent the transformation of a set of parameter bindings into the corresponding instances of that axiom schema
Interpret the case when a set of bindings is not specified as an axiom schema Syntactically indicated by processing instruction
-->
Example: Captured String Syntax
Inspired from IKL: (= ('a') a)
stringParam stringParam
XSLT Element Construction
stringParam
XSLT for-each:MXSL Quantification
... ...
Two-Phase Interpretation
1. the application of a metainterpretation to transform the metalanguage instance to another instance that is purely in the base language;2. the application of the base languages model-theoretic interpretation to transform the derived instance to truth values.
Truth-Valuation (I) Assume an interpretation I, with vocabulary V and
metalanguage formula F. Each metainterpretation transformation
uses one binding of the metalanguage parameters to literals, constrained only by datatype
produces a preformula Preformulas are discarded that either
Are not syntactically valid RuleML Contain names that are not in vocabulary V
Truth-Valuation (II)
Call the remaining set of RuleML formulas S The metalanguage formula F is valid in an
interpretation I if all RuleML formulas in S are valid in the interpretation I.
In (exactly) that case, I is a model of F.
Metalanguage Requirements (I)
add element node to output tree, given local name and namespace as parameters;
add attribute node to element node, given local name, name-space and value as parameters;
add text node as child of element; iteratively add, in specified order, a number of
nodes to output tree, based on set of parameter bindings represented in structured XML;
Metalanguage Requirements (II)
express the hierarchical and sequential structure of the output tree isomorphically according to the corresponding structure of the metalanguage instance.
In addition, we have the following operational requirements: Instances in metalanguage must: validate against the XSLT schema, be self-contained, be self-executing.
MXSL Syntax (I)
, for iteration , for construction of XML elements , for addition of attributes to XML
elements with @select, for generation of the
values of attributes and contents of element
MXSL Syntax (II)
(once as the root element) (once as a header to define the
parameter bindings and also as the first children of , once for each parameter in the iteration)
(once for the MXSL body)
(zero to many times as the first children of )
MXSL Syntax (III)
(once as the last child of to define the root for the output document)
(arbitrarily within the element)
Example: Structured DataAxiom Before Generalization
Tara Athan Athan Services
Tara Athan Athan Services
Example: Structured Data (cont.)MXSL Axiom Schema
Overview of Infinitary Logic (I)
Given a base (finitary) language L Parameter : cardinality of sequences of
conjunctions and disjunctions, also cardinality of set of variable symbols
Parameter : cardinality of sequences of quantifiers ( complete, compact
Assert/Rulebase/{And, Or}/xsl:for-each Disallowing mapClosure on Assert or Rulebase, nor
closure on And > L(1, ) -> complete, not compact*
Allowing mapClosure on Assert or Rulebase, or closure on And > L(
1,
1) -> neither complete nor compact*
*Follows if RuleML is given FOL semantics, assumes countable literals
Infinitary Case: In practice
Nonliteral vocabulary is typically finite Literals can be introduced only as needed Consider the captured string axiom
One binding needed for each nonliteral in vocabulary (= ('a') a) is discarded by the semantics if "a" is not in the
vocabulary So finite model property can be retained even
with MXSL axiom schemas
Extensions: xsl:for-each/Query
P
Interpret as implicit disjunction of multiple queries Success = inconsistency of rulebase appended with
negations of query (Rulebase/xsl:for-each/Neg) - L(, ) Bindings are lost extended query capability is needed
Metalanguage Abstract Syntax
Grouping symbols entered as pairs through one command
Mirrors the structure of the output document. Iteration command generates a set of sibling
items based on a pattern, instantiated with parameters.
Text always added through a metalanguage command, implementing output-escaping of punctuation characters
MXSL: For CLIF
MCLIF
(mcltext captured-string-axiom (for-each (($stringParam xs:string)) ( = (apply (squote $stringParam)) (dquote $stringParam )) ) )
Conclusions
MXSL: a (small) subset of the XSLT language, which can be used as a syntactic metalanguage for XML-based KR languages
MXSL expresses axiom schemas semantics of structured data generalized queries.
Paradigm for non-XML-based formats Future investigations: reasoning
Slide 1Slide 2Slide 3Slide 4Slide 5Slide 6Slide 7Slide 8Slide 9Slide 10Slide 11Slide 12Slide 13Slide 14Slide 15Slide 16Slide 17Slide 18Slide 19Slide 20Slide 21Slide 22Slide 23Slide 24Slide 25Slide 26Slide 27Slide 28Slide 29Slide 30Slide 31