questions and answers: reasoning and querying in description …tessaris/papers/phd-thesis.pdf ·...
Post on 15-Feb-2018
225 Views
Preview:
TRANSCRIPT
QUESTIONS AND ANSWERS:
REASONING AND QUERYING IN
DESCRIPTION LOGIC
A THESIS SUBMITTED TO THE UNIVERSITY OF MANCHESTER
FOR THE DEGREE OF DOCTOR OF PHILOSOPHY
IN THE FACULTY OF SCIENCE AND ENGINEERING
April 2001
By
Sergio Tessaris
Department of Computer Science
Contents
Abstract 5
Declaration 6
Copyright 7
Acknowledgements 8
1 Introduction 9
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.1.1 Logic based knowledge representation . . . . . . . . . . . . . 10
1.1.2 Description logics . . . . . . . . . . . . . . . . . . . . . . . 11
1.2 Reasoning with DL knowledge bases . . . . . . . . . . . . . . . . . . 13
1.3 Thesis outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2 Preliminaries 17
2.1 Description Logics . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.1.1 Syntax and Semantics . . . . . . . . . . . . . . . . . . . . . 18
2.1.2 DL based systems . . . . . . . . . . . . . . . . . . . . . . . 22
2.2 Reasoning with DLs . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.2.1 Terminological reasoning . . . . . . . . . . . . . . . . . . . . 26
2.2.2 Hybrid reasoning . . . . . . . . . . . . . . . . . . . . . . . . 30
2.3 Role axioms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.4 The description logic ��f . . . . . . . . . . . . . . . . . . . . . . . 35
3 Model characterisation of ��f 36
3.1 Quasi transitive shrub interpretations . . . . . . . . . . . . . . . . . . 37
3.1.1 Canonical interpretations . . . . . . . . . . . . . . . . . . . . 41
3.2 Transforming interpretations into q.t. shrubs . . . . . . . . . . . . . . 46
2
3.2.1 Unravelling the interpretation . . . . . . . . . . . . . . . . . 48
3.2.2 Properties of unravelled interpretations . . . . . . . . . . . . 50
3.3 Completeness of q.t. shrub interpretations . . . . . . . . . . . . . . . 59
3.4 Application to terminological reasoning . . . . . . . . . . . . . . . . 61
3.5 Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4 KB satisfiability algorithms 64
4.1 Alternative Abox reasoning techniques . . . . . . . . . . . . . . . . . 64
4.1.1 Direct tableaux . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.1.2 Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.1.3 Resolution based methods . . . . . . . . . . . . . . . . . . . 66
4.2 Precompletion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.2.1 KB satisfiability algorithm . . . . . . . . . . . . . . . . . . . 68
4.2.2 Correctness and completeness of the algorithm . . . . . . . . 71
4.2.3 Pros and cons of precompletion . . . . . . . . . . . . . . . . 74
5 Precompletion algorithm for ��f 77
5.1 Precompletions of knowledge bases . . . . . . . . . . . . . . . . . . 77
5.2 Satisfiability of precompletions . . . . . . . . . . . . . . . . . . . . . 83
5.2.1 Interpretation of roles . . . . . . . . . . . . . . . . . . . . . . 87
5.2.2 Interpretation of concepts . . . . . . . . . . . . . . . . . . . 95
5.2.3 Satisfiability of a precompleted KB . . . . . . . . . . . . . . 98
5.3 Knowledge base satisfiability . . . . . . . . . . . . . . . . . . . . . . 100
6 Querying Aboxes 101
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
6.2 Conjunctive Queries in DL . . . . . . . . . . . . . . . . . . . . . . . 103
6.2.1 Queries with multiple terms . . . . . . . . . . . . . . . . . . 104
6.2.2 Queries with variables . . . . . . . . . . . . . . . . . . . . . 106
6.2.3 Extending the framework . . . . . . . . . . . . . . . . . . . . 110
6.3 Relation with other works . . . . . . . . . . . . . . . . . . . . . . . . 111
7 Answering conjunctive queries in ��f 113
7.1 Query language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
7.1.1 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
7.1.2 Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
7.2 Completeness of quasi transitive shrub interpretations . . . . . . . . . 115
3
7.3 Answering boolean queries . . . . . . . . . . . . . . . . . . . . . . . 117
7.3.1 Query normal form . . . . . . . . . . . . . . . . . . . . . . . 118
7.3.2 Conjunctive queries . . . . . . . . . . . . . . . . . . . . . . . 120
7.3.3 Disjunction of conjunctive queries . . . . . . . . . . . . . . . 121
7.4 Answering queries in normal form . . . . . . . . . . . . . . . . . . . 123
7.4.1 Query transformation rules . . . . . . . . . . . . . . . . . . . 123
7.4.2 Notes on transformation rules . . . . . . . . . . . . . . . . . 129
7.4.3 Limitation of the technique . . . . . . . . . . . . . . . . . . . 139
7.4.4 Example of query answering . . . . . . . . . . . . . . . . . . 142
7.4.5 Termination . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
7.4.6 Correctness and completeness . . . . . . . . . . . . . . . . . 153
7.5 Speeding up the answer . . . . . . . . . . . . . . . . . . . . . . . . . 168
8 Implementation and testing 170
8.1 Description of the system . . . . . . . . . . . . . . . . . . . . . . . . 170
8.2 Optimising the algorithm . . . . . . . . . . . . . . . . . . . . . . . . 171
8.2.1 Evaluation strategy . . . . . . . . . . . . . . . . . . . . . . . 171
8.2.2 Axiom absorption and lazy expansion . . . . . . . . . . . . . 172
8.2.3 Lexical normalisation . . . . . . . . . . . . . . . . . . . . . . 172
8.2.4 Backjumping . . . . . . . . . . . . . . . . . . . . . . . . . . 173
8.2.5 Abox partitioning . . . . . . . . . . . . . . . . . . . . . . . . 175
8.2.6 Using the terminological reasoner . . . . . . . . . . . . . . . 176
8.2.7 Other techniques . . . . . . . . . . . . . . . . . . . . . . . . 179
8.3 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
8.3.1 DL benchmark suite . . . . . . . . . . . . . . . . . . . . . . 180
8.3.2 Measurements . . . . . . . . . . . . . . . . . . . . . . . . . 182
8.3.3 Notes on results . . . . . . . . . . . . . . . . . . . . . . . . . 184
9 Conclusions 190
9.1 Thesis contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
9.1.1 KB satisfiability . . . . . . . . . . . . . . . . . . . . . . . . 190
9.1.2 Query answering . . . . . . . . . . . . . . . . . . . . . . . . 191
9.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
Bibliography 193
4
Abstract
Description Logics (DLs) are a family of formal languages for describing complexstructured classes. These languages contain boolean operators and quantification overclass attributes, as well as the specification of elements of the classes and their proper-ties. DL knowledge bases (KBs) consist of a terminological part which describes thegeneral organisation of the classes, and an assertional part for describing the propertiesof individuals. In spite of its inherent intractability, practical algorithms have been re-cently developed for purely terminological reasoning; when individuals are introducedthere is still a lack of results.
This thesis constitutes an advance in the direction of the development of DL sys-tems providing efficient and powerful reasoning in presence of individuals. We choosean expressive DL (��f), which has been previously studied from the terminologicalperspective (i.e. without individuals), and we investigate the problem of reasoning withindividuals.
��f extends the standard DL ��� with transitive roles, role hierarchy, and at-tributes. This extension enables the representation of general inclusion axioms usinga technique called internalisation. This thesis investigates two automated reasoningproblems: KB satisfiability and query answering.
KB satisfiability is the fundamental problem of deciding whether a given KB doesnot contain any contradiction. This problem is not only important for delivering con-sistent KBs, but most of the reasoning tasks for DL systems can be reduced to KBsatisfiability. We show how the technique of precompletion, which has previously onlybe used in less expressive DLs, can be used in our expressive DL.
A serious shortcoming of many Description Logic based knowledge representationsystems is the inadequacy of their query languages. We present a novel technique thatcan be used to provide an expressive query language for such systems. One of themain advantages of this approach is that, being based on a reduction to knowledgebase satisfiability, it can easily be adapted to most existing (and future) DescriptionLogic implementations. We believe that providing Description Logic systems withan expressive query language for interrogating the knowledge base will significantlyincrease their utility.
5
Declaration
No portion of the work referred to in this thesis has been
submitted in support of an application for another degree or
qualification of this or any other university or other institu-
tion of learning.
6
Copyright
Copyright in text of this thesis rests with the Author. Copies (by any process) either
in full, or of extracts, may be made only in accordance with instructions given by the
Author and lodged in the John Rylands University Library of Manchester. Details may
be obtained from the Librarian. This page must form part of any such copies made.
Further copies (by any process) of copies made in accordance with such instructions
may not be made without the permission (in writing) of the Author.
The ownership of any intellectual property rights which may be described in this
thesis is vested in the University of Manchester, subject to any prior agreement to the
contrary, and may not be made available for use by third parties without the written
permission of the University, which will prescribe the terms and conditions of any
such agreement.
Further information on the conditions under which disclosures and exploitation
may take place is available from the head of Department of Computer Science.
7
Acknowledgements
I would like to thank all the people who helped and inspired me during the period ofmy PhD: my supervisors (in order of appearance) Carole Goble, Graham Gough, andIan Horrocks, the present and past members of Information Management Group, andseveral other people in the Department of Computer Science (the list would be toolong).
I am particularly grateful to Ian Horrocks and Graham Gough for their support,both scientific and human. Moreover, without their tireless proofreading this thesiswould have been hardly readable.
I would like to mention Enrico Franconi as the principal cause for my involvementin the DL community, and to thank him for his advice.
This work has been supported by a grant from the Engineering and Physical Sci-ences Research Council.
8
Chapter 1
Introduction
Description logics are knowledge representation systems evolved from early semantic
networks and frame systems. They have been developed as a logical reconstruction
of the KL-ONE–like knowledge representation systems (see Woods and Schmolze
[1992]). They differ from their ancestors in that they place a strong emphasis on a
precise semantic characterisation of the modelling language. The need for a unequiv-
ocally defined semantics stems from the fact that having a clear semantics enables the
description and justification of the automated deduction processes.
Description logics have been proved effective in several application fields including
automated language understanding and generation, configuration and data modelling
(see Franconi [1994], Calvanese et al. [1998b]). In addition, their formal semantics
enables the exact characterisation of the expressiveness of a DL system against its
computational properties.
Although good results have recently been obtained in the development of optimised
algorithms for DL knowledge bases without individuals, little progress has been made
with reasoning involving individuals. This research is an attempt to fill this gap, by
exploring the possibility of developing optimised algorithms.
A serious shortcoming of many Description Logic based knowledge representation
systems is the inadequacy of their query language. In this thesis we present a novel
technique that can be used to provide an expressive query language for such systems.
This chapter presents an overview of background of the thesis, together with the
overall description of the document.
9
1.1. BACKGROUND 10
1.1 Background
1.1.1 Logic based knowledge representation
This thesis concerns knowledge representation systems that represent the domain using
a formal language. In these systems, a precise semantics unambiguously defines the
meaning of a knowledge base written using the language. In addition, the knowledge
representation system provides a set of automated reasoning services to manipulate
and/or interrogate knowledge bases.
In particular it is assumed that the knowledge can be partitioned into terminological
and assertional parts. The terminological part (or simply terminology) describes the
structural properties of the terms of the domain, while the assertional part depicts a
particular configuration of the domain by introducing individuals and asserting their
properties using the definitions in the terminology.
Example 1.1
TerminologyBirds are Animals.
Penguins are Birds and not Flyers.
Assertionstweety is-a Bird and not Flyer and
has FRIEND which-is Flyer.
This example shows a simple knowledge base written using a natural language–
like syntax. The terminological part defines the properties of Birds and Penguins,
while the assertional part states some properties of a particular individual (tweety).
This individual is described as a bird which cannot fly, having at least one friend which
is a flyer.
Partitioning knowledge in this way is quite a common methodology in computer
science. For instance, in the database setting the distinction between schemata and an
actual database could be seen as the dichotomy between the terminology and the as-
sertional part. Generally speaking there are two main reasons to adopt the partitioning
technique; firstly, the same terminology could be used for several knowledge bases,
and secondly, different algorithms can be used on each part, exploiting the different
characteristics of the two components.
Two other important features of knowledge representation systems are the possi-
bility of adopting an open world semantics for the knowledge base (see Reiter [1977]),
as well as the ability to represent incomplete information.
1.1. BACKGROUND 11
The adoption of an open world semantics leaves a degree of uncertainty regarding
facts that are not explicitly asserted. For example, individuals not mentioned in the
knowledge base could be part of the domain and each individual mentioned could have
more properties than the asserted ones. Referring to Example 1.1, tweety could
either be a penguin or not, and there could also be additional birds or animals in that
little world.
Incomplete information can appear in different forms. Firstly, the implied existence
of individuals without explicitly mentioning them: the knowledge base in Example 1.1
implies the existence of at least one flyer which is friend of tweety. Secondly, in
the form of disjunctive information, for example it could be specified that tweety’s
friend should be either an auk or a puffin.
Description logics knowledge representation systems take advantage of the above
properties. The next section gives an overview of the main features of this Description
Logic formalism, while a more formal description is provided in Chapter 2.
1.1.2 Description logics
There are several ways to give an intuition about what a description logic is. For a
knowledge representation audience, DLs could be viewed as a logical reconstruction
of frame systems and semantic networks. An object oriented database audience can
look at DLs as the static (declarative) part of a data definition language, while a lo-
gician could look at them as a syntactic variant of a multi-modal modal logic. Given
this intuition, dipping into the literature reveals a strong emphasis on the formal logic
nature of DLs (see KRSS). In fact a DL consists of a formal language, described by a
clear set theoretical semantics, together with a calculus (see Chapter 2).
Language expressions of a DL describe classes (concepts) of individuals that share
some properties. Properties can also be specified by means of relations (roles) between
individuals. The language is compositional, i.e. the concept descriptions are built by
combining different subexpressions using constructors. For instance, if the concepts
STAFF and TECHNICIAN are defined, the notion of technical staff can be represented
by putting together the two concepts in the expression ����� � ����������. The kind
of expressions that can be built using a DL language are
Bird � �Flyer � ��FRIEND�Flyer�
Vehicle � ��PART-OF�Engine� � �� �PART-OF�Wheel�
where elements in typewriter style are terms of the language (concepts in small letters
1.1. BACKGROUND 12
and roles in capital letters), while �, �, � are language constructors.1
The semantics of the language is given in a set theoretical way, over a domain which
is a set of elements. A concept expression corresponds to a subset of the domain, while
a role expression corresponds to a binary relation over the domain. As their shape
suggests, some of these constructors have a strong relationship with boolean operators
and logical quantifiers.
A DL knowledge base consists of a terminology, traditionally called the Tbox,
and an assertional part, called Abox. The Tbox contains the definitions of the terms
(concept definitions), while the Abox contains a set of membership and role assertions.
Membership assertions relate an individual to a concept, stating that the individual
involved is an instance of the concept. Role assertions state that two individuals are
linked by a given role. Example 1.1 could be expressed by the DL knowledge base in
Example 1.2.
Example 1.2
TboxBird Animal
Penguin Bird � �Flyer
Abox tweety��Bird � �Flyer � ��FRIEND�Flyer��
Reasoning with DL knowledge bases is a deduction process which extracts not only
the facts explicitly asserted in a knowledge base, but also their logical consequences.
For instance, from the knowledge base of Example 1.2 can be derived the fact that
tweety is an Animal. When only the terminology is involved in a deduction, the
reasoning is said to be terminological; otherwise it is said to be hybrid.
DL systems provide automated reasoning services for interacting with the knowl-
edge base. Typical reasoning services provided are: verifying the non–contradiction of
assertions in the knowledge base, checking that an individual satisfies a concept expres-
sion, retrieval of all the individuals satisfying a concept expressions and subsumption
checking, which verifies whether one expression is more general than another.2 Nev-
ertheless, it has been proved that all services can be reduced in polynomial time to the
problem of knowledge base satisfiability (see Schaerf [1994]), that is the checking for
the absence of contradicting assertions in a given knowledge base.
1The meaning of the different elements of the language will be clarified in Chapter 2.2An expression is more general than another if its corresponding set always includes that of the
second concept.
1.2. REASONING WITH DL KNOWLEDGE BASES 13
The term complexity of reasoning in this context refers to the complexity of the
knowledge base satisfiability problem, and it characterises the computational proper-
ties of a DL system. The complexity actually depends on the expressiveness of the
underlying DL language. Moreover, the language (and its expressiveness) is defined
by the kinds of constructors it provides. Therefore the complexity is determined by the
constructors provided by the language.
One example of the influence of the constructors on the computational properties
is the interaction of the role conjunction constructor with the number restriction con-
structor. The role conjunction constructor builds a new role expression from the inter-
section of atomic roles; while number restriction constructor constrains the number of
individuals related by a given role. These two kinds of constructor do not cause any
problem separately, but when they are taken together (i.e. allowing role conjunction in
the role of the number restriction) they cause the undecidability of the KB satisfiability
problem (see Baader and Sattler [1996]).
Different applications need different language constructors, and the modularity of
the language facilitates the addition of new constructors. The result is that there is not
a single DL language, but a family of different languages. A big effort has been spent
in extending the expressiveness of DL languages, while maintaining the soundness and
decidability of the reasoning services (see for example Buchheit et al. [1993]).
Nowadays the undecidability boundaries and the complexity of the reasoning are
well understood on the theoretical side (see Donini et al. [1997]). However there is still
a lot work to do on the definition of “practical” algorithms, and the analysis of the in-
teraction between the different constructors is still an active topic in the DL community
(see Franconi et al. [1998a]).
1.2 Reasoning with DL knowledge bases
Description logics have proved to be effective in various applications, both those that
require an assertional part and those that do not. In fact, while DLs were initially
developed to be the knowledge representation part of an AI system, recent work has
showed that their ability to reason in an abstract way about a domain3 could be ex-
tremely useful for knowledge modelling tasks. For example, DLs could be used as
tools for checking the consistency of entity–relationship or object oriented schemata
(see Calvanese et al. [1998b]).
3Without the necessity to populate the domain with individuals.
1.2. REASONING WITH DL KNOWLEDGE BASES 14
Recent years have seen significant advances in the design of sound and complete
reasoning algorithms for DLs with both expressive logical languages and unrestricted
Tboxes, i.e., those allowing arbitrary concept inclusion axioms (see Baader [1991],
Horrocks and Sattler [1999], De Giacomo and Massacci [1998]). Moreover, systems
using highly optimised implementations of (some of) these algorithms have also been
developed, and have been show to work well in realistic applications (see Horrocks
[1998], Patel-Schneider [1998]). While most of these have been restricted to termino-
logical reasoning (i.e., the Abox is assumed to be empty), attention is now turning to
the development of both algorithms and (optimised) implementations that also support
Abox reasoning (see Haarslev and Moller [1999], Tessaris and Gough [1999]).
This research starts with the premise that there are still several key applications
for a DL hybrid system. Generally speaking, applications that require the ability to
state incomplete knowledge about the individuals in the domain are good candidates
for the use of a hybrid DL; one example is natural language processing (see Franconi
[1994]). The optimisations recently applied to Tbox algorithms (see Horrocks and
Patel-Schneider [1998a]) give pointers towards the improvement or redesign of current
Abox algorithms in order to enhance performance in realistic applications.
In this thesis we discuss techniques and algorithms used to deal with two essential
reasoning tasks: knowledge base satisfiability and query answering. We focus on a
particular DL language, which is implemented in the terminological system4 FaCT
(see Horrocks [1998]). This DL, called ��f,5 possesses most of the relevant features
of an expressive DL language. In addition, experience with the FaCT system has shown
that ��f is well suited for knowledge representation modelling.
Several techniques have been proposed for the verification of the satisfiability of a
KB, and many of them have lead to the development of DL systems. However, some
of these techniques have been developed for DLs less expressive than is required for a
modern system. We investigate the extension of one of these techniques, the so called
precompletion technique (see Hollunder [1996]), for reasoning with ��f.
Although modern DL systems provide sound and complete Abox reasoning for
very expressive logics, their utility is limited compared to earlier DL systems by their
very weak Abox query languages. Typically, these only support instantiation (is an
4DL system without the Abox.5��f has been previously called ���f��� , but the new name follows a more compact naming
scheme recently introduced in Horrocks et al. [1999b].
1.3. THESIS OUTLINE 15
individual � an instance of a concept �), realisation (what are the most specific con-
cepts � is an instance of) and retrieval (which individuals are instances of �). This is in
contrast to a system such as Loom where a full first order query language is provided,
although based on incomplete reasoning algorithms (see MacGregor and Brill [1992]).
We present a technique for answering conjunctive queries over DL KBs. This
work has been inspired by the use of Abox reasoning to decide conjunctive query
containment (see Horrocks et al. [1999a], Calvanese et al. [1998a]). However, the
characteristics of ��f forces the use of an algorithm which is much more complicated
of those presented in these previous works.
Our main contributions can be summarised in the following points.
The extension of the already known precompletion technique for KB satisfia-
bility to the DL language ��f. Previously the technique was applied to DLs
without general inclusion axioms, transitive roles, or role hierarchies.
A query answering algorithm for disjunctions of conjunctive queries for ��f
KBs with Aboxes.
A prototype software system performing KB satisfiability for ��f has been de-
veloped. The system has been used for verifying the practical feasibility of the
precompletion technique, and the impact of different optimisations on the per-
formance of the system.
A characterisation of the class of models of the logic ��f. This was previously
done (implicitly) for the problem of KB satisfiability, but to our knowledge, it is
the first result for the query answering.
The next section gives an overview of the structure of this thesis.
1.3 Thesis outline
The thesis contains both introductory and technical chapters, in particular the reader
can get an overview of the work presented by reading the even numbered chapters only.
Chapter 2 gives a formal description of the Description Logics in general, and in
particular the DL ��f which is used in the following chapters.
1.3. THESIS OUTLINE 16
Chapter 3 describes the class of interpretations which characterise the logic ��f.
The results from this chapter are used in the proofs presented in Chapter 5 and Chap-
ter 7. This model–theoretic characterisation of ��f is not only essential for most of
the proofs, but it provides an insight on the expressiveness of ��f.
Chapter 4 and Chapter 5 investigate the problem of Knowledge Base satisfiability
for ��f. The first gives an overview of the methods which have been used for tackling
the problem, and provides an introduction to the technique we investigated. Chapter 5
provides a formal description of the precompletion algorithm, and shows the proofs
for its correctness and completeness.
Chapter 6 and Chapter 7 introduce and provide a solution to the problem of answer-
ing conjunctive queries over ��f knowledge bases. The first chapter gives an intuitive
presentation of the problem and our solution; while Chapter 7 provides the full details.
Chapter 8 describes the experiments we performed with an implementation of the
algorithm for KB satisfiability. Finally, Chapter 9 draws conclusions from this work
and presents further lines of research suggested by the results achieved so far.
Chapter 2
Preliminaries
In this chapter Description Logics (DLs) are formally introduced, together with the
description of what we intend by the term DL based system. In Section 2.1 we present
the syntax and semantics of the logics belonging to the class of DLs, followed by the
description of DL based systems. Section 2.2 introduces common reasoning problems
over DL knowledge bases (KBs), and the automation of the reasoning itself. Sec-
tion 2.3 describes an extension of DL KBs with the introduction of axioms on roles.
Finally, we describe the DL language we investigate, and which will be used towards
this thesis.
2.1 Description Logics
The core of Description Logics is the concept language, a formal language designed to
describe classes and relationships between elements of the classes. DL based knowl-
edge bases are built using concept languages expressions, and they are usually divided
in two distinct parts: intensional and extensional. The intensional part describes the
general schema of the classes and relationships, while the extensional part constitutes
a (partial) instantiation of the schema, since it contains assertions about a set of indi-
viduals. Historically the intensional part takes the name of terminology or Tbox, and
the extensional part the name of assertional part or Abox. Queries on a DL knowl-
edge base are answered by an inferential process involving both the Tbox and Abox
parts. The answer to a query is deduced as logical consequences of the content of the
knowledge base according to the formal semantics.
17
2.1. DESCRIPTION LOGICS 18
2.1.1 Syntax and Semantics
All description logic systems are based on a common family of languages, called con-
cept languages, for describing structured classes of objects. The foundations of con-
cept languages are concepts and roles: a concept represents a class of objects sharing
some common characteristics, while a role represents a binary relation between objects
or, in other words, attributes attached to objects.
The language is completely described by a formal syntax and a Tarsky-like seman-
tics. A formal definition of the language is essential for knowledge bases characterisa-
tion, and for the definition of reasoning services.
A concept language is composed of an alphabet of distinct concept names (�� ),
role names (�� ) and individual names ( ); together with a set of constructors for
building concept and role expressions. Concept expressions (or simply concepts) de-
scribe subsets of the domain; for example Motorcycles � Cars denotes the union
of the elements in Motorcycles and in Cars. Describing the abstract syntax, con-
cept names are indicated by letter � or �, role names by � , and individual names by
lowercase letters �, �. Concept expressions are indicated by letters � or �, and role
expressions by or .
Description logics form a family of different logics, distinguished by the set of con-
structors they provide. Each language is named according to a convention introduced
in Schmidt-Schauss and Smolka [1991]: each constructor is associated to a different
capital letter (e.g. � to disjunction and � to negation) and the name of a language is
composed by the prefix �� (acronym for attributive language) followed by the letters
correspondent to the constructors in the language (e.g. ��� or ���). One of the rea-
sons for having such a fine granularity on the classification of concept languages is that
the computational properties of the reasoning changes dramatically with the presence
or absence of a particular constructor; therefore the ability to indicate precisely the
constructors which are in a DL is very useful.
The very basic language denoted by the prefix �� is close to the expressivity
of frame based representation systems. It enables the specification of hierarchies of
concepts by means of the conjunction of two concepts (�). The expression � � �
denotes the intersection of the elements in � and in �. The hierarchy comes from the
fact that � �� is more specific than both � and �, because it denotes a smaller set of
elements. A second group of constructors in �� enables the specification of attributes
by the unqualified existential (��) and qualified universal (���) quantifiers. These
expressions build new concepts in terms of the roles: expression �� denotes the set
2.1. DESCRIPTION LOGICS 19
of elements related to some other element by the role , while ��� denotes the set
of elements which are related by the role exclusively to elements of the concept
�. Using frame systems nomenclature, the existential quantifier represents the fact
of having attribute , while the universal quantifier specifies the type restriction of
that attribute. In addition, the concept language �� has the ability to represent the
complement of concept names by the expression ��,1 and it provides the symbols �
and � for the full domain and the empty set respectively.
The semantics of DL constructors is defined in terms of an interpretation � �
��� � ��� consisting of a nonempty domain �� and a interpretation function �� . The in-
terpretation function maps concept names into subsets of the domain, role names into
subsets of the cartesian product of the domain (�����), and individual names into el-
ements of the domain. The only restriction on the interpretations is the so called unique
name assumption, which imposes that different individual names must be mapped into
distinct elements of the domain. Given a concept name� (role name � ) the set denoted
by �� (� �) is called the interpretation, or extension, of � () w.r.t. �.
Example 2.1
For example, given the concept names Even, Odd, the role SUCC, and the individuals
one,two,three and four. A possible interpretation is composed by the natural
numbers �� � ��� �� �� �� � and the interpretation function defined by:
Odd� � ��� �� �
Even� � ��� ��
SUCC� � ���� ��� ��� ��� ��� ��� ��� �� �� ���
one� � �
two� � �
three� � �
four� � �
Note that at this stage there is no way to say that an interpretation is correct or
wrong. In this example the interpretation ���� ��� “looks correct” because of the natu-
ral language semantics we associate to the names, but a different interpretation where
the role SUCC is mapped to ���� ��� ��� ��� would be a valid interpretation as well.
1But not the negation of a general concept expression.
2.1. DESCRIPTION LOGICS 20
Syntax Semantics Description
� �� � �� concept name� �� top� � bottom
� �� �� ��� conjunction����
�� � ����� �� � �� � � ��
�universal quantification
����� � ���� �� � �� � � � ��
�existential quantification (�)
� �� � �� general negation (�)� �� �� ��� disjunction (�)� ��
�� �
�� � ��� �� � ��
�� �
�number restriction (� )
� ���� �
�� � ��� �� � ��
�� �
�
� ������ �
�� � ��� �� � �� � � � ��
�� �
�qualified number restriction(�)
� ������ �
�� � ��� �� � �� � � � ��
�� �
�
� �� � � � � � ���� � � � � � �
��
one-of (�)
Table 2.1: Syntax and Semantics of concept expression constructors.
Syntax Semantics Description
� � � � �� ��� role name � � �� role conjunction (�) � � �� role disjunction��
���� � � � � �� � �
�converse role (�)
Æ � Æ� role composition�
�� � � � � ��
�test
��
������� reflexive transitive closure: ���� �
�������� , and ������ � ���� �
Table 2.2: Syntax and Semantics of role expression constructors.
2.1. DESCRIPTION LOGICS 21
Interpretations can be naturally seen as directed labeled graphs; the nodes are the
elements of the interpretation domain, while the interpretation function provides both
the labeled edges and the set of node labels. For example, the interpretation � of
Example 2.1 is represented by the graph
� � � �
one two three four
�Odd� �Odd� �Odd��Even� �Even�
�SUCC� �SUCC� �SUCC� �SUCC�
�SUCC�
Note that the node and edge labels are sets of concept and role names respectively,
because a single element of the domain can be in the extension of more than one con-
cept name (and analogously a pair of elements in the extension of several roles). In the
graph we indicated the mapping for the individual names with the dotted lines, but in
general this will be omitted when we are interested in the structure of the interpretation
only.
Expressions are interpreted according the semantics given in Table 2.1 and Ta-
ble 2.2; for instance, with respect to the Example 2.1 the expression ��SUCC�Even��
is interpreted as the set ��� ��. The interpretation of a concept expression can be the
empty set as well; for instance, with respect to the interpretation of Example 2.1 the
set �Even � ��SUCC�Even��� is empty.
An interpretation is said to be a model for a concept expression if the extension of
that expression is nonempty. The existence of a model for a concept expression defines
the satisfiability of the concept itself; i.e. a concept expression � is satisfiable if and
only if there exists an interpretation � such that �� is nonempty.
Different concept expressions can be compared with respect to their models. Two
concepts are equivalent if and only if their extensions are equal with respect to every
interpretation. In a similar way, a concept is subsumed by another if the extension
of the first one is included in that of the second with respect to every interpretation.
Formally, concept � is subsumed by concept � (written as � �) if and only if for
any interpretation � the inclusion �� � �� holds.
2.1. DESCRIPTION LOGICS 22
2.1.2 DL based systems
Traditionally, the knowledge base of a DL-based system is composed of two distinct
parts: the intensional (Tbox) and the extensional (Abox) components.
Terminology
The intensional part describes the relation between concepts and roles expressions. It
can be seen under two different lights: as a collection of definitions for role and concept
names, or as a set of axioms that restrict the models for the knowledge base. The Tbox
is composed of a set of statements of the forms:
��� � (2.1)
� � (2.2)
The first statement asserts that the concept expressions � and � are equivalent,
while the second that concept expression � is more specific that (or included in) ex-
pression �. When the left hand side of the statement is a concept name (��� � or
� �), the statement is said to be definitional, because it defines the characteristics
of the concept name. Examples of general statements are �Odd � Even� � and
� �� �SUCC�, while Integer�� �Odd � Even� � �� �SUCC� is a definition.
The semantics of Tbox statements is given in terms of interpretations analogously
to the subsumption and equivalence seen in Section 2.1.1. An interpretation � satisfies
(is a model of) the statement���� if and only if�� � �� , and it satisfies�� if and
only if�� � �� . The generalisation to a full Tbox is straightforward, an interpretation
is a model for a Tbox if it satisfies all the statements contained in the Tbox.
DL systems are also characterised by the kind of statements they allow on the
Tbox. Systems with free Tboxes allow any kind of assertions, others restrict to the
definitional statements or to acyclic assertions only. A terminological cycle in a Tbox
is a recursive definition, or one or more mutually recursive definitions. Examples of
recursive definitions are Human �OFFSPRINGS�Human or the pair of definitions�Odd
�� ��Even � �SUCC�Even��
Even �SUCC�Odd
��
2.1. DESCRIPTION LOGICS 23
Allowing terminological cycles in the Tbox rises several problems, on both the se-
mantic and computational aspects (see Nebel [1991]). For this reason, most of the early
DL systems do not allow cycles in the terminology (see Borgida and Patel-Schneider
[1994], Baader and Hollunder [1991b]). On the other hand, recursive definitions are
ubiquitous in computer science (e.g. data structures), and a very natural way to model
application domains (e.g. the Human definition above). For these reasons recursive
definitions have been widely studied (see Nebel [1991], De Giacomo and Lenzerini
[1997]), and all modern systems provide unrestricted concept inclusion assertions.
The first problem for cyclic definitions is providing a semantics; the different pro-
posed semantics are either fixed point based or descriptive. Using the descriptive se-
mantics, all the interpretations satisfying the Tbox statements are models for the Tbox
itself. While under the fixed point semantics, interpretations are filtered according a
fixed point operator (see Nebel [1991], De Giacomo and Lenzerini [1997] for more
details). The descriptive semantics is adopted here because of its wide acceptance as
the most appropriate one (see Donini et al. [1996b]).
We distinguish three different types of terminologies: definitional acyclic, defini-
tional, and free. The latter (free) type is the most general one: every kind of termi-
nological axiom of the form � � can be put in the terminology2. The definitional
approach restricts the form of the axioms in such a way that only concept names are
allowed on the left hand side of an axiom, and each name can appear only once in a
left hand side. Under the definitional restriction the axioms are called definitions.3 The
most restrictive approach imposes that the definitions must be acyclic.
Many systems take the acyclic definitional approach because it is the simplest case,
and it exhibits good computational properties. In addition, it can be shown that a
knowledge base containing only acyclic definitions can be transformed in an equivalent
one with an empty terminology.
The transformation proceeds in two phases. Firstly, all the inclusion definitions
of the form � � are transformed into equivalent equality definitions ��� � � ��,
introducing a new concept name for each inclusion. Finally, all the defined concept
names in concept expressions are substituted with their definition. After the transfor-
mation the Tbox can be ignored and all the reasoning can be carried out with the fully
expanded expressions.
2The inclusion axiom is general enough for representing definitions (����) as well. In fact, a single
definition ���� can be expressed by the pair of inclusion axioms � �� and � � �.
3Actually, in DLs which provide the disjunctive constructor, free terminologies can always be trans-formed in (possibly cyclical) definitional terminologies (see Buchheit et al. [1993]).
2.1. DESCRIPTION LOGICS 24
Abox assertions
The extensional part describes a particular configuration of a domain, introducing in-
dividual names and their properties. Aboxes are composed of statements of the form:
��� (2.3)
��� ��� (2.4)
The assertion ��� expresses the fact that the interpretation of individual � is in the
extension of a concept expression �. The role assertion ��� ��� states that the pair
composed by the interpretations of individuals � and � is in the interpretation of role
expression . The semantics of an assertion is given with respect to interpretations.
An interpretation � satisfies ��� (��� ���) if and only if �� � �� (��� � ��� � �).
Analogously to the Tbox, an interpretation � is a model for an Abox if it satisfies
every assertion in the Abox.
DL knowledge bases
A DL knowledge base is a pair � � �� ���, where � is a Tbox and � an Abox.
Given the definition of models for a Tbox and an Abox, a model for a knowledge base
� � �� ��� is a model for both � and �. The definition of models of a knowledge
base naturally introduces the concept of logical implication: a statement� is a logically
implied (or entailed) by the knowledge base � (written as � �� �) if and only if � is
true in every model of �. For example the statement ��� is a logical implication of the
knowledge base � � ������� ��� �����.
Two knowledge bases are equivalent if the models for one are models for the
other and vice versa. This equivalence is a powerful tool for manipulating knowl-
edge bases: as long as the equivalence is guaranteed, a knowledge base can be modi-
fied without affecting its logical implications. For example the knowledge base � �
������ � ���� � ������ is equivalent to �� � ���
��� � ���� � ���� � �����,
and the statement ��� is logically implied by both � and � �.
2.2. REASONING WITH DLS 25
2.2 Reasoning with DLs
Reasoning with DL knowledge bases is the fundamental process of discovering the
facts entailed by the knowledge base. The interaction with a DL system is performed
by a collection of reasoning services implemented employing automated reasoning
techniques. The reasoning services may differ according to the purpose of the DL
system, for example checking the truth value of boolean queries, or verifying the con-
sistency of the knowledge base.
The reasoning services provided by a DL system are formally defined in terms of
the logical consequence. Reasoning services can be classified as basic services, which
involve the checking of the truth value for a statement, and complex services. Among
complex reasoning services are tasks such as finding all the individuals being in a
concept expression, or organising the concept names appearing in the terminology in a
taxonomy. Basic services are for instance the verification of the subsumption between
two concepts or the satisfiability of a concept. The principal basic services are:
Knowledge base satisfiability written as � �� � !" �, is the problem of checking
whether there is at least a nonempty model for �.
Concept satisfiability written as � �� � !" �, is the problem of checking whether
there exists a model of � in which the extension of the concept � is not empty.
Subsumption written as � �� � �, is the problem of verifying that in every model
of � the extension of � is included in that of �.
Instance Checking written as � �� ���, is the problem of verifying that in every
model of � the interpretation of � is in the extension of �.
Reasoning services are not independent of one another, on the contrary each can be
often reduced to another. The reducibility actually depends on the underlying concept
language: using a language closed under negation, all the basic reasoning services can
be reduced to knowledge base satisfiability (see Schaerf [1994] for the details).
The complex reasoning tasks provided vary from system to system, and are de-
fined on top of the basic services. The most common are classification and retrieval.
Classification consists of explicitly representing the concept taxonomy entailed by the
knowledge bases. The concept taxonomy is a graph whose nodes are the concept
names that appear in the knowledge base, and the edges represent the subsumption
relation between them. This graph can be constructed by checking the subsumption
2.2. REASONING WITH DLS 26
between every pair of concept names. Retrieval (or query answering) is the collecting
of all individuals in the knowledge base that are instance of a given concept in every
model of the knowledge base. It is formally defined by the set �� � � � �� ����,
where � is the querying concept.
2.2.1 Terminological reasoning
Terminological reasoning involves only the terminology (i.e. without considering Abox
assertions). In spite of the fact that all the reasoning services can be reduced to knowl-
edge base satisfiability, terminological reasoning is important because of the fact that
for most of the concept languages the terminology does not depend on the assertional
part. This independence relies on the fact that some services which require dealing
with concept expressions only (e.g. subsumption) can be performed ignoring the asser-
tions. This assumption is extremely useful; for example, it enables the verification that
an Elephant is a Mammal without considering the complete knowledge base about
wild life.
Unfortunately, not all concept languages enjoy this property; among these are the
concept languages in which individuals can be used in concept expressions (e.g. the
ones with one-of). In fact, while in general the expression ��� �� � � is not sub-
sumed by ���, with respect to a knowledge base containing the assertions ����� ����
the subsumption actually holds:
� � ����� ����� �� ��� �� � � ���
Since most of the languages enjoy the property of terminology independence, and
terminological reasoning services are fundamental for knowledge representation sys-
tems (classification, for example), a big effort has been spent on the development of
efficient concept satisfiability algorithms (see Horrocks and Patel-Schneider [1998a]).
The concept satisfiability problem is general enough, because concept subsumption
can be reduced to the unsatisfiability of a concept expressions: � � if and only if
� � �� is unsatisfiable.4
The main idea of the concept satisfiability algorithm, presented in Baader and Hol-
lunder [1991a], is based on a notational variant of the first–order tableaux calculus. It
is based on the general idea that, since presenting a model is a way of showing that the
4If ���� is unsatisfiable � intersected with �� it is empty in each model, so � has to be includedin �. The other way around is analogous.
2.2. REASONING WITH DLS 27
concept is satisfiable, a tableau can be used to build a model of the given concept.
Example 2.2
Looking for a model of expression � � ��� means building an interpretation � in
which the expression’s extension is not empty. The process starts “creating” an object
in the domain �� , which belong to the extension of the concept � � ���:
� �� � ����� �
should be in both �� and ������ because of the semantics of �, so two more
constraints are induced:
� �� � � ������
The semantics of the existential constructor guides the generation of a new individual
� � �� (potentially different from ) such that:
� � �� � � � � � ��
From the set of constraints above an interpretation can be derived:
�� � � � ��
�� � � �
�� � ���
� � �� � ���
which can be easily verified as being a model for the initial concept expression � �
���.
Example 2.2 can be generalised by the notion of constraint system as defined in
Schmidt-Schauss and Smolka [1991]. A constraint system is a set of constraints, which
are syntactic elements of the forms:
�� concept constraint
� � ��� role constraint
!� � inequality constraint
(2.5)
Although the syntax is similar to the one for the Abox assertions, there is a difference in
the meaning of the and � items. These are not individuals, but variables, so the unique
2.2. REASONING WITH DLS 28
name assumption does not apply to them unless stated by an inequality constraint. The
similarity with Abox assertions is that both kind of expressions restrict their arguments
to be in a concept or role expression.
When in a constraint system between two variables the inequality constraint holds,
these two variables are said to be separated. A constraint system can be seen as a graph
where the edges are labelled with the roles, and the nodes are the variable names; when
a variable is connected by a role to a variable �, � is said to be an -successor
of . Given a constraint system � the notation ����� represents the constraint system
obtained by substituting each occurrence of variable � with variable � in the constraint
system �.
The process of looking for a model proceeds by extending (or completing) a con-
straint system using a set of completion rules. These rules modify a constraint system,
adding or rewriting the constraints contained on it. For example the rule for the exis-
tential quantification (see Example 2.2) is defined as:
� #� �� � ���� ���� � �
if ���� in �, � is a new variable
and there is no � s.t. both � � ��� and ��� in �.
The meaning of the rule is that the constraint system � should be expanded by
adding the constraints � � ��� and ���, if the conditions in the following lines are
verified. The set of completion rules varies according to the different concept lan-
guages, for example the rules for ���� are shown in Figure 2.1.
The concept satisfiability algorithm starts with the constraint system � ���, where
� is the concept to check, then it applies the completion rules as long as the precon-
ditions are satisfied. When no rule is applicable the constraint system is said to be
completed, and if there are not contradictory constraints, a model for the concept �
can be derived (see Donini et al. [1996b]). Contradictions in a constraint system are
detected by the so–called clashes, which are constraint systems containing constraints
of one of these forms:
1. � ���
2. � ��� ����
3. � �$ ��� �� � ���� � � � � � � � � � ��
� ��� !� �� � �� � � � � � � �� �� � !� ��
2.2. REASONING WITH DLS 29
� #� � ���� ���� � �if ��� � �� in �,and both ��� and ��� are not in �.
� #� � ��� � �if ��� � �� in �,neither ��� or ��� is in �and � � �� or � � ��.
� #� ����� � �if ���� in �, and � � ��� is in �,and ��� not in �.
� #� �� � ���� ���� � �if ���� in �, � is a new variableand there is no � s.t. both � � ��� and ��� in �.
� #� �� � ���� � � � � � � � �� � ��� !� �� � �� � � � � � � �� � !� �� � �if �� � in �, �� � � � �� are new variablesand has not � pairwise disjoint -successors in �.
� # ����� if �$ � in �, has no more than � -successors in �,� � ���, � � ��� are in �,and � !� � is not in �.
Figure 2.1: Completion rules for ����
Not all the completion rules are deterministically applicable (i.e. only in one way).
For example in the disjunction rule#� one of the two concepts can be nondeterministi-
cally chosen to be inserted in the augmented constraint system; the choice can generate
results in two different constraint systems. One is chosen and if it leads to a clash, the
algorithm backtracks and tries the second one. In the case of ���� the nondetermin-
istic rules are those associated with disjunction, and with the upper number restriction
(the variables to be merged are nondeterministically chosen).
The algorithm based on the completion rules has been proven to solve the concept
consistency checking problem; i.e. it produce a clash free completed constraint system
if and only if the concept is satisfiable (see Donini et al. [1996b]). Presenting the for-
mal proof of that result is not in the scope of this work, but an overview of how it is
structured can be useful because of it can be generalised to other concept languages
with different set of rules. The first step consists in showing that the deterministic rules
2.2. REASONING WITH DLS 30
application preserves the satisfiability of the constraint system, and that the nondeter-
ministic rules can be applied in such a way that the satisfiability is preserved. The
second step consists in showing that the rule application always terminates; i.e. all
the possible completed constraint systems which can be generated are finite. Finally,
the last step shows that from a completed clash-free constraint system a model can be
constructed.
2.2.2 Hybrid reasoning
Hybrid reasoning takes account of both the parts of a knowledge base. We consider al-
gorithms for solving the problem of knowledge base satisfiability. In principle, this ap-
proach is general enough because all reasoning services can be reduced to knowledge
base satisfiability (see Section 2.2). Indeed, most of the DL systems available pro-
vide different reasoning services by reducing them to KB satisfiability, and although
alternative approaches have been suggested (see for example Rousset [1999]), no other
satisfactory alternative has yet been found.
For a large class of concept languages, including ���� , the algorithm can be
a generalisation of that used for terminological reasoning. The notion of constraint
system is extended, allowing the presence of constraints for individuals as well as vari-
ables. In addition, due to the unique name assumption, each pair of different individu-
als implicitly introduces an inequality constraint which forces them to be separated.
The initial constraint system is not composed by a single constraint as the termi-
nological case, but the full Abox is included in the constraint system. Each role or
concept assertion is added to the constraint system as a corresponding role or concept
constraint. Then the algorithm proceeds, completing the constraint system as in the
terminological case. In Buchheit et al. [1993] the proof of correctness and complete-
ness of the technique has been provided. Example 2.3 below, shows an knowledge
base completion for an instance checking problem.
Example 2.3
Consider the following knowledge base with an empty terminology:
� � � �
�����susan�Female�bill��Female�
�sarah�susan��FRIEND� �sarah�andrea��FRIEND�
�susan�andrea��LOVES� �andrea�bill��LOVES
����
This is one of the classical examples for showing the necessity of reasoning by case
2.2. REASONING WITH DLS 31
in DL. The example pivots on the fact that the sex of the individual andrea is not
specified, so the KB contains incomplete knowledge.5
We want to check whether sarah has a female friend loving a male, i.e. verifying
the instance checking problem
� �� sarah���FRIEND��Female � ��LOVES��Female����
It is easy to verify the instance checking problem using reasoning by case: we can
assume that andrea is either in the concept Female or �Female. In the first case,
andrea is the female friend of sarah loving a male (bill). In the second case,
susan is the female friend of sarah loving a male (andrea).
This problem can be reformulated in term of KB satisfiability by verifying that the
knowledge base augmented by the assertion
sarah����FRIEND��Female � ��LOVES��Female���
is unsatisfiable (see Donini et al. [1994]). The corresponding initial constraint system
is:6 �����������sarah���FRIEND���Female � ��LOVES�Female���
susan�Female�bill��Female�
�sarah�susan��FRIEND� �sarah�andrea��FRIEND�
�susan�andrea��LOVES� �andrea�bill��LOVES
���������Using the universal rule (#�) applied to the first constraint and the role constraints,
new constraints for susan and andrea are introduced:�susan���Female � ��LOVES�Female��
andrea���Female � ��LOVES�Female��
�
With the new constraint for susan the nondeterministic #� rule can be applied.
However choosing the susan��Female constraint leads to a clash with the initial
susan�Female assertion; therefore the second concept is chosen, and the constraint
system is extended with: �susan���LOVES�Female�
5The name Andrea is considered a male name in Italy, and female in most of the rest of the world.6All the negations have been pushed in front of concept names.
2.3. ROLE AXIOMS 32
This constraint, together with the role constraint �susan�andrea��LOVES can be
used with the universal rule leading to the new constraint:�andrea�Female
Again the #� can be applied to the not yet considered constraint
andrea���Female � ��LOVES�Female��
containing a disjunction. As in the previous case, choosing andrea��Female leads
to a clash, so the added constraint is:�andrea���LOVES�Female�
which can be used with the �andrea�bill��LOVES constraint to add the new con-
straint bill�Female, which generates a new clash. At this point all the possible
nondeterministic choices have been explored, and all the possibilities leaded to a clash.
Therefore the constraint system is not satisfiable, and the answer to the instance check-
ing problem is affirmative.
2.3 Role axioms
The DLs we have introduced up to now are powerful languages for describing classes
of elements, but the possibility of specifying “global” properties of the roles is limited.
For instance, a DL with the transitive closure constructor (see Table 2.2) can “talk”
about the transitive closure of a role but it cannot restrict the interpretation of a role to
be transitive.
Most of the recently developed DL systems have introduced the possibility of
adding axioms about roles into the terminology (see Horrocks [1998], Haarslev and
Moller [2000b]). Usually these axioms state the inclusion relationship between role
names; but, in addition, transitivity or functional restrictions can be imposed on role
names. The choice of role axioms provided by a DL system depends on the balance
between the representional usefulness of the construct, and the practical tractability of
the KB satisfiability problem.
Although some of the role restrictions can be imposed by appropriate concept ax-
ioms, having an algorithm which handles them directly is usually more efficient. A
2.3. ROLE AXIOMS 33
typical case is the functional restriction; i.e. the fact that every element has at most one
successor. This restriction can be enforced by a role axiom which states that a role
� is functional, or by using the concept axiom � $ ���� (in DLs providing the
number restriction constructor). In spite of the fact that they are equivalent, the latter
method may introduce unnecessary nondeterminism in the tableau construction (see
Section 2.2.1).
In our work we concentrate on three kinds of role restrictions: transitivity, func-
tional, and inclusion (or role hierarchy). Indicated in the name of the logic respectively
by the letters �� , �, and f.
Transitivity This restriction on role names states that the transitive closure of a role
must be contained in the role itself. Formally, this can be expressed as a requirement
for interpretations, in the same way as in the case of concept axioms (see Section 2.1.2).
For example, if the role is transitive then an interpretation � � ��� � ��� satisfies the
transitivity of iff
�� � ��� ��� ��� � � implies � � �� � � �
Transitivity can be extremely useful for modelling part-of relations (see Sattler [1996]
for more details). The reader may have noticed the similarity of the transitive re-
striction with the transitive closure constructor on roles (see Table 2.2). In fact, the
latter can be used in place of the transitive restriction (the converse is not true). How-
ever, transitive restriction has proved to be practically more tractable than the transi-
tive closure constructor, leading to simpler satisfiability algorithms (see Horrocks et al.
[1999b]).
Generally, the transitivity restrictions are not directly represented as axioms in the
terminology; instead, we distinguish a subset of the role names as the set of transitive
roles (i.e. � �� � �� ).
Functional If a role name is functional, in every interpretation satisfying the restric-
tion the relation associated to the role name must be a partial function (i.e. it relate
any element of the domain to no more than one element). It is easy to see that this
is equivalent to a concept axiom of the form � $ ����. However, functional re-
strictions are common in domain modelling, therefore they are usually handled by the
satisfiability algorithm directly.
Analogusly to the transitivity restrictions, functional roles are considered a subset
2.3. ROLE AXIOMS 34
of the set of role names (i.e. %�� � �� ). In addition, we impose the limitation (as
in the FaCT system) that the sets of functional roles and transitive roles are disjoint.
The reason for this limitation lies in the fact that is not clear whether transitive func-
tional role are of any usefulness in the modelling process; moreover, the decidability
of such a DL is still an open problem (we know that allowing transitive roles in number
restriction leads to undecidability, see Horrocks et al. [1999b]).
Inclusion This kind of axiom is the role counterpart of the concept axioms shown
in Section 2.1.2, with the difference that they relate sets of pair of elements of the
domain instead of sets of elements. Role axioms are expressed by statements of the
form � �, where � and � are role expressions. An interpretation � � ��� � ���
satisfies the role axiom iff
�� � �
� �
We consider only simple axioms where the role expressions are simply role names, in
this way the role axioms express only a taxonomy of role names. It is worth notic-
ing that axioms hide an implicit negation,7 therefore without restrictions on the role
axioms full boolean role expressions can be obtained if the DL has conjunction or dis-
junction role constructors. The problem is that having a language for role expressions
closed w.r.t. negation push the DL close to the decidability border (see Lutz and Sattler
[2000]).
Obviously, sub-roles of functional roles must be functional as well; therefore if the
DL provides these three restrictions a transitive role cannot be included in a functional
role (see the discussion about transitive functional roles above).
Role hierarchy can be “simulated” by using either role conjunction or disjunction.
Roughly speaking, the idea is to substitute each node of the role taxonomy by either
the conjunction of all the top level role names “above” the node, or the disjunction of
the leaves below it. Each role name in then replaced by the corresponding expression;
this is more or less the same process as the “unfolding” of the concept definitions. This
role unfolding can be performed only if the DL does not provide the qualified num-
ber restriction constructor (i.e. $ ���) because for keeping decidability complex
role expressions are not allowed as argument of this constructor (see De Giacomo and
Lenzerini [1994]).
An example of the expressivity that role restrictions can provide is given by the fact
7Just consider the concept case in which the axiom � �� is equivalent to assert that every elementof the domain must belong to the interpretation of the concept �� ��
2.4. THE DESCRIPTION LOGIC ��F 35
that arbitrary concept axioms can be represented by the combination of transitivity and
role inclusion (see Baader [1991]).
2.4 The description logic ��f
In the rest of the thesis we consider the DL ��f; this is the very same logic handled
by the DL system FaCT (see Horrocks [1998]) with the addition of the Abox.
��f is an extension of the logic ��� to include transitive roles, role hierarchy and
functional restriction. Following the naming convention introduced in Horrocks et al.
[1999b], the DL ��� extended with transitive role (i.e. ����) is abbreviated as �.
This is due to the relationship with the propositional (multi) modal logic �� ��� (see
Schild [1991]).
Summarising, the DL ��f is built over a signature of distinct concept (�� ), role
(�� ) and individual ( ) set of names; in addition we distinguish two non–overlapping
subsets � �� , %�� of �� which denote the transitive and the functional roles.
Valid concept expressions are defined by the abstract syntax:
� � � � � � � � � � �� � �� � �� � �� � �� � ��� � ���
where � is a concept name chosen from the set �� and a role name from �� .
A ��f knowledge base is composed by a Tbox containing axioms of the form
�� �� or � �, and an Abox containing individual assertions of the form ��� or
��� ���. The semantics is the one described in the previous sections.
Since role inclusion axioms contain role names only, from the set of role inclusion
assertions a taxonomy of role names can be trivially build. In addition, if there is a
cycle in the axioms, then all the names involved in the cycle must correspond to the
same binary relation in every interpretation (satisfying the axioms). For this reason,
we assume that the role taxonomy is acyclical, and we represent it by a partial order &
defined over the set of role names.
In the next chapter we characterise the structure of the models for a ��f KB. The
idea is to recognise a class of interpretations which completely describes ��f, in the
sense that whenever a KB has a model it has a model of this class as well.
This characterisation of the logic will be crucial for the following chapters describ-
ing the KB satisfiability and query answering algorithms.
Chapter 3
Model characterisation of ��f
As anticipated in the previous chapters, many of the convenient properties of Descrip-
tion Logics stem from their tree model property; i.e. the fact that every satisfiable
formula has a model that is a tree (see Vardi [1997], Gradel [1999]). In the logic we
consider the picture is slightly more complicated by the fact that Tbox assertions on
roles and the Abox can force non-tree shaped models. In particular, the transitivity
forces the presence of “shortcuts” corresponding to the transitive closure of parts of
the tree; while the Abox assertions can force any graph shape among the nodes corre-
sponding to individual names. However, we still have a tree–like model property; the
models look like a cloud containing the nodes corresponding to individual names with
(possibly transitive) trees hanging off the nodes in the cloud (see Figure 3.1).
Figure 3.1: Structure of interpretations for ��f
36
3.1. QUASI TRANSITIVE SHRUB INTERPRETATIONS 37
We call this kind of interpretations shrubs, because we distinguish a central inter-
connected part with branches coming out of it.1 There are a few properties of these
interpretations which are fundamental for the proofs we are going to present in the
following chapters. The first one is that elements in distinct trees are not connected by
any link; i.e. elements of the trees cannot “see” elements of other trees. The second
is that each tree is “attached” to a single node corresponding to an individual; if there
are links starting from other nodes in the cloud, then these edges must be the result of
a transitive closure. This means that all the connections of the tree with the rest of the
graph are “mediated” by the individual the tree is attached to.
This chapter will formally present the properties of this class of interpretations
which we have called quasi transitive shrubs. In addition, we define a transformation
for arbitrary interpretations which enables us to present a completeness result for both
the problem of satisfiability (see Section 3.3) and logical implication (see Section 7.2)
for ��f.
Note that completeness w.r.t. KB satisfiability can be proved by using a technique
simpler than the one we are presenting in this chapter. In fact, it is well know that
the DL ��f has the finite model property (see Horrocks [1998]),2 while our technique
always generates infinite interpretations (even uncountable if the starting domain is
uncountable). So the natural question is whether we need a different mechanism. Our
answer to this doubt is obviously positive, because we need to show the completeness
w.r.t. a problem not directly reducible to KB satisfiability (see Section 7.2). We will
come back to this latter point in the last section of this chapter.
3.1 Quasi transitive shrub interpretations
As explained above we are interested in interpretations being structured as a collection
of trees whose roots are possibly interconnected by links. We start from a set of trees,
then we add the required links connecting the roots to the rest of elements.
There are different way of building a tree; the one we have chosen consists of
selecting an alphabet (without any restriction of cardinality), and then consider the set
of all finite sequences of elements of that alphabet. When it comes to actually build
the tree by adding the edges, we are going to connect two sequences only if the second
one is equal to the first one with one more element of the alphabet (i.e. a sequence
1Strictly speaking a proper shrub does not have the interconnected core, but from an adequate dis-tance they looks like balls of leaves with a few branches coming off them.
2Without Abox assertions, however Abox is not affecting the finiteness.
3.1. QUASI TRANSITIVE SHRUB INTERPRETATIONS 38
��� � ��� can have only sequences ��� � ���� as successors). This process naturally
leads to a forest where each element of the selected alphabet (or more precisely the
sequence containing a single element) can be the root of one of the trees. Note that
any node of the trees can have as many successors as the cardinality of the alphabet
(which is not restricted in any way). Note that this method does not satisfy completely
our requirements, because there are still the problems of transitive relations, and Abox
role assertions. These two aspects are indeed considered in Definition 3.2.
We chosen this method because what we want to do is to take an arbitrary inter-
pretation and transforming it into a quasi transitive shrub interpretation still having the
same properties (in terms of satisfiability of a KB). In addition, we would like to main-
tain a strict relation between the elements of the original interpretation domain and
those of the new interpretation; the idea is taking the original interpretation domain as
an alphabet (see Section 3.2.1 for the details).
Let us start with the definition of the set of sequences of elements from a given
domain.
Definition 3.1. Let � be an arbitrary domain then'#� is the set of all the finite se-
quences of elements of �. We indicate the empty sequence as �, which by assumption
is not in'#� .
To distinguish the elements of the “alphabet” from the sequences we are going to
use the letters � � for elements of the “alphabet” domain, and the letters �� �� � for
sequences. We indicate with � �� � � � the sequence containing the elements �� � � �� �of �; while � represents the sequence resulting from the concatenation of the the
element to the sequence �.
We use the set of finite sequences as the domain for the quasi transitive shrub
interpretations; in addition, we impose some restriction on the interpretation function.
In particular, we require that individual names are mapped to sequences of length 1,
and we add a few requirements on the interpretation of roles, which we will detail
below.
Definition 3.2 (Quasi transitive shrubs). Let � be an arbitrary domain. An interpre-
tation � � ��� � ��� over a set of individual names , role names �� , and concept
names �� is a quasi transitive shrub interpretation iff it satisfies the following prop-
erties. Let �� � be arbitrary elements of �� , an element of �, and a name in �� :
3.1. QUASI TRANSITIVE SHRUB INTERPRETATIONS 39
�� �'#� (3.2a)
if � � � then there is � � � such that � � �� (3.2b)
if ��� � � � � then ��� � � � � (3.2c)
if ��� � � � � then either � � � or ���� ��� ��� � �� � � (3.2d)
if ���� ��� ��� � �� � � ��� � � then ��� � �� � � �� � � (3.2e)
The first two restrictions of Definition 3.2 correspond to the two above mentioned
properties that the domain is composed by finite sequences of given elements, and
the individual names are always interpreted as “singleton” sequences. The third prop-
erty (3.2c) ensures that there is not any link going back from one of the trees to the root
of any other tree; i.e. the interaction between different trees is always mediated by the
roots, and is restricted to elements mapped from individual names.
The last two properties ((3.2d) and (3.2e)) ensure the tree–like shape, extended to
the transitive case. Property (3.2d) extends the tree–shape idea to cover the transitive
parts of the interpretations. In fact, if the transitive closure is not considered, the
presence of an edge ��� � � implies that � � � (the successors of a node are sequences
containing just one more element). However, we need to consider the case in which
a role is transitive or contains a transitive sub-role; therefore we must allow the case
in which ��� � � is a shortcut for the two edges ��� �� and ��� � �. The last property
relates the elements contained into a subtree with elements outside the given subtree.
In fact, given a node �, all the nodes in the form � �� � � � (with � � �) are in
the subtree rooted in �. The property describes the case in which there is an edge
connecting a node outside this subtree (�) to both the root (�) and an element of the
subtree.3 In this case it is ensured that the edge ��� � �� � � �� is a shortcut for the path
��� �� and ��� � �� � � ��.
Clearly, the the structures characterised by Definition 3.2 are neither trees nor
acyclic graphs. However, we can distinguished the tree–like structures defined by all
the elements beginning with the same sequence. With an abuse of terminology we call
these substructures trees. In fact, if we consider their intuitive support tree, the added
edges are only the result of transitive closure of the support tree. This restriction does
not apply to the roots of these trees, which can be connected in any way (even with
cycles). The idea is that the relation among the roots reflects the role assertions in the
3The case in which � � � is trivially satisfied.
3.1. QUASI TRANSITIVE SHRUB INTERPRETATIONS 40
Abox.
The restriction on the structure of q.t. shrub interpretations are strong enough to
characterise two sequences connected by a role. This is shown by the next lemma
which is used in Chapter 7 for showing the completeness of the proposed algorithm
for query answering (see Proposition 7.16).
Lemma 3.3. Let � � ��� � ��� be a quasi transitive shrub interpretation built from the
domain � (i.e. �� �'#�). Let be an arbitrary role name, � and � � � �� � � � be
elements of �� ( � � � for � � �� � � � � �) for � � �.4 Then the pair ��� �� is in �
only if:
there is an index � � � s.t. � � � �� � � � (3.3a)
or ��� � �� � � and ���� � ��� �� �� ��� � � (3.3b)
Proof. Since the length of the sequence � is finite, then we prove the lemma by induc-
tion on the length of �.
Let be � � � � �, then by (3.2d) either � � � � or ���� � ��� �� �� � � ��� � �;
indeed, these are exactly the two cases (3.3a) and (3.3b).
Let us assume that the lemma holds for any sequence whose length is less than
�, and let us consider ��� � �� � � �� � � . By (3.2d) either � � � �� � � ��� or
���� � �� � � ����� �� �� � � ���� � �� � � ��� ��� � � . In the first case (3.3a) is sat-
isfied by � � � ' �, while for the latter case we can use the inductive hypothesis
for the pair ��� � �� � � ���� � � . Therefore, either there is an index � � � ' �
such that � � � �� � � � or ���� � ��� �� �� � �� � � ����� � � . In the first case
(3.3a) is satisfied, while in the second we have ��� � �� � � . We need to show
that �� �� � �� � � ��� �� � � . This is verified by (3.2e) because ��� � �� � � and
��� � �� � � �� � � by hypothesis.
The previous lemma describes the situation in which there is a relation � be-
tween an arbitrary element � and an element � which is “inside” one of the trees (it
cannot be one of the roots because it is a sequence longer than a single element by
assumption). When this is the case then either � is contained in the subtree rooted in
� (condition (3.3a)), or the relation is “coming” via the root of the tree containing �
(i.e. the sequence � � (condition (3.3b)). Note that by Definition (3.2c), when the latter
condition is verified the element � must correspond to one of the individual names.
4We exclude the case � � ��, because this is trivially covered by the definition in (3.2c).
3.1. QUASI TRANSITIVE SHRUB INTERPRETATIONS 41
This property is important because it guarantees that inside one of the trees links
can only go downward. Therefore there cannot be any link going back to one of the
roots, or to elements contained in any other tree.
3.1.1 Canonical interpretations
The definition of a quasi transitive shrub interpretation is independent of any given
knowledge base. In particular it does not take into account restrictions on the inter-
pretations imposed by a particular KB, like transitivity restrictions on role names, or
inclusion relations between roles. Therefore, by using Definition 3.2 we cannot prevent
the transitive closure of relations or multiple relations connecting two elements, even
when they are not required (by the kb). The point is that we really need to minimise
the non–tree characteristics of the structure of an interpretation (for reasons which will
be clear in Chapter 6 and Chapter 7).
Let us consider for example the assertion � . In terms of the structure of
interpretations, the axiom guarantees that whenever there is an edge labelled with �
between two nodes then there must be an edge labelled as well. This is not a general
property of the ��f logic, but only of interpretations satisfying the knowledge base
containing the given assertion.
This problem is specifically related to the possibility of specifying terminological
axioms about roles. For example with ���, or even a DL as expressive as PDL, the
tree–model property can be stated independently from any particular kb.
In our case the DL ��f allows the presence of role axioms; therefore, what we
really need is a definition characterising an interpretation w.r.t. a given KB. For this
reason we introduce the notion of canonical interpretations w.r.t. a given KB (Def-
inition 3.6). Roughly speaking, the definition considers the transitivity and the role
ordering induced by the KB in order to restrict the structure of allowed interpretations.
It is important to notice that a canonical interpretation does not necessarily satisfy the
knowledge base; the two notions are orthogonal one to the other.
Given a knowledge base �, the role inclusion axioms in the terminology define a
partial order over the set of role names �� , we use the symbol &� to denote this
reflexive order (by definition). For the sake of simplicity, wherever there is only
a single terminology the KB is omitted and the ordering of role names is denoted by
the symbol & alone. Moreover, some of the role names have additional restrictions:
namely being functional or transitive. Note that the functionality restriction propagates
w.r.t. the role ordering, forcing each subrole of a functional role to be functional as
3.1. QUASI TRANSITIVE SHRUB INTERPRETATIONS 42
well. We distinguish these two non-overlapping subsets of �� as %�� (functional)
and � �� (transitive).
Let us consider the requirement that two elements of the domain of a canonical q.t.
shrub interpretation are related by two different roles only if this is required in every
interpretation. Suppose that we have two elements � � in the original interpretation �
related by two different roles (� � �� � � � ��); in the q.t. shrub interpretation, all
the sequences of the form � and � � would be related by the same pair of roles. In
order to avoid this, we can duplicate the element � into �� and ��, then add the pair
�� � � ��� to the interpretation of and �� � � ��� to the interpretation of �. This
technique cannot be applied in every case; for example, if the role � is included in
(i.e. � & ), whenever a pair � � �� is in �� it must be in � as well.
Therefore we must consider groups of role names that must “stay together”. Fortu-
nately, this can be discovered by looking at the structure of the relation & induced by
a knowledge base, together with the functional restrictions. We identify a subset of all
the subsets of �� , calling the elements of it labels. Intuitively, we are going to make
as many copies of elements of the original domain as the number of possible labels.
The following definition provides the formal description of the set of labels w.r.t. a
given knowledge base. It is important to stress the fact that this set is induced by the
assertions about the roles in a knowledge base, therefore they are not intrinsic to the
logic ��f.
The first thing we need to do is to formalise the notion of the fact that two roles
can “stay together”; i.e. the fact that when we have different roles connecting two
elements they cannot be split. The simplest case is when a role name is included in an
other (i.e. & �), but the interaction between role hierarchy and functional restriction
makes the story slightly more complicated. For example, let us consider two role
names � and � both included in a functional role � . In an arbitrary interpretation
�, whenever � � �� � �� and � � �� � �
� we know that both � � �� and � � ��
must be in � � because of the inclusion; in addition, � � is functional therefore � � �.
This scenario may be even more elaborated by considering three roles �, �, �
and two unrelated functional roles ��, �� having the inclusion relation like � & ��,
� & ��, � & ��, and � & ��. With arguments which are similar to the simpler
case, we can conclude that if in all three relations an element has a successor (i.e.
� � ��� � �� , � � ��� � �
� , and � � ��� � ��) then this successor must be the same
(i.e. �� � �� � ��).
Note that we are not saying that � � �� � �� implies the presence of � � �� � �
�
3.1. QUASI TRANSITIVE SHRUB INTERPRETATIONS 43
as well (like the case in which � & �), but only that if they are in the interpretation
then they must coincide. This property is used in the proofs for the query answering
algorithm presented in Chapter 7 (see Proposition 7.15).
The broad idea in the definition is the fact that for a non–functional role every
“preceding” role in the hierarchy must stay with it; while when is functional we
must take the “successors” as well. This should be propagated for covering the cases
shown above. So we define a relation � describing the “staying together” of the roles.
Note that this relation is not always symmetric because if it is the case that � &
and !& �, then � � but not the converse, while when there are functional roles
involved it becomes symmetric (e.g. in the first of the example above, we have that
� � � and the converse as well).
Definition 3.4. The relation � is recursively defined by
� � � iff
���������������
� & �� or
� & � and � is functional, or
� � and � � for some role if neither
� & � nor � & �
(3.4a)
The relation is well defined because it can be build by using the structure of the role
hierarchy. We use the newly defined relation for introducing the set of labels of a KB.
A label � is a subset of �� (� � �� ). For a fixed ordering & and set of func-
tional roles %�� we define the set of labels of �� as a maximal5 subset � of ��
satisfying the properties that for any label � � �
if �� �� � � then � � or � � ; (3.4b)
if � � then �� � � �� � �. (3.4c)
First we show a few properties of the sets of labels of a given KB.
5Maximal w.r.t. set inclusion.
3.1. QUASI TRANSITIVE SHRUB INTERPRETATIONS 44
Proposition 3.5. Let us consider a set of labels � of a given KB:
� �� (3.5a)
for all � �� there is � � � s.t. � �; (3.5b)
the set of labels is unique; (3.5c)
if two labels share a functional role, then they are equal. (3.5d)
Proof. 3.5a Let assume that !� �, clearly satisfies the properties in Definition 3.4.
Therefore � � � � is a set of labels which strictly includes �, contradicting the
hypothesis that � is maximal.
3.5b Let us assume that there is a role s.t. !� � for any label � in �. Let us
build the new label �� � �� � � ��, note that �� satisfies both the properties
in Definition 3.4; therefore ���� � � is a set of labels which strictly includes �,
contradicting the hypothesis that � is maximal.
3.5c Let ��, �� be two sets of labels satisfying the Definition 3.4. We are going to
show that �� � ��, by choosing a label � in �� and proving that it must be in
�� as well.
If � is the empty set, then it is contained in �� by (3.5a), so we consider the case
in which � !� . Let us assume that � !� ��; the label � satisfies the properties
in Definition 3.4 because �� is a set of labels. Therefore ��� � �� is a set of
labels which strictly includes �, contradicting the hypothesis that � is maximal.
Using the very same arguments we can show that �� � ��, therefore they are
equal.
3.5a Let ��,�� be two labels in � sharing a functional role � .
First we show that �� � ��. Let us consider an arbitrary role name in ��; if
is equal to � , then it is in �� by definition. Let us assume that is different
from � ; by Property (3.4b) we have that either � � or � � . If � �
then � �� by Property (3.4c). Let us suppose that � � , since & we
can use Definition (3.4a) for showing that � � as well (substitute � with � ,
and �� � with . Therefore � �� by Property (3.4c).
Analogously, we can show that �� � ��, therefore �� � ��.
3.1. QUASI TRANSITIVE SHRUB INTERPRETATIONS 45
Definition 3.6 (Canonical Interpretations). A quasi transitive shrub interpretation
� � ��� � ��� is said to be canonical w.r.t. a KB � if the following two conditions
apply. Given arbitrary role names ��� � � �� �, and different elements �� �� � of ��
with � !� � :
if ���� ��� ��� ��� ��� ��� � � then
there is a transitive role � & such that ���� ��� ��� ��� � ��(3.6a)
if ���� ��� � �� � � � � � �
� then there is a label � s.t. ��� � � �� �� � � (3.6b)
The definition of a canonical interpretations is given in such a way that it tries
to minimize the non–tree characteristics of the structure of an interpretation. In fact,
given the ��f DL we cannot restrict ourselves to tree structures; for example the role
hierarchy can force two different edges connecting the very same pair of elements, or
the transitivity can force “shortcuts” of role paths.
The first restriction (3.6a) ensures that a canonical interpretation contains only the
necessary shortcuts; i.e. those required for satisfying the transitivity restrictions. The
property states that if the interpretation contains a shortcut (i.e. ���� ��� ��� ��� ��� ��� �
�), this must be caused by a transitive role. Because of the interaction between tran-
sitivity and role inclusion, it is not necessary that the role is transitive: a transitive
role included in can force the shortcut as well.
This is not enough, in fact the restriction (3.6a) does not require the expected
��� �� � �� . The reason comes from the interaction of assertions about the individuals
with the role structure; in fact, we know that Abox assertions can impose any shape in
the part of interpretations concerning the individuals.
Example 3.1
Let us consider the Abox assertions
��� ���� ��� ����� ��� ���� ��
���� ���� ���� � �� �����
together with the inclusion assertions � � �, � , and the role names �,� � being
transitive. One interpretation (�) satisfying the assertions is depicted as the graph
3.2. TRANSFORMING INTERPRETATIONS INTO Q.T. SHRUBS 46
below�
�� �
�
���
���
���
��
����
�� �� �� �
�� �� �� �
����
�� �� �� �
where the element � corresponds to � (i.e. �� � �), � to �, and �� to �; in addition
� is an anonymous element. It is not difficult to realise that � is a sort of “minimal”
interpretation satisfying the set of assertions; in addition, it does not force any transitive
subrole of connecting the elements corresponding to � and �.
The restriction (3.6b) concerns the case of multiple edges connecting the very same
pair. Again, these cases are restricted to only those which are necessarily imposed by
the knowledge base.
3.2 Transforming interpretations into q.t. shrubs
In this section we show how to build a canonical q.t. shrub interpretation from an
arbitrary interpretation satisfying a given knowledge base. First we define the transfor-
mation, leading to what we call the unravelled interpretation and its transitive closure
(Section 3.2.1), then we highlight some of their properties (Section 3.2.2). These re-
sults are used to show the completeness of the class of q.t. shrub interpretation w.r.t.
the problem of kb satisfiability (Section 3.3), and conjunctive query answering as pre-
sented in Chapter 6 and Chapter 7 (see Section 7.2).
The underlying idea of the transformation is simple: we take the domain of the
interpretation as the starting alphabet for the sequences, then we define the interpre-
tation function according to the properties, w.r.t. the original interpretation, of the last
element of each sequence. Roughly speaking, a sequence is in the extension of a con-
cept name if the last element is in the extension of the same concept w.r.t. the original
interpretation. Analogously, two sequences are related if the second one is equal to the
first one with the addition of an element (see Definition 3.2), and their last elements
are related in the original interpretation (see Definition 3.7).
Example 3.2
3.2. TRANSFORMING INTERPRETATIONS INTO Q.T. SHRUBS 47
Let us consider the following simple interpretation without individual names:
� � ���� ������� ���
���
The idea is transforming it into into an interpretation containing an infinite number of
trees like:
��
���
���
����
���� ���
���
���
���
���
��
���
���
���
��
���
where � is an arbitrary sequence of elements from the set ��� �� �� or the empty se-
quence (�).6
As shown in the example above, even starting from a finite interpretation we always
build a new infinite interpretation (because of the arbitrary sequence � in the example).
It is conceivable to impose more restrictions in order to obtain smaller interpretations
(e.g. in the example we may restrict the prefix � to the empty sequence �); however,
more restrictions means more complications and the construction we use is sufficient
for our purpose.
The construction we sketched makes the new interpretations violating any transitive
restriction. For instance, the original interpretation of the example above satisfies a
transitive restriction on the role �, but the newly build interpretation do not satisfies it.
Note that this is something intrinsic in the construction; therefore in a subsequent step
the appropriate relations are transitively closed (see Definition 3.8). In addition, this
simple initial idea is complicated by the fact that the interpretation must be canonical
w.r.t. a given knowledge base, together with the pernicious interaction of transitive and
functional restrictions with the role hierarchy.
6Actually, the transformation is more involved as we are going to show in the following sections.This example is giving the intuition behind the transformation.
3.2. TRANSFORMING INTERPRETATIONS INTO Q.T. SHRUBS 48
3.2.1 Unravelling the interpretation
We assume that the interpretation we are transforming is defined over a set of individual
names , of concept names �� , and role names �� . The set of role names includes
two non-overlapping sets: the functional (%�� ) and transitive (� �� ) names. In
addition, a partial ordering & is defined on �� ; this ordering must satisfy the con-
ditions that names included in functional names are functional as well, and transitive
roles cannot be included into functional roles. Formally, given two names �� in ��
if � & then � %�� implies � � %�� , and � � � �� implies !� %�� .
Now we define the unravelled interpretation of �. We start by building as many
copies of � as the number of labels in � (�� for any � � �). For each label � there
is a bijection � mapping each copy in �� to the original element in �. We define a
new domain �� by taking the union of all the copies of � (i.e. �� ��
�� ��). For
any element of this new set we define two mappings: the first (Æ) maps any element
to the original element of � it was copy of (i.e. Æ ��
�� �); while the second ( )
maps any element to the label corresponding to the given copy of � (i.e. � � � # �,
and � � � � iff � ��). Using the domain �� we define the set'#�� of all the finite
nonempty sequences of elements of �� (see Definition 3.1). The mappings Æ and can
easily be extended to'#�� by using the value of the last element of the given sequence
(i.e. �� � � � � and �� � � � � for any � �'#� � ���).
Elements of'#�� can be naturally associated to nodes in a forest, where � are the
roots and � is a successor of �. We then use the relation structure in the original
interpretation � for adding the necessary edges between nodes. Since we did not make
any assumption on the original domain �, we cannot fix the width of the resulting tree.
In fact it can be even uncountable. The following definition formally describes the
unravelled interpretation of �.
To simplify the notation we indicate with � the set of elements of � to which
the individual names in are mapped by the interpretation � � ��� ��� (i.e. � �� � � � �! � �!� �
�).
Definition 3.7 (Unravelled interpretation). The unravelled interpretation �� � ����� �
���
of � � ��� ��� is defined as:
the domain ��� is the set of all the finite nonempty sequences of elements of ��
��� �
'#���
3.2. TRANSFORMING INTERPRETATIONS INTO Q.T. SHRUBS 49
given an individual name ! �
!�� � � for some � �� s.t. � � � !� and � � � ;
given a role name � �� ,
�� �
���� �� �
�� � �� � ����� ���� � �
(3.7a)
� ���� � � � ��� ��
�� � ����� � �� � � � � � ��
and � !� �� or � � !� ���
(3.7b)
Note that the two sets (3.7a) and (3.7b) are distinct because in the first one all
the pairs are of the form �� � ���, while in the second ��� � �. The condition
“� !� �� or Æ� � !� �” in the (3.7b) part prevents pairs of the form �� � � ��
being added to the interpretation when an analogous pair can be added by the
(3.7a) part. Without the condition, an assertion like ��� ���� in the Abox with
� functional would cause the functional restriction on � to be violated in the
unravelled interpretation.
given an atomic concept � � ��
��� �
�� � �
�� � ��� � �� �
The mapping ��� is well defined; in particular the only problem can arise from the
non–uniqueness of the element in the interpretation of individual names. However,
it is easy to realise that we map an individual name ! to a single copy of the original
element !� (the one in the domain ��).
The unravelling is a well known technique for showing that a modal logic has the
tree model property (see Vardi [1997], Streett [1982], Gradel [1999]). Unfortunately,
the standard technique is not enough in our case, in fact the unravelling of an interpre-
tation would not satisfies the transitive restriction on roles. It is easy to see the problem
by considering the fact that only pairs like ��� � � are added to the interpretation of a
role by the (3.7b) set; therefore if both ��� � � and �� � � �� are in the interpreta-
tion of a transitive role, the required ��� � �� pair is obviously missing. We overcome
this shortcoming by simply adding the transitive closure of relations corresponding to
transitive role names. We call this operation the transitive closure of the interpretation.
3.2. TRANSFORMING INTERPRETATIONS INTO Q.T. SHRUBS 50
Definition 3.8. Let �� � ����� �
��� be the unravelled interpretation of �. The transitive
closure of �� , written as ��� � ������ �
����, is defined as following:
the domain ���� is equal to the unravelled domain �
��;
individuals are mapped as in ��!��� � !
���
given an atomic concept � � ��
���� � �
�� �
given a non–transitive role name � �� ( � ��
��� �
�� ��
��� � ���������
�����
given a transitive role name � � ��
��� � ���� �� � � ���� � � � � ��� � �
�� s.t. ���� � � � � ��� is finite and
�� � �� �� � �� ����� ����� � � � �� � � � � �' �� � ���
The definition of transitive closure of �� is recursive and well defined because we
assume an acyclic ordering &, therefore �� can be build “bottom-up” w.r.t. the ordering
over role names. It is easy to realise that the transitive closure satisfies the transitive
restrictions on role names, while it does not lose the other properties of the unravelled
interpretation. This will enable us to conclude that given an arbitrary interpretation
we have an effective way of providing an example of an interpretation satisfying the
required properties. This is the fundamental brick which provides the foundations for
the completeness proof as we show in the following sections.
3.2.2 Properties of unravelled interpretations
In this section we present two results which constitute the foundation for the complete-
ness proof. The first two propositions highlight some of the properties of the (transitive
closure of) unravelled interpretations; while the last one shows that these generated in-
terpretations are indeed canonical q.t. shrubs.
3.2. TRANSFORMING INTERPRETATIONS INTO Q.T. SHRUBS 51
We start by showing that the unravelled interpretation �� satisfies some fundamental
properties; then in Proposition 3.10 we show that the same properties are maintained
even after the transformation in transitive closure.
Proposition 3.9. Let �� � ���� � �
��� be the unravelled interpretation of � � ��� ���.
Then, �� satisfies the following properties:
For any ! � , �!��� � !� (3.9a)
For any ��� �� � ��� and role , if ��� �� �
��� then ����� ���� � � (3.9b)
For any � � ��� , � �, and role , if ����� � � � � then
there is � � ��� s.t. ��� �� �
�� and ��� � (3.9c)
For any � � ��� and concept �, � � �
�� iff ��� � �� (3.9d)
For any � � ��� and functional role � %�� ,
"�� � ��� �� �
�� � "� � ����� � � �
� (3.9e)
For any �� � in ��� , and role names � � s.t. � & ,
if ��� �� � ��� and ����� ���� � � , then ��� �� �
��(3.9f)
Proof.
(3.9a) This is true by definition, because �!��� � � � � !� .
(3.9b) If ��� �� � �� then by Definition 3.7 in both the two sets (3.7a) and (3.7b), the
condition ����� ���� � � must be satisfied.
(3.9c) Let us assume that ����� � � � with � �. We distinguish the two cases in
which � is or is not mapped from an individual name (i.e. � � ��).
– Let assume that � � �� . If � � , then there is an element � � �� such
that � �� � , and � �� � ( � � by Proposition (3.5a) ). In addition,
� � � �� by definition; therefore ��� � �� �
�� because it is in the set
(3.7a). The element � we are looking for is � �.
If !� � , then there is an element � � �� such that � � �� and
� �� � (by Proposition (3.5b) there is at least a label containing ). We
are going to show that the element � � is the � we need. Clearly � � !�
�� because !� �; in addition, ����� �� ��� � � because �� �� �
3.2. TRANSFORMING INTERPRETATIONS INTO Q.T. SHRUBS 52
� �� � . Therefore the pair ��� � �� is in the set (3.7b), so it is in �� as
well.
– Now we assume that � !� �� . By definition there is an element � � ��
such that � � �� and � �� � . We can proceed as in the previous case
by considering � �. Clearly �� �� � � �� � , so ����� �� ��� � �;
therefore all the conditions for the set (3.7b) are satisfied (we assumed that
� !� ��) and ��� � �� �
�� .
(3.9e) Let us consider the mapping Æ restricted to the domain�� � ��� �� �
��
. First
we show that the codomain is equal to the set� � ����� � � �
�, and then that
the restriction of Æ is bijective. This is enough for prooving that the cardinality
of the two sets are equal.7
If � ��� � ��� �� �
��
then ����� ���� � � by the already shown Propo-
sition (3.9b), therefore ��� �� � ����� � � �
�. For the other direction,
let be an element of� � ����� � � �
�. By Proposition (3.9c) there is an
element � such that ��� �� � �� and ��� � ; therefore � �
�� � ��� �� �
��
(and the restriction of Æ we are considering is surjective).
For showing that the restriction of Æ is bijective we only need to show that it is in-
jective. This is trivially true in the case that the cardinality of�� � ��� �� �
��
is less than 2. Therefore we assume that it contains at least two elements. Let
�� and �� be two elements of the first set such that �� !� ��. We reason by
contradiction by assuming that ���� � ����.
The pair ��� ��� either belongs to the set (3.7a) or (3.7b), and the same applies
to ��� ���. If both of them belong to (3.7a), then �� � � � and �� � � �, with
both � �� and � �� equal to . This means that � and � belong to ��; in
addition �� �� � �� �� because ���� � ����. Therefore � � � because �is a bijection: this is clearly in contradiction with the hypothesis that �� !� ��.
If ��� ��� belongs to the set in (3.7a), then � � �� ; therefore ��� ��� cannot be
in the set in (3.7b). The converse applies if we assume ��� ��� in the set (3.7a).
Therefore both ��� ��� and ��� ��� must be in the set in (3.7b).
This means that �� � � � and �� � � � with � !� � because �� !� �� by
hypothesis. By definition of the domain �� there should be two labels �� and
�� such that � � ��� and � � ��� . In addition, ��� �� � ��� �� because
7Note that we do not make any assumption on the initial interpretation .
3.2. TRANSFORMING INTERPRETATIONS INTO Q.T. SHRUBS 53
���� � ���� therefore �� !� �� because both �� and �� are bijective. By
definition of the set (3.7b) the role must be in both the labels �� and ��.
The role is functional by hypothesis, and we already showed that if two labels
have a functional role in common they must coincide. This is in contradiction
with the fact that �� and �� are different. Therefore, the assumption that ����
can be equal to Æ���� is false, so Æ restricted to the two given sets of (3.9e) is
injective as well as surjective.
The existence of a bijection between two sets guarantees that the cardinality of
the two sets is the same.
(3.9f) If ��� �� is in ��� then, by Definition 3.7, the pair ��� �� is either in the set (3.7a)
or in the set (3.7b). In the first case ��� �� � �� �
�� and ����� ���� � � by
assumption8 therefore ��� �� � �� .
In the second case � � � for some in ��, � � ���, and it is not the case that
� � �� and � � � � . Note that � � (see Definition (3.4a)), so � ��� as
well by Definition (3.4c). This enable us to conclude that ��� �� � �� by using
Definition 3.7.
(3.9d) We prove this point by induction on the structure of the concept expression �.
For atomic concept names it is true by definition.
For the inductive step we assume that (3.9d) holds for the concepts ���, then
we show that it holds also for ��, � ��, and ���.
�� For any �, � � ������ iff � !� �
�� . For the induction hypothesis this is true
iff ��� !� �� , which is the case iff ��� � ����� .
� �� For any �, � � �� � ���� iff � � �
�� and � � ��� . As in the previous
case this is the case iff ��� � �� and ��� � �� , which is true iff ��� �
�� ���� .
��� If � � ������� , then there is a � � �
�� such that ��� �� � �� and � � �
�� .
By (3.9b) ����� ���� � � , and ��� � �� by the inductive hypothesis.
Therefore ��� � ������ .
For the other direction, let us assume that ��� � ������ . Then there is
� � such that ����� � � � and � �� . By (3.9c) there is � � ���
8Note that since we do not assume that the interpretation satisfies the role inclusion, we explicitlyadd ����� ���� �� as a prerequisite.
3.2. TRANSFORMING INTERPRETATIONS INTO Q.T. SHRUBS 54
such that ��� �� � �� and ��� � . Since � � �
�� by induction hypothesis
� � ������� .
Now we turn our attention to the transitive closure of the unravelled interpretation,
showing that this operation would not destroy the properties described in Proposi-
tion 3.9, provided that the original interpretation satisfies the transitive and inclusion
restrictions for role names. In addition, in the case of the transitive closure we as-
sume that the original interpretation satisfies the restriction on the roles imposed by
the knowledge base. This is necessary because Definition 3.8 adds the pairs for ensur-
ing the satisfiability of these restrictions to the interpretation of roles. If these are not
satisfied in the original interpretation, then there is a dichotomy between the original
and the result interpretations which may invalidate part of the properties we stated in
Proposition 3.9.
Proposition 3.10. Let ��� � ������ �
���� be the transitive closure of the unravelled
interpretation �� � ���� � �
���. If the original interpretation � satisfies the transitive and
inclusion restrictions then ��� satisfies all the properties in Proposition 3.9.
Proof.
(3.9a) Since !���� !
�� then �!���� � !� .
(3.9b) If ��� �� � ��� then either ��� �� �
�� or there is a transitive role � &
such that there is a sequence � � ��� � � � � �� � � of elements of ��� such that
����� ����� � � � �� � � � � �� � ��� . In the first case ����� ���� � � by (3.9b). In
the second ������� ������� � � � �� � � � � �� � ��; therefore ����� ���� � ��
because �� is transitive by hypothesis (transitive restrictions are satisfied). In
addition, �� � � because the inclusion restrictions are satisfied; therefore
����� ���� � � .
(3.9c) This is trivially satisfied because �� � ��� for any .
(3.9e) If a role is functional, then by the restriction on the language it has not any
transitive subroles (i.e. � !& for any � � � �� . Therefore ��� �
�� by
definition, so the property is satisfied because �� satisfies (3.9e).
3.2. TRANSFORMING INTERPRETATIONS INTO Q.T. SHRUBS 55
(3.9f) Let ��� �� � ����, we distinguish the two cases in which � is or is not transitive.
If � is transitive and is not transitive, then ���� �
��� by Definition 3.8; so
let us assume that is transitive as well. By Definition 3.8 there is a sequence
� � ��� � � � � �� � � of elements of ��� such that ����� ����� � � � �� � � � � �� � ��� .
By Proposition (3.9b)
������� ������� � � � �� � � � � �� � ��� and
������� ������� � � � �� � � � � �� � �
because � & and the inclusion restrictions are satisfied by hypothesis. We
can use Proposition (3.9f) for showing that ����� ����� � � � �� � � � � �� � �� ,
therefore ��� �� � ��� by construction (see Definition 3.8).
If � is not transitive and ��� �� � ���� , then either ��� �� � �
�� or there is a
transitive role � � s.t. � � & �, � !& � �, and ��� �� � � ���� . In the first case
��� �� � �� by Proposition (3.9f) and
�� � ��� . In the second case � � &
because � & and we can proceed as in the previous case in which � was
transitive, by using � � in place of �.
(3.9d) We can proceed exactly as in the corresponding proof in Proposition 3.9. In fact,
the proof relies on the previously shown (3.9b) and (3.9c).
Before verifying that whenever an interpretation satisfies a KB, the transitive clo-
sure of its unravelled interpretation satisfies the very same KB; we want to be sure that
the interpretation we defined is a canonical quasi transitive shrub. This is what we are
going to prove with the next proposition.
Proposition 3.11. Let ��� � ������ �
���� be the transitive closure of the unravelled
interpretation �� � ���� � �
���. Then if � satisfies the role ordering, ��� is a canonical
quasi transitive shrub interpretation.
Proof. First we show that ��� is a quasi transitive shrub (Definition 3.2). The first
properties (3.2a) and (3.2b) are trivially satisfied by definition.
(3.2c) Let ��� � � be a pair in ��� for an arbitrary role name . Summarising the
Definition 3.8, either ��� � � � �� or there is a transitive role � & and a finite
3.2. TRANSFORMING INTERPRETATIONS INTO Q.T. SHRUBS 56
sequence of elements ��� � � � � �� of ��� such that
����� ����� � � � �� � � � � �' �� �� � �� �� � � � � ��� �
In the first case ��� � � must be in the set (3.7a) (because � !� � by definition);
therefore ��� � � � �� . In the second case, the same property can be shown for
all the elements ����� � � � � ��.
(3.2d) Let ��� � � be a pair in ��� for an arbitrary role name .
If is transitive then there is a finite sequence of element ��� � � � � �� of ��� such
that ����� ����� � � � �� � � � � �' �� �� � �� �� � � � � �� . Note that ���� � �
because � cannot be the empty sequence (see (3.7b)). If � � � then � � ���� �
�; otherwise ������ ��� � ��� � � is in �� and ��� ����� � ��� �� is in
�� by
definition (take the sequence ��� � � � � ���� in Definition 3.8).
If is not transitive, then ��� � � � �� or ��� � � �
���� � ������ ��� �
���.
In the first case � � � by definition (see (3.7b)). In the latter case, there is a
transitive role � s.t. � & , !& � and ��� � � � ����. We just showed that
if the role is transitive then either � � � or ���� ��� ��� � �� � ���� . Moreover,
���� &
��� because � & .
(3.2e) Suppose ���� ��� ��� � �� � � ��� � ���, we distinguish two cases where � � �
and � � �.
– If � � � then ��� � �� � ��� . Since we already showed that (3.2d) holds,
we can use it to conclude that either � � � (i.e. ��� � �� � ��� � �� � ���)
or ��� � �� � ���. In both cases the property is satisfied.
– Let us assume � � �. Since ��� �� � ���, then � !� � �� � � � for all
� � �� � � � � � by construction (see Definitions 3.8 and 3.7).
Firstly we show that for any role , when � !� � �� � � � for all � �
�� � � � � � and ��� � �� � � �� � ���, then �� �� � � ���� � �� � � �� �
��� for
� � �� � � � � �. Since ��� � �� � � �� � ���, then either � � � �� � � ���
or ���� � �� � � ����� �� �� � � ���� � �� � � ��� � ��� . The first case can
be ruled out because � !� � �� � � � for all � � �� � � � � �; therefore both
��� � �� � � ���� and �� �� � � ���� � �� � � �� are in ���. We can apply the
same arguments to ��� � �� � � ���� to conclude that both ��� � �� � � ����
and �� �� � � ���� � �� � � ���� are in ��� as well; this inductive argument
3.2. TRANSFORMING INTERPRETATIONS INTO Q.T. SHRUBS 57
can be carried on for each � �� � � ���, as long as � ' # � �. Therefore
���� � �� � � ����� �� �� � � ���� � �� � � ��� � ��� for any � � �� � � � � �.
In addition, we can use the fact that ��� � � �� � ��� to conclude that
��� � �� by using the same arguments.
If is transitive then, ��� � �� � �� and �� �� � � ���� � �� � � �� �
��
for any � � �� � � � � � (by construction, see Definition 3.8). Therefore
���� � �� � � �� � � � �� � � � � �� � ���, so ��� � �� � � �� �
���.
If is not transitive, then there is a transitive role � & such that
��� �� � � �� � ��� because � � � by assumption, so ��� � �� � � �� can-
not be in �� . By using the same arguments as the transitive case we can
conclude that ��� � �� � � �� � ����, therefore ��� � �� � � �� �
��� as well.
The next step is to show that ��� is canonical (Definiton 3.6).
(3.6b) Let �� � � �� � be role names and �� � elements of ���� such that � !� � . We
have to show that if ���� ��� � �� � � � � � �
� then there is a label � s.t.
��� � � �� �� � �. Since � !� � and ���� ��� � �� , then there are two
elements � in ���� and in �� such that � � � (by Definition (3.2c), because
it has been proved above that ��� is a q.t. shrub). By using Definition (3.2d)
together with the fact that ��� � � � ����, for each role �, we can conclude
that either � � � or ���� ��� ��� � �� � ����. Note that if � � �, this must be
true for all the roles �� � � �� � and the hypotesis would become ���� � �� �
�� � � � � ��
�; on the other hand, � would be the same for all the roles again,
therefore ���� � �� � �� � � � ���
� . This shows that we can restrict ourselves
to the case ���� � �� � �� � � � ���
� without loss of generality, which allows
us to consider only the Definition 3.7 of unravelled interpretation, by ignoring
the transitive closure.9
By the fact that the pair has the structure ��� � �, we know that for every role �
��� � � ����� � � � �
��� ����� � ����� � �� � �
� and � � � �
therefore � � � � for all roles �� � � �� �, which shows that there is a label
( � �) which includes all the roles.
9The role hierarchy does not add any pair ��� ��� to a role because of the Property (3.9c) and thefact that satisfies the role ordering by hypothesis.
3.2. TRANSFORMING INTERPRETATIONS INTO Q.T. SHRUBS 58
(3.6a) Let �� �� � be different elements in ���� such that � !� � and
���� ��� ��� ��� ��� ��� � ����
If is transitive then (3.6a) is verified by definition because & . So let
be non–transitive: this means that either ��� �� � �� or there is a transitive role
� s.t. � & , !& �, and ��� �� � ���� (see Definition 3.8). Analogously,
either ��� �� � �� or there is a transitive role � � s.t. � � & , !& � �, and
��� �� � � ���� . Let us consider this case by case.
– If ��� �� is in �� , then � � � for some � �� because � !�
��� (see
Definition 3.7). If ��� �� � �� then � � � as well, therefore � � �. This
is in contradiction with the assumption that � and � are distinct.
– Let ��� �� be in �� and ��� �� be in � �
���. Then � � � and there are
��� � � � � �� in ��� s.t. �� � �, �� � �, and ����� ����� � � � �� � � � � �' �� �
� ��� . Since � � � then ���� � �, therefore ��� �� � ������ ��� is in � �
��
which is included in � ����.
– If ��� �� is in ����, and ��� �� is in �� , then we can conclude that ��� �� is
in ���� as well by using the very same arguments as for the previous case.
– Let ��� �� be in ����and ��� �� be in � �
��� . Then there are ��� � � � � �� and
���� � � � � ���� in �
�� s.t. �� � �, ��� � �, �� � ���� � �,
����� ����� � � � �� � � � � �' �� � ����
and
������ ������ � � � �� � � � � �� ' �� � � �
���
As in the previous cases we note that ���� � ������; therefore ������ �� �
��� � � �
�� . Since ��� � ���� by definition (see Definition 3.8), and � �
�� �
� ���� , then ������ �� � �
��� � � ����.
By using the already proved Property (3.6b), we know that there is a label
� containing both � and � �; by Definition (3.4b), either � � � � or � �
� �. Note that by assumption neither � nor � � can have any functional role
which includes them (or be functional themselves). Let us assume that
� � � �; since � � is not functional then either � � & � or there is a role �
such that � � & � and � � �. The first case is our goal, while in the
3.3. COMPLETENESS OF Q.T. SHRUB INTERPRETATIONS 59
latter case we can assume that � must not be functional because � � cannot
be included into a functional role. Since the ordering � is well defined,
sooner or later we are going to hit � without reaching any functional role;
therefore the result is a chain of inclusions which shows that � � & � by the
transitivity of &.10 Assuming that � � � � we can conclude that � & � � by
using the same arguments we used for the dual case.
Therefore we have just shown that either � � & � or � & � �. Let us consider
the case in which � & � �. We assumed that � satisfies the role ordering,
then by using (3.9b) and (3.9f) it is easy to show that if � & � � then ���� �
� ���� (see the proof for Lemma 3.12). Therefore both ��� �� and ��� �� are
in � ����. If we consider the case where �� & � we obtain the same result
with � instead of � �.
3.3 Completeness of q.t. shrub interpretations
Now all the pieces are in place for showing the completeness of the described models.
Obviously, if a knowledge base has a quasi transitive shrub model it has a model.
The difficult bit is showing that having an unrestricted model implies that there is a
quasi transitive shrub model as well. Clearly, the complex translation which has been
described in the previous sections provides the basis for such a proof. We start by
showing that given an arbitrary interpretation satisfying a KB, the transitive closure of
its unravelled interpretation satisfies the KB as well.
Lemma 3.12. Let � be a knowledge base, & be the role ordering induced by �, and
� �� %�� be the transitive and functional role names. We assume that � ��
and %�� are consistent w.r.t. &. If an arbitrary interpretation � � ��� ��� satisfies
�, then Then the transitive closure of its unravelled interpretation ��� � ������ �
����
satisfies � as well.
Proof. We prove the lemma by considering all the assertions in � and showing that
they are satisfied by ���. Since � satisfies �, the requirement for Proposition 3.10 are
satisfied. Therefore we are guaranteed that the properties in Proposition 3.9 hold for���.10The last point can be demostrated more formally by induction on �, but we think that it is clear
enough.
3.3. COMPLETENESS OF Q.T. SHRUB INTERPRETATIONS 60
Let �� be a concept axiom in �. Assume that the axiom is not satisfied, then
there is an element � such that � � ���� and � !� �
���. By (3.9d), ��� � ��
and ��� !� �� therefore � do not satisfies �. This is in contradiction with the
hypothesis, therefore � � must be satisfied.
Let � be a role axiom in �; then � & by definition. Let ��� �� be a pair
in ����, then ����� ���� � �� by (3.9b). We can conclude that ��� �� � ��� by
using (3.9f).
If � � �� , then ��� is transitive by construction (see Definition 3.8). This
can be proved by considering that if ���� ��� ��� ��� � ���, then we have two
paths in �� : one connecting � to �, and a second connecting � to �. Therefore
there is a path in �� connecting � to �.
If is functional, then for any � � ���� ,
"�� � ��� �� �
��� � "� � ����� � � �
�by (3.9e). In addition, "
� � ����� � � �
�$ � because � is functional;
therefore ��� is functional as well.
Let ��� be a concept assertion in �. It is easy to see that ����� ��� , therefore
������ � �� . The assumption that ���
�!� �
��� implies that ������ !� �� by
(3.9d). Obviously this is a contradiction with the hypothesis that � satisfies �;
therefore ����� �
��� .
Let ��� ��� be a role assertion in �. Let us consider the pair ������ �
����: by
Definition 3.8 ���� � �
�� and ���� � �
�� . Note that ���� � ���� � �� �
�� , and
������� ��
���� � ���� ���. Moreover, ��� � ��� � � because � satisfies the role
assertion. Therefore, ������ �
���� � �� because the pair is in the set (3.7a), and
������ �
���� � ��� because �� �
��� .
We can now conclude by stating the completeness of quasi transitive shrub models
w.r.t. our logic. In Chapter 7 we present a slightly different version of the present
theorem for deduction instead of satisfiability.
Theorem 3.13. A knowledge base � is satisfiable iff there is a canonical quasi transi-
tive shrub interpretation satisfying it.
3.4. APPLICATION TO TERMINOLOGICAL REASONING 61
Proof. The “only if” direction direction is trivial because if there is a quasi transitive
shrub interpretation satisfying �, then � is satisfiable by definition. For the “if” direc-
tion, let be � an interpretation satisfying �, and ��� � ������ �
���� the transitive closure
of its unravelled interpretation. According to Lemma 3.12 ��� satisfies � as well; in
addition ��� is a canonical quasi transitive shrub interpretation by Proposition 3.11.
3.4 Application to terminological reasoning
We can use the very same technique to establish the completeness of the class of inter-
pretations we described w.r.t. terminological reasoning. In particular we concentrate
on the problem of verifying the satisfiability of a concept expression w.r.t. a given ter-
minology (see Chapter 2). We have already shown that the structure of models for a
generic kb is a set of quasi transitive trees “glued” together through their roots. For
terminological reasoning (i.e. without individuals) models are much simpler because
they are single transitive trees. This property is exploited in Chapter 5 to show the
completeness of the presented kb satisfiability algorithm.
A given concept � is satisfiable w.r.t. a terminology � iff there is an interpretation
� satisfying � , such that the set �� is non empty. The property we want to show is
that this is the case iff there is a quasi transitive shrub interpretation � s.t.: it satisfies
� , it is a single quasi transitive tree, and the root of this tree is in � �� . These properties
will be used in Chapter 5.
We notice that the given satisfiability problem can easily be reduced to knowledge
base satisfiability (see Schaerf [1994]). In fact,� is satisfiable w.r.t. a terminology � iff
the knowledge base � � �� � ������ is satisfiable (where � is an arbitrary individual
name). This can easily be proved by considering that if � is an interpretation satisfying
� , such that �� is not empty; then, we can build an intrepretation for � by mapping
� to one of the elements of �� . Conversly, if � satisfies � then �� is an element of
��; therefore � is satisfiable w.r.t. � .
We use this reduction to establish the completeness result. Let us start from an
arbitrary interpretation � satisfying �. As we have just shown, there is a direct way
of obtaining an interpretation � � satisfying � , by taking � � as equal to � but for the
addition of the mapping from � to a given element of � � . As shown in Section 3.2.1
we build �� �� as the transitive closure of the unravelled interpretation of � �.
The main idea is “extracting”, from �� �� the single tree rooted in �����
. Since in
� there are not any role assertion, all the q.t. trees in �� �� are unconnected, and there
3.5. REMARKS 62
cannot be any loop on the root � ����
(see Definition 3.7). Formally, we extract an
interpretation � � ����� ���� by selecting all the elements of � ����
connected to � ����
:
��� � �� � � � �����
, or there is a finite sequence ���� � � � � ��� � �����
( ����
and roles �� � � � � ��� s.t.
�� � �����
� �� � � and ���� ����� � � for � � �� � � � � �' ���
and then we define the new interpretation function ��� restricting the original � ����
to the
new domain:
��� � �����
����
�� � ����
���� ����
By construction � is connected, in the sense that there is a path from the root � ����
to any other individual. It is not difficult to verify that � is still a canonical quasi
transitive shrub interpretation satisfying � , and the root is in � �� .
3.5 Remarks
At this point we need to clarify the reason why the complicate mechanism that has
been put in place was really necessary. The alternative technique can be derived from
the algorithm used for showing the satisfiability of DL formulae w.r.t. a terminology
(see Horrocks [1998]). It can be shown that if the formula is satisfiable the algorithm
produces a pseudo–model from which an interpretation can be trivially built. It is not
difficult to show that the interpretations generated by the pseudo–models satisfy the
properties we are interested in. This method is very good for showing the completeness
of the class of interpretations w.r.t. the problem of kb satisfiability (Section 3.3), but it
has the disadvantage that the connection between an arbitrary interpretation and one of
the interpretations belonging to the restricted class cannot easily be established.
This connection is not necessary for kb satisfiability, but we need it for the prob-
lem of query answering (see Horrocks and Tessaris [2000]) when the query language is
different from the assertional language (see Section 7.1). As we will see in Chapter 6,
query answering is strongly related to logical deduction; which, roughly speaking, is
3.5. REMARKS 63
the problem of deciding whether a formula is verified in all the interpretations satisfy-
ing a knowledge base (written as � �� $).
In general, when the knowledge base and the formulae share a common language
closed under negation the problem � �� $ is simply tranformed into the kb (un) satis-
fiability problem � � ��$�. In the case of our language, this is not directly possible
because the formula $ is written in a language strictly more expressive than the one
of �. However, our goal is to reduce the problem to KB (un) satisfiability anyway, by
introducing a transformation of the original formula $ into a kind of negation $, such
that � �� $ iff � � $ is not satisfiable.
The required transformation is inspired by the “tree–model property” of the DLs
we consider (see Section 6.2.2); therefore we show a restricted version of the logical
deduction problem in which we consider only interpretations belonging to a given class
�.11 I.e. deciding whether a formula is verified in all the interpretations belonging to
the class � and satisfying a knowledge base (written as � ��� $).
The class of interpretations cannot be chosen arbitrarly, in fact we are interested in
a class � such that � �� $ iff � ��� $. Our candidate class is the one presented in
this chapter, therefore we must show that it satisfies the required property. The “only
if” direction is the easy bit, since � �� $ takes into account all the interpretations, we
must show that if � ��� $ then � �� $.
The technique we will use for this proof (which will be presented in detail in Chap-
ter 7) relies on the interpretation transformation presented in Section 3.2; in particular,
on the ability to relate the original and the transformed interpretations in both direc-
tions. If we were interested in kb satisfiability only, we needed only the first direction;
therefore the mere existence of an arbitrary interpretation belonging to the required
class would have been enough.
11In particular we consider the class of q.t. shrub canonical w.r.t. the kb (see Section 3.1).
Chapter 4
KB satisfiability algorithms
In this chapter we present a brief overview of the different methods for KB satisfiability
presented in the literature, then we describe the precompletion technique, investigated
in this thesis. In addition, we provide the motivation for our decision to concentrate on
the precompletion method.
4.1 Alternative Abox reasoning techniques
In our analysis we concentrate on complete algorithms suitable for DLs whose expres-
sivity is at least ��� with general axioms. This choice is dictated by the acknowledge-
ment that the ability to express general axioms is essential for modern DL systems. For
this reason, for example, we are not going to describe the structural algorithm used by
the CLASSIC DL system.
4.1.1 Direct tableaux
In Section 2.2.2 we provided an example of verification for the satisfiability of a KB
based on an extension of the tableaux–based technique presented in Section 2.2.1. In
general, tableaux algorithms for terminological reasoning can be extended to hybrid
reasoning without major changes.
The main idea is that the expansion starts with an initial set of constraints (see
Equation 2.5) corresponding to the assertions in the Abox (see Buchheit et al. [1993],
64
4.1. ALTERNATIVE ABOX REASONING TECHNIQUES 65
Haarslev and Moller [2000b], Horrocks et al. [2000b]).1 Individual names are sub-
stituted by different variables in the constraint system; therefore if the unique name
assumption holds for individuals, inequality constraints are added for imposing that
two different individual names are never merged.
It is worth mentioning that up until now the fastest DL system providing Abox
reasoning (described in Haarslev and Moller [2000a]) uses a direct tableaux technique.
We are interested to experiment with a technique which maintains the algorithmic
distinction between Abox and Tbox. In addition we want to reuse terminological rea-
soners as they are (if possible). So we decided to not investigate the direct tableaux
technique but a variation of it, called the precompletion technique, which will be de-
scribed later on.
4.1.2 Encoding
The encoding technique reduces the KB satisfiability problem to the problem of veri-
fying the satisfiability of a concept w.r.t. a terminology (see De Giacomo and Lenzerini
[1996]). This is achieved by treating individuals as mutually disjoint newly introduced
concept names, and encoding the Abox assertions as terminological axioms. For ex-
ample the assertions ��� and ��� ��� are encoded as the two axioms �� � and
�� ���� respectively.
This idea alone does not provide a complete solution, since when concepts are used
instead of individuals we are relaxing the fundamental property of an individual of be-
ing unique. In fact we cannot restrict the cardinality of the concept the individual has
been substituted with, therefore it is possible that a knowledge base which is unsatisfi-
able because of this uniqueness restriction results in a satisfiable one once encoded.
A similar idea was exploited in Borgida and Patel-Schneider [1994] and Era and
Donini [1992]; however in the first case a different semantics was adopted for the
individual name,2 while the second approach works only for not very expressive DLs.
In De Giacomo and Lenzerini [1996] the proposed algorithm deals with very ex-
pressive DLs, providing correct and complete reasoning. The trick they use is to ensure
that, given an interpretation for the encoded KB, all the elements belonging to the same
individual representative concept are indistinguishable. To guarantee this property they
use terminology axioms to impose that if an element of a representative concept �� has
1Note that terminological reasoning always starts with a single concept formula� to check, thereforecorresponding to the single constraint ���.
2In fact they relaxed the assumption that an individual is unique.
4.1. ALTERNATIVE ABOX REASONING TECHNIQUES 66
a property (i.e. belongs to a concept) � then all the elements of �� have the same prop-
erty.
Note that the concept � is not specified, and in principle there are infinitely many
different expressions; therefore, infinitely many new axioms for each representative
concept. However, they show that the number of different concept expressions they
need is finite and polynomialy bound by the size of the original knowledge base.
Although the technique is a very powerful theoretical mechanism for investigating
the computational properties of very expressive DLs, this method has not lead to any
practical implementations. The encoding they suggests requires a very expressive un-
derlying language (i.e. Converse PDL), for which no efficient reasoner is yet available;3
it is also the case that the size of the generated terminology, although polynomial, soon
becomes unmanageable when the number of individuals grows.
4.1.3 Resolution based methods
In spite of the fact that the DL community is strongly focused on tableaux–based meth-
ods, recently there have been some attempts to apply resolution–based methods to the
KB satisfiability problem. We will not spend much time here on this approach since
reasoning with Aboxes using resolution seems to be still at an early stage. However,
some results have been obtained by translational methods4 applied to terminological
and Abox reasoning (see Hustadt and Schmidt [2000], Tammet [1995]). Note that a
resolution–based terminological reasoner can be used in conjunction with the tech-
nique we are investigating (see Section 4.2).
Recently, a direct resolution–based method has been proposed for KB satisfiability
(see Areces et al. [1999]). Although the DL they cover lacks some important fea-
tures (for example the possibility of expressing general inclusion axioms) we think
that more work in this subject can produce interesting results, since several modern
computational logic tools are based on resolution methods rather than tableaux.
3In fact, no attempts have been made to investigate whether the encoding technique is applicable todifferent DLs.
4Modal or Description logics are translated into First Order Logic and then resolution methods areapplied with an appropriate strategy for guaranteeing termination.
4.2. PRECOMPLETION 67
4.2 Precompletion
The main idea behind the precompletion technique is to split the KB satisfiability algo-
rithm in two parts. In the first part all the information implicit in the role assertions is
made explicit, generating what we call a “precompletion” of the knowledge base, then
a terminological reasoner is used for verifying the consistency of the concept assertions
for each individual name (see Hollunder [1996], Donini et al. [1994]).
The precompletion approach has been successfully used for both providing cor-
rect and complete algorithms, and analysing the complexity of the KB satisfiability
problem. The works cited above focused on DL knowledge bases with empty termi-
nologies, and languages not including transitive or functional roles.5 With the work
presented in this thesis we generalised the precompletion technique for KBs with gen-
eral inclusion axioms, transitive roles, and functional roles (i.e. the DL ��f).
Let us consider for example a very simple Abox containing only the assertions:
� � ������� ��� ���� ����� �
The two first assertions can be used to derive the new assertion ���; it is easy to realise
that � is satisfiable iff �� � � � ����� is satisfiable as well (which is not the case,
because � cannot be in � and �� at the same time). The interesting point is that when
we check the satisfiability of �� we do not need to consider the role assertion, because
its effects have been made “explicit” in the new assertion. Therefore we can verify
the KB satisfiability by checking the satisfiability of the concepts ��� and � � ��
separately.
This technique consists of a correctness–preserving process which eliminates spe-
cific information regarding dependencies between individuals, while maintaining the
consequences of such information. Once these dependencies are eliminated, the as-
sertions about a single individual can be independently verified, ignoring the fact that
an individual is involved. The precompletion, or elimination of the dependencies, is
performed by adding new assertions using a set of nondeterministic syntactic rules.
Because of the nondeterminism of the rules, many different precompletions can be de-
rived from a single knowledge base, which is satisfiable if and only if at least one of
these precompletions is satisfiable.
A subset of the tableaux rules for ��f (see Horrocks [1998]) is used to transform
5Even if the DLs included number restriction constructors, general axioms are needed for simulatingfunctional restrictions.
4.2. PRECOMPLETION 68
the KB into one of its precompletions (see Figure 5.1). In the case of ��f, the only
“dropped” rule is the �-rule (see Figure 2.1), and this means that new variables are
never generated.
The nondeterminism is introduced by the �-rule and possibly an exponential num-
ber of precompletions can be generated. On the other hand, since the number of vari-
ables is constant, the size of a precompletion is alway polynomial in the size of the
original KB, and, most importantly, we do not need to worry about the possible non–
termination of the algorithm (see Section 5.1).
The main advantage of this technique lies in the fact that the concept satisfiability
tests can be performed by using any available terminological reasoner (providing a suf-
ficiently expressive DL language); in particular, impressive results have been recently
obtained by optimised tableau–based systems like FaCT (see Franconi et al. [1998b]).
Indeed, one of the primary motivations for reexamining the precompletion approach is
the availability of such optimised systems.
The formal proof of correctness and completeness of the algorithm is provided in
Chapter 5, here we sketch the actual algorithm and the technique used for proving its
completeness.6
4.2.1 KB satisfiability algorithm
The input consists of a set containing the constraints (see Formulae 2.5) corresponding
to the assertions in the Abox. In this section we introduce the algorithm for the less
expressive language �� (i.e. without functional roles). The rules for this language are
simpler and more intuitive, the full details for ��f are presented in the next chapter.
The rules in Figure 4.1 are repeatedly applied to the initial set until either no rule is
applicable or a contradictory combination of constraints is detected (a so-called clash).
The �-rule generates several alternatives branches, these are exhaustively explored
by means of backtrack points where the algorithm restarts in case of a failure.
Intuitively, a clash corresponds to an unsatisfiable combination of constraints in
the constraint set. When the precompletion process encounters a clash there is no need
to continue adding new constraints, the precompletion will be unsatisfiable anyway;
therefore the algorithm backtracks to the most recent backtrack point and continues
the precompletion from that state.
If none of the rules is applicable a precompletion has been generated, and can
6In this case the correctness is the easy part.
4.2. PRECOMPLETION 69
� #� �!��� � �if ! is in , � � is in �and !�� is not in �.
� #� �!���� !���� � �if !��� � �� in �,and either !��� or !��� is not in �.
� #� �!��� � �if !��� � �� in �,and � � �� or � � ��
and !�� is not in �.
� #� �!���� � �if !���� in �, and �!� !���� is in �, and � & ,and !��� is not in �.
� #�� �!������ � �if !��%�� in �, �!� !���� is in �,and there is � � �� such that � & & % ,and !����� is not in �.
Figure 4.1: Precompletion rules for ��
be checked for its satisfiability. This is done by gathering all the concept assertions
concerning an individual, the concepts are then conjoined generating a new single
concept associated to each single individual. These concepts are then verified by using
an external call to a terminological reasoner; if they are all satisfiable then the algorithm
terminates with success, otherwise it backtracks as in the case of clash detection.
If the process fails to find a satisfiable precompletion after all the nondeterministic
branches have been explored, then it terminates with failure.
As presented here the algorithm sounds rather naive and prone to inefficient explo-
ration of the search space. In fact, we need a few “tricks” to avoid a hopelessly slow
algorithm. In Chapter 8 we will come back to this point and show how to improve the
algorithm to obtain acceptable behaviour in most of the cases. Note that the theoretical
worst case complexity of reasoning is EXPTIME, therefore there can be pathological
KBs manifesting this complexity.
Example 4.1
4.2. PRECOMPLETION 70
We use a slightly modified version of Example 2.3 to illustrate the precompletion al-
gorithm. Let us consider the problem of checking the satisfiability of the Abox���������������������
sarah���FRIEND���Female � ��LOVES�Female����
susan�Female�
andrea��LOVES��Female�
�sarah�susan��FRIEND�
�sarah�andrea��FRIEND�
�susan�andrea��LOVES
��������������������
with an empty terminology. This Abox corresponds to the one of Example 2.3 where
the two constraints bill��Female and �andrea�bill��LOVES have been replaced
by the constraint andrea��LOVES��Female. Note that this Abox is still unsatisfi-
able, it can be easily verified by using similar arguments to those used for the previ-
ously seen example; however we want to show its unsatisfiability by using the precom-
pletion algorithm.
Using the �–rule with the first concept constraint and the two role constraints
�sarah�susan��FRIEND and �sarah�andrea��FRIEND we add the two con-
straints �susan���Female � ��LOVES�Female���
andrea���Female � ��LOVES�Female��
��
Both the newly added constraints are possible branching points where the �–rule is
applicable; however if we choose to add the concept constraint susan��Female we
obtain a contradiction straight away. Therefore we ignore that branch and we add the
constraint �susan���LOVES�Female�
�
We can carry on with the deterministic rules and apply the �–rule with the latest added
constraint and the role constraint �susan�andrea��LOVES; this results in the further
addition of the constraint �andrea�Female
�
At this point, the only option is applying the �–rule to the constraint
andrea���Female � ��LOVES�Female���
The choice of adding the constraint andrea��Female can be immediately ruled out
4.2. PRECOMPLETION 71
by the fact that it is in contradiction with the previously added andrea�Female,
therefore we select the other disjunct and we add the constraint�andrea���LOVES�Female�
�
It is easy to verify that we have obtained a precompletion, since there are no further
applicable rules; in addition, all the search space has been explored. Therefore, the
original Abox is satisfiable iff this precompletion is satisfiable. The satisfiability check
is performed by verifying the satisfiability of the concept constraints associated to each
individual:
Individual Concept
sarah ��FRIEND���Female � ��LOVES�Female���
susan �Female � ��Female � ��LOVES�Female�� � ��LOVES�Female��
andrea ���LOVES��Female� � ��Female � ��LOVES�Female��
� Female � ��LOVES�Female���
The first two concepts are satisfiable, while the third one is not because of the conjunc-
tion of ��LOVES��Female� and ��LOVES�Female�. Therefore the initial Abox is
unsatisfiable.
4.2.2 Correctness and completeness of the algorithm
This section presents an overview of the proof for correctness and completeness of the
precompletion technique; the full formal proof is provided in the next chapter.
The proof is divided in two parts: in the first we show that the set of precompletions
characterises the satisfiability of the original knowledge base. The second part gives a
method for checking the satisfiability of such a precompletion using a terminological
reasoner.
The definition of a precompletion for a knowledge base � � �� ��� is given in
a procedural way as a new KB � � � �� �� �� where the ABox � � is obtained by
extending � using the nondeterministic syntactic rules in Figure 5.1 as long as they
are applicable.
The precompletion rules are designed in such a way that, whatever strategy of
application is chosen, the process of completing a knowledge base always terminates,
with the same set of precompletions. In fact the only rule that does not introduce a
4.2. PRECOMPLETION 72
smaller assertion in the knowledge base is the ��–rule, but its applicability is bounded
by the number of role assertions, which is invariant. The number of precompletions of
a KB can be exponential because of the presence of a nondeterministic rule, however
the size of each precompletion is polynomial with respect to the size of the original
KB.
The set of all precompletions of a knowledge base characterises its satisfiability.
If one of the precompletions is satisfiable then the original KB is trivially satisfiable
because the process never removes constraints, therefore the assertions in the Abox are
still in the precompletion. The other direction is proved by using a technique similar
to that used in Horrocks and Sattler [1998]. A model � of � which witnesses its
satisfiability is used to deterministically guide the application of precompletion rules
to a unique satisfiable precompletion. For this purpose the only nondeterministic rule
is transformed into its deterministic counterpart:
�#� �!��� � �
if !��� � �� in �,
and � � �� if !� � ��� , � � �� otherwise
and !�� is not in �.
Satisfiability is preserved by rule application since if � is a model for the ABox
before the application of the rule, then it is a model for the extended ABox as well.
Therefore, given the fact that the terminology does not change, � is a model for the
generated precompletion.
Now we know that we can concentrate on precompleted KBs without loss of gen-
erality. In �� all the contradictions can be detected by the terminological reasoner,
but in ��f there can be clashes involving role assertions (see Definition 5.7). For this
reason in Chapter 5 we will assume clash–free precompletions, since the presence of a
clash makes a precompletion trivially unsatisfiable.
Given a precompleted knowledge base � �, for each individual ! in the KB we
consider the individual concept���� �� !�, which is defined as the conjunction of
all the concept expressions in the set �� � !�� � ��, or � if there are no assertions
about !. It is clear that a model for the knowledge base is a model for every individual
concept. We now show how, given models for the individual concepts, we can build a
model for the knowledge base.
If each individual concept���� �� !� is separately satisfiable with respect to
the terminology, then for every individual name ! there is an individual model �� �
4.2. PRECOMPLETION 73
���� ����, which witnesses the satisfiability. Interpretations can be infinite, but with-
out loss of generality it can be assumed that their interpretation domains are pairwise
disjoint, as well as having tree structures,7 whose root elements (indicated by ���) are
in the extension of the corresponding individual concept���� �� !� (as shown in
Section 3.4).
The idea is to build a union interpretation � � ��� �� � by “gluing” together all the
tree interpretations, and adding new pairs to the interpretation of roles according to the
role assertions.
The fact that the union interpretation is a model for the precompleted KB is in-
tuitive for the propositional part (concept conjunction, disjunction and negation), but
not so for the modal part. In particular, trouble can arise from the interaction between
the universal quantification and the newly added pairs. However, the domain disjoint-
ness and tree structure assumptions guarantee that the properties of role interpretations
overcome this problem.
First, new links are added only to root nodes (those associated to individual
names); in fact for each pair in � � �� � � , � ��� (�!� � implies that � � ��
is in �� . An important consequence of this property is that if a role connects
two individuals of different individual domains, then the first element of the pair
must be a root individual.
Second, there cannot be a connection between elements from different individual
domains without a path passing through a root node. That is, given two distinct
individuals � and �, whenever there is a pair ��� � � � � with � ��, there
should be a transitive role � included in such that both pairs ��� � ���, ���� �
are in �� , or � �� .
This property ensures that elements in different individual domains can interact
only via individual names in the Abox (i.e. the roots). Therefore, restrictions
like ��� holding inside an individual model cannot affect elements belonging
to other individual domains.
It is easy to see that interpretation of roles satisfies the role assertions in the KB,
because of their simple form; however concept assertions are more problematic. In
fact, although the interpretation of a concept name is simply the union of the interpre-
tations of individual models, the extension of some concept expressions can violate the
7Actually, the structure is more tree–like, as explained in Chapter 3, but here we can consider itsimply as a tree.
4.2. PRECOMPLETION 74
semantics of concept forming constructors because of the presence of new pairs in the
interpretation of roles.
Example 4.2
For example, given the simple precompleted knowledge base � � ���� �����, we can
build an individual model �� for the label of �. We choose the interpretation domain
�� � ��� with the interpretation function ��� � �, and ��� � �� � . The union
interpretation maps to ���� ��� and � to �.
Let us consider the concept expression ���, by the semantics of the universal
constructor (see Table 2.1) ��� � ������� because ��� is not related to any other
element via . However, in the union interpretation this is no longer true because ���
is related to itself via �� , but ��� cannot be in ��� � .
This problem is localised to root individuals, because the interpretation of roles re-
stricted to non–root individuals does not change. In fact, the interpretation of arbitrary
concept expressions, restricted to non–root individuals is a monotonic extension of the
one in individual models; i.e. for any concept expression �, ��� (�!�
� �� for
each individual name !.
For root individuals this property does not hold for arbitrary concept expressions;
however, the property is still valid for concept expressions which appear as assertions
in the precompleted KB. In fact, for each assertion !�� � � � the individual �� is in the
extension �� . This restricted property (together with the more general applicability to
non–roots) is sufficient to prove that all the axioms and assertions in the precompleted
knowledge base are satisfied.
4.2.3 Pros and cons of precompletion
The initial motivation of our research was exploring the feasibility of providing an
Abox reasoner for the DL of the FaCT system (��f). In particular we are interested in
the differences in developing DL systems with and without Abox. For this reason we
decided to investigate the precompletion technique, since it enables a clear distinction
between the hybrid reasoning and the terminological component of the algorithm.
The algorithm is completely independent from the terminological reasoner which
is used for the concept satisfiability test. In this way, an Abox can be easily added to
most existing DL systems without re-implementing the whole system. Moreover, op-
timisation strategies implemented at the terminological level do not adversely interact
with the precompletion algorithm.
4.2. PRECOMPLETION 75
A further advantage of precompletion is that it confines the exponential blow of the
complexity to the terminological reasoning. In fact, the size of precompletions is poly-
nomial in the size of the KB. This may indicate that a system based on precompletion
can run with a smaller memory footprint compared to a system based on an extension
of the Tableaux method.8
On the other hand, precompletion presents two big disadvantages w.r.t. other tech-
niques. The first one is related to the fact that the precompletion process cannot obtain
useful information from the terminological reasoner (just a binary yes/no). This means
that in case of failure of a concept satisfiability test, the precompletion process cannot
use additional information for improving heuristics in the exploration of the search
space.
However, we think that the major drawback lies in the fact that the limits on the ex-
pressiveness of DLs suitable for this approach are not clear. For example, the extension
of precompletion to the inverse role constructor is problematic.
Example 4.3
Consider for example the Abox
������������������� ����� ��� ���
��
This Abox is unsatisfiable, as can be seen by considering the following diagram
� �
����������������� ����
������������
�
���
�
The key point is the combination of the existential and universal quantification
���������� � �� which pushes “new” restrictions on the individuals � and � (i.e. the
concept ��� on �, and consequently the concept � on �). Note that the Abox is
already precompleted, and the concept ��������������� is satisfiable; therefore a
naive application of precompletion leads to a wrong result.
Precompletion is failing in this case because it does not take into account the fact
that new concepts may have to be added to the label of an individual because of the
terminological reasoning.
8This actually depends on the kind of optimisation implemented in the DL system. As we are goingto suggest later on, the precompletion technique can be used as a testbed for improving strategies forTableaux based systems.
4.2. PRECOMPLETION 76
In the next chapter we prove that the precompletion technique can be used with
the DL ��f. From the precise description of the technique an actual algorithm can
be easily devised. We have implemented a prototype of an Abox reasoner based on
precompletion, and we performed some evaluation experiments with it. The system
and the results are described in Chapter 8.
Chapter 5
Precompletion algorithm for ��f
The proof for correctness and completeness of the technique is in two parts: in Sec-
tion 5.1 we present a method for deriving a set of simpler knowledge bases called
precompletions, and we show that this set characterises the satisfiability of the origi-
nal knowledge base. The second part, in Section 5.2 shows that the satisfiability of a
precompletion can be verified using a terminological reasoner.
The purpose of the precompletion process is to generate a simpler (not smaller)
knowledge base where the role assertions about individuals can be ignored. Precom-
pletions of knowledge bases are built using a set of nondeterministic syntactic rules
which extend the Abox of the original knowledge base. It will be proved that a knowl-
edge base is satisfiable if and only if a satisfiable precompletion can be derived.
In Section 5.2 we show that for checking the satisfiability of a precompletion the
role assertions can be ignored. In a precompletion the relevant elements are the concept
assertions and the terminology. Each individual is associated to the conjunction of
the concepts appearing in its concept assertions. If all those individual concepts are
satisfiable with respect to the terminology, then we show that their models can be
combined in an interpretation satisfying the precompleted knowledge base.
5.1 Precompletions of knowledge bases
Without loss of generality we assume that all the concept axioms in the Tbox are in the
form � �, where � is a concept expression. This assumption is not restrictive at all
in DLs closed under negation. An arbitrary assertion �� �� can be transformed into
the equivalent assertion � ���� � ���, which is in the required form.
A second assumption we adopt is that concept expressions are in negation normal
77
5.1. PRECOMPLETIONS OF KNOWLEDGE BASES 78
form, where the �� constructor can appear only in front of concept names. Any concept
expression can be transformed into an equivalent expression in normal form using the
following rewriting rules.
��� " � ��� ��� " �� � �� ���� " ����
��� ��� " �� � �� ���� " ����
This assumption simplifies the description of the algorithm and the proofs detailed in
this chapter.
Intuitively, a knowledge base is precompleted if all the information entailed by
the presence of role assertions is exhibited in the form of concept assertions. The
definition of a precompletion for a knowledge base is given in a procedural way as a
new KB where the Abox is extended using the syntactic rules of Figure 5.1 as long as
they are applicable.
We define a set of new binary operators � )� � (one for each individual name !), to
simplify the formulae in the rules. Intuitively, they take into account the possible in-
teraction between the role hierarchy and the functional restrictions (see Section 3.1.1).
Two role names and � are � )� � related if they are functional, and the Abox asser-
tions force the and � successors of the individual name ! to be the same element.
Strictly speaking, the operator depends on the KB as well, but for simplifying the no-
tation we are omitting it. In places where the relation is used, it will be clear which is
the KB we are referring to.
Definition 5.1. Given two roles and �, an individual !, and a KB � � �� ���, the
relation )� � holds iff:
there is a role � & s.t. either !������ or �!� !���� is in �; and
there is a role �� & � s.t. either !������� or �!� !������ is in �; and
there is a set of roles ��� � � � ����, and a set of functional roles ���� � � � � ���,
s.t.
– either !������ or �!� !���� is in �, for any � � �� � � � � �' � and
– & ��, � & ��, � & ��, � & ��, � & ��, . . . , ��� & ��.
The � )� � relation can be better understood by considering that it is describing a
5.1. PRECOMPLETIONS OF KNOWLEDGE BASES 79
situation in which part of the role taxonomy looks like
�� �� ���� ��
� � � � ��� �
� ��
and the individual name ! has a successor for each of the role names��� � � � � ���� �.
In this case, the functional restrictions cause all these successors to be interpreted as
the very same element. By the way it is defined, the relation � )� � is symmetric (i.e.
)� � * � )� ).
Proposition 5.2. Let � � �� ��� be a KB, and � � ��� � ��� be an interpretation
satisfying �. For any individual name !, role name , �, and elements � � of �� . If
�!�� � � � , �!�� �� � �� , and )� �, then � �.
Proof. All the roles � ���� � � � � ��� of Definition 5.1 are functional because they
are included in functional roles (��� � � � � ��); therefore !� has at most one successor
for any of these roles. We are going to show that all these successors are equal.
Note that for any �, the constraint !������ (or �!� !����) implies the existence of
an element � in �� s.t. �!�� �� � �� .
Let us consider �!�� �� � �� , and �!�� � � � . Since �
� � ��� and � �
��� , then
��!� � ��� �!
�� ��� ��
� . From the functionality of ��� we can conclude
that � � .
The very same arguments can be applied to all pair of roles �, ���, including the
last ���, �; therefore � � � � � � � � � ��� � �.
Given this introductory definition, the transformation rules of Figure 5.1 should be
clear, and we are going to introduce the formal definition of a precompletion of a KB.
Definition 5.3. A precompletion of a knowledge base � � �� ��� is defined as a
knowledge base � � � �� �� ��, where � � is an extension of � built by applying the
syntactic rules in Figure 5.1 as long as their pre–conditions are satisfied.
The precompletion rules are designed in such a way that, whatever strategy of
application is chosen, the process of completing a knowledge base always terminates
(Proposition 5.4).
5.1. PRECOMPLETIONS OF KNOWLEDGE BASES 80
� #� �!��� � �if ! is in , � � is in �and !�� is not in �.
� #� �!���� !���� � �if !��� � �� is in �,and either !��� or !��� if not in �.
� #� �!��� � �if !��� � �� is in �,and � � �� or � � ��
and !�� is not in �.
� #�� �!���� � �if !���� and �!� !���� are in �, )� �, and !��� is not in �.
� #�� �!���� � �if !���� and �!� !���� are in �,there is � & s.t. � )� �and !��� is not in �.
� #� �!���� � �if !���� is in �, and �!� !���� is in �, and � & ,and !��� is not in �.
� #�� �!������ � �if !��%�� in �, �!� !���� is in �,and there is � � �� such that � & & % ,and !����� is not in �.
Figure 5.1: Precompletion rules for ��f
5.1. PRECOMPLETIONS OF KNOWLEDGE BASES 81
Proposition 5.4. The precompletion process always terminates, and any precomple-
tion has a size which is polynomial w.r.t. the size of the knowledge base.
(Sketched). The number of new assertions introduced by the terminology via the #�
rule is equal to the number of individual names in the KB multiplied by the number
of concept axioms in the Tbox. The rules #�, #�, #�� , #��, and #� always in-
troduce assertions smaller than the original ones. The only rule that introduces non
decreasing assertions is #��; however, its applicability is bounded by the number of
role assertions, which is invariant.
For estimating the size of each precompletion we can use the argument that the
number of different concept expressions that can be generated is polynomial w.r.t. the
size of the KB.1 Therefore the size of a precompletion cannot exceed the number of
individual names multiplied by the number of concept expressions, and this number is
still polynomial w.r.t. the size of the KB.
The precompletion algorithm generates a finite, but possibly exponential, number
of precompletions; which can be individually checked for their consistency. The ad-
vantage over the original knowledge base is that they are simpler, enabling the use of
techniques based on terminological reasoning.
The satisfiability of a knowledge base and the satisfiability of its precompletions
are strictly related. In fact, the knowledge base is satisfiable if and only if at least one of
its precompletions is satisfiable (Proposition 5.5). Since Proposition 5.4 ensures that
the precompletions of a given knowledge base can be enumerated, the satisfiability
checking can be performed on precompleted knowledge bases.
Proposition 5.5. A knowledge base � � �� ��� is satisfiable if and only if it has a
satisfiable precompletion.
Proof. + Since � is included in all its precompletions, a model for a precompletion
� � is a model for � as well.
* Given a model � � ��� � ��� for �, a satisfiable precompletion of � can be built.
This precompletion � � � �� �� �� is built by extending � using a set of rules
constrained by the model �. The rules are the same as Figure 5.1 apart from
the nondeterministic #� rule, which is transformed into a deterministic one by
using the model �:
1Again the only problem may come from the ��� rule, but the number of formulae that it cangenerate is limited by the number of transitive role names.
5.1. PRECOMPLETIONS OF KNOWLEDGE BASES 82
� #� �!��� � �
if !��� � �� in �,
and � � �� if !� � ��� and � � �� otherwise
and !�� is not in �.
All the rules preserve satisfiability, in the sense that if � is a model for the Abox
before the application of the rule, then it is a model for the extended Abox as
well:
#� If � is a model for �� ���, then every element of the domain is in ��
(including !� for each ! � ). Therefore the assertion !�� is satisfied.
#� If � is a model for �� ���, then !� � ��� � ���� . Therefore !� � ��
� and
!� � ��� .
#� If � is a model for �� ���, then !� � ��� � ����; therefore either !� � ��
�
or !� � ��� . Suppose that !� � ��
� , then � is extended by adding the
assertion !��� which is satisfied by �, and analogously for the case in which
!� � ��� .
#�� If � is a model for �� ���, then �!�� !��� � �� , and there is an element
� �� s.t. �!�� � � � . By Proposition 5.2 � !�� ; therefore the
interpretation � satisfies the assertion !��� as well.
#�� If � is a model for �� ���, then �!�� !��� � �� , and every element s.t.
�!� � � � � must be in �� .
Since � )� �, there is a role� & � and an element such that �!�� � �
�� . In addition, � & therefore �!�� � � � , which means that �
�� .
Finally, we can use Proposition 5.2 with � and � for concluding that �
!�� , so � satisfies the assertion !���.
#� If � is a model for �� ���, then !� � ������ , �!� � !��� � �� and �� �
� . So �!�� !��� � � as well, therefore !�� � �� .
#�� If � is a model for �� ���, then !� � ��%���� , �!� � !��� � �� � � �
% � and � � ����. If there is in �� such that �!�� � � � � , then
�!� � � � � as well because � is transitive. The element should be in
�� because !� � ��%���� and � � % � ; therefore !�� � ������ .
5.2. SATISFIABILITY OF PRECOMPLETIONS 83
Because of the preserved satisfiability, � must be a model for the precompleted
knowledge base � � as well.
The following Proposition 5.6 makes explicit the logical properties of precomple-
tions implicit in the rules of Figure 5.1.
Proposition 5.6. A precompleted ��f knowledge base � � � �� ��� ��, containing
assertions about the set of individual names , satisfies all the following conditions:
#� if ! � and � � � � � then !�� � � �;
#� if ���� � �� � � � then ���� � � � and ���� � � �;
#� if ���� � �� � � � then ���� � � � or ���� � � �;
#�� if �!����� ��� ����� � � �, and )� �, then ��� � � �;
#�� if �!����� ��� ����� � � �, and there is � & s.t. � )� �, then ��� � � �;
#� if ������� ��� ����� � � � and � & then ��� � � �;
#�� if ����%��� ��� ����� � � � and there is a transitive role � � �� such that
� & & % , then ����� � � �.
Proof. Each property corresponds to the pre–conditions of one of the rules of Fig-
ure 5.1. So if the property is not satisfied, the corresponding rule is still applicable and
� � is not yet a precompletion.
At this point we have reduced the problem of checking the satisfiability of a ��f
knowledge base to the problem of verifying whether one of its precompletions is satisfi-
able. Now we turn our attention to precompleted KBs. The advantage in concentrating
on such KBs is that the consistency can be verified using a terminological reasoner
(Theorem 5.20).
5.2 Satisfiability of precompletions
In this section we show how the satisfiability of a precompletion can be reduced to the
concept satisfiability problem. This reduction enables us to close the circle and leads
to the main result for a general KB in Section 5.3.
5.2. SATISFIABILITY OF PRECOMPLETIONS 84
Since we are going to ignore the role assertions of a precompleted KB, first we
must make sure that a precompleted KB does not contain any contradiction caused by
role assertions. In ��f this case is restricted to assertions involving functional roles.
Definition 5.7. A precompletion � � � �� �� �� contains a clash iff there are two
roles �� � �� , and individual names �� �� � with � !� �, such that )� �, and
���� ���� ��� ����� � � ��
A precompletion containing a clash is not satisfiable because it violates some the
functional restrictions; in fact, by Proposition 5.2 the interpretations of � and � must
coincide, which is in contradiction with the unique name assumption. Precompletions
not containing clashes are said to be clash-free.
Given a knowledge base, the label of an individual is the set of concept expressions
in the assertions on the individual itself. This is formally defined in the next definition.
Definition 5.8. Given a knowledge base � � �� ���, the label of an individual ! �
with respect to � (written as ���� !�) is defined as the set of all concept expressions in
the assertions about the individual name !:
���� !� �
��� � !�� � �� if this set is not empty, or
��� otherwise
By using the label of an individual, the concept expression����� !� is defined as the
conjunction of all the concept expressions in ���� !�:
����� !� � �� � � � � � ��
where ��� � � � �� � � � � �� � ���� !�.
Since in precompleted knowledge bases the information carried by role assertions
is made explicit, all the relevant properties of an individual are in the form of concept
assertions. The label of an individual completely characterises the properties of the
individual, and it can be used to verify that these properties are not contradictory.
If each individual concept���� �� !� is separately satisfiable with respect to
the terminology, then for every individual name ! there is an individual model �� �
���� ����, which witnesses the satisfiability. As seen in Section 3.4, it can be assumed
that each individual model has a quasi transitive tree structure.
The domains of the models can be considered pairwise disjoint without loss of
generality. Let �� and �� be two models of����� �� and
����� �� respectively,
5.2. SATISFIABILITY OF PRECOMPLETIONS 85
whose domains �� and �� overlap. A new domain ��� can be chosen having the
same cardinality as ��, and being disjoint from ��, and a bijection can be defined
between the sets �� and ���. Moreover the defined bijection can be used to build a
new interpretation in which every element of �� is substituted by the mapped element
in ���. It is easy to show that that new interpretation � �� � ���
�� ����� is a model for the
corresponding concept����� �� with respect to the given terminology.
Starting from the individual models ��, an interpretation for the overall knowledge
base can be build by combining them. The domain of the union interpretation is the
union of all the domains from each individual model. The interpretation function is
defined in terms of the interpretation functions of each individual model:
Each individual name is interpreted as the root of its corresponding model (i.e.
!��), which is defined because to each individual name corresponds a label (see
Definition 5.8).
Concept names are interpreted as the union of their interpretations in the different
models.
Interpretation of role names is slightly more involved because the union of the
single interpretations is not enough. In fact, the pairs corresponding to the role
assertions in the Abox must be added as well (including those induced by both
the role hierarchy, and transitive restriction on roles).
As anticipated, the interpretation of roles involves something more than the union
of individual interpretations. First, the role assertions should be added; then, we must
ensure that functional roles are still functional. The latter point can be understood
by considering the precompletion of the KB � � � � ���� ����� ��������, where �
is a functional role. The precompletion of � is � � ���� ����� ������� �����, which
is satisfiable. From the precompletion, two individual models can be derived: �� for
the concept ���� and �� for �. If we try to merge these two models, together with
the pair generated by the role assertion ��� ���� , the resulting interpretation would not
satisfy the functional restriction on � . The solution to this problem relies on a more
careful definition of the union interpretation, which takes into account the functional
restrictions on role extensions. For this purpose we introduce the notion of restricted
interpretation of roles (Definition 5.9).
The idea is to remove the links that will be added later on by means of role asser-
tions in the Abox. We perform this operation only on functional roles, because they
are the problematic ones.
5.2. SATISFIABILITY OF PRECOMPLETIONS 86
Definition 5.9. Let � � � �� �� �� be a clash–free precompleted KB, and �� �
���� ���� be the individual model for the individual ! in � �. The restricted individ-
ual interpretation function ��� for ! is defined as:
!�� � !��
��� � ���
�� �
����������� (
��!��� � � �!��� � � ��
�if there are �� �� s.t. & ��
�!� !����� � � �� and � )� ���
�� otherwise.
Since the restricted interpretation is different from the original one only for the role
names, in the following we are going to use it only with role names.
Note that if is transitive, then�� � �� because does not have any functional
super–role (see Definition 5.1).
Note that the restricted role interpretation depends on the knowledge base as well
as on each individual model. It is used instead of the original interpretation function
in Definition 5.10, where the extension of role names is defined in such a way that
pairs removed by means of Definition 5.9 are substituted by pairs induced by Abox
assertions. This is necessary only for non–transitive roles, because transitive roles
cannot have functional super–roles by assumption, so the � )� � relations do not hold
among transitive roles.
Then, the following Sections provide the formal proofs showing that this union
interpretation is indeed a model for the precompleted KB.
Definition 5.10. Given a precompleted knowledge base � � � �� �� �� and the indi-
vidual models �� � ���� ���� for each individual ! � , then the union interpretation
5.2. SATISFIABILITY OF PRECOMPLETIONS 87
� � ��� ��� is defined as:
� �����
��
!� � !��
�� �����
���
�
���������������
��
��� �� �
����� ��� � ��� ���� � � �� )�
�
������ ��� � ��� ���� � � ��
� &
��
������� � ��
if is not transitive,
� ��
��� �� �
����� ��� � ��� ���� � � ��
� & �� otherwise.
For each ! � , � � �� , � �� . The operator �� builds the transitive closure
of a relation. For an arbitrary binary relation &, it is defined as &� ��
��� &�; where
&� � &, and &��� is the composition of & and &�, i.e. &��� � & Æ &�.
Note that the definition is recursive because the union interpretation is used to
build the interpretation for roles. Given the fact that we assume that the role hierarchy
is acyclic (see Section 2.4), the interpretation of the roles is well defined.
The fact that the union interpretation is a model for the precompleted KB is intuitive
for the propositional part (concept conjunction, disjunction and negation), but not so
for the modal part. In particular, troubles can arise from the interaction between the
universal quantification and the newly added pairs. However, we are going to prove
that the domain disjointness and tree–like structure of the individual interpretations
guarantee that the union interpretation is indeed a model for the precompleted KB.
5.2.1 Interpretation of roles
The disjointness of the domains guarantees that the individual model trees do not over-
lap, so the role interpretations �� are pairwise disjoint. This property is crucial for
the structure of the role extension in the union interpretation, since it ensures that new
links are added only to root nodes (i.e. to those associated to individual names).
First we are going to show that restricted interpretations still satisfy the role hier-
archy.
Proposition 5.11. Let � � � �� �� �� be a clash–free precompleted KB, and �� �
5.2. SATISFIABILITY OF PRECOMPLETIONS 88
���� ���� be the individual model for the individual ! in � �. For any pair of role
names � and , individual name !, if � & then ��� � �� .
Proof. Since �� satisfies the role hierarcy by hypothesis, ��� � ��; therefore the
only case in which ��� !� �� is the one in which a pair is removed from �� but
not from ��� . We are going to reason by contradiction, assuming that there is a pair
�!��� � � �� � ��� s.t. there are �� ��, & �, �!� !����� � � �, and � )� ��
(see Definition 5.9). This generates a contradiction because � & & �, therefore
the pair should have been eliminated from ��� as well.
Now we consider the relation between the union interpretation and the individual
interpretations. We start by verifying that no pair of elements, coming from the same
individual domain, is in the union interpretation of a role unless it is in the individual
interpretation of the role itself. Naturally, we must exclude the case of a pair induced
by an Abox interpretation like ��� ���, which forces the pair ����� ���� in � .
Proposition 5.12. Given a precompleted knowledge base � � � �� �� ��, and the
union interpretation � � ��� �� � from the individual models �� � ���� ���� with ! � .
For each role � �� , individual name ! � , and elements � � �� � ��:
� � �� � � and � !� !� implies � � �� � ��
Proof. We prove this proposition by induction using the partial order & over the set
of roles �� . By the structure of Definition 5.10 we can start from transitive roles
(no induction is necessary in this case), and then prove the property for non–transitive
roles.
If is transitive, we consider the set
�����
�� ������ ��� � ��� ���� � � ��
� & �
and we show that for any integer �, � � �� � ��, � � �� � �, and � !� !� implies
� � �� � �� . We show this by induction on �.
If � � � then either
� � �� �����
��� or
� � �� ����� � ��� � ��� ���� � � ��
� & �
5.2. SATISFIABILITY OF PRECOMPLETIONS 89
The second case can be ruled out because � !� � . In addition, the sets ��� and
�� are disjoint for any !� � different from !, because the individual domains
are disjoint by assumption; therefore � � �� � ��.
As an inductive hypothesis we assume that the property holds for any number
less or equal than �, and we show that it holds for � � �.
If � � �� � ���
there is an element � such that � � �� � and ��� �� � �. Note
that � must be different from !� , because we assumed that �� is a quasi transitive
shrub (see Section 3.1). We can use the inductive hypothesis to conclude that
both � � �� and ��� �� are in ��; therefore � � �� � ��, because �� � ��
and �� is transitive by assumption. In addition, �� � �� because cannot
have any functional super–role (see Definition 5.9 and Definition 5.1). Therefore
� � �� � ��.
If is not transitive then we have four cases:
� � �� �����
���
� � �� ����� � ��� � ��� ���� � � �� )�
� �
� � �� ����� � ��� � ��� ���� � � ��
� & �
� � �� ��
������� �
���
The three initial cases are analogous to the one for transitive roles and � � �: the
second and third cases can be ruled out because � !� � , while the first case satisfies
the proposition because of the disjointness of the individual domains.
In the fourth case there is a transitive role � included in (and different, because
is not transitive) s.t. � � �� � �� . Since we already proved the property for transitive
roles, then � � �� � ���; therefore � � �� � �� by Proposition 5.11.
The following proposition shows that pairs of root elements come from Abox as-
sertions only.
Proposition 5.13. Given a precompleted knowledge base � � � �� �� ��, and the
union interpretation � � ��� �� � from the individual models �� � ���� ���� with ! � .
5.2. SATISFIABILITY OF PRECOMPLETIONS 90
For any role � �� and elements � � of �, if � � �� � � and � � � then:
� � �� ����� � ��� � ��� ���� � � ��
� & � or
� � �� ����� � ��� � ��� ���� � � �� )�
� � or
� � �� � ����� � ��� � ��� ���� � � ��
� & � �� for some transitive role � & .
Proof. If is transitive, let us consider the set
�����
�� ������ ��� � ��� ���� � � ��
� & �
We show by induction on � that if � � �� � �
and � � � then
� � �� � ������ ��� � ��� ���� � � ��
� & ���
For � � � it is true because � � �� !��
��� �� since �� is a quasi transitive tree
interpretation for any ! (see Section 3.3). Let us assume the property for any integer
less or equal than �, and we consider ���
. If � � �� � ���
, then there is an element
� s.t. � � �� � and ��� �� � �. By the inductive hypothesis applied to ��� �� �
�
we conclude that
��� �� � ����� � ��� � ��� ���� � � ��
� & ���
therefore � � � . We can then use the inductive hypothesis with � � �� � conclud-
ing that
� � �� � ������ ��� � ��� ���� � � ��
� & ��
as well; therefore
� � �� � ������ ��� � ��� ���� � � ��
� & ���
If is not transitive and � � �� � � , then one of the following cases must be
5.2. SATISFIABILITY OF PRECOMPLETIONS 91
satisfied:
� � �� �����
���
� � �� ����� � ��� � ��� ���� � � �� )�
� �
� � �� ����� � ��� � ��� ���� � � ��
� & �
� � �� ��
������� �
���
We can rule out the first case because of the structure of the individual interpretations.
The second and third cases trivially satisfy the proposition, while the fourth case falls
into the previously proved property for transitive roles.
The second important property of role extensions in union interpretation is that
there cannot be a connection between elements from different individual domains with-
out a path passing through a root node (Proposition 5.14). This property ensures that
all the restrictions applying to a non–root element can only be induced through the root
of its own individual domain, instead of by being directly propagated from different
individual domains.
Proposition 5.14. Given a precompleted knowledge base � � � �� �� ��, and the
union interpretation � � ��� �� � from the individual models �� � ���� ���� with ! � .
For any role � �� , different individual names �� �, and elements � ��, � � ��,
if � � �� � � then � �� , and � � �� or there is a role � � � �� such that � &
and�� � ���� ���� ��
� �� .
Proof. The proof follows the same structure of the previous Proposition 5.12.
If is transitive let us consider the set
�����
�� ������ ��� � ��� ���� � � ��
� & �
We show by induction on � that that for any integer, if � ��, � � �� with � !� � and
� � �� is in �, then � �� , and � � �� or � � ���, ��� � �� are in � (this satisfies the
condition, because is transitive and & ).
5.2. SATISFIABILITY OF PRECOMPLETIONS 92
If � � � then either
� � �� �����
��� or
� � �� ����� � ��� � ��� ���� � � ��
� & �
The first case can be ruled out because of the disjointness of the individual do-
mains, so � � . In addition, �� is the only element in both �� and �;
therefore � �� (and analogously � � �� ).
As an inductive hypothesis we assume that the property holds for any number
less or equal than �, and we show that it holds for � � �.
If � � �� � ���
there is an element � such that � � �� � and ��� �� � �.
If � � ��, we can use the inductive hypothesis with � � � for concluding that
� �� . We then distinguish the two cases � � �� and � !� �� . In the first
case the condition is trivially satisfied (note that �� � for any �). In the
second case,�� � ���� ���� ��
� � by the inductive hypothesis; in addition
the transitivity of � and the fact that���� � ��� ��� ��
� � , guarantees the
satisfiability of the condition.
If � !� ��, then there is an individual ! different from � s.t. � � ��. We can use
the induction hypothesis with ! and � for ��� �� � �, concluding that � � !�
and either � � �� or���� ���� ��� � ��
� � . Since � � !�� � � , � �
by Proposition 5.13; therefore � �� . If � � �� the condition is trivially
satisfied, while if���� ���� ���� ��
� � ,
�� � ���� ��� � ��
� � because�
� � ��� ��� ��� � � and � is transitive.
If is not transitive then have four cases:
� � �� �����
���
� � �� ����� � ��� � ��� ���� � � �� )�
� �
� � �� ����� � ��� � ��� ���� � � ��
� & �
� � �� ��
������� �
���
The first case can be ruled out because of the disjointness of the individual domains.
5.2. SATISFIABILITY OF PRECOMPLETIONS 93
In the second and third case � �� because �� is the only element in both �� and
� (and analogously � � �� ). Finally, for the fourth case there is a transitive role �
included in s.t. � � �� � �� , and we can conclude that the proposition is true because
we have already shown that it holds for transitive roles.
Before considering the interpretation of concepts, we must verify that the func-
tional restrictions, and the role hierarchy are still satisfied in the union interpretation.
Proposition 5.15. Given a precompleted knowledge base � � � �� �� ��, and the
union interpretation � � ��� �� � from the individual models �� � ���� ���� with ! � .
If a role is included in a functional role � (i.e. & � ), then "���� � �� � �
$ �
for any element of �.
Proof. Note that there are no transitive roles included in (and !� � �� ), since
this is forbidden by the language definition. First of all we are going to show that if
is included in a functional role � then the definition of � (see Definition 5.10) can be
simplified by the formula:
� �����
�� ������ ��� � ��� ���� � � �� )�
� � (,)
Since is included in a functional role, then it is not transitive and it does not include
any transitive role; therefore the component�
������� � �� is empty. We just need
to show that����� ��� � ��� ���� � � ��
� & ������ ��� � ��� ���� � � �� )�
� �
by proving that if � & , then � )� � implies )� � for any role � (see
Definition 5.1). But this is actually the case because � & , and by assumption there
is a functional role (� ) including both � and. The sequence of roles in Definition 5.1
can be extended for by using � and � , and if there is a constraint ������� (or
��� !�����) with �� & �, this is good for as well because �� & � & .
Now that we proved Equation (,) we are going to show that the functional restric-
tion on is satisfied. We proceed by contradiction, assuming that there are � � ��,
� � �� s.t. � !� � and �� � ��� � � ��� � � . For doing this we use the alternative
definition of � shown in Equation (,).
5.2. SATISFIABILITY OF PRECOMPLETIONS 94
Note that it cannot be the case that �� � ��� � � ��� ��
��� �� because �� is
functional for any !, and the individual domains are disjoint. If
�� � ��� � � ��� ������ ��� � ��� ���� � � �� )�
�
then there is ! � s.t. � !� , and )� because is included in a functional
role (see Definition 5.1); therefore, there is a contradiction between the assumption
that � !� � and Proposition 5.2.
Therefore
� � �� �����
��� and
� � �� ����� � ��� � ��� ���� � � �� )�
�
(or the converse, since using � or � is symmetric). Note that there must be two individ-
ual names ! and !�, s.t. � !� , � � !�� , �!� !���� is in � �, and )� �.
Since �!�� �� ��
��� �� , then �!� � �� � �� because of the disjointness of the
individual domains. This is in contradiction with the fact that )� � (see Defini-
tion 5.9).
Now we show that the inclusion restrictions among role names are satisfied as well.
Proposition 5.16. Given a precompleted knowledge base � � � �� �� ��, and the
union interpretation � � ��� �� � from the individual models �� � ���� ���� with ! � .
For two arbitrary roles and �, if � & then �� � � .
Proof. Let be � � �� � �� , we need to show that � � �� � � as well.
If � � � , then � � �� � � because of Proposition 5.13 and the way the union
interpretation is constructed (see Definition 5.10).
We assume that � !� � , and we distinguish the two cases in which � � �� � ��
for same individual !, and in which � ��, � � ��� with ! !� !�. In the first case
� � �� � ��� by Proposition 5.12, therefore � � �� � �� by Proposition 5.11. If
� �� and � � ��� with ! !� !�, then by Proposition 5.14 either � � !�� (which is
in contradiction with the assumption that � !� �), or there is a transitive role � � & �
s.t.�� � !���� �!�� � ��
� � �� . If is not transitive, then � �� & � by construction
(see Definition 5.10); therefore we assume that is transitive. In addition, by using
Proposition 5.12 we can conclude that �!�� � �� � � ���� , therefore �!�� � �� � ��� by
5.2. SATISFIABILITY OF PRECOMPLETIONS 95
Proposition 5.11. By Proposition 5.13
� � !��� � �
���� � ��� � ��� ���� � � ��
� & � � ���
because � � does not have any functional super–role; therefore � � !��� � � as well.
Finally, we can conclude that � � �� � � because � is transitive by construction (see
Definition 5.10).
Now we are going to look at the concepts, by considering the properties of the
union interpretation.
5.2.2 Interpretation of concepts
It is easy to prove that the interpretation of roles satisfies the role assertions in the
KB (because of their relatively low expressivity); however concept assertions are more
problematic. In fact, the presence of new pairs in the interpretation of roles can modify
the extension of concept forming constructors (see Example 4.2).
This “non-monotonicity” problem does not involve any element of the domain, but
it is localised to root elements. Roughly speaking, the reason for this lies on the fact
that the interpretation of roles, restricted to non–root elements does not change (see
Proposition 5.12). Proposition 5.17 shows that, if we restrict to non–root elements, the
extension of arbitrary concept expressions w.r.t. the union interpretation is monotonic
w.r.t the individual models.
Proposition 5.17. For any individual name � � and concept expression �, ��� (��� � �� .
Proof. The proposition is proved by induction on the structure of the concept expres-
sion �. The basic case of a concept name is true by construction of � (see Defini-
tion 5.10). If it is true for the concept expressions �� and ��, then for each more
complex expression:
�� Let � �� (���
be such that !� ��� , then !� �� by construction (the
individual domains are disjoint), so � ����� .
��� � ��� By definition ��� � ���� � ��
� � ��� and ���
� � ��� , ���
� � ��� for the
induction hypothesis; therefore ��� � ����� � ��� � ���
� .
��� � ��� It is analogous to the � case.
5.2. SATISFIABILITY OF PRECOMPLETIONS 96
������ If � �������� (
���
, then there exists � � �� such that � � �� � ��
and � � ���� . Note that � � �� � �� , because only pairs whose first element is
the root element are removed in the restricted individual interpretation function
(see Definition 5.9). Therefore � � �� � � (see Definition 5.10). In addition,
� !� �� because �� is a q.t. tree, so � � ��� by the induction hypothesis; therefore
� ������� .
������ Let be such that � �������� (
���
, and � � �� � � . By using
Proposition 5.13 and Proposition 5.12 we can conclude that � � �� is in �� so
� � ���� , and � !� �� . Therefore � � ��
� for the induction hypothesis.
As shown in Example 4.2, this monotonicity property does not hold for arbitrary
concept expressions applied to root elements. However, it is still valid for concept
expressions which appear explicitly as assertions in the precompleted KB (Proposi-
tion 5.18). In fact, the concept in Example 4.2 does not appear in the KB as an assertion
involving the individual �. Moreover, this restricted property of the union interpreta-
tion (together with the more general one applying to non–roots) is sufficient to prove
that all the axioms and assertions in the precompleted knowledge base are satisfied by
the union interpretation (Proposition 5.19).
Proposition 5.18. For any individual name � � , ��� � � � implies �� � �� .
Proof. The Proposition is proved by induction on the structure of the concept expres-
sion �.
The basic case is a concept name: If ��� is in � � then � � ��� �� �� so � �
���; therefore �� � �� because ��� � �� .
If it is true for the concept expressions �� and ��, then for each more complex
expression:
�� If ���� � � �, then �� � ��� �� �� so �� !� ��� . From the disjointness
of the individual domains �� !� �� (see Definition 5.10), therefore �� �
�����
��� � ��� If ����� � ��� � � �, then both ���� and ���� are in � � (Proposi-
tion 5.6). For the induction hypothesis �� is in both ��� and ��
� , so it is in
��� � ���� as well.
5.2. SATISFIABILITY OF PRECOMPLETIONS 97
��� � ��� If ����� � ��� � � �, then either ���� or ���� is in � � (Propo-
sition 5.6). If ���� is in � �, then for induction hypothesis �� � ��� ;
therefore �� � ��� � ���� as well. The same if ���� is in � �.
������ If �������� is in � � then ������ � ��� �� ��; therefore there is
an element � in �� s.t. ��� � �� � �� and � � ���� . The problem is that
�� � �� , so the pair ��� � �� can be removed in the restricted individual
interpretation function (see Definition 5.9).
If ��� � �� � �� , then ��� � �� � � as well, and since � � �� (���
because of the tree–like structure of ��, then � � ��� by Proposition 5.17.
This concludes that �� � ������� .
If ���� �� !� �� , then there are �� �� s.t. & �, ��� !����� � � � and
� )� ��. Since � )�
�� and & �, there is a functional role � such
that & � and � & � ; in addition, the constraint �������� is in � �, so
)� �� (see Definition 5.1). This means that
��� � �� ������ ��� � ��� ����� � � �� )�
��
therefore there is an individual � s.t. � � �� and ��� � ��� � � (see Defini-
tion 5.10).
The only remaining thing is to show that �� � ��� . However, the constraint
��� must be in � � because of the #�� rule (by Proposition 5.6); therefore
we can use the inductive hypothesis to conclude that �� � ��� .
������ Suppose that ��� � � � � , then we distinguish three different cases
according to the element : is one of the root elements (i.e. � �),
� �� (���
, or there is � !� � s.t. � �� (���
.
– If � !� then the pair ��� � � must be introduced by means of Abox
assertions (see Proposition 5.13). Let us consider the three different
cases.
, If
��� � � ����� � ��� � ��� ���� � � ��
� &
there is a constraint ��� ���� with � & s.t. � �� . The con-
straint ���� must be in � � because of the #� rule (by Proposi-
tion 5.6). Therefore we can use the inductive hypothesis for con-
cluding that � ��� .
5.2. SATISFIABILITY OF PRECOMPLETIONS 98
, If
��� � � ����� � ��� � ��� ���� � � �� )�
�
there is a constraint ��� ���� with )� � s.t. � �� . The
constraint ���� must be in � � because of the #�� rule (by Propo-
sition 5.6). Therefore we can use the inductive hypothesis for con-
cluding that � ��� .
, If
��� � � � ����� � ��� � ��� ���� � � ��
� & � ��
there is a set of constraints
���� !������ �!�� !������ � � � � �!���� !������ � � �
and a transitive role � s.t. �� & � & for all � � �� � � � � �, and
� !�� . By Proposition 5.6 each � � �� � � � � � the contraints
!��������� and !���� are in � � (because of the #�� rule and
the #�). Therefore !���� is � � and � ��� by the inductive
hypothesis.
– If � �� (���
then ��� � � � �� , because both �� and are in the
same individual domain (see Definition 5.10). In addition, ������ �
��� �� �� so � ���� because �� is a model for the label ��� �� ��.
Using Proposition 5.17 we can conclude the � ��� .
– If there is � !� � s.t. � �� (���
, then by Proposition 5.14 there is
a transitive role � s.t. � & and both ��� � ���, ��� � � are in �� . By
using the very same arguments as the first case we can show that the
constraint ��������� is in � �; therefore we can conclude that � ���
by using the same arguments as in the previous case.
5.2.3 Satisfiability of a precompleted KB
It is clear that a model for the knowledge base is a model for every individual con-
cept. Given the results of the previous sections, Propositions 5.19 shows that the union
interpretation is a model for the precompleted knowledge base.
5.2. SATISFIABILITY OF PRECOMPLETIONS 99
Proposition 5.19. A clash-free ��f precompleted knowledge base � � � �� ��� ��
is satisfiable if and only if for each individual name ! � the concept expression���� �� !� (see Definition 5.8) is satisfiable with respect the terminology � �.
Proof. * Let � be a model for � �, then for any individual name ! and for any concept
� in ��� �� !� the individual !� is in the extension �� (see Definition 5.8).
Therefore !� � ����� �� !��
� so its interpretation is non–empty with respect
to the terminology � �.
+ If for every named individual ! the concept expression���� �� !� is satisfiable
with respect to the terminology � �, there is a quasi transitive tree interpreta-
tion �� which satisfies � � and having the root element in ����� �� !��
�� (see
Section 3.4). Therefore the union interpretation � � ��� ��� can be defined as
Definition 5.10. Verifying that interpretation � satisfies every assertion in � � it
can be proved that it is a model for � �.
Let ��� be a concept assertion in � �, then by Proposition 5.18 �� � �� .
Let ��� ��� be a role assertion in � �, then ��� � ��� � � by construction
of � in Definition 5.10.
Let � � be an axiom in � �, then � satisfies the axiom if and only if
every element in the domain � is in the extension of � with respect to �.
Let be an element of �, then there exists � � such that � ��, and
� ��� because the model �� satisfies � �. There are two cases: � ��
or � �� (���
– If � �� then � �� because of Proposition 5.18 (��� is in the
precompleted ABox � � by Proposition 5.6).
– If !� �� , then � �� by Proposition 5.17.
If � � �� then � is transitive by construction (see Definition 5.10).
If the role is functional (i.e. � %�� , or there is a role � � %��
such that � � ), then its functional restriction is satisfied by Proposi-
tion 5.15.
If � is an axiom in � �, then � & and �� � � by Proposition 5.16.
5.3. KNOWLEDGE BASE SATISFIABILITY 100
5.3 Knowledge base satisfiability
As summarised in Theorem 5.20, the satisfiability of a general knowledge base can
be verified by checking the satisfiability of its precompletions via a terminological
reasoner.
Theorem 5.20. The satisfiability checking problem for an ��f knowledge base can be
reduced to concept satisfiability with respect to a terminology.
Proof. A knowledge base � is satisfiable if and only if one of the precompletions
��� � � � ��� generated by the algorithm in Definition 5.3 is satisfiable (Proposition 5.5).
Moreover, the problem of checking the satisfiability of each one of the precompletions
can be solved by a concept satisfiability checking for each named individual in the
knowledge base (Proposition 5.19).
This means that we now have a KB satisfiability algorithm: the precompletion
rules of Figure 5.1 provide an algorithmic way to enumerate the precompletions of
a knowledge base, while the satisfiability of each precompletion is verified using an
external terminological system such as FaCT (see Horrocks [1998]).
Chapter 6
Querying Aboxes
In earlier chapters we have shown how to verify the satisfiability of a ��f KB (Chap-
ter 5) and how to answer to simple DL queries (Section 2.2). This chapter introduces
the reader to a novel query answering technique for DL KBs. This technique provides
correct and complete answers for disjunctions of conjunctive queries. The aim of this
chapter is to introduce the ideas underlying this technique for the simpler DL ���.
The full formal discussion of the extension to the language ��f is in the next Chapter.
6.1 Introduction
Typically, the only kind of query DL systems support is instantiation (is an individual
� an instance of a concept �?), realisation (what are the most specific concepts � is an
instance of?) and retrieval (which individuals are instances of �?). The reason for this
weakness is that, in these expressive logics, all reasoning tasks are reduced to that of
determining KB satisfiability (consistency). In particular, instantiation is reduced to
KB (un)satisfiability by transforming the query into a negated assertion; however, this
technique cannot be used (directly) for queries involving roles because these logics do
not support role negation.
For example, it can be inferred that John is an instance of Animal if and only if
the KB is not satisfiable when an axiom is added to the Abox asserting that John is not
an instance of Animal (i.e., that John is an instance of the negation of Animal). Re-
alisation and retrieval can, in turn, be achieved through repeated application of instan-
tiation tests. However, this technique cannot be used (directly) to infer from the above
axioms that the pair �John�Bill� is an instance of the transitive role Brother, be-
cause these logics do not support role negation, i.e., it is not possible to assert that
101
6.1. INTRODUCTION 102
�John�Bill� is an instance of the negation of Brother.
In this chapter we introduce a technique for answering such queries using a more
sophisticated reduction to KB satisfiability. We then show how this technique can be
extended to determine if an arbitrary tuple of individuals (i.e., not just a singleton or
pair) satisfies a disjunction of conjunctions of concept and role membership assertions
that can contain both constants (i.e., individual names) and variables. This provides
a powerful query language, similar to the conjunctive queries typically supported by
relational databases,1 that allows complex Abox structures (e.g., cyclical structures)
to be retrieved by using variables to enforce co-reference (see Chandra and Merlin
[1977]). For example, the query
�� � �� � ������Bill��Parent - ��� ��Parent -
��� ���Parent - � � ���Hates�� (6.1)
would retrieve all the pairs of hostile siblings in Bill’s family.2
In our work we focus on answering boolean queries, i.e., determining if a query is
true with respect to a KB. Retrieval can be easily (although inefficiently) turned into
a set of boolean queries for all candidate tuples. Note that in available systems, the
retrieval problem is similarly reduced to instantiation.3 It is important to stress the
fact that, given the expressivity of DLs, query answering cannot simply be reduced to
model checking as in the database framework (we illustrate this point in Example 6.1
below). This is because KBs may contain nondeterminism and/or incompleteness,
making it infeasible to use an approach based on minimal models. In fact, query an-
swering in the DL setting requires the same reasoning machinery as logical derivation.
Example 6.1
For using model checking technique for query answering we must be able to as-
sociate a “preferred” model to a given KBs, and this is quite difficult for arbitrary
DL KBs. Let us consider for example a simple KB containing the single axiom
Elephant �Mouse, stating that elephants and mice are disjoint, and the Abox
assertion hathi��Elephant � Mouse�.
We can identify two minimal (w.r.t. inclusion) interpretations satisfying the KB:
1The work is inspired by the use of Abox reasoning to decide conjunctive query containment (seeHorrocks et al. [1999a], Calvanese et al. [1998a]).
2Note that a sound and complete KB satisfiability algorithm will guarantee sound and completequery answers.
3Apart from not very expressive DL languages (see for example Rousset [1999]).
6.2. CONJUNCTIVE QUERIES IN DL 103
in the first the element mapped from the individual hathi is in the extension of the
concept Elephant, while in the second it is in the extension of the concept Mouse.
Which of the two interpretations can be considered the “preferred” one? The point is
that there is not any general mechanism for choosing one, even with this trivial KB.
An important advantage with the technique presented here is that it is quite generic,
and can be used with any DL providing general inclusion axioms where instantiation
can be reduced to KB satisfiability. It could therefore be used to significantly increase
the utility of Abox reasoning in a wide range of existing (and future) DL implementa-
tions.
6.2 Conjunctive Queries in DL
In this chapter we will focus on conjunctive queries only (i.e. queries containing only
conjunctions of terms), and the DL ���. This choice has been made for simplifying
the description of the underlying ideas. In the next chapter we are going to formally
present the technique for a wider range of queries and the more expressive DL ��f.
A key feature of conjunctive queries is that they may contain variables, and we
will assume the existence of a set of variables that is disjoint from the set of individual
names. A conjunctive query is a formula � �� � � ��'� - � � � - '��, where '�� � � � � '�
are concept or role query terms. Concept and role terms are syntactically equal to
Abox assertions of the form �� or � � ��� where � is a concept, is a role and
� � are either individual names or variables.4 A query formula generally contains free
variables which correspond to the tuple the query is meant to retrieve; these variables
are also called distinguished. Queries without free variables are said to be boolean,
and the answer to these queries is not a set of tuples but a true or false value.
In this chapter we are going to introduce the semantics of query answering in a
intuitive way; a formal definition will be presented in the next chapter. The semantics
relies on the the definition of interpretations for a DL: an interpretation � satisfies a
query iff the interpretation function can be extended to the variables in in such a
way that � satisfies every term in. In addition, we require that free variables in are
interpreted as element corresponding to individual names; this restriction ensure that
the set corresponding to a query answer contains only individual names. Using this
definition we can introduce the knowledge base into the framework: a query is true
4We decided to avoid the more common notation ���� and ���� � because it is confusing when �and � are complex concept and role expressions.
6.2. CONJUNCTIVE QUERIES IN DL 104
w.r.t. � (written � �� ) iff every interpretation that satisfies � also satisfies . The
answer to a non–boolean query will be the set of tuples of individual names that, once
substituted to the distinguished variables, make the query true w.r.t. the knowledge
base. For example, the query
" �Bill� ���Parent - ��� ���Parent - ��Male (6.2)
is true w.r.t. a KB � iff it can be inferred from � that Bill has a grandson. Note that
query truth value and the idea of logical consequence are strictly related. In fact, a
boolean query is true w.r.t. a KB iff it is a logical consequence of the KB.
In the following, we will only consider how to answer boolean queries. Retrieving
sets of tuples can be achieved by repeated application of boolean queries with differ-
ent tuples of individual names substituted for variables. For example, the answer to
the retrieval query ���� �� � � w.r.t. a KB � is a set of tuples ��� ��, where �� � are
individual names occurring in �, such that � �� � for the boolean query � obtained
by substituting �� � for �� � in . Of course the naive evaluation of such a retrieval
could be prohibitively expensive, but would clearly be amenable to optimisation. For
example, let us consider the query (6.1) of the previous section. In many DLs the ex-
pressivity of the Abox for roles is very limited: in ���, for example, a KB implies a
query term like ���Bill��Parent, where the second argument is an individual name,
only if there is an explicit assertion of this form in the Abox. Therefore we may use
role assertions in the Abox to reduce the number of candidates among the individual
names.
We will show how to answer boolean queries in two steps. Firstly, we will consider
conjunctions of terms containing only individual names appearing in the KB; secondly,
we will show how this basic technique can be extended to deal with variables. For
simplifying the notation, we represent the knowledge bases as a single set containing
both the Tbox and Abox assertions (instead of a pair as in Chapter 2).
6.2.1 Queries with multiple terms
In this section we will consider queries expressed as a conjunction of concept and
role terms built using only names appearing in the KB (i.e. without variables); e.g.,
Tom�Student or �Tom�CS710��Enrolled.
As we have already seen, logical consequence can easily be reduced to a KB satis-
fiability problem if the query contains only a single concept term (this is the standard
6.2. CONJUNCTIVE QUERIES IN DL 105
instantiation problem). For example, Tom�Person is a logical consequence of the KB
�Student Person�Tom�Student� iff the KB
�Student Person�Tom�Student�Tom��Person�
is not satisfiable. This can be generalised to queries containing conjunctions of concept
terms simply by transforming the query test into a set of (un)satisfiability problems: a
conjunction ����� - � � � - ����� is a logical consequence of a KB iff each ����� is a
logical consequence of the KB.
However, this simple approach cannot be used in our case since a query may also
contain role terms. Instead, we will show how simple transformations can be used to
convert every role term into a concept term. We call this procedure rolling up a query.
The rationale behind rolling up can easily be understood by imagining the availabil-
ity of the DL one-of operator, which allows the construction of a concept containing
only a single named individual (see Schaerf [1994]). The standard notation for such
a concept is � � �, where � is an individual name, and the semantics is given by the
equation � � �� �����
. For example, the expression � Bill � represents a concept
containing only the individual Bill (i.e., � Bill �� ��
Bill��
).
Using the one-of constructor, the role term �John�Bill��Brother can be
transformed into the equivalent concept term John���Brother�� Bill ��. Further-
more, other concept terms asserting additional facts about the individual being rolled
up (Bill in this case) can be absorbed into the rolled up concept term. For example,
the conjunction
�John�Sally��Parent - Sally�Female - Sally�PhD
can be transformed into John��Parent��� Sally � � Female � PhD�� The ab-
sorption transformation is not strictly necessary for queries without variables, but it
serves to reduce the number of satisfiability tests needed to answer the query (by re-
ducing the number of conjuncts), and it will be required with queries containing vari-
ables. By applying rolling up to each role term, an arbitrary query can be reduced to an
equivalent one which contains only concept terms, and which can be answered using a
set of satisfiability tests as described above.
However, the logic we are using does not include the one-of constructor, and
most of the state of the art DL systems do not provide the constructor (with the excep-
tion of the latest version of the DLP reasoner, see Patel-Schneider [1998]). Fortunately,
6.2. CONJUNCTIVE QUERIES IN DL 106
we do not need the full expressivity of one-of, and in our case it can be “simulated”
without the need of adding new axioms (see Section 4.1.2). The technique used is to
substitute each occurrence of one-of with a new concept name not appearing in the
knowledge base. These new concept names must be different for each individual in
the query, and are called the representative concepts of the individuals (written ��,
where � is the individual name). In addition, assertions which ensure that each in-
dividual is an instance of its representative concept must be added to the knowledge
base (e.g., Bill��Bill). In general, a representative concept cannot be used in place
of one-of because it can have instances other than the individual which it represents
(i.e., ��� .
����
). However, representative concepts can be used instead of one-
of in our reduced setting. In particular, the conjunction ��� ��� - ��� is a logical
consequence of a given knowledge base if and only if ������� � �� is a logical con-
sequence of the very same knowledge base augmented by the assertion ����. Note
that the trick of using a concept as representative for an individual has been used in a
few other cases in the literature (see De Giacomo and Lenzerini [1996], Borgida and
Patel-Schneider [1994], Era and Donini [1992]).
6.2.2 Queries with variables
In this section we show how variables can be introduced in this framework by using
a more complex rolling up procedure in order to obtain a similar reduction to the KB
(un)satisfiability problem. Variables can be used exactly as individual names, but their
meaning is as “place-holders” for unknown elements of the domain. Because variables
may be interpreted as any element of the domain, they cannot simply be considered as
individual names to which the unique name assumption does not apply; nor can they
be treated as referring only to named individuals, giving the possibility of nondeter-
ministically substituting them with names in the KB. In fact the query
�Bill� ���Parent - ��� ���Parent - ��Male (6.3)
is true w.r.t. both the KBs������Bill�Mary��Parent�
�Mary�Tom��Parent�
Tom�Male
��� and �Bill��Parent���Parent�Male�� �
6.2. CONJUNCTIVE QUERIES IN DL 107
but for the first KB the variables can be substituted by the individual names Mary and
Tom, while in the second case the variables may need to be interpreted as elements of
the domain that are not the interpretations of any named individuals.
Answering queries containing variables involves a more sophisticated rolling up
technique. For example, let us consider the terms ��� ���Parent and ��Male of
query (6.3). If � were an individual name, then the terms could be rolled up as
���Parent���� � Male�, but this is not an equivalent query when � is a variable
name because � can be interpreted as any element of the domain, not just an element
of ��� . However, since in this case � is no longer referred to in any other place in
the query, there is no other constraint on how an interpretation can be extended w.r.t.
�, so the concept � (whose interpretation is always the whole domain) can be used
instead of ��. The resulting concept term is ���Parent��� � Male�, which can be
simplified to ���Parent�Male. The same procedure can now be applied to �, thereby
reducing query (6.3) to the single concept term Bill��Parent���Parent�Male�.
In order to show how this procedure can be more generally applied, it will be useful
to consider the directed graph induced by the query, i.e., a graph in which there is a
node for each individual or variable in the query, and an edge from node to
node � for each role term � � ��� in the query. For instance, Query (6.3) corresponds
to the graph5
Bill � �MaleParent Parent
It is easy to see that the rolling up procedure can be used to eliminate variables from
any tree-shaped part of a query by starting at the leaves and working back towards the
root (this is similar to the notion of descriptive support described in Rousset [1999]).
The fact that rolling up should start from leaves is essential for correctness: for ex-
ample, rolling up query (6.3) in the reverse order would lead to the non-equivalent
Bill��Parent�� - ���Parent�� - ��Male.
However, this simple procedure cannot be applied to parts of the query that contain
cycles, or where more than one edge enters a node corresponding to a variable (i.e.,
with terms like � � ��� - ��� ����).
Let us consider the case where a variable is involved in a cycle, e.g., the simple
5More details are provided in the next chapter.
6.2. CONJUNCTIVE QUERIES IN DL 108
query
� � ���Path - ��� ���Path - ��� ��Path (6.4)�
�
Path Path
Path
which tests the KB for the presence of a particular type of loop involving the role
Path. Rolling up one of the terms does not help, because the resulting query
� � ���Path - ��� ���Path - ���Path���
�
�
Path Path
�Path���
still contains another reference to the variable , and replacing �� with � would result
in a non-equivalent query that no longer contained a cycle. Moreover, it is obvious that
there is no way to roll up the query in order to obtain a single occurrence of any of the
three variables.
This problem can be solved by exploiting the tree model property of the logic.
Given this property, we know that Tbox assertions alone cannot constrain all mod-
els to be cyclical (if there is a model, then there is a tree model), so any cycle that
might satisfy a cyclical query must be explicitly asserted in the Abox. Moreover, given
the restricted expressivity of role assertions (i.e., that they apply only to atomic role
names), cycles enforced in every interpretation must be composed only of elements
interpreting individual names occurring in the knowledge base. Therefore, before ap-
plying the rolling up procedure, a variable occurring in a cycle should be substituted
by the individual names in the KB. This procedure generates a number of alternative
queries corresponding to each individual in the KB. In the next chapter we will see
how they are handled. The intuition behind this property can be understood by consid-
ering that, given an arbitrary interpretation satisfying the cycle only with elements not
corresponding to individual names, a new interpretation can be build where the cycle
is split by duplicating one or more of the involved elements. This new interpretation
can be defined in such a way that it still satisfies the KB, but no longer contains the
6.2. CONJUNCTIVE QUERIES IN DL 109
required cycle. This duplication can be performed only if the elements are not “fixed”
individual names and by assertions in the Abox. For example, if in the query (6.4) the
variable is substituted by the individual name �, then it can be transformed into the
query
��� ���Path - ��� ���Path - ����Path�����
which no longer contains a cycle composed only of variables. Consequently, it can be
rolled up into the single concept term
���Path���Path���Path�����
where the concept �� is used to close the cycle.
For DLs which can enforce more restrictions on role structure (for example the
transitivity or hierarchy in ��f) the simple formulation of the tree model property is
not longer valid and a more involved definition must be adopted (see Chapter 3). This
fact complicates the basic algorithm we are sketching in this chapter as we show in
Chapter 7.
A similar argument can be used when variables appear as the second argument of
more than one role term, e.g., the variable � in the query � � ���-��� ����. Such vari-
ables can also be dealt with by nondeterministically substituting them with individual
names occurring in the Abox.
We have seen how role terms containing variables can be rolled up into concept
terms, but these may still be of the form ��, where is a variable. For example, the
query � � ���Parent, where and � are variables, can only be reduced to the single
term ��Parent��. In this case we need to verify that the interpretation of the concept
�Parent�� is nonempty in every interpretation that satisfies the KB. In general, the
interpretation of a concept � is nonempty in every interpretation that satisfies the KB
� iff adding the assertion � �� to � makes it unsatisfiable.
Summarising, the procedure for answering an arbitrary boolean conjunctive query
is divided into two phases. Firstly, the role terms are eliminated by repeatedly applying
the following rules: (i) if the graph induced by the query contains a leaf node �, then
the role term � � ��� is rolled up, and the edge � � �� is removed from the graph; (ii)
otherwise, if the graph contains a node � with multiple incoming edges, then all role
terms � � ��� are rolled up,6 and the corresponding edges are removed from the graph;
6If is a variable, then it is first replaced with an individual name chosen nondeterministically fromthose occurring in the the KB.
6.2. CONJUNCTIVE QUERIES IN DL 110
(iii) if the graph still contains edges but no leaf nodes and no confluent nodes, then it
must contain a cycle. In this case a node � in a cycle is chosen (preferably an individual
as this reduces nondeterminism) and rolled up as in case (ii) above. Secondly, the
query (which now contains only concept terms) evaluates to true iff there is at least
one nondeterministic replacement of variables with individual names such that every
term is a logical consequence of the KB.
6.2.3 Extending the framework
In the previous sections we sketched the very basic idea of the query answering al-
gorithm. This intuitive procedure cannot be applied straight away to the ��f logic
we are considering. In fact, role hierarchy and transitivity complicate the structure of
interpretations (see Chapter 3). In addition, the presence of potential cycles introduces
implicit disjunctions which we have not yet shown how to handle.
We need to add a further note about the one-of constructor (or the so called
nominals in the Modal Logic community). From Section 6.2.1 the reader may have
drawn the erroneous conclusion that this constructor can be used to represent variables
as well as individual names. This is false, and providing a logic with one-of would
not eliminate the creation of alternative queries in presence of cycles as described in
Section 6.2.2. The reason for this is that, even though both variables and one-of
represent a single element in the interpretation, the name of the individual in the one-
of is more restrictive than the name of a variable.
Let us consider for example the query � � ��� � ��� - � � �����. We may be
tempted to use the one-of constructor for representing the joining variable � by in-
troducing a new individual name !� and “rolling up” as in Section 6.2.1, giving the
new query � � ����� !� �� - ������ !� ���. The semantics of the new query is
completely different from the original one, because it is satisfied by an interpretation
only if the interpretation function maps !� to the appropriate element.7 On the other
hand, if the actual name !� does not appear in the knowledge base, the element mapped
from !� by an interpretation function is arbitrary. Therefore, there is always at least
an interpretation where !� is mapped to a “wrong” element. This means that the new
query would be never satisfied, even if the original one would be.8
7I.e. if the interpretation is , satisfying the condition that �� ���� �� and �� ��
�� �� for some
�� .8The query must be satisfied in every interpretation satisfying the KB.
6.3. RELATION WITH OTHER WORKS 111
6.3 Relation with other works
In the DL community soundness and completeness of reasoning services are consid-
ered essential; therefore we consider only approaches which satisfy this requirement.
This assumption excludes for example the work done with LOOM which provides a
first order query language with an incomplete algorithm (see MacGregor and Brill
[1992]).
As we mentioned early on, our work on conjunctive query answering has been in-
spired by the application of Abox reasoning for solving the problem of query contain-
ment under constraints (see Horrocks et al. [2000a] and Calvanese et al. [1998a]). The
similarity between the two problems comes from the fact that the query containment
problem can be reformulated into a query answering problem. The trick, well known
in the database community (see Chandra and Merlin [1977]), consists in transforming
the less general query into a database by considering the variables as constants. If
querying this simple database with the more general query provides an answer, then
the inclusion is verified. This simple technique can be generalised to the case where
the valid databases are restricted by a DL based constraint language (see Calvanese
et al. [1998a]). The query containment problem is reduced to query answering in the
DL setting where the constraints are transformed into the terminology.
The relation between the two problems is also exploited in Calvanese et al. [2000].
In their framework the query answering problem is reduced to satisfiability of Con-
verse PDL formulae (see Chapter 4). Their technique is very similar to the one we
are presenting, in fact we have been inspired by their work, particularly the idea of
representing a query as a graph and using a “rolling-up” technique.
The difference lies on the underlying DL they use and the scope of their work. They
rely heavily on Converse PDL logic for the transformation, therefore their approach is
only applicable to systems providing a Converse PDL reasoner. In contrast, we want to
present a general technique which is applicable to a wide range of DL systems with a
minimum effort. For this reason, for example, we develop a mechanism for simulating
the inverse role constructor which is provided by Converse PDL but not implemented
in most of the available DL reasoners.
Even if Converse PDL is, for most aspects, more expressive than the logic ��f
we are focusing on, the presence of transitivity and functional restrictions as well as
role hierarchy in ��f makes the structure of its interpretations more convoluted (see
Chapter 3). This complexity is reflected on the query answering algorithm, as shown
in the next chapter.
6.3. RELATION WITH OTHER WORKS 112
The problem of enhancing the query language for Abox KBs has also been attacked
in the context of the system CARIN (see Levy and Rousset [1996a]). This DL based
system integrates a DL with a query language which is essentially (recursive) Datalog.
Their combination provides a very powerful query language which subsumes conjunc-
tive queries;9 however, its applicability is limited by the fact that combining recursive
Datalog with an expressive DL leads to undecidability (see Levy and Rousset [1996b]).
In addition, the technique for query answering uses an ad hoc algorithm which is not
easily portable across different DL reasoners.
On the same track as the CARIN approach, a proposal has been made to use backward-
chaining for evaluating conjunctive queries based on query expansion (see Rousset
[1999]). The problem with this approach is the extreme restriction on the expressive-
ness of the DL language (��� with empty terminologies). The advantage of this
technique is that the query answer is built bottom-up, leading to much better perfor-
mance. The downside is that the the procedure relies on the fact that the underlying DL
has the property of having some sort of “minimal model” against which the query can
be evaluated. As soon that the DL expressivity is enriched by constructors enabling the
representation of disjunctive information the underpinning idea behind the technique
cannot no longer be applied.
In the next chapter we give a formal description of the technique outlined in this
chapter. Formal proofs are provided for supporting the correctness and completeness
of the described technique.
9Using non–recursive Datalog yields to a language which is essentially equivalent to the one pre-sented in Calvanese et al. [2000].
Chapter 7
Answering conjunctive queries in ��f
7.1 Query language
The query language we consider is an adaptation of the basic conjunctive query lan-
guage as defined in the database setting (see Abiteboul et al. [1995], Chandra and
Merlin [1977]). The main difference in the DL case lies in the fact that in DL there are
only relations of arity one (concepts) or two (roles). Individual names can be viewed
as the constants in the database case.
In the following we adopt the same notation as the one for the conjunctive calculus
as described in [Abiteboul et al., 1995, section 4.2].
7.1.1 Syntax
Definition 7.1. Let �� be the set of valid concept expressions build over a set �� of
atomic concept names, and a set �� of role names. We assume a set of individual
names, and a distinct set / of variables. The set 0����� of conjunctive formulae over
�� is the set of all formulae defined by the following abstract syntax:
$ 1 �� � � � ���� � � � � �� � � ! � � � �
$� - $� � $� 2 $� � � $
where � is a concept expression in ��, �� � � � � � are role names in �� , ! is an
individual name in ), and � � are variable names in / .
The atomic formulae of the form ��, � � ���� � � � � ��, � !, and �
� are called terms. We distinguish the terms into concept terms ( ��), role terms
113
7.1. QUERY LANGUAGE 114
(� � ���� � � � � ��), and equality terms.
Given a formula of 0����� we use the natural definition of free variables as
those not bound by an existential quantifier. We indicate the set of free variables of
a given formula $ as �����$�. For the sake of simplicity we introduce the notation
� � � � � � �$ as equivalent to the formula � ��� ��� � � �� �$� � � ���.
Given the conjunctive set of formulae 0����� we define a query as a formula
containing free variables. Intuitively, the free variables correspond to the “results” of
the query. We distinguish a special form of query when the formula does not contain
any free variable.
Definition 7.2. A conjunctive query over a DL �� is an expression
� �� � � � � � � $�
where $ is a conjunctive formula ($ � 0�����), and �� � � � � � are called dis-
tinguished variables and are exactly the free variable names of $ (� �� � � � � �� �
�����$�). Among the conjunctive queries we distinguish the boolean conjunctive queries
as those without free variables.
7.1.2 Semantics
The semantics of the conjunctive formulae is given in the same way as for the DL by
means of interpretations composed by a domain and an interpretation function (see
Chapter 2).
Definition 7.3. Let � � ��� ��� be an interpretation over the set of individual names
, role names �� , and concept names �� . Given a formula $ � 0�����, and a
mapping ( from the names in �����$� to the domain �, we say that the interpretation
� models $ w.r.t. ( (� ��� $) when:
� ��� �� iff (� � � ��
� ��� � � ���� � � � � � � iff �(� �� (���� � �� � � � � ��
�
� ��� � ! iff (� � � !�
� ��� � � iff (� � � (���
� ��� $� - $� iff � ��� $� and � ��� $�
� ��� $� 2 $� iff � ��� $� or � ��� $�
� ��� � $ iff there is ) � �� s.t. (� � (� �) and � ���� $
7.2. COMPLETENESS OF QUASI TRANSITIVE SHRUB INTERPRETATIONS115
Where the notation (� �) represents the mapping ( extended by the pair � � )� if
is not in the domain of (, otherwise the original value for is replaced by ) (i.e.
(� �) � �� � )�� � ���� )�� � ( � � !" �). With the simplified notation � �� � � �$,
we say that � ��� � � � � � � �$ iff there is an extension (� of ( with a mapping for
all the variable names �� � � �� � such that � ���� $.
When there is at least a mapping ( such that � ��� $, then the interpretation �
models $.
We can use the above defined semantics for specifying the meaning of answering
a query w.r.t. a given knowledge base �. We start from boolean queries; then we are
going to extend the definition to general queries.
Definition 7.4. Let � be a DL knowledge base, and$ a conjunctive formula in0�����
without free variables. We say that � answers (implies) $ (written as � �� $) iff any
interpretation satisfying � models $: i.e. for any interpretation �, � �� � implies
� �� $.
The definition of logical implication for boolean queries provides the basis for
the formal semantics of query answering. Analogously to the database setting, the
answer to a query is a set of tuples of individual names (see Reiter [1984]). Each tuple
corresponds to an instantiation of the conjunctive query into a boolean query where
the free variables are bound to the corresponding names in the tuple. In other words, a
tuple �!�� � � � � !�� belongs to the answer of the query � �� � � � � � � $� w.r.t. the kb �
iff � �� � � � � � �� � � !� - � � � - � � !� - $�.
It is obvious from the definition that query answering can be reduced to logical im-
plication by repeated application of boolean queries with different tuples of individual
names substituted for variables. Of course the naive evaluation of such a retrieval could
be prohibitively expensive, but would clearly be amenable to optimisation (see Sec-
tion 7.5 for more details). However, from now on we concentrate on boolean queries
by showing how to decide if a query formula without free variables is a logical impli-
cation of a given knowledge base.
7.2 Completeness of quasi transitive shrub interpreta-
tions
As anticipated in Section 6.2.2 we need to show that cycles can be enforced only by
means of Abox assertions (see Proposition 7.15 and Proposition 7.16). For this purpose
7.2. COMPLETENESS OF QUASI TRANSITIVE SHRUB INTERPRETATIONS116
we are going to use a result of completeness similar to the one already presented in
Chapter 3. Canonical transitive shrub interpretations provides a formal characterisation
of this “tree-like” structure of the parts of an interpretation which are not constrained
by Abox assertions.
Note that we cannot just reuse the completeness result presented in Section 3.13
because the problem of logical implication is different from the one of kb satisfiability.
In many logics these two problems can be trivially reduced to each other (for example
using negation), but in our case the fact that the query language is different from the
assertional language prevents us from taking this simple shortcut.
Our goal is to show that for providing the evidence that a given knowledge base im-
plies a formula, we do not need to consider any arbitrary interpretation but only quasi
transitive shrubs. For this purpose we are going to use the very same transformation
for interpretations defined in Section 3.2.1.
Theorem 7.5. Let � be a ��f knowledge base, and $ a conjunctive formula in
0����� without free variables. Then � �� $ iff � �� $ w.r.t. canonical quasi transi-
tive shrub interpretations.
Proof. The “only if” direction is trivial because quasi transitive shrub interpretations
are interpretations as well.
For the “if” direction let us assume that � �� $ w.r.t. canonical quasi transitive
shrub interpretations. Let � � ��� ��� be an arbitrary interpretation satisfying � (i.e.
� �� �); we have to show that � models $ as well (Definition 7.3).
Let ��� � ������ �
���� be the transitive closure of the unravelled interpretation of
�; in addition, the mapping Æ maps elements of ���� to � (See Definition 3.7 and
Definition 3.8). Since we assumed that � �� $ w.r.t. canonical quasi transitive shrub
interpretations, then ��� models $ (��� is a canonical quasi transitive interpretation by
Proposition 3.11).
Let ( be an arbitrary mapping ( � / # ����; the mapping ( � is a mapping from
/ to � defined by (� � ���� ���� � ��� �� � (�. We are going to show by induction
on the structure of the formula $ that if ��� ��� $ then � ���� $. Note that since �
satisfies �, then ��� satisfies the properties in Proposition 3.9 (see Proposition 3.10).
First let us consider the basic cases.
�� If ��� ��� �� then (� � � ����, and (�� � � �(� �� � �� by (3.9d); therefore
� ���� ��.
7.3. ANSWERING BOOLEAN QUERIES 117
� � ���� � � � � �� If ��� ��� � � ���� � � � � � � then �(� �� (���� � ���
� for
� � �� � � � � � and �(�� �� (����� � ��(� ��� �(����� � �� by (3.9b); therefore
� ���� � � ���� � � � � ��.
� ! If ��� ��� � ! then (� � � !���, and (�� � � �(� �� � !� by (3.9a);
therefore � ���� � !.
� � If ��� ��� � � then (� � � (���, and (�� � � �(� �� � �(���� � (����;
therefore � ���� � �.
As inductive hypothesis let us assume that, for an arbitrary mapping � � / # ����,
if ��� ��� $ then � ���� $ (and the same holds for $�, $�). We are going to show that
the very same property holds for the formulae $� - $�, $� 2 $�, and � $.
$� - $� If ��� ��� $� - $� then ��� ��� $� and ��� ��� $�. By inductive hypothesis
� ���� $� and � ���� $�; therefore � ���� $� - $�.
$� 2 $� If ��� ��� $� 2$� then ��� ��� $� or ��� ��� $�. If ��� ��� $� then � ���� $�
by inductive hypothesis; therefore � ���� $� 2 $�. Analogously, we come to the
very same result if we assume that ��� ��� $�.
� $ If ��� ��� � $ there is an element � of ���� such that ��� ��� $with � � (� �� .
For the inductive hypothesis � ���� $, where �� � ���� ���� � ��� �� � ��.
It is easy to show that � � � (�� ���� : if ��� )� � ��, then either � " and
) � ���, therefore ��� )� � (�� ���� ; or ��� �� � � with ��� � ), therefore
��� )� � (�� ���� because � � (� �� . Analogusly, it can be shown that if
��� )� � (�� ���� then ��� )� � ��.
This means that � ��������� $, therefore � ���� � $.
By Definition 7.3 there is a mapping ( � / # ���� such that ��� ��� $. We just
shown that there is a mapping ( � � / # � such that � ���� $. Therefore from � �� $
for the arbitrariness of � we can conclude that � �� $.
7.3 Answering boolean queries
We are going to present a technique which enables us to decide whether a query for-
mula without free variables is logical consequence of a given knowledge base. Our
7.3. ANSWERING BOOLEAN QUERIES 118
algorithm cannot answer arbitrary query formulae but only a restricted class repre-
sentable in a normal form we are going to define. This normal form is general enough
for answering the kind of queries we are focusing on: conjunctive queries and disjunc-
tions of conjunctive queries.
First we are going to present the normal form, then we show how connected and
unconnected conjunctive queries can be verified by using the normal form.1 Finally,
we show how disjunctions of conjunctive queries can be verified by a slightly more
sophisticated reduction to the normal form.
7.3.1 Query normal form
We consider a disjunctive normal form
���� �$� 2 � � � 2 � �$�� (7.1)
where each $� is a conjunction of terms of the form of ��, � � ���� � � � � � �,
� ! and � � (i.e. it does not contain any disjunction or existential quantification).
In addition, we require that its free variables are exactly � and �, and all the variables
in each $� are connected. As the normal form suggests, we differentiate a special
variable � whose purpose is “joining” all the different disjuncts, which otherwise do
not share any variable.
It is easy to realise that this normal form is not general enough to cover all the valid
query formulae as described in Section 7.1.1. The simple formula � ��� � ��� 2
� � ����� for example cannot be transformed into the normal form. However, conjunc-
tive queries (i.e. not containing any disjunction) or disjunctions of conjunctive queries
(not sharing any variable) can be transformed into the normal form by renaming of
variables.2
The reduction to boolean queries shown in Section 7.1.2 can generate queries that
apparently cannot be transformed in the normal form; e.g. the query � �� � � $� 2 $��,
once instantiated into a boolean query, becomes � � �� � � !� - � � !� - �$� 2$��
for some individual names !� and !�. However, variables being identified with an indi-
vidual name can be distributed among the disjuncts without altering the semantics, be-
cause the uniqueness of the individual name ensures that they will be identified with the
1Intuitively, being connected means that there is a path constituted by the role terms connecting eachpair of variables (we provide a more formal definition at the end of next section).
2Actually the transformation is a little more involved in the case of non connected or disjunctivequeries, as described in the next sections.
7.3. ANSWERING BOOLEAN QUERIES 119
same element of the domain. Therefore the formula in the example can be transformed
into the equivalent � � �� � � !� - � � !� - $�� 2 � � �� � � !� - � � !� - $��,
which is a disjunction of conjunctive queries.
For a better explanation of the algorithm we use an alternative equivalent represen-
tation of the formula (7.1) by using a set of sets of query terms. The idea behind this
alternative representation is that since all the variables are existentially quantified we
do not really need to specify the quantification operator for binding a variable name.
Therefore we drop the quantification at the beginning of each $� and we use the set
of terms in it instead of the formula itself; i.e. the formula ���� �$� 2 � � � 2 � �$��
is represented by �*�� � � � � *�� ��, where *� is the set of all the conjuncts in $�. The
notation �� � �� �� specifies that � is the special variable joining the disjuncts.
Given the one-to-one mapping between the set and explicit forms of the query for-
mula, we use the same notation for describing the notion that an interpretation models
a formula (see Definition 7.3). Given the formula �*�� � � � � *�� �� we say that an in-
terpretation � models the formula w.r.t a mapping ( (� ��� �*�� � � � � *�� ��) iff there
is an element (disjunct) *� such that the domain of ( includes all the variables in *�
and for each term + in *�, � ��� +. It is easy to see that, for the class of queries rep-
resentable in the given normal form, the two notions of model for a formula coincide.
To simplify the notation, when there is no ambiguity we are going to write formulae
having a single element (disjunct) in the the form � ��� * instead of � ��� �*� ��.
The set representation enables us to provide a formal definition of the property of
connectedness for a formula. First we introduce the notion of a path connecting two
variables.
Definition 7.6. A path connects two variables and it is defined in the following recur-
sive way:
the set �� � ���� � � � � ��� is a path connecting to �;
if the set $ is a path connecting to �, the set $� is a path connecting � to �, and
$ � $� is empty, then $ � $� is a path connecting to �;3
if $ is a path from to �, then it is a path connecting � to as well.
A cycle is a path connecting a variable to itself. A set of terms contains a path (cycle)
if a subset of it is a path (cycle).
3We require the disjointness of the two paths because we want to prevent cases like � � �� �� ���� from being wrongly classified as paths starting and ending with the same variable.
7.3. ANSWERING BOOLEAN QUERIES 120
By using the previous definition we can now provide a formal characterisation of
a connected formula. A query formula in the normal form is connected iff in every
element (disjunct) there is a path between any pair of variables.
7.3.2 Conjunctive queries
Since nested existential quantifiers can easily be prepended to the formula by renaming
of the variables, a connected conjunctive boolean query is trivially in query normal
containing a single element. In this section we consider boolean queries in the form
� ����� � � ��
���$��� - � � � - � ����� � � ��
���$����
where the formulae $���� � � �� $��� are connected and do not contain any existential
quantifier nor disjunctions. Without loss of generality we can assume that all the vari-
able names are distinct in each formula.
It is intuitively the fact that, since formulae do not interact one to each other, such
queries may be answered by considering one formula at a time. This is indeed the case,
and we are going to show it formally in the following paragraphs.
Since we can prove a more general result about unconnected conjunction, which
will be used in the next section, we are going to relax the constraint on the form of the
query. Let us consider the query answering problem
� �� � ����� � � ��
���$��� - � � � - � ����� � � ��
���$���� (,)
where there is not any restriction on the form of $���� � � �� $��� other than the fact that
they do not share any free variables. We are going to show that the query answering
problem (,) is verified iff � �� � ����� � � �����$��� for each � � �� � � � � �. The uncon-
nected conjunctive queries we are considering in this section are a particular case of
(,).
Since the “only if” direction (i.e. *) is trivially satisfied by definition, we are going
to concentrate on the other direction. Let us assume that the variable names are distinct
in each formula � ����� � � �����$���, and � �� � ����� � � ��
���$��� for each � � �� � � � � �.
Let � be an arbitrary interpretation satisfying � (� �� �); we have to show that �
models the query formula in (,) as well.
Since � �� � ����� � � �����$��� for each � � �� � � � � �, then � �� � ����� � � ��
���$���;
therefore there are � mappings (�� � � � � (� for the variable names such that � ���� $���.
7.3. ANSWERING BOOLEAN QUERIES 121
Let us consider the formula
� ����� � � ��
���� � � ����� � � ��
����$��� - � � � - $����
which is equivalent to the original query formula in (,), because the variable names are
distinct. Since the variable names are distinct, making the union of the mappings does
not alter the variables relevant for each single formula $���; therefore, the mapping
( � (�� � � ��(� is such that � models �$���- � � �-$���� w.r.t. (. Therefore � models
the query formula in (,) as well.
7.3.3 Disjunction of conjunctive queries
The last class of queries we consider is the disjunction of conjunctive queries. In this
class, the query formula is a disjuntion of formulae like *� 2 � � � 2 *�, where each *�is in the form
� ����� � � ��
���$��� - � � � - � ����� � � ��
���$����
and the formulae $���� � � �� $��� are conjunctive queries; i.e. do not contain any ex-
istential quantifier nor disjunctions. A disjunctive query answering problem can be
formulated as
� �� �$��� - � � � - $�����
2 � � �
2 �$��� - � � � - $�����
where each $��� is a connected conjunctive formula. We can assume without loss
of generality that all the existential quantifiers appear at the beginning of each sub-
formula $���. First the query formula is transformed into the conjunctive normal form
by using the distributive property of boolean constructors (i.e. �� - �� 2 � is equiva-
lent to �� 2 �� - �� 2 ��). In this way we obtain an equivalent query formula where
the conjunctions appear at the top level, like
� �� �$�������� 2 � � � 2 $���������
- � � �
- �$�������� 2 � � � 2 $����������
(,)
7.3. ANSWERING BOOLEAN QUERIES 122
By using the result shown in Section 7.3.2 we can consider each conjunct �$�������� 2
� � � 2 $��������� independently from the rest of the formula. Therefore, we need to show
how to answer to a query in the form
� �� $� 2 � � � 2 $�
where $�� � � �� $� are connected conjunctive formulae. The formula $� 2 � � � 2 $� has
a close resemblance with the query normal form described in Section 7.3.1. What is
missing is a common “joining” variable connecting the disjuncts (i.e. the variable � in
Formula (7.1)).
The idea is to transform the query into the normal form by choosing arbitrarily one
variable from each disjunct and “promoting” it as the joining variable name. In the
next Proposition we are going to show that this approach is indeed correct and allows
us to solve the query answering problem.
Proposition 7.7. Let � be a knowledge base and ���$� 2 � � �2���$� a query formula
such that ���$�� � � � � ���$� are connected conjunctive query formulae not containing
the variable name �; then � �� ���$� 2 � � � 2 ���$� iff � �� ���$������ 2 � � � 2
$������ �.4
Proof. Let us assume that � �� ���$� 2 � � � 2 ���$� and let � be an interpretation
satisfying �. By assumption � �� ���$� 2 � � � 2 ���$�; therefore there is a mapping
( such that � ��� ���$� 2 � � � 2 ���$�. By definition there is one of the disjunct
���$� such that � ��� ���$�; therefore � ���� $�, where (� � (����) for some
) � �� . Let (�� be a mapping such that ( �� � (���(����� . The mapping (�� satisfies
� ����� $������ because $� does not contain the variable �, so � ����� �$������ 2 � � �2
$������ �, therefore � ����� ���$������ 2 � � � 2 $������ � by definition. This means
that � �� ���$������ 2 � � �2$������ �. By the arbitrariness of the choice of � we can
conclude that � �� ���$������ 2 � � � 2 $������ �.
For the other direction, let us assume that � �� ���$������ 2 � � � 2 $������ �,
and � be an interpretation satisfying �. By assumption there is a mapping ( such
that � ��� ���$������ 2 � � � 2 $������ �; therefore there is a disjunct $������ and
a mapping (� � (���) , for some ) � �� , such that � ���� $������ . Let (�� be a
new mapping such that ( �� � (����(���� . Since �� is not appearing as a free variable
in $������ , then � ����� �$������ ������ . In addition, since � is not appearing in $�,
4The formula ���� indicates the formula in which all the free occurrences of the variable �, �itself is substituted by .
7.4. ANSWERING QUERIES IN NORMAL FORM 123
�$������ ������ " $� so � ��� ���$� (the only differences between ( and (�� are the
mapping of variables � and ��). Therefore � ��� ���$�2 � � �2���$� and consequently
� �� ���$� 2 � � �2���$�. By the arbitrariness of the choice of � we can conclude that
� �� ���$� 2 � � � 2 ���$�.
7.4 Answering queries in normal form
In this section we describe the actual algorithm for answering queries in the normal
form described in Section 7.3.1. The algorithm is described in terms of a set of non-
deterministic rules which transform the initial implication by modifying the query for-
mulae and possibly the KB on the left hand side. The goal of the rules is producing an
implication test of the form � �� ������� � � � � � �����; when such a form is reached
the rules are no longer applied. Such a simple implication can be verified by checking
whether the knowledge base �, extended with the axiom �� ���� � � � � � ����� is
unsatisfiable. In fact, the formula ������� is implied by the KB � iff in every inter-
pretation satisfying � the extension of the concept � is non-empty. We can rephrase
the question by verifying whether there is an interpretation in which the extension is
empty (i.e. the negation of the concept itself is equal to the whole domain).
We require that the algorithm is correct, complete, and terminating. We are going
to build these transformation rules according to these requirements. In the next sections
we prove that this is actually the case.
7.4.1 Query transformation rules
Rules transform query problems of the form � �� �*�� � � � � *�� �� into “simpler”5 prob-
lems of the same form. An element *� is selected such that it matches the preconditions
of one or more rules. One of the matching rules is selected according to a strategy, and
the query is transformed as described in the rule. The procedure is repeated until no
elements of the query formula satisfy the preconditions of any rule. At this point the
resulting query formula will be of the form
������� � � � � � ������� ��
which is equivalent to the formula ������� � � � � � ���.
5Simpler in the sense that the query after the application of a rule is closer to the target formula������� � � � � � ������� ��.
7.4. ANSWERING QUERIES IN NORMAL FORM 124
In general, rules substitute the element (disjunct) satisfying the precondition with
a new disjunct, and possibly they modify the KB � in the left hand side:
� �� � * � *� � � � � � *� ���
rule
�� �� � *� � *� � � � � � *� ���
In some cases, the selected element * may be substituted by more than one elements
*��� � � � � *��; but since this transformation is performed by the last two rules only, we
maintain the simpler single element substitution for the rest of the rules.
Before presenting the rules we need to introduce some notation which is going to
simplify the following exposition. As introduced in Chapter 6, we are considering
knowledge bases as sets of Tbox and Abox assertions instead of a pair of sets. This
simplifies the notation, when new axioms are added to the KB (see nominal elimination
rule).
A further simplification is made to the role syntax in the role terms, which are
represented in a set like notation instead of a conjunction; i.e. the expression�� � � ��
� is represented by ���� � � � � ��. In cases where the actual elements of the sets
are not used, the set will be indicated by a calligraphic character like � or �; i.e. the
conjunction is �� (see for example the role collapsing rule).
New concept names are introduced in some of the rules; we assume that these new
names do not appear in the knowledge base to which the rule is applied. Some of these
new names are uniquely associated to individual names and are called representative
concepts; they are denoted as ��, where ! is the associated individual name (see the
nominal elimination rule for an example). It is important to note that the representative
concepts are unique w.r.t. each individual name. Therefore, if the assertion !��� is
already in �, then the knowledge base �� � � � �!���� is not modified. New concept
names, different from the representative concept, can be introduced by appropriate
rules (e.g. the simple inverse rolling up rule); they are unique w.r.t. the actual rule
application.
Finally, in some of the rules the presence a concept term of the form ��� may be
required among the other conditions (e.g. the rolling up ones). If such term is not in
the formula, we assume an implicit ��� (see the simple rolling up rule).
7.4. ANSWERING QUERIES IN NORMAL FORM 125
The ordering in which the rules are presented is not arbitrary, since the strategy
for their application gives the precedence according to this order. This assumption is
essential for some of the proofs we are going to show.
1. Equality elimination: if � � �� � * (or �� � � � *), ,� are variable
names, !" �, and *� �� indicates the set * in which all the variables are
substituted by �; then
�� � �
*� � *� �� ( �� � ��
Note that � � � is used because has been substituted by � in *� �� .
2. Conjunction elimination: if � ���� ���� � *, then
�� � �
*� � � ���� � ���� � �* ( � ���� ������
3. Role collapsing: if �� �� ������ � �� ������ � * then
�� � �
*� � �� �� ������� ���
� �* ( �� �� ������ � �� ��������
4. Contradiction elimination: if � � !�� � !�� � *, and !� is different from
!�; then
�� � �
*� � ����� �
5. Nominal elimination: if �� � !� � * and there is no term of the form � �� �����
in *; then
�� � � � �!����
*� � ������ � �* ( �� � !���
Where �� is the representative concept name associated to the individual !.
7.4. ANSWERING QUERIES IN NORMAL FORM 126
6. Role elimination: if �� �� ������� � � � � ��� � *, � � �, and there are two
indexes #� � s.t. � & �; then
�� � �
*� � �� �� �������� � � � � �� ( �����
� �* ( �� �� ������� � � � � �����
7. Shortcut elimination: If
�� �� ������� � � � � � ���� ������� � �� ������� � � � � ��� � *�
and there is an index # $ � and a transitive role � such that � & �, and in every
set of roles ��, there is at least a role �� such that �� & �; then
�� � �
*� �
����������� �� �������� � � � � �� ( ����� if � � �, or
� �* ( �� �� ������� � � � � ����
* ( �� �� ����� if � � �.
8. Simple rolling up: if �� � ���� ���� � *, � !" �,6 and * does not contain any
other term involving �; then
�� � �
*� � � ����� � �* ( �� � ���� ������
9. Simple inverse rolling up: if ���� ��� ���� � *, � !" �, and * does not
contain any other term involving �; then
�� � � � �� �������
*� � � ������ � �* ( ���� ��� ������
Where ���� is a new concept name not appearing in �.
10. Functional rolling up: if �� � ������� � � � � ��� ���� � *, � � �, � !" �,
6We are going to use the symbol �, instead of �, to denote syntactic equivalence for avoidingconfusion with the use of � in query terms (see section 7.1.1).
7.4. ANSWERING QUERIES IN NORMAL FORM 127
there is a label � s.t. ��� � � � � �� � �,7 and * does not contain any other term
involving �; then
�� � �
*� � � ������ � ���� � � � � � ������
� �* ( �� � ������� � � � � ��� ������
11. Functional inverse rolling up: if ���� ������ � � � � ��� ���� � *, � � �,
� !" �, there is a label � s.t. ��� � � � � �� � �, and * does not contain any
other term involving �; then
�� � �
*� � ����� � ���� � � � � � ������ ��� ����
� �* ( ���� ������ � � � � ��� ������
12. Nominal rolling up: if �� � !� � � ������� � � � � ��� ���� � *, � !" �, and *
does not contain any other term involving �; then
�� � � � �!����
*� � � ������� � ��� � ����� � � � � � �������
� �* ( �� � !� � � ������� � � � � ��� ������
Where �� is the representative concept name associated to the individual !.
13. Nominal inverse rolling up: if �� � !� ��� ������ � � � � ��� ���� � *, � !"
�, and * does not contain any other term involving �; then
�� � � ��!���� �� � ��� �������
�� � � � � � ����������
*� �� �����
�� � � � � � �������
� �* ( �� � !� ��� ������ � � � � ��� ������
Where �� is the representative concept name associated to the individual !, and
������ � � � � ���
�� are new concept names not appearing in �.
7The set of labels of a given KB are defined in Definition 3.4.
7.4. ANSWERING QUERIES IN NORMAL FORM 128
14. Cycle breaking: if �� � !� � � ������� � � � � ��� � *, and in
�* ( �� � ������� � � � � ����
there is a path from to � (see Definition 7.6), or " �; then
�� � � � �!����
*� � � ������� � � � � � �������
� �* ( �� � ������� � � � � �����
Where �� is the representative concept name associated to the individual !.
15. Simple cycle breaking: if �� � !� � � ������� � � � � ��� � *, and � � �;
then
�� � � � �!����
*� � �� � ����� ������� � � � � � �������
� �* ( �� � ������� � � � � �����
Where �� is the representative concept name associated to the individual !.8
16. Inverse cycle breaking: if �� � !� ��� ������ � � � � ��� � *, and in �* (
���� ������ � � � � ���� there is a path from to �, or " �; then
�� � � ��!���� �� �������
�� � � � � � ����������
*� �� �����
�� � � � � � �������� �* ( ���� ������ � � � � �����
Where �� is the representative concept name associated to the individual !, and
�����,. . . ,���
�� are new concept names not appearing in �.
17. Simple nominal introduction: if �� � ������� � � � � ��� � *, � � �, there
is not any label � such that ��� � � � � �� � �, and there is not any term like
� � ! in *, and !�� � � � � !� are the individual names appearing in �; then � new
disjuncts are added to the query formula in place of *:
� �� �* � �� � !�� � � � � � * � �� � !�� � *�� � � � � *����
8The term �� ���� is left in the query for avoiding the problem of breaking the connectedness ofthe formula.
7.4. ANSWERING QUERIES IN NORMAL FORM 129
where *�� � � � � *� are the original disjuncts.
18. Cyclic nominal introduction: if * contains a cycle involving the variable �,
none of the variables �� involved in the cycle appear in a term like �� � ! in *,
and !�� � � � � !� are the individual names appearing in �, then new disjuncts are
added to the query formula in place of *:
� �� �* � �� � !�� � � � � � * � �� � !�� �
)'��� � *� � � � � )'��� � *� *�� � � � � *����
where *�� � � � � *� are the original disjuncts, and )'���� � � � � )'��� are all the possi-
ble combinations of equality terms among pairs of variable names ��� �� involved
in the cycle; i.e. )'��� � ��� � ���.
7.4.2 Notes on transformation rules
In this section we explain in detail the rationale behind the rules introduced in Sec-
tion 7.4.1; we provide the intuition on their correctness, while the formal proofs are in
the following sections.
The underlying idea is that a conjunctive formula, represented by a set of terms,
can be visualised as a graph. The nodes of the graph are the variable names, labeled by
the set of concept terms applying to the variable node. Nodes are connected by edges
representing the role terms. In addition, equality terms are explicitly attached to the
appropriate node. For example, the query expression
����� ��� �� � � � � !� ��� ����� ��� �����
is represented as the graph
� � !
����
�
�
�
A formula in normal form is a set of conjunctive formulae sharing a common variable;
by the graph analogy it corresponds to a set of graphs “hooked” on the joining variable
7.4. ANSWERING QUERIES IN NORMAL FORM 130
(the � in Formula (7.1)). Each graph corresponds to one of the elements in the normal
form.
Note the similarity between the graph representation of a query and of an interpre-
tation (see Chapter 2). The main difference is that, in the case of interpretations, nodes
are elements of the domain and are labelled by concept names only (in interpretation
graphs, edges are labelled by sets of names, but this is just a different notation for the
conjunction constructor). The parallel between the two representations enables us to
view query answering under the different perspective of graph matching. In fact, given
an interpretation and a conjunctive query, we can see the question of whether the in-
terpretation satisfies the query as a problem of verifying if the graph corresponding to
the query can be matched against the graph representing the interpretation. The pres-
ence of disjunction in the query language slightly blurs this perspective; however, since
we restrict ourselves to disjunctions of conjunctive queries we can consider the non-
deterministic matching with one of the graphs in the disjunctive formula. The graph
matching perspective is very useful for understanding the rationale behind the algo-
rithm. In particular, it enables us to build and explain examples9 in a very intuitive
way.
The purpose of the transformation rules is to collapse each graph contained in a
disjunctive formula into a graph composed of a single node and without any edges.
The rules are grouped according to the kind of transformation they perform on the
graph.
Normalisation rules
Due to the role conjunction constructor, more than one formula may correspond to the
very same graph; e.g. both the formulae ���� �� � �� and ���� ��� ��� ���� are
represented by the very same graph. Moreover, there are valid formulae that do not
fit nicely with the graph idea; e.g. formulae having multiple concept or equality terms
applied to the very same variable name (like ����� ���� or � � �� � � !�). For this
reason, the first five rules are designed to transform the formula in a normal form which
correspond univocally to a graph representation. We call these five rules normalisation
rules.
The equality elimination rule eliminates equality terms by renaming the appropriate
variables. Note that the rule is written in such a way that it never eliminates the joining
9And counterexamples as well. Most of the problems discovered in the development of the algorithmhave been spotted by using the graph matching analogy.
7.4. ANSWERING QUERIES IN NORMAL FORM 131
variable � (for this reason, both the � � and � � cases must be considered).
The conjunction elimination and role collapsing rules serve the same purpose of
gathering together concept (role) terms applying to the vary same variable (pair of
variables) by means of the DL conjunction constructor.
The contradiction elimination rule takes care of formulae which cannot be satisfied
by any interpretation because of two individual names. In fact, by the unique name
assumption two different individual names cannot be interpreted by the very same
element of the domain. The contradictory formula is substituted by the simpler �����
which has no models either. The choice of the substituting formula is driven by the
fact that the resulting formula must contain the connecting variable �, and should be
connected (see the definition of normal form in Section 7.3.1). An alternative choice
would have been the elimination of the element (i.e. the disjunct * is simply eliminated
from the query). However, this choice has been ruled out to avoid the necessity of
taking into account cases in which the formula contains a single disjunct (e.g. like
��� � �� � � ���).
The last normalisation rule, the nominal elimination rule, is needed for transform-
ing a formula in which the “linking” variable is identified with an individual name (we
will come back to this case later on, when we described the rest of the rules). It is easy
to see that the number of times the normalisation rules can be sequentially applied to
the same disjunct is bounded by the number of terms plus one (because of the nominal
elimination rule).
Role elimination rules
The role elimination and shortcut elimination rules remove redundant edges from the
graph. This is necessary because parts of the graph that may appear as a cycle (or as
multiple edges connecting two nodes) may not really be cycles. If these “false” cycles
are not correctly handled then the nominal introduction rules would be incomplete (see
the proofs in Proposition 7.15 and Proposition 7.16). The following example shows
why the absence of the role elimination rule would cause problems.
Example 7.1
Let us consider for example the query ��� �� � � against the knowledge base � �
�� � ���������. Clearly, each interpretation satisfying� is a model for ��� �� � �;
therefore � �� � ���� �� � �� (i.e. � �� ����� �� � ���). On the other hand, the
formula ���� �� � �� is a candidate for the simple nominal introduction rule, which
transforms the query into � �� � ���� �� � � - � ��. However, this new query
7.4. ANSWERING QUERIES IN NORMAL FORM 132
formula is not implied by �. The problem is that, even though the query graph shows
two entering edges into the node , the fact that � is included in implies that an edge
labeled � is always an edge as well.
The shortcut elimination rule is conceptually similar, but it covers the cases in
which the transitivity restriction forces the existence of edges not explicitly appearing
in the query formula itself.
In figure 7.4.2 these “phantom” edges are depicted as dotted lines. Solid lines
denote explicit edges, while dashed lines represent implied edges. Finally, undirected
edges represent arbitrary connection with the rest of the graph.
�
���
���
�
��
�
�� � � � ����
� ��
�
�
Figure 7.1: Prerequisites for shortcut elimination rule.
Since the proposed technique works only when in cycles there are no transitives
role (or roles including transitive ones). The shortcut elimination rule can transform a
query which is apparently not covered by the algorithm into a suitable one.
Rolling up rules
The next three pairs of rules (rolling up and its inverse version) are the core of the graph
collapsing procedure. They act on “leaves” of the graph,10 eliminating the leaf node
and the edge(s) connecting it to the parent node. We call them “rolling up” because the
information about the removed node and edge(s) is incorporated into the parent node
via a concept term.
The simple rolling up rule is the more intuitive one; and the easiest way of under-
standing it is to consider the first order translation of the DL constructor ���, which
is � � ���� � �� - ������ (see Borgida [1994]). The translation correspond exactly
to the part of the query �� � ���� ���� involved in the rule.
10The condition of being leaves is enforced by the requirement “. . .� does not contain any other terminvolving ”.
7.4. ANSWERING QUERIES IN NORMAL FORM 133
*
��
*�
���
Figure 7.2: Application of simple rolling up rule.
Note that any reference of the fact that the edge was connecting to � is lost in the
application of the rule; therefore the rule can be applied only if there are not any other
terms involving � itself (i.e. � is a leaf node). Rolling up a non–leaf node may produce
a wrong aswer to the query as shown in the following example.
Example 7.2
To show why this can cause problems, let us consider the simple query
����� ��� � � ������
and the kb � � ������� � ������. Clearly, � does not imply the given query
because there is a model satisfying � and not having a path with the two roles in
sequence (e.g. the interpretation �� � �, � � ���� ���, and �� � ���� ��� satisfies
�). However, if the simple rolling up rule is applied to the term ��� ��, the resulting
query formula is ���������� � � ������. It is easy to realise that the latter query is
a logical implication of � (i.e. identifying both � and with �); therefore the rule
application would not be correct.
The simple rolling up rule is applied only if there is a single edge connecting the
leaf node; e.g. the rule cannot be a applied to the term to the term � � ��� � �. Again,
the reason lies in the loss of information which would hide the fact that there is only one
confluent node (i.e. �). However, there are cases in which this uniqueness is redundant
because there are other contraints that guarantee it. This is the rationale behind the
functional rolling up and nominal rolling up rules.
Let us consider the case when the leaf node � is connected to the “father” node
by a set of roles (i.e. � � ���� � � � � � �) all being subroles of a common functional
role � (see Figure 7.3 (a)). Variable � can be duplicated � ' � times, because the
common functional role � forces � and its copies to be interpreted as the very same
value (Figure 7.3 (b)). All the generated nodes can then be rolled up as described for
7.4. ANSWERING QUERIES IN NORMAL FORM 134
the simple rolling up rule.
(a)
��
�
�
(b)
��
�� � � � ��
� ��
(c)
������ ����� � � �� �����Figure 7.3: Functional rolling up rule.
The nominal rolling up rule works in the same way (Figure 7.4); but in this case the
uniqueness is guaranteed by the equality term. Note that the basic rolling up procedure
cannot be used directly in this case because there is an equality term (i.e. � � !).
To cover these cases we use the representative concept of the individual name in-
volved. As introduced in Section 6.2.1, the representative concept is used in place
of the DL one-of constructor. In fact, using the one-of constructor, the terms
�� � !� � � ���� � � � � ��� ���� (when � is a leaf node) can be substituted by the
single term assertion ������� � � ! �� � ���� ! � � � � � � ���� ! ��. The nomi-
nal rolling up rule uses the concept �� in place of the concept expression � ! �. In
addition, an assertion which ensure that the individual is an instance of its representa-
tive concept is added to the knowledge base (e.g., !���) if it is not already there.
(a)
��
�!
�
�
(b)
��
�!
���!
� � � ���!
� ��
(c)
������ � ���� �����
� � � �� ������Figure 7.4: Nominal rolling up rule.
Since our underlying language does not allow the use of the inverse role construc-
tor, all the rolling up rules have their inverse counterpart. This is necessary because
we cannot guarantee that the graph can be collapsed by following the direction of the
edges. In fact, is quite easy to come up with an example of graph in which all nodes
have both entering and exiting edges. The idea is not dissimilar to the one already ex-
ploited for the representative concept: the introduction of a new concept representing
the existance of the appropriate edge and node.
7.4. ANSWERING QUERIES IN NORMAL FORM 135
Consider the basic simple inverse rolling up rule for a disjunct having the terms
��� ��� ��� (node � is a leaf). If the underlying DL logic provided the inverse role
constructor, the pair of term could have been substituted by the single concept term
������ in the same way as the simple rolling up rule. In order to overcome the defi-
ciency of the logic we are going to substitute the expression with a new concept name
(����). The technique is similar to the one used for reducing the satisfiability problem
of Converse PDL to the very same problem in PDL (see De Giacomo [1996]). In our
case we are not interested in enriching the language with the inverse role constructor,
but only in representing a particular “shape” in the interpretations. Therefore, we can
localise the scope of the newly introduced concept name to the given expression � by
using the axiom � ������ (see Proposition 7.18).
The nominal inverse rolling up rule is essentially a combination of the simple in-
verse rolling up rule, combined with the duplicating arguments we already presented
in the case of the nominal rolling up rule. The functional inverse rolling up rule takes
a different approach, exploiting the functional restriction on the role names. Consider
the transformation shown in Figure 7.5, which is exactly the same pattern as seen in
the functional rolling up rule (see Figure 7.3).
(a)
��
�
�
(b)
��
�
...
��
�
�
(c)
�
�� � ����� � � � � �����
�
Figure 7.5: Functional inverse rolling up rule.
In this case we duplicate the variable ; the new nodes �� � � � � � can then be rolled
up by using the same arguments as the simple rolling up rule. The resulting structure
shown in Figure 7.5 (c) is then suitable for the application of a simple inverse rolling
up rule in a subsequent step.
The rolling up rules eliminate nodes and edges from the graph, but we should
make sure that the two conditions, maintaining the joining variable � and maintaining
the connectedness, are still satisfied by the resulting graph. The second condition is
guaranteed by the fact that rules only operate on leaves, while the first one is explicitly
stated on each rule (with the condition � !" �). However, this latter requirement for
the rules clashes with the necessity of eliminating the equality terms as well as the
7.4. ANSWERING QUERIES IN NORMAL FORM 136
role terms. In fact, equalities with individual names are eliminated by the rolling up
procedure (by the nominal rolling up and nominal inverse rolling up rules; but they
never apply to the joining variable �. For this reason we have the nominal elimination
rule which applies to � only. Note that this rule must be applied only when there are
no role terms in the conjunctive formula, otherwise the fact that � must be identified
with a particular individual name would be lost.
Cycles in query formulae
Until this point we have shown the rules which handle tree–like structures; in fact,
when the graph is a tree we can simply roll it up starting from the leaves.11 When the
query graph contains cycles we need a means for “breaking” these cycles and rolling
up the graph using the rules described above. This means is provided by the cycle
breaking rules toghether with the nominal introduction rules.
Let us first examine the three cycle breaking rules, starting from the cycle breaking
and inverse cycle breaking rules. They work using the very same principle which has
been exploited for the nominal rolling up rules. The fact that the variable � is identified
with an individual name allows us to introduce a new variable forced to have the very
same interpretation as � (because the new variable is identified with the very same
individual name). In addition, the path still connecting the two variables guarantees
the connectedness of the resulting graph (see Figure 7.6). The inverse cycle breaking
rule is similar, the only difference being the direction of the edges connecting the two
variables. The two rules also cover the case in which the two variables coincide; in
that case the graph contains a loop centered on the variable.
(a)
��!
�
�
(b)
��!
���
!
�
�(c)
��!
������ � � � �� ������
Figure 7.6: Cycle breaking rule.
The simple cycle breaking rule is similar, in principle, to the cycle breaking rule; but
it does not assume that removing the edges will leave the graph connected; therefore,
11This is not completely precise because, even in tree–like graphs, we may still need to use the simplenominal introduction rule.
7.4. ANSWERING QUERIES IN NORMAL FORM 137
one of the role names (�) is left unchanged in order to maintain the connectedness.
At a first sight, simple cycle breaking looks redundant because of the nominal rolling
up rule. However, the rule is necessary when the rolling up is contrary to the direction
of the edges (for example when � " �, or the leaf node is ) because the nominal
inverse rolling up rule covers only cases in which the starting node is identified with
an individual name.
The rule is part of the ones handling cycles because, although the role term (i.e.
� � ������� � � � � ��) is not necessarily part of a cycle (according to Definition 7.6)
when role names appearing in a role term are unrelated (see the role labels in Defini-
tion 3.4), the term itself can be considered a sort of a cycle. In fact, we can traverse
one of the edges in one direction and a different edge in the other direction, coming
back to the very same variable. This is not true for the “false cycles” as explained in
the normalisation rules.
In the query there can be cycles (even the “cycles” composed by a single term)
composed by variables only; i.e. in which none of the variables are identified with an
individual name by an equality term. In these cases we cannot use the cycle breaking
rules, and one of the nominal introduction rules must be used.
These two rules exploit the completeness of the quasi transitive shrub interpreta-
tions (Section 3.1), since the properties of these interpretations guarantee that cycles
can only be enforced by assertions about the individual names. Consider the simple
nominal introduction rule applied to a term � � ���� � � � � � �. The idea consists
in identifying the second variable (�) with one of the individual names. The choice
cannot simply be nondeterministic, as shown in Example 7.3; therefore we must be
able to try all the possibilities at once. This is the reason why maintaining a connect-
ing variable (the � variable in Section 7.3.1) across all the disjuncts is essential for the
completeness of the algorithm.
The problem with the nondeterministic approach is that once the (nondeterminis-
tic) choice is made, it must be the correct choice for all the possible interpretations
satisfying the knowledge base. In fact, we are going to show an example of knowledge
base in which for some interpretations a variable must be identified with one individ-
ual, while for others a different individual must be chosen. Trying all the possibilities
at once allows all the different cases to be covered.
Example 7.3
7.4. ANSWERING QUERIES IN NORMAL FORM 138
Let us consider the knowledge base
� � ���� ���� ��� ���� ��� ����� ��� ����� ���� � ������ �
� forces two different cycles with the same role names:
���� ���� ��� �����
and
���� ���� ��� ����� �
in addition, the assertion ���� � ����� forces either � or � to be an instance of � in
every model. Therefore the query formula � �� �� - � � ��� - ��� ���� is a logical
consequence of �. On the other hand, we cannot nondeterministically choose one of
the two individual names to be identified with (or �). In fact, in this way all the
possible choices will lead to query formulae which are not logical consequences of �.
When a true cycle is present in the query, the cyclic nominal introduction rule is
used; its rationale relies on the very same arguments as for the simple nominal in-
troduction rule. However, in this case we must be more careful in order to avoid a
different kind of “false cycle”, as shown in the next example.
Example 7.4
Let us consider the formula * � ���� ���� ��� ���� � �� ����� � �� ����� having a
diamond shape:
�
�
�
�
�
�
and the knowledge base � � �����������������. By interpreting both variables �and � as the very same element, it is easy to see that � �� �*�. In addition, the graph
is cyclic and there are no applicable rules other than the cyclic nominal introduction
one.
Suppose that we are going to apply the rule to the variable � (choosing a different
variable would not make any real difference), without adding all the possible combi-
nations of equality terms among the variables in the query. In this case we obtain the
7.4. ANSWERING QUERIES IN NORMAL FORM 139
new formula
*� � �� � �� ��� ���� ��� ���� � �� ����� � �� ����� �
because � is the only individual name in �. It is easy to see that �* �� is not a logical
consequence of � (i.e. � !�� *�).
Applying the rule as it is defined, we obtain several disjuncts and the formula
� � � �� ��� ���� ��� ���� � �� ����� � �� ����� among them. This disjunct is sim-
plified into the formula ���� ���� � �� ����� by the equality elimination rule, and
subsequently rolled up into the formula *�� � ��������������. It is easy to see that
the query �� � � � *��� � � �� is a logical consequence of �.
As shown by the example above, the cases of false cycles are covered by the intro-
duction of the equalities in the cyclic nominal introduction rule. If two (or more) of
the variable names in a cycle must be interpreted as the very some element, there is at
least one of the disjuncts which has the required equality terms explicitly stated.
7.4.3 Limitation of the technique
Unfortunately the presented technique works only under some restrictions on the form
of the queries and/or terminologies. We have two main sources of problems, namely
the transitivity restrictions and the interaction between functionality and the role hier-
archy. This is symptomatic of the fact that the restriction on role interpretation compli-
cates the structure of the models as shown in Chapter 3. In the following two section
we are going to describe the limitations of the query answering algorithm.
Functionality and hierarchy
As highlighted by the Definition 3.6 of a canonical interpretation, we can assume that if
a pair is in the interpretation of two different roles these two roles must be in a common
label (let us ignore Abox assertions). However, the contrary is not necessarly true; i.e.
interpretations of roles do not have to coincide if two roles are in the very same label.
For the application of the simple nominal introduction rule it is essential to know
whether a variable must be identified with an individual in the presence of two variables
connected by more than one edge (i.e. the node � with a term like � � ���� � � � � � �).
Trivial cases are covered by the role elimination rule, since the inclusion relation be-
tween two roles guarantees that every pair in the included role must be in the including
7.4. ANSWERING QUERIES IN NORMAL FORM 140
role as well. However, the interaction between functional restrictions and role hierar-
chy may cause more subtle “terminological” coreferences.
For example, in the case in which two unrelated (via &) roles have a common func-
tional super–role. Let � and � be the two roles both included in a functional role
� ; in an arbitrary interpretation � satisfying the restrictions on the roles, whenever
there is a pair ��� �� in �� and a pair ��� �� in �
� then � � � because they must be
both in the interpretation of � . Note that this is different from saying that whenever a
pair is in one of the two roles it must be in the second as well (as in the case if inclu-
sion). This property allows us to “roll up” terms like � � ���� �� into something
like ������ � ����� without loosing the information that the two successors must
be the same element.
The cases shown until now have in common the fact that the “terminological”
coreference can be discovered by looking at the relation among the role names. Un-
fortunately there are cases in which the role structure is not enough and the single
interpretation makes the difference.
Example 7.5
Let ��, �� be two unrelated functional roles and � �� % three different roles s.t. &
��, � & ��, � & ��, and % & �� (assuming that the the converse elations do not
hold). Given an arbitrary interpretation �, and two pairs ��� �� in � and ��� �� in
% � , whether or not � and � must be the same actually depends on the existance of an
�–successor of �.
If there is an element �� s.t. ��� ��� � �� , then by using the same arguments as in
the previous case we can show that � � �� and �� � �; therefore � � �. Otherwise,
nothing is forcing � and� to be the same. Here we have a “terminological” coreference
that does not depend on the roles structure alone.
How this affects the presented algorithm can be shown by a simple example. Let
us consider the following three Aboxes sharing the same role terminology described
above
���� ���� ���%��� ��� ���%� (7.5a)
���� ���� ���%��� ������� (7.5b)
���� ���� ���%��� � (7.5c)
We want to decide whether the formula ��� �� � % is a logical consequence of the
7.4. ANSWERING QUERIES IN NORMAL FORM 141
assertions.12
It is easy to realise that the given formula is a logical consequence of both the first
two Aboxes (7.5a) and (7.5b), but not of the third one. In fact, an interpretation �
where ���� ��� � � � ��� , ���� ��� � % � � ��
� , �� � �, and �� � �), satisfies the
Abox assertions in (7.5c) but not the formula � ���� �� � % �.
The problem in devising an algorithm to discern the different cases comes down to
figuring out whether the variable should or should not be identified with one of the in-
dividuals of the Abox. For the Abox (7.5a) we can use a transformation rule like simple
nominal introduction which adds, (amongst others) the disjunct ���� �� � % � � ��.
However, this is not sufficient to detect the case of the Abox (7.5b); this last case can
be covered by a rule like functional rolling up, which transforms the formula into
������ � �%���.
On the other end, if the latter transformation is indiscriminatevily used, the Abox
(7.5c) would be wrongly recognised as satisfying ��� �� � % . In this particular case,
the right trasformation should be something that takes into account the necessity of an
�–successor for �, like the term ������ � �%�� � �����.
The last example suggests that a combination of the two mentioned rules can be
used to detect these tricky cases of interaction between role hierarcy and functional
restrictions. However, we still have to generalise this intuition to the general case, and
for the time being we are forced to restrict the expressivity of the role terminology.
Roughly speaking, we forbid cases like the described one by ensuring that for any
pair of roles in a single label either they are included one in the other, or they share
a common functional super–role. As shown in the proof of Proposition 7.14, this
restriction is enough to prove correctness and completeness of the functional rolling
up rules.
Transitivity and cycles
The kind of queries we are able to answer using this technique must not contain any
cycle which includes a transitive role name.13 The problem with cycles including tran-
sitive roles is that there may be variables in the cycle that are not identified with one
of the individual names in the knowledge base. The reason is that transitivity pro-
vides a way of creating shortcuts for arbitrarily long directed paths. Let us consider for
12With an abuse of notation we used the individual � instead of a variable for simplifying the formula.13In fact, the requirement can be less restrictive, as highlighted by the proof of Proposition 7.16.
However, this formulation is much simpler and clearer.
7.4. ANSWERING QUERIES IN NORMAL FORM 142
example a knowledge base forcing interpretations having a structure like:
�
�
� �
��
��
��
��
��
where �, �, � represent elements of the domain corresponding to individual names
and the dot an “anonymous” element. When the role name is transitive the dotted
shortcuts are presents whenever the non–dotted structure is present (for example by
means of an assertion like �����). Therefore there is always a cycle which includes
an element not corresponding to an individual name (the “anonymous” element). The
scenario can be complicated by the role hierarchy, since these transitive shortcuts can
be forced into non trasitive roles as well.14
7.4.4 Example of query answering
For a better undersanding of the algorithm we show the rule application process on a
concrete example. Let � � � �� ��� !��� ��� ����� ���� ������� be a knowledge
base and we like to know whether the answer to the boolean query
�� ����� - ��� �� � � - � ! - ��� ���� - ��� �����
is positive or negative. We can guess that the answer must be negative (i.e. � does not
imply the given formula). The reason for this is that the constraint ������ does not
force a loop on � but only the existence of a element related with � by role � (it may
be � itself, �, !, or even a fourth one not specified).
Since the query formula is conjuntive and connected, it is already in normal form.
We choose � as the joining variable:
������ ��� �� � �� � !� ��� ����� ��� ������ ���
14We have the conjecture that the problem can be solved by introducing more disjuncts with the cyclicnominal introduction rule. The idea is that, since the underlying DL does not provide the inverse roleconstructor, at least one of the variables in a cycle must be coreferenced with one of the individual namesin the KB. However, this hypothesis is still being investigated.
7.4. ANSWERING QUERIES IN NORMAL FORM 143
To simplify the exposition we start from a knowledge base � � already containing the
assertions for the representative concepts (i.e. �� � �������� !���� �����). In this way
the knowledge base will be unmodified by the rules we are going to use. Initally the
query formula contains a single graph, to which are applied first the role elimination,
followed by the nominal rolling up rules:
� � !
��
�
�
�
� � !
��
�
�
�
�� � ����
�
�
At this point we can only apply the cyclic nominal introduction rule to the loop in-
volving the variable �. The rule generates three different isomorphic graphs where
the variable � is identified with each one of the three different individual names in the
knowledge base. The query now is composed of the three graphs:
� ��
�� � ����
�
�
� ��
�� � ����
�
�
� �!
�� � ����
�
�
Since the graphs have the same shape, they are transformed in the very same way; with
the only difference of the representative concept used for the variable �.First, the cycle
breaking rule is applied to the loop on �, then the nominal rolling up rule concludes
the collapsing:
� ��
�� � ����
�
�
� ��
�� � ����
�
�����
�
� � ����
� ������ � ������
7.4. ANSWERING QUERIES IN NORMAL FORM 144
The query formula we are now going to verify is composed by three disjuncts in the
form of ����� � ���� � ����� � � ���� ����; where � � is substituted by ��, ��, or ��.
Since the variable � is the same we can collapse the disjunction into a single concept
expression by using the � constructor.
����� � ���� � ������ � �������
� �� � ���� � ������ � �������
� �� � ���� � ������ � ��������
By using simple transformations on the concept expressions we obtain the query an-
swering problem
�� �� ������� � ���� � ������� � ������ �
������ � ������ � ������ � ����������
which can be verified by checking whether the knowledge base
�� � �� ��� � ���� � ������� � ������ �
������ � ������ � ������ � ���������
is unsatisfiable. We can push the negation down into the concept expression to obtain
the kb
�� � �� ��� � ����� �
�������� � ������� � ���� � ������� � ���� � ����������
This kb is indeed satisfiable, and this can be verified by using a KB satisfiability al-
gorithm. However, here we demonstrate its satisfiability by exhibiting an example of
interpretation satisfying the constraints. Let us consider the following interpretation �:
���� ���
�����
��
!����
����
������
Interpretation � has been constructed in such a way that it satisfies all the constraints in
7.4. ANSWERING QUERIES IN NORMAL FORM 145
�, and all the assertions for the representative concepts. To satisfy the added inclusion
assertion derived from the query formula, every element of the domain must be in the
interpretation of the concept. The elements !, �, and �� satisfy the constraint because
they are not in �� (therefore they are in �����). The element � obviously is not in
����� , nor in ��������; therefore we should verify that � is in ����� � ������� �
���� � �������� ���� � ��������� . Clearly, this is the case because it is in �����
� ,
in ������ , and in ��������
� .15
Since the transformed KB is satisfiable, the conclusion is that the query formula is
not a logical consequence of the KB �.
7.4.5 Termination
Transformation rules are applied by following a simple “ordering” strategy: if more
than a rule can be applied to the same disjunct, then the first one (in the order they have
been presented) is applied.
Given this strategy the rule application process terminates; i.e. the transformation
always leads to a formula where no rules are applicable, and the resulting formula has
the form � �� �������� � � � � ������� ��.
The proof of termination is divided in three parts. First we show that the application
of any rule maintains the connectivity of the query formula. Secondly, we show that
the variable � is never eliminated; finally, we prove that in a finite number of steps
we are going to obtain a query formula without role terms. If an element (disjunct) of
the query is connected, contains the variable �, and does not contain any role term it
must be in the form ������ � � � � ����� (possibly containing a term in the form � � !
as well). From that is collapsed into ����� by the conjunction elimination rule (and
nominal elimination rule if � � ! is present).
Proposition 7.8. Let � �� �*�� � � � � *�� �� be a query where all the elements *�� � � � � *�are connected, and �� �� �*��� � � � � *
���� �� be the resulting query after the application
of one of the rules. Then all the disjuncts *��� � � � � *��� are connected.
Proof. Given the assumption that each one of the starting disjuncts is connected, we
have to show that the connectivity is maintained by the rules which remove role terms.
15Note that, the interpretation provides also an example of interpretation satisfying �, but notmodelling the query formula. Therefore an alternative way of showing that the formula is not a logicalconsequence of the KB �.
7.4. ANSWERING QUERIES IN NORMAL FORM 146
Equality elimination: even if all the occurrences of variable are renamed to
�, the disjunct is still connected because no role terms are removed.
Role collapsing: two role terms are eliminated between � and �, but a new role
term is inserted connecting the very same variables. Therefore if the disjunct
was connected before the rule application, then it is still connected after the
application.
Contradiction elimination: the new disjunct *� � ����� is trivially connected.
Role elimination: since � � �, then the role term is substituted with a different
role term between the very same variables.
Shortcut elimination: even if a role term between variables � and � is re-
moved, there still is a path connecting the two of them by hypothesis.
Simple rolling up: the term � � ��� is removed from *; but since the variable
� no longer appears in *�, then if the disjunct * is connected, so must be the new
*�.
Simple inverse rolling up: this case is analogous to the previous one.
Functional rolling up: again, this case is analogous to the simple rolling up.
Functional inverse rolling up: a role term connecting variables � and is re-
moved from *; but a new role term connecting the very same variables is added.
Nominal rolling up: this case is analogous to simple rolling up.
Nominal inverse rolling up: this case is analogous to simple rolling up.
Cycle breaking, inverse cycle breaking: a role term connecting and � is
removed from *; however, in the remaining formula there is a path connecting
and � by hypothesis (or � " ). Therefore *� is still connected.
Simple cycle breaking: a role term connecting and � still remains in *�, so the
connectedness is maintained.
7.4. ANSWERING QUERIES IN NORMAL FORM 147
Proposition 7.9. Let � �� �*�� � � � � *�� �� be a query answering problem where the
variable � appears in all the disjuncts *�, and �� �� �*��� � � � � *���� �� be the resulting
query after the application of one of the rules. Then the variable � appears in all the
disjuncts *��.
Proof. Analogously to the previous proposition we consider the rules that might delete
terms.
Equality elimination: � is never renamed by definition.
Conjunction elimination: the variable is still in *� because of the fact that a
new concept term is added.
Role collapsing: both � and � are still kept in *�.
Nominal elimination: this rule does not affect � by definition.
Contradiction elimination: variable � is still in *� by definition.
Nominal elimination: a concept term containing � is still in the disjunct * �.
Role elimination, shortcut elimination: both � ( �) and � ( �) are still in *�.
Simple rolling up, simple inverse rolling up, functional rolling up, functional
inverse rolling up, nominal inverse rolling up, and nominal rolling up: these
rules do not remove � by definition.
Cycle breaking, simple cycle breaking, and inverse cycle breaking: even
thought a role term containing � is removed, in *� there is at least an equality
term containing �.
Proposition 7.10. Let � �� �*�� � � � � *�� �� be a query where all the disjuncts *� are
connected, and � is appearing in all of them. Then, there is a strategy for applying the
rules such that a new query �� �� ������� � � � � � ������� �� containing only concept
terms can be obtained in a finite number of application of the rules.
Proof. The idea behind the proof is showing that we only eliminate role terms by
using the rules, but never introduce new ones; therefore we obtain a query without role
terms. Moreover each disjunct is connected and contains the variable � as shown in the
7.4. ANSWERING QUERIES IN NORMAL FORM 148
previous propositions; therefore, if the query does not contain any role term, then each
disjunct must contain only terms like � � ! or ���. Elements containing only concept
terms can be normalised into the form ���� �� by using the nominal elimination and
conjunction elimination rules.
Given a set of terms *, we define the number of edges of * as the total number of
different role names connecting each pair of variable names in role terms. Formally,
this can be defined as the cardinality of the set
�� � ��� � � � ���� � � � � � � � *� � �� �
Note that the rules never increase this number per disjunct, at most they multiply the
number of disjuncts in the query (the nominal introduction rules). Given a query for-
mula �*�� � � � � *�� ��, we consider the number of edges of the formula as the maximum
among the number of edges of each disjunct *�. We are going to show that this number
decrease in a finite number of applications of the rules.
A given disjunct in the query can be transformed (by the substitutions performed by
the rules) into a form in which no rules are applicable or only the nominal introduction
rules are applicable, in a number of steps which is linear w.r.t. the size of the disjunct
itself.
Let us consider the query answering problem � �� �*�� � � � � *�� ��; we can assume
that none of the normalisation rules can be applied to any of the *� (if it is not the
case this can be achieved in a number of steps linear w.r.t. the sum�
��������� "*�, where
"*� denotes the cardinality of the set *�). If the number of edges of �*�� � � � � *�� �� is
greater than 0, there should be a subset of the disjuncts having that number of edges.
We are going to take one of those disjuncts and show that the number of edges of
disjuncts “derived” from the chosen one can be decreased in a finite number of steps.
In addition we are going to show that we can limit the number of “derivable” dis-
juncts. We can apply this arguments to all the “maximal” disjuncts in the original
query; therefore in a finite number of steps we obtain a new query answering problem
with a smaller number of edges.
First we must specify precisely what we intend by the fact that a disjunct is “de-
rived” from another one in the query. As we described in Section 7.4.1, most of the
rules take one of the disjuncts * from the query and replace it with a modified version
*� of it; we say that *� is “derived” from *. The case of nominal introduction rules is
different; the disjunct * is substituted by a finite number of new disjuncts, and all these
are said to be derived from *.
7.4. ANSWERING QUERIES IN NORMAL FORM 149
Let * be one of the disjuncts in the query �*�� � � � � *�� ��. Clearly all the derived
disjuncts have a number of edges which is less or equal than the one of *. We assumed
that none of the normalisation rules is applicable; in addition none of the role elimina-
tion rules is applicable as well, otherwise we already obtained a derived disjunct with
a smaller number of edges.
First we must show that, under these assumptions, when there is at least a role
term in * one of the nominal introduction rules is applicable. We distinguish two cases
according to the fact that there is or there is not any cycle in *.
If there is no cycle in *, then there is a role term � � ���� � � � � � � being the
last term of a path not terminating with �; i.e. either there is not any different role
term containing � and � !" �, or there is not any different role term containing
and !" �.
Let us assume that in * there is not any different role term containing � !" �.
Since none of the normalisation rule are applicable, there can be only two other
terms containing �: � � ! for some individual name ! and ���. If � � ! is
in *, then the nominal rolling up would be applicable; which is not the case
by assumption, therefore � � ! is not in *. If , � � (i.e. � � ����), then the
simple rolling up would be applicable, so , � �. Finally, there is not any label
� s.t. ��� � � � � �� � �, otherwise the functional rolling up rule would be
applicable. Therefore the simple nominal introduction rile is applicable because
all the preconditions are met.
Let us assume that in * there is not any different role term containing !" �. As
before we can exclude the case in which , � � and all the roles are included in
a single label (otherwise the simple inverse rolling up or the Functional inverse
rolling up rule would be applicable). In the same way we should exclude the
case in which either a term like � ! or � � !� are in *, otherwise the inverse
cycle breaking or simple cycle breaking rule would be applicable. Therefore the
simple nominal introduction rule is again applicable.
If there are cycles in *, let us consider each role term � � ���� � � � � � � in-
volved in at least a cycle. Since neither the cycle breaking nor the inverse cycle
breaking rules are applicable, then there are no terms like � ! or � � ! in *;
therefore the cyclic nominal introduction rule is applicable.
Now we consider the cases in which the nominal introduction rules must be ap-
plied. We show that in a finite number of steps a new query is obtained where all the
7.4. ANSWERING QUERIES IN NORMAL FORM 150
disjuncts derived from * have a number of edges less than that of * (i.e. a rule which
eliminates edges is applied). We proceed by induction on the number of variables, not
identified with any individual name (by a term like � !), which are involved in at
least a cycle or in a term like � � ���� � � � � � � where ��� � � � � �� are not included
in a single label.
First let us consider the case in which there is only one variable � satisfying the
above conditions. Then, either the simple nominal introduction rule is applicable
or there is a loop on � and the cyclic nominal introduction is applied. In both
cases the disjuncts introduced in place of * are of the form *��� � !�� (because
adding an equality term between the same variable does not make any sense). In
the case of the simple nominal introduction rule either the nominal rolling up
or simple cycle breaking rules are then applicable. If the cyclic nominal intro-
duction rule has been applied, then there is the possibility of applying the cycle
breaking rule as well. Anyway, in all three cases the rules eliminate one or more
edge from the derived disjunct * � �� � !��.
If there is no cycle, then there must be a term like � � ���� � � � � ��, and we
can apply the very same arguments we used in the basic case. So let us assume
that there is a cycle involving more than one variable (not identified with any
individual name) and one of the nominal introduction rules is applied to one of
these variables �. We consider the two cases separately.
Simple nominal introduction Let us consider one of the disjuncts *� � * �
�� � !�� inserted in place of *. Let � � ���� � � � � �� be the role term
matching the rule application. In this case the nominal rolling up rule is
not applicable because � is part of a cycle, but one of the cycle breaking or
simple cycle breaking rules can be applied to any of the resulting disjuncts,
diminushing their number of edges.
Cyclic nominal introduction Since the cycle is longer than �, either there is
a role term � � ���� � � � � �� or ��� ��� � � � � �� in the path, with
!" �. In addition, the derived disjuncts are either in the form *��� � !��
or )'��� � *, where )'��� is a set of equalities between variables.
Let us consider the disjunct *� � * � �� � !��. As we mentioned, either
� � ���� � � � � � � or ��� ��� � � � � �� are in the path.
If � � ���� � � � � � � is in the path then the preconditions of the cycle
breaking rule are satisfied (and none of the preceding ones); therefore in the
7.4. ANSWERING QUERIES IN NORMAL FORM 151
next step the (unique) disjunct derived from *� will have a smaller number
of edges. Let us now consider the case in which ��� ��� � � � � � � is in
the path. The cycle breaking rule is not applicable because by assumption
*� does not contain any term like � !, and the same is true for the simple
cycle breaking rule. However, the inverse cycle breaking rule is applicable,
which is a role elimination rule.
Let us consider a disjunct in the form *� � )'��� � *, where )'��� is a
set of equalities between variables. The query answering problem can be
transformed into a new problem in which *� is substituted by a single new
disjunct at each step, by applying the normalisation or role elimination
rules. In a number of steps which is linear w.r.t. the size of *� we obtain a
single disjunct *�� derived from *� to which only the nominal introduction
rules are applicable (or none of the rules).
Note that the number of variables involved in a cycle in *�� is decreased be-
cause of the fact that two of them have been explicitly made equal by a term
� � � (so one of them has been eliminated by the nominal elimination
rule). Therefore, we can use the inductive hypothesis and conclude that in
a finite number of steps we obtain a formula in which all the disjuncts that
are derived from *�� have a number of edges smaller than *� (and conse-
quently than *). The arguments are valid for all the disjuncts *� � )'��� �*
derived from *.
Worst case complexity analysis
The complexity16 of the actual rolling up procedure is polynomial in the size of the
query formula. This can be seen by considering the graph corresponding to each dis-
junct; the rolling up eliminates the leaves of the graph (after a linear normalisation
by the first five rules). Even if new assertions are added to the knowledge base, these
are linearly bounded by the sum of the number of edges of each graph representing a
disjunct in the query formula.
Unfortunately, the nominal introduction rules multiply the number of disjuncts
(graphs) in the query. As the reader can guess from the last section, the number of
16In this section, we use the word complexity intending the worst case complexity.
7.4. ANSWERING QUERIES IN NORMAL FORM 152
newly introduced disjuncts can be exponential in the size of the initial problem.17 The
worst case is related to the cyclical nominal introduction rule, because the other rule
introduce a lesser number of elements (the equalities are not introduced). Therefore,
the worst case complexity can be calculated by analysing the cyclical nominal intro-
duction rule.
The proof for termination (see Proposition 7.10) provides the hints for the analysis
of the complexity. The point of the proof to look at is where the inductive arguments
are used for the cases in which the cyclical nominal introduction rule can possibly be
applied recursively. At each application of this rule one of the variables involved in a
cycle is “eliminated” from the derived new disjuncts.18 Therefore, if at the next step
the cyclical nominal introduction rule is applied again, the number of equalities to be
introduced will be smaller than the one at the previous step.
The complexity is of the order of the number of graphs that can be generated by
recursive application of the nominal introduction rules. This number can be calculated
by considering the tree whose root is a disjunct and the nodes the new elements gener-
ated by successive applications of the rule from the root element. The number of all the
generated graphs is equal to the number of leaves in this tree; so we need to estimate
the maximum number of successors for each node.
By looking at the rule we can estimate the number of successors of each node by
considering the sum of the number of individual names in the knowledge base (we
denote this number by " ), with the number of equalities that can be generated by
the number of variables in the cycle (we can consider this number as the total number
of variables). Although the number of variables can be tightened to a more precise
lower value, we assume that it is equal to the size of the single disjunct (we denote this
number as �). The number of equalities generated from � variables can be calculated
by considering a square matrix of size � where each row (column) is marked by a
variable. Each element of this matrix denotes an equality between the correspondent
two variables. We do not need the whole matrix because we are not interested in the
principal diagonal (terms like � ), and we need only half of the matrix (there is not
point in having both � � and � � ); therefore the number of equalities is �������
.
The maximum number of successor of the root are " � �������
, at the next level
we have �" � �������
��" � �����������
�, and so on until the number of variables is
17Unfortunately it is not only on the size of the formula because of the identification of the variableswith the individual names in the knowledge base.
18Or it is identified with an individual, which is the same from the perspective of the cyclical nominalintroduction rule.
7.4. ANSWERING QUERIES IN NORMAL FORM 153
equal to one:
�" ����' ��
���" �
��' ����' ��
�� � � � �" �
���' ��
��" � �� �
� times.
Therefore, the number of disjuncts is bounded by the formula �" � ����. The size of
each disjunct is bounded by the size of the original formula, and in addition the rolling
up procedure can add to the knowledge base as many new assertions as the number of
inverse rolling up rule applications. This does not alter the order, which is still-�. ��
w.r.t. the size of the query problem.
At the end of the collapsing procedure the algorithm uses the knowledge base sa-
tisfiability procedure whose complexity is EXPTIME (see Chapter 4). Note that we
already estimated that the size of the input formula is of the order of -�.��, therefore
the worst case complexity for the whole problem is double exponential time (2EXP-
TIME).
This is more than intractable, the question at this point is—can we do better than
that? The answer can only be found by the reduction of a different problem known to be
2EXPTIME–hard to our query answering problem. However, there is good evidence
that this is the intrinsic complexity of the problem, since it is the same order as the
results presented in Calvanese et al. [1998a] for a similar problem. Although the DL
presented in that paper is more expressive than the one we are presenting here, we
already know that the complexity of the satisfiability problem is the same for both the
logics. The reader may object that in the same work they do not show hardness results,
however the nondeterministic nature of the problem suggests that the addition of an
exponential step to the basic satisfiability problem cannot be avoided.
7.4.6 Correctness and completeness
The correctness and completeness of a rule are defined in the usual way. A rule is cor-
rect if a positive answer to the query problem is still positive for the new query problem
generated by the rule itself; i.e. if� �� �*� *�� � � � � *�� �� then�� �� �*�� *�� � � � � *�� ��.
A rule is complete if negative answers are preserved; i.e. if �� �� �*�� *�� � � � � *�� ��
then � �� �*� *�� � � � � *�� ��).
For rules that do not modify the KB on the left hand side (� � ��) the pattern of
the proof is “simple” because the set of interpretations satisfying the KB before and
7.4. ANSWERING QUERIES IN NORMAL FORM 154
after the rule application does not change. On the other hand, when a rule modifies the
KB the set of satisfying interpretations is reduced because new constraints are added to
the KB. In this case we must pay attention to the introduction of new concept names;
in particular regarding the representative concept. In fact, a representative concept
is an effective way of representing a single element only if the reasoning mechanism
is unable to distinguish the case where the interpretation of a representative concept
contains more than one element from the case in which it contains a single element.
Therefore we want to show a completeness result for the class of interpretations in
which the representative concepts are restricted to contain a single element (this is
analogous to the completeness of a class of models as seen in Section 7.2).
We call an interpretation nominal representative (n.r.) if the interpretation func-
tion maps each representative concept into the set containing only the interpretation
of the corresponding individual name (i.e. ��� �
�!��
). Using the definition of n.r.
interpretations we introduce the notion of nominal safe queries, as those completely
characterised by n.r. interpretations. A query answering problem � �� $ is nominal
safe when � �� $ w.r.t. n.r. interpretations implies that � �� $ (note that the converse
is trivially true).
We have to show that the manipulations of the query answering problem done by
the rules keep the result nominal safe. Otherwise, the use of representative concepts
will can produce wrong answers. The initial query answering problem is trivially nom-
inal safe because no representative concepts appear neither in the knowledge base nor
in query itself.19
Some of the proofs we provide rely on the fact that the query problem remains
nominal safe after the application of any rule; this is an additional requirement to the
completeness and correctness. Let � �� $ be the query answering problem matching
a rule, and �� �� $� the problem resulting from the application of the rule itself. We
want to show that:
1. the rule is complete: �� �� $� implies � �� $;
2. the rule is correct: � �� $ implies �� �� $�;
3. the resulting problem �� �� $� is nominal safe: �� �� $� w.r.t. n.r. interpretations
implies �� �� $�.
19An interpretation satisfying the KB can just be extend by adding the representative concepts, andmapping them to the set containing the element of the domain corresponding to the appropriate individ-ual (i.e. starting from the interpretation , we make a new � from by adding � �
�
������
for everyindividual name �).
7.4. ANSWERING QUERIES IN NORMAL FORM 155
If we start from the assumption that � �� $ is nominal safe, we can simplify the proofs
by showing that the two conditions
�� �� $� w.r.t. n.r. interpretations implies
� �� $ w.r.t. n.r. interpretations, and(7.3a)
� �� $ implies �� �� $� (7.3b)
hold for every rule. Note that the second condition is just the correctness as defined
above; the real difference is the first restricted version of completeness. Firstly, we
show that the two conditions are sufficient when the problem � �� $ is nominal safe.
1. Completeness: if �� �� $� then �� �� $� w.r.t. n.r. interpretations. We can
then use the Condition (7.3a) to conclude that � �� $ w.r.t. n.r. interpretations.
In addition � �� $ is nominal safe by hypothesis, therefore � �� $ w.r.t. n.r.
interpretations implies that � �� $.
2. Correctness: it is exactly the Condition (7.3b).
3. Safeness: if �� �� $� w.r.t. n.r. interpretations, then � �� $ w.r.t. n.r. interpreta-
tions by Condition (7.3a). In addition, � �� $ w.r.t. n.r. interpretations implies
that � �� $ because the problem is nominal safe by assumption. Therefore, we
can use Condition (7.3b) to conclude that �� �� $�.
In the following propositions we assume that the query problem before the applica-
tion of the rules is nominal safe, and by completeness we intend the Condition (7.3a)
instead of the classical definition. The proofs are organised by rules, first the rules
which do not modify the knowledge base, and then the ones which do modify it.
The pattern of all the proofs is similar, so we are going to introduce it to help the
reader in following the proofs. The rules transform the problem � �� �*� *�� � � � � *����
into the new problem �� �� �*��� � � � � *��� *�� � � � � *����, where the element * is substi-
tuted by one or more elements *��� � � � � *��. Let us consider completeness (correctness
is similar), and a rule that substitute the element with a single new element.
We have to show that assuming �� �� �*�� *�� � � � � *����, it is the case that � ��
�*� *�� � � � � *����. We choose an arbitrary n.r. interpretation � satisfying � and we
show that it models the formula �*� *�� � � � � *���� (see Definition 7.3). If necessary
for a rule modifying the kb, we first extend � to a new interpretation � � satisfying ��.
In general, this new interpretation is equal to the original one with mappings for the
newly introduced concept names; we can then easily go back to � by discarding the
7.4. ANSWERING QUERIES IN NORMAL FORM 156
appropriate mapping. Since � � satisfies ��, then � � �� �*�� *�� � � � � *���� by hypothesis;
so there is a mapping ( such that � � ��� *� or � � ��� �*�� � � � � *����. In the latter case,
we have that � � ��� �*� *�� � � � � *���� as well, and in general it is easy to show that
if � � ��� �*�� � � � � *����, then � ��� �*�� � � � � *����. This is because either the new
names in � � do not appear in any of the formulae *�� � � � � *�, or they are already in
� (representative concept) and � � � �. Therefore in the actual proof we show that if
� � ��� *� then there is a mapping ( � such that � ���� *.
Rules that do not modify the KB
Proposition 7.11. The equality elimination, conjunction elimination, role collapsing,
role elimination, and shortcut elimination rules are complete and correct.
Proof. Correctness and completeness of these rules are trivially verified by the fact
that for every interpretation � and mapping (, � ��� * iff � ��� *�; therefore � ��
�*� *�� � � � � *���� iff � �� �*�� *�� � � � � *����.
Let us consider the shortcut elimination rule as an example. Since the two query
formulae differ only for the disjuncts * and *�, we just need to show that � ��� * iff
� ��� *� for an arbitrary mapping (.
If � ��� * then � ��� � �� ������� � � � � ��, therefore
� ��� � �� �������� � � � � �� ( ����
as well. The rest of the terms in *� are the same of those in *, therefore � ��� *� as
well.
Let us assume that � ��� *�, we need to show that � ��� � �� ������� � � � � ��.
We consider the case in which � � �, the other is not really different. Since � ��� *�,
then � ��� � �� �������� � � � � �� ( ����; therefore we must verify that � ���
� �� ����. But this is true because
�� �� ������� � � � � � ���� ������� � *�
therefore in every set of roles ��, there is at least a role �� such that �� & �, which
means that � ��� � �� ������ for � � �� � � � � �' �. For the transitivity restriction on �
� ��� � �� ����, therefore � ��� � �� ���� because � & �.
Proposition 7.12. The contradiction elimination rule is correct and complete.
7.4. ANSWERING QUERIES IN NORMAL FORM 157
Proof. If � � !�� � !�� � *, and !� is different from !�, then there is no interpreta-
tion satisfying *. The very same applies to *� " �����. If � �� �*� *�� � � � � *�� ��, then
an arbitrary interpretation satisfying � must model �*�� � � � � *�� �� (without the dis-
junct *); therefore it models �*�� *�� � � � � *�� ��. The other direction is analogous.
Proposition 7.13. The simple rolling up rule is correct and complete.
Proof. Let � �� �*� *�� � � � � *���� be the query before the application of the simple
rolling up rule, such that * satisfies the rule conditions, and � �� �* �� *�� � � � � *���� is
the resulting query.
completeness Let � be a n.r. interpretation satisfying �. We assume that � ��
�*�� *�� � � � � *����, so there is a mapping ( such that � ��� �*�� *�� � � � � *����. If
� ��� *� for � � �� � � � � � then � ��� �*� *�� � � � � *���� as well; so, we are going to
show that if � ��� *�, then there is a mapping ( � s.t. � ���� *.
If � ��� *� then � ��� ����, therefore there is an element � � �� s.t.
�(� �� �� � �� and � � �� . The mapping (� � (���� in which � is mapped to
� satisfies the required property that � ���� � � ��� and � ���� ���. The rest of terms
in * are satisfied because � does not appear in any other term, therefore � ���� *.
Correctness Let � be an interpretation satisfying �; for reasons analogous to those
in the completeness proof, we are going to show that if � ��� * then � ��� *�.
If � ��� * then � ��� � � ��� and � ��� ���, it is easy to see that this implies that
� ��� ���� (the element (��� is the “witness” for the property). For the remaining
terms in *�, since they are in * as well they are modeled by � w.r.t. (; therefore
� ��� *�.
Proposition 7.14. If the role hierarchy is such that for every unrelated pair of roles20 in
the same label there is a functional role that includes both of them, then the functional
rolling up and functional inverse rolling up rules are correct and complete.
Proof. The proof for these two rules are very similar to the previous one for the simple
rolling up rule. In particular, the correctness part is almost identical and is left to the
intuition of the reader. So we are going to concentrate on the completeness.
Let � �� �*� *�� � � � � *���� be the query before the application of the functional
rolling up rule, such that * satisfies the rule conditions, and � �� �* �� *�� � � � � *���� is
the resulting query.
20They are not included one in each other.
7.4. ANSWERING QUERIES IN NORMAL FORM 158
completeness Again, the structure is the same as the previous completeness for the
simple rolling up rule. The crucial point here is that although we “split” the role con-
junction into several existential assertions, the interpretations are such that all the suc-
cessors must be the very same element.
The point we need to show is that, in any interpretation satisfying �, if an element
is related to two elements via two different role names being in the very same label
and not being included one in the other, then the two related elements must coincide.
Formally, let � be an interpretation satisfying �. Let � be an element of �� , and �,
� two role names in the same label s.t. neither � & � nor � & �. Then, if
��� ��� � �� and ��� ��� � �
� , �� � ��.
Note that this is not true if � & �, because this implies that �� � �
� but
not the converse. The intuition behind the proof is that if the two roles are not related,
then there must be an interaction with a functional role that forces the collapsing of
successors of � (i.e. there is a functional role � s.t. � & � and � & � ).
If � and � belongs to the very same label, then either � � � or � � � (by
Definition 3.4b); let us assume that � � �. We prove the claim by induction on the
definition of � (Definition 3.4a).
The two basic cases � & � and � & � are ruled out by assumption, so there
must be a functional role � such that � & � and � & � . This means that both
��� ��� and ��� ��� are in � � , and �� � �� because � satisfies the functional restriction
on � .
The property we have just shown applies to the functional (inverse) rolling up rule
because we assumed that the role elimination rule is not applicable. Therefore, if the
set of role names ��� � � � � �� is included in a label and there is more than one role
in it, they must be unrelated.
Proposition 7.15. The simple nominal introduction rule is correct and complete.
Proof. We assume that the query before the application rule is � �� �*� *�� � � � � *����,
* satisfies the conditions of simple nominal introduction rule, and none of the rules
with a higher priority can be applied to *.
completeness Let � be a canonical (w.r.t. �) quasi transitive shrub interpretation
satisfying �, which is n.r. as well. We assume that � �� �* � �� � !�� � � � � � * �
�� � !�� � *�� � � � � *����, so there is a mapping( such that � ��� �*��� � !�� � � � � � *�
�� � !�� � *�� � � � � *����. By definition there is one of the disjuncts such that � ��� +
7.4. ANSWERING QUERIES IN NORMAL FORM 159
for each term + contained in it. Moreover, each of the disjuncts in �*��� � !�� � � � � � *�
�� � !�� � *�� � � � � *���� includes at least one of the disjuncts in �*� *�� � � � � *�� ��;
therefore � ��� �*� *�� � � � � *�� �� as well.
Correctness Let � be a canonical (w.r.t. �) quasi transitive shrub interpretation sat-
isfying �. We assume that � �� �*� *�� � � � � *�� ��, so there is a mapping ( such that
� ��� �*� *�� � � � � *�� ��. We proceed by contradiction by assuming that �!����* �
�� � !�� � � � � � * � �� � !�� � *�� � � � � *����.
If � ��� *� for some � � �� � � � � �, then
�����* � �� � !�� � � � � � * � �� � !�� � *�� � � � � *����
as well; so we must assume only that � ��� *. If (��� � � , there is a contradiction
with the assumption that none of the disjuncts *� �� � !�� is satisfied. If (��� !� � ,
and since we assumed than � � �, then �(� �� (���� � �� � �
� � � � � � �� .
Therefore, by (3.6b) there is a label � s.t. ��� � � �� �� � �, which is in contradiction
with the conditions for the applicability of the rule itself.
Given the fact that we exhausted all the possibilities, finding only contradictions,
we must reject the given assumption that
�!����* � �� � !�� � � � � � * � �� � !�� � *�� � � � � *�����
Therefore we conclude that
� �� �* � �� � !�� � � � � � * � �� � !�� � *�� � � � � *�����
Proposition 7.16. If the roles involved in the cycle are not transitive and do not have
any transitive sub-role, then the cyclic nominal introduction rule is correct and com-
plete.
Proof. We assume that the query before the rule application is � �� �*� *�� � � � � *����,
* satisfies the conditions of cyclic nominal introduction rule, and none of the rules with
7.4. ANSWERING QUERIES IN NORMAL FORM 160
an higher priority can be applied to *. The query after the application of the rule is
� �� �* � �� � !�� � � � � � * � �� � !�� �
)'��� � *� � � � � )'��� � *� *�� � � � � *����� (,)
completeness We assume that (,) is satisfied. Let � be a canonical (w.r.t. �) quasi
transitive shrub interpretation satisfying �, which is n.r. � models the query formula in
(,) by hypotheses; therefore there is a mapping( such that � models the query formula
w.r.t. (. By definition there should be one of the disjuncts such that � is a model for
it w.r.t. (. However, all the disjuncts in (,) are supersets of at least one disjunct in
�*� *�� � � � � *����. Therefore � models �*� *�� � � � � *���� w.r.t. ( as well.
Correctness We must show that if � �� �*� *�� � � � � *����, then (,) is satisfied. Let
� be a canonical (w.r.t. �) quasi transitive shrub interpretation satisfying �. By hy-
pothesis � �� �*� *�� � � � � *����, therefore there is a mapping ( such that � ���
�*� *�� � � � � *��. We proceed by contradiction assuming that � does not model the
query formula in (,) w.r.t. (.
By definition either � ��� �*� or � ��� �*�� for � � �� � � � � � (since no *� contains
an existential quantifier, we can assume that ( is a mapping for all the variables). If it
is the latter case, then � models the query formula in (,) w.r.t. ( as well, leading to a
contradiction; therefore we must assume that � ��� �*�.
Note that, since � is not a model for (,), then the mapping ( must not satisfy
any term + in �� � ! � ! � � or in�
� )'���, otherwise � would be a model for the
corresponding disjunct * � �+� in (,). Therefore
(��� !� �;
(� �� !� (� �� for any pair of variables �� � occurring in a cycle with �.
Let us consider a cycle *� � * involving the variable � in *. Obviously � ��� �*��;
we are going to show that this is in contradiction with the assumptions we made.
Before analysing the cases we show a property that will be used in the rest of the
proof. Given the assumptions, it is the case that for each term � �� ���� � � � � � �
in *� either �(� ��� (� ��� � � or (� �� � (� ��) for some element ) (see
Definition 3.2). If (� �� � � and �(� ��� (� ��� � �� then (� �� � � as
well by (3.2c). Let us assume that (� �� !� � , then (� �� � �) for some el-
ement ) of � and � of �� (See Definition 3.2). By (3.2d) either � � (� �� or
7.4. ANSWERING QUERIES IN NORMAL FORM 161
��(� ��� ��� ��� (� ���� � �� . The first case is exactly the result we want to prove;
therefore we assume that ��(� ��� ��� ��� (� ���� � �� , and we show that this is a
contradiction. Since � is canonical, then we can use the property (3.6a) to infer that
there is a transitive role � & �. This is in contradiction with the fact that we assumed
that none of the role names involved in any cycle in * is transitive or has a transitive
subrole.
We consider three cases according to the length � of the cycle (cardinality of *�).
� � � If the cycle contains a single term then it must be like ��� ���� � � � � � �. This
means that �(���� (���� � �� , which is impossible by Lemma 3.3 because we
assumed that (��� !� �
� � � If the cycle contains two terms, then they must be of the form
���� ��� � � � � � �� � � ����� � � � � � ��� �
This is because if there is not any variable different from �, then there is a smaller
cycle. On the other hand, with more than two different variables and two role
terms, having a cycle is impossible. Moreover, we assumed that no other rules
can be applied to *, therefore there cannot be any equality like � � in *;
nor can the variables and � appear in the same order in the two role terms in
*� (i.e. like ��� ��� � � � � ��, and ��� ���� � � � � � ��), otherwise the role
collapsing rule would be applicable. Since (��� !� � we already proved that
(��� � (� �) and (� � � (���)�, which is a contradiction.
� � � Let � and � be the two “neighbour” variables of � (i.e. there is a role term
connecting � and � and a different term connecting � and �). The two variables
must be different, otherwise there will be a cycle of length �. We can distinguish
four cases according to the ordering of the variables in the terms; let us consider
each one of them.
– Let us assume that the two terms are
��� ���� � � � � � �� and ��� ����� � � � � � ���
We showed that (� �� � (���)�� and (� �� � (���)�� for some )�� , )��;
therefore (� �� is not in � . This applies to the next variable connected
to �: either (� �� � (� �)�� or (� � � (� ��)�. The first case is ruled
7.4. ANSWERING QUERIES IN NORMAL FORM 162
out because (� � cannot be equal to (��� (we assumed this at the begin-
ning); therefore (� � � (� ��)�. The same arguments can be carried on
for all the subsequent variables in the cycle, and this is clearly in contra-
diction with the fact that, for the last variable �, it must be the case that
(� �� � (���)�� .
– Let us assume that the two terms are
��� ���� � � � � � �� and � �� ����� � � � � � ���
Therefore (� �� � (���)�� and (��� � (� ��)� for some )�� , )�. Using
the same argument of the previous case we can show that every variable in
the cycle has the property of being in the form (� � � (���)�� � �)�)�. This
is in contradiction with the fact that (��� � (� ��)�.
– Let us assume that the two terms are
� �� ���� � � � � � �� and ��� ����� � � � � � ���
Therefore (� �� � (���)�� and (��� � (� ��)� for some )�� , )�. Using
the arguments of the previous case starting from � instead of � we get a
similar contradiction.
– Let us assume that the two terms are
� �� ���� � � � � � �� and � �� ����� � � � � � ���
Therefore (��� � (� ��)� and (��� � (� ��)� for some )�. This is in
contradiction with the fact that (� �� and (� �� must be different.
At this point we exhausted all the possibilities, finding only contradictions. There-
fore we must reject the original assumption that � does not model (,) w.r.t. (. Given
the arbitrariness of the choice of interpretation � we conclude that (,) is satisfied.
Rules that modify the KB
Proposition 7.17. The nominal elimination rule is correct and complete.
Proof. Let � �� �*� *�� � � � � *���� be the query before the application of the nominal
elimination rule, such that * satisfies the rule conditions, and � � �� �*�� *�� � � � � *����
is the resulting query.
7.4. ANSWERING QUERIES IN NORMAL FORM 163
completeness We assume that �� �� �*�� *�� � � � � *���� w.r.t. n.r. interpretations.
Let � be a n.r. canonical (w.r.t. �) quasi transitive shrub interpretation satisfying �.
We have to show that � �� �*� *�� � � � � *����. We can distinguish the case in which
�!���� � � from the case in which it is not.
If �!���� � � then �� � �; therefore there is a mapping ( such that � ���
�*�� *�� � � � � *���� by the assumption that �� �� �*�� *�� � � � � *����. As seen in
the previous proofs we concentrate in the case in which � ��� *�; we just need
to show that � ��� � � !. Since we assumed that � is n.r., then ��� �
�!��
because �!���� � �. Therefore (��� � !� because � ��� ����, which proves
that � ��� � � !.
If �!���� !� � then the concept name �� does not appear in �, because the
assertion !��� is added to the left hand side whenever the representative concept
�� is introduced in the query formula. We define a new interpretation � � by
adding (or redefining) the mapping ���� �
�!��
to �.21 � � �� � because � does
not contain any reference to ��; so � � �� � � �!����. By assumption there is a
mapping ( such that � � ��� �*�� *�� � � � � *����. Let us consider the case in which
� � ��� *�: as in the previous case (��� � !��
� !� . In addition we can use the
very same mapping ( with �, therefore � ��� � � !; this shows that � ��� *.
Correctness We assume that � �� �*� *�� � � � � *����. Let � be a canonical (w.r.t.
��) quasi transitive shrub interpretation satisfying ��. We have to show that � ��
�*�� *�� � � � � *����. Again we distinguish the two cases according to whether !��� is
already in �. However, in both cases � �� � because � � ��; therefore there is a
mapping ( such that � ��� �*� *�� � � � � *����. We just need to show that if � ��� *,
then � ��� *�, by showing that � ��� ����. However, this is trivially verified because
� ��� � � ! (by the assumption that � ��� *), and !� � ��� because � �� ��.
Proposition 7.18. The simple inverse rolling up rule is correct and complete.
Proof. Let � �� �*� *�� � � � � *���� be the query before the application of the simple in-
verse rolling up rule, such that * satisfies the rule conditions, and � � �� �*�� *�� � � � � *����
is the resulting query.
21The domain of the interpretation function �� contains � by definition; therefore the value �� isdefined.
7.4. ANSWERING QUERIES IN NORMAL FORM 164
completeness We assume that �� �� �*�� *�� � � � � *���� w.r.t. n.r. interpretations. Let
� be a n.r. canonical (w.r.t. �) quasi transitive shrub interpretation satisfying �. We
build a new n.r. interpretation � � equal to � with the addition/substitution of the inter-
pretation for the new concept name ���� such that
������ �
�� � �� � ��� �� � � and � � ��
��
Note that since ���� does not appear either in � or in �*� *�� � � � � *����, � � still sat-
isfies �, and if � � �� �*� *�� � � � � *���� then � �� �*� *�� � � � � *����. It is easy to see
that � � satisfies � ������ ; i.e. �� ��� � �� � ��� �� � � and � � ��
�(note
that ��� � �� because ���� does not appear in �). Therefore � � satisfies ��, and by
assumption there is a mapping ( such that � � ��� �*�� *�� � � � � *����. We just need to
show that if � � ��� *�, then there is a mapping ( � s.t. � � ���� *.
Let us assume that � � ��� *�, then (� � � ������; therefore there is an element
� of �� such that ��� (� �� � �� and � � ��� . We define (� as (���� ; note that
still � � ���� *� because � does not appear in *� by hypothesis. By construction � � ����
��� �� and � � ���� ���; therefore � � ���� *.
Correctness We assume that � �� �*� *�� � � � � *����. Let � be a canonical (w.r.t.
��) quasi transitive shrub interpretation satisfying ��. Obviously � satisfies � as
well because it is a subset of ��; therefore there is a mapping ( such that � ���
�*� *�� � � � � *����. We need to show that if � ��� *, then � ��� *�.
Let us assume that � ��� *, then (��� � �� ; therefore (��� � ������ because
� satisfies ��. In addition, �(���� (� �� � � because � ��� *; therefore (� � �
����� , which means that � ��� *
�.
Proposition 7.19. The nominal rolling up rule is correct and complete.
Proof. Let � �� �*� *�� � � � � *���� be the query before the application of the nominal
rolling up rule, such that * satisfies the rule conditions, and � � �� �*�� *�� � � � � *���� is
the resulting query.
completeness We assume that �� �� �*�� *�� � � � � *���� w.r.t. n.r. interpretations. Let
� be a n.r. canonical (w.r.t. �) quasi transitive shrub interpretation satisfying �. We
build a new interpretation � � by adding (or redefining) the mapping ���� �
�!��
to
�. The interpretation � � is still n.r. In addition, if !��� is already in �, then � � � �
(see the proof of Proposition 7.17). If on the other hand !��� is not in �, it is easy to
7.4. ANSWERING QUERIES IN NORMAL FORM 165
see that if � � �� �*� *�� � � � � *���� then � �� �*� *�� � � � � *���� as well because �� does
not appear either in � or in �*� *�� � � � � *����. The interpretation � � trivially satisfies
��; therefore there is a mapping ( such that � � ��� �*�� *�� � � � � *����. We just need to
show that if � � ��� *�, then there is a mapping ( � s.t. � ���� *.
Let us assume that � � ��� *�. Since � � ��� ������� � ��� � � � � � ����� � ����,
then (� � � ������ � ������; therefore there is an element � of ��� � �� such that
�(� �� �� � ��� � �
� , � � ��� � �� (because either � does not include �� or
���� � ��
�) and � � ���� . Note that ��
�� ��!��
, therefore � � !� � !��
. The
very same arguments, applied to (� � � ��������� for � � �� � � � � �, allow us to con-
clude that �(� �� �� � ��� . We define (� as (���� . By construction � ���� � � !,
� ���� ���, and � ���� � � ���� � � � � � �; therefore � � ���� *.
Correctness We assume that � �� �*� *�� � � � � *����. Let � be a canonical (w.r.t.
��) quasi transitive shrub interpretation satisfying ��. Obviously � satisfies � as
well because it is a subset of ��; therefore there is a mapping ( such that � ���
�*� *�� � � � � *����. We need to show that if � ��� * then � ��� *�.
Let us assume that � ��� *, therefore there is an element (��� of �� such that
(��� � !� (because � ��� � � !), (��� � �� (because � ��� ���), and �(� �� (���� �
�� � � � � �
� (because � ��� � � ���� � � � � ��). In addition, (��� � ��� because
� satisfies �� (!��� is in ��). Therefore (��� � ������ � ��� � ����� � � � � �
������� and � ��� *
�.
Proposition 7.20. The cycle breaking and simple cycle breaking rules are correct and
complete.
Proof. Let � �� �*� *�� � � � � *���� be the query before the application of the rule, such
that * satisfies the rule conditions, and �� �� �*�� *�� � � � � *���� is the resulting query
after the application of one of the two rules.
completeness We proceed as in the previous Proposition 7.19. Let � be a n.r. canoni-
cal (w.r.t. �) quasi transitive shrub interpretation satisfying �, and � � defined as � with
the mapping ���� �
�!��
. As in the previous proposition, � � is still n.r. and it satisfies
��. Therefore by hypothesis there is a mapping ( such that � � ��� �*�� *�� � � � � *����.
As before we assume that � � ��� *�. We are going to show that � ��� * as well. Note
that in the application of both the rules the variables and � are still in * � (in the cycle
breaking rule either because of the path or because " �); therefore both (� � and
(��� are defined.
7.4. ANSWERING QUERIES IN NORMAL FORM 166
Let us consider the simple cycle breaking rule; since � � ��� *�, then �(� �� (���� �
��� � �
� and there are � ' � elements )�� � � � � )� of ��� � �� such that )� � ����
and �(� �� )�� � ��� � �
� . Since � � is n.r., then )� � � � � � )� � (���, so
�(� �� (���� � �� � � � � � �
�; therefore � ��� � � ���� � � � � ��. The cycle
breaking rule is the same.
Correctness The proof is exactly the same as for proposition 7.19 for the nominal
rolling up rule.
Proposition 7.21. The nominal inverse rolling up rule is correct and complete.
Proof. Let � �� �*� *�� � � � � *���� be the query before the application of the nominal
inverse rolling up rule, such that * satisfies the rule conditions, and � � �� �*�� *�� � � � � *����
is the resulting query.
completeness We assume that �� �� �*�� *�� � � � � *���� w.r.t. n.r. interpretations.
Let � be a n.r. canonical (w.r.t. �) quasi transitive shrub interpretation satisfying �.
Again, we proceed as in Proposition 7.19 and Proposition 7.18 by considering � � as
equal to � with the mapping ���� �
�!��
, and for each � � �� � � � � � ������� ��
� � �� � �!� � �� � ���
if !� � �� or ������� � otherwise. This interpreta-
tion is still n.r.; in addition, it satisfies � because all the ����� do not appear in �,
and either �� appears in � and ���� � ��
� because � is n.r., or �� does not ap-
pear in �. The interpretation � � satisfies the assertion !��� by definition; now we
are going to show that it satisfies �� � ��� ��������� � � � � � ������
��� as
well. Note that ��� � �� because either �� does not syntactically appear in � or
���� � ��
� ; therefore either �� � ����� �
�!�
��
or �� � ����� � . In the first
case, let us consider the role name � for some � � �� � � � � �. If ) � ��� is such that
�!��
� )� � ��� then ) �
�� � �� � �!�� �� � �
��
; therefore ) � �����
�� by defini-
tion, and !��
� ����������
�� because of the arbitrariness of ). Given the fact that this
is true for all � � �� � � � � � then �� ������ � �������
�� � � � �����������
�� . In the
case that �� � ����� � the inclusion constraint is trivially satisfied. This shows that
� � satisfies ��. By assumption there is a mapping ( such that � � ��� �*�� *�� � � � � *����.
We need to show that if we assume � � ��� *�, then there is a mapping ( � s.t. � ���� *.
Since we assumed that � � ��� *�, then (� � � ����
�� � � � � � ������
�� . Therefore
�!�� (� �� � ��� � �
� for all � � �� � � � � � and !� � ��� � �� because �������
is not empty. The new mapping ( � � (���!� is clearly such that � ���� * because �
does not appear in other terms in * by hypothesis.
7.4. ANSWERING QUERIES IN NORMAL FORM 167
Correctness We assume that � �� �*� *�� � � � � *����. Let � be a canonical (w.r.t. ��)
quasi transitive shrub interpretation satisfying ��. Obviously � satisfies � as well be-
cause is a subset of ��; therefore there is a mapping( such that � ��� �*� *�� � � � � *����.
We need to show that if we assume that � ��� * then � ��� *�.
Since � ��� *, then (��� � !� , (��� � �� , and �(���� (� �� � �� for all
� � �� � � � � �. In addition, (��� � ��������� � � � ��������
���� because � satisfies
the inclusion �� � ��� ��������� � � � � � ������
���, and (��� � �� � ���� .
Therefore for all � � �� � � � � �we have the combination �(���� (� �� � �� and (��� �
����������
� , which implies that (� � � �����
� . This allows us to conclude that
� ��� *�.
Proposition 7.22. The inverse cycle breaking rule is correct and complete.
Proof. Let � �� �*� *�� � � � � *���� be the query before the application of the inverse cy-
cle breaking rule, such that * satisfies the rule conditions, and �� �� �*�� *�� � � � � *����
is the resulting query.
completeness Let us assume that �� �� �*�� *�� � � � � *���� w.r.t. n.r. interpretations,
and let � be a n.r. canonical (w.r.t. �) quasi transitive shrub interpretation satisfying
�. We proceed as in proposition 7.21 by considering � � as equal to � with the map-
ping ���� �
�!��
, and for each � � �� � � � � � �����
�� ��� � �� � �!�� �� � �
��
.
Again, interpretation � � is still n.r. and satisfies �. Using the same arguments as for
Proposition 7.21 we can show that � � satisfies �� as well. By assumption there is a
mapping mapping ( such that � � ��� �*�� *�� � � � � *����. We are going to show that if
we assume � � ��� *�, then � ��� * (note that both the variables and � maching the
precondition are still in *�).
Since we assumed that � � ��� *�, then (� � � ������ � � � � � ���
����� , so
�!�� (� �� � ��� � �
� for all � � �� � � � � �. In addition, (��� � !� because
� � ! is still in *�; therefore � ��� ��� ��� � � � � � � (i.e. � ��� * because
* � *� � ���� ��� � � � � ���).
Correctness We assume that � �� �*� *�� � � � � *����. Let � be a canonical (w.r.t. ��)
quasi transitive shrub interpretation satisfying ��. Obviously � satisfies � as well be-
cause is a subset of ��; therefore there is a mapping( such that � ��� �*� *�� � � � � *����.
We need to show that if we assume that � ��� *, then � ��� *�.
Since � ��� * then (��� � !� , and �(���� (� �� � �� for all � � �� � � � � �. In ad-
dition, since (��� � !� � ��� we derive that (��� � �������
���� for all � � �� � � � � �,
7.5. SPEEDING UP THE ANSWER 168
because � satisfies the inclusion �� ��������� � � � � � ������
���. This means
that (� � � �����
� for all � � �� � � � � �; therefore � ��� ������� � � � � � ���
��� (i.e.
� ��� *�).
7.5 Speeding up the answer
The major problem for the performances of the query algorithm lies in the nondeter-
ministic substitution of the variables with the individual names. Note this is true for
both the transformation of a “retrieval” query into a boolean query (see Section 7.1.2),
and the actual boolean query answering (see Section 7.4). Therefore, reducing the
number variable substitutions to perform, and the number of individuals which take
part in each substitution is the way to go for a better performing algorithm.
The variables which need to be substituted by individual names cannot be reduced
in the phase of transformation of a query into a boolean query, but there are a lot of
possibilities in the application of the transformation rules.
Example 7.6
Let us consider a cyclical query like
�� � ����� ��� ����� ��� ���� � � ���� �
This query contains two cycles: one involving the variables , �, and �, and a second
constituted by the loop in (i.e. � � ���). The cyclic nominal introduction rule can
be applied to any of the tree variables. However, if it is applied to , both the cycles
are “eliminated” at once; while applying the rule to either � or � leaves the loop in
in each one of the newly generated disjuncts.
Note that, in this example, choosing reduces the size of the transformed query of
a logarithmic factor.
For reducing the number of candidate individuals for a variable name, we are con-
sidering two different techniques. The first one relies on the concept terms contained
in the query, while the second on the structure of role terms.
They both rely on the observation that if a subset of the terms in a disjunct are never
satisfied, then neither the disjunct will be satisfied. Therefore, if in a query contains a
concept term like ��, and the variable must be substituted by an individual name, we
can just consider the subset of individual names which can be instance of � (verifiable
by instance checking tests like � �� ����, see Section 2.2). Note that for variable
7.5. SPEEDING UP THE ANSWER 169
name different from the joining variable � (see Section 7.3.1) we can even consider the
more restricted set of individual names which are instance of �.
We can make an analogous consideration for role terms, with the difference that the
kind of role assertions in a ��f KB are not very expressive. In particular, they do not
allow to express incomplete information as the concept assertions do. Using concept
assertions we can states something like ��� �� which leave a degree of uncertainty
about the properties of �.22 This is impossible to do for role assertions, which are
usually limited to simple statements like ��� ���. This limited expressivity for roles is
shared by most of the DLs studied and/or implemented.
This limitation can be used in order to restrict the number of candidate individu-
als. Let us consider for example a term like � � ���, where the variable � must be
substituted by an individual name. In DL like ��f we can restrict to names which
appear explicitly in assertions like ��� ��� as second element of the pair.23 Note that
this argument is no longer valid for DLs providing the inverse role constructor; on the
other hand we are confident that effective optimisations can be devised in most of the
cases.
We did not investigate these ideas yet, and all these optimisations must be carefully
verified in order to avoid incomplete or incorrect answers. This will be subject of our
future work.
22Note that the incomplete information may be stated in more subtle ways and/or implied by theterminology.
23Note that the role hierarchy must be taken into account as well, therefore considering role assertionswith sub–roles of � as well.
Chapter 8
Implementation and testing
The main goal of our DL system implementation is not the delivery of a complete and
robust system, but the development of a prototype to be used as test bed for verifying
the validity of our approach and for suggesting new directions to investigate.
The first objective was the analysis of the overall performances of the system com-
pared to other state of the art DL reasoners. This point is important because we know
that the naive implementation of the tableaux–like method for DLs leads to unaccept-
able performance (see Horrocks [1997]). Modern DL systems, on the other hand,
adopt a series of optimization techniques which provide exceptional results (see Hor-
rocks and Patel-Schneider [1999], Haarslev and Moller [1999, 2000c]). Unsurpris-
ingly, these techniques provide better results if the algorithm has control over the whole
reasoning process. In our case, on the other hand, we have a sharp separation between
the precompletion phase (dealing with Abox assertions) and the invocation of the ter-
minological reasoner. The other goal of the experimentation was to understand the
impact of different optimisation techniques applied at the Abox precompletion level.
The focus of this chapter is not to describe the actual software artifact we have
developed, but rather on the underlying ideas and techniques used.
8.1 Description of the system
The DL system has been written in Common Lisp because it is an excellent language
for symbolic manipulation, and it enables the programmer to easily extend and modify
the code to experiment with different approaches.
Most available DL systems have been written using Lisp, mainly for the above
mentioned reasons, but also because we are dealing with highly intractable algorithms;
170
8.2. OPTIMISING THE ALGORITHM 171
therefore, twiddling with a low level programming language for gaining an order of
magnitude (which is not that easy against modern Lisp compilers) is pointless without
first investigating higher level optimisations, which can lead to gains of several order
of magnitude.
Interaction with the system is performed by means of Lisp function and macros
designed to be conform to the KRSS standard (see KRSS), as well as compatible with
the syntax used by the FaCT system (see Horrocks [1997]).
The terminological reasoner used is FaCT. Since it is written in Common Lisp as
well, its integration in the code as a library was very easy, and a natural choice since it
is the system used in our research group. However, in principle, any DL terminological
reasoner can be integrated as a library.
8.2 Optimising the algorithm
The experience with previous DL systems shows that the direct implementation of the
tableaux–based satisfiability algorithms provides very poor performance, unaccept-
able for any real application. Even if the experiments with other DL systems have
been mainly at the terminological level, we can try to extract some lessons from those
experiences (see Haarslev and Moller [1999], Horrocks and Patel-Schneider [1999]).
One of the advantages of the precompletion technique is that the precompletion
phase is completely separated from the terminological reasoning, so optimisation tech-
niques implemented at the two levels do not interact adversely. The terminological
reasoner we are using (i.e. FaCT) is already optimised, therefore we focus on the opti-
misation we can perform during the precompletion.
Among the answers we are looking for from the experiments is whether optimisa-
tions made at the precompletion level make a real difference, and which ones produce
the best results. Given the fact that during the precompletion no new individuals are
created, the source of complexity must be related to the nondeterminism in the �–rule.
8.2.1 Evaluation strategy
According to the formal description of the algorithm in Chapter 5, the order in which
the rules are selected is irrelevant to the correctness and completeness of the algo-
rithm.1 However, for specifying the actual algorithm we decided to consider all the
1Note that with ��f this is no longer true for terminological reasoning. The difference is the absenceof the �–rule in the generation of precompletions.
8.2. OPTIMISING THE ALGORITHM 172
constraints associated to a given individual before moving to a different individual.
Note that there is always the possibility that constraints associated with different in-
dividuals cause the addition of new constraints for the individual which has been pre-
viously considered. In this case, eventually the individual in question will be selected
again.
Intuitively, this strategy may give good results when there is not a strong interde-
pendency between individuals (i.e. role assertions). On the other hand, it may cause an
unnecessary overhead if there are “easy” contradictions in individuals not yet consid-
ered.
8.2.2 Axiom absorption and lazy expansion
General axioms are one of the major sources of nondeterminism; in fact, in every
axiom � � is hidden the disjunctive formula � � �� applied to every individual
name. One the most effective ways of reducing the effects of axioms is the so called
absorption technique used in conjunction with lazy expansion of concept names (see
Horrocks and Tobies [2000]).
Roughly speaking, the idea behind this technique is to transform a general axiom
into the special form � � where � is a concept name. Then the axiom is treated as
a sort of definition for the name � and ignored until a concept constraint of the form
!�� is examined; at this point the “definition” � of � is added to the label of ! (i.e.
the new constraint !��). This basic idea can be extended to negated concept names as
well; i.e. having definitions of the form �� �. However, their combination must
be used carefully to avoid incorrect results. We implemented the absorption algorithm
described in Horrocks and Tobies [2000].
8.2.3 Lexical normalisation
Concept expressions are normalised and encoded according to the transformation rules
described in Horrocks and Patel-Schneider [1999]. In the normal form, concept expres-
sions can be concept names, conjunctions of normal form concepts, universal quantifi-
cation constructors, and the negation of a normal form. Conjunctions are represented
as sets, so the order of the conjuncts does not affect the syntactic equivalence of dif-
ferent expressions; in addition, nested conjunctions are flattened (e.g. expressions like
���� � ��� ��� are transformed into ��� � �� ���).
Normal form expressions are uniquely associated with an identifier which is used
8.2. OPTIMISING THE ALGORITHM 173
Concept expression Normal form� ���� � � � � � �� �������� � � � ��������� ���������� ��� � � � � � �� ����� � � � � ���� �� ���� � � � � ��� � ��� � � � � ��� � ���� � � � � ��� ��� � � � � ���� ��� ���� ����� ��� � � � � ��� � ���� � � � � ���� ���� ��� � � � � ��� ������������ � � � � ��� ��
Table 8.1: Lexical normalisation rules
in place of the complex expression itself. A table is used to relate the normalised
expressions to their respective identifiers. In this way, every time an expression is
encountered a second time during the parsing, the very same identifier is used to rep-
resent it. This mechanism enables the possibility of detecting trivial inconsistencies at
an early stage, without the necessity of expanding the concept expressions.
For example, let us consider the expression ����� ��� � ����� � ����. The
parser is recursively invoked over its subexpressions, first the expression ���� ���
is transformed into the negation of an universal quantification �������������.
During the transformation each subexpression is associated to an identifier, and the
identifier table will contain the mappings �/� 3# � �������, and �/� 3# ���/�.2
During the parsing of the second top level conjunct ����� � ��� the parsed subex-
pressions are recognised as stored in the table and substituted by their identifiers. So
the resulting normal form is ����/�� �/�� which is immediately normalised as ��.
8.2.4 Backjumping
Inherent unsatisfiability concealed in sub-problems can lead to large amounts of un-
productive backtracking search known as thrashing.
Example 8.1
2There will also be identifiers for the other subexpressions, but these are sufficient to show our point.
8.2. OPTIMISING THE ALGORITHM 174
Consider the set of constraints������ ����� � � � � ����� ����� ����������
������ ���� ��� ����
��
which may cause the exploration of �� alternative combinations of constraints on �
deriving from the concepts ��� � ���� � � � � ��� � ���, while the true cause for the
failure is related to the individual �. For example, this will happen if the rule application
strategy forces the evaluation of all the constraints associated with an individual before
considering a different individual.
This problem is addressed by adapting a form of dependency directed backtracking
called backjumping, which has been used in solving constraint satisfiability problems
(see Baker [1995]). Backjumping works by labeling concept constraints with a depen-
dency set indicating the branching points on which they depend. A concept constraint
��� depends on a branching point if ��� was added to the label by the �–rule gener-
ating the branching point or if ��� was generated by a different rule and the concept
constraints involved in the rule depends on the branching point.3
For example, the constraints ������� ��� ���� generate the new constraint ���
by means of the �–rule. The constraint ��� inherits the very same dependency set as
the constraint �����.
When a clash is discovered, the dependency sets of the clashing concept constraints
can be used to identify the most recent branching point where exploring the other
branch might alleviate the cause of the clash. The algorithm can then jump back over
intervening branching points without exploring alternative branches.
In more detail, when a clash is detected the union of the dependency sets of the
clashing concepts is taken, and backtracking is performed. During backtracking, each
branching point encountered is checked against the dependency set to see if it is a
member. If it is not in the dependency set, then the other branch is ignored and back-
tracking continues. If the branching point is in the dependency set, and the other branch
has not been explored, then backtracking stops and the algorithm proceeds with the ex-
ploration of the second branch. If both branches have already been explored, then the
dependency sets from the two branches are unioned and backtracking continues.
Note that when a contradiction is discovered by the verification of a label, there
is no way for our algorithm of knowing the source of the contradiction; therefore, is
3Since new role assertions are never introduced, the dependency is only related to concept expres-sions.
8.2. OPTIMISING THE ALGORITHM 175
returned the union of the dependency set of all constraints of the label. As we are
going to show in Section 8.2.6, we can modify the algorithm in order to make the
backjumping more effective.
With respect to the given example, the precompletion algorithm generates the first
precompletion ������������ ����� � � � � ����� ����� ����������
����� � � � � ����
������ ���� ������� ��� ����
����� �
by choosing the first concept for each constraint containing the disjunction, and ap-
plying the �–rule to the constraints ��������� and ��� ����. In this process, the
algorithm generates � different branching points, and every constraint ����� � � � � ����
is labelled with a different branching point. The constraints associated with � have no
branching point associated with them (this indicate a dependency with the top level),
so when the terminological reasoner discovers the inconsistency of the label associated
with � there is no reason to explore the alternative options ����� � � � � ����.
8.2.5 Abox partitioning
Analysing the precompletion algorithm, it is easy to realise that individuals can in-
fluence each other only by means of role assertions. In addition, these assertions are
“static”, in the sense that they do not change during the precompletion process. This
guarantees that unconnected parts of the Abox can be precompleted independently.
The whole KB is satisfiable iff each connected component is independently satisfiable.
This property makes little difference in case of KB satisfiability, since the only ad-
vantage is on the memory occupation.4 However, it can be a significant improvement
in case of the instance checking problem (i.e. verifying whether an individual is mem-
ber of a concept in every model of the KB, see 2.2). In fact, this problem is usually
reduced to KB (un)satisfiability: given a KB �� ���, an individual �, and a concept �,
�� ��� �� ��� iff the KB �� �� � ������� is unsatisfiable. Knowing that the initial
KB �� ��� is satisfiable (this is usually the case) allows us to verify the satisfiability
of the KB by considering only the assertions about individuals “connected” with the
name in question (the individual �). The rest of the Abox can be ignored.
4Even though one can use the property for loading into the main memory only the part of the Aboxrelevant for the current precompletion (see for example Elhaik and Rousset [1998]).
8.2. OPTIMISING THE ALGORITHM 176
8.2.6 Using the terminological reasoner
According to the described algorithms the terminological reasoner is used only when
a precompletion is generated. However, it can be used in different phases of the algo-
rithm in a more sophisticated way. The starting assumption is that the terminological
reasoner is more efficient than the precompletion phase; therefore it can be used not
only for verifying the consistency of a precompletion but also for guiding the algo-
rithm. In addition, is reasonable to expect that the Abox is much bigger than the Tbox.
Result caching As pointed out in Donini et al. [1996a], caching the satisfiability
results for the tested concept expressions is essential for maintaining the complexity of
the actual algorithm in the EXPTIME theoretical complexity. In addition, in our case it
can allow us to avoid the overhead of converting a concept expression from the internal
representation to a format suitable for the terminological reasoner.
As described in Section 8.2.3, the system keeps a table of all the normalised con-
cept expressions the system comes across during the precompletion. When the ter-
minological reasoner is invoked for checking the satisfiability of a concept expression
(the conjunction of one or more concepts in a label), the result is stored in the table
as well. If a concept expression, having a cached satisfiability value, must be checked
again, its cached value is returned without invoking the terminological reasoner again.
Early inconsistency detection The first alternative use of the terminological rea-
soner is as an early inconsistency detector. This is based on the observation that if the
constraints applied to an individual are inconsistent they will be inconsistent in any
generated precompletion. Therefore the label of an individual can be verified even
before an actual precompletion is generated. If an individual is found having an incon-
sistent label, the algorithm stops exploring the current search branch and backtracks to
the appropriate saved state of the precompletion process.
Taking this approach too far can be damaging if the terminological reasoner is
called too often. We decided to use the focusing/un-focusing of the precompletion
process on an individual as a trigger for the label verification. The label is verified when
all the constraints associated to an individual have been considered, before considering
a different individual. If during subsequent precompletion of the rest of the KB no new
constraints are added to an individual which as been already precompleted, there is no
reason to verify its label again.
8.2. OPTIMISING THE ALGORITHM 177
Modal verification Backjumping is an essential technique for pruning the search
space, however it works properly only if we have precise knowledge of the constraints
which generate a clash. Our problem is that in case of a clash detected by the termi-
nological reasoner, we cannot get the information about the constraints responsible for
the clash. We could assume that any concept in the label can be the cause of the clash,
but this would make the backjumping technique much less effective.
For example, the label ������ ����� ���� ���� is inconsistent if the terminol-
ogy contains an axiom like � ��. However, the inconsistency can only be detected
by the terminological reasoner. Once the unsatisfiability of the label is discovered, the
system cannot tell which element of the label caused the inconsistency; in principle it
can even be among the set ������.
We can improve the algorithm observing that if during the precompletion a clash
has not been detected, the only possible cause for an inconsistency must be associated
with an “anonymous” element whose existence is enforced by an existential quantifi-
cation. This is because the tree–like model property of the logic guarantees that no
constraints can be “pushed back” from these “anonymous” elements, and contradic-
tions generated in the propositional part of the labels (e.g. like � and ��) are detected
during the precompletion process.
When there is the necessity of verifying the satisfiability of a label, the standard
algorithm builds a new concept by conjoining all the concepts in the label. With the
modal verification technique, the only concepts considered are the universal and exis-
tential quantifications (i.e. ��� and ���). The algorithm selects the smallest subsets
of the label which can be independently verified without compromising the complete-
ness of the algorithm. In the previous example the terminological reasoner is used to
verify the two concepts ���� � ���� and ������ independently.
The basic idea for selecting these subsets is to consider each existential concept in a
different subset, and adding the universal restrictions which can apply to it (i.e. having
the same or a more general role name). However, this simple approach does not work
with ��f, because of the interaction between functional role and role hierachy. In
fact, two (or more) of those “anonymous” elements may be forced to co-refer by a
common functional super-role (see Section 3.1.1, and Example 7.5). For example,
a label containing the two concepts ��� and ����� , is not satisfiable if there is a
functional role � such that & � and � & � . This contradiction would not be
detected by considering the two existential concepts separately.
8.2. OPTIMISING THE ALGORITHM 178
The solution for this problem is to check existential concepts whose role share com-
mon functional super–roles together. This must be extended to “chains” of roles like
& ��, � & ��, � & ��, and % & �� as well (see Example 7.5). Universal concepts
alone do not generate contradictions without the actual existence of a successor; i.e.
two concepts like ��� and ���� are not contradictory, unless they are combined
with an corresponding existential concept (or Abox assertion). Therefore they do not
need to be verified by themselves (they are considered by means of exitential concepts,
or of the corresponding precompletion rule).
Using this technique, in case of unsatisfiability we can narrow the set of involved
constraints. Therefore it can be extremely useful in conjunction with backjumping,
because it enables a more precise identification of the backtrack point responsible for
the clash.
Example 8.2
Let us consider the following variation of Example 8.1������ ����� � � � � ����� �����
������� ������ ���
��
In this case backjumping alone would not provide any improvement because the gen-
erated precompletion ������������ ����� � � � � ����� �����
����� � � � � ����
������� ������ ���
����� �
is associated with the single individual �. Therefore, the terminological reasoner is
invoked with the concept
��� ���� � � � � � ��� ���� � �� � � � � � �� � ���� � ���� ����
and the algorithm is unable to detect which constraints are the cause of the unsatisfia-
bility.
By checking the modal part only, the terminological reasoner is invoked with the
concept ���� � ���� ���. When the unsatisfiability is reported, the algorithm
knows that the clash is generated by one of the two constraints ������� ������ ���
and the backjumping is more effective.
8.2. OPTIMISING THE ALGORITHM 179
Instance checking by subsumption The terminological reasoner can sometimes be
used to avoid precompletion in case of instance checking, by using an approximation
of the Most Specific Concept technique (see Era and Donini [1992]). The idea is to
build a concept expression which is guaranteed to contain the individual, and is as
specific as possible. The trivial way of doing it is by conjoining all the concepts in the
label of the individual, but the role constraints can be considered as well to narrow the
expression by using existential quantification.
For example, from the set of constraints ����� ��� ���� ���� we know that the in-
dividual �must be contained in the expression �������. This process can be carried
on for different levels of role assertions (e.g. � can be related to a third individual � or
even to � itself); we decided to build the MSC by stopping when a cycle is detected.
Now, let us assume that we calculated the MSC associated to the element � and
we call it 0���. If we are interested in verifying whether the individual � is in the
extension of the concept � in every interpretation, we can first check if 0��� is
subsumed by � or by �� before performing any precompletion. In fact, if 0�� � is
subsumed by � then � is an instance of �, while if it is subsumed by �� we know for
sure that � cannot be an instance of � in any interpretation satisfying the KB.
8.2.7 Other techniques
Query caching If an instance checking like �� ��� �� ��� is verified, the assertion
��� is a logical consequence of the KB; therefore the new KB �� ��� ������ is
equivalent to the original one. In addition, the proof for the original instance checking
problem can be rather involved.
The idea behind the instance checking is to add the results of positive instance
checking problems as new assertions in the KB. This can be useful in cases in which
the same query is issued again, or when a contradiction can be detected earlier because
of the new assertions.
Semantic branching Semantic branching is a technique for exploring disjoint bran-
ches of the search space, when a nondeterministic rule must be applied (see Horrocks
and Patel-Schneider [1999]). When a constraint containing a disjunction constructor
(i.e. ���� � � � � � ��) must be expanded, the �–rule selects the first disjunct and adds it
to the constraint system (i.e. ����). If the new constraint causes a contradiction, the al-
gorithm backtracks and the second disjunct �� is choosen. The process continues until
either a satisfiable precompletion is found, or all the possibilities have been exhausted.
8.3. EXPERIMENTS 180
This method is called syntactic branching, because it follows the syntactic order of the
disjuncts in the concept expression. However, the options are not necessarily disjoint,
so there is nothing to prevent the recurrence of unsatisfiable constraints in different
branches.
We have not implemented semanting branching in the way it is described in Hor-
rocks and Patel-Schneider [1999]. This is because it requires a completely different
backtracking mechanism, and we wanted a technique which could be easily activated
or deactivated. What we do instead, is a modified version of the syntactic branching
where every time a new disjunct is tried, the conjunction of the negation of previous
failed choices are added as well. For example, if the first option ���� fails, the new
constraints ���� and ����� are tried. If the second choice fails as well, the third option
is ���� with the constraint ������ � ����, and so on util the last option. The idea is
that the negated concept inserted would cause contradictions to be detected at an early
stage, pruning the useless part of the search space.
Unfortunately, this implementation is far from optimal, because it may suffer the
overhead of evaluating the newly introduced constraints; in fact, with the tests we used
the optimisation did not improve the performance of the system (although, it has not
degraded either).
8.3 Experiments
The main purpose of the experiments we performed with the system was to verify the
feasability of the approach in terms of performance. We compared our implementation
with Race (see Haarslev and Moller [2000a]) because it is the fastest available Abox
rasoner, and it provides a superset of ��f (it has unqualified number restrictions as
well). The comparison of Race with other DL systems can be found in Franconi et al.
[1998b].
8.3.1 DL benchmark suite
The main problem in performing experiments with an Abox reasoner is the lack of
real data (i.e. KBs). For our experiments we used the synthetic Abox tests of the DL
benchmark suite.5 The DL benchmark suite is a collection of tests for DL systems
which has been introduced for the DL’98 workshop (see Horrocks and Patel-Schneider
5It is available at the URL http://kogs-www.informatik.uni-hamburg.de/˜moeller/dl-benchmark-suite.html.
8.3. EXPERIMENTS 181
[1998b]). The test suite was designed according to a structure which has been first
used for the Comparison of Theorem Provers for Modal Logics at Tableaux ’98 (see
de Swart [1998]).
The test suite is mainly oriented towards terminological reasoning, because the
tests were originated in the modal logic community. There are 9 classes of tests
(k branch n, k d4 n, k dum n, k grz n, k lin n, k path n, k ph n, k poly n, k t4p n)
each one containing different instances of problems of increasing difficulty. Each in-
stance is automatically generated according to a schema related to the class, and it
consists of a Tbox, an Abox and a set of Abox queries.
Each instance is evaluated by loading the knowledge base (i.e. Tbox and Abox),
then the KB satisfiability is checked and all the Abox queries are evaluated. If the
system is unable to evaluate completely all the queries within a CPU time limit of 500
seconds, the test for the current instance is aborted.
For each class the benchmark starts with the easiest instance (the first) and con-
tinues until all the instances for the class are evaluated, or a timeout interrupts the
evaluation. The number of the latest instance the system has been able to evaluate is
recorded together with the CPU time spent to evaluate it, and a new class of problems
is considered.
Abox test instances derive from the concepts generated for the Concept Satisfia-
bility Tests (see Horrocks and Patel-Schneider [1998b]). The Tbox is generated by
naming all the sub-concepts appearing in the test concept, while the Abox is generated
from the model generated by a satisfiability test over the concept itself.
This approach has the advantage of being well defined, so the results of each Abox
query is known in advance; this is crucial for verifying the correctness of the DL sys-
tem. On the other hand, it has several disadvantages that make the tests unsatisfactory
as examples of realistic problems. Firstly, the originating concepts contain a single role
name, and this is very unlikely in any real KB. The second problem is the terminology.
Since all the sub–concepts are recursively named the resulting Tbox is unfoldable (i.e.
acyclic and definitional, see Section 2.1.2), and this means that it can be considered
as an empty terminology. The last problem of the synthetic Abox tests is the fact that
Aboxes are built starting from a model generated by a tableaux based DL system for a
concept. The kind of models these systems generate are not unstructured, but they are
always tree shaped and mostly fully connected. Obviously they cannot be taken as a
representative sample for generic KBs.
As we are going to show in the next section, the inadequacy of the data tests is
8.3. EXPERIMENTS 182
highlighted by some of the results of our experiments. Unfortunately, these synthetic
Abox tests are the only coherent suite for testing DL systems, so we must make the
most of them. In performing the experiments we were not interested in the absolute
performance, but on the actual behaviour of the system with different optimisation fea-
tures activated. However, the results must be considered in the light of their limitations.
We do not think that the available Abox tests reflect realistic KBs, so they cannot give
an estimate of the size/type of KB our system can handle.
8.3.2 Measurements
Our objective is to etablish a correlation between the performance of the system on
different tests and the behaviour of the optimization techniques. Some of the different
optimisations mentioned in Section 8.2 can be turned on and off by flags, and sev-
eral counters have been placed in crucial parts of the code to obtain a picture of the
behaviour of the system during the evaluation of a tests. For example, we counted
the number of times the knowledge base is precompleted, in contrast with the times
the MSC test is sufficient to answer an instance check. Another counter registers the
number of backjumps which have been performed.
We assumed that each class of tests presents a different pattern, therefore we took
separate measurements for each class. The results are summarised in Table 8.2, Ta-
ble 8.3, and Table 8.4.
Results of the experiments are grouped by the class of tests, and each row shows
the results for the system with a particular set of optimisations. As a reference, the first
row of each class shows the results of the Race system. The following rows contain the
results obtained by our system with different configuration of optimisation techniques.
The results obtained by our system appear in decreasing order of performance, so the
first row contains always the best result, and the last the worse. We chosen this order
because it highlights the impact of the different optimisations on the performance of
the system.
All the experiments have been performed on a machine equipped with a Celeron
450 MHz processor, with 256 Mb of main memory and running Linux with the kernel
version 2.2.17. Both the systems are written in Common Lisp and run using Allegro
Common Lisp version 5.0.1.
We will first explain the meaning of each column in the tables, then we will try to
draw some conclusions from the results of the experiments.
The System field shows the active optimisations with our system (aFaCT) or the
8.3. EXPERIMENTS 183
version of the Race system that has been used. The six different optimisations which
can be activated are6
LABEL-MODAL-CHECK enable the modal verification discussed in Section 8.2.6;
CHECK-BEFORE-PRECOMP enable the verification of the satisfiability of the la-
bel of an individual before starting to precomplete it (the verification at the end
is always performed);
MSC-CHECKING the Most Specific Concept of an individual is build and used for
the instance checking before the actual precompletion of the Abox;
BACKJUMPING enable the backjumping;
QUERY-CACHING enable query caching (see Section 8.2.7);
SEMANTIC-BRANCHING activate the semantic branching (see Section 8.2.7).
As we said above, the configuration opt6 has all these optimisation switches ac-
tivated, then at each different configuration one of the switches is turned off. So in
the configuration opt5 the MSC-CHECKING is turned off, in opt4 the QUERY-
CACHING, in opt3 the SEMANTIC-BRANCHING, in opt2 the CHECK-BEFORE-
PRECOMP, in opt1 the LABEL-MODAL-CHECK, and finally in opt0 the BACK-
JUMPING is turned off as well.
We could have tried every combination of the six flags, but it would have taken
too much time to run, and a volume of result data rather unmanageable. Therefore we
decided to start with all the optimisation turned on, and disable a feature at each step.
This gives us seven different configuration of optimisations, named from opt6 to opt0.
With opt6, all the optimisations are active, while with opt0 they are all switched off.
In the configuration opt5 the MSC-CHECKING is turned off, in opt4 the QUERY-
CACHING, in opt3 the SEMANTIC-BRANCHING, in opt2 the CHECK-BEFORE-
PRECOMP, in opt1 the LABEL-MODAL-CHECK, and finally in opt0 the BACK-
JUMPING is turned off as well. The order is based on the our experience with the
system, we first eliminated the options that we thought did not impact the performance.
The first flag MSC-CHECKING is an exception, because the use of subsumption to
perform instance checking is not an optimisation which impacts on the behaviour of
the algorithm.
6The remaining techniques we mentioned in the previous section are always active.
8.3. EXPERIMENTS 184
The next two columns Last Instance and Duration show the last instance of the
class that has been solved, and how many seconds of CPU time have been used for
solving that instance. The following fields are the value of the different counters.
All-Instance-Checks shows the total number of instance checks that have been
performed, while Prec-Instance-Checks indicates the number of them which have
been performed by precompletion. Note that unless the MSC-CHECKING is active,
these two counters have the same value.
Backjumping counts the number of times the following options of a backtracking
point have been ignored because of the backjumping optimisation. The counter is in-
cremented at every backatrack point, so for a single failure more than one backtrack
points can be “skipped” (see the example in Section 8.2.4). Therefore its quantitative
value is not very significant; however, the order of magnitude gives an idea of how
much the backjumping is “active”. Its value compared with the number of clashes de-
tected gives an idea of how “deep” the backjumps are (how many backtracking points
are ignored in a single backtrack).
Call-Reasoner-Full-Label and Call-Reasoner-Modal-Part count the number of
times the terminological reasoner has been invoked to verify the satisfiability of a label.
Sat-Label on the other hand counts the total number of times the label has been ver-
ified. This number is typically higher than the sum of the two previous ones, because
the satisfiability value of the label could have been cached (counters Cached-Clash-
Label and Cached-Sat-Label), or a contradiction is immediately detected among the
concepts of the label (Simple-Clash or Bottom-Detect, this latter one detects the
presence of an explicit � contraint in the label, usually added by a concept normalisa-
tion.) Finally, the counter Label-Clash shows haw many times the verification of the
satisfiability of a label returned a failure.
8.3.3 Notes on results
The first thing worth noticing is the fact that most of the instance checking tests can
be solved by terminological reasoning only, without precompleting the KB. This can
be observed by considering the value of the counter Prec-Instance-Checks when
the system has the MSC-CHECKING enabled (configuration opt6). In all classes
except k d4 n and k grz n all the instance checks are solved without precompleting
the Abox, and even in those two classes over 98% of the instance checks are solved
8.3. EXPERIMENTS 185
Datas
et
Syste
m
Last
Insta
nce
Durat
ion
All-Ins
tanc
e-Che
cks
Prec-
Insta
nce-
Check
s
Backju
mpin
gCall
-Rea
sone
r-Full
-Lab
el
Call-R
easo
ner-M
odal-
Part
Sat-L
abel
Labe
l-Clas
h
Cache
d-Clas
h-La
bel
Cache
d-Sat
-Lab
elSim
ple-C
lash
Botto
m-D
etec
t
race
1.1r
93
159.
21aF
aCT
(opt
5)2
4.82
191
191
4715
016
615
876
1607
415
949
032
7651
50
aFaC
T(o
pt4)
28.
3919
119
146
854
184
1571
515
935
1576
60
3675
064
0aF
aCT
(opt
3)2
8.67
191
191
5351
518
418
920
1914
018
972
036
9125
70
aFaC
T(o
pt6)
29.
8319
10
5272
046
1857
518
651
1854
70
3090
105
0aF
aCT
(opt
2)2
1519
119
154
223
019
330
1934
219
251
012
9301
10
aFaC
T(o
pt1)
224
7.81
191
191
485
2988
60
2989
829
811
012
1016
060
kbranchn
aFaC
T(o
pt0)
10.
7851
510
1647
90
1649
115
932
012
6939
30
race
1.2
219
.57
aFaC
T(o
pt6)
27.
0510
71
2410
024
1869
645
7295
866
582
089
514
0880
0aF
aCT
(opt
5)2
63.7
610
710
721
164
3010
6809
472
262
6447
20
1158
1402
200
aFaC
T(o
pt4)
281
.21
107
107
2862
242
8994
608
1004
3489
689
015
3419
1498
2aF
aCT
(opt
3)2
81.3
510
710
726
903
4144
8472
590
360
7998
60
1489
1683
062
aFaC
T(o
pt2)
210
1.5
107
107
8155
032
585
3262
130
954
034
6524
92
aFaC
T(o
pt1)
210
6.51
107
107
2655
3281
40
3285
031
441
034
6538
02
kd4n
aFaC
T(o
pt0)
10.
2528
280
8482
90
8486
368
165
034
1098
500
race
1.2
1338
3.31
aFaC
T(o
pt6)
1617
7.79
2105
015
8574
866
047
6752
865
397
073
371
754
0aF
aCT
(opt
3)10
120.
870
970
912
8487
1186
713
2468
1547
2411
3318
2110
368
1176
260
aFaC
T(o
pt4)
1012
3.41
706
706
1219
7311
339
1184
1813
9571
1003
4621
9793
1042
430
aFaC
T(o
pt1)
1032
0.07
698
698
3891
317
9444
017
9634
1643
020
190
9100
20
aFaC
T(o
pt5)
819
0.19
418
418
6999
956
9285
812
9610
577
357
545
9610
8043
0aF
aCT
(opt
2)2
0.83
5151
1358
050
253
027
00
2831
50
kdumn
aFaC
T(o
pt0)
10.
9822
220
9554
095
6675
270
1239
190
Tabl
e8.
2:E
xper
imen
tssu
mm
ary
8.3. EXPERIMENTS 186
Datas
et
Syste
m
Last
Insta
nce
Durat
ion
All-Ins
tanc
e-Che
cks
Prec-
Insta
nce-
Check
s
Backju
mpin
gCall
-Rea
sone
r-Full
-Lab
el
Call-R
easo
ner-M
odal-
Part
Sat-L
abel
Labe
l-Clas
h
Cache
d-Clas
h-La
bel
Cache
d-Sat
-Lab
elSim
ple-C
lash
Botto
m-D
etec
t
race
1.2
1031
.49
aFaC
T(o
pt6)
894
.38
890
1529
7128
1321
453
772
1027
3625
080
016
133
1254
319
617
aFaC
T(o
pt3)
721
3.86
648
648
5328
3131
807
1471
6723
7731
8765
50
2981
015
3477
2894
7aF
aCT
(opt
4)7
217.
5965
665
675
9159
3812
518
6553
3017
1211
2041
038
531
1971
9338
503
aFaC
T(o
pt5)
725
3.73
688
688
4176
2117
823
8070
913
9834
4456
40
1998
261
967
2132
0aF
aCT
(opt
2)7
275.
0864
964
968
5282
017
7157
2085
8311
2345
019
623
9458
3123
0aF
aCT
(opt
1)6
214.
3246
046
059
124
6132
70
6488
247
533
015
766
190
3398
kgrzn
aFaC
T(o
pt0)
214
7.08
6161
014
8493
015
6337
9469
10
3022
9499
7814
race
1.2
439
5.28
aFaC
T(o
pt3)
450
.48
9393
076
1223
212
369
1223
40
6160
370
0aF
aCT
(opt
4)4
51.6
793
930
7612
782
1291
912
785
061
6310
10
aFaC
T(o
pt2)
452
.32
9393
840
1428
314
309
1423
90
2670
352
0aF
aCT
(opt
6)4
53.5
193
00
4414
881
1498
614
872
061
7340
50
aFaC
T(o
pt5)
454
.32
9393
076
1352
513
662
1353
00
6166
774
0aF
aCT
(opt
1)3
25.8
177
770
1785
70
1788
317
821
026
1411
70
klinn
aFaC
T(o
pt0)
328
.03
7777
014
345
014
371
1430
90
2613
922
0ra
ce1.
23
199.
12aF
aCT
(opt
3)4
122.
8331
4731
4710
805
2762
2528
6139
4507
084
957
880
aFaC
T(o
pt4)
412
7.4
3147
3147
1080
527
6225
2861
3945
070
849
5788
0aF
aCT
(opt
5)4
135.
1631
4731
4710
805
2762
2528
6139
4507
084
957
880
aFaC
T(o
pt2)
311
419
2119
2116
6280
028
724
2924
128
227
040
253
397
115
aFaC
T(o
pt6)
313
5.82
1455
010
805
611
2528
3988
2364
084
947
840
aFaC
T(o
pt0)
19.
2114
014
00
1217
50
1529
715
172
050
6031
3072
kpathn
aFaC
T(o
pt1)
19.
4614
114
130
1407
50
1412
814
003
050
6029
3
Tabl
e8.
3:E
xper
imen
tssu
mm
ary
8.3. EXPERIMENTS 187
Datas
et
Syste
m
Last
Insta
nce
Durat
ion
All-Ins
tanc
e-Che
cks
Prec-
Insta
nce-
Check
s
Backju
mpin
gCall
-Rea
sone
r-Full
-Lab
el
Call-R
easo
ner-M
odal-
Part
Sat-L
abel
Labe
l-Clas
h
Cache
d-Clas
h-La
bel
Cache
d-Sat
-Lab
el Simple
-Clas
h Botto
m-D
etec
t
race
1.2
51.
7aF
aCT
(opt
0)6
2.09
391
391
055
095
00
4053
90
aFaC
T(o
pt1)
62.
1539
139
10
550
950
040
539
0aF
aCT
(opt
2)6
2.67
391
391
00
5595
00
4053
90
aFaC
T(o
pt5)
66
391
391
013
112
226
760
8336
10
aFaC
T(o
pt4)
66.
1239
139
10
131
1222
676
083
361
0aF
aCT
(opt
3)6
6.48
391
391
013
112
226
760
8336
10
kphn
aFaC
T(o
pt6)
645
.78
391
00
5512
150
00
8346
0ra
ce1.
24
120.
33aF
aCT
(opt
2)4
3.32
1088
1088
592
018
6420
0052
80
136
1274
0aF
aCT
(opt
0)4
3.47
1088
1088
018
640
2000
528
013
618
660
aFaC
T(o
pt1)
43.
610
8810
8859
218
640
2000
528
013
612
740
aFaC
T(o
pt4)
44.
6910
8810
8859
217
6013
3234
7664
80
384
1096
0aF
aCT
(opt
3)4
4.76
1088
1088
592
1760
1332
3476
648
038
410
960
aFaC
T(o
pt5)
48.
1410
8810
8829
615
7610
0028
1266
40
236
784
0
kpolyn
aFaC
T(o
pt6)
444
.72
1088
00
192
256
648
00
200
640
race
1.2
211
8.46
aFaC
T(o
pt6)
529
6.21
5690
026
025
1427
1873
5934
90
2127
416
0aF
aCT
(opt
2)2
129.
6152
952
956
501
010
2185
1025
8223
115
039
756
886
0aF
aCT
(opt
1)2
180.
5152
552
546
703
8991
40
9038
620
948
147
151
297
0aF
aCT
(opt
5)2
273.
2652
552
533
525
5012
492
022
1534
1038
327
111
263
7513
90
aFaC
T(o
pt4)
12.
2926
226
223
192
5802
588
223
1584
5726
352
112
208
8906
90
aFaC
T(o
pt3)
12.
5920
720
711
925
3349
956
562
9861
420
235
185
5276
156
0
kt4pn
aFaC
T(o
pt0)
00
5252
028
4484
082
8210
1632
6693
378
4503
4876
00
Tabl
e8.
4:E
xper
imen
tssu
mm
ary
8.3. EXPERIMENTS 188
by using the MSC.7 This is a clear indication that the tests are not adequate for fully
evaluating an Abox reasoner.
Note that using the MSC is not always the best way (in terms of efficiency) of
performing the instance checking. Although, in some cases it provides remarkable
good results (e.g. k dum n and k t4p n), in other it actually manifests a significant
loss of performance (e.g. k poly n and k ph n).
Unsurprisingly, no particular configuration of optimizations is best in all experi-
ments. Different classes of problems seem to require different optimisations for better
performance, and this may suggests that a good approach could be an adaptive system
which activates or deactivates some optimizations according to the behaviour of the
system with a given KB.
Having said that, the nondeterminism is definitely the main source of slow down of
the system, and optimisations that are directed at reducing the nondeterminism (such
as backjumping) seem to provide uniformily better results. This is clearly indicated
by the fact that the “activeness” of the backjumping (counter Backjumping) is one
of the few indicators that can be clearly related to the performance (see for example
k branch n, but the same pattern can be spotted in other classes). Moreover, as pre-
dicted in Section 8.2.4, the backjumping alone (configuration opt1) does not seems
to provide good results without the LABEL-MODAL-CHECK activated as well. This
is clearly indicated by the ratio of the two counters Call-Reasoner-Full-Label and
Call-Reasoner-Modal-Part w.r.t. the counter Backjumping.
The rest of the counters do not seem to show a direct correlation with performance.
This probably indicates that the optimisations they are monitoring do not significantly
influence the behaviour of the system (at least with the available Abox tests).
The last thing we would like to mention is the fact that we really did not expect our
system to be faster than the Race system; even in a few classes. We had this impression
because we know that Race is no slower than FaCT as a terminological reasoner, and
it uses the very same engine for both Abox and terminological reasoning. If you look
closely at the precompletion technique, it uses transformation rules which are very
similar to the ones implemented in Race; the difference lies the fact that our system
made a sharp distiction between rules working with the Abox (the precompletion rules)
and rules working at the terminological level (delegated to the terminological reasoner
FaCT).
7The reader may be puzzled by the fact that the rest of the counters are not equal to zero; this isexplained by the fact that the satisfiability of the KB is verified before performing any instance checkingtests.
8.3. EXPERIMENTS 189
A system without this separation (like Race) has the possibility to maximise the
effect of optimisations like backjumping because it has better information about con-
straints causing contradictions (something we tried to simulate by verifying the modal
part of the label only). We think that with classes like k branch n it is this fact that
makes the real difference in performance. In addition, Race is a more mature system
and it has more optimisation techniques implemented (for example model merging,
see Haarslev and Moller [2000a]).
For these reasons, we did not expect to outperform Race for any test having a small
and almost fully connected Abox, where partitioning and memory occupation do not
make any difference. Obviously, it could be the case that some of the results exhibited
by our system depend on the peculiar structure of the test problems. However, if this
behaviour is exhibited even with general KBs, it means that there is some unneces-
sary overhead in the hybrid reasoning in Race which can be reduced by separating the
treatment of Abox and “terminological” constraints. This would suggest that the per-
formance of tableaux–based algorithms, such as the one implemented in Race, can be
improved by using an evaluation strategy more similar to the precompletion strategy
(e.g. work on individual first).
Chapter 9
Conclusions
This chapter summarise the results presented in this thesis, and suggests some direc-
tions for future work.
9.1 Thesis contributions
The aim of our work is to study basic reasoning techniques for DL knowledge bases
with Abox, and to determine whether they can lead to practical tractability. This thesis
investigates two reasoning problems, namely KB satisfiability and query answering.
9.1.1 KB satisfiability
Different techniques have been considered, and we chose to investigate the possibility
of extending the precompletion technique in order to provide an algorithm for the DL
��f.
By exploiting the tree–like model property of ��f, it has been proved that the tech-
nique leads to a correct and complete algorithm for KB satisfiability. An experimental
DL reasoner based on the studied algorithm has been developed, and some experiments
have been performed to evaluate the behaviour of the system.
The experiments provided evidence of the fact that, even with an optimised termi-
nological reasoner, different optimisation techniques must be used in the precomple-
tion phase in order to obtain acceptable performance. Although we have not provided
formal proofs of correctness and completeness of the optimisations used, we have pro-
vided arguments supporting the fact that their use does not affect the correctness and
completeness of the underlying algorithm.
190
9.2. FUTURE WORK 191
The system developed, although experimental, exhibits good performance com-
pared to a state of the art DL system. As pointed out in Chapter 8, we think that the
ideas behind the precompletion technique may be used to improve the performance of
tableaux based systems like Race.
9.1.2 Query answering
The language for querying KBs has always been one of the weakest points of DL
systems; recently, there have been some attempts to find a solution to this inadequacy.
In this thesis we develop a technique for answering conjunctive queries over ��f KBs.
Again, the technique relies on a careful use of the model–theoretic properties of
��f. A class of interpretations has been identified which is complete w.r.t. the prob-
lem of conjunctive query answering in ��f. The properties of this class have been
used to devise an algorithm for query answering, and for proving its correctness and
completeness.
Although the results which have been presented here are specific to the DL ��f,
the technique is general enough to provide query answering algorithms for a wide
range of DLs. In addition, since the problem is reduced to KB satisfiability, the algo-
rithm described can easily be implemented on top of most of the available DL systems.
9.2 Future work
There are several research directions which stem from the work presented in this thesis.
KB satisfiability On the KB satisfiability front it is natural to ask ourselves whether
the precompletion technique can be extended to more expressive DLs. We think that
this technique can easily be adapted to most of the DLs having the tree–model property
(or a tree–like property like the one detailed in Chapter 3). For example, the addition of
(qualified) number restrictions should not be too difficult. On the other hand, there are
DL constructors that appear to be much more difficult (or even impossible) to handle
in the precompletion framework as, for example, the inverse role constructor.
The experiments with the developed DL system show that there is still the neces-
sity to produce an adequate test suite for Abox reasoners. The fact that DL systems
providing Abox reasoning are becoming fast enough to be used in applications may
encourage the production of such tests.
9.2. FUTURE WORK 192
Since the experiments show that different techniques must be incorporated in the
basic algorithm to obtain acceptable performance, there is still some theoretical work
to do. We provided arguments supporting the correctness and completeness of the
modified algorithm, but ideally this should be shown by means of formal proofs.
Query answering Concerning the query answering problem, there is still a long way
to go before having usable systems. Moreover, there are still some restrictions on the
query language for ��f which should be lifted (if possible). Some conjectures about
the solution to these problems have been made, and they will be investigated in the
near future.
Providing a completeness result for the complexity of query answering is an impor-
tant task that would provide a useful theoretical tool for evaluating query algorithms.
Before the actual implementation of an experimental system, we think that the pos-
sibility of optimising the nondeterminism in the query transformation must be carefully
considered. In fact, although the algorithm as it is can probably handle small Aboxes,
it is not suitable for large ones. In addition, the experience with the results obtained by
optimising the terminological and hybrid reasoning encourages research in this promis-
ing field.
Bibliography
S. Abiteboul, R. Hull, and V. Vianu. Foundations of databases. Addison-Wesley, 1995.
C. Areces, H. de Nivelle, and M. de Rijke. Prefixed resolution: A resolution method for
modal and description logics. In Proceedings of CADE 99, pages 187–201, January
1999.
F. Baader. Augmenting concept languages by transitive closure of roles: An alternative
to terminological cycles. In Proceedings of the 12th International Joint Conference
on Artificial Intelligence, pages 446–451. Morgan Kaufmann, 1991.
F. Baader and B. Hollunder. A terminological knowledge representation system with
complete inference algorithm. In Proceedings of the Workshop on Processing
Declarative Knowledge, pages 67–86, 1991a.
F. Baader and B. Hollunder. KRIS: Knowledge representation and inference system.
SIGART Bulletin, 2(3):8–14, 1991b.
F. Baader and U. Sattler. Number restrictions on complex roles in description logics:
A preliminary report. In Luigia Carlucci Aiello, Jon Doyle, and Stuart C. Shapiro,
editors, Proc. of the fifth International Conference on Principles of Knowledge Rep-
resentation and Reasoning, Cambridge, Massachusetts, USA, 1996. Morgan Kauf-
mann.
A. B. Baker. Intelligent Backtracking on Constraint Satisfaction Problems: Experi-
mental and Theoretical Results. PhD thesis, University of Oregon, 1995.
A. Borgida. On the relationship between description logic and predicate logic. In Pro-
ceedings of the Third International Conference on Information and Knowledge Man-
agement (CIKM’94), Gaithersburg, Maryland, November 29 - December 2, 1994,
pages 219–225. ACM, 1994.
193
BIBLIOGRAPHY 194
A. Borgida and P. F. Patel-Schneider. A semantic and complete algorithm for subsump-
tion in the classic description logic. Journal of Artificial Intelligence Research, 1,
1994.
M. Buchheit, F. M. Donini, and A. Schaerf. Decidable reasoning in terminological
knowledge representation systems. Journal of Artificial Intelligence Research, 1:
109–138, 1993.
D. Calvanese, G. De Giacomo, and M. Lenzerini. On the decidability of query contain-
ment under constraints. In Proceedings of the Seventeenth ACM SIGACT SIGMOD
SIGART Symposium on Principles of Database Systems (PODS-98), 1998a.
D. Calvanese, G. De Giacomo, and M. Lenzerini. Answering queries using views over
description logics knowledge bases. In Proceedings of the 16th National Conference
on Artificial Intelligence (AAAI 2000), pages 386–391, 2000.
D. Calvanese, M. Lenzerini, and D. Nardi. Description logics for conceptual data
modeling. In Jan Chomicki and Gunter Saake, editors, Logics for Databases and
Information Systems, pages 229–264. Kluwer Academic Publisher, 1998b.
A. K. Chandra and P. M. Merlin. Optimal implementation of conjunctive queries in
relational data bases. In Conference Record of the Ninth Annual ACM Symposium
on Theory of Computing, pages 77–90, 1977.
G. De Giacomo. Eliminating “converse” from converse pdl. Journal of Logic, Lan-
guage and Information, 5:193–208, 1996.
G. De Giacomo and M. Lenzerini. Boosting the correspondence between description
logics and propositional dynamic logics. In AAAI-Press/MIT-Press, editor, Pro-
ceedings of the 12th National Conference on Artificial Intelligence (AAAI’94), pages
205–212, 1994.
G. De Giacomo and M. Lenzerini. TBox and ABox reasoning in expressive descrip-
tion logics. In Proceedings of the Fifth International Conference on the Principles of
Knowledge Representation and Reasoning (KR-96), pages 316–327. Morgan Kauf-
mann Publishers, 1996.
G. De Giacomo and M. Lenzerini. A uniform framework for concept definitions in
description logics. Journal of Artificial Intelligence Research, 6:87–110, 1997.
BIBLIOGRAPHY 195
G. De Giacomo and F. Massacci. Combining deduction and model checking into
tableaux and algorithms for converse-pdl. Information and Computation, 1998.
H. de Swart, editor. Automated Reasoning with Analytic Tableaux and Related Meth-
ods: International Conference Tableaux’98, number 1397 in LNAI, 1998. Springer.
F. Donini, G. De Giacomo, and F. Massacci. EXPTIME tableaux for ���. In
L. Padgham, E. Franconi, M. Gehrke, D. L. McGuinness, and P. F. Patel-Schneider,
editors, Collected Papers from the International Description Logics Workshop
(DL’96), number WS-96-05 in AAAI Technical Report, pages 107–110. AAAI
Press, Menlo Park, California, 1996a.
F. M. Donini, M. Lenzerini, D. Nardi, and A. Schaerf. Reasoning in description logics.
In Foundation of Knowledge Representation, pages 191–236. CSLI-Publications,
1996b.
F.M. Donini, M. Lenzerini, D. Nardi, and A. Schaerf. Deduction in concept languages:
From subsumption to instance checking. Journal of Logic and Computation, 4(4):
423–452, 1994.
Francesco M. Donini, M. Lenzerini, D. Nardi, and W. Nutt. The complexity of concept
languages. Information and Computation, 134(1):1–58, 1997.
Q. Elhaik and M. C. Rousset. Making an abox persistent. In Franconi et al. [1998b].
A. Era and F. Donini. Most specific concepts for knowledge bases with incomplete
information. In International Conference on Information and Knowledge Manage-
ment, Baltimore, November 1992.
E. Franconi. Description logics for natural language processing. In Working Notes of
the 1994 AAAI Fall Symposium on Knowledge Representation for Natural Language
Processing in Implemented Systems, 1994.
E. Franconi, G. De Giacomo, I. R. Horrocks, D. L. McGuinness, W. Nutt, P. F. Patel-
Schneider, and C. A. Welty. Report on the 1998 intl. workshop on description logics
(dl’98), 1998a.
Enrico Franconi, G. De Giacomo, Robert M. MacGregor, Werner Nutt, and Christo-
pher A. Welty, editors. Proceeding of the 1998 International Workshop on Descrip-
tion Logics (DL’98), 1998b. CEUR Pubblication.
BIBLIOGRAPHY 196
E. Gradel. Why are modal logics so robustly decidable? Bulletin of the European
Association for Theoretical Computer Science, 68:90–103, 1999.
V. Haarslev and R. Moller. An empirical evaluation of optimization strategies for abox
reasoning in expressive description logics. In Lambrix et al. [1999], pages 115–119.
V. Haarslev and R. Moller. Consistency testing: The race experience. In Proceedings
of TABLEAUX 2000, volume 1847 of Lecture Notes in Computer Science, pages
57–61. Springer, 2000a.
V. Haarslev and R. Moller. Expressive abox reasoning with number restrictions, role hi-
erarchies, and transitively closed roles. In A.G. Cohn, F. Giunchiglia, and B. Selman,
editors, Proceedings of the Seventh International Conference on Knowledge Repre-
sentation and Reasoning (KR2000), pages 273–284. Morgan Kaufmann, 2000b.
V. Haarslev and R. Moller. High performance reasoning with very large knowledge
bases. In Franz Baader and Ulrike Sattler, editors, Proceedings of the 2000 Interna-
tional Workshop on Description Logics (DL2000), pages 143–152, 2000c.
B. Hollunder. Consistency checking reduced to satisfiability of concepts in terminolog-
ical systems. Annals of Mathematics and Artificial Intelligence, 18:95–131, 1996.
I. Horrocks. Optimising Tableaux Decision Procedures for Description Logics. PhD
thesis, University of Manchester, 1997.
I. Horrocks. Using an expressive description logic: FaCT or fiction? In Proc. of the 6th
Int. Conf. on the Principles of Knowledge Representation and Reasoning (KR’98),
1998.
I. Horrocks and P. F. Patel-Schneider. Comparing subsumption optimizations. In Fran-
coni et al. [1998b].
I. Horrocks and P. F. Patel-Schneider. Dl system comparison. In Franconi et al.
[1998b].
I. Horrocks and P. F. Patel-Schneider. Optimising description logic subsumption. Jour-
nal of Logic and Computation, 9(3):267–293, 1999.
I. Horrocks and U. Sattler. A description logic with transitive and inverse roles and
role hierarchies. In Franconi et al. [1998b].
BIBLIOGRAPHY 197
I. Horrocks and U. Sattler. A description logic with transitive and inverse roles and
role hierarchies. Journal of Logic and Computation, 9(3):385–410, 1999.
I. Horrocks, U. Sattler, S. Tessaris, and S. Tobies. Query containment using a DLR
ABox. LTCS-Report 99-15, LuFG Theoretical Computer Science, RWTH Aachen,
Germany, 1999a.
I. Horrocks, U. Sattler, S. Tessaris, and S. Tobies. How to decide query containment un-
der constraints using a description logic. In Logic for Programming and Automated
Reasoning (LPAR 2000), volume 1955 of Lecture Notes in Computer Science, pages
326–343. Springer, 2000a.
I. Horrocks, U. Sattler, and S. Tobies. Practical reasoning for expressive description
logics. In H. Ganzinger, D. McAllester, and A. Voronkov, editors, Proceedings of the
6th International Conference on Logic for Programming and Automated Reasoning
(LPAR’99), number 1705 in Lecture Notes in Artificial Intelligence, pages 161–180.
Springer-Verlag, 1999b.
I. Horrocks, U. Sattler, and S. Tobies. Reasoning with individuals for the description
logic ���0. In David MacAllester, editor, Proceedings of the 17th International
Conference on Automated Deduction (CADE-17), number 1831 in Lecture Notes In
Artificial Intelligence, pages 482–496. Springer-Verlag, 2000b.
I. Horrocks and S. Tessaris. A conjunctive query language for description logic aboxes.
In National Conference on Artificial Intelligence (AAAI 2000), pages 399–404.
American Association for Artificial Intelligence, 2000.
I. Horrocks and S. Tobies. Reasoning with axioms: Theory and practice. In Proceed-
ings of the 7th International Conference on the Principles of Knowledge Represen-
tation and Reasoning (KR’2000), pages 285–296, 2000.
U. Hustadt and R. A. Schmidt. Issues of decidability for description logics in the
framework of resolution. In R. Caferra and G. Salzer, editors, Automated Deduction
in Classical and Non-Classical Logics, volume 1761 of Lecture Notes in Artificial
Intelligence, pages 191–205. Springer, 2000.
KRSS. Description-logic knowledge representation system specification, November
1993.
BIBLIOGRAPHY 198
Patrick Lambrix, Alex Borgida, M. Lenzerini, Ralf Moller, and P. Patel-Schneider, edi-
tors. Proceeding of the 1999 International Workshop on Description Logics (DL’99),
1999. CEUR Pubblication.
A. Y. Levy and M. C. Rousset. Carin: A representation language combining horn
rules and description logics. In Proceedings of the Twelfth European Conference on
Artificial Intelligence (ECAI-96). John Wiley and Sons, 1996a.
A. Y. Levy and M. C. Rousset. The limits on combining recursive horn rules and
description logics. In Proceedings of the AAAI Thirteenth National Conference on
Artificial Intelligence 1996, 1996b.
C. Lutz and U. Sattler. The complexity of reasoning with boolean modal logic. In
Advances in Modal Logic 2000 (AiML 2000), Leipzig, Germany, 2000.
R. M. MacGregor and D. Brill. Recognition algorithms for the LOOM classifier. In
Proceedings of AAAI-92, pages 774–779. AAAI Press, 1992.
B. Nebel. Terminological cycles: Semantics and computational properties. In J. Sowa,
editor, Principles of Semantic Networks. Morgan Kaufmann, 1991.
P. F. Patel-Schneider. DLP system description. In Franconi et al. [1998b], pages 87–89.
R. Reiter. On closed world data bases. In H. Gallaire and J. Minker, editors, Logic and
Data Bases, pages 55–76, 1977.
R. Reiter. Towards a logical reconstruction of relational database theory. In Michael L.
Brodie, John Mylopoulos, and Joachim W. Schmidt, editors, On Conceptual Mod-
elling, Perspectives from Artificial Intelligence, Databases, and Programming Lan-
guages, Topics in Information Systems, pages 191–233. Springer, 1984.
M. C. Rousset. Backward reasoning in aboxes for query answering. In Enrico
Franconi and Michael Kifer, editors, Knowledge Representation meets Databases
(KRDB’99), volume CEUR-WS/Vol-21. CEUR, 1999.
U. Sattler. A concept language extended with different kinds of transitive roles. In
G. Gorz and S. Holldobler, editors, 20. Deutsche Jahrestagung fur Kunstliche In-
telligenz, number 1137 in Lecture Notes in Artificial Intelligence. Springer Verlag,
1996.
BIBLIOGRAPHY 199
A. Schaerf. Reasoning with individuals in concept languages. Data and Knowledge
Engineering, 13(2):141–176, 1994.
K. Schild. A correspondence theory for terminological logics: Preliminary report. In
Proceedings of the 12th International Joint Conference on Artificial Intelligence,
pages 466–471, 1991.
M. Schmidt-Schauss and G. Smolka. Attributive concept descriptions with comple-
ments. Artificial Intelligence, 48:1–26, 1991.
R. S. Streett. Propositional dynamic logic of looping and converse is elementarily
decidable. Information and Control, 54(1/2):121–141, July/August 1982.
T. Tammet. Using resolution for extending kl-one-type languages. In CIKM ’95,
Proceedings of the 1995 International Conference on Information and Knowledge
Management, November 28 - December 2, 1995, Baltimore, Maryland, USA, pages
326–332. ACM, 1995.
S. Tessaris and G. Gough. Abox reasoning with transitive roles and axioms. In Lambrix
et al. [1999].
M. Y. Vardi. Why is modal logic so robustly decidable? Technical Report TR97-274,
Rice University, 1997.
W. A. Woods and J. G. Schmolze. The KL-ONE family. Computer and Mathematics
with Applications, special issue: Semantic Networks in Artificial Intelligence, 23
(2-5):133–177, March-May 1992.
top related