learning from ordinal data with ilp in description logic · 2017. 9. 12. · learning from ordinal...

28
Learning from Ordinal Data with ILP in Description Logic Nunung Nurul Qomariyah and Dimitar Kazakov Computer Science, University of York, UK Presented on the 27 th InternaDonal Conference on InducDve Logic Programming Orléans, France, 4-6 September 2017

Upload: others

Post on 11-Oct-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Learning from Ordinal Data with ILP in Description Logic · 2017. 9. 12. · Learning from Ordinal Data with ILP in Description Logic Nunung Nurul Qomariyah and Dimitar Kazakov Computer

LearningfromOrdinalDatawithILPinDescriptionLogic

NunungNurulQomariyahandDimitarKazakov

ComputerScience,UniversityofYork,UK

Presentedonthe27thInternaDonalConferenceonInducDveLogicProgrammingOrléans,France,4-6September2017

Page 2: Learning from Ordinal Data with ILP in Description Logic · 2017. 9. 12. · Learning from Ordinal Data with ILP in Description Logic Nunung Nurul Qomariyah and Dimitar Kazakov Computer

THEOUTLINE

1.  IntroducDon

2.  ProblemrepresentaDon

3.  ProposedAlgorithm

4.  EvaluaDon5.  ConclusionsandFurtherWork

Page 3: Learning from Ordinal Data with ILP in Description Logic · 2017. 9. 12. · Learning from Ordinal Data with ILP in Description Logic Nunung Nurul Qomariyah and Dimitar Kazakov Computer

INTRODUCTION

ILPalgorithmsusingDLrepresentaDon–  HavethepotenDaltobeappliedtolargevolumesof

linkedopendata

–  Tobenefitfromthetoolsavailableforsuchdata•  e.g.IDEssuchasProtégé,DBsystemsuchastriplestoreand

Ontologyreasoners

Page 4: Learning from Ordinal Data with ILP in Description Logic · 2017. 9. 12. · Learning from Ordinal Data with ILP in Description Logic Nunung Nurul Qomariyah and Dimitar Kazakov Computer

INTRODUCTION

PreviousworkonILPinDLincludes:–  DL-Learner[1],YinYang[2],DL-FOIL[3],Kietz’s[4]andKonstantopoulos’s[5]work.

–  TheyaremostlyaimingtolearnaboutaconceptdescripDons

Page 5: Learning from Ordinal Data with ILP in Description Logic · 2017. 9. 12. · Learning from Ordinal Data with ILP in Description Logic Nunung Nurul Qomariyah and Dimitar Kazakov Computer

INTRODUCTION

TheapplicaDonarea:PreferenceLearning(PL)–  PL[6]aimstoinducepredicDveuserpreference

modelsfromempiricaldata.

several benefits of representing both data and models in alogic-based language, as this allows for the use of reasonertools that can infer logical consequences from a given knowl-edge data base. Our method employs an ontology reasonerto recommend items consistent with the user preference hy-potheses produced by ILP. This approach has the potentialto make suggestions about items that have never been ex-plicitly discussed with the user.

In summary, our contribution in this paper is as follows:• We propose a new approach to learning preference

from multi-attribute items for recommender systemsbased on Inductive Logic Programming (ILP).

• We propose a new architecture for knowledge repre-sentation and inference using Semantic Web Rule Lan-guage (SWRL) and an ontology reasoner.

• We also describe a way to tune the settings of ourlearning algorithm based on experiments with a realworld dataset in order to improve system performance.

We divide the rest of the paper as follows. In Section 2, weexplain the details of our proposed approach. We describethe dataset and the experiments carried out in Section 3.Then, we discuss the results in Section 4. We explore relatedwork in Section 5. Finally, we conclude and suggest futurework in Section 6.

2. PROPOSED APPROACHWe have guided our choice of learning algorithm by the

need for an expressive representation formalism learning al-gorithm capable of handling a variety of hypotheses based onthe user preferences, along with the desire to be able to learnrobust hypotheses from a limited number of examples andexpress the result in a human readable form. While otherresearchers [9] have used linear SVM to approximate userpreferences, we opted for the flexibility of Inductive LogicProgramming. In addition, the performance of our systemis boosted through the use of constraints on the range of hy-potheses considered, which reduces the time complexity ofthe learning task. We divide this section into four subsectionexplaining each step in more detail.

2.1 Problem FormalizationArguably, the use of data that genuinely reflects the user

preferences is essential for the success of any recommendersystem. Therefore we have opted for a form of knowledgeelicitation that minimises the subjectivity of the user’s repliesby limiting the complexity of the query asked and restrictingthe feedback provided to qualitative information alone. Inpractice, this is achieved through queries consisting of pairsof items along with their descriptions, where the user onlyneeds to select the better of the two items. Such pairwisecomparisons are used to learn which items will be classifiedas “Good”, i.e. ones that the user would consider buying.We use the user’s answers to classify the unlabelled dataand make a prediction about classes. We illustrate the gen-eral annotation process in Figure 1.

The figure shows how we derive conclusions about prefer-ences regarding individual attributes from data pairs of theform “Car 1 is-better-than Car 2”. The bold arrow repre-sents the annotation from the user and the dotted arrowsshow possible implications about individual attributes thatthe learning algorithm will consider. Note that in general,ILP makes it possible to compare combinations of attributes,e.g. hprice1, mileage1i vs. hprice2, mileage2i, through the

car 1 car 2better than

mileage 1

price 1

mileage 2

price 2

year 1

type 1

year 2

type 2

Figure 1: User annotation

use of appropriately defined relations (so called backgroundknowledge), but this aspect of ILP is not explored here. Theway in which we build hypotheses is explained in more detailin Section 2.2.

Definition 1 (Items). An item I is described by a set ofattribute names and their values: {A1 = v1, A2 = v2, . . . An

=vn

}.

Definition 2 (Comparison Pair). Given a set of itemsE, we define a comparison pair P as any (e, e0) 2 E⇥E, e 6=e0. We shall then refer to the variable represented by the at-tribute A of the first element of the pair as A

first

, whileA

second

will refer to the variable represented by the attributeA of the second element of the pair. The values of these vari-ables will be denoted as value(A

first

), resp. value(Asecond

).

Definition 3 (User Annotation). The annotation pro-vided by the user is binary: a given pair (e, e0) is given aclass label 1 if the user considers e to be better than e0, orthe class label is set to 0 if the user considers e0 to be betterthan e. The relationship better than represents strict in-equality, and the user is forced to choose between these twoalternatives. We therefore define a predicate C, such that:

C(he, e0i) =(1 if e is better than e0

0 if e0 is better than e.(1)

Definition 4 (Training Examples). The set of train-ing examples S consists of the union of all pairs he, e0i suchthat C(he, e0i) = 1, along with all pairs he0, ei such thatC(he, e0i) = 0.

Now we state our main learning task as:

Definition 5 (Learning Problem). Find a model T thatis consistent with the set of training examples S.

2.2 Learning AlgorithmWe use the annotated data as input for our learning al-

gorithm. We build an algorithm that searches the space ofpossible hypotheses starting from the most general hypothe-ses, i.e. the ones based on the least number of constraints,and progresses towards the most specific rule possible givenby a Progol-like bottom clause [8]. The di↵erence betweenthe ILP system Progol and ours is that Progol searches thehypothesis space in a greedy way, throwing away all pos-itive examples that are already covered by the hypothesis,while we derive all parts of the hypothesis in a cautious way,

Page 6: Learning from Ordinal Data with ILP in Description Logic · 2017. 9. 12. · Learning from Ordinal Data with ILP in Description Logic Nunung Nurul Qomariyah and Dimitar Kazakov Computer

INTRODUCTION

TheapplicaDonarea:RecommenderSystem–  Wehavepreviouslypublishedworkshoppaperatthe

ACMRecSys2017,Como,Italy

Page 7: Learning from Ordinal Data with ILP in Description Logic · 2017. 9. 12. · Learning from Ordinal Data with ILP in Description Logic Nunung Nurul Qomariyah and Dimitar Kazakov Computer

PROBLEMREPRESENTATION

TheobjecDveofthiswork:tolearnabouttransiDveanD-reflexiverelaDons:

–  Transi>ve:•  Weusetheexamplesprovidedbytheuser,alongwiththeirtransi>veclosure,intheircorrectorder(e.g.“carAisbecerthancarB”)asposiDveexamples

•  ExampleoftransiDveclosure:–  Userprovides:“carAisbecerthancarB”–  Useralsoprovides:“carBisbecerthancarC”–  Weaddaclosure:“carAisbecerthancarC”asaposiDveexample.

–  An>-reflexive:•  WeusethesameexamplesinreverseorderasnegaDveexamples.(e.g.“carBisbecerthancarA”)

Page 8: Learning from Ordinal Data with ILP in Description Logic · 2017. 9. 12. · Learning from Ordinal Data with ILP in Description Logic Nunung Nurul Qomariyah and Dimitar Kazakov Computer

PROBLEMREPRESENTATION

Hypothesislanguage

InAleph

AsobjectpropertyinRDF/XML:

5.1 Introduction

quantification. DL-Learner is quite close with our work, while it learns about concepts, our goal

is to learn Domain and Range axioms of a specific object properties. DL-Learner is another

improvement of previous ILP implementation in DL, such as YinYang [9] and DL-FOIL [10].

Kietz [11] and Konstantopoulos [12] also perform similar work in DL.

5.1.3 Problem Representation

We develop an algorithm by adopting the ILP to learn relations in DLs. In DL terms, our problem

is defined as follows:

Given: a set of individual and classes (represented in an ontology hierarchy) and a set

of preferences value (represented in an asymmetric transitive object property),

Find: the best axioms (in form of conjunction of classes) of both Domain and Range

for a specific object property (betterthan) that is complete and consistent with the

preferences given.

This problem is categorized as supervised learning problem, where the user gives label for each

item pair by considering their attributes comparisons. Then we proceed the label as a guidance for

searching a class description. The class description that map the left side of the object property to

the right side of it is known as Domain and Range axioms. In our preference learning problem, we

can say that any individual that belong to the class description in the Domain is more preferred

by the users than any individual belong to the class in the Range. The order of preferences here

is anti-symmetric, which means if item A is better than item B, then in any case item B cannot

be better than item A. It is also transitive, which means whenever item A is better than item B,

and item B is better than item C, then item A is better than item C. We use the reasoner to

retract all the transitive chain of the object properties. We treat them as an additional pair and

include them in the positive examples. For the anti-symmetric property, we handle it by assuming

all the opposite order of relations as the negative examples. Figure 5.1 shows how we represent

our problem in Protege.

5.1.3.1 Hypothesis languange

The aim of ILP is to find a theory that is complete (it covers all the given positive examples) and is

consistent (it covers no negative examples). We show how the hypothesis language is represented

as mode declaration in Aleph:

:- modeh(1,betterthan(+car,+car)).

:- modeb(1,hasbodytype(+car,#bodytype)).

We build the hypothesis by specifying which object property in the given ontology we want to

learn. For examples, we want to learn the object property betterthan as shown below:

<owl:ObjectProperty rdf:ID="betterthan"/>

4

5.1 Introduction

quantification. DL-Learner is quite close with our work, while it learns about concepts, our goal

is to learn Domain and Range axioms of a specific object properties. DL-Learner is another

improvement of previous ILP implementation in DL, such as YinYang [9] and DL-FOIL [10].

Kietz [11] and Konstantopoulos [12] also perform similar work in DL.

5.1.3 Problem Representation

We develop an algorithm by adopting the ILP to learn relations in DLs. In DL terms, our problem

is defined as follows:

Given: a set of individual and classes (represented in an ontology hierarchy) and a set

of preferences value (represented in an asymmetric transitive object property),

Find: the best axioms (in form of conjunction of classes) of both Domain and Range

for a specific object property (betterthan) that is complete and consistent with the

preferences given.

This problem is categorized as supervised learning problem, where the user gives label for each

item pair by considering their attributes comparisons. Then we proceed the label as a guidance for

searching a class description. The class description that map the left side of the object property to

the right side of it is known as Domain and Range axioms. In our preference learning problem, we

can say that any individual that belong to the class description in the Domain is more preferred

by the users than any individual belong to the class in the Range. The order of preferences here

is anti-symmetric, which means if item A is better than item B, then in any case item B cannot

be better than item A. It is also transitive, which means whenever item A is better than item B,

and item B is better than item C, then item A is better than item C. We use the reasoner to

retract all the transitive chain of the object properties. We treat them as an additional pair and

include them in the positive examples. For the anti-symmetric property, we handle it by assuming

all the opposite order of relations as the negative examples. Figure 5.1 shows how we represent

our problem in Protege.

5.1.3.1 Hypothesis languange

The aim of ILP is to find a theory that is complete (it covers all the given positive examples) and is

consistent (it covers no negative examples). We show how the hypothesis language is represented

as mode declaration in Aleph:

:- modeh(1,betterthan(+car,+car)).

:- modeb(1,hasbodytype(+car,#bodytype)).

We build the hypothesis by specifying which object property in the given ontology we want to

learn. For examples, we want to learn the object property betterthan as shown below:

<owl:ObjectProperty rdf:ID="betterthan"/>

4

Page 9: Learning from Ordinal Data with ILP in Description Logic · 2017. 9. 12. · Learning from Ordinal Data with ILP in Description Logic Nunung Nurul Qomariyah and Dimitar Kazakov Computer

PROBLEMREPRESENTATION

Hypothesislanguage

InProtégé

5.1 Introduction

(a) Representation of background

knowledge as class hierarchy

(b) Representation of examples as individuals

(c) Representation of hypothesis language as an object property

Figure 5.1: Problem representation in Protege

5.1.3.2 Background knowledge

The di↵erence how we represent the attributes in Aleph is that they are written as predicate with

two arguments, while in our algorithm, we treat them as classes and their individual member (e.g.

a class of car with sedan body type; we specify any individual that has a body type of sedan is a

member of that class). In Aleph, the background knowledge is written as below:

car(car1). car(car2).

bodytype(sedan). bodytype(suv).

hasbodytype(car1,sedan). hasbodytype(car2,suv).

In our algorithm, the background knowledge is written in RDF/XML as below:

<owl:Class rdf:ID="sedan"/>

<owl:Class rdf:ID="suv"/>

<sedan rdf:ID="car1"></sedan>

<suv rdf:ID="car2"></suv>

5.1.3.3 Examples

In ILP, the examples are represented as ground facts with predicate betterthan/2, where the

arguments are of type car. The positive examples is written as: betterthan(car1,car2), while

5

Page 10: Learning from Ordinal Data with ILP in Description Logic · 2017. 9. 12. · Learning from Ordinal Data with ILP in Description Logic Nunung Nurul Qomariyah and Dimitar Kazakov Computer

PROBLEMREPRESENTATION

5.1 Introduction

(a) Representation of background

knowledge as class hierarchy

(b) Representation of examples as individuals

(c) Representation of hypothesis language as an object property

Figure 5.1: Problem representation in Protege

5.1.3.2 Background knowledge

The di↵erence how we represent the attributes in Aleph is that they are written as predicate with

two arguments, while in our algorithm, we treat them as classes and their individual member (e.g.

a class of car with sedan body type; we specify any individual that has a body type of sedan is a

member of that class). In Aleph, the background knowledge is written as below:

car(car1). car(car2).

bodytype(sedan). bodytype(suv).

hasbodytype(car1,sedan). hasbodytype(car2,suv).

In our algorithm, the background knowledge is written in RDF/XML as below:

<owl:Class rdf:ID="sedan"/>

<owl:Class rdf:ID="suv"/>

<sedan rdf:ID="car1"></sedan>

<suv rdf:ID="car2"></suv>

5.1.3.3 Examples

In ILP, the examples are represented as ground facts with predicate betterthan/2, where the

arguments are of type car. The positive examples is written as: betterthan(car1,car2), while

5

5.1 Introduction

(a) Representation of background

knowledge as class hierarchy

(b) Representation of examples as individuals

(c) Representation of hypothesis language as an object property

Figure 5.1: Problem representation in Protege

5.1.3.2 Background knowledge

The di↵erence how we represent the attributes in Aleph is that they are written as predicate with

two arguments, while in our algorithm, we treat them as classes and their individual member (e.g.

a class of car with sedan body type; we specify any individual that has a body type of sedan is a

member of that class). In Aleph, the background knowledge is written as below:

car(car1). car(car2).

bodytype(sedan). bodytype(suv).

hasbodytype(car1,sedan). hasbodytype(car2,suv).

In our algorithm, the background knowledge is written in RDF/XML as below:

<owl:Class rdf:ID="sedan"/>

<owl:Class rdf:ID="suv"/>

<sedan rdf:ID="car1"></sedan>

<suv rdf:ID="car2"></suv>

5.1.3.3 Examples

In ILP, the examples are represented as ground facts with predicate betterthan/2, where the

arguments are of type car. The positive examples is written as: betterthan(car1,car2), while

5

Backgroundknowledge

InAleph

AsclasshierarchyinRDF/XML:

Page 11: Learning from Ordinal Data with ILP in Description Logic · 2017. 9. 12. · Learning from Ordinal Data with ILP in Description Logic Nunung Nurul Qomariyah and Dimitar Kazakov Computer

PROBLEMREPRESENTATION

5.1 Introduction

(a) Representation of background

knowledge as class hierarchy

(b) Representation of examples as individuals

(c) Representation of hypothesis language as an object property

Figure 5.1: Problem representation in Protege

5.1.3.2 Background knowledge

The di↵erence how we represent the attributes in Aleph is that they are written as predicate with

two arguments, while in our algorithm, we treat them as classes and their individual member (e.g.

a class of car with sedan body type; we specify any individual that has a body type of sedan is a

member of that class). In Aleph, the background knowledge is written as below:

car(car1). car(car2).

bodytype(sedan). bodytype(suv).

hasbodytype(car1,sedan). hasbodytype(car2,suv).

In our algorithm, the background knowledge is written in RDF/XML as below:

<owl:Class rdf:ID="sedan"/>

<owl:Class rdf:ID="suv"/>

<sedan rdf:ID="car1"></sedan>

<suv rdf:ID="car2"></suv>

5.1.3.3 Examples

In ILP, the examples are represented as ground facts with predicate betterthan/2, where the

arguments are of type car. The positive examples is written as: betterthan(car1,car2), while

5

Backgroundknowledge

InProtégé

Page 12: Learning from Ordinal Data with ILP in Description Logic · 2017. 9. 12. · Learning from Ordinal Data with ILP in Description Logic Nunung Nurul Qomariyah and Dimitar Kazakov Computer

Examples

InAleph

AsrelaDonsbetweenindividualsinRDF/XML:

PROBLEMREPRESENTATION

5.2 The Proposed Algorithm

the negative examples is generated by opposing the positive examples :-betterthan(car2,car1).

In our algorithm, a set of examples given by the user label is translated as a set of individual

which has a betterthan relationship with the other individual. We show a representation of the

above example in RDF/XML as below:

<sedan rdf:ID="car1">

<betterthan> <suv rdf:ID="car2"></suv> </betterthan>

</sedan>

5.2 The Proposed Algorithm

In this section we describe our proposed algorithm. We implement our algorithm using Java with

the OWL API library for DL implementation. Progol is an implementation of ILP in C language

by Muggleton [13]. Aleph [14] is one of the ILP implementation that use Prolog language and

following the same procedure as Progol. We follow the four basic procedures used in Progol/Aleph

as below:

1. Select a positive example. Select an example to be generalized based on their order in

the examples file. Each instance of the relation can be seen as a pair of object IDs.

2. Build the bottom clause. The bottom clause is the conjunction of all non-disjoint class

memberships for each object in the pair.

3. Search. Find a clause more general than the bottom clause. This step uses greedy best-first

search to find a clause consistent with the data.

4. Remove covered examples. Our algorithm is greedy, removing all covered examples once

each highest-scoring clause is added to the current theory.

5.2.1 Search and refinement operator

We use a top down approach similar to Aleph. The algorithm start the generalization by using

each class in the bottom clause. The bottom clause will contain conjunction of n class on the

Domain side, and the conjunction of n class on the Range side. This will produce n ⇥ n possible

pair class combination on the first level of generalization. We evaluate all of them, except the one

that compare the same classes and the already checked hypothesis. This is illustrated in Figure 5.2.

We use a common ILP scoring function, P ⇥ (P � N), where P is the number of positive

examples covered, and N – the number of negative examples covered. In the case that the solution

has the same score as another alternative, Aleph will only return the first solution found. In our

algorithm, we consider all the non-redundant hypotheses that are consistent with the examples (i.e.

covered zero negative and more than 2 positive). The search will not stop until all the possible

combinations have been considered.

6

5.1 Introduction

(a) Representation of background

knowledge as class hierarchy

(b) Representation of examples as individuals

(c) Representation of hypothesis language as an object property

Figure 5.1: Problem representation in Protege

5.1.3.2 Background knowledge

The di↵erence how we represent the attributes in Aleph is that they are written as predicate with

two arguments, while in our algorithm, we treat them as classes and their individual member (e.g.

a class of car with sedan body type; we specify any individual that has a body type of sedan is a

member of that class). In Aleph, the background knowledge is written as below:

car(car1). car(car2).

bodytype(sedan). bodytype(suv).

hasbodytype(car1,sedan). hasbodytype(car2,suv).

In our algorithm, the background knowledge is written in RDF/XML as below:

<owl:Class rdf:ID="sedan"/>

<owl:Class rdf:ID="suv"/>

<sedan rdf:ID="car1"></sedan>

<suv rdf:ID="car2"></suv>

5.1.3.3 Examples

In ILP, the examples are represented as ground facts with predicate betterthan/2, where the

arguments are of type car. The positive examples is written as: betterthan(car1,car2), while

5

Page 13: Learning from Ordinal Data with ILP in Description Logic · 2017. 9. 12. · Learning from Ordinal Data with ILP in Description Logic Nunung Nurul Qomariyah and Dimitar Kazakov Computer

Examples

PROBLEMREPRESENTATION

5.1 Introduction

(a) Representation of background

knowledge as class hierarchy

(b) Representation of examples as individuals

(c) Representation of hypothesis language as an object property

Figure 5.1: Problem representation in Protege

5.1.3.2 Background knowledge

The di↵erence how we represent the attributes in Aleph is that they are written as predicate with

two arguments, while in our algorithm, we treat them as classes and their individual member (e.g.

a class of car with sedan body type; we specify any individual that has a body type of sedan is a

member of that class). In Aleph, the background knowledge is written as below:

car(car1). car(car2).

bodytype(sedan). bodytype(suv).

hasbodytype(car1,sedan). hasbodytype(car2,suv).

In our algorithm, the background knowledge is written in RDF/XML as below:

<owl:Class rdf:ID="sedan"/>

<owl:Class rdf:ID="suv"/>

<sedan rdf:ID="car1"></sedan>

<suv rdf:ID="car2"></suv>

5.1.3.3 Examples

In ILP, the examples are represented as ground facts with predicate betterthan/2, where the

arguments are of type car. The positive examples is written as: betterthan(car1,car2), while

5

InProtégé

Page 14: Learning from Ordinal Data with ILP in Description Logic · 2017. 9. 12. · Learning from Ordinal Data with ILP in Description Logic Nunung Nurul Qomariyah and Dimitar Kazakov Computer

PROPOSEDALGORITHM

WefollowthefourbasicstepsusedintheProgol[7]/Aleph[8]greedylearningapproach:

1.   Selectaposi>veexample.•  EachinstanceoftherelaDoncanbeseenasapairofobjectIDs.

2.   BuildtheboMomclause.•  ThebocomclauseistheconjuncDonofallnon-disjointclassmembershipsforeachobjectinthepair.

3.   Search.•  Thisstepusesgreedybest-firstsearchtofindaclauseconsistentwiththedata.

4.   Removecoveredposi>veexamples.•  Ouralgorithmisgreedy,removingallcoveredexamplesonceeachhighest-scoringclauseisaddedtothecurrenttheory.

Page 15: Learning from Ordinal Data with ILP in Description Logic · 2017. 9. 12. · Learning from Ordinal Data with ILP in Description Logic Nunung Nurul Qomariyah and Dimitar Kazakov Computer

PROPOSEDALGORITHM

! 

Page 16: Learning from Ordinal Data with ILP in Description Logic · 2017. 9. 12. · Learning from Ordinal Data with ILP in Description Logic Nunung Nurul Qomariyah and Dimitar Kazakov Computer

PROPOSEDALGORITHM

Learning from Ordinal Data with ILP in Description Logic 3

bottom clause contains the conjunction of n constraints (of type class member-ship) on the Domain side, and same number of constraints again on the Rangeside of the relation. This will produce n ⇥ n possible pairs on the first level ofgeneralisation. (We have chosen not to consider hypotheses only constrainingone of the arguments.) We evaluate all combinations of constraints, except theones that imply the same class membership of both arguments (i.e. X is better

than Y because they both share the same property/class membership) and thosethat have already been considered. This is illustrated in Figure 1.

(Thing) betterthan (Thing)

(Manual) betterthan (LargeCar) (Manual) betterthan (NonHybrid . . .

(Manual u Nonhybrid) betterthan (LargeCar

u Manual). . .

. . .

?(Manual u NonHybrid u SmallCar u Sedan) betterthan ((LargeCar u Manual u NonHybrid u Suv)

Fig. 1: Refinement Operator

We use a common ILP scoring function, P ⇥ (P �N), where P is the numberof positive examples covered, and N – the number of negative examples covered.In the case that the solution has the same score as another alternative, Alephwill only return the first solution found. In our algorithm, we consider all thenon-redundant hypotheses that are consistent with the examples (i.e. coveredzero negative and more than 2 positive). The search will not stop until all thepossible combinations have been considered.

If we have not found yet a consistent hypothesis, we continue to refine theone with the highest non-negative score, which means that we add a pair ofliterals to constrain each of the two objects in the relation. We stop at 2 literalseach for Domain and Range (this is the same as Aleph’s default clause lengthof 5). Similarly to Aleph, we also consider any examples where we cannot find aconsistent generalisation as exceptions. In this case, we add the bottom clauseas the consistent rule.

4 Algorithm complexity

We implement our algorithm in one of the DL family of languages, namely ALC(attributive language with complement) [14], the basic DL language which has

Page 17: Learning from Ordinal Data with ILP in Description Logic · 2017. 9. 12. · Learning from Ordinal Data with ILP in Description Logic Nunung Nurul Qomariyah and Dimitar Kazakov Computer

PROPOSEDALGORITHM

Searchandrefinementoperator:!  Refinement:

–  Ifwehavenotfoundyetaconsistenthypothesis,weconDnuetorefinetheonewiththehighestnon-negaDvescore,

–  WeaddapairofliteralstoconstraineachofthetwoobjectsintherelaDon.

–  Westopat2literalseachforDomainandRange(thisisthesameasAleph’sdefaultclauselengthof5).

!  Excep>on:–  SimilarlytoAleph,wealsoconsideranyexampleswherewecannotfind

aconsistentgeneralisaDonasexcepDons.Inthiscase,weaddthebocomclauseastheconsistentrule.

Page 18: Learning from Ordinal Data with ILP in Description Logic · 2017. 9. 12. · Learning from Ordinal Data with ILP in Description Logic Nunung Nurul Qomariyah and Dimitar Kazakov Computer

PROPOSEDALGORITHM

Searchandrefinementoperator:

!  WeevaluateallcombinaDonsofconstraints,except:–  theonesthatimplythesameclassmembershipof

botharguments•  e.g.carAhasbodytypesedanandcarBhasbodytypesedan

–  thosethathavealreadybeenconsidered.

Page 19: Learning from Ordinal Data with ILP in Description Logic · 2017. 9. 12. · Learning from Ordinal Data with ILP in Description Logic Nunung Nurul Qomariyah and Dimitar Kazakov Computer

PROPOSEDALGORITHM

! 

Page 20: Learning from Ordinal Data with ILP in Description Logic · 2017. 9. 12. · Learning from Ordinal Data with ILP in Description Logic Nunung Nurul Qomariyah and Dimitar Kazakov Computer

EVALUATION

DatasetWeusetwopubliclyavailabledatasets,carpreferences[8]andsushipreferences[9]withthestaDsDcsbelow:

!  Numberofitems:10items

!  Numberofpairs:45preferencepairs(2-combinaDonof10)

!  NumberofparDcipants:60users

!  Numberofacributes:–  cardatasethas4acributes(bodytype,transmission,fuelconsumpDon

andenginesize)

–  sushidatasethas7acributes(style,major,minor,heaviness,howfrequentlyconsumedbyauser,priceandhowfrequentlysold)

Page 21: Learning from Ordinal Data with ILP in Description Logic · 2017. 9. 12. · Learning from Ordinal Data with ILP in Description Logic Nunung Nurul Qomariyah and Dimitar Kazakov Computer

EVALUATION

EvaluaDonmethod!  Thegoal:

–  toassesstheaccuracyofthepredicDvepowerofeachalgorithmtosolvethepreferencelearningproblem.

!  Wecompareouralgorithmwiththreeothermachinelearningalgorithms:–  SVM,theMatlabCARTDecisionTree(DT)learner,andAleph

Page 22: Learning from Ordinal Data with ILP in Description Logic · 2017. 9. 12. · Learning from Ordinal Data with ILP in Description Logic Nunung Nurul Qomariyah and Dimitar Kazakov Computer

EVALUATION

! 

4 Nunung Nurul Qomariyah and Dimitar Kazakov

the least expressivity. ALC allows one to construct complex concepts from sim-pler ones using various language constructs. The capabilities include direct orindirect expression, e.g. concept disjointness, domain and range of roles, includ-ing the empty role.

The most expensive process is the membership checking part for every possi-ble hypothesis. This is used for scoring the hypothesis. For every single hypoth-esis the reasoner needs to check the coverage of each hypothesis. One possibleway to reduce the complexity is by minimising the search tree and checking theredundancy without reducing the accuracy.

5 Evaluation

Dataset. We use two publicly available preference datasets [3] [4]. Both thesushi and the car datasets have 10 items to rank which leads to 45 prefer-ence pairs per user. We take 60 users from each dataset and perform 10-foldcross validation for each user’s individual preferences. The car dataset has 4 at-tributes: body type, transmission, fuel consumption and engine size, while thesushi dataset has 7 attributes: style, major, minor, heaviness, how frequentlyconsumed by a user, price and how frequently sold. Despite the di↵erence in thenumber of attributes in the two datasets, we found that the maximum clauselength of 4 (in Aleph and in our algorithm) is su�cient to produce consistenthypotheses.

Evaluation method. The goal of this evaluation is to assess the accuracy of thepredictive power of each algorithm to solve the preference learning problem. Wecompare our algorithm with three other machine learning algorithms: SVM, theMatlab CART Decision Tree (DT) learner, and Aleph. SVM is a very commonstatistical classification algorithm that used in many domains. Similar work ofpairwise preference learning was performed by Qian et. al. [8] show that SVMcan also be used to learn in this domain. Both DT and Aleph can be includedin the evaluation since both of them are logic based learner, where the first is inpropositional logic and the latter is in First Order Logics.

We learn each individual preferences and test them using 10-fold cross vali-dation. The result is shown in Table 1 and Figure 2a. According to the ANOVAtest, the result shows that there is a significant di↵erence amongst the algorithmswith the p-value 2.0949⇥ 10�21 for the car dataset and 7.3234⇥ 10�36 for thesushi dataset.

Table 1: Mean and standard deviation of 10-fold cross validation testSVM DT Aleph Our algorithm

car dataset 0.8317±0.12 0.7470±0.10 0.7292± 0.08 0.8936±0.05sushi dataset 0.7604±0.09 0.8094±0.06 0.7789±0.06 0.9302±0.03

Page 23: Learning from Ordinal Data with ILP in Description Logic · 2017. 9. 12. · Learning from Ordinal Data with ILP in Description Logic Nunung Nurul Qomariyah and Dimitar Kazakov Computer

Learning from Ordinal Data with ILP in Description Logic 5

0.00.10.20.30.40.50.60.70.80.91.0

SVM Decision Tree Aleph Our algorithm

Accu

racy

10-fold cross validation accuracy

car dataset sushi dataset

(a) 10-cross validation accuracy

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1# training examples

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

accu

racy

Accuracy vs Number of Training Examples

SVMDecision TreeAlephOur algorithm

(b) Accuracy by varying number of training examples

Fig. 2: Evaluation results

We also perform several experiments with the algorithms by varying theproportion of training examples and test it on 10% of examples. For a morerobust result, we validate each cycle with 10-fold cross validation. The result ofthis experiments is shown in Figure 2b. We show that our algorithm still workbetter even with the smaller number of training examples.

Sample solutions found. Our algorithm can produce more readable resultsfor a novice user compared to Aleph. An example of consistent hypothesis foundby our algorithm is shown below:Automatic u Hybrid betterthan MediumCar u Suv

While Aleph produces rules, such as:betterthan(A,B) :-hasfuelcons(B,nonhybrid), hasbodytype(B,suv).

6 Conclusion and Further Work

In this paper, we have shown that the implementation of ILP in DL can beuseful to learn a user’s preferences from pairwise comparisons. We are currentlyworking to address the following limitations of our algorithm:

– We only consider one level class hierarchy in the ontology for simplicity. Inthe real world, the class hierarchy can be more complex.

– Currently, our algorithm uses the Closed World Assumption, which makesit easier to find a consistent hypothesis. This is not in line with the fact thatmost DL-based knowledge databases and their reasoners operate under theOpen World Assumption.

Results

Experiment2:

WesetdifferentproporDon

oftrainingexamplesand

testiton10%oftestdata.

Foramorerobustresult,wevalidateeachcycle

with10-foldcrossvalidaDon.

EVALUATION

ProporDonoftrainingexamples

Page 24: Learning from Ordinal Data with ILP in Description Logic · 2017. 9. 12. · Learning from Ordinal Data with ILP in Description Logic Nunung Nurul Qomariyah and Dimitar Kazakov Computer

EVALUATION

Samplesolu>ons

! ByusingDLrepresentaDon,wecanproducemorereadableresultsforanoviceuser

! Alephproducesrules,suchas:

5.6 Conclusion and Further Work

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1# training examples

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

accu

racy

Accuracy vs Number of Training Examples

SVMDecision TreeAlephOur algorithm

Figure 5.4: Accuracy vs number of training examples on both dataset

We also perform several experiments with the algorithms by varying the proportion of training

examples and test it on 10% of examples. For more robust result, we validate each cycle with

10-fold cross validation. The result of this experiments is shown in Figure 5.4. We show that our

algorithm still work better even with the smaller number of training examples.

5.5.3 Sample solutions found

Our algorithm can produce more readable results for a novice user compared to Aleph. An example

of consistent hypothesis found by our algorithm is shown below:

Automatic u Hybrid betterthan MediumCar u Suv

While Aleph produces rules, such as:

betterthan(A,B) :-hasfuelcons(B,nonhybrid), hasbodytype(B,suv).

5.6 Conclusion and Further Work

In this paper, we have shown that the implementation of ILP in DL can be very beneficial to

estimate a preference from pairwise comparisons. We are currently working on the improvement

of some limitations of our algorithm as mentioned below:

• We only consider one level class hierarchy in the ontology for simplicity. In the real world,

the class hierarchy can be more complex.

10

5.6 Conclusion and Further Work

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1# training examples

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

accu

racy

Accuracy vs Number of Training Examples

SVMDecision TreeAlephOur algorithm

Figure 5.4: Accuracy vs number of training examples on both dataset

We also perform several experiments with the algorithms by varying the proportion of training

examples and test it on 10% of examples. For more robust result, we validate each cycle with

10-fold cross validation. The result of this experiments is shown in Figure 5.4. We show that our

algorithm still work better even with the smaller number of training examples.

5.5.3 Sample solutions found

Our algorithm can produce more readable results for a novice user compared to Aleph. An example

of consistent hypothesis found by our algorithm is shown below:

Automatic u Hybrid betterthan MediumCar u Suv

While Aleph produces rules, such as:

betterthan(A,B) :-hasfuelcons(B,nonhybrid), hasbodytype(B,suv).

5.6 Conclusion and Further Work

In this paper, we have shown that the implementation of ILP in DL can be very beneficial to

estimate a preference from pairwise comparisons. We are currently working on the improvement

of some limitations of our algorithm as mentioned below:

• We only consider one level class hierarchy in the ontology for simplicity. In the real world,

the class hierarchy can be more complex.

10

Page 25: Learning from Ordinal Data with ILP in Description Logic · 2017. 9. 12. · Learning from Ordinal Data with ILP in Description Logic Nunung Nurul Qomariyah and Dimitar Kazakov Computer

CONCLUSIONS

!  WehaveshownthattheimplementaDonofILPinDLcanbeusefultolearnauser’spreferencefrompairwisecomparisons.

!  Currently,ouralgorithmusestheClosedWorldAssumpDon,whichmakesiteasiertofindaconsistenthypothesisandtestthecoverage.

!  Intermofaccuracy,wehaveshownthatourproposedalgorithmoutperformedtheotheralgorithmsevenwiththesmallernumberoftrainingexamples.

Page 26: Learning from Ordinal Data with ILP in Description Logic · 2017. 9. 12. · Learning from Ordinal Data with ILP in Description Logic Nunung Nurul Qomariyah and Dimitar Kazakov Computer

FUTUREWORK

WearecurrentlyworkingtoaddressthefollowinglimitaDonsofouralgorithm:

– Workingwithmorecomplexclasshierarchy(includesnegaDons,unionsandquanDfiers)

– WorkingundertheOpenWorldAssumpDon(OWA)

–  Theimprovementofthesystemperformanceandscalability•  e.g.implemenDngtriplestore

!  WeplantoimplementouralgorithminRecommenderSystemapplicaDonandinvitetherealuserstoevaluateusinglargerrealdatasetfromAutotrader(7361cars)

Page 27: Learning from Ordinal Data with ILP in Description Logic · 2017. 9. 12. · Learning from Ordinal Data with ILP in Description Logic Nunung Nurul Qomariyah and Dimitar Kazakov Computer

REFERENCES

[1] Lehmann,J.:DL-Learner:LearningConceptsinDescripDonLogics.In:TheJournalofMachineLearningResearch,vol.10,pp.2639–2642.(2009)

[2] Iannone,L.,Palmisano,I.,Fanizzi,N.:AnAlgorithmBasedonCounterfactualsforConceptLearninginTheSemanDcWeb.In:AppliedIntelligence,vol.26no.2,pp.139–159.Springer,Heidelberg(2007)

[3] Fanizzi,N.,d’Amato,C.,andEsposito,F.:DL-FOILConceptLearninginDescripDonLogics.In:ProceedingsofInducDveLogicProgramming,ser.LNCS,vol.5194,pp.107–121.Springer,Heidelberg(2008)

[4] Kietz,J.:LearnabilityofDescripDonLogicPrograms.In:ProceedingsofInducDveLogicProgramming,ser.LNCS,vol.2583,pp.117–132.Springer,Heidelberg(2002)

[5] Konstantopoulos,S.andCharalambidis,A.:FormulaDngDescripDonLogicLearningasAnInducDveLogicProgrammingTask.In:ProceedingsofIEEEWorldCongressonComputaDonalIntelligence,pp.1–7(2010)

[6] FürnkranzJ.andHüllermeierE.:Preferencelearning:AnintroducDon.Springer(2010)[7] Muggleton,S.:InverseEntailmentandProgol.In:NewGeneraDonCompuDng,vol.13no.3-4,pp.

245–286(1995)[8] Srinivasan,A.:TheAlephManual.In:TechnicalReport.CompuDngLaboratory,OxfordUniversity

(2000),hcp://web.comlab.ox.ac.uk/oucl/research/areas/machlearn/Aleph/[9] Abbasnejad,E.andSanner,S.andBonilla,E.V.andPoupart,P.:LearningCommunity-based

PreferencesviaDirichletProcessMixturesofGaussianProcesses.In:Proceedingsofthe23rdInternaDonalJointConferenceonArDficialIntelligence(IJCAI)(2013)

[10]Kamishima,T.:NantonacCollaboraDveFiltering:RecommendaDonBasedonOrderResponses.In:ProceedingsoftheNinthACMSIGKDDInternaDonalConferenceonKnowledgeDiscoveryandDataMining,pp.583–588.ACM,NewYork(2003)

Page 28: Learning from Ordinal Data with ILP in Description Logic · 2017. 9. 12. · Learning from Ordinal Data with ILP in Description Logic Nunung Nurul Qomariyah and Dimitar Kazakov Computer

Thankyou

AnyquesDonorfeedback?