exploiting visual similarities for ontology alignment · linked open data (lod) paradigm shows how...

9
Exploiting visual similarities for ontology alignment Charalampos Doulaverakis, Stefanos Vrochidis, Ioannis Kompatsiaris Information Technologies Institute, Centre for Research and Technology Hellas, Thessaloniki, Greece {doulaver, stefanos, ikom}@iti.gr Keywords: Ontology alignment, Visual similarity, ImageNet, Wordnet Abstract: Ontology alignment is the process where two different ontologies that usually describe similar domains are ’aligned’, i.e. a set of correspondences between their entities, regarding semantic equivalence, is determined. In order to identify these correspondences several methods and metrics that measure semantic equivalence have been proposed in literature. The most common features that these metrics employ are string-, lexical- , structure- and semantic-based similarities for which several approaches have been developed. However, what hasn’t been investigated is the usage of visual-based features for determining entity similarity in cases where images are associated with concepts. Nowadays the existence of several resources (e.g. ImageNet) that map lexical concepts onto images allows for exploiting visual similarities for this purpose. In this paper, a novel approach for ontology matching based on visual similarity is presented. Each ontological entity is associated with sets of images, retrieved through ImageNet or web-based search, and state of the art visual feature extraction, clustering and indexing for computing the similarity between entities is employed. An adaptation of a popular Wordnet-based matching algorithm to exploit the visual similarity is also proposed. Our method is compared with traditional metrics against a standard ontology alignment benchmark dataset and demonstrates promising results. 1 INTRODUCTION Semantic Web is providing shared ontologies and vocabularies in different domains that can be openly accessed and used for tasks such as semantic anno- tation of information, reasoning, querying, etc. The Linked Open Data (LOD) paradigm shows how the different exposed datasets can be linked in order to provide a deeper understanding of information. As each ontology is being engineered to describe a par- ticular domain for usage in specific tasks, it is com- mon for ontologies to express equivalent domains us- ing different terms or structures. These equivalences have to be identified and taken into account in order to enable seamless knowledge integration. Moreover, as an ontology can contain hundreds or thousands of entities, there is a need to automate this process. An example of the above comes from the cultural heritage domain where two ontologies are being used as stan- dards, one is the CIDOC-CRM 1 , used for semanti- cally annotating museum content, and the other is the Europeana Data Model 2 , which is used to semanti- cally index and interconnect cultural heritage objects. 1 CIDOC-CRM, http://www.cidoc-crm.org 2 Europeana Data Model, http://labs.europeana.eu While these two ontologies have been developed for different purposes, they are used in the cultural her- itage domain and correspondences between their en- tities should exist and be identified. In ontology alignment the goal is to automatically or semi-automatically discover correspondences be- tween the ontological entities, i.e. their classes, prop- erties or instances. An ’alignment’ is a set of map- pings that define the similar entities between two on- tologies. These mappings can be expressed e.g. using the owl:equivalentClass or owl:equivalentProperty properties so that a reasoner can automatically access both ontologies during a query. While the proposed methodologies in literature have proven quite effective, either alone or combined, in dealing with the alignment of ontologies, there has been little progress in defining new similarity metrics that take advantage of features that haven’t been con- sidered so far. In addition existing benchmarks for evaluating the performance of ontology alignments systems, such as the Ontology Alignment Evaluation Initiative 3 (OAEI) have shown that there is still room for improvement in ontology alignment. In the last 5 years the proliferation of multime- 3 OAEI, http://oaei.ontologymatching.org

Upload: others

Post on 26-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Exploiting visual similarities for ontology alignment · Linked Open Data (LOD) paradigm shows how the different exposed datasets can be linked in order to provide a deeper understanding

Exploiting visual similarities for ontology alignment

Charalampos Doulaverakis, Stefanos Vrochidis, Ioannis KompatsiarisInformation Technologies Institute, Centre for Research and Technology Hellas, Thessaloniki, Greece

{doulaver, stefanos, ikom}@iti.gr

Keywords: Ontology alignment, Visual similarity, ImageNet, Wordnet

Abstract: Ontology alignment is the process where two different ontologies that usually describe similar domains are’aligned’, i.e. a set of correspondences between their entities, regarding semantic equivalence, is determined.In order to identify these correspondences several methods and metrics that measure semantic equivalencehave been proposed in literature. The most common features that these metrics employ are string-, lexical-, structure- and semantic-based similarities for which several approaches have been developed. However,what hasn’t been investigated is the usage of visual-based features for determining entity similarity in caseswhere images are associated with concepts. Nowadays the existence of several resources (e.g. ImageNet)that map lexical concepts onto images allows for exploiting visual similarities for this purpose. In this paper,a novel approach for ontology matching based on visual similarity is presented. Each ontological entity isassociated with sets of images, retrieved through ImageNet or web-based search, and state of the art visualfeature extraction, clustering and indexing for computing the similarity between entities is employed. Anadaptation of a popular Wordnet-based matching algorithm to exploit the visual similarity is also proposed.Our method is compared with traditional metrics against a standard ontology alignment benchmark datasetand demonstrates promising results.

1 INTRODUCTION

Semantic Web is providing shared ontologies andvocabularies in different domains that can be openlyaccessed and used for tasks such as semantic anno-tation of information, reasoning, querying, etc. TheLinked Open Data (LOD) paradigm shows how thedifferent exposed datasets can be linked in order toprovide a deeper understanding of information. Aseach ontology is being engineered to describe a par-ticular domain for usage in specific tasks, it is com-mon for ontologies to express equivalent domains us-ing different terms or structures. These equivalenceshave to be identified and taken into account in orderto enable seamless knowledge integration. Moreover,as an ontology can contain hundreds or thousands ofentities, there is a need to automate this process. Anexample of the above comes from the cultural heritagedomain where two ontologies are being used as stan-dards, one is the CIDOC-CRM1, used for semanti-cally annotating museum content, and the other is theEuropeana Data Model2, which is used to semanti-cally index and interconnect cultural heritage objects.

1CIDOC-CRM, http://www.cidoc-crm.org2Europeana Data Model, http://labs.europeana.eu

While these two ontologies have been developed fordifferent purposes, they are used in the cultural her-itage domain and correspondences between their en-tities should exist and be identified.

In ontology alignment the goal is to automaticallyor semi-automatically discover correspondences be-tween the ontological entities, i.e. their classes, prop-erties or instances. An ’alignment’ is a set of map-pings that define the similar entities between two on-tologies. These mappings can be expressed e.g. usingthe owl:equivalentClass or owl:equivalentPropertyproperties so that a reasoner can automatically accessboth ontologies during a query.

While the proposed methodologies in literaturehave proven quite effective, either alone or combined,in dealing with the alignment of ontologies, there hasbeen little progress in defining new similarity metricsthat take advantage of features that haven’t been con-sidered so far. In addition existing benchmarks forevaluating the performance of ontology alignmentssystems, such as the Ontology Alignment EvaluationInitiative3 (OAEI) have shown that there is still roomfor improvement in ontology alignment.

In the last 5 years the proliferation of multime-

3OAEI, http://oaei.ontologymatching.org

Page 2: Exploiting visual similarities for ontology alignment · Linked Open Data (LOD) paradigm shows how the different exposed datasets can be linked in order to provide a deeper understanding

dia has generated several annotated resources anddatasets that are associated with concepts, such as Im-ageNet4 or Flickr5 thus making their visual represen-tations easily available and retrievable so that they canbe further exploited, e.g. for image recognition.

In this paper we propose a novel ontology match-ing metric that is based on visual similarities betweenontological entities. The visual representations of theentities are crafted by different multimedia sources,namely ImageNet and web-based image search, thusassigning each entity to descriptive sets of images.State of the art visual features are extracted from theseimages and vector representations are generated. Theentities are compared in terms of these representationsand a similarity value is extracted for each pair of en-tities, thus the pair with the highest similarity valueis considered as valid. The approach is validated inexperimental results where it is shown that when it’scombined with other known ontology alignment met-rics it increases precision and recall of the discoveredmappings.

The main contribution of the paper is the introduc-tion of a novel similarity metric for ontology align-ment based on visual features. To the best of the au-thors knowledge this is the first attempt to exploit vi-sual features for ontology alignment purposes. Wealso propose an adaptation of a popular lexical-basedmatching algorithm where lexical similarity is re-placed with visual similarity.

The paper is organized as follows: Section 3 de-scribes the methodology in detail, while Section 5presents the experimental results on the popular OAEIconference track dataset. In Section 4 an metric thatexploits the proposed visual similarity and lexical fea-tures is proposed and described. Related work in on-tology alignment is documented in Section 2. Finally,Section 6 concludes the paper and a future work planis outlined.

2 RELATED WORK

In order to accomplish the automatic discov-ery of mappings, numerous approaches have beenproposed in literature that rely on various fea-tures. Of the most common are methods thatcompare the similarity of two strings, e.g. com-paring hasAuthor with isAuthoredBy, are the mostused and fastest to compute as they operate on rawstrings. Existing string similarity metrics are be-ing used, such as Levenshtein distance, Edit dis-

4ImageNet, http://www.image-net.org/5Flickr, https://www.flickr.com/

tance, Jaro-Winkler similarity, etc, while string sim-ilarity algorithms such as (Stoilos et al., 2005) havebeen developed especially for ontology matching.Other mapping discovery methods rely on lexi-cal processing in order to find synonyms, hyper-nyms or hyponyms between concepts, e.g. Au-thor and Writer, where Wordnet is most commonlyused. In (Lin and Sandkuhl, 2008) a survey on meth-ods that use Wordnet (Miller, 1995) for ontologyalignment, is carried out. Approaches for exploit-ing other external knowledge sources have beenpresented (Sabou et al., 2006; Pesquita et al., 2014;Chen et al., 2014; Faria et al., 2014). Other similar-ity measures rely on the structure of the ontologies,such as the Similarity Flooding (Melnik et al., 2002)algorithm that stems from the relational databasesworld but has been successfully used for ontologyalignment, while others exploit both schema and on-tology semantics for mapping discovery. A com-prehensive study of such methods can be found at(Shvaiko and Euzenat, 2005). In terms of matchingsystems, there have been proposed numerous ap-proaches that combine matchers or include externalresources of the generation of a valid mapping be-tween ontologies. Most available systems have beenevaluated in the OAEI benchmarks that are held an-nually. In (Jean-Mary et al., 2009) the authors use aweighted approach to combine several matchers inorder to produce a final matching score between theontological entities. In (Ngo and Bellahsene, 2012)the authors go a step further and propose a novel ap-proach to combine elementary matching algorithmsusing a machine learning approach with decisiontrees. The system is trained from prior ground truthalignments in order to find the best combination ofmatchers for each pair of entities. Other systems, suchas AML (Faria et al., 2013) and (Kirsten et al., 2011),make use of external knowledge resources or lexiconsto obtain ground truth structure and entity relations.This is especially used when matching ontologies inspecialized domains such as in biomedicine.

In contrast to the above we propose a novel on-tology matching algorithm that corresponds entitieswith images and makes use of visual features in or-der to compute similarity between entities. To the au-thors knowledge, this is the first approach in literaturewhere a visual-based ontology matching algorithm isproposed. Throughout the paper, the term “entity” isused to refer to ontology entities, i.e. classes, objectproperties, datatype properties, etc.

Page 3: Exploiting visual similarities for ontology alignment · Linked Open Data (LOD) paradigm shows how the different exposed datasets can be linked in order to provide a deeper understanding

(a) Images for “boat” (b) Images for “ship” (c) Images for “motorbike”Figure 1: Images for different synsets. (a) and (b) are semantically more similar than with (c). The visual similarity between(a) and (b) and their difference with (c) is apparent

3 VISUAL SIMILARITY FORONTOLOGY ALIGNMENT

The idea for the development of a visual similar-ity algorithm for ontology alignment originated fromthe structure of ImageNet where images are assignedto concepts. For example, Figure 1 shows a subsetof images that is found in ImageNet for the wordsboat, ship and motorbike. Obviously, boat and shipare more semantically related than boat and motor-cycle. It is also clear from Figure 1 that the imagesthat correspond to boat and ship are much more sim-ilar in terms of visual appearance than the images ofmotorbike. One can then assume that it is possible toestimate the semantic relatedness of two concepts bycomparing their visual representations.

In Figure 2 the proposed architecture for visual-based ontology alignment is presented. The sourceand target ontologies are the ontologies to bematched. For every entity in the ontologies, sets ofimages are assigned through ImageNet by identify-ing the relevant Wordnet synsets. A synset is a setof words that have the same meaning and these areused to query ImageNet. A single entity might corre-spond to a number of synsets, e.g. “track” has differ-ent meaning in transport and in sports as can be seenin Figure 3. Thus for each entity a number of imagesets are retrieved. For each image in a set, low levelvisual features are extracted and a numerical vectorrepresentation is formed. Therefore for each conceptdifferent sets of vectors are generated. Each set ofvectors is called a “visual signature”. All visual sig-natures between the source and target ontology arecompared in pairs using a modified Jaccard set sim-ilarity in order to come up with a list of similarityvalues assigned to each entity pair. The final list ofmappings is generated by employing an assignmentoptimization algorithm such as the Hungarian method

(Kuhn, 1955).

3.1 Assigning images to entities

The main source of images in the proposed work isImageNet, an image database organized accordingto the WordNet noun hierarchy in which each nodeof the hierarchy is associated with a number of im-ages. Users can search the database through a text-search web interface where the user inputs the querywords, which are then mapped to Wordnet indexedwords and a list of relevant synsets (synonym sets,see (Miller, 1995)) are presented. The user selectsthe desired synset and the corresponding images aredisplayed. In addition, ImageNet provides a RESTAPI for retrieving the image list that corresponds to asynset by entering the Wordnet synset id as input andthis is the access method we used.

For every entity of the two ontologies to bematched, the following process was followed: A pre-processing procedure is executed where each entityname is first tokenized in order to split it to meaning-ful words as it is common for names to be in the formof isAuthorOf or is author of thus after tokenization,isAuthorOf will be split to the words is, Author andof. The next step is to filter out stop words, words thatdo not contain important significance or are very com-mon. In the previous example, the words is and of areremoved, thus after this preprocessing the name thatis produced is Author.

After the preprocessing step, the next procedureis about identifying the relevant Worndet synset(s) ofthe entity name and get their ids, which is a ratherstraightforward procedure. Using these ids, ImageNetis queried in order to retrieve a fixed number of rele-vant images. However trying to retrieve these imagesmight fail, mainly due to two reasons: either the namedoes not correspond to a Wordnet synset, e.g. due

Page 4: Exploiting visual similarities for ontology alignment · Linked Open Data (LOD) paradigm shows how the different exposed datasets can be linked in order to provide a deeper understanding

Figure 2: Architecture of the proposed ontology alignment algorithm

(a) Images for “track(running)”

(b) Images for “track (train)”

Figure 3: Images that correspond to different meanings ofconcept “track”. Since we can’t be certain of a word mean-ing (word sense), each concept is associated with all rele-vant synsets and corresponding image sets from ImageNet

to misspellings, or the relevant ImageNet synset isn’tassigned any images, something which is not uncom-mon since ImageNet is still under development and isnot complete. So, in order not to end up with emptyimage collections, in the above cases the entity nameis used to query Yahoo

TMimage search6 in order to

find relevant images. The idea of using web-basedsearch results has been employed in computer visionas in (Chatfield and Zisserman, 2013) where web im-age search is used to train an image classifier.

The result of the above-described process is tohave each ontological entity C associated with n setsof images IiC, with i = 1, . . . ,n, where n is the numberof synsets that correspond to entity C.

6Yahoo search, https://images.search.yahoo.com

3.2 Extracting the visual signatures ofentities

For allowing a visual-based comparison of the onto-logical entities, each image set IiC has to be repre-sented using appropriate visual descriptors. For thispurpose, a state of the art approach is followed whereimages are represented as compact numerical vec-tors. For extracting these vectors the approach whichis described in (Spyromitros-Xioufis et al., 2014) isused as it has been shown to outperform other ap-proaches on standard benchmarks of image retrievaland is quite efficient. In short, SURF (Speeded UpRobust Features) descriptors (Bay et al., 2008) are ex-tracted for each image in a set. SURF descriptors arenumerical representations of important image featuresand are used to compactly describe image content.These are then represented using the VLAD (Vec-tor of Locally Aggregated Descriptors) representation(Jegou et al., 2010) where four codebooks of size 128each, were used. The resulting VLAD vectors arePCA-projected to reduce their dimensionality to 100coefficients, thus ending up with a standard numericalvector representation v j for each image j in a set. Atthe end of this process, each image set IiC will be nu-merically represented by a corresponding vector set.This vector set is termed “visual signature” ViC as itconveniently and descriptively represents the visualcontent of IiC, thus ViC = {v j}, with j = 1, . . . ,k and kbeing the total number of images in IiC.

The whole processing workflow is depicted in Fig-ure 4.

Algorithm 1 outlines the steps to create visual sig-

Page 5: Exploiting visual similarities for ontology alignment · Linked Open Data (LOD) paradigm shows how the different exposed datasets can be linked in order to provide a deeper understanding

Figure 4: Block diagram of the process for extracting thevisual signatures of an entity

natures VC of entities in an ontology.

Algorithm 1 Pseudocode for extracting visual signa-ture VC of an entity C in ontology OEnsure: VC = /0, C is an entity of ontology O

Ct ← removeStopWords(tokenize(C))W ← find Wordnet synsets of Ctfor all synsets Wi in W do

IiC← download k images from ImageNetif IiC = /0 then

download k images from webend ifViC← /0for all images j in IiC do

v j← extractVisualDescriptors( j)ViC← add v j

end forVC← add ViC

end forreturn VC

3.3 Comparing visual signatures forcomputing entity similarity

Having the visual signatures for each entity, the nextstep is to use an appropriate metric in order to com-pare these signatures and estimate the similarity be-tween image sets. Several vector similarity and dis-tance metrics exist, such as cosine similarity or eu-clidean distance, however these are mostly suitablewhen comparing individual vectors. In the currentwork, we are interested in establishing the similarityvalue between vector sets so the Jaccard set similaritymeasure is more appropriate as it is has been definedexactly for this purpose. It’s definition is

JViCs,V jCt =|ViCs∩VjCt ||ViCs∪VjCt |

(1)

where ViCs and VjCt are the i and j different vi-sual signatures of entities Cs and Ct , |ViCs∩VjCt | is theintersection size of the two sets, i.e. the number ofidentical images between the sets, and |ViCs ∩VjCt | isthe total number of images in both sets. It holds that

0 ≤ JViCs,V jCs ≤ 1. For defining if two images A andB are identical, we compute the angular similarity oftheir vector representations.

AngSimA,B = 1− arccos(cosineSim(A,B))π

(2)

with cosineSim(A,B) equal to

cosineSim(A,B) =

n=100

∑k=1

Ak ·Bk√n=100

∑k=1

A2k ·

√n=100

∑k=1

B2k

(3)

For AngSim, a value of 0 means that the two im-ages are completely irrelevant and 1 means that theyare identical. However, two images might not haveAngSimA,B = 1 even if they are visually the same butthey are acquired from different sources due to e.g.differences in resolution, compression or stored for-mat, thus we risk of having |ViCs∩VjCt |= /0. For thisreason instead of aiming to find truly identical imageswe introduce the concept of “near-identical images”where two images are considered identical if the havea similarity value above a threshold T , thus

IdenticalA,B =

{0 if AngSimA,B < T1 if AngSimA,B ≥ T

(4)

T is experimentally defined. Using the above weare able to establish the Jaccard set similarity value oftwo ontological entities by corresponding each entityto an image set, extracting the visual signature of eachset and comparing these signatures. The Jaccard setsimilarity value JVi,U j is computed for every pair i, j ofsynsets that correspond to the examined entities, V,U .Visual Similarity is defined as

VisualSim(Cs,Ct) = maxi, j

(JViCs,V jCt ) (5)

4 COMBINING VISUAL ANDLEXICAL FEATURES

The Visual Similarity algorithm can either be ex-ploited as a standalone measure or it can be usedas complementary to other ontology matching mea-sures as well. Since in order to construct the visualrepresentation of entities Wordnet is used, one ap-proach is to combine visual with lexical-based fea-tures. Lexical-based measures have been used in

Page 6: Exploiting visual similarities for ontology alignment · Linked Open Data (LOD) paradigm shows how the different exposed datasets can be linked in order to provide a deeper understanding

ontology matching systems in recent OAEI bench-marks, such as in (Ngo and Bellahsene, 2012)where,among others, the Wu-Palmer (Wu and Palmer, 1994)Wordnet-based measure has been integrated. The Wu-Palmer similarity value between concepts C1 and C2is defined as

WuPalmerC1,C2 =2 ·N3

N1 +N2 +2 ·N3(6)

where C3 is defined as the least common super-concept (or hypernym) of both C1 and C2, N1 and N2are the number of nodes from C1 and C2 to C3, re-spectively, and N3 is the number of nodes on the pathfrom C3 to root. The intuition behind this metric isthat since concepts closer to the root have a broadermeaning which is made more specific as one movesto the leaves of the hierarchy, if two concepts have acommon hypernym closer to them and further fromthe root, then it’s likely that they have a closer seman-tic relation.

Based on this intuition we have defined a new sim-ilarity metric that takes into account the visual fea-tures of both concepts and of their least common su-perconcept. Using the same notation and meaning forC1, C2, C3, the measure we have defined is expressedas

LexiVisC1,C2 =V3

3− (V1 +V2)(7)

where V3 is the visual similarity value betweenC1 and C2 and V1,V2 are the visual similarity val-ues between C1,C3 and C2,C3 respectively. V1,V2 andV3 are calculated according to Eq. 5. In all cases,0 ≤ LexiVisC1,C2 ≤ 1. The intuition behind this mea-sure is that semantically related concepts will be eachother highly visually similar to each other and alsohighly similar visually with their closest hypernym.The incorporation of the closest hypernym in the over-all similarity estimation of two concepts will allow forcorrections in cases where concepts might be visuallysimilar but semantically irrelevant, e.g. “boat” and“hydroplane” pictures depict an object surrounded bya body of water, however when they are visually com-pared against their common superconcept, in the pre-vious example it is the concept “craft”, their pair-wisevisual similarity value will be low thus lowering theconcepts’ similarity. This example is depicted in Fig-ure 5.

5 EXPERIMENTAL RESULTS

For analyzing the performance of the Visual Sim-ilarity ontology matching algorithm we ran it against

Boat Hydroplane CraftBoat-Hydroplane=0.49, Boat-Craft=0.24,

Hydroplane-Craft=0.35LexiVis (Boat-Hydroplane) = 0.20

Figure 5: Visual similarity values between the concepts“Boat” and “Hydroplane” which are semantically irrelevantbut visually similar. Their common hypernym is “Craft”.The LexiVis measure, by taking advantage of lexical fea-tures, lowers their similarity value.

the Ontology Alignment Evaluation Initiative (OAEI)Conference track of 2014 (Dragisic et al., 2014)7.The OAEI benchmarks are organized annually andhave become a standard in ontology alignment toolsevaluation. In the conference track, a number of on-tologies that are used for the organization of confer-ences have to be aligned in pairs. The conferencetrack was chosen as, by design, the proposed algo-rithm requires meaningful entity names that can be vi-sually represented. Other tracks, such as benchmarkand anatomy, weren’t considered due to this limita-tion which is further discussed in Section 6. Refer-ence alignments are available and these are used forthe actual evaluation in an automated manner. Thereference alignment that was used is “ra1” since thiswas readily available for the OAEI 2014 website.

The VisualSim and LexiVis ontology matchingalgorithms were integrated in the Alignment API(Euzenat, 2004) which offers interfaces and sampleimplementations in order to integrate matching algo-rithms. The API is recommended from OAEI for par-ticipating in the benchmarks. In addition, algorithmsto compute standard information retrieval measures,i.e. precision, recall and F-measure, against referencealignments can be found in the API, so these wereused for the evaluation of the tests results. In thesetests we changed the threshold, i.e. the value underwhich an entity matching is discarded, and registeredthe precision, recall and F1 measure values.

In order to have a better understanding of the pro-posed algorithms we compared it against other popu-lar matching algorithms. Ideally the performance ofthese would be evaluated against other matching algo-rithms that make use of similar modalities, i.e. visualor other. This wasn’t feasible as the proposed algo-rithms are the first that makes use of visual features,so we compare it with standard algorithms that exploittraditional features such as string-based and Wordnet-

7OAEI 2014, http://oaei.ontologymatching.org/2014/

Page 7: Exploiting visual similarities for ontology alignment · Linked Open Data (LOD) paradigm shows how the different exposed datasets can be linked in order to provide a deeper understanding

(a) Precision

(b) Recall

(c) F1 measureFigure 6: Precision, Recall and F1 diagrams for differentthreshold values using the conference track ontologies ofOAEI 2014

based similarity. For this purpose we implemented theISub string similarity matcher (Stoilos et al., 2005)and the Wu-Palmer Wordnet-based matcher whichis described in Section 4. These matchers havebeen used in the YAM++ ontology matching system(Ngo and Bellahsene, 2012) which was one of the topranked systems in OAEI 2012.

All aforementioned algorithms, ISub, Wu-Palmer,VisualSim and LexiVis, are evaluated using Precision,Recall and F1 measure, with

F1 =2 ·Precision ·RecallPrecision+Recall

(8)

The results of this evaluation are displayed in Fig-ure 6.

It can be seen from Figure 6 that VisualSim andthe LexiVis algorithms performs better in all mea-sures than the Wu-Palmer alignment algorithm whichconfirms with our initial assumption that the seman-tic similarity between entities can be reflected in theirvisual representation using imaging modalities. This

allows a new range of matching techniques based onmodalities that haven’t been considered so far to beinvestigated. However, the string-based ISub matcherdisplays superior performance, which was expectedas string-based matchers are very effective in ontol-ogy alignment and matching problems, which pointsout that the aforementioned new range of matchersshould work complementary to the existing and estab-lished matchers as these have proven their reliabilitythough time.

An additional performance factor that should bementioned is the computational complexity and over-all execution time for the Visual based algorithmwhich is much greater than the simpler string-basedalgorithms. Analyzing Figure 4, of all the docu-mented steps by far the most time consuming arethe image download and visual descriptor extraction.However, ImageNet is already offering visual descrip-tors which are extracted from the synset images andare freely available to download8. The range of im-ages that have been processed is not yet complete butas ImageNet is still in development, the plan is to havethe whole image database processed and have the vi-sual descriptors extracted. This availability will makethe calculation of the proposed visual-based ontologyalignment algorithms faster.

5.1 In combination with other ontologyalignment algorithms

As a further test, using the Alignment API we in-tegrated the LexiVis matching algorithm and aggre-gated the matching results with other available match-ing algorithms in order to have an understanding onhow it would perform in a real ontology matching sys-tem. We used the LexiVis algorithm as it was shownto perform better than the original Visual Similarityalgorithm (Figure 6). The other algorithms that wereused are the ISub and Similarity Flooding matchers inaddition to the baseline NameEq matcher. These wereused in order to have a combination of matchers thatexploit different features, i.e. string, structural and vi-sual. The matchers were combined using an adaptiveweighting approach similar to (Cruz et al., 2009). Forthis test we again used the conference track bench-mark dataset of OAEI 2014. For this dataset, resultsregarding the performance of the participating match-ing systems are published in OAEI’s website and in(Dragisic et al., 2014). It can be seen from Table 1,in the line denoted with italic font, that the inclu-sion of the LexiVis ontology matching algorithm in

8ImageNet visual features download,http://image-net.org/download-features

Page 8: Exploiting visual similarities for ontology alignment · Linked Open Data (LOD) paradigm shows how the different exposed datasets can be linked in order to provide a deeper understanding

Table 1: Performance of the LexiVis matching algorithm in combination with other matching algorithms (ISub, Name Equal-ity, Similarity Flooding (Melnik et al., 2002)), and how the performance is compared to matching systems that participated inOAEI 2014 conference track

System Precision Recall F1-measureAML 0.85 0.64 0.73

LogMap 0.80 0.59 0.68LogMap-C 0.82 0.57 0.67

XMap 0.87 0.49 0.63NameEq + ISub + SimFlood + LexiVis 0.71 0.53 0.60

NameEq + ISub + SimFlood 0.81 0.47 0.59OMReasoner 0.82 0.46 0.59

Baseline (NameEq) 0.80 0.43 0.56AOTL 0.77 0.43 0.55

MaasMtch 0.64 0.48 0.55

the matching system results in better overall perfor-mance than running the system without it. The addedvalue of 0.01 in F1 results in an overall F1 value of0.60 which brings our matching system in the top 5performances. The rather small added value of 0.01is mainly due to the fact that the benchmark is quitechallenging as can be seen from the results of Table 1.For example the XMap system, which is ranked 4th,managed to score 0.07 more in F1 than the baselineNameEq matcher which simply compares strings andproduces a valid pair if the names are equal. Even thissmall increase of F1 just by including the LexiVis al-gorithm proves that it can improve results in such achallenging benchmark thus showing its benefit.

6 CONCLUSIONS

In this paper a novel ontology matching algorithmwhich is based on visual features is presented. The al-gorithm exploits ImageNet’s structure which is basedon Wordnet in order to correspond image sets to theontological entities and state of the art visual pro-cessing is employed which involves visual featuredescriptors extraction, codebook-based feature rep-resentation, dimensionality reduction and indexing.The visual-based similarity value is taken by calcu-lating a modified version of the Jaccard set similarityvalue. A new matcher is also proposed which com-bines visual and lexical features in order to determineentity similarity. The proposed algorithms have beenevaluated using the established OAEI benchmark andhas shown to outperform Wordnet-based approaches.A limitation of the proposed visual-based matchingalgorithm is that since it relies of visual depictions ofentities, in cases where entity names are not words,e.g. alphanumeric codes, then its performance will bepoor as images will be able to be associated with it. A

way to tackle this is to extend the approach to includeother data, such as rdfs:label, which are more de-scriptive. Another limitations of this approach wouldbe the mapping of concepts that are visually hardto express, e.g. “Idea” or “Freedom”, however thisis partly leveraged by employing web-based searchwhich likely retrieves relevant images for almost anyconcept.

The current version of the algorithm only uses en-tity names Future work will focus in optimizing theprocessing pipeline in order to have visual similarityresults in a more timely manner using processing op-timizations and other approaches such as word sensedisambiguation in order to reduce the image sets thatcorrespond to each entity.

ACKNOWLEDGEMENTS

This work was supported by MULTISENSOR (con-tract no. FP7-610411) and KRISTINA (contract no.H2020-645012) projects, partially funded by the Eu-ropean Commission.

REFERENCES

Bay, H., Ess, A., Tuytelaars, T., and Van Gool, L. (2008).Speeded-up robust features (SURF). Computer visionand image understanding, 110(3):346–359.

Chatfield, K. and Zisserman, A. (2013). VISOR: Towardson-the-fly large-scale object category retrieval. InAsian Conference of Computer Vision – ACCV 2012,pages 432–446. Springer Berlin Heidelberg.

Chen, X., Xia, W., Jimenez-Ruiz, E., and Cross, V. (2014).Extending an ontology alignment system with biopor-tal: a preliminary analysis. In Poster at Intl Sem. WebConf.(ISWC).

Page 9: Exploiting visual similarities for ontology alignment · Linked Open Data (LOD) paradigm shows how the different exposed datasets can be linked in order to provide a deeper understanding

Cruz, I. F., Palandri Antonelli, F., and Stroe, C. (2009). Ef-ficient selection of mappings and automatic quality-driven combination of matching methods. In ISWCInternational Workshop on Ontology Matching (OM)CEUR Workshop Proceedings, volume 551, pages 49–60. Citeseer.

Dragisic, Z., Eckert, K., Euzenat, J., Faria, D., Ferrara, A.,Granada, R., Ivanova, V., Jimenez-Ruiz, E., Kempf,A., Lambrix, P., et al. (2014). Results of the ontologyalignment evaluation initiative 2014. In InternationalWorkshop on Ontology Matching, pages 61–104.

Euzenat, J. (2004). An API for ontology alignment. In TheSemantic Web–ISWC 2004, pages 698–712. Springer.

Faria, D., Pesquita, C., Santos, E., Cruz, I. F., and Couto,F. M. (2014). Automatic background knowledge se-lection for matching biomedical ontologies. PLoSONE, 9(11):e111226.

Faria, D., Pesquita, C., Santos, E., Palmonari, M., Cruz,I. F., and Couto, F. M. (2013). The agreementmak-erlight ontology matching system. In On the Moveto Meaningful Internet Systems: OTM 2013 Confer-ences, pages 527–541. Springer.

Jean-Mary, Y. R., Shironoshita, E. P., and Kabuka, M. R.(2009). Ontology matching with semantic verifica-tion. Web Semantics: Science, Services and Agents onthe World Wide Web, 7(3):235–251.

Jegou, H., Douze, M., Schmid, C., and Perez, P. (2010). Ag-gregating local descriptors into a compact image rep-resentation. In Computer Vision and Pattern Recogni-tion (CVPR), 2010 IEEE Conference on, pages 3304–3311. IEEE.

Kirsten, T., Gross, A., Hartung, M., and Rahm, E. (2011).Gomma: a component-based infrastructure for man-aging and analyzing life science ontologies and theirevolution. J. Biomedical Semantics, 2(6).

Kuhn, H. W. (1955). The hungarian method for the as-signment problem. Naval research logistics quarterly,2(1-2):83–97.

Lin, F. and Sandkuhl, K. (2008). A survey of exploit-ing wordnet in ontology matching. In Artificial In-telligence in Theory and Practice II, pages 341–350.Springer.

Melnik, S., Garcia-Molina, H., and Rahm, E. (2002). Sim-ilarity flooding: A versatile graph matching algorithmand its application to schema matching. In Data Engi-neering, 2002. Proceedings. 18th International Con-ference on, pages 117–128. IEEE.

Miller, G. A. (1995). Wordnet: a lexical database for en-glish. Communications of the ACM, 38(11):39–41.

Ngo, D. and Bellahsene, Z. (2012). YAM++: A multi-strategy based approach for ontology matching task.In Knowledge Engineering and Knowledge Manage-ment, pages 421–425. Springer.

Pesquita, C., Faria, D., Santos, E., Neefs, J.-M., and Couto,F. M. (2014). Towards visualizing the alignment oflarge biomedical ontologies. In Data Integration inthe Life Sciences, pages 104–111. Springer.

Sabou, M., d’Aquin, M., and Motta, E. (2006). Using thesemantic web as background knowledge for ontology

mapping. In In Proc. of the Int. Workshop on OntologyMatching (OM-2006.

Shvaiko, P. and Euzenat, J. (2005). A survey of schema-based matching approaches. In Journal on Data Se-mantics IV, pages 146–171. Springer.

Spyromitros-Xioufis, E., Papadopoulos, S., Kompatsiaris,I., Tsoumakas, G., and Vlahavas, I. (2014). A com-prehensive study over VLAD and product quantiza-tion in large-scale image retrieval. IEEE Transactionson Multimedia, 16(6):1713–1728.

Stoilos, G., Stamou, G., and Kollias, S. (2005). A stringmetric for ontology alignment. In Gil, Y., editor, Pro-ceedings of the International Semantic Web Confer-ence (ISWC 05), volume 3729 of LNCS, pages 624–637. Springer-Verlag.

Wu, Z. and Palmer, M. (1994). Verbs semantics and lexicalselection. In Proceedings of the 32nd annual meetingon Association for Computational Linguistics, pages133–138. Association for Computational Linguistics.