a comparison of sawsdl based semantic web service discovery algorithms
DESCRIPTION
Master\’s ThesisTRANSCRIPT
A COMPARISON OF SAWSDL BASED SEMANTIC WEB SERVICE DISCOVERY
ALGORITHMS
by
SHIVA SANDEEP GARLAPATI
(Under the Direction of John A. Miller)
ABSTRACT
The advent of Web Services has revolutionized the way web-based applications
communicate. However, an ongoing problem has been how to discover the appropriate web
services. There have been techniques which were purely based on the syntactic aspects of the web
services. This approach is often not adequate due to the fact that it never really considered the
semantics of the web service. Henceforth, there has been significant research on semantically
discovering web services. In this thesis, we compare and analyze three SAWSDL based semantic
web service discovery algorithms namely, SAWSDL-MX, MWSDI and TVERSKY.
INDEX TERMS: Semantic Web Service, SAWSDL, Ontology, Web Service discovery and
WSDL.
A COMPARISON OF SAWSDL BASED SEMANTIC WEB SERVICE DISCOVERY
ALGORITHMS
by
SHIVA SANDEEP GARLAPATI
B.E., ANNA UNIVERSITY, INDIA, 2007
A Thesis Submitted to the Graduate Faculty of The University of Georgia in Partial
Fulfillment of the Requirements for the Degree
MASTER OF SCIENCE
ATHENS, GEORGIA
2010
© 2010
SHIVA SANDEEP GARLAPATI
All Rights Reserved
A COMPARISON OF SAWSDL BASED SEMANTIC WEB SERVICE DISCOVERY
ALGORITHMS
by
SHIVA SANDEEP GARLAPATI
Major Professor: John A. Miller
Committee: Krzysztof J. Kochut
Thiab R. Taha
Electronic Version Approved:
Maureen Grasso
Dean of the Graduate School
The University of Georgia
August 2010
iv
DEDICATION
To my parents, my brother, my relatives and all my friends.
v
ACKNOWLEDGEMENTS
I thank my professor Dr. John A. Miller for the encouragement and support
provided throughout.
vi
TABLE OF CONTENTS
Page
ACKNOWLEDGEMENTS .................................................................................................v
LIST OF TABLES ............................................................................................................ vii
LIST OF FIGURES ......................................................................................................... viii
CHAPTER
1. INTRODUCTION ............................................................................................1
2. BACKGROUND ...............................................................................................4
SEMANTIC WEB SERVICES ...................................................................4
SAWSDL .....................................................................................................4
3. DISCOVERY ALGORITHMS .........................................................................6
SAWSDL-MX MATCHMAKER ...............................................................6
TVERSKY MODEL BASED MATCHING ALGORITHM ......................9
METEOR-S DISCOVERY MATCHING ALGORITHM ........................14
4. EVALUATION AND DISCUSSION OF RESULTS .....................................21
EVALUATION OF ALGORITHMS BASED ON RESULTS .................34
5. CONCLUSION AND FUTURE WORK ........................................................37
REFERENCES ..................................................................................................................39
APPENDICES
A. APPENDIX .....................................................................................................44
vii
LIST OF TABLES
Page
Table 3-1: Match Tversky model Similarity Scores for common ontology ......................10
Table 3-2: Similarity of Operations ...................................................................................15
Table 3-3: Concept Similarity ............................................................................................17
Table 3-4: Property Similarity ...........................................................................................19
Table 4-1: Precision, Recall and F-Measure for SAWSDL-M0 ........................................28
Table 4-2: Precision, Recall and F-Measure for SAWSDL-M1 ........................................28
Table 4-3: Precision, Recall and F-Measure for SAWSDL-M2 ........................................29
Table 4-4: Precision, Recall and F-Measure for SAWSDL-M3 ........................................29
Table 4-5: Precision, Recall and F-Measure for SAWSDL-M4 ........................................30
Table 4-6: Precision, Recall and F-Measure for TVERSKY .............................................30
Table 4-7: Precision, Recall and F-Measure for MWSDI .................................................31
Table 4-8: Average Precision, Recall and F-Measure for SAWSDL ................................31
Table 4-9: Syntactic similarity using Extended Jaccard measure ......................................32
viii
LIST OF FIGURES
Page
Figure 4-1: Precision Graph for SAWSDL-M0, TVERSKY, MWSDI and Average
SAWSDL-MX Hybrid ...........................................................................................33
Figure 4-2: Recall Graph for SAWSDL-M0, TVERSKY, MWSDI and Average
SAWSDL-MX Hybrid ...........................................................................................33
Figure 4-3: F-Measure Graph for SAWSDL-M0, TVERSKY, MWSDI and Average
SAWSDL-MX Hybrid ...........................................................................................34
Figure A-1: novel_author_service.wsdl ............................................................................45
Figure A-2: novel_author_service.wsdl (continued) ........................................................46
Figure A-3: surfing_destination_service.wsdl ..................................................................47
Figure A-4: surfing_destination_service.wsdl (continued) ...............................................48
Figure A-5: novel_authorbook-type_service.wsdl ............................................................49
Figure A-6: novel_authorbook-type_service.wsdl (continued) ........................................50
Figure A-7: activity_destination_service.wsdl .................................................................51
Figure A-8: activity_destination_service.wsdl (continued) ..............................................52
Figure A-9: activity_beach_service.wsdl...........................................................................53
Figure A-10: activity_beach_service.wsdl (continued) ....................................................54
1
CHAPTER 1
INTRODUCTION
Web services have revolutionized the way web-based applications communicate.
The World Wide Web Consortium (W3C) defines a Web Service as "a software system
designed to support interoperable machine-to-machine interaction over a network. It has
an interface described in a machine-processable format (specifically Web Services
Description Language WSDL [4]). Other systems interact with the web service in a
manner prescribed by its description using SOAP [3] messages, typically conveyed using
HTTP with an XML serialization in conjunction with other web-related standards." [1].
The Web services provide a way for interaction between web-based applications using
the Extensible Markup Language (XML) [2], Simple Object Access Protocol (SOAP) and
Web Service Description Language (WSDL). The Simple Object Access Protocol is a
protocol for communication of information (messages in XML) between web services
and usually relies on other protocols (for example, HTTP [5]). The web services are
described using an XML language called WSDL. Lately, the emphasis has been moving
towards Representational State Transfer (REST) [6] services, which does not require
SOAP, but uses HTTP or similar protocols.
Universal, Description, Discovery and Integration (UDDI) [7] is an XML based
registry to publish and discover web services (UDDI technical specification committee,
OASIS has been closed and there has been no further development after 2007 [8]). The
2
UDDI enables the service providers to publish their web services to the world and the
service requesters to find and locate appropriate web services. Web service discovery
[9, 10] by UDDI is purely keyword and taxonomy based, which is often not adequate in
terms of correctness. A better way to enhance web service discovery is by adding
semantics to service descriptions and use algorithms or mechanisms that do semantic
matching. Many approaches that add such semantics have come up, such as OWL-S [11],
WSMO [12], WSDL-S [13] and SAWSDL [14]. Adding semantics can be achieved by
annotating the WSDL [15] using references to concepts in ontologies. Ontology [16] is a
formal representation of a set of concepts within a domain and the relationships between
those concepts. This set of extensions to the WSDL is called Semantic Annotations for
Web Service Description Language (SAWSDL).
This thesis focuses on three SAWSDL based semantic web service discovery
algorithms. SAWSDL-MX [17] is a hybrid semantic matchmaker which uses logic-based
and text-based similarity information to determine the match. The Tversky [18] model
finds the similarity match by identifying the relationship between ontological concepts
and the similarity of properties corresponding to them. The MWSDI [19] discovery
algorithm is based on finding the similarity match between the request and service by
comparing the ontological concepts of the operation, inputs and outputs.
The thesis is structured as follows: We briefly explain what semantic web services
are and how to semantically annotate web services with SAWSDL in chapter 2. The
discussion of the algorithms gives insight into each algorithm in chapter 3. The
evaluation of each of the discovery engines is done and the precision, recall and F-
3
Measure of the experiments are calculated and compared in chapter 4. In chapter 5, we
present the conclusions and future work.
4
CHAPTER 2
BACKGROUND
2.1 SEMANTIC WEB SERVICES
Semantic Web Services (SWS) [20] are just the extensions to the traditional web
services by adding references to the ontologies. This makes the web services more
machine-understandable, since the ontologies are more standardized. The semantic
annotations to a web service make it unambiguous. When multiple services use common
ontologies, they follow the same vocabulary, which allows machines to perform
automated discovery and compositions of web services. SAWSDL is one of the
approaches to semantically annotate web services. It is discussed in detail below.
2.2 SAWSDL
SAWSDL is an extension of WSDL to add additional semantics to the service
descriptions. It is achieved by model references and schema mappings. A
modelReference points to a concept in an ontology with a similar meaning intended. A
model reference is commonly added to interfaces, operations, faults, element, simple
types and complex types. There can be multiple references to multiple ontologies for all
of them. A Schema mapping is used to tackle mismatch of data between the request and
the service. The schema mappings are of two types. The liftingSchemaMapping,
describes the transformation from lower level (XML) to the upper level (Ontology) and
5
loweringSchemaMapping, describes the transformation from upper level to lower level.
The model references help in automated discovery of services, while the schema
mappings help in automated service execution.
Here is an example SAWSDL file [21]. The model references and schema
mapping are shown in bold.
<xsd:element name="Author" type="AuthorType"/>
<xsd:element name="Novel"
sawsdl:liftingSchemaMapping="http://127.0.0.1/services/liftingSchemaMappings/no
vel_author_service_Novel_liftingSchemaMapping.xslt" type="NovelType"/>
<xsd:complexType name="NovelType"
sawsdl:modelReference="http://127.0.0.1/ontology/books.owl#Novel">
<xsd:sequence>
<xsd:element name="hasSize" type="Medium"/>
</xsd:sequence>
</xsd:complexType>
<xsd:simpleType name="Medium"
sawsdl:modelReference="http://127.0.0.1/ontology/books.owl#Medium">
<xsd:restriction base="xsd:string"/>
</xsd:simpleType>
<xsd:simpleType name="AuthorType"
sawsdl:modelReference="http://127.0.0.1/ontology/books.owl#Author">
<xsd:restriction base="xsd:string"/>
</xsd:simpleType>
6
CHAPTER 3
DISCOVERY ALGORITHMS
The main feature of these semantic matching algorithms is the degree of match
(match score) between the operations, inputs and outputs of requests and services. The
semantic annotations for each of these are taken and the relationships between concepts
of ontology are considered to find the degree of match. Another main feature is how the
services are ranked once they are discovered.
In the following sections, we discuss the three discovery algorithms in detail.
3.1 SAWSDL-MX MATCHMAKER
SAWSDL-MX is a hybrid semantic service matchmaker. There are multiple
variants namely, SAWSDL-MX1, SAWSDL-M0+WA and SAWSDL-MX2. The
SAWSDL-MX1 is logic based with the degree of match varying among 5 different
aspects: Exact, Plug-in, Subsumes, Subsumed-by and Fail. It then applies text similarity
filters to rank the services with the same degree of match.
The SAWSDL-M0+WA is very similar to the SAWSDL-MX1 and does the same
logic based matching, but ranks the services using the WSDL Analyzer (WA) tool [22]
which calculates the structural similarity of the entire WSDL. The SAWSDL-MX2
computes all the three kinds of matching: logic based text similarity and structural
similarity. Moreover, it calculates the ranking based on a machine learning approach
called binary SVM (Support Vector Machines) [23] classifier which, given the training
7
set of relevance services to the request, ranks the services. However, the WSDL-Analyzer
tool and SVM-based classifiers are beyond the scope of this thesis. SAWSDL-MX1 is the
only approach which focuses only on the semantic annotations in the SAWSDL and text
similarities and is described more in detail below.
LOGIC-BASED MATCHING
The SAWSDL-MX matchmaker uses the terminology of the OWL-DL. ≡ and ⊑
denote concept equivalence and concept subsumption, respectively. Let be the
ontology used by the matchmaker and H( ) be the concept hierarchy/taxonomy in the
ontology, then
For each concept C, D ∈ H( ),
D ∈ LSC(C) if D ⊑ C and ∄ E s.t. D ⊑ E ⊑ C
where, LSC(C) is the immediate sub-concept of C. Similarly explained, LGC(C) is the
immediate super-concept of C.
Let R.I be the set of input concepts for the request R and S.I be the set of input
concepts of the service S. R.O is the set of output concepts for R and S.O is the set of
output concepts for S. For simplicity, let us, for the moment, assume that all the list of
inputs and outputs are singletons. In this notation, the various types of matches supported
by SAWSDL-MX may be given as below:
Exact match: R exactly matches with S.
o R.I ≡ R.I R.O ≡ S.O
Plug-in Match: S plugs into R.
8
o R.I ⊑ S.I S.O LSC(R.O)
Subsumes Match: R subsumes S.
o R.I ⊑ S.I S.O ⊑ R.O
Subsumed-by match: R is subsumed by S
o R.I ⊑ S.I (R.O ≡ S.O S.O LGC(R.O))
Fail: S fails to match with R according to logic based semantic filter criteria.
These types of matches were designed for the OWLS-MX matchmaker [24]. The
ranking of services are defined in the same order explained. This logic based variant is
called SAWSDL-M0. The ranks for the services with similar degree of match are given
in random order for the first time, but follow the same ranking principle thereafter.
TEXT SIMILARITY MATCHING
SAWSDL-MX calculates the text similarity of the request and service using the
following token-based similarity measures: Loss-of-Information [24], Extended Jaccard
[34], Cosine [35] and Jensen-Shannon [36]. To calculate the text similarity, the concepts
from both the request and service, such as input concepts and output concepts from the
ontology are unfolded into vectors such as the vector model of information retrieval.
Each of these terms is given a tf-idf (term frequency-inverse document frequency) [25]
weight based on whether it is input or output. Each pair of inputs and outputs are
compared and the average is taken as text similarity for the request and service pair.
The SAWSDL-(M1-M4) variants use hybrid matching which calculates the
degree of match using the Logic-Based matching and then ranks the service based on one
9
of these text similarity measures Loss-of-Information (M1), Extended Jaccard (M2),
Cosine (M3) or Jensen-Shannon (M4).
3.2 TVERSKY MODEL BASED MATCHING ALGORITHM
The Tversky algorithm uses the Tversky model [27] to calculate the degree of
match between services. The services are annotated in SAWSDL. The approach is a
similarity based model in which the match score is calculated by comparing the input
concepts, output concepts and functionality concepts of the requests and the services
referenced in the ontologies.
MATCHING ALGORITHM
The Tversky’s model matches the services by individually finding SimI
(Similarity of Inputs), SimO (Similarity of Outputs) and SimF (Similarity of
Functionality), which analyzes the number of common properties (which may be
inherited) between the pairs of inputs, outputs and functionality concepts of request R
and service S conceptualized in the ontology. We say that the two concepts R and S have
a common property if the two properties RP and SQ have the weighted average of name
match and range match is greater than the threshold ∝, which is decided arbitrarily. This
is discussed more in detail later in the section. Given a concept C, p(C) is defined as the
set of properties of the concept C in the ontology.
MATCHING BASED ON COMMON ONTOLOGY
Case 1: For SimI(R, S) (refer to Table 3-1), the similarity or the degree of match
score for the input is: (a) 1 if R.I ≡ S.I, the two concepts are said to be equivalent if R.I ⊑
S.I S.I ⊑ R.I, i.e., both the concepts subsume each other. (b) 1 if R.I ⊑ S.I, instances
10
of R.I are necessarily instances of S.I, in other words R.I has a subclass of relationship
with S.I. (c) The ratio of the number of common properties and the number of properties
of S.I, if S.I ⊑ R.I, i.e., the service concept subsumes the request concept, in this case
p(R.I) ∩ p(S.I) = p(R.I). (d) The ratio of the number of common properties and the
number of properties of S.I, if R.I ∩ S.I ≠ ∅, i.e., the two concepts intersecting should
have at least one property in common.
Table 3-1: Tversky model Similarity Scores for common ontology
Input/Output/
Functionality
Match Score Condition
SimI(R,S) 1 R.I ≡ S.I
1 R.I ⊑ S.I
|p(R.I)| / | p(S.I)| S.I ⊑ R.I
|p(R.I)∩ p(S.I)| / | p(S.I)| R.I ∩ S.I ≠ ∅
SimO(R,S)
1 R.O ≡ S.O
|p(S.O)| / | p(R.O)| R.O ⊑ S.O
1 S.O ⊑ R.O
|p(R.O)∩ p(S.O )| / | p(R.O)| R.O ∩ S.O ≠ ∅
SimF(R,S)
1 R.F ≡ S.F
|p(S.F)| / | p(R.F )| R.F ⊑ S.F
1 S.F ⊑ R.F
|p(R.F)∩ p(S.F )| / | p(R.F)| R.F ∩ S.F ≠∅
11
. Case 2: For SimO(R, S) (refer to Table 3-1) the similarity or the degree of match
for the output is: (a) 1 if R.O ≡ S.O, i.e., two concepts are equivalent. (b) The ratio of the
number of common properties and the number of properties for R.O, if R.O ⊑ S.O, i.e.
the concept subsumes the request concept, in this case p(R.O) ∩ p(S.O) = p(R.O). (c) 1 if
S.O ⊑ R.O, instances of S.O are necessarily instances of R.O. In other words, R.O has a
superclass of relationship with S.O. (d) The ratio of the number of common properties
and the number of properties of R.O if R.O ∩ S.O ≠ ∅, the two concepts intersecting
should have properties in common.
Case 3: For SimF(R, S) (refer to Table 3-1) the similarity or the degree of match
for the output is: (a) 1 if R.F ≡ S.F, i.e., two concepts are equivalent. (b) The ratio of the
number of common properties and the number of properties for R.F, if R.F ⊑ S.F, i.e., the
concept subsumes the request concept, in this case p(R.F) ∩ p(S.F) = p(R.F). (c) 1 if S.F
⊑ R.F, instances of S.F are necessarily instances of R.F in other words R.F has a
superclass of relationship with S.F. (d) The ratio of the number of common properties and
the number of properties of R.F if R.F ∩ S.F ≠ ∅, the two concepts intersecting should
have properties in common.
MATCHING BASED ON MULTIPLE ONTOLOGIES
Not all web services reference the same ontology. Different web services can be
described by different ontologies. This gives rise to certain issues which can be tackled
by using a feature-based similarity measure, which compares concepts based on their
common and distinguishing features (properties). It takes into account the features or
12
properties of concepts which are transparently represented by their inherited properties.
The similarity functions MOSimI(R, S), MOSimO(R, S), MOSimF(R, S) for inputs,
outputs and functionality for multiple ontologies are defined as follows:
( ( . ), ( . )) ( ( . ), ( . ))*
| ( . ) | | ( . ) | ( ( . ), ( . )) | ( . ) |( , )
IM p R I p S I M p R I p S I
p R I p S I M p R I p S I p S IMOSim R S
( ( . ), ( )) ( ( ), ( ))*
| ( ) | | ( ) | ( ( . ), ( )) | ( ) |
. . .. . . .
( , ) O
M p R O p S M p R p S
p R p S M p R O p S p R
O O OO O O O
MOSim R S
( ( . ), ( )) ( ( ), ( ))
*| ( ) | | ( ) | ( ( . ), ( )) | ( ) |
. . .
. . . .( , ) F
M p R F p S M p R p S
p R p S M p R F p S p R
F F F
F F F FMOSim R S
The above formulas can be summarized as the geometric distance between the
ratio of the best mapping between the properties and the number of properties present in
the service or request. Function M establishes a mapping between the properties of the
two concept classes S and R. It establishes the best matching or mapping between two
sets of properties, P = 1 2
{ , ,... }up p p , Q = 1 2
{ , ,... }vq q q and is determined by using the
Hungarian algorithm [28] for weighted bipartite graph matching.
1
1
1
1 2
( , ) ( * )
1 ( , )
0
1
1
* ( , ) * ( , )
u
ij iji
i jij
u
ijj
v
iji
ij
x
x
M P Q Max m x
if p q is selectedwhere x
otherwise
such that for all i
for all j
m w namematch p q w rangematch p q
13
The weight ijm is calculated using the similarity between the properties. A
property has three parts: name, domain and range. Name match can be done using several
string matching algorithms such as N-gram, stemming, etc. Currently, our
implementation uses the N-gram algorithm [29] to calculate the similarity between the
names of properties. Range match can be divided into two parts: data type match and
name match. Data type match is done by checking whether both the properties have same
data types or not. If they have same data types then the value of match is 1 or else it is 0
(certainly this could be refined to give a value from 0 to 1 as done by MWSDI
algorithm).
3.3 MWSDI DISCOVERY MATCHING ALGORITHM
The MWSDI discovery algorithm supports both semantic and syntactic discovery
of services. The request R is matched against a service S. The matching of R with S is
done using a matching function which returns a similarity value in the range of 0 or 1.
This similarity value has two different dimensions: syntactic similarity and semantic
similarity. These similarities correspond to inputs, outputs and operations of the R, S
pairs. A match score is calculated for each pair of request and service. The pairs are
ranked in descending order of the match score and presented for the selection of a proper
web service. MWSDI is a more general approach which supports matching at the
interface (port type) level which may include several operations.
14
MATCHING ALGORITHM
The similarity score is obtained by matching the operations of R and S which
include a set of inputs, a set of outputs and a functionality. Semantic Similarity of
operations (refer to table 3-2) is used as the matching unit while matching R and S.
SIMILARITY OF OPERATIONS
The function OPSim (refer to table 3-2) calculates the similarity for an individual
operation pair. OPSim comprises of input similarity, output similarity, syntactic similarity
and conceptual similarity.
Input Similarity (inpSim): The input similarity calculates the similarity between
the set of inputs of the request and the service. Given the set of inputs, we use the
Hungarian algorithm to find the best mapping. This is better explained using a bipartite
graph. Consider a graph G = (R.I, S.I, M) where, R.I is the set of Request inputs and S.I is
the set of Service inputs and M is the set of concept match scores ijm (concept similarity
match score for . , .i jR I S I ). We want to find the best mapping with the maximum match
score.
Output Similarity (oupSim): The output similarity calculates the similarity
between the set of outputs of the request and the service. Given the set of outputs, we use
the Hungarian algorithm to find the best mapping. In the terms of a bipartite graph, G =
(R.O, S.O, M) where, R.O is the set of Request outputs and S.O is the set of Service
outputs and M is the set of concept match scores ijm (concept similarity match score for
. , .i jR O S O ). We want to find the best mapping with the maximum match score.
15
Syntactic Similarity (synSim): The syntactic similarity is the similarity of the
name of the operation in the request and service. To find the syntactic similarity of the
operation names, we have used the N-gram similarity algorithm, which is one of the
numerous string similarity measure algorithms.
Table 3-2: Similarity of Operations
1 2 3
4
( , )
( . , . ) ( . , . ) ( . , . )
( . , . )
OPSim R S
w inpSim R I S I w oupSim RO S O w synSim R name S name
w conSim R F S F
1
1
1
1 2
1 ( . , . )1( ( ))
0
1
1
. { , ,..
( . , . ) u
i j
ij ij iji
u
ijj
v
iji
if R I S I is selectedMax m x where x
m otherwise
such that for all i
for all j
where R I I I
inpSim R I S I
x
x
1 2
. } . { , ,... }
, ,
( . , . )
u v
ij i j
I S I I I I
arethelist of inputs for request R servce S respectively
m conSim R I S I
1
1
1
1 2
1 ( . , . )1( ( ))
0
1
1
. { , ,..
( . , . )
ui j
ij ij iji
u
ijj
v
iji
if R O S O is selectedMax m x where x
m otherwise
such that for all i
for all j
whereR O O O
oupSim RO S O
x
x
1 2
. } . { , ,... }
, ,
. , .( )
u v
i jij
O S O O O O
arethelist of outputs for request R servceS respectively
RO S Om conSim
( , ) ( . , . )synSim R S nGramSim R label S label
16
Concept Similarity (conSim): The concept similarity is the similarity of the
operation concept referenced in the ontology. The concept similarity is also used in
calculating ijm in input and output similarity. So in general, the concept similarity
explains the similarity of two concepts in a given ontology.
The concept similarity is a combination of concept syntactic similarity, coverage
similarity and property similarity.
Concept Syntactic Similarity (conSynSim): The concept syntactic similarity is the
similarity of the ontological name of two concepts. To find the syntactic similarity of the
ontological names, we have used the N-gram similarity algorithm.
Coverage Similarity (cvrgSim): The coverage similarity measures the extent of
knowledge with which one concept covers the other. If the concepts have a ratio of ≥ .8
for compropSim (the number of common properties over the number of unique
properties), then we say the concepts have a coverage similarity of 1. If not, for the
service concept we loop through the hierarchy to find a parent concept that has
compropSim ≥ .8. The coverage similarity is penalized1
0.1* 2x
where x is the number of
levels up in the hierarchy where the parent concept is found. The same process is
repeated to find the child concept in the hierarchy that has compropSim ≥ .8. If found the
penalty is 0.05*x, where x is the number of level down in the hierarchy where the child
concept is found. If none of the above cases work, the coverage similarity is 0.
17
Table 3-3: Concept Similarity
1 2
1 2
5 6 7
where . is the set of properties of R { , ,... }
. is the set of properties of S { , ,... }
( , )
( , ) ( . , . ) ( , )
m
m
R p p p p
S q q q q
conSim R S
w cvrgSim R S w propSim R p S q w conSynSim R S
( , ) ( . , . )conSynSim R S nGramSim Rlabel S label
1
1 ( , ) 0.8
1 0.1 2 ( , ( )) 0.8
1 0.05 ( , ( )) 0.8
0
difference in levels within the hierarchy
( , )
x x
x
compropSim R S
compropSim R parent S
x compropSim R child S
otherwise
x
cvrgSim R S
| ( ) ( ) |( , )
| ( ) ( )|p R p S
compropSim R Sp R p S
Property Similarity (propSim): The property similarity is the similarity of the set
of properties of two concepts. Given the set of properties, we use the Hungarian
algorithm to find the best mapping. In the terms of a bipartite graph, G = ( . , .R p S q , M)
where, .R p is set the properties of the request and .S q is set the properties of the service
and M is the set of property match score ijm (property similarity match score for (
. , .i jR p S q ). We want to find the best mapping with maximum match score. The property
similarity has multiple parts. It has yet another contributor C, a constant. The value of C
18
is 1 when both the properties are inverse functional properties or when both the properties
are not inverse functional properties. If not, the value of C is 0.8.
Range similarity (rangeSim): The range similarity has multiple parts. It takes the
weighted average of the property syntactic similarity and property range similarity. The
property syntactic similarity is calculated using the N-Gram algorithm. The property
range similarity is the similarity of the ranges of the two properties. If the ranges of the
property are object type properties, then the ranges are ontological concepts. The property
range similarity calculates the common properties of the range concepts divided by
number of properties of the range concept of service property.
Property Syntactic similarity (propSynSim): The property syntactic similarity is
the similarity of the ontological name of two properties. To find the syntactic similarity
of the ontological names, we have used the N-gram similarity algorithm.
Property Range similarity (propRangeSim): The property range similarity is the
ratio of common properties of the range concepts of the property to the list of properties
of the range concept of service property.
Cardinality Match (cardSim): The cardinality provides information in
determining whether the properties match. The cardinality match is (a) 1 if both the
request property and service property have the same cardinality. (b) 1 if both request
property and service property are functional properties. (c) 0.9 if the request needs more
than one value and the service property has only one value, the match would be less as
the request requirement is not met and (d) 0.7 if the service needs more than one value
and the request property has only one value.
19
Property Syntactic similarity (propSynSim): The property syntactic similarity is
the similarity of the ontological name of two properties. To find the syntactic similarity
of the ontological names, we have used the N-gram similarity algorithm.
Unmatched Properties (unMatchedProp): When matching two concepts, we can
get some situations wherein the concepts may not have the same number of properties.
The request may have equal, more or less number of properties compared to the concept.
In cases where the advertisement concept has less number of properties compared to
request, we penalize the concept by 0.05 for each of the unmatched property.
Table 3-4: Property Similarity
1
1 1
1( . , . ) ( ( * ))
1 ( . , . )
0
1 1
m
ij iji
i jij
n m
ij ijj i
x x
propSim R p S q Max m xm
if R p S q isselectedwhere x
otherwise
such that for all i for all j
3 ( . , . ) ( . , . ) ( . , . )
0.05 ( . , . )
i j i j i j
i j
rangeSim R p S q cardSim R p S q propSynSim R p S q
unMatchedprop R p S q
ijm c
1 . , .
1 . , .
0.8
i j
i j
if R p S q areinverse functional properties
if R p S q arenot inverse functional properties
otherwise
c
8 9. , . ( . , . ) ( . , . )( ) i j i j i jR p S q w propSynSim R p S q w propRangeSim R p S qrangeSim
| ( . . ) ( . . )|. , .
| ( . . )|( ) i j
i jj
p R p Range p S q RangeR p S q
p S q RangepropRangeSim
20
1 ( . ) ( . )
1 . , .
0.9 ( . ) ( . )
0.7 ( . ) ( . )
( . , . )
i j
i j
i j
i j
i j
cardinality R p cardinality S q
R p S q are functional properties
cardinality R p cardinality S q
cardinality R p cardinality S q
cardSim R p S q
( . , . ) ( . . , . . )i j i jpropSynSim R p S q p qnGramSim R label S label
The weights are all normalized, (w1, w2, w3, w4) = {3/10, 3/10, 3/10, 1/10}, (w5,
w6, w7) = {1/3, 1/3, 1/3} and (w8, w9) = {1/2, 1/2}.
21
CHAPTER 4
EVALUATION AND DISCUSSION OF RESULTS
The evaluations of the algorithms are based on the first SAWSDL collection
SAWSDL-TC1 [30]. It is publicly available and has 895 services with annotations spread
out to 24 ontologies. The collection has 26 request files and is associated to a binary
relevance set, which lists all the services which are relevant. The relevance factor is
binary and based on whether the services are “relevant” or “not relevant”. Each service in
the collection is restricted to have single interface and single operation and all the
annotations are referenced to OWL ontologies. Thus, we have carefully picked 36
services and 4 request services from the SAWSDL-TC1 collection which have
annotations to single ontology. For each request, we have a relevance set of services from
the 36 services, which were obtained from the SAWSDL-TC1. For request R1 (shown in
Appendix A) there are 4 relevant services and for the other three requests R2, R3 and R4
it is 7, 4 and 7, respectively. The operations for the services in the collection are not
annotated i.e., no functionality for the operations. The services are modified to refer to
ontologies on the Web rather than a local http server (before modification services were
referenced to ontologies on a local http server).
22
IMPLEMENTATION OF THE ALGORITHMS
SAWSDL-MX GUI TOOL
The SAWSDL-MX tool [31] is publicly available. It is implemented in Java and
has the GUI to run the algorithm. In this thesis, we did not implement the algorithm. We
ran and tested the GUI version of SAWSDL-MX 2.1. Running the tool is a 3 step
process.
The first step is called Test collection. We can either load an already existing
collection or alternatively create one. To create a collection, we add the set of services
in the service offers section either by URL or from local system, we added 36
services. Similarly, we add the set of requests into the service request section, we
added 4 requests and for each of the request the relevance set is added.
The second step is the Matchmaker, which is divided into SAWSDL-MX1 and
SAWSDL-MX2. In SAWSDL-MX1, we can choose one of the variants SAWSDL-
(M0-M4) in the configuration dropdown. Based on the configuration selected we can
change the parameters of minimum degree of match or text similarity thresholds.
After selecting the variant and the threshold, the Match button should be pressed to
start the matchmaking process.
The third and final step is the Evaluation, which is divided into 2 sections, one
showing the ranking of services for the request and the other showing the evaluation
charts of precision/recall/response time. Along with the ranking, the degree of match
and score for each shown. Each service in the ranking have a color code of “Black”
23
and “Red” meaning relevant and irrelevant based on the relevance set defined for
each request in the first step.
In the implementation of SAWSDL-MX, they have considered only the top level
annotations for the complex types in the SAWSDL. If there are multiple annotations,
i.e., multiple model references to single element, they consider only one random
model reference. Their algorithm is not restricted to one ontology reference. There
can be multiple ontology references in the SAWSDL.
TVERSKY MODEL PROTOTYPE IMPLEMENTATION
The Tversky Model Prototype API [32] is implemented in Java. The requests are
created in the program by creating a WS_Spec object, which stores the information of the
service like service name, ontology path, operation name, operation model reference,
input parameter, input model references, input types, output parameter, output model
references and output types. The services are stored in a native XML database eXist [33].
Each service in the database is retrieved and stored locally when running the algorithm.
We created a SAWSDL parser to parse these SAWSDL service files and retrieved
appropriate information to create the WS_Spec object.
We implemented the Hungarian algorithm and N-gram algorithm that are used by
the Tversky Model.
The following steps explain the setup and execution of the algorithm.
The API uses a native XML database as the registry to store the service. Install
eXist XML database. In the database, create a collection and upload the services.
24
Make changes to the configuration file (Config.java) to point the location of the
registry and a local directory to download the services.
Create a request by giving the path to the ontology, an operation, its model
reference, a set of inputs with name, type and model reference and a set of outputs
with name, type and model reference.
Running the algorithm, gives the match score for each operation pair of the
request and the service.
In the implementation of Tversky Model, we have considered both the top level
and bottom level annotations for the complex types in the SAWSDL. If there are multiple
annotations i.e., multiple model references to single element, we consider only the first
model reference. The implementation removes the weight for input, output, functional
similarity, if the request does not have any of them, for example, in our test cases we do
not have reference concepts to the operation, hence w4 is set to zero and other weights are
normalized. The implementation of the algorithm is restricted to one ontology reference
for a service. Howerver, the algorithm has conditions to deal when the request and
service reference different ontologies.
MWSDI API IMPLEMENTATION
The MWSDI API [34] is implemented in Java. The requests are created in the
program by creating a WS_Spec object, which stores the information of the service like
service name, ontology path, operation name, operation model reference, input
parameter, input model references, input types, output parameter, output model
references and output types. The services are stored in a native XML database eXist.
25
Each service in the database is retrieved and stored locally when running the algorithm.
We created a SAWSDL parser to parse these SAWSDL service files and retrieved
appropriate information to create the WS_Spec object.
We used the OWL API to find the concepts that are model referenced in the
service. We implemented the Hungarian algorithm and N-gram algorithm that are used
by the Tversky Model.
The following steps explain the setup and execution of the algorithm.
The API uses a native XML database as the registry to store the service. Install
eXist XML database. In the database, create a collection and upload the services.
Make changes to the configuration file (Config.java) to point the location of the
registry and a local directory to download the services.
Create a request by giving the path to the ontology, an operation and its model
reference, a set of inputs with name, type and model reference and a set of outputs
with name, type and model reference.
Running the algorithm, gives the match score for each operation of the service.
In the implementation of MWSDI algorithm, we have considered both the top
level and bottom level annotations for the complex types in the SAWSDL. If there are
multiple annotations, i.e., multiple model references to single element, we consider only
the first model reference. The implementation of the algorithm is restricted to one
ontology reference for a service. The implementation removes the weight for input,
output, functional similarity, if the request does not have any of them, for example, in our
test cases, we do not have reference concepts to the operation, hence w4 is set to zero and
26
other weights are normalized. The algorithm does not have conditions to tackle, if the
request and service reference different ontologies, and hence the individual scores might
be low.
The evaluation in this paper takes a further step in comparing these algorithms
quantitatively with the match score, which is a number ranging between [0-1] with 0
being no match and 1 as exact match.
To perform statistical evaluations we have compared the precision, recall and F-
measure of the algorithms.
cov
Recall (r)
covPrecision (p)
cov
2F-Measure (F)
number of correctly dis ered services
number of all correct services
number of correctly dis ered services
number of dis ered services
precision recall
precision recall
Precision is used to find the ratio of number of true positives (correctly discovered
services) to the sum of true positives and false positives (total number of discovered
services). Precision does not tell about all relevant services. Recall is the ratio of true
positives to the sum of true positives and false negatives (number of relevant not
discovered services), but it does not tell about irrelevant services discovered. The F-
Measure gives a balanced score for testing the accuracy of the algorithms.
The results for each of the algorithms with their Precision, Recall and F-measure
are shown in tables below. The experimental data used to test the algorithms are shown in
the Appendix. R1-R4 are the 4 requests used in the evaluations.
27
The table 4-1 gives the Precision, Recall, F-Measure of SAWSDL-M0 which is a
Logic Based matching with the minimum degree of match as subsumed-by.
The table 4-2 gives the Precision, Recall, F-Measure of SAWSDL-M1 which is a
hybrid logic based matching with Loss of Information syntactic similarity measure and
syntactic threshold of 0.3.
The table 4-3 gives the Precision, Recall, F-Measure of SAWSDL-M2 which is
hybrid logic based matching with Extended Jaccard syntactic similarity measure and
syntactic threshold of 0.3.
The table 4-4 gives the Precision, Recall, F-Measure of SAWSDL-M3 which is
hybrid logic based matching with Cosine syntactic similarity measure and syntactic
threshold of 0.3.
The table 4-5 gives the Precision, Recall, F-Measure of SAWSDL-M4 which is
hybrid logic based matching with Jensen Shannon syntactic similarity measure and
syntactic threshold of 0.3.
The table 4-6 gives the Precision, Recall, F-Measure of TVERSKY model with a
similarity threshold of 0.3.
The table 4-7 gives the Precision, Recall, F-Measure of MWSDI with a similarity
threshold of 0.3.
The table 4-8 gives the average Precision, Recall, F-measure of SAWSDL-MX
variants.
The table 4-9 gives the Syntactic similarity using Extended Jaccard measure.
28
Table 4-1 Precision, Recall and F-Measure for SAWSDL-M0
Request
(Relevant)
Returned
(Relevant)
Precision Recall F-Measure
R1(4) 2(2) 1.0 0.5 0.67
R2(7) 6(6) 1.0 0.86 0.92
R3(4) 6(4) 0.67 1.0 0.80
R4(7) 6(6) 1.0 0.86 0.92
Average 0.94 0.82 0.85
Table 4-2 Precision, Recall and F-Measure for SAWSDL-M1
Request
(Relevant)
Returned
(Relevant)
Threshold = 0.3
Precision Recall F-Measure
R1(4) 17(4) 0.23 1.0 0.37
R2(7) 0(0) 0.0 0.0 0.0
R3(4) 14(4) 0.28 1.0 0.44
R4(7) 0(0) 0.0 0.0 0.0
Average 0.09 0.36 0.15
29
Table 4-3 Precision, Recall and F-Measure for SAWSDL-M2
Request
(Relevant)
Returned
(Relevant)
Threshold = 0.3
Precision Recall F-Measure
R1(4) 7(3) 0.43 0.75 0.55
R2(7) 0(0) 0.0 0.0 0.0
R3(4) 8(4) 0.5 1.0 0.67
R4(7) 0(0) 0.0 0.0 0.0
Average 0.17 0.32 0.22
Table 4-4 Precision, Recall and F-Measure for SAWSDL-M3
Request
(Returned)
Returned
(Relevant)
Threshold = 0.3
Precision Recall F-Measure
R1(4) 17(4) 0.23 1.0 0.37
R2(7) 0(0) 0.0 0.0 0.0
R3(4) 18(4) 0.22 1.0 0.36
R4(7) 0(0) 0.0 0.0 0.0
Average 0.08 0.36 0.13
30
Table 4-5 Precision, Recall and F-Measure for SAWSDL-M4
Request
(Relevant)
Returned
(Relevant)
Threshold = 0.3
Precision Recall F-Measure
R1(4) 17(4) 0.23 1.0 0.37
R2(7) 0(0) 0.0 0.0 0.0
R3(4) 16(4) 0.25 1.0 0.4
R4(7) 0(0) 0.0 0.0 0.0
Average 0.09 0.36 0.14
Table 4-6 Precision, Recall and F-Measure for Tversky Model
Request
(Relevant)
Returned
(Relevant)
Precision Recall F-Measure
R1(4) 0(0) 0.0 0.0 0.0
R2(7) 1(1) 1.0 0.14 0.25
R3(4) 5(1) 0.80 1.0 0.89
R4(7) 1(1) 1.0 0.14 0.25
Average 0.78 0.27 0.32
31
Table 4-7 Precision, Recall and F-Measure for MWSDI
Request
(Relevant)
Returned
(Relevant)
Precision
Recall
F-Measure
R1(4) 0(0) 0.0 0.0 0.0
R2(7) 2(2) 1.0 0.28 0.43
R3(4) 4(4) 1.0 1.0 1.0
R4(7) 1(1) 1.0 0.14 0.25
Average 0.82 0.31 0.40
Table 4-8 Average Precision, Recall and F-Measure for SAWSDL-MX Hybrid
Request Precision Recall F-Measure
R1(4) 0.28 0.94 0.41
R2(7) 0.0 0.0 0.0
R3(4) 0.31 1.0 0.47
R4(7) 0.0 0.0 0.0
Average 0.11 0.35 0.16
32
Table 4-9 Syntactic similarity using Extended Jaccard measure
Request
(Relevant)
Returned
(Relevant)
Threshold = 0.3
Precision Recall F-Measure
R1(4) 18(4) 0.22 1.0 0.36
R2(7) 0(0) 0.0 0.0 0.0
R3(4) 17(4) 0.23 1.0 0.37
R4(7) 0(0) 0.0 0.0 0.0
Average 0.08 0.36 0.13
33
The following figures give the graphical representation of Precision, Recall, F-
measure values for TVERSKY, MWSDI and average of SAWSDL-MX variants.
Figure 4-1 Precision graph for SAWSDL M0, Tversky, MWSDI and average SAWSDL-
MX Hybrid
Figure 4-2 Recall graph for SAWSDL M0, Tversky, MWSDI and average SAWSDL-
MX Hybrid
0
0.2
0.4
0.6
0.8
1
1.2
R1 R2 R3 R4
SAWSDL-MX
Tversky
MWSDI
SAWSDL-M0
0
0.2
0.4
0.6
0.8
1
1.2
R1 R2 R3 R4
SAWSDL-MX
Tversky
MWSDI
SAWSDL-M0
34
Figure 4-3 F-Measure graph for SAWSDL M0, Tversky, MWSDI and average
SAWSDL-MX Hybrid
4.1 EVALUATION OF ALGORITHMS BASED ON THE RESULTS
The experiments were conducted on 4 requests and 36 advertisement services.
The SAWSDL-MX algorithm was executed on the 5 variants namely, SAWSDL-M0
with minimum degree of match as Subsumed-by and SAWSDL-(M1-M4) with syntactic
similarity threshold of 0.3. The Tversky and MWSDI algorithms were also set a threshold
of 0.3. The relevant services for each request were provided in the SAWSDL-TC (Test
Collection) which eliminated the requirement for human evaluators identifying the
relevant services for each request. Based on the results for each algorithm, the
observations we made are the following:
0
0.2
0.4
0.6
0.8
1
1.2
R1 R2 R3 R4
SAWSDL-MX
Tversky
MWSDI
SAWSDL-M0
35
The hybrid approach of SAWSDL-MX, which integrates the syntactic similarity
to the discovery and ranking of services, increased the number of false positives.
The result for the SAWSDL-M0, which is a logic-based only variant, has better
precision, compared to the other hybrid variants.
MWSDI does the concept match by taking a weighted average of the syntactic
similarity, property similarity and coverage similarity, which is a better matching
technique compared to Tversky model, which does a less detailed comparison,
since it calculates the match score based on the common property matching.
MWSDI finds the best mapping between the set of inputs and outputs to obtain
maximum overall match using the Hungarian algorithm, but Tversky model does
the average of match score between each individual pair of inputs or outputs.
MWSDI uses the Hungarian algorithm and has a deeper hierarchy of comparison,
which makes it slow and time consuming.
Tversky model and SAWSDL-MX handles the operation matching between
request and service even if they are annotated to separate ontologies. MWSDI has
coverage similarity functions, which are dedicated to similarity matching of
concepts from the same ontology. This affects the match score considerably.
SAWSDL-MX has the option of choosing one of the four syntactic similarity
measures, Loss of Information, Extended Jaccard, Cosine and Jensen-Shannon
36
similarity measures compared to Tversky and MWSDI which rely only on N-
Gram syntactic similarity measure.
The Extended Jaccard syntactic similarity approach has the better results when
compared to other similarity measures used in the SAWSDL-MX variants.
The Test Collection has services with no model references to the operations and
has references only to inputs and outputs, which make them biased to SAWSDL-
MX, since TVERSKY and MWSDI rely on functional similarity of the operation
concept.
TVERSKY and MWSDI have good precision compared to SAWSDL-MX, which
has better Recall values.
37
CHAPTER 5
CONCLUSION AND FUTURE WORK
The goal of this work was to evaluate the effectiveness of the several SAWSDL
based semantic web service discovery algorithms. In particular, we were looking at how
well they fared in precision and recall. The MWSDI algorithm was modified slightly to
improve the accuracy of matching. The original version of MWSDI algorithm had
context similarity which was removed due to the difficulty of implementation. MWSDI
concentrates more on matching the functionality of the operation. Unfortunately, the
SAWSDL-TC has SAWSDL files which do not have annotations to the operation. The
limitations of the SAWSDL-MX1 matchmaker were to rely only on the top-level
annotations and ignore the bottom-level annotations and perform operation matching only
based on the inputs and outputs and not the functionality of the operation. The finding
from our evaluation of the three algorithms is as follows: SAWSDL-MX1 performs well
with the logic based matching filter and has good recall values. The Tversky model
handles the matching of services from different ontologies and performed well with good
precision. The MWSDI approach has good precision values, since it makes use of the
property similarity and in addition has the coverage similarity to tackle concepts at
different levels from the same ontology.
38
In the future, we intend to do a comprehensive comparison of these algorithms
using SAWSDL test collections which have annotations to the operation of the service.
The Tversky algorithm can be improved by including the Hungarian algorithm while
doing the similarity match on the inputs and outputs and also use N-gram similarity to
calculate the syntactic similarity for the different parts of the WSDL. Also, additional
information retrieval techniques such as the Loss of Information, Extended Jaccard,
Cosine and Jensen-Shannon similarity measures will be considered for Tversky and
MWSDI algorithms instead of just using N-gram similarity for calculating the syntactic
similarity. The calculation of weights in the MWSDI is arbitrarily, In the future we can
integrate machine learning techniques to determine the weights. We also intend to
evaluate the SAWSDL-M0+WA and SAWSDL-MX2 matchmaker as part of a future
evaluation of SAWSDL-based semantic web service discovery algorithms.
39
REFERENCES
1. Web service glossary http://www.w3.org/TR/ws-gloss/.
2. Extensible Markup Language (XML) 1.0 (Fifth Edition) [Available at
http://www.w3.org/XML/].
3. SOAP Version 1.2 Part 1: Messaging Framework (Second Edition) [ Available at:
http://www.w3.org/TR/soap12-part1/].
4. Erik Christensen, Francisco Curbera, Greg Meredith, Sanjiva Weerawarana, W3C
Web Services Description Language (WSDL). [Available at:
http://www.w3c.org/TR/wsdl.].
5. HTTP - Hypertext Transfer Protocol [Available at:
http://www.w3.org/Protocols/].
6. Fielding, Roy Thomas (2000), Architectural Styles and the Design of Network-
based Software Architectures, Doctoral dissertation, University of California,
Irvine.
7. UDDI. Universal Description, Discovery, and Integration (UDDI v3.0). 2005
[Available ar: http://www.uddi.org/].
8. Message announcing closure of Technical Committee. [Available at:
http://lists.oasis-open.org/archives/uddi-spec/200807/msg00000.html].
40
9. Web Services Discovery [Available at: http://msdn.microsoft.com/en-
us/library/f9t5yf68(VS.80).aspx.].
10. Web service Discovery [Available at:
http://en.wikipedia.org/wiki/Web_Services_Discovery].
11. OWL-S. OWL-based Web Service Ontology. 2004 [Available at:
http://www.daml.org/services/owl-s/].
12. WSMO. Web Services Modeling Ontology (WSMO). 2004 [Available at:
http://www.wsmo.org/].
13. Rama Akkiraju, Joel Farrell, John Miller, Meenakshi Nagarajan, Marc-Thomas
Schmidt, Amit Sheth, Kunal Verma, Web Service Semantics - WSDL-S,
http://www.w3.org/Submission/WSDL-S, Retrieved 10 Oct 2006.
14. Joel Farrell, Holger Lausen. Semantic Annotations for WSDL. 2007 [Available
at: http://www.w3.org/TR/sawsdl/].
15. Roberto Chinnici, Jean-Jacques Moreau, Arthur Ryman, Sanjiva Weerawarana,
Web Services Description Language (WSDL) Version 2.0 Part 1: Core Language
[Available at: http://www.w3.org/TR/wsdl20/].
16. OWL. OWL Web Ontology Language Reference, W3C Recommendation.
[Available at: http://www.w3.org/TR/owl-features/].
17. Matthias Klusch, Patrick Kapahnke and Ingo Zinnikus: Hybrid Adaptive Web
Service Selection with SAWSDL-MX and WSDL Analyzer. The 6th Annual
European Semantic Web Conference (ESWC 2009).
41
18. Jorge Cardoso, John A. Miller and Savitha Emaini, "Web Services Discovery
Utilizing Semantically Annotated WSDL," in Reasoning Web 2008, Lecture
Notes in Computer Science (LNCS), Vol. 5224, Baroglio et al., Editors
(September 2008) pp. 240-268.
19. Kunal Verma, Amit P. Sheth, Swapna Oundhakar, Kaarthik Sivashanmugam, and
John A. Miller, "Allowing the Use of Multiple Ontologies for Discovery of Web
Services in Federated Registry Environment," Technical Report #UGA-CS-
LSDIS-TR-07-011, Department of Computer Science, University of Georgia,
Athens, Georgia (February 2007) pp. 1-27.
20. Semantic Web Services (SWS) [Available at:
http://en.wikipedia.org/wiki/Semantic_Web_Services].
21. SAWSDL file [Available at:
http://cs.uga.edu/~shiva/activivtyDestination_sawsdl.wsdl].
22. Ingo Zinnikus, Rupp H.J, Fischer K, "Detecting Similarities between Web
ServiceInterfaces: The WSDL Analyzer" In: Second International Workshop on
Web Servicesand Interoperability (WSI 2006), Pre-conference Workshop of
Conferenceon Interoperability for Enterprise Software and Applications, I-ESA
2006, March20-21, Bordeaux (2006).
23. Support Vector Machines (SVM) [Availble at:
http://en.wikipedia.org/wiki/Support_vector_machine].
42
24. Matthias Klusch , B. Fries, and K. Sycara. Automated Semantic Web Service
Discovery with OWLS-MX. In 5th International Conference on Autonomous
Agents and Multi-Agent Systems (AAMAS). 2006. Hakodate, Japan: ACM Press.
25. Term Frequency-Inverse Document Frequency. [Avialable at:
http://en.wikipedia.org/wiki/Tf–idf].
26. Bipartite Graph. [Available at: http://en.wikipedia.org/wiki/Bipartite_graph].
27. Tversky, A., Features of Similarity. Psychological Review, 1977.
84(4): p. 327-352.
28. Hungarian Algorithm. [Available at:
http://en.wikipedia.org/wiki/Hungarian_algorithm].
29. Ngram Algorithm. [Available at :
http://74.125.47.132/search?q=cache:wXdDByJaveoJ:www.cs.ualberta.ca/~kondr
ak/papers/spire05.ps+n-GRam+algorithm+.ps&hl=en&ct=clnk&cd=10&gl=us].
30. SAWSDL-TC1 test collection. [Available at:
http://projects.semwebcentral.org/projects/sawsdl-tc/].
31. SAWSDL-MX tool. [Avialable at:
http://projects.semwebcentral.org/projects/sawsdl-mx/].
32. Discvoery algorithms, Tversky and MWSDI. [Available at :
http://cs.uga.edu/~shiva/Discovery.zip].
33. Native XML database eXist. [Avialable at:
http://www.exist.com/].
34. Jaccard index. [Available at: http://en.wikipedia.org/wiki/jaccard_index].
43
35. Cosine similarity. [Available at: http://en.wikipedia.org/wiki/Cosine_similarity].
36. Jensen-Shannon divergence. [Available at: http://en.wikipedia.org/wiki/Jensen-
Shannon_divergence].
44
A. APPENDIX
TEST CASES
The SAWSDL-TC is publicly available. The 5 request files and the 36 services
we used for testing are available at http://cs.uga.edu/~shiva/sawsdl-tc1.zip. Here is an
example for a request service and advertisement service.
REQUESTS
Here are the samples of requests that have been used. Two of the used requests
are shown in the following pages.
45
Figure A-1 novel_author_service.wsdl
46
Figure A-2 novel_author_service.wsdl (continued)
47
Figure A-3 surfing_destination_service.wsdl
48
Figure A-4 surfing_destination_service.wsdl (continued)
49
SERVICES
Here are some of the advertisement SAWSDL services that have been used.
Figure A-5 novel_authorbook-type_service.wsdl
50
Figure A-6 novel_authorbook-type_service.wsdl (continued)
51
Figure A-7 activity_destination_service.wsdl
52
Figure A-8 activity_destination_service.wsdl (continued)
53
Figure A-9 activity_beach_service.wsdl
54
Figure A-10 activity_beach_service.wsdl (continued)