combining semantic and multimedia query routing techniques for unified data retrieval in a pdms*

24
Combining Semantic and Multimedia Combining Semantic and Multimedia Query Routing Techniques for Query Routing Techniques for Unified Data Retrieval in a PDMS* Unified Data Retrieval in a PDMS* Claudio Gennaro Claudio Gennaro 1 , Federica Mandreoli , Federica Mandreoli 2,4 2,4 , Riccardo , Riccardo Martoglia Martoglia 2 , , Matteo Mordacchini Matteo Mordacchini 1 , Wilma Penzo , Wilma Penzo 3,4 3,4 , and , and Simona Simona Sassatelli Sassatelli 2 1 ISTI – PI/CNR, Pisa ISTI – PI/CNR, Pisa 2 DII - Università degli Studi di Modena e Reggio Emilia DII - Università degli Studi di Modena e Reggio Emilia 3 3 DEIS - Università degli Studi di Bologna DEIS - Università degli Studi di Bologna 4 IEIIT – BO/CNR, Bologna IEIIT – BO/CNR, Bologna * This work is partially supported by the Italian co-founded Project NeP4B * This work is partially supported by the Italian co-founded Project NeP4B ISDSI 2009 – June 25th, 2009 ISDSI 2009 – June 25th, 2009

Upload: karen-harding

Post on 30-Dec-2015

29 views

Category:

Documents


2 download

DESCRIPTION

ISDSI 2009 – June 25th , 2009. Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS*. Claudio Gennaro 1 , Federica Mandreoli 2,4 , Riccardo Martoglia 2 , Matteo Mordacchini 1 , Wilma Penzo 3,4 , and Simona Sassatelli 2 - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS*

Combining Semantic and Multimedia Combining Semantic and Multimedia Query Routing Techniques for Unified Query Routing Techniques for Unified

Data Retrieval in a PDMS*Data Retrieval in a PDMS*

Claudio GennaroClaudio Gennaro11, Federica Mandreoli, Federica Mandreoli2,42,4, Riccardo Martoglia, Riccardo Martoglia22,,

Matteo MordacchiniMatteo Mordacchini11, Wilma Penzo, Wilma Penzo3,43,4, and , and Simona SassatelliSimona Sassatelli22

11 ISTI – PI/CNR, Pisa ISTI – PI/CNR, Pisa22 DII - Università degli Studi di Modena e Reggio Emilia DII - Università degli Studi di Modena e Reggio Emilia

3 3 DEIS - Università degli Studi di BolognaDEIS - Università degli Studi di Bologna44 IEIIT – BO/CNR, Bologna IEIIT – BO/CNR, Bologna

Claudio GennaroClaudio Gennaro11, Federica Mandreoli, Federica Mandreoli2,42,4, Riccardo Martoglia, Riccardo Martoglia22,,

Matteo MordacchiniMatteo Mordacchini11, Wilma Penzo, Wilma Penzo3,43,4, and , and Simona SassatelliSimona Sassatelli22

11 ISTI – PI/CNR, Pisa ISTI – PI/CNR, Pisa22 DII - Università degli Studi di Modena e Reggio Emilia DII - Università degli Studi di Modena e Reggio Emilia

3 3 DEIS - Università degli Studi di BolognaDEIS - Università degli Studi di Bologna44 IEIIT – BO/CNR, Bologna IEIIT – BO/CNR, Bologna

* This work is partially supported by the Italian co-founded Project NeP4B* This work is partially supported by the Italian co-founded Project NeP4B* This work is partially supported by the Italian co-founded Project NeP4B* This work is partially supported by the Italian co-founded Project NeP4B

ISDSI 2009 – June 25th, 2009 ISDSI 2009 – June 25th, 2009 ISDSI 2009 – June 25th, 2009 ISDSI 2009 – June 25th, 2009

Page 2: Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS*

Motivation ICTs over the Web have become a strategic asset Internet-based global market place where automatic cooperation

and competition are allowed and enhanced

The aim: development of an advanced technological infrastructure for SMEs to allow them to search for partners, exchange data and negotiate without limitations and constraints

The architecture: inspired by Peer Data Management Systems (PDMSs)

NeP4B Project(Networked Peers for Business)

ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS

Page 3: Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS*

Peer1

Peer2

Peer3

Peer5

Peer4

Peer6

TXT MM

TXT MM

TXT MM

TXT MM

TXT MM

TXT MM

The Reference Scenario A peer: a single SME or a mediator

It is queried by exploiting a SPARQL-like query language similarity predicates (FILTER function LIKE )

It keeps its data in an OWL ontology Multimedia objects multimedia

attributes in the ontology (e.g. image)

Tile

pricecompany

Image

imagePeer1

domdom dom ran

size

dom

materialdom

origindom

ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS

SELECT ?companyWHERE { ?tile a Tile .

?tile company ?company . ?tile price ?price . ?tile origin ?origin . ?tile image ?image . FILTER ( (?price < 30) && (?origin = ‘Italy’) &&

LIKE (?image, ‘myimage.jpg’, 0.3))}LIMIT 100

Page 4: Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS*

Peer1

Peer2

Peer3

Peer5

Peer4

Peer6

TXT MM

TXT MM

TXT MM

TXT MM

TXT MM

TXT MM

The Reference Scenario (cont.) Peers are connected by means of semantic

mapping (with scores) Peers collaborate in solving users queries

How to effectively and efficiently answer a query?

By adopting effective and efficient query routing techniques Flooding is not adequate

Overloads the network (traffic & computational effort) Overwelms the querying peer (irrelevant results)

Q

Q’

Q’’

Queries are formulated on the peer’s ontology

Answers can come from any peer that is connected through a semantic path of mappings

ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS

Page 5: Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS*

Combining Semantic and Multimedia Query Routing Techniques We leverage our distinct experiences on semantic [Mandreoli et al.

WIDM 2006, Mandreoli et al. WISE 2007] and multimedia [Gennaro et al. DBISP2P 2007, Gennaro et al. SAC 2008] query routing

We propose to combine our approaches in order to design an innovative mechanism for a unified data retrieval

We pursue: Effectiveness by selecting the semantically best suited subnetworks Efficiency by promoting the network’s zones where potentially

matching objects are more likely to be found, while pruning the others

ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS

Two main aspects characterize our scenario: the semantic heterogeneity of the peers’ ontologies the execution of multimedia predicates

Page 6: Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS*

Outline

Motivation

Query Answering Semantics

Query Routing Semantic Query Routing Multimedia Query Routing Combined Query Routing

Routing Strategies

Experimental Evaluation

Conclusions and Future Works

ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS

Page 7: Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS*

Query Answering Semantics Peer pi has an ontology Oi ={Ci1 , … , Cim}

A semantic mapping: a fuzzy relation M(Oi,Oj) OiOj where each instance (C,C’) has a membership grade μ(C,C’) [0,1]

evaluation of f on a local data instance i: s(f,i) [0,1] s(f,i) = s( f(φ1, …, φn), i ) = sfunφ ( s(φ1,i), … , s(φn,i) )

Boolean semantics for relational predicates non-Boolean semantics for similarity predicates t-norm ( resp. t-conorm) for scoring conjunctions (resp. disjunctions)

Local query answers: Ans(f,pi) = { (i,s(f,i)) | s(f,i)>0 }

Local query execution

Query formula:f ::= <triple_pattern> <filter_pattern>

<triple_pattern> ::= triple | <triple_pattern> ∧ <triple_pattern>

<filter_pattern> ::= φ | <filter_pattern> <filter_pattern> |∧ <filter_pattern> ∨ <filter_pattern> |

(<filter_pattern>) φ is a relational (=,<,>, <=, >=) or similarity (~t) predicate

pi

Oi

f

ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS

Ans(f,pi)

Page 8: Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS*

Query Answering Semantics (cont.)

f undergoes a chain of reformulations ff1…fm

s(f, Pp…pm ) = sfunr (s(f, p1), s(f1, p2), … , s(fm-1, pm))

sfunr is a t-norm

One-step query reformulation (pipj) according to the mapping M(Oi,Oj)

s(f,pj) = sfunc ( μ(C1 ,C’1), … , μ(Cn ,C’n) )

sfunc is a t-norm

Query answering semantics f submitted to p

P’ = {p1,…,pm} : set of accessed peers

Pp…pi : path used to reformulate f over each pi in P’

Ans (f, {p} U P’) = Ans(f,p) U Ans(f, Pp…p1 ) U … U Ans(f, Pp1…pm )

where : Ans(f, Pp…pi ) = (Ans(f,pi), s(f, Pp…pi ))

Multi-step query reformulation (pp1…pm = Pp1…pm )

pi pj

M(Oi,Oj)

Oi Oj

f

ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS

p p1 pm…

ff1

f2fm

(Ans(f,pj), s(f,pj))

(Ans(f,pm), s(f,Pp…pm))

Page 9: Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS*

Outline

Motivation

Query Answering Semantics

Query Routing Semantic Query Routing Multimedia Query Routing Combined Query Routing

Routing Strategies

Experimental Evaluation

Conclusions and Future Works

ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS

Page 10: Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS*

Peer1

Peer2

Peer3

Peer5

Peer4

Peer6

Semantic Query Routing

A Generalized Semantic Mapping relates each each concept C in Oi to the set of concepts Cs in Oj

s taken from the mappings in Ppi…pjs according to an aggregated score which expresses the semantic similarity between C and Cs.

Whenever pi forwards a query to one of its neighbors pj , the query might follow any of the semantic paths originating at pj, , i.e. in pj’s subnetwork

Main idea: introduction of a ranking approach for query routing which promotes pi’ neighbors whose subnetworks are the most semantically related to the query.

Preliminaries: pj

s : the set of peers in the subnetwork rooted at pj

Ojs: set of schemas {Ojk | pjk in pj

s}

Ppi…pjs : set of paths from pi to any peer in pj

ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS

Peer2s

Page 11: Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS*

When a peer receives a query formula f, it exploits its SRI scores to determine a ranking for its neighborhood: Rp

sem (f)[pi] = sfunc(μ(C1,C1s), …, μ(Cn,Cn

s))

Each peer p maintains a matrix named Semantic Routing Index (SRI) containing the membership grades given by the generalized semantic mappings between itself and its neighborhood Nb(p)

Peer1

Peer2

Peer3

Peer5

Peer4

Peer6

Semantic Query Routing (cont.)

SRISRIPeer1Peer1 TileTile originorigin ……

Peer1Peer1 1.01.0 1.01.0 ……

Peer2Peer2 0.850.85 0.700.70 ……

Peer3Peer3 0.650.65 0.850.85 ……

ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS

SRI[i,j] represents how the j-th concept is semantically approximated by the subnetwork rooted at the i-th neighbor

SRI-based Query Processing

Page 12: Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS*

Outline

Motivation

Query Answering Semantics

Query Routing Semantic Query Routing Multimedia Query Routing Combined Query Routing

Routing Strategies

Experimental Evaluation

Conclusions and Future Works

ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS

Page 13: Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS*

Multimedia Query Routing

Main idea: introduction of a ranking approach for query routing which promotes pi’ neighbors whose subnetworks contain the highest number of potentially matching objects.

ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS

The execution of multimedia predicates is inherently costly (CPU and I/O)

Preliminaries Each peer’s object is classified w.r.t. its distance (dissimilarity) to some

reference objects (pivots)E.g.: object O, pivot P, with d(O, P) = 42

Each peer maintains a condensed description of such a classification of its objects by using histograms (Peer Indices)

0 10042

0 0 0 01 Bit-vector

0 0 0 01

0 1 0 00

0 0 0 01} 0 1 0 02

Page 14: Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS*

ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS

Multimedia Query Routing (cont.) Each peer p also maintains a Multimedia Routing Index (MRI) containing

the aggregated description of the resources available in its neighbors’ subnetworks

Similarity-based Range Queries over metrics objects For each query Q, a vector representation QueryIdx(Q) is built by setting to

1 all the intervals covered by the requested range

MRI-based Query Processing

Rpmm(f)[pi]=

mink [QueryIdx(Q) * MRI(p,pis)]

Σ j=1…n mink [QueryIdx(Q) * MRI (p,pjs)]

When a peer p receives Q, it computes the percentage of potential matching objects in each neighbor’s subnetwork w.r.t the total objects in Nb(p):

Peer1

Peer2

Peer3

Peer5

Peer4

Peer6

Any MRI row MRI(p,pis) is built by summing up the Peer

Indices in the i-th neighbor’s subnetwork

MRIMRIPeer1Peer1 [0.0,0.2][0.0,0.2] [0.2,0.4][0.2,0.4] ……

Peer2Peer2 88 2929 ……

Peer3Peer3 300300 1,5211,521 ……

Page 15: Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS*

Outline

Motivation

Query Answering Semantics

Query Routing Semantic Query Routing Multimedia Query Routing Combined Query Routing

Routing Strategies

Experimental Evaluation

Conclusions and Future Works

ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS

Page 16: Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS*

Combined Query Routing

ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS

Optimal aggregation algorithms can only work with monotone aggregation functions E.g. min, mean, sum…

Both the semantic and multimedia scores induce a total order They can be combined by means of a meaningful aggregate

function in order to obtain a global ranking:

We inspire to Fagin, Lotem & Naor: “Optimal Aggregation Algorithms for

Middleware”. Journal of Computer and System Sciences, 66: 47-58, 2003.

Rpcomb (f) = α Rp

sem (f) β Rpmm(f)

Page 17: Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS*

ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS

Combined Query Routing (cont.)

SRIPeer1 MRIPeer1 min( )

Peer2 0.70 0.53 0.53

Peer3 0.65 0.76 0.65

Peer1

Peer2

Peer3

Peer5

Peer4

Peer6

E.g.

SELECT ?companyWHERE { ?tile a Tile .

?tile company ?company . ?tile price ?price . ?tile origin ?origin . ?tile image ?image . FILTER ( (?price < 30) && (?origin = ‘Italy’) &&

LIKE (?image, ‘myimage.jpg’, 0.3))}LIMIT 100

Final ranking: Peer3-Peer2 Peer3 roots the most promising subnetwork!

Page 18: Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS*

Outline

Motivation

Query Answering Semantics

Query Routing Semantic Query Routing Multimedia Query Routing Combined Query Routing

Routing Strategies

Experimental Evaluation

Conclusions and Future Works

ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS

Page 19: Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS*

Routing Strategies

Efficiency (Depth First Model) the best peer in one hop!

o it exploits the only information provided by Rpcomb (f)

Effectiveness (Global Model) the best known peer!

o it exploits the information provided by Up is visited Rpcomb (f)

The adopted routing strategy determines the set of visited peers and induces an order on it

When a peer p receives Q it computes the ranking Rpcomb(f) on its

neighbors

Different routing strategies relying on such ranking and having different performance priorities are possible:

Page 20: Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS*

Outline

Motivation

Query Answering Semantics

Query Routing Semantic Query Routing Multimedia Query Routing Combined Query Routing

Routing Strategies

Experimental Evaluation

Conclusions and Future Works

ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS

Page 21: Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS*

Experimental Evaluation

Simulation environment for SRI and MRI Peers belong to different semantic categories

Ontologies: consisting of a small number of classes Multimedia contents: hundreds of images taken from the Web,

characterized by two MPEG7 standard features (scalable color & edge histogram)

Network topology: randomly generated with the BRITE tool in the size of few dozens of nodes

Queries: on randomly selected peers Routing strategy: DF search Aggregation function: mean Stopping condition: number of retrieved results

ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS

Experimental setting

Page 22: Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS*

Effectiveness Evaluation

ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS

We measure the semantic quality of the obtained results (satisfaction)

Page 23: Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS*

Outline

Motivation

Query Answering Semantics

Query Routing Semantic Query Routing Multimedia Query Routing Combined Query Routing

Routing Strategies

Experimental Evaluation

Conclusions and Future Works

ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS

Page 24: Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS*

Conclusions & Future Works We presented a novel approach for processing queries

effectively and efficiently in a distributed and heterogeneous environment, like the one of the NeP4B Project

We proposed an innovative query routing approach which exploits the semantic of the concepts in the peers’ontologies and the multimedia contents in the peers’repositories

We experimentally prove the effectiveness of our techniques with a series of exploratory tests

(In the future) we will: perform new tests on larger and more complex scenarios integrate our techniques in a more general framework for

query routing (including latency, costs, etc.)

ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS