outline
DESCRIPTION
Mapping Data in Peer-to-Peer Systems: Semantics and Algorithmic Issues B y A. Kementsietsidis, M. Arenas and R.J. Miller Presented by Md. Anisur Rahman: 3558643 Anahit Martirosyan: 100628480 LianXiang Qiu: 3603336 University Of Ottawa Winter 2004. Outline. P2P Data-Sharing-System - PowerPoint PPT PresentationTRANSCRIPT
Mapping Data in Peer-to-Peer Systems: Semantics and Algorithmic Issues
By A. Kementsietsidis, M. Arenas and R.J. Miller
Presented by Md. Anisur Rahman: 3558643
Anahit Martirosyan: 100628480LianXiang Qiu: 3603336
University Of OttawaWinter 2004
Outline
P2P Data-Sharing-System Mapping Table Alternative Semantics for Mapping Tables Mapping Tables as Constraints An algorithm for checking consistency of the
existing mappings and inferring new mappings from them
Conclusion and Future work
Peer-to-Peer Data-Sharing System
What is a Mapping Table?
GDB_id SwissProt_id
G1
G1
G2
G3
P9
Q62
P40
P38
GDB_id Gene_Name
G1
G2
G3
NF1
NF2
NGFB
SwissProt_id Protein_ name
P9
P40
NF1
MERL
Relation GDB Relation SwissProt
Mapping Table
A mapping table m from a set of attributes X to a set of attributes Y is a finite set of mappings over X Y
Alternative Semantics for Mapping Tables
Closed-Closed-World SemanticsClosed-Open-World Semantics
GDB_id SwissProt_id
G2 P40
GDB_id SwissProt_id
G2
v - {G2}
P40
v’ - {P40}
Valuation over a mapping table
A valuation p over mapping table m is a function that maps each constant value in m to itself and each variable v of m to a value of the domain of the attribute where v
appears If v appears in the expression of the form v-S , then p(v)S
Attr1 Attr2
a 3
b 2
v-{a,b} 1
dom(Attr1)={a, b, c, d}
dom(Attr2)={1, 2, 3}
p(a) = ap(3) = 3p(v) = cp(v) = d
Mapping table m
Mapping Constraint
GDB_id Gene_Name
G1
G2
G3
NF1
NF2
NGFB
SwissProt_id Protein_ name
P9
P40
NF1
MERL
GDB_id SwissProt_id
G2
v - {G2}
P40
v’ - {P40}
Relation GDB Relation SwissProt
GDB_id GENE_Name Swissprot_id Protein_ Name
G1
G2
G3
G2
NF1
NF2
NGFB
NF2
P9
P40
P9
P9
NF1
MERL
NF1
NF1
Mapping table m
A relation having attributes from both GDB and SwissProt
idotSwissm
idGDB _Pr_: Mapping Constraint
Extension of a mapping constraint
Given a mapping constraint ext () = {(t) | t m and is a valuation
over m }
Ym
X :
Attr1 Attr2
a 3
b 2
v-{a,b} 1
Mapping table m
21: Attrm
Attr
dom(Attr1)={a, b, c, d}
dom(Attr2)={1, 2, 3}
Attr1 Attr2
a 3
b 2
c 1
d 1
ext(µ)
A mapping constraint is called the cover of a set of mapping constraints if
is consistent if and only if there exists t ext()
For every mapping constraint , ╞ ’ if and only if ext() ext(’)
Cover of a set of mapping constraints
Ym
X :
Ym
X ':'
Example of Cover
B1 B2
px 1
qy 2
rz 3
rx 4
A1 A2 B1
p x pxq y qy
v-{p,q} v’ v’’-{px,qy}
C1 C2
a i
b j
c k
A1 A2
p x
q y
r z
B1 C1 C2
px a iqy b j
v-{px,qy} v’ v’’-{I,j}
A1 A2 C1 C2
p x a iq y b j
Mapping table m1 Mapping table m2
Mapping table m
Relation r1
Relation r2
Relation r3
212
12 ,: CCm
B
11
211 ,: Bm
AA
2121 ,,: CCm
AA
={1, 2}
The Algorithm
Input A path = P1, P2,…., Pn of peers
A set of mapping constraints over path Two sets of attributes X and Y in peers P1 and Pn
Output: A mapping constraint that is a cover of Y
mX :
How is the Algorithm useful?
To check whether ╞ ’ Run the algorithm to find the cover Check whether ext() ext(’).
To check whether is consistent Run the algorithm to find the cover Check whether ext() is nonempty
An Example
P1 P3
=P1, P2, P3, P4
= {µ1, µ2,…, µ11}
{A1, A2,.., A6}
P2
{B1, B2,.., B6} {C1,C2,C3,C4}
P4
{D3, D4}
4444: BA m
3233 ,: 3 BBA m
5555: BA m
6666: BA m
1111: BA m
21212 ,,: 2 BBAA m
3599: CB m
12177,: CBB m
2388: CB m
331010: DC m
441111: DC m
Partitions
4444: BA m
3233 ,: 3 BBA m
5555: BA m
6666: BA m
1111: BA m
21212 ,,: 2 BBAA m µ2
µ1 µ3 µ5
µ4 µ6
2121 ,, 2 BBAA m323 ,3 BBA m
111 BA m
1
444 BA m2
555 BA m3
666 BA m4
Inferred Partitions
Peer P1 Peer P2
2121 ,, 2 BBAA m323 ,3 BBA m
111 BA m
1
444 BA m2
555 BA m3
666 BA m4
1217, CBB m5
238 CB m6
359 CB m7
1
5
2
6
3
7
4
444 BA m
2121 ,, 2 BBAA m323 ,3 BBA m
1417, CBB m
238 CB m
111 BA m
359 CB m
555 BA m
666 BA m
Inferred partition over P1 and P2
Advantages of Partitioning
While computing the cover, partitioning reduces computational cost as fewer constraints are considered at a time.
Different partitions can be processed in parallel.
Description of the Algorithm
The algorithm has two phases The Information gathering Phase The Computation Phase
Information Gathering Phase
P1 P2 P3 P4
Compute partitionsFor each partition send to P2 the set of attributes in the partition
Compute own partitionsCompute inferred partitions using the information of partitions of P1
Compute own partitionsCompute inferred partitions using the information of propagated inferred partitions from P2
Computation Phase
P1 P2 P3 P4
Using the local constraints of the inferred partition , computes a cover between P3 and P4
The mappings belonging to the cover are streamed to peer P2.
Determines with which of its own partitions the incoming stream of mapping should be associated With this information it generates a cover between itself and P4
Uses the incoming stream of mappings to generate a cover between its own attributes and those of peer P4
Conclusion and Future Scope
This paper showed that by treating mapping tables as constraints on the exchange of information between peers it is possible to reason about them and check their consistency.
There is scope for investigating the use of mapping tables in support of query answering.
Thank YouThank You