heptox 1 : marrying xml and heterogeneity in your p2p databases angela bonifati (icar cnr, italy),...
TRANSCRIPT
HEPTOX1: Marrying XML and Heterogeneity in Your
P2P Databases
Angela Bonifati (Icar CNR, Italy), Elaine Q.Chang, Laks V.S.Lakshmanan, Terence Ho,
Rachel Pottinger, Shuan Wang and Ting Wang (UBC, Canada)
1 HEPTOX stands for “Heterogeneous Peers Talk!”
See our demo! This afternoon 14:00-15:30 & Thursday 14:00-15:30
Motivations for Marrying XML and Heterogeneity in P2P Databases
Peers contain similar and related XML data
Each peer wants to keep its own schema and yet needs to be mapped to others’ schemas [cfr. LenzeriniTutorial@PODS02]– autonomy, flexibility are important in P2P– a global mediated schema is unfeasible
Queries are still formulated against one (e.g. local) schema– Need to transparently cross the different schemas
Previous work [Clio, Hyperion, Piazza] could only handle limited heterogeneity
See our demo! This afternoon 14:00-15:30 & Thursday 14:00-15:30
A P2P Network of Heterogeneous Hospitals
Peer1 Peer2DTD1 DTD2DTDn
Peern
....
Event ...
DateProblem
Admission...
CoronaryPulmonary
...
...ID InsName...
See our demo! This afternoon 14:00-15:30 & Thursday 14:00-15:30
Heterogeneity and XML data: an example
Consider a P2P network of hospitals and an unfortunate patient moving among them: – Option#1: the patient carries his/her own files and
query translation is done manually It is error-prone, and unfeasible with several moves
and with frequent joining/leaving of peers
– Option#2: the hospital db admin manually writes the mappings
It is not that easy to find a person who knows the rule machinery that well!
See our demo! This afternoon 14:00-15:30 & Thursday 14:00-15:30
Heterogeneity and XML data: an example
– Option#3: the hospital db admin provides informal arrows/boxes correspondences w.r.t. a set of acquaintances:
Users/applications do not know the underlying mappings machinery and can keep it simple
A peer’s entering the network is a lightweight operation
See our demo! This afternoon 14:00-15:30 & Thursday 14:00-15:30
What do input mappings look like?What do input mappings look like?
Montreal Hospital
Patient
AdmissionID MedCr# Name Hist
Event
Problem Date
Treat
Desc
Doc
AdmDate DisDate PatRef
Boston Hospital
Pulmonary
Admission
Coronary
ID InsName Policy# Enter Leave Patient
Progress
PatRef Symptom Treatment
Date Desc
*
Source Schema
Target Schema
*
++
ID/IDREF ID/IDREF
*
*
* **
*
?
?@
@@
@@
See our demo! This afternoon 14:00-15:30 & Thursday 14:00-15:30
Mappings in hospital schemasMappings in hospital schemas
XML mappings in HePToX are specified informally by arrows and boxes which encode:– Data/Metadata correspondences – Structure correspondences– The informal mappings are translated to Datalog-like mapping
rules. Data/Metadata conflicts are not dealt with in previous works:Data/Metadata conflicts are not dealt with in previous works:
– Addressed in HepToXAddressed in HepToX
Event
DateProblem Pulmonary Coronary
Admission
See our demo! This afternoon 14:00-15:30 & Thursday 14:00-15:30
Demo scenarios
Doctors: track patients
Patients: access their data
Insurance Companies: define the policy for a set of patients
Etc.
See our demo! This afternoon 14:00-15:30 & Thursday 14:00-15:30
HepToX Contributions
GUI for specifying correspondences (arrows/boxes)
Datalog-like mapping language for working with complex XML trees
Rule Inference algorithm for producing the Datalog-like mapping rules
Query Translation algorithm based on those mappings that works for a significant subset of XQuery (TPs with joins)
Our “Data Exchange” semantics, which differs from GAV and LAV mappings
See our demo! This afternoon 14:00-15:30 & Thursday 14:00-15:30
Demo Screenshots 1/2Schema Mappings By Boxes/Arrows and Corresponding Datalog-like Rules
See our demo! This afternoon 14:00-15:30 & Thursday 14:00-15:30
Demo Screenshots 2/2Details of Query Translation Algorithm (for each pair <TP, MR>)
See our demo! This afternoon 14:00-15:30 & Thursday 14:00-15:30
On HEPTOX implementation
The demo shows the following features:– Draw mappings and show the generation of rules– Show the query translation algorithm at work– Show a real network emulation with Emulab
HEPTOX is implemented in Java:– Uses QizX [QizX] as the underlying XQuery engine – FreePastry as the underlying P2P protocol– Emulab as the real network emulation environment– It consists of ~10,000 lines of code
Come visit our demo booth: This afternoon 14:00-15:30 &
Thursday 14:00-15:30