data integration
DESCRIPTION
Clio Schema MappingTRANSCRIPT
Information Systems Group Candidacy Exam, Jan. 2010
Clio: Schema Mapping Creation and
Data Exchange
Presented by
Leila Jalali
Information Systems Group Leila Jalali, Candidacy Exam
the Clio project
Source schema S
Target schema T
•Wants data from S•Understands T•May not understand S
Schema Mapping
“conforms to”
data
“conforms to”
Clio addresses two main problems: How to generate schema mappings generate schema mappings and how to use them for data exchangedata exchange?
Data Exchangeto transform data
Information Systems Group Leila Jalali, Candidacy Exam
The Motivating Example1.Schema Mapping Generation
Mapping generation algorithm
2. Data Exchange Query generation algorithm
Conclusions
Outline
Information Systems Group Leila Jalali, Candidacy Exam
Schema S:
A Motivating Example
Companies: Set of RcdName
Address
Year
Grants : Set of RcdGid
Recipient
Amount
Supervisor
Manager
Contacts : Set of RcdCid
Phone
Organizations: Set of RcdCode
Year
Fundings: Set of Rcd
FId
FinId
Finances: Set of RcdFinId
Budget
Phone
v1
v2
v3
v4
Schema T:
Correspondences (given by a "schema matcher“ or a“user”)
f1
f2f3
f4
Information Systems Group Leila Jalali, Candidacy Exam
Correspondences
n,d,y Companies(n,d,y) → y',F Organizations(n,y',F))
v1:
Companies
Name
Address
Year
Grants
Gid
Recipient
Amount
Supervisor
Manager
Contacts
Cid
Phone
Organizations
Code
Year
Fundings
FId
FinId
Finances
FinId
Budget
Phone
v1
v2
v3
v4
f1
f2f3
f4
Using tuple generating dependency(tgd):
foreach c in companiesexists o in organizations,
with o.code = c.name
Information Systems Group Leila Jalali, Candidacy Exam
n,d,y,g,a,s,m Companies(n,d,y),
Grants(g,n,a,s,m) → y',F,f, p
Organizations(n,y',F)), F(g,f), Finances(f,a,p)
Companies
Name
Address
Year
Grants
Gid
Recipient
Amount
Supervisor
Manager
Contacts
Cid
Phone
Organizations
Code
Year
Fundings
FId
FinId
Finances
FinId
Budget
Phone
v1
v2
v3
v4
f1
f2f3
f4
More complex mappings
foreach c in companies, g in grantswhere c.name=g.recipient
exists o in organizations,f in o.fundings,i in financeswhere f.finId = i.finId
with o.code = c.name and f.fId = g.gId and i.budget = g.amount
Information Systems Group Leila Jalali, Candidacy Exam
n,d,y,g,a,s,m Companies(n,d,y),
Grants(g,n,a,s,m) → y',F,f, p
Organizations(n,y',F)), F(g,f), Finances(f,a,p)
Companies
Name
Address
Year
Grants
Gid
Recipient
Amount
Supervisor
Manager
Contacts
Cid
Phone
Organizations
Code
Year
Fundings
FId
FinId
Finances
FinId
Budget
Phone
v1
v2
v3
v4
f1
f2f3
f4foreach c in companies, g in grants
where c.name=g.recipientexists o in organizations,
f in o.fundings,i in financeswhere f.finId = i.finId
with o.code = c.name and f.fId = g.gId and i.budget = g.amount
query on the source:QS
query on the target: QT
Correspondences
QS QT
More complex mappings
Information Systems Group Leila Jalali, Candidacy Exam
The Motivating Example1.Schema Mapping Generation
Mapping generation algorithm
2. Data Exchange Query generation algorithm
Conclusions
Outline
Information Systems Group Leila Jalali, Candidacy Exam
Mapping GenerationSource Schema Generate all possible associations within
the SourceTarget Schema
Structural Associations
Generate all possible associations within the Target
Information Systems Group Leila Jalali, Candidacy Exam
Mapping Generation
Companies:NameAddressYear
Grants:GidRecipientAmountSupervisorManager
Contacts:CidEmailPhone
Organizations:CodeYearFundings:
FIdFinId
Finances:FinIdBudgetPhone
f1
f2f3
f4
from g in grants
from p in companies
from o in organizations
Source Schema Generate all possible associations within the Source
Target Schema
Structural Associations
Generate all possible associations within the Target
Information Systems Group Leila Jalali, Candidacy Exam
Mapping GenerationSource Schema Generate all possible associations within
the SourceTarget Schema
Structural Associations
Generate all possible associations within the Target
Build larger associaitons in Source (AS) and Target (AT)
Logical Associations
Information Systems Group Leila Jalali, Candidacy Exam
Mapping Generation
Companies:NameAddressYear
Grants:GidRecipientAmountSupervisorManager
Contacts:CidEmailPhone
f1
f2f3
Source Schema
Target Schema
Structural Associations
AS :
Build larger associaitons in Source (AS) and Target (AT)
Logical Associations
Generate all possible associations within the Source
Generate all possible associations within the Target
starting with a structural association and "chasing" constraints
Information Systems Group Leila Jalali, Candidacy Exam
Mapping GenerationSource Schema
Target Schema
Structural Associations
Logical Associations
Use a pair of <AS,AT > and Correspondeces covered by <AS , AT> to generate a
Clio Mapping: foreach AS exists AT with WW is the conjunction of equalities h (eS )=h’(eT ) (captured from correspondences)
Build larger associaitons in Source (AS) and Target (AT)
Generate all possible associations within the Source
Generate all possible associations within the Target
Information Systems Group Leila Jalali, Candidacy Exam
Clio mapping, example
AS : from g in grants, c in companies, s in contacts, m in contacts
where g.recipient = c.name and g.supervisor = s.cid and g.manager = m.cid
AT: from o in organizations, f in o.fundings, i in finances
where f.finId = i.finId
v1, v2, v3 are covered
Companies
Name
Address
Year
Grants
Gid
Recipient
Amount
Supervisor
Manager
Contacts
Cid
Phone
Organizations
Code
Year
Fundings
FId
FinId
Finances
FinId
Budget
Phone
v1
v2
v3
v4
f1
f2f3
f4
Generate a Clio Mapping: foreach AS exists AT with WW is the conjunction of equalities h (eS )=h’(eT )
foreach g in grants, c in companies, s in contacts, m in contactswhere g.recipient = c.name and g.supervisor = s.cid and g.manager = m.cid
exists o in organizations, f in o.fundings, i in financeswhere f.finId = i.finId
with c.name = o.code and g.gId = f. fId and g.amount = i.budget
Information Systems Group Leila Jalali, Candidacy Exam
Dominance
A2 dominates A1 (A1 ≤ A2 ) if the from and where clauses of A1 are subsets
of those of A2 (after suitable renaming)
A2 : from g in grants, c in companies, s in contacts, m in contactswhere g.recipient = c.name and g.supervisor = s.cid and
g.manager = m.cid
A1 : from g in grants, c in companieswhere g.recipient = c.name
Information Systems Group Leila Jalali, Candidacy Exam
Coverage of a coresspondence A correspondence v : foreach PS exists PT with
eS=eT
is covered by a pair of associations <AS , AT> if PS ≤ AS and PT ≤ AT with some renaming h, h’
Example:
AS : from c in companiesAT : fom o in organizations
v: foreach c in companies exists o in organizations with c.name = o.code
Information Systems Group Leila Jalali, Candidacy Exam
Mapping GenerationSource Schema
Target Schema
Structural Associations
Logical Associations
Build larger associaitons in Source (AS) and Target (AT)
Use a pair of <AS,AT > and Correspondeces covered by <AS , AT> and generate a
Clio Mapping: foreach AS exists AT with WW is the conjunction of equalities h (eS )=h’(eT ) (captured from correspondences)
Generate all possible associations within the Source
Generate all possible associations within the Target
Information Systems Group Leila Jalali, Candidacy Exam
Mapping GenerationSource Schema
Target Schema
Structural Associations
Logical Associations
Add the Clio Mapping to the Set of Mappings
the Set of Mappings
Build larger associaitons in Source (AS) and Target (AT)
Use a pair of <AS,AT > and Correspondeces covered by <AS , AT> and generate a
Clio Mapping: foreach AS exists AT with WW is the conjunction of equalities h (eS )=h’(eT ) (captured from correspondences)
Generate all possible associations within the Source
Generate all possible associations within the Target
Information Systems Group Leila Jalali, Candidacy Exam
Finds maximal sets of correspondences that can be interpreted together
Discard the “larger” mapping
Generate a Clio mapping
Logical associations are meaningful combinations of correspondences
Information Systems Group Leila Jalali, Candidacy Exam
The Motivating Example1. Schema Mapping Generation
Mapping generation algorithm
2. Data Exchange Query generation algorithm
Conclusions
Outline
Information Systems Group Leila Jalali, Candidacy Exam
Query generation for data exchange
Mapping generation
Query generation
Target schema
Source schema
Information Systems Group Leila Jalali, Candidacy Exam
Overview of Query Generation
Input: A Clio Mapping
1. Query Graph is constructed which represents the key portions of the query in the graph
2. Annotate the graph to generate Skolem terms
3. Traverse the graph and produce the query
Output: the data exchange Query
(in SQL, XQuery, or XSLT)
y 0 (organizations)
y 0.year
y 0 .codey 1(fundings)
y 0.finIdy 0.fid
x1. gid
x 0.name
x 0.name
x1. amount, x1.gid, x 0.name,
x 0.name
x 0.name, x1.gidx1.gid
Information Systems Group Leila Jalali, Candidacy Exam
y0 (organizations)
Adding a node for each variable in the exists clause
y1(fundings)
y2(finances)
1. Constructing the Query Graph
Information Systems Group Leila Jalali, Candidacy Exam
Adding nodes for all the atomic type elements reachable from these nodes via record projection
Organizations:CodeYearFundings:
FIdFinId
Finances FinId Budget Phone
f4
y0 (organizations)
y1(fundings)
y2(finances)
y1.fid y1.finId
y0.code y0.year y2.finId
y2.budget
y2.phone
1. Constructing the Query Graph (cont.)
Information Systems Group Leila Jalali, Candidacy Exam
y0 (organizations)
Add structural edges to reflect the relationships between nodes
y1(fundings)
y2(finances)
y1.fid y1.finId
y0.code y0.year y2.finId
y2.budget
y2.phone
1. Constructing the Query Graph (cont.)
Organizations:CodeYearFundings:
FIdFinId
Finances FinId Budget Phone
Information Systems Group Leila Jalali, Candidacy Exam
y0 (organizations)
Add the source nodes for all source expressions in the with clause
y1(fundings)
y2(finances)
y1.fid y1.finId
y0.code y0.year y2.finId
y2.budget
y2.phone
x1. gid
x0.name
x1.amount
x2.phone
1. Constructing the Query Graph (cont.)
Information Systems Group Leila Jalali, Candidacy Exam
y0 (organizations)
Attach the source nodes to the target nodes to which they are “equal”
y1(fundings)
y2(finances)
y1.fid y1.finId
y0.code y0.year y2.finId
y2.budget
y2.phone
x1. gid
x0.name
x1.amount
x2.phone
1. Constructing the Query Graph (cont.)
Information Systems Group Leila Jalali, Candidacy Exam
y0 (organizations)
Use the equalities in the where clause to add edges between target nodes
y1(fundings)
y2(finances)
y1.fid y1.finId
y0.code y0.year y2.finId
y2.budget
y2.phone
x1. gid
x0.name
x1.amount
x2.phone
1. Constructing the Query Graph (cont.)
Information Systems Group Leila Jalali, Candidacy Exam
x 0.namex 1.amount
x1.gid
2. Annotating the Graph
y0 (organizations)
y1(fundings)
y2(finances)
y1.fid y1.finId
y0.code y0.year y2.finId
y2.budget
y2.phone
x1. gid
x0.name
x1.amount
x2.phone
Each node is annotated with a set of source expressions
Upward propagation: Every expression that a node acquires is propagatedto its parent node, unless the (acquiring) node is a variable.
x 2.phonex 0.name
x1.gid
x 1.amount
x 2.phone
Information Systems Group Leila Jalali, Candidacy Exam
2. Annotating the Graph (cont.)
Downward propagation: Every expression that a node acquires is propagated to its children
x 0.namex 1.amount
x1.gid
y0 (organizations)
y1(fundings)
y2(finances)
y1.fid y1.finId
y0.code y0.year y2.finId
y2.budget
y2.phone
x1. gid
x0.name
x1.amount
x2.phone
x 2.phonex1.gid
x 0.name x 1.amount, x 2.phonex 1.amount, x 2.phonex 0.namex 0.name
x1.gid
x 0.name
Information Systems Group Leila Jalali, Candidacy Exam
2. Annotating the Graph (cont.)
Eq. propagation: Every expression that a node acquires is propagated to the nodes related to it through equality edges.
x 0.namex 1.amount
x1.gid
y0 (organizations)
y1(fundings)
y2(finances)
y1.fid y1.finId
y0.code y0.year y2.finId
y2.budget
y2.phone
x1. gid
x0.name
x1.amount
x2.phone
x 2.phonex1.gid,x 0.name
x 0.name x 1.amount, x 2.phone
x 1.amount, x 2.phonex 0.name
x1.gid,x 0.name
x 1.amount, x 2.phone
x1.gid,x 0.name
Information Systems Group Leila Jalali, Candidacy Exam
2. Annotating the Graph (cont.)
Apply the rules until no more rules can be applied
x 0.namex 1.amount
x1.gid
y0 (organizations)
y1(fundings)
y2(finances)
y1.fid y1.finId
y0.code y0.year y2.finId
y2.budget
y2.phone
x1. gid
x0.name
x1.amount
x2.phone
x 2.phonex1.gid,x 0.name
x 0.name x 1.amount, x 2.phone
x 1.amount, x 2.phonex 0.name
x1.gid,x 0.namex 1.amount, x 2.phone
x1.gid,x 0.namex 1.amount, x 2.phone
x1.gid,x 0.name
Information Systems Group Leila Jalali, Candidacy Exam
3. Generation of Transformation Queries
The for each clause is converted to a query fragment:
Generate the query fragment:
Information Systems Group Leila Jalali, Candidacy Exam
3. Generation of Transformation Queries
x 0.namex 1.amount
x1.gid
y0 (organizations)
y1(fundings)
y2(finances)
y1.fid y1.finId
y0.code y0.year y2.finId
y2.budget
y2.phone
x1. gid
x0.name
x1.amount
x2.phone
x 2.phonex1.gid,x 0.name
x 0.name x 1.amount, x 2.phone
x 1.amount, x 2.phonex 0.name
x1.gid,x 0.namex 1.amount, x 2.phone
x1.gid,x 0.namex 1.amount, x 2.phone
x1.gid,x 0.name
Perform a depth-first traversal on the Graph
Information Systems Group Leila Jalali, Candidacy Exam
x 0.namex 1.amount
x1.gid
y0 (organizations)
y1(fundings)
y2(finances)
y1.fid y1.finId
y0.code y0.year y2.finId
y2.budget
y2.phone
x1. gid
x0.name
x1.amount
x2.phone
x 2.phonex1.gid,x 0.name
x 0.namex 1.amount, x 2.phone
x 1.amount, x 2.phonex 0.name
x1.gid,x 0.namex 1.amount, x 2.phone
x1.gid,x 0.namex 1.amount, x 2.phone
x1.gid,x 0.name
3. Generation of Transformation Queries
Information Systems Group Leila Jalali, Candidacy Exam
Finally we have the Query:
Information Systems Group Leila Jalali, Candidacy Exam
Clio: Conclusion
Providing tools that help in automating and managing the problem of Data Conversion
The key contributions of Clio:Schema mapping generation
Mapping as a query discovery problemCapable of mapping between relational and nested
schemasQuery generation for data exchange
SQL, XQuery, XSLT, generating Skolems,...
Information Systems Group Candidacy Exam, Jan. 2010
Thanks
Information Systems Group Leila Jalali, Candidacy Exam
Clio RequirementsComplex mappings: using associationDefinitions:
Mapping languagePathsSchema&TypesDominance
Query Generation Challenges,the problem of Recursion in XML schema
Nested Referential Integrity (NRI) constraintsThe Chase
Back ups
Information Systems Group Leila Jalali, Candidacy Exam
the Clio project- overview of the requirements
Source schema S
Target schema T
Schema Mapping
“conforms to”
data
“conforms to”
no assumptions about the schemas
A general mapping language
Capable of mapping between relations schemas and nested schemas
Mapping at different levels of granularities
Incremental mapping algorithms
Information Systems Group Leila Jalali, Candidacy Exam
Formalize correspondences
n,d,y Companies(n,d,y) → y',F Organizations(n,y',F))
v1:
v2:n,d,y,g,a,s,m
Companies(n,d,y),Grants(g,n,a,s,m) → y',F,f Organizations(n,y’,F), F(g,f )
v3: g, r, a, s, m Grants(g,r,a,s,m) → f,p Finances(f,a,p)
v4:c, e, p Contacts(c,e,p) → f,b Finances(f,b,p)
Companies
Name
Address
Year
Grants
Gid
Recipient
Amount
Supervisor
Manager
Contacts
Cid
Phone
Organizations
Code
Year
Fundings
FId
FinId
Finances
FinId
Budget
Phone
v1
v2
v3
v4
f1
f2f3
f4
Using tuple generating dependency(tgd):
Information Systems Group Leila Jalali, Candidacy Exam
Correspondences alone are not enough
f3
CompaniesName Addres
sYear
MS SA 1976AT&T TX 1980IBM NY 1955
GrantsGId Rec.t Amt
301 MS 30
302 MS 40
303 IBM 30
Organizations
FundingsCode
MS
Year
FinIdFId
AT&T
IBM
301
302
Rec.t
How individual data values should be connected in the target?Companies
Name
Address
Year
Grants
Gid
Recipient
Amount
Supervisor
Manager
Contacts
Cid
Phone
Organizations
Code
Year
Fundings
FId
FinId
Finances
FinId
Budget
Phone
v1
v2
v3
v4
f1
f2f3
f4
Information Systems Group Leila Jalali, Candidacy Exam
The "association" between companies and grants in the source is suggested by f1 (a foreign key)
More complex mappings are needed
n,d,y,g,a,s,m Companies(n,d,y),Grants(g,n,a,s,m) → y',F,f Organizations(n,y’,F), F(g,f )
f3
CompaniesName Addres
sYear
MS SA 1976AT&T TX 1980IBM NY 1955
GrantsGId Rec.t Amt301 MS 30302 MS 40303 IBM 30
Organizations
FundingsCode
MS
Year
301
FinIdFId
AT&T
IBM 303
302
Companies
Name
Address
Year
Grants
Gid
Recipient
Amount
Supervisor
Manager
Contacts
Cid
Phone
Organizations
Code
Year
Fundings
FId
FinId
Finances
FinId
Budget
Phone
v1
v2
v3
v4
f1
f2f3
f4
Information Systems Group Leila Jalali, Candidacy Exam
Yet more complex...
n,d,y,g,a,s,m Companies(n,d,y),Grants(g,n,a,s,m) →
y',F,f, p Organizations(n,y',F), F(g,f), Finances(f,a,p)
• Three tuples are generated for each pair of related companies and grants
• The mapping specifies that there exist an f, appearing in two places, without saying what its value must be
v3:g, r, a, s, m Grants(g,r,a,s,m) →
f,p Finances(f,a,p)
Companies
Name
Address
Year
Grants
Gid
Recipient
Amount
Supervisor
Manager
Contacts
Cid
Phone
Organizations
Code
Year
Fundings
FId
FinId
Finances
FinId
Budget
Phone
v1
v2
v3
v4
f1
f2f3
f4
Information Systems Group Leila Jalali, Candidacy Exam
v4 c, e, p Contacts(c,e,p) → f,b Finances(f,b,p)
Yet more complex...
• How do we obtain the phone to be put in finances?
• Is it the supervisor's one or the manager's?
• FKs suggest either (or even both)
• Human intervention is needed to choose
Companies
Name
Address
Year
Grants
Gid
Recipient
Amount
Supervisor
Manager
Contacts
Cid
Phone
Organizations
Code
Year
Fundings
FId
FinId
Finances
FinId
Budget
Phone
v1
v2
v3
v4
f1
f2f3
f4
Information Systems Group Leila Jalali, Candidacy Exam
The Mapping Language- Syntax
foreach x1 in g1, . . . , xn in gn
where B1
exists y1 in g'1, . . . , ym in g'mwhere B2
with e1 = e'1 and . . . and ek = e'k
foreach c in companies, g in grantswhere c.name=g.recipient
exists o in organizations,f in o.fundings,i in finances
where f.finId = i.finIdwith o.code = c.name
and f.fId = g.gIdand i.budget = g.amount
xi in gi (generator)•xi variable•gi set (either the root or a set nested within it)
B1 conjunction of equalities over the xi variables
e1 = e'1 … equalities between a source expression and a target expression
The example:
Information Systems Group Leila Jalali, Candidacy Exam
Primary and Relative paths
Primary path (given a schema root R, that is a first level element in the schema):x1 in g1, x2 in g2, …, xn in gn
where g1 is an expression on R (just R?), gi (for i ≥ 2) g1 is an expression on xi-1
Examplesc in companieso in organizations, f in o.fundings
Relative path with respect to a variable x x1 in g1, x2 in g2, …, xn in gn
where g1 is an expression on x, gi (for i ≥ 2) g1 is an expression on xi-
1
Examplef in o.fundings
Information Systems Group Leila Jalali, Candidacy Exam
A schema: a sequence of labels(roots) each with associated type, defined by this grammar:
Schema and types
Atomic types A set typeComplex types
Repeated elementsAll and choice model-groups
Instances: associates each schema root a valueA value for atomic types
An unordered tuple of pairs
A pair
setID
Information Systems Group Leila Jalali, Candidacy Exam
Correspondences
Information Systems Group Leila Jalali, Candidacy Exam
the data exchange problem
Information Systems Group Leila Jalali, Candidacy Exam
Query generation challenges
1. Creation of New Values in the Target
Optional: Null
Not nullable: one-to-one Skolem function
namesalary
spouse
dateofbirth
But if it is emp ID
Information Systems Group Leila Jalali, Candidacy Exam
1. Creation of New Values in the Target
Refrential constraints
Query generation challenges
Information Systems Group Leila Jalali, Candidacy Exam
2. Grouping Nested elements
Query generation challenges
Information Systems Group Leila Jalali, Candidacy Exam
3. Value Creation interacts with Grouping
Query generation challenges
Information Systems Group Leila Jalali, Candidacy Exam
Recursion in XML schema
Information Systems Group Leila Jalali, Candidacy Exam
the Chase
Given as association, repeatedly applying a chase rule to the "current" association (initialed as the input one)If there is a NRI constraint
foreach X exists Y where Bsuch that the "current" association contains X and does not contain a Y that satisfies Bthen add Y to the generators and B to the where clause
Example. If we start with from g in grants
then we have to add various components and obtainfrom g in grants, c in companies,
s in contacts, m in contactswhere g.recipient = c.name and g.supervisor = s.cid and
g.manager = m.cid
Information Systems Group Leila Jalali, Candidacy Exam
Clio: Analysis and Conclusion
Termination and Complexity of the Chase:the Chase with general dependecies may not
be terminateCyclic dependencies
NRIs: A weakly acyclic setthe number of Chase steps is polynomial
Conculsion
Information Systems Group Leila Jalali, Candidacy Exam
Clio mappingA Clio mapping: for each AS exists AT
with EAS , AT : logical associations (on source and
target, resp.)E a conjunction of equalities:
for each correspondence v in C covered by <AS , AT> , E includes the equality h(eS )=h(eT ) which is the result of the coverage, for one of the coverages
Information Systems Group Leila Jalali, Candidacy Exam
Structural Association
Structural association:− from P (with P primary path)
Starts from the Root of the schema
CompaniesNameAddressYear
Grants GidRecipientAmountSupervisorManager
Contacts CidEmailPhone
OrganizationsCodeYearFundings
FIdFinId
FinancesFinIdBudgetPhone
Information Systems Group Leila Jalali, Candidacy Exam
Nested Referential Integrity (NRI) constraints
P1 is a primary pathP2 is a primary path or a relative path
with respect to a variable in P1
B is a conjunction of equalities between an expression on a variable of
P1
and an expression on a variable of P2
o in organizations, f in o.fundings
f in o.fundings
foreach o in organizations, f in o.fundings exists i in finances
where f.finId = i.finId
Organizations:Code
Year
Fundings:
FId
FinId
FinancesFinId
Budget
Phone
The basis for discovery of associations: capture relation foreign key and referential constraints as well as XML keyref constraint:
foreach P1 exists P2 where B
f4
Information Systems Group Leila Jalali, Candidacy Exam
Logical association: semantic relationships between schema elementsObtained by starting with a structural association and "chasing" NRI constraints
Logical Association
Information Systems Group Leila Jalali, Candidacy Exam
Logical Association- the Chase
Companies
Name
Address
Year
Grants
Gid
Recipient
Amount
Supervisor
Manager
Contacts
Cid
Phone
Organizations
Code
Year
Fundings
FId
FinId
Finances
FinId
Budget
Phone
v1
v2
v3
v4
f1
f2
f3f4
start with a structural association
f2
f3
Information Systems Group Leila Jalali, Candidacy Exam
Logical Association Relationships
A2 dominates A1 (A1 ≤ A2 ) if the from and where clauses of A1 are subsets
of those of A2 (after suitable renaming)
A2 : from g in grants, c in companies, s in contacts, m in contactswhere g.recipient = c.name and g.supervisor = s.cid and
g.manager = m.cid
A1 : from g in grants, c in companieswhere g.recipient = c.name
Information Systems Group Leila Jalali, Candidacy Exam
Mapping Generation AlgorithmInputs: S , T ,
CorrespondencesLogical associations are meaningful combinations of
correspondencesGenerate all Logical Associations : AS , AT
Which correspondences can be interpreted together?For each suitable pair <AS , AT>: find the
correspondences covered by the pair with some renaming <h,h‘>, Check for dominance
Output: the set of Schema Mappings
AS : from c in companiesAT : fom o in organizations
M: for each c in companies exists o in organizations with c.name = o.code
Generate Clio Mapping: foreach AS exists AT with W
W is the equality h(eS )=h(eT ) Add the Clio Mapping to the Set of
Mappings