merging models based on given correspondences rachel a. pottinger philip a. bernstein
Post on 19-Dec-2015
214 views
TRANSCRIPT
Merging Models Based on Given Correspondences
Rachel A. PottingerPhilip A. Bernstein
Introduction
“A model is a formal description of a complex application artifact, such as a database schema, an application interface, a UML model, an ontology, or a message format. The problem of merging such models lies at the core of many meta data applications.”
Introduction
Combining models requires two steps: Determining correspondences between
two models (Schema matching) Merging the models based on those
correspondences
Determining correspondences is a major topic of ongoing research and is not covered in this paper
Model Management
Proposed by Bernstein in “Applying Model Management to Classical Meta Data Problems”Operators: Match Merge Apply Diff
Model Management
Presented solution: Merge (A, B, MapAB) A & B models MapAB = mapping of correspondences Returns “duplicate-free union” of A &
B
Example - Conflict
Conflict Resolution
Conflict resolution is independent of representationExisting similarities among solutions offer an opportunity for abstractionBuneman, Davidson, and Kosky (BDK) algorithm Uses pair-wise correspondences that
have “Is-a” and “Has-a” relationships
Representation of Models
Representation requires (at least) 3 meta-levelsModel = database schema, etcMeta-model = type definitionsMeta-meta-model = representation language in which models and meta-models are expressed
Inputs: Merge (A, B, MapAB)
Two models: A & BMapping: MapAB First-class models, elements and relationships Mapping elements, origins of additional
mapping relationships Non-mapping elements Equality and similarity mapping elements
Optional designation of preferred modelOptional overrides for Merge behavior
Complicated Mapping
Non-mapping element
Similarity Mapping
Mapping Result
Result = “a schema that presents all the information of the schemas being merged, but no additional information”Resulting model, G, satisfies Generic Merge Requirements
Conflict Resolution
Conflicts categorized based on meta-level Representation conflicts Meta-model conflicts Fundamental conflicts
Representation Conflicts
Occurs when two models describe the same concept in different ways Example, Name represented as ActorName vs.
FirstName & LastName Different possible outputs
Solutions: Concepts the same based off equality mapping
elements Related based of meta-meta-model relationships
and elements, FirstName sub element of ActorName
Related in more complex fashion beyond meta-meta-model representation, ActorName equals the concatenation of FirstName and LastName
Meta-Model Conflicts
Merge result violates meta-model-specific constraint SQL table and XML database are
merged into a SQL model, there will be no concept of a sub column
EnforceConstaints operator requires merge results to conform to a given meta-model.
Fundamental Conflicts
Meta-meta-model conflictsMerge result violates meta-meta-model rules and cannot be considered a model
Fundamental Conflicts Example
Meta-meta-model rule: one-type restriction
Merge allowed actions: Specify an alternative function to
apply for each conflict resolution category
Resolve the conflict manually
Cardinality Constraints
Maximum and minimum occurrences of relations often restricted
Acyclicity
Models often required to be acyclicCycles introduced in merging are collapsed into a single element by defaultUser can override default behavior
The Merge Algorithm
Initialize result G to nullInclude Elements with equivalence relationCombine element propertiesCombine and include relationshipsFundamental conflict resolution
Merge Steps
ActorNameActorIDSim
Bio
Actor
Bio FirstName LastName
ID:History:HowRelated:Name:Etc…
ID:History:HowRelated:Name:Etc…
ID:History:HowRelated:Name:Etc…
ID:History:HowRelated:Name:Etc…
ID:History:HowRelated:Name:Etc…
ID:History:HowRelated:Name:Etc…
ID:History:HowRelated:Name:Etc…
ID:History:HowRelated:Name:Etc…
Contributions
Technical requirements for a generic merge operatorUse of a first-class input mapping model, enabling richer correspondencesCharacterization of when Merge can be automaticTaxonomy of conflicts and a definition of conflict resolution strategiesExperimental evaluation and results
Evaluation
Merged Foundational Model of Anatomy (FMA) and GALEN Common Reference Model
FMA contains 895,307 elements and 2,032,020 relationships
GALEN contains 155,307 elements and 569,384 relationships
Significant structural differencesMapping contained 6265 1-to-1 correspondencesEvaluation Goals:
Limited changes to Merge would be needed Merge would function on models this large The merged results would not be simply read from the
mapping (i.e., the conflicts anticipated would occur)
Evaluation
Few non-fundamental changes had to be madeMerging took aprox. 20 hoursMerge results
1,045,411 elements with 9,096 duplicates 2,590,969 relationships
338 cycles, most of length 2, where found1 cycle of length 18 was foundMerged correspondences:
3 element merges: 2344 3+ element merges: 623 1 element merges: 1215
Conclusions
Algorithm is well designed Merge() is implemented in a generic way that allows for different modelsDefinitions of conflict management are givenImplementation and execution was very labor intensiveSlow, 13 weeks of expert work, 20 hours of processor timeRelies on other systems with unknown results
Questions?
Generic Merge Requirements
1. Element preservation2. Equality preservation3. Relationship preservation4. Similarity preservation5. Meta-meta-model constraint satisfaction6. Extraneous item prohibition7. Property preservation 8. Value preference