a methodology for view integration in logical database design

Upload: ticianne

Post on 29-May-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/8/2019 A Methodology for View Integration in Logical Database Design

    1/23

    A Methodoloqy for View Inteqrationin Loqical Database DesiqnShamkant 6. NavatheDatabase Systems Research and Development Center512 Weil Hall

    University of Florida, Gainesville, Florida 32611Suresh G. GadqilMetropolitan LifeCorporate S.Ystem Planninq1 Madison AvenueNew York, New York 10010

    AbstractThis paper is based on a conceptual frame-work for the design of databases consisting ofview modeling, view integration, schema analysisand mapping and physical design and optimiza-

    tion. View modeling involves the modeling ofthe user views using a conceptual/semantic datamodel; the view integration phase merqes userviews into a global model which is then mappedto an existing database environment and subse-quently optimized. Here we use the data modelproposed by Navathe and Schkolnick to model userviews and develop a methodo1og.y for view inte-gration. View integration issues are examinedin detail; operations for merging views aredefined and an approach to automating the viewintegration process is described. The proposedapproach is being partially implemented at theUniversity of Florida.

    1. INTRODUCTIONIt can be said without any doubt that oneof the major obstacles to the use of databasemanagement software is the initial preparatoryeffort during logical database design. It iscertainly a difficult task to design a "com-munity view" of a single database which trulyreflects the aggregation of views with differentexpectations, backgrounds and technical exper-tise. For realistic databases used in business,industry and government with thousands ofpotential users, an individual user or usergroup cannot be expected to be aware of the

    needs of the rest of the user community and adesigner cannot be knowledgeable enouqh tocomprehend the requirements of a spectrum ofusers.A natural way to start, therefore is b.ycollecting the views of user groups or applica-tion areas individually. The problem of copingwith thousands of data elements/attributes orhundreds of data objects is ver.y difficult todeal with manually [12]. In this paper wepresent an approach to alleviating this prol?lem.

    proceedings of the Eighth International Conferenceon Very Large Data Bases

    It is assumed that user views would be explicit-ly represented using some data model. The ViewRepresentation Model of Navathe and Schknlnick[ 111 is used as a representative model and amethodology for inteqratinq such views isdiscussed.

    1.1 THE DATA BASE DESIGN PROCESSThe conceptual framework for the desiqn ofdata bases as described in some of Navathe's

    previous work [11,16] and as qenerally acceptedat the 1978 New Orleans data base desiqn work-shop [;I] may be described in terms of foursteps. The input to these four steps stems froma Requirements Analysis step which provides aspecification of data and processing require-ments. The four steps are:

    A. View modelinrl: in this first sten theuser's view of the real world is abstracted andrepresented.

    8. View Integration: the user views arecombined into a qlobal model of the data and anyconflicts in the process are presented forresolution.C. Schema analysis and mapping: the globalmodel is mapped to the logical structure of an

    existing database management system. lD. Physical schema design and optimization:the logical structure is analyzed with respectto physical design alternatives and an "optimal"

    physical schema is constructed .This conceotual framework is illustrated inFiqure 1.

    1.2 OBJECTIVE AND SCOPEA major problem in database design is thelack of a structured desiqn methodoloqv and lackof automatic aids for developing complex data-bases. Research has been conducted with the

    142 Mexico City, September,W8

  • 8/8/2019 A Methodology for View Integration in Logical Database Design

    2/23

    goal of structuring this process [12,15,17-J.Further, a few automated techniques whichaddress parts of the database design process andothers oriented towards particular database man-agement systems are available; examples arePSL/PSA [14] used primarily for requirementsspecification; and Database Design Aid (DBDA)[12] used for a specific database managementsystem. We had proposed in [16] the frameworkfor a structured database design methodologyconsisting of the above 4 steps and surveyedresearch done to date related to those steps.In [ll] we described how the first step of viewmodeling may be accomplished.This paper addresses the View Integrationstep. It assumes the use of the specific datamodel namely, the Navathe and Schkolnick model[ll], for representing views and analyzes theprocess of view integration. The aaper dis-cusses the different problems related to theview integration process and proposes anapproach to the development of a software systemfor automating the construction of integratedviews. We are presently implementing theseideas in the database design aid in conjunctionwith a data dictionary system being developed atthe University of Florida. Efforts are alsounder way to consolidate the research on viewintegration by collaborating with the Universityof Rome where similar work is being pursued onthe extended E-R model [2]. To our knowledqe,very little work has been done on the view

    integration problem, barring a few exceptionsc51.2. MODELING USER VIEWS

    For ease of discussion and because o f someof its desirable features we have chosen themodel proposed by Navathe and Schkolnick [ll](henceforth abbreviated as the N-S model) as avehicle for modeling user views. A briefdescription of the model and its advantagesfollows.

    2.1 SUMMARY OF THE NAVATHE-SCHKOLNICK MODELThe N-S model includes the two as ects ofuser requirements identified by Kahn [6 : "theiInformation Structure perspective" describing

    data by means of entity types and associationtypes, and "the usage perspective" giving adescription of processing and insertion/deletionof instances of these types. The discussion ofthe view integration process itself is mostlyinvariant to the model selected and will be donein a general way. The N-S model allows themodeling of user views in explicit terms. Themodel uses two type constructs: objects andconnectors. The object type is divided into twosubtypes: entity type and association type.

    Proceedings of the Eighth International Conferenceon Very Large Data Baser

    Instances of an entity tvpe are called entities.Entities refer to physical things, persons,concepts. An entity type may either be self-identified or externally identified. An asso-ciation type refers to a collection of associa-tions. Associations are n-ar.y relationshipsdefined over entity types and association types.They are subdivided into two subtypes: simpleassociations and identifier associations. Thesubtypes of a simple association type are:cateqorization, subsettinq and partitioninqtypes.

    Connecter types in the N-S model connect anassociation type to some object tvoe which par-ticipates in the association type. They aredivided into two subtypes: directed and undi-rected. A directed connector type impliescertain rules of insertion and deletion in thecontext of an association type. In qeneral,connector types are used to show three types o fdependencies among object types: ownership,deletion and null dependency. They are illus-trated in Fiqure 3.

    The N-S model representation conventionsallow view diagrams to be drawn. These diaqramsare supplemented with assertions to state com-plicated interdependencies among instances. Anassertion lanquage may be defined to state theseassertions. We are presently investigating theadequacy of a language based on the first orderpredicate calculus. The language must be capa-ble of expressing the following kind of asser-tions: For the view diagram in Fig. 18 -"Procedures performed by the SERVICEinstance called Hospital-trust are alwaysperformed free" is an assertion in naturallanguage. The same in an assertionlanguage could be stated as -

    s.Service-Name = Hospital-trust & e PROCEDURE-IDENTIFIER & E PERFORMED ==> E PERFORMED-FREE.where: s,p,c,r are respectively instances ofentity types SERVICE, PROCEDURE,

    SCHEDULE and PERSONNEL.Assertions pertaining to a single view are

    termed intra-view assertions whereas thosepertaining to multiple views are termed inter-view assertions.

    For a further clarification of the N-Smodel the reader is referred to the Appendix inthe paper and to Clll.Note: The words 'type' and 'instance' will beused in the subsequent discussion onlywhen they seem necessary.

    143 .,MexitiCity, September,1969

  • 8/8/2019 A Methodology for View Integration in Logical Database Design

    3/23

    2.2 ADVANTAGESOF THE N-S MODELa) It allows association types to bedefined over entity types, over associationtypes, or a combination of both unlike the E-Rmodel [23.b) Models the existence-dependencies among

    entities separately (in terms of identifierassociations) from the relationships (or associ-ations) among entities (unlike Smith/SmithC131) l It thus defines identification pathsclearly. (e.g. PROCEDUREn Fig. 19 is exis-tence-dependent on SERVICE b.y virtue of identi-fier association PROCEDURE-IDENTIFIER),

    C 1 Describes a view of terms of objectWe which are either entities or associa-tions). The internal hierarchy of descriptorelements (attribute types) within an object typeis kept separate from the object type interac-tion since .it is less important. For examplethe internal structure of A in Figure 4 namely,A(A,b,(c,d),(e,(f))) is a further level ofdetail about entity type A.d) Models data relationships as seen b.yuser both at schema and instance level. Proper-ties of views with respect to insertion/deletionrules etc. are clearly distinguished (unlikeChen [2] and Smith/Smith [13]) by use of differ-ent subtypes of association types and by employ-ing directed connector types.

    2.3 GLOSSARY OF VIEW MODELING AND INTEGRATIONTERMSA user view is a representation of informa-

    tion structure and processing requirements asseen by a group of users. A subview is a subsetof a user view. The subsequemussion aboutuser views is equally applicable to subviews.The View Integration process has as itsobjective the development of a conceptual modelsupporting all users in the organization. Thisconceptual model will be called as a GlobalView. The Global View ma.y in fact con-num6er of views that cannot be further inte-grated. The software system to accomplish viewintegration will be called a View Integrator.The Enterprise View forms a nucleus for

    development of the m View. It describesthe basic entities and associations appropriatefor the organization and is based on the assump-tion that some characteristics of data structureand usage can be deduced from the nature of theorganization.

    Proceedings of the Eighth International Conferenceon Very Latgs Data Bases 144

    Equivalent Views are defined as views whichhave the same information content but differentstructures. The term information content refersto the functional and non-functional associa-tions among attributes. Information content hasbeen defined formally for relational databasesin terms of functional dependence and directtable look-up associations [l]. In this paperwe will assume that an acceptable measure ofinformation content has been defined for theuser c0mmunit.v in question.

    A tarqet of integration is an object typeor a c-o?;-type which is compared amongdifferent views for possible integration.A view integration operation is a processthat is applied to the targets of integrationgiving rise to new object types, connectortypes, attribute types etc. These operationsare discussed in greater detail in a later sec-tion. A preferred view is that view which isselected from amonq two or more equivalent viewsfor inclusion in the Global View over a lesspreferred view. The inteqration policy isdefined as the set of rules which dictate how anumber of local views must be integrated into anEnterprise View. The policy contains severalrules; examples of these rules are:

    "The user views in Finance Dept. are tobe preferred over user views in Market-ing Dept. at all times."or "In categorizing patients, select thosecategorizations which correspond to thephysical allocation of wings in thehospital (e.g. Psychiatric, Obstetric,Intensive Care, etc.)."

    The integration polic.y, after all, is subject tointerpretation by the designer. It should becouched in the form of a scale of preferences ofviews and intra-view and inter-view assertions.

    2.4 EQUIVALENCEOF VIEWSA comparison of two views can be made todetermine whether the.v are identical. Identicalviews have objects and connectors that matchcompletely. In some cases the identical natureof two views is made apparent only af ter namingcorrespondences are established.Views which are not identical are referredto as non-identical and may be classifiedfurther into Equivalent and Non-equivalentviews. A pairwise matching of views can lead tothe grouping of a set of views into subsetsshown schematically in the following hierarchy:

    Mexico City, Sap$wnbar, 19

  • 8/8/2019 A Methodology for View Integration in Logical Database Design

    4/23

    Views

    Identical Non-identical(may have naming conflicts)

    AEquivalent Non-equivalent

    ARestructureepresentationequivalent Equivalent

    Equivalence of two views arises from twosources: the use of different modeling cons-tructs and conventions resulting in Representa-tion Equivalence; genuine differences in how thesame data are viewed by the modeler/designerresulting in Restructure Equivalence. Restruc-ture Equivalence is so named because each of theequivalent views can be transformed into theother by a set of restructuring operations suchas compression, expansion, assembly merging andassembly partitioninq [lo]. The distinctionbetween these two types of equivalences ismotivated by the fact that the complexity ofview integration operations is greater in deal-ing with Restructure Equivalent views. Sincethe boundary between these two is at best arbi-trary, some convention must be adopted to sepa-rate the two. In 'this paper our convention willbe as follows: if two views have objects andconnectors which match except for differences inprime and candidate keys, they will be consi-dered representation equivalent. Figure 5 showstwo Representation equivalent views. Figure 6shows two Restructure equivalent views whereView 2 has entity type DEPTS categorized intocategories MEDICAL, SURGICAL, NON-MEDICAL-NON-SURGICAL but contains the same informationas in View 1. The equivalence is not apparentunless it is specified.Restructure equivalence among views to beintegrated generally requires identification ofthe preferred views(s) and gives rise to thedevelopment of mapping rules during integration.The preferred view is incorporated into theGlobal View with as little change as possible.Less preferred views are accommodated by meansof mapping rules which enable the non-preferredview to be reconstructed from the Global V iew.

    3. A MODEL FOR THE VIEW INTEGRATION PROCESSA Global View is developed from the inputviews and assertions. The input views consist

    of the Enterprise View and local views; theInter-view assertions describe certain charac-teristics of views which need to be taken intoaccount in the inteqration process.Other outputs of the View Inteqrationprocess are - Mappinq Rules and a Statement ofConflicts. The Mapping Rules describe how indi-vidual local views can be generated and how userprocessing requirements can be satisfied. Thenature of mapping rules depend upon the way inwhich processing requirements are specified.E.g. if an access path specification is used todescribe processing requirements, a mapping .rulemay state that the original access path isobtained b.y a combination of access paths in theglobal view. The statement of conflicts pre-sents conditions which could not be resolvedduring the integration.The view integration model is presentedschematically in Figure 7.The following assumptions and statementsfurther define the scope of view integration inthe present paper:a) a sinqle result: the Global View whichreally consists of multiple views is to beproduced. The model does not consider develop-ment or presentation of alternative GlobalViews.b) due to errors or inadequate definitionof input views or assertions the integrationstep produces a statement of conflicts wherevernecessary after designer intervention.

    cl it is assumed that a "normalization ofinput user views" is carried out correspondingto the first normalization of the relationalmode l [4] to eliminate all Identifier Associa-tions. The normalization process propagateskeys within the data structure so that allobjects become self-identified. Figure 8 illus-trates how the identifying association linkingLABORATORYand CHIEF is removed by propagatingLabName into CHIEF, i.e., expanding the key ofCHIEF from ChiefName, to Labname, ChiefName tomake it self-identified. Navathe LgJ has dis-cussed an intuitive procedure to normalizenetwork structured data which can be appliedwhen a network of identification paths ispresent.

    3.1 INPUTS TO THE VIEW INTEGRATORThe inputs to the View Integrator are theEnterprise View, the local views, assertions and*(processing requirements. In this paper we donot elaborate on how processing requirementswithin each view are specified.

    Proceedings of the Eighth International Conferenceon Very Large Data Bases 146 Mexico CiW, September, 1982

  • 8/8/2019 A Methodology for View Integration in Logical Database Design

    5/23

    A. THE ENTERPRISE VIEWThe Enterprise View is constructed by thedesigner based on his knowledge of the orqaniza-

    tion. It will include many of the entities inthe final database design. For example, in adatabase for a Medical Center an Enterprise Viewmay consist of entity types DOCTORS, PATIENTS,TESTS, MEDICATIONS, NURSES and PROCEDURES.

    8. LOCAL VIEWSLocal Views are views of data seen by theindividual users or user groups in the organi-zation. They are modeled using the N-S model

    with entities and associations. The.v yvprEesupplemented b.y intra-view assertions.ferred local view and its intra-view asser-tion(s) are carried forward unchanged as far aspossible during integration; some local viewsundergo changes during integration. However,unless deliberately deleted, no information fromany view is supposed to be lost.

    C. INTER-VIEW ASSERTIONSInter-view Assertions circumscribe the viewintegration process by specifying views which

    are equivalent. They state in what way viewsare equivalent and the data correspondence fromone view to another. Data correpsondence isstated in terms of names of entities, associa-tions and attributes and, where appropriate, therestructuring operations to convert one view toanother. An example of an inter-view assertionfor Restructure equivalent views shown in Figure6 is:

    1. View 2 = RSTR[View 112. Preferred View = View 23. View 2 [MEDICAL, SURGICAL,

    NON-MEDICAL-NON-SURGICAL]==> CATEGORIZE [View 1[DEPARTMENT]/TYPE]

    The first two statements above are self-explanatory. Statement 3 specifies the restruc-turing operation called CATEGORIZE which trans-forms entities from one view to another. (seeassembly partitioning in [lo]). In this examplethe entities MEDICAL, SURGICAL, NON-MEDICAL-NON-SURGICAL are categories of the DEPARTMENT entityof View 1 by the attribute type called Type.

    Inter-view assertions are also used tospecify prime and candidate keys in representa-tion equivalent views. The inter-view asser-tions for Figure 15 are:

    1. View 1 = REP[View 21 '_.2. Prime key: ServiceNo.; Candidate key:ServiceName.

    Proceedings of the Eighth international Conferenceon Very Carge Date Bases 146

    _ .~

    3.2 OUTPUTS OF THE VIEW INTEGRATORA. THE GLOBAL VIEWA Global View is a conceptual model of datawhich subsumes the Enterprise View and all

    pertinent user views within an organization.The Global View is characterized by a minimum ofdata redundancy and inclusion of all accesspaths for data retrieval of object instancesfrom all user views. A Global View is the mostsignificant output of the View Integrator; it isinput to the next phase in logical databasedesign namely, Schema Analysis and Mapping,which maps the database design to a targetdatabase management system.

    B. MAPPING RULESMapping rules state the access paths fordata and include the deletion and insertion

    rules for views which are equivalent. MappingRules are derived from inter-view assertions.They represent the means by which data specifiedin a non-preferred view may be obtained from theGlobal View. The complexity of mapping rulesdepends upon the number and type of restructur-ing operations required to transform one viewinto its equivalent.

    Mapping rules may be illustrated withreference to Fiqure 6 which shows two restruc-ture equivalent views. Inter-view assertionsfor these views have been stated earlier. Amapping rule for these views would be:

    DEPTS ==> DEPTS t MEDICAL t SURGICALt NON-MEDICAL/NON-SURGICALwhere t indicates a union operation for accesspaths.

    The mapping rule siqnifies that referencesto the entity DEPTS of the non-preferred view -View 1 requires access to all the entities namedon the right hand side of the mapping rule inthe Global View. For Fig. 10, a maopinq rule toobtain View 1 would indicate that XCeSSiWDEPARTMENT by Oeptname would involve a table-look-up operation of looking up a Deptno forDeptname first.

    3.3

    tion

    A GENERAL PROCEDURE FOR VIEW INTEGRATIONIn a semi-formal manner the view integra-process may be described as follows.

    InputsAssume an enterprise view E and local viewsVI, . V, as given.

    Mexico City, Septem&$B2

  • 8/8/2019 A Methodology for View Integration in Logical Database Design

    6/23

    P = given integration policy: = a set of intra-view assertions= a set of inter-view assertionsR = a set of processing requirements

    outputsG= global view setM = a set of mapping rulesS = a set of conflictsA modified set of assertions

    A step-by-step procedure for view integration:Step 1: Divide views V

    classes CI, . . . v l * * v n i nouch that eachclass contains eitker- a set of equivalent views(either restructure or represen-tation equivalent), or- a set of identical views, or- a sinqle view (which is notidentical nor equivalent to aview in another class).

    Verif.y the classes b.y checking withthe designer.Set class index i to 1.

    Step 2: Perform an inteqration on class Ci.If the class contains identicalviews, select one as the integratedresult: IVi.If a class contains equivalentviews, attempt to generate anintegrated view IVi,

    - by applying P- by treating views in descendingorder of preference

    - by reporting back conflicts to thedesigner and asking the designer tomake a choice (of names, of keysetc.) or making new assertions ormodifyinq old ones.

    - by considering I and J duringinteqration.If a single integrated view cannot begenerated, multiple intermediate viewsmay be generated and passed to the nextstep.

    Proceedings of the Eighth International Conference

    Step 3: Provide feedback to the desiqner onthe intermediate results ofintegration IVI, . . . IV . (Theseinclude local views whichPwere in aclass by themselves). Ask designerto provide a preference ordering ofthese semi-integrated views.

    Step 4: Integrate views IVi into E indescendinq order of preference in away similar to step 2.- Modifv assertions in I and J onthe basis of how the inteqration

    orocess affects the componentsof an assertion.- Report conflicts back to thedesigner and ask the designer tomake a choice, add or modify anassertion etc. If a conflictcannot be resolved add it to set

    S.- Repeat step 4 as long as viewinteqration operations areapplicable.specific The applicatlP,;i;;operationsintegration is extremely diff i-cult to figure out automaticallyon the basis of naming, struc-tural similarities and asser-tions. The designer may berequired to control this process(see [21).The result of step 4 is the set Gof global views. Ideally G shouldcontain a single view but practi-cally it may not. Step 4 may needto be reapplied after some modifi-cations.

    Step 5: Generate M b.y determininq how eachof the requirements in R is satis-fied by using G. (Generation of Mmay alternately be carried outduring Steps 2 and 4).

    4. THE MATCHING PROCESS IN VIEW INTEGRATIONThe view integration operations to beapplied to views are determined b.v comparing theviews to be inteqrated. At the object typelevel, simi1arit.y of objects is determined bycomparison of attributes and keys; at the viewlevel, the comparison is between objects andconnectors from one view with another. Thematching process is used in Step 1 of the viewintegration procedure to divide views intoclasses. There are three possible outcomes of

    on Very Large Data Bases 1 r-1 Mexico City, September,. 1982

  • 8/8/2019 A Methodology for View Integration in Logical Database Design

    7/23

    the matching process: complete match, partialmatch or mismatch.It is assumed that two completely matchingobject types will have completely matchinginstances. The reason behind such an assumptionis that at the logical database design stage noconclusions can be drawn regarding actual values

    in instances of data. If details about thevalues are available, they may be supplied bymeans of inter-view assertions:e.g., Value-set (View 1 l PLACE-OF-ORIGIN)I 'US','OTHER' , -

    Value-set (View 2 l PLACE-OF-ORIGIN)= 'US' ,' EUROPE','ASIA','OTHER'

    4.1 OBJECT SIMILARITYObject similarity is determined first bycomparing the names of the object types and thenby comparing the attribute names and definitionsof the object types. This matching results ineither a complete match or a partial match or amismatch . The conditions of complete match andcomplete mismatch are easy to visualize. Match-ing names are reported to the designer for con-firmation that they have identical meanings.The partial match conditions arise out of a com-bination of complete match, partial match, andmismatch when key attributes and non-key attri-butes from different objects are compared .The range of outcomes of the matchingprocess is illustrated in Table 1.The general rules fo r resolving partial

    matches and mismatches are described below.Let 0 be an object type being matched; 0is from a preferred local view, O2 is from anon-preferred view and OS is the result ofintegrating OI and 02.

    Let Key(Oi) = set of key attribute types inOi'

    N-Key(Oi) = set of non-key attributestypes in Oi.a) Partial match on key -

    W(Og) = W(Ol)N-Key(03) = N-Key(OI)u N-KeY(02)u{KeY(O2) - Key(OI)}

    b) Mismatch on key -Kw(Og) = @Y(Ol)

    N-Kw(03) = N-Key(OI) u {N-Key(02)- IVY) u Key(02)

    Proceedings of the Eighth International Conferenceon Very Large Data Bases

    The union operation on the attribute setsabove is supposed to eliminate duplicateattributes although their names may not beidentical. The duplication is inferred fromassertions.Figure 9 illustrates a partial match onkey; the entity type *DEPARTMENT as key attri-

    bute UivNo, UaptNo in view 1 which is preferred.After integration, the same key applies toDEPARTMENT. Figure 10 illustrates a mismatch onkey. Deptno from the preferred view 2 isselected as a key in the target view. Ueptnameremains as a non-key attribute type. Figure 11illustrates a case with no match on name betweenentity types NURSE-STATION and RECEPTION-STATION. However, an inter-view assertion(possibly input by the designer after ascertain-ing that they mean the same) forces a matchamong them. Matronname and Head-nurse-name arealso supposed to be declared as identical attri-bute types. The names, key etc. from the p re-ferred view are carried to the integrated view.Table 1: Outcomes of Object type matching

    Match Type Match OnName Key attributes attributes

    Complete 1 Completer Complete 1 \y;:eate

    Partial

    Partial No Match _CompletePartialNo MatchNo Match CompletePartialNo MatchNo Match Complete CompletePartialNo MatchPartial CompletePartialNo MatchNo MatchI CompletePartialMismatch 1 No Match No Match No Match

    4.2 SUBVIEW OR VIEW SIMILARITYAssume a situation in which local views arematched by extracting subviews made up of one ormore unary, binary or n-ary associations and

    matching them among themselves. A subview thushas at least two objects and may have ownershipand deletion dependency characteristics byvirtue of directed connectors. Subview similar-ity may be determined by a comparison of objecttypes and a comparison of dependencies. At aminimum we can consider two object types and theownership and deletion dependency, whichever areapplicable, between them. Object matching hasalready been discussed in terms of name, keyattributes and non-key attributes of the object

    148 Mexico City, September, 198

  • 8/8/2019 A Methodology for View Integration in Logical Database Design

    8/23

    types. As regards different dependencies, wehave three states, viz., ownership, deletiondependency and null dependenc.v in each of twoviews, giving a total of nine combinations. Inthe matching process we consider the combination[Ownership (View l), Deletion Dependency (View2)] the same as [Deletion Dependency (View l),Ownership (View 2)]. Therefore, we have toconsider only the following six cases:Table 2: Different combinations of dependenciesType of DependencyCombination # V iew 1 View 2 Match Outcome

    : Ownership OwnershipDeletion Deletion Complete Match3 Null Null4 Ownership Null Partial Match5 Deletion Null6 Ownership Deletion ConflictCombinations 1, 2 and 3 represent condi-tions of complete match and the merging processcan continue. Combinations 4 and 5 represents apartial match and require resolution in one oftwo ways: maintain both relationships withre ndant objec t types or provide mapping rulesso that data for each of the two views can beob ained from the Global View without any redun-

    /da cy. The latter method is chosen in keepinqwith one of the objectives of inteqrationn mely, to minimize redundancy. This is in con-trast to El Masri and Wiederhold's approach [5]where they require that partially matchedobjects from equivalent views in the GlobalModel be maintained AND consequent insertion anddeletion of instances of these objects behandled consistently. The approach in [5] hasthe advantage of preserving the same view ofdata as originally defined by each user in theGlobal View but results in a heavy data redun-dancy as well as increases processing overhead.

    As a general policy, the integrated viewfor Combinations 4 and 5 will include the moreconstrained of the views being integratednamely, View 1 in Table 2. Combination 6 showsa conflict in two views which cannot be resolvedautomatically. This condition requires designerintervention followed by a specification as tothe preferred view.

    4.3 VIEW INTEGRATION OPERATIONSView Integration operations permit themerging of user views to develop a Global View.Operations are performed on targets of integra-tion namely, object types and connector types.One operation on a target of integration maytrigger several operations with the object andconnector types which were involved in theformer operation as targets. View integration

    operations can be defined at two levels: schema(type) operations and instance operations.Instance operations would apply to individualinstances of objects (e.g., delete instanceswhere age > 75). A view integrator need notreally support such operations during desiqn.They may be provided in the form of modifica-tions of-assertions.Schema operations consist of operations onassociations and connectors. The former operateon the three kinds of special associations:categorization, partitioning and subsetting.The latter operate on connector types.

    4.3.1. ASSOCIATION TYPE OPERATIONSCategorization Addition Is the additon of anew categorlzatlon of an entity-type. Figure 12shows the integration of two views with differ-ent categorizations of the same object PATIENT.The integrated view is obtained b.y merging the

    PATIENT entity and the addition of categoriza-tion ADMISSION-STATUS to an existing cateqorira-tion AGE-GROUP. Assume that mutual exclusivftywas given as an intra-view assertion. Theimplication on instances is that a given patienthas one instance of type PATIENT, one of eithertype INPATIENT or type OUTPATIENT and one ofeither type PEDIATRIC or type ADULT.Category Enhancement is the addition of anew category type supplementinq an existinqcategorization. Category, enhanckent is illus-trated in Figure 13. The two input views showdifferent categorizations of the PROCEDUREentity type. The integratlon of these views

    results in one categorization association of thePROCEDURES ntity type into four entity types.This operation is similar to the categorizationaddition except that (i) no new categorizationassociation is added here; only new categorytypes are added to an existing categorization.If mutual exclusivity is to be enforced, cate-gorization addition is preferred.Partition Enhancement involves the additionof partitions to an existing partitioning asso-ciation. Partition enhancement is illustratedin Figure 14 where View 1 and View 2 have a par-titioning of the EMPLOYEEby EMPLOYEETYPE. Theentity type called EMPLOYEE-TYPE is shown in

    terms of its instances, e.g., NURSE and DOCTORin View 1. Each of these instances is supposedto be associated with a partition of the set ofinstances of EMPLOYEE. In the integrated viewthe partitioning association includes EMPLOYEE-TYPE partitions from View 1 enhanced byEMPLOYEE-TYPE partitions from View 2. Mutualexclusivity must be obeyed, otherwise partitionenhancement is disallowed. In this example anEMPLOYEE nstance must map into one of the fourpartitions defined b.y the four instances ofEMPLOYEE-TYPE.Proceedings of the Eighth International Conferenceols very Large Data Bases 149 MsxicoCity,G&embr,1982

  • 8/8/2019 A Methodology for View Integration in Logical Database Design

    9/23

    Partition Addition is the addition of a newaartitioninq association for a given entitytvne. In the above example of EMPLOYEE-TYPEpartitions, if the partitions from two viewscannot be disjoint (e.g. a Nurse can also be aSurgical-assistant), then the two EMP-TYPE-SELpartitionings need to be kept separate in theinteqrated view. From any single preferredview's standpoint, this amounts to adding a newpartitioning association to that view.

    Subset Addition is the addition of one ormore subsets to an existing association to allowa different subset of association instances tobe a named subset. Figure 15 illustrates subsetaddition. The integrated view shows the fiveassociation types which are subsets of theassociation type DRUG-SCHEDULErom View 1 andView 2.

    4.3.2 ATTRIBUTE TYPE OPERATIONSAttribute Enhancement is the addition ofnew attribute types to an object type to accountfor the partial match or mismatch of that objecttype with other views. Fiqure 10 shows howattribute enhancement is applied to the entitytype DEPARTMENT. Considering that each viewcontributes instances of an entity type, theattributes which are not present in a particularview are supposed to be given null values in theintegrated view.Attribute Creation is an operation where anew attribute is created in an objec t tvoe to_convey the same information as is contained inanother view by means of a categorization or

    partitioning association. Fiqure 16 shows anexample where object-type STUDENT is beingmatched in two views. To convey the categorita-tion information about a STUDENT, a newattribute called Category is created in theintegrated view. The value set for Categorywill be assigned (by the designer); e.g.,Category = 0 for graduate, 1 for undergradu-ate. Note that the attributes, Degree and Rank,which are a part of the entity type GRAD-STUD'and UG-STUD are used for an attribute enhance-ment of STUDENT.

    4.3.3 CONNECTORYPE OPERATIONSRestriction is defined as the-selection ofthe most restrictive connector specification forrepresentation in the integrated view. Connec-tor types show three types of dependency:ownership, deletion and null dependency. Whenmerging two views the combinations of dissimilarconnector types may be integrated as follows:

    Proceedings of the Eighth International Conferenceon Very Large Data Bases 150

    Conflict in connector tvpes calls fordesiqner intervention.

    5. CONFLICTS IN VIEW INTEGRATIONConflicts in the integration process ariseduring the matching and integration of views forseveral reasons. They are presented to thedesigner for resolution. If the designer isable to resolve them (possibly with the help ofusers) integration continues with the new infor-mation; otherwise the conflict is added to theset of unresolved conflicts. Conflicts maypartly be caused b.v incomplete or erroneous

    specification of input and partly due to differ-ences in modelinq. Conflicts may be classifiedby considerinq their source into the followingclasses:a) Object descriptions, includinq naminqand dependency conflictsb Equivalent View conflictsC Modelinq conflicts

    5.1 OBJECT DESCRIPTION CONFLICTSThese can occur in the fo rm of synonyms(different names referring to the same objects)

    or homonyms (one name referring to differentob.jects). In the proposed matching schemesynonyms will cause a complete match on key andnon-key attributes but no match on name; and.homon.vms will be shown up b,v a condition wheretwo objects have the same name but the ke.y andnon-key attributes do not match completely. Ineither case, our current implementation schemerequires designer confirmation before any merg-ing of object types is attempted. A designershould allow two objec t types, whether with thesame name or different names, to be integrated(merged) only after ascertaining that they havethe same set of instances. A wronq choice ofname could cause the reportinq of more conflictsas additional views are inteqrated.

    As discussed previously, several dependencyconflicts may occur. These conflicts are gener-ally resolved by adopting the most restrictivespecification out of those in different viewsand then providing mapping rules wherever necei-sary for individual views. One case where the

    Mexico Cii, September;7982

  • 8/8/2019 A Methodology for View Integration in Logical Database Design

    10/23

    conflict cannot be resolved (see Restri'Operation) is when the same object is anof association A in view 1 and is del'dependent on association A in view 2.

    5.2 EQUIVALENT VIEW CONFLICTS

    ctionowneretion

    The integration process assumes that equiv-alent views will be accompanied by inter-viewassertions stating equivalence. When equivalentviews are not accompanied by inter-view asser-tions, the integration process cannot detectequivalence and hence yields redundancies ofdata. In addition, partial match conditions maybe detected and the designer warned of possibleconflicts. Figure 6 shows two views which areequivalent. However, without. an inter-viewassertion about equivalence, the integrator willreport that object-type DEPTS from View 1partially matches with DEPTS as well as withMEDICAL, SURGICAL, etc. from View 2. The con-flict may arise whenever no equivalence isstated by an assertion, yet there is a totalmatch or a partial match among either key ornon-key attributes which indicates a possibilityof equivalence. This signals an equivalent viewconflict and warrants designer intervention.

    5.3 MODELING CONFLICTSDifferences in representation of views canpotentially lead to redundant data after inte-gration. However, genuine cases exist where thesame situation may be modeled differently bydifferent users. Figure 17 illustrates a model-ing conflict involving the use of partitioningand categorization associations. The instanceimplications of Views 1 and 2 are drasticallydifferent, Consider the following situation.Entities 'of type PATIENT are partitioned in twoways in View 1: by admission status and b.v age-group. ADMISSION-STATUS is an entity type withtwo instances and so is AGE-GROUP. In View 2the infbrmation about patients is more detailed- there is an instance of one of the categorytypes INPATIENT or OUTPATIENT for every PATIENT;similarly there is an instance of category typesPEDIATRIC or ADULT for every PATIENT too. To.support both views, the designer must allow thecategorization association as well as the parti-tioning associations to be preserved.Between the conditions of a complete matchand a complete mismatch of objects we have arange of partial matches. Table 1 shows thematch conditions when matching on Object names,key attributes and non-key attributes. Anintegrator nust provide for a set of actions ineach of those cases.

    6. CONCLUSIONS

    which We are currently implementing a design aidallows user views to be input using the N-S model, builds a dictionary of views andperforms view integration using the approachdescribed above. Emphasis is being placed ondesiqner interaction and ouraim is to produce aworkable solution for dealing with large prob-lems of integration which designers cannot copewith manually.The conclusions of our investigation intothe view integration process are as follows:a) The success or failure of view inteqra-tion is largely dependent on getting as explicita statement of user views as possible. The jobof extracting all relevant information from eachuser (or user group) is a major challenge to thedesigner. It is not clear whether the designershould do an ad hoc analysis of user views todetect equivalences (e.g. restructure equiva-lence) and then try to specify them on his ownor whether he should expect different user areasto know of the similarities and differences intheir data needs.b) From the detailed discussion of objectand connector matching it is clear that the taskof matching and integrating views represented byobjects and connectors is non-trivial.Decisions made in the automatic View Integratorshould be confirmed at every step and thedesigner/user should bring in his instancelevel knowledge about data to evaluate thematches or mismatches. User help may be soughtfrom time to time to guide this process.c) In the conflict resolution area, consi-derable human involvement is necessary. It ispossible to classify conflicts and help the userin an understanding of the conflict, but theultimate repsonsibility to resolve them will beup to the users and management who set policies.d) Given that users are kept actively/interactively involved in the view integrationeffort, the main advantage a machine can provideis to deal with a large number of integrationalternatives and present them to the user. Anautomated or semi-automated approach is stillbeneficial when the volume of comparisons ofdata names and data characteristics is too highto deal with manually. The actual view integra-tion operations at the object and connectorlevel were shown to be straightforward and canbe easily automated.The overall conclusion from the above,therefore, is that a View Integrator can be a

    Proceedings of the Eighth International Conferenceon Very Large Data Bases

    151Mexico City, September, 1982

  • 8/8/2019 A Methodology for View Integration in Logical Database Design

    11/23

    valuable aid in design for realisticintegrated databases used in most organiza-tions. It is however futile to expect that sucha system could solve the problem completely onits own and produce a tailored design of thedatabase. The system has to be an interactiveone where the user/designer team will navigatethe course to an acceptable design.

    ACKNOWLEDGEMENTThe authors wish to thank Melanie Smith forher comments on an earlier version of the paper.Discussions with Maurizio Lenzerini of theUniversity of Rome have been useful. This workwas supported in part by DOE grant No. DE-ASOS-81ERl0977.

    APPENDIX:View modeling in the Navathe-Schkolnick Model

    This appendix describes the essentials ofthe View Representation model [ll] which wasintroduced as a vehicle for expressing the viewsof data held by user groups or individual usersof a database prior to its design. Most of theterms used below are identical with those inc111. Some terminology has been changed toimprove claritiy and understandability. Thereader is referred to [ll] for a comprehensivediscussion about the model.General Description

    The model expresses a user's view of databy employing the following type constructs:entity, association, attribute, connector.These types are divided into subtypes as shownin Figure 2. Assertions are used additionallyto state interrelationships and constraintswhich cannot be otherwise expressed. The termobject type is used to stand for either anentity type or an association type. In general,the model allows the user to start with theentity types of interest, describe each entitytype with a nested list of attribute types andbuild any number of levels of association types.No instance information is captured in a viewdiagram besides that in the form of assertions.The model provides conventions for a visualdescription of a user view. This visualdescription, called a view diagram, is in theform of a graph with object types as nodes andconnector types as edges. Assertions are notgraphically represented.The semantics of the model include rules ofinsertion and deletion at both the type andinstance level. Some of them are covered belowand in Fig. 3. In the following we define each

    important concept in the model and provide someexplanation. References are made to theexamples used in the bod.y of the paper.

    TerminologyAn entity ty e refers to a physical thing,

    object, &son document etc whoseexistence is of inte:est to tke usi'r. Anassociation type is an n-ary relationship amongn object types. An object type is either anentity type or an association type. Anassociation type is .a fact, a concept o r arelationship among its components.Each object type has a descriptor set whichis a list of attribute types used to describethe object type. An attribute type may' beatomic in which case it is a propertv. aualitvor characteristic of the object types-which itcan describe. Otherwise, an attribute type maybe defined as a list of other attribute types.

    The attribute type structure within an objecttype can thus be hierarchically orqanized and isshown by a nested parenthetical notation (e.g.see Figure 4).A connector type shows the linkage or thelogical access path among two object types whereone of them is an association type and the otheris a participant in that association type. Aconnector type, if undirected, is represented bya two tuple. When directed it is represented byan ordered two tuple. An assertion is astatement (in some assertion specificationlanguage) which describes a specfic relationshipamong instances of object types from one view or

    multiple views; the term intra-view and inter-view assertions are used correspondingly.Assertions are used when the semantics of thedirected connectors is inadequate to modelinstance interdependencies.A list of key attribute types of an entitytype provide a .total identification to it wheneach instance omnique valuelist of the ke.v attribute types. Such entitytypes are called self-identified. Partialidentification of an entity type by itsattributes implies the need for external identi-fication. External identification is defined asthe process of augmenting the partial internal

    identifier (p1D) of an entity with the keyattributes of other objects . E.g., in Fig. 8,CHIEF has a key attribute Chiefname (key-attributes are always underlined in viewdiagrams) which is a partial identifier. CHIEFis externally identified from entity type LAB.The mode l allows an object type to have severalexternal identifications.Association types are divided into twosubtypes: simple and identifier. An identifierassociation is an n-ar.y association which

    Proceedings of the Eighth International Conferenceon Very Large Dete Beses 152

    J _Mexico Citi, Seqtemk, 1962

  • 8/8/2019 A Methodology for View Integration in Logical Database Design

    12/23

    provides external identification to uniquelyidentify an instance of a partially identifiedobject type from the remaining (n-l) objectstypes. An association is called a simpleassociation if it is not an identifier associa-tion. (See [8] for a detailed discussion ofidentification of data).

    Diagrammatic NotationThe visual description of a view is real-ized in the form of an acyclic graph called aview diagram where object types are nodes andconnector types are edges. All of the conceptsmentioned above, except intra-view assertions,are used in the view diagram. Object types!;xc;;;,identifier association) are represented,In the view diagram, associationtypes are'grouped on one side of the diagram andentity types are grouped on the other. Objecttypes are described by descriptor sets withinthe diagram underneath the object type if

    possible. A self identified object type isrepresented by placing a # symbol next to theobject type in the view diagram (e.g. SERVICE inFig 18). An identifier association is repre-sented as a rectangle with undirected connectorsfrom the identifying object type(s) to the iden-tifier association and a directed connector fromthe identifier association to the identifiedobject type.

    Simple Association TypeA simple association type is a way ofrelating instances of one object type withinstances of another object type, The contentof a simple association contains tuples composedof the total internal identfiers TIDs) of allt-bject types involved in the assoc ation type,as well as any attribute types describing thesimple association type. This type of associa-tion is represented with undirected connectorsbetween the objects (involved in the associ-ation) and the simple association. Identifica-tion of simple association types is achieved byconcatenating the TIDs of all of the objecttypes involved in the association type. Thereis no restriction on the number of simpleassociation types that an object type can beinvolved in.

    Directed Connector TypesDirected connector types express differentkinds of dependence among object types to whichthey relate. In an n-ary simple associationthere is a directed connector from an entity tothe association if it plays the role of an Ownerof the association. If deletion of the associa-tion implies that certan entity instances getdeleted, this is also indicated by a directedconector type from the association type to those

    Proceedings of the Eighth International Conference

    entity types. Lack of directed connectors in asimple association implies that there is nospecification of the ownership of the associa-tion nor of any deletion dependency. Fig. 3shows the above possibilities and explains theinsertion/deletion rules. In an identifierassociation type a directed connector tvoepoints to the entity type which is identified bythe rest (see Fig. 8).

    Three special types of simple associationsare defined:1. categorization is an n-ary relationbetween an owner object type and memberobject types called category t,vpes.2. partitioning is a binary relation amongobject types; an instance of the objectWe which causes partitioning isassociated with a set of instances (apartition) of the other object type.

    3. subsettinq is a unar.y association overan object type.Cateqorization A of an object type X intoobjet ttypesassociation ty& A i,, . ..X is denoted by anX2"1 . ..X ) which ismarked "cat" in a view diagram. Ffqures 12, 13show examples of categorization. CategorizationA implies that there exists a mapping f whichmaps every instance of X into a subset oP cate-gories XI, X2, . ..X..A binary partitioning association typeS(X,Y) associates an instance of X with a subsetof the instances of Y. Typically, X and Y areentity types but either one or both may be asso-ciation types. Partitions of the instances of Yare non-overlapping, such that each partitionhas a different instance of X associated with itvia S. A partitioning association is marked"partition" in a view diagram.

    Mathematically,let Iy be the set of instances of Y.

    2'y denotes the power set of Iy,i.e. the set of all its subsets.

    Then there is a function G, associatedwith S:

    Gs: Ix + 2Iy, such that

    IY = iCl Gs(xi)and Gs(Xijn Gs(Xj) = * for i f 5,

    where XI x2 . ..xm are all instances of X.

    on Very Large Data Bases 153 Mexico City, Septemkr, 1geZ

  • 8/8/2019 A Methodology for View Integration in Logical Database Design

    13/23

    The partitioning association EMP-TYPE-SEL inView 2 of Figure 14 implies that the instancesof entity type EMPLOYEEare partitioned into 2sets. One is associa ted with the first instanceof the entity type EMPLOYEE-TYPE representingsurgeons,' the other with the second instance ofthe entity type EMPLOYEE-TYPE representingsurgical assistants.A unary subsetting association type T(A)provides a means of givfng the name T to a'sub-set of the instances of A. In the view diagramsuch an association is marked "sub". A is typi-cally another association type (e.g., DRUG-SCHEDULE in Figure 15) but could also be anentity type. There is no mutual exclusivenessconstraint among the subsets of instances of Adefined by subsetting associations TI(A), T2(A),. . . . etc.View Representation using the N-S model isillustrated with the example in Figure 18: Thefigure shows six entity types: SERVICE,

    PROCEDURE, SCHEDULE, PERSONNEL, INPATIENT-PROCEDURES,OUTPATIENT-PROCEDURES; n identifierassociation PROCEDURE-IDENTIFIER which identi-fies instances of SERVICE with instances ofPROCEDURE;a simple association type PERFORMED;the categorization association PROCEDURE-CATEGORIES which categorizes PROCEDURE. ntoINPATIENT-PROCEDURES and OUTPATIENT-PROCEDURESand subsetting association types PERFORMED-FREEand PERFORMED-REPEATEDLY.The figure also showswith directed connectors that the objec t typePROCEDURE is identified by .association typePROCEDURE-IDENTIFIER, and owns the categoriza-tion PROCEDURE-CATEGORIES, nd that the entitytype SCHEDULE is deletion dependent on theassociation type PERFORMED, PERFORMEDs shownto be the owner of the subsetting associationtypes T

    REFERENCES1. Arora, A.K. and Robert C. Carlson, "Theinformation preserving properties of relationaldatabase transformations", Proceedings of the3rd Very'Large Database Conference, Berlin, WestGermany, Ittt, 1978.2. Batini, C., M. Lenterini and G.Santucci, "A Computer-aided Methodology for,Conceptual Database Design", InformationSystems, Vol.'appear). 7, No. 3, Pergam mon Press (to

    3. Chen, P.P.S., "The entity-relationshipmodel: towards a unified view of data". ACMTransactions on Database Systems, Vol. 1, ho7 16. Yao, S.B., S.B. Navathe, and(March 1976). J.L. Weldon, "An integrated approach to logicaldatabase design", Proceedings of the NYU Sympo-sium on Database Design, May 1978, [in 171;. Codd, E. F., "Normalized data baSestructure: a brief tutorial", Proceedings of

    the 1971 ACM-SIGFIDET Workshop on DataDescription, Access and Control, Ann Arbor, MIMay 1974.5. El-Masri, R., and G. Wiederhold, "Datamodel integration using the structural mode l",Proceedings ACM SIGMOD International Conference,Boston, MA, June 1979, pp. 191-202.6. Kahn, B.K., "A method for describinginformation required by the database designprocess", Proceedingstional Conference on Management-of Data, June1976, ACM, New York.7. Lum. V.Y. et al.. 1978 New Orleansdata base design working report", Proceedings ofthe 5th International Conference on Ver.y LargeData Bases, Rio de Janiero, Brazil, October1979, pp. 328-339.8. Navathe, S.B., "Schema analysis fordatabase restructuring", ACM Transactions on

    Database Systems, 5, 2, June 1980.9. Navathe, S.B., "An intuitive procedureto normalize network structured data", P roceed-ings of the 6th Very Large Database Conference,Montreal, Canada, October 1980, pp. 350-358.10. Navathe, S.B., and J.P. Fry, "Restruc-turing for large databases: three levels ofabstraction", ACM Transactions on DatabaseSystems, 1, 2, June 1976.11. Navathe S.B., and M. Schkoln ick, "Viewrepresentation in logical database design",Proceedings of the ACM-SIGMOD nternational Con-

    -ference, Austin, TX, Junea.;I9!?, ACM, New York.12. Raver, N. and G.U. Hubbard, "Automatedlogical database design: concepts and applica-tions", IBM Systems Journal, No. 3, 1977.13. Smith, J.M. and D.C.P. Smith, "Data-base abstractions: aggregation and generaliza-tion", ACM Transactions on Database Systems, 2,2, June 1971.14. Teichroew, D. and E.A. Hershey, III,"PSL/PSA: A computer-aided technique for struc-.tured documentation and analysis of informationprocessing systems", IEEE Transactions on Soft-ware Engineering, Vol. S.t-3, No. 1, January1977.15. Weldon, J.L., "Integrating databaselogical views", Unpublished Working Paper, NewYork University, 1979.

    Proceedings of the Eighth International Conferenceon Very Large Data Bases 154 Mexico City, September; l-98

  • 8/8/2019 A Methodology for View Integration in Logical Database Design

    14/23

    17. Yao, S.B., S.B. Navathe, J.L. Weldon ,T.L. Kunii [eds.], Database Design Techniques I:Requirements and Logical Structures, SpringerVerlag, New York, 1982.

    Proceedings of the Eighth International Conferenceon Very Large Data Bases 155 Mexico City, September, 1982

    ~~~_

  • 8/8/2019 A Methodology for View Integration in Logical Database Design

    15/23

    attribute

    I IInformationStructure(lS) ProcessingRequirements (PR)

    User View, Ehterprise View

    Global Model

    Logical FunctionalDBMS Design ofSChpi3 Applications

    tPhysical tDetailedDBMS TransactionSchema ProcessingFig. 1: Conceptual Framework for Database Design [ll]

    /ObiO\ntity association/\aAyg?on ;~$& I \&e*ingategorization partitioning

    / LLtedirectedconnector connectorFigure 2: The Type Hierarchy in the View Representation Model

    Proceedings of the Eighth International Conferenceon Very Large Data Bases 156 Mexico City, September; 1982

  • 8/8/2019 A Methodology for View Integration in Logical Database Design

    16/23

    Debtion Dapmdmq

    C073 E..c--- ----: i- - --*tD

    Null Dapmtd~q

    F

    ---- ---- ---cl%G

    Airthe-cbjectty~inassohtiitvp B.C and D an owner dqmncbnt on A.(iI instances of A can be frmly in-.

    (ii) instanar of C or D annot ba insermdulCS suodatd with mm8 A.(iii) if an instance u of A is d&ted, itsits anociatiin instances in 6 a. .containin g u am also Ill leted;instancaofCuldofDwldchwanot refermad in any other instantof B an also d&tad.

    DiSthOmankrObjscttypin8WCi4tiitvp c.D b daktiin &pandmt on C and E.Dalatiin of n instance cc of C implieswtomatic &lotion of the amxiaalimtanaddofD,rubjucttotwocondi-tiiS:(i) ~~tio;cxwnnwmt of any other.(ii1 dd is not a componmt in anothatasswiation instance (differant from C,Slldl~E~

    No ownership and no deletio n depabncyfwwociation(ypeFandrmongobjacttwms E and 0.Implies that E and G on b frnhl inwtadand &lomd.

    Fig. 3: Diiemnt Typm of apandencia

    A - h. b. (c. dl. (e, (f)l) a, b, c, d, ., f w l ttrii m.lhomtiooumbftlhomeNirnatal rtructu~ withii w-typa A.Clmton such ill (cd) mq have thdtOW8lSS.

    Fig. 4: Vii Repmmtati on with N-3 maW thawing npmmtbn ofamity-types from assacinbn tvpl l d . hiuuchy ofatwlbum typa within an mtity

    huisHi&. Srvico-Nam. Typ) mavi~No., a&a&m, Typo)Fii. 6: R wnmnmtkmEquivafana-&mob&ctsintwov*mbutwitbdiffaantprinnLyr.

    Proceedings of the Eighth International Conferenceon Very Large Data Bases

  • 8/8/2019 A Methodology for View Integration in Logical Database Design

    17/23

    View 1

    CONTAINS(7(Dept-No., Dept-name,Dept-type, Budget)

    View 2

    bntername, City) (Dept-No., Dept-name) (Budget) (Budget) (Budget)

    Fig. 6: Restructure Equivalenw

    Enterprise LocalView Views

    Intra-view InterviewAssertions Assertions

    Integrator

    1 :iq, i..:,. ,>it

    Fig. 7: A Model for View Integration

    Proceedings of the Eighth International Conferenceon Very Large Data Bases 158 Mexico City, September, 1982

  • 8/8/2019 A Methodology for View Integration in Logical Database Design

    18/23

    I AB-CHIEFIDENTIFIER I

    Lab-No., Lab-Name Type), , (Chief-Name, Title) Building-Name, Floor 34

    (Lab-No.. Lab-Name, Type) (Lab-Name Chief-Name Title) (9uildin9-Nwnq %#IA-,Fig. 9: Normalization of Idantifiw Associations

    View 1(Preferred)

    View 2

    (CONSISTS-OF\

    (DivNo, DeptNo, LOCI--

    (@AZ) (+zz--WeptNo, Lot)

    (e-j (ii&h&) (-&-i&-j(DivNo, DeptNo, Lot)

    F@ 9: lntsgation of vials when them is a complete match onname, partial match on key attributes and complete matchon non-key attributes.

    the Eighth international ConferenceVery Large Date Bases

    159Mexico City, September, 1982

  • 8/8/2019 A Methodology for View Integration in Logical Database Design

    19/23

    View 1

    Integrated View

    (CenterName, City) WeptName, Sire) f-- --View 2 (Preferred)

    (DeptNo., DeptName,Lot, Size)

    (OEP/\RTMENT)G](DeptNo., Lot)

    Fig. 10: Integration of views when there is a complete match on name,no metch on key or non-key attributes.

    View 1 (Preferred)

    MAINTAINSs=?----f---t--(CLINICAL-UNIT)NURSESTATION)

    (w, Flood,Matronname)

    View 2

    (CLINICAL-UNIT)W)(E&&g Unit-name,Heed-nurse-name)Integreted View: Same as View 1

    Fii.11: lntegretion of views with object typeshaving different names.

    Proceedings of the Eighth International Conferenceon Very Large Data Bases 160 Mexico Cii, Sf1ptembm;-~W 8

  • 8/8/2019 A Methodology for View Integration in Logical Database Design

    20/23

    (Room Q Admissionda te) (Date-last-visit)

    View 2 ;gjsii?;Immuniz-cod4 (BPhigh, BPlow)

    Integrated View

    Fig. 12: Ca tegorization Addition

    lntegmted View

    Fig. 13: Category E nhancenttmt (type dim)

    Proceedings of the Eighth International Conferenceon Very Large Data Bases 161 Mexico City, September, 1962

    .__-__~ .~~

  • 8/8/2019 A Methodology for View Integration in Logical Database Design

    21/23

    V*rr 1: Medial Smio Empkwmpartition.

    Intapatd View: Y diol and Surgical Employrr

    partition.EMP-TYPE-SEL

    Fii. 14: Putiiion Enhm-t Mow immca of an ntitytype afled EMPLOYEE-TYPE)

    rub. sub. rub.O.D. B.I.D. T.I.D.

    I/y*rrl DRUG SCHEDULE

    PATIENT DRUG

    sub. sub.

    KAY:O.D. = Ona , dayE.I.D. = Twice . damT.I.D. - Thrr rimes a dqO.I.D. - Four times a dyPRN = Talc. m mqui,.,,

    ub. sub. sub. rub. sub.E.I.D. T.I.D. D.I.D. PRN

    l/g2DRUG-SCHEDULE

    DRUG

    Fig. 16: SubsM Addition ItYP dim)

    Procwdings of the Eighth International Conference4Mrvery Large Data Bases 162

  • 8/8/2019 A Methodology for View Integration in Logical Database Design

    22/23

    View 1 wt.ST-CATEGORIES\

    ggg2y-&;STUDENT = (SS#i GRAD-STUD = UG-STUD = (Rank)Name, Grade-pt-avg.) (Deer4Interview constraint: s < STUDENT = is. g. - > 6 ST-CATEGORY pB(s. -, u > t ST-CATEGORY

    (mutual exclusion)

    View 2 (preferred) (STUDENT)STUDENT = ($j& Name, Grade-pt-avg.)

    Integrated View STUDENTSTUDENT = (E+ Name, Grade-pt-avg ., Category, De gree, Rank)

    Fig. 16: Creation of a new attribute type calledCategory dur ing integration.

    View 1 (Higher Administration View)

    \~ --/ . /(Status-type, Total-no.,Capacity) (Gro~iads,

    View 2 (Operational View)

    .Fig. 17: Two different views of patient information; which may coexist.

    Proceedings of the Eighth International Conferenceal Very Large Data Bases 163 Mexico City, SW&amber, Q#@2

  • 8/8/2019 A Methodology for View Integration in Logical Database Design

    23/23

    - ___

    sub.,f PERFORMED- \

    sub.f PERFORMED- \FREE \ REPEATEDLY /

    PROCEDURE-IDENTIFIER

    Entity TypesSERVICE (Service-Name, Location)PROCEDURE (Procedure-No., Procedure-Name, Type)PERSONNEL (Emolovee-No., Employee-Name, Title, Sex)SCHEDULE (Qg&, Time-start, Time-finish)INPATIENT-PROCEDURE (Labname)OUTPATIENT-PROCEDURE (Outpatient-unit-no.)Association TypesPROCEDURE-IDENTIFIER (identifier-association tvoe)PERFORMED . .PROCEDURE-CATEGORIES fcategorizationessociation type)PERFORMED-FREE (subsettingessocietion tvpa)PERFORMED-REPEATEDLY (s&setting-association tvpa)

    Fig. 18: View Representation using the NS model.