base resumen

Upload: odette-garcia

Post on 03-Apr-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/28/2019 Base Resumen

    1/41

  • 7/28/2019 Base Resumen

    2/41

    Logicaldesign

    The aim of logicaldesign is to construct a logicalschema that correctlyandefficientlyrepresents all ofthe informationdescribed by an Entity-Relationshipschema produced during the conceptualdesign phase. This isnot just a simpletranslationfromone model to another for tworeasons. First,there is not a close correspondence between the models involvedbecause notall the constructs of the Entity-Relationshipmodel can be translatednaturallyinto the relationalmodel.For example, whilean entitycan easilybe represented by a relation, there are various options for thegeneralizations. secondly, the aim of conceptual design is to represent thedata accuratelyand naturallyfroma high-level,computer-independentpointof view. Logicaldesign is instead the basis for the actualimplementationof the application,and must take intoaccount, as far aspossible, the performance of the flnalproduct. The schema must therefore berestructured in such a way as to make the executionof the projectedoperations as efficientas possible. In sum, we must plan a task that iinot onlya translation (fromthe conceptual model to the logical)but also areorganization.since the reorganizationcan for the most part be dealt withindependentlyof the logicalmodel, it is usuallyhelpfulto airria"the logicaldesign intotwo steps, as shown in Figure7. r.

    ' Restructuringof the Entity-Relationshipschema, whichisindependent of the chosen logicalmoder and is based on criteriafor theoptimizationof the schema and the simplificationof the followingstep.

    ' Translation into the logicalmodel,whichrefers to a specific logicalmodel

    (in ourcase, the relationalmoder) and can includea furtheroptimization,based on the features of the logicalmodel itself.The inputfor the firststep is the conceptual schema produced in rhe

    preceding phase and the estimate d database load, in terms f the amount ofdata and the operationalrequirements. The result obtained is a restructure,cE-R schema, which is no longer a conceptual schema in the strictsense of rhe

  • 7/28/2019 Base Resumen

    3/41

    2l 8 I Cnapre'-I Logicaldesign

    Ccn ceDiualDaubase Logical- scnema load mdel

    lntegrityconstralnts

    Figure 7. r Logicaldatabase design'

    term, in that it constitutes a representation ofthe data that takes into accoun:

    implementationasPects.Thisschemaandthechosenlogicalmodelconstitutetheinputofthe

    secondstep,whichproducesthelogicalschemaofourdatabase'Inthisste;we can also carry out qualitycontiolson the schema and' possibly'furthe:

    optimizationsbasedo.,th.characteristicsofthelogicalmodel.Ananalvs:stechniqueusedfortherelationalmodel(caltednormalization)will.oepr"r"rr*dseparately in the next chapter' The finallogicalschema' theintegrityconstraintsdef,ned on it and the relevant documentation.orri=tltrt.the finalproduct of logicaldesign'

    In the remainderof this chaptlr, we wiilpresent the two steps that makeup the logicaldesign of a dataase. We willhrst discuss the techniques that

    canbeusedtoanalvzetheefficiencyofadatabasebyreferringtoitscol.ceptua\ sc\rema'

    7.1 Performanceanalysis on E-R schemas

    *^E-R scherira can be modifiedto optimizesome perform-anceindicatp5'\e

    -.s: -e -.ers^ ind,icator,because iheeffi'ciencyol a daraca>: :::: ' 'r"':!N:

    Logicaldesign

    Logicalschema

    Restuctu ringofthe E-R schema

    Restructed

    Translationto alogicalschema

    Logicalschema

  • 7/28/2019 Base Resumen

    4/41

    SeqtionT.ll2l9Performance analysis on E-R schemas I

    preciselyevaluated with reference to a conceptual schema. The reason is thatthe actual behaviouris also dependent on physicalasPects that are notpertinent to a conceptual representation. It is possible, however, to carryoutevaluationsof the twoparameters that influence the performanceof anysoftware system. These are:

    r cost of an operation: this is evaluated in terms of the number ofoccurrerces of entitiesand relationships that are visitedto execute anoperation on the database; this is a coarse measure and it willsometimesbe necessary to refer to more detailed criteria;

    ' storage requirement:this is evaluated in terms ofnecessary to store the data described by the schema.

    In order to study these parameters, we need the followirg

    . Volume of data. That is:

    Age

    R.eleaseDate

    Flgure 7.2

    o numberof occurrences of each entity and relationshipof the schema;o size of each attribute.

    . Operation characteristics.That is:o type of operation(interactiveor batch);o frequency(average number of executions in a certain time span);o data involved(entities andfor relationships).To give a practical example, we

    willlookat an already familiarschema,',r'hichis shown again for convenience, in Figure7.2.

    Code5t rname

    Salary Phone

    Name0,N) (t,t)StartDate

    StartDateName (l,N) ( I,N) Cty

    tudS"-Iumben

    Street

    number of bytes

    information.

    (o,t)I \(!,N)E,tpr-oyee

    Address

    An E-R schema on the personnel of a company.Pos:C oc e

  • 7/28/2019 Base Resumen

    5/41

    SectronT.ll2l9Performance analysis on E-R schemas I

    preciselyevaluated with reference to a conceptual schema. The reason is thatthe actual behaviouris also dependent on physicalasPects that are not

    pertinent to a conceptual representation'It is possible, horvever' to carryout

    evaluationsof the twoparameters that influencethe performanceof anysoftware system. These are:

    . cost of an operation: this is evaluated in terms of the number ofoccurrences of entitiesand relationships that are visitedto execute anoperationonthedatabase;thisisacoarsemeasureanditwillsometimesbe necessary to refer to more detailedcriteria;

    . storage requirement:this is evaluated in terms of number of bytes,r...rir.y to store the data described by the schema'

    ln order to study these parameters, we need the followinginformation.

    . \'olume of data. That is:c numberof occurrences of each entity and relationshipof the schema;

    size of each attribute.' Operation characteristics.That is:

    : tyPe of operation(interactiveor batch);: frequency(average number of executions in a certain time sPan);: data involved(entities and/or relationships)'To give a practicalexample, we willlookat an already familiarschema,-,rlichis shown again forconvenience, in Figure7'z'

    (0,1)Code

    L-lame:ael'7

    fue (l,l)StartDate

    StartDate\a.l'le ( I,N) ( I,N)

    3-,cEg

    l"e easeDate

    I1Evre 7.2 An E-R schema on the Personnel of a comPany'

    Name

    umberStreetostCode

    (o,t\-1 ____Nl'N)Emplov*

    Address

  • 7/28/2019 Base Resumen

    6/41

    22O I .t-, a;:=-I Logicaldesign

    Tvpicaloperarions for rhis schema can be:. operation r: assign an employeeto a project;. operation e: findthe data foran employee, for the departmenr ::: -., _ ::

    he or she r,orks and for the projects in whichhe or she is invorr-e:. operation 3: findthe data for all the employees of a certain depa__:::' operation 4: for each branch, findits departments with the surr:-:i -the managers and the listof the employees in each department.

    Althoughthe operations above might lookoversimprifyingwirhres:.::: -_:the actual database load, we can note that databas. oi".tioi,follorr...-= u,_called 'eighty-twentyrure'. This rule states that eighiypercent of the - : -: :generated by twentypercent of the operations. This fact allor...s -: :.concentrate only on some operations and stiilgive an adequate indic::-:- :,,the workloadsfor the subsequent analysis.

    The volumeofdata and the general characteristics ofthe operations ::- rrdescribed by using tables such as those in FigureT3.rntheiabre or -:..-*-::

    Table of volumesConcept Typ" VolumeBranchDepartmentEmployeeProjectComposition

    MembershipManagementParticipation

    EtrL-

    E

    E

    R

    RRR

    t0BO

    2000500

    BO

    I 900BO

    6000

    Table of operations

    ligure7.3 Examp\es of volumetable and operations table.

    all the concepts of the schema are shown (entities andrelationships r '.,. -:-theirestimated volumes. In the table of operationswe show, for=::.operation, the expected frequency and a symbolthat indicateswhethe: :::operation is interactive (I)or batch (B).Note that, in the volumes table. :::number of occurrences of a relationshipdepends on two parameters. TLe-are (i) the volumeof the entities involvedin the relationshipand (iii:::numberof timesan occurrence of these entitiesparticipateson average i ,-occurrence ofthe relationship.The latter depends in turn on the cardinalii:-of the relatlonship. For example, the number of occurrences of ri:collpostrorrelationshipis equal to the number of departments, since tl:cardinaLliesdicrate that each department belongs to one and onlyoi=branch. o:: :he other hand, the number of occurrences of the relationshu:Mu3=rs-: :s -::tle less than the number of employees, since few emplovees

    Operation TyPe Freq u en c'loplop2cp3op4

    I

    I

    I

    B

    50 pe ' :=-.I O0 pe' :=.l0 pe'==-.2 per'- ,', ii.

  • 7/28/2019 Base Resumen

    7/41

    Section 7. I I 22 IPerformance analysis on E-R schemas I

    :: :q to no dePartment' Finally,ifan employee is involvedon average in- t" itoects, we have zooo X 3 = 6ooo occurrences for the relationship

    =.r-:itiloN(and thus 6ooo + 5oo = rz employees on average for each:-.t.

    ::: each operation we can, moreover, describe the data involvedby means: - '= :;'-'igaiionschema that consists of the fragmentof the E-R schema-:-:-.-::i:to the operation. On this schema, it is useful to draw the 'logical:,::- :hat must b; folowedto access the required information.An example: i : -:r-iqationschema is proposed in Figure7'4 withreference to operation: :: :::ain the required infoimation,we begin with the ENpt-oveeentityand

    Code:'*f-name Phone

    Name

    (0,1) (t,N)>a ary

    Age (0,N)StartDate

    StartDate

    \a-e3'- :;e: l

    Fg-;:e -.J Exampleof a navigationschema'

    w I: - :;cess to his department by means of the MmgeRsHlprelationship,mr :.: ::e :rojects in whichhe is involvedby means of the

    PARTICIPATION

    e.::-s:ip._:::hisinformationis available,we can estimate the cost of an operation

    :: ::e database by countingthe number of accesses to occurrencesof entitiest:j:elationshipsnecessary to carryout the operation'Lookagain atirtrff::::. Accordingto the navigation schema, we must firstaccess anrr:i-::::.eof the ENpIYEEentityin order then to access an occurrence of therE EFS- relationshipand, by this means, to an occurrence of the}s.=:\Tentit/.Folwingthis, to obtain the data of the projectson whichrr :r s:e is rvorting,we must access on average three occurrences of thet*t-r:F-,ONrelationship(because we have said that on average an employeeryirruj:- lhree projects).Then, through this, we access on average three

    :r:*-$:-:c"::s-;ch as that shown in Figure7.5. In the last columnof this table, the.-e :r ;cess is shown: R for read access, and W forwriteaccess. It is-*q::; irr rDk this distinctionbecause, as we shall see in chapter 9, rsrite

    l@ffin---s-e generally more onerous than read ones'lr = ::r: section, we willsee how these simple analysis tools can be used'!r sr{j:;isionsduringthe restructuringof E-R schemas'

    EmplovEr

    ARTICIPATI

    PnolEcr

  • 7/28/2019 Base Resumen

    8/41

  • 7/28/2019 Base Resumen

    9/41

    Restructuringore-t'Tl]:Y;1I"'. Removinggeneralizations replaces all the generalizations in the schema

    by otherconstructs.. Partitioningand mergingof entitiesand relationshipsdecides

    whether is it convenient topartitionconcepts in the schema into more thanone concept or to merge several separate concepts into a single one.

    . Selectionof primaryidentifierschooses an identifier forthose entitiesthat have more than one.

    Later in the section, we willexamine separately the variousrestructuring:asks using practicalexamples.

    7.2.1 Analysisof redundanciesA redundancyin a conceptual schema corresponds to a piece of information:hat can be derived (that is, obtained by a series ofretrievaloperations) from,.rher data. AnEntity-Relationshipschema can contain various formsofredundancy. The most frequent examples are as follows.. Attributeswhose value can be derived, foreach occurrence of an entity

    (or a relationship),fromvalues of other attributes ofthe same occurrence.For example, the flrstschema in Figure7.7 consists of an entity lNVotcEinrryhichone of the attributescan be deduced from theothers by means ofarithmeticoperations.

    . Attributesthat can be derived fromattributesof other entities (orrelationships),usuallyby means of aggregate functions.Anexample ofsuch a redundancy is present in the second schema in Figure

    7.7.In this

    schema, the attributeTotalAmountof the PURcHASEentity is a derived one.It can be computedfrom thevalues of the attribute Price of the Pnooucrentity, by summing the prices of the products of whicha purchase ismade up, as specified by the CoNposlloxrelationship.

    . Attributesthat can be derived fromoperations of countingoccurrences.For example, in the third schema in Figure7.7, the attributeNumberOflnhabitantsof a towncan be derived by countingtheoccurrences of the relationship Resloeucg in whichthe town participates.This is actuallya variant of the previousexample, whichis discussedseparately, as it occurs frequently.

    r Relationships that can be derived fromother relationshipsin the presenceof cycles. The last schema in Figure7.7 contains an example of thistvpeof redundancy: the TeRcHtNc relationshipbetween students and lecturerscan be derived from therelationshipsArreNotNCEand AsstGNMENr.I:must be clearlystated that the presence of cycles does not necessa:-iL','generate redundancies. If, forexample, instead of the Trcr,rcrelationship,thfs schema had ontained a relationshipSupERrosor

  • 7/28/2019 Base Resumen

    10/41

    224 | Chapter 7I Logicaldesign

    NetAmount

    Tax

    GrossAmount

    TotalAmounr

    NumberOflnhabitants

    Figure 7.7 Examples of schemas withredundancies.

    representingthe linkbetween students and supervisors then the scir:-:wouldnot have been redundant.

    The presence of a derived piece of informationin a database presenrs :-advantage and some disadvantages. The advantage is a reduciionin:::number of accesses necessary to obtain the derived information.T::disadvantages are a larger storage requirement (whichis, however:negligiblecost) and the necessity for carryingout additionaloperations --order to keep the derived data up to date. The decision to maintiinor ie-e:=a redundancy is made by comparingthe cost of operations that involve::=redundant informationand the storage needed, in the case of presence :.:absence of redundancy.

    using a practicalexample, let us lookat how the evaluation ro:'.described above can be used to make a decisionof this type. consider :::schema about people and towns in Figure 7.7 and imagine ihut lt refers :: .:regionalelectoralrollapplication for whichthe followingmain opera::;-.are defined:

    ' operation r: add a new person withthe personb townof residence.' operation2:printall the data of a town (includingthe numb,e: ::

    inhabitants).

    Price

    TEncutNc

    Sruorrur Lgcruas:

  • 7/28/2019 Base Resumen

    11/41

    Restructuringor e-t':.1311;1 I"t

    Let us suppose, moreover, that forthis application, theload is that shownin Figure7.8.

    Figure7.8 Tables of volumes and operations for theschema in Figure7.7.

    Let us firsttry to evaluate the indicatorsof performancein the case ofpresence of redundancy (attributeNumberOflnhabitantsin the TowN entity).

    Assume that the numberof inhabitantsof a town requires four bytes. Wecan see that the redundant data requires 4 x 2oo = 8oo bytes, that is, less thanone Kbyteof additionalstorage. Let us now move on to estimatingthe cost

    ofthe operations. Asdescribed in the access tables in Figure7.9, operation rrequires the following.A write access to the PensoN entity (to add a new

    Table of volumes

    Concept Typ" VolumeTownPersonResidence

    E

    E

    R

    200I000000r 000000

    Table of accesses inpresence of redundancy

    Table of operationsOperation TyP" Frequencyoplop2

    500 per d^y2 per d"y

    Table of accesses inabsence of redundancy

    Operation IConceto T)rp" Acc. TyP.PersonResidence

    E

    R

    WW

    Operation2

    Concept TyP" Acc. TyPeTownResidence

    E

    R

    I

    5000RR

    Operation IConcept TyP. Acc. TyP"PersonResidenceTownTown

    E

    RE

    E

    wwRw

    Operation2Concept TyP" Acc. Typ"Town E R

    Figure7.9 Tables of accesses for the schema on electoral rolldata inFigure7.7.

    person), a writeaccess to the Resloeuce relationship(to add a new person-townpair) and finallya read access (to flnd the relevant town)and anotherrvriteaccess to the TowNentity (to update the numberof inhabitantsof thatoccurrence). This isall repeated

    5ootimes per day, for a totalof r5oo rsrite

    accesses and 5oo read accesses. The cost of operation zis almost negligible,as it requires a single read access to the TowNentity to be repeatei ntice aday. Supposing a writeaccess to cost twice as much as a read access, rve ba'.-ea total of35oo accesses a day when there is redundant data.

    Let us now consider what happens when there is no redundant da:a.

  • 7/28/2019 Base Resumen

    12/41

    226 | CnapieI Logicaldesign

    For operatlon r rse need a rt,riteaccess to the pensoN entityand a rr-:-:.access to the ReslorNCErelationshipfor a total of rooo writeaccesses per ::-"(There is no need to access the TowNentitysince there is no der:-.-::information).For operation z however,we need a read access to the T3.'.".entity (to obtain the data for the town),whichwe can neglect, and 5ooc, ::::accesses to the REStDEtr,tCErelationshipon average, (obtainedby dividir::::number of people by the number of towns) to calculate the num'c,.: ::inhabitants of this town.This gives a total of roooo read accesses pe: :j-countingtwice the writeaccesses, we have a total of r2ooo accesses pe: ::-when there is no redundant data. Thus, approximately85oo accesses -:::per day are requiredwhere there is no redundant data in order to sa\-e -:----Kbyte.This dependson the fact that the read accesses needed to compu:=:-:derived data are much more than the writeaccesses needed to ke:::-:derived data up to date.

    It is obviousthat, in this case, it worthmaintainingthe redundant

    7 .2.2 RemovinggeneralizationsThe relationalmodel does not allowthe direct represenra::.-: : -generalizationsof the E-R model. we need, therefore, to transfo:= ::_::constructs into other constructs that are easier to translate. As -e sa:" -i-later in section 7.4.2, the E-R constructs for whichthe translation:-:::-:relationalmodel is easy are entities and relationships.

    To represent a generalizationusing entities and relationships i:r;:::*-:essentially three possible options.we willdemonstrate these bv re:e::-* :;the generic E-R schema in Figure7.ro.

    Figure 7. ro Example of a schema withgeneralization.

    The possible outcomes are shown in Figure7.

    rr and are obtainei b-.-=,.-:*of the follorvingrestructurings.

    t. collapse the childentities into the parent entity. The enti:1es ! :rE2 are deleted and theirproperties are added to the parent enr:*.,:-. l::hjs entirr',a furtherattributeAtypeis added, which serves to ci:s:= r';r:he 'rvpe' (E or E2) of an occurrence of Eg. For example, a gener .-;..j:,:

    !s i: -";

  • 7/28/2019 Base Resumen

    13/41

    Section 7.2 I 227Restructuringof E-R sche-"t I

    Ao' Aoz

    Figure 7.rr Possible restructurings of the schema in Figure7' ro'

    between the parent entityPERSoNand the childentities MnNand WovRNcan be collapiedinto the entityPeRsOtt by adding to it the attributeSex.Note that the attributes A1and A2can assume nullvaiues (because thevare inapplicable)for some occurrences of Eg. In addition, the relationship

    R2 wilihave a minimumcardinalityequal to zero for the E entirr-(blcause the occurrences of E2 are onlya subset of the occurrercescl E^ '.z. Collapse the parent entityinto the childentities' The Pare:-: :::':"'

    Es is teleted ind, for the ProPerty of inheritance' its a:tr:bu::s -:5identifierand the relationships to whichthis entitvrvas inr'-i'=j=::added to both the childentities E and E2. The relationsh:psR :::: i, :

  • 7/28/2019 Base Resumen

    14/41

    228 I Cnapter 7I Logicaldesign

    :r

    represent respectivelythe restrictionof the relationshipR c:: -,::occurrences of the entitiesEl and Ez. Consider, for exarn:-=;generalizationbetr,r'een the entities PERSoN,havingSurname and A: uattributes and SSN (Social Security Number)as an identif,er,ani:-:entities MnNand WoueN. Ifthis is restructuredin this way, ther :-:attributesSurname and Age and the identifierSSN are added to botl ::*entities Ml.and WoNnN.

    3. Substitutionof the generalizationwith relationships.l-:generalization is transformedinto twoone-to-one relationshipsthat '*.the parent entity withthe childentities E and E2. There are no trans::-of attributes or relationship andthe entities E and E2 are identir::externallyby the entityE0. Additionalconstraintshold in the re;nschema: each occurrence of E6 cannot participate in both R61 and Rg-moreover, if the generalizationis complete,each occurrence of E9 mu-participate in exactlyone of R61 and R6r.

    The choice amongthe various optionscan be made in a manner similar::that used forderived data. That is, by consideringthe advantages alidisadvantages ofeach ofthe possible choices in terms ofstorage needed ar.icost of the operations involved.We can, however, establishsome generrules.

    Optionr is convenient whenthe operations involvethe occurrences aE:the attributesof Eg, E and E2 more or less in the same way. In this caseeven ifwe waste storage for thepresence of nullvalues, the choice assuresfewer accesses compared to the others in which theoccurrences and rheattributes are distributedamong the variousentities.

    Optionz is possible onlyif the generalizationis total, otherwise theoccurrences of E that are occurrences of neitherE nor E2 would notberepresented. It is useful when there are operations that refer onh- tcoccurrences of E or ofE2, and so they make distinctionsbetween theseentities. In this case, storage is saved compared to Optionr, because, inprinciple,the attributes neverassume nullvalues. Further, there is areduction of accesses compared to Option3, because it is not necessary tcvisitE9 in order to access some attributes of E and E2.Option3 is useful when thegeneralizationis not totaland the operationsrefer to either occurrences and attributes of E (Ejor of Eg, and thereforemake distinctionsbetween childand parent entities.In this case, we cansave storage compared to Optionr because of the absence of nullvalues,but there is an increase of thenumber ofaccesses to keep the occurrencesconsisteni.

    There is a: iroportantaspect that must be clarifledabout the above. For the:esiri::irgo the generalizations,the simple countingof instances and;.-esses :s :ot alrsavs sufficientforchoosing the best possible option. Given

    jr:i

    i,i rl i

    iirl

    li rrrn,i

    ,iti ,,,,l

    ll rl,

  • 7/28/2019 Base Resumen

    15/41

    Restructuringore-Jr".tlY;1I,r,

    these factors, it wourd seem that option3 rvouldhardrr-e'er be suitable, asit usually requires more accesses in order tor h i s,:esru c;r.i";:; "*ever,ha s th e gre a t : :l:,';L:5;.;.l?li

    r?lJilX'i;ith fewerattributes. As we shail see] this transrates into logicarand thenphysical structures of smail dimensions forwhicha physical access alowsthe.retrievalof a greater amount of data (tupres) at once. Therefore, in somecriticalcases, a more refined anarysis needs to be carriedout. This might takeintoaccount other factors, such as the quantityof data that can be retrieved

    lI -"u"-r.ofa single.i:""r_:to seconary srorage. These aspects willbediscussed in more detail in Chapter 9.r

    The options presented are not the onry ones arowed, but it is possible toarry out restructurings that combine them. An exampre rs giveninigure7.12, which consists in another possibre transformationof the schemagiven in Figure7.ro. In this case, bur"don considerationssimilarto thoseAo' Aoz

    Figure7'r2 Possible restructurirg of the schema in Figure7.Lo.

    discussed abovei it was decided to incorporateE6 and E and to leave thentityE2 separate from the others. Te attribut.A.ro"was added toistinguish the occurrences of Eg fromthose of E.Finally,regardinggenerarizations on more than one revel, lve can proceedin a similarway, anaryzing a gener arization at a time, starting fromrheottom of the enrire hierary. "rasea ,; ,h"-Jo;,;;#'.";figurations

    are possible, which

    "r:.f"obtained by combining

    the basic ..*.r.:r:,rg,n the variouslevels ofthe hierarchy.7 '2'3 Partitioningand mergingof entities and rerationshipsEntitiesand relationshipsof an E-R schema can be partitionedor =e:.i ::mprovethe.efficiencyofoperations, using the fo[owingp.r;;;f. _i:_-_i:sare reduced by separating attributes of thJsame concept thar are a--;-ss: j :-,,

  • 7/28/2019 Base Resumen

    16/41

    23O I Cnaorer -I togicat design

    differentoperations and by merging attributes of differentconce::s :::- i"l-1accessed bv the same oPerations. The same criteriaas those disc;=i:: *redundancies are valid inmakinga decision about this typeof res::--::-: -rl

    Partitioningof

    entitiesAn example of entitypartitioningis s: ' - ;

    Figure 7.13: the EMPLoYEEentity is substitutedby twoentities, l:::.:::: -

    Name Level

    Address SalaryDateOfBirth

    EmployeeNumber

    Name ai{Eo.Address

    (l,l) ENpt-ovME\-Dnrn '-'.'DateOfBirth

    Figure 7.L3Example of partitioningof entities.

    one-to-one relationship.One of them describes personal infor-::-:-:. iitremployee. The other describes informationabout the emple.'.--.*::: ;r:employee.This restructuringis useful if the operations that ::=:_-=:--''involve theoriginal entityrequire, foran employee,either onlr- :::--:-;--,:rrof a personal nature or only informationrelating to his or here=:-:-.-=: -

    This is an example of uerticalpartitioningof an entity, inthe sers: :-: ---:Econcept is sub-dividedaccording to its attributes.It is aisc,:,::n:however, to carryottt horizontalpartitioningin which the su:-: -.--:rworks on the occurrences of entities. Forexample, there cori; := r:rroperations that relateonly to the analysts and othersthat operate Lr- -.' : - --:,8salespeople. In this case, too, it could be useful to partitionhe i"e-:=entity into twodistinctentities, ANnlvsr andSnuspeRsol,. having --: :^;Trattributesas the originalentity.Note that horizontal:i::-:-:- - lcorresponds to the introductionof hierarchies at the logicalIevei.

    Horizontalpartitioninghas the side effect of having to dup-';::::rrelationships in whichthe originalentityparticipated.This phe::=-:lican have negative effectson the performanceof the database. O:::: :--:-s'hand. r-erticalpartitioninggenerates entitieswithfewer attributes. l:=-,-;lrtherefore be translated intophysicalstructures from whichwe can :==-:-r: .great d.eai of data n'ith a single access. Partitioningoperations n'illrr :*--:gdrscusseC in Chapter ro, whendealing withthe fragmentationof dis:--:---:;dal-ases.

    EmployeeNumber

    EmplovgE

  • 7/28/2019 Base Resumen

    17/41

    Restructuringor e-J;'.13[l;1I" 'Deletionof multi-valuedattributesone r)'pe of partitioningthatshould be discussed separately deals rvithrhe deletion of multi-valuedattributes. This restructuringis necessary because, as rvithgeneralizations,the relationalmodel does not allowthe direct representation of theseattributes.

    The restructuring required is quite simple and is illustrated in Figure7.r4.

    NameAddress

    Town( I,N) Telephone

    NumberName

    Addresss

    Town

    Figure 7.r4 Example of deletionof multi-valueattributes.

    The AceNcventity, is separated into two entities:an entityhavingname andattributes as the originalentity,apart fromthe multi-valuedattributeTelephone, and a new Teupgor.r entity,withthe attributeNumber. Theseentitiesare linkedby a one-to-many relationship.obviously,if the attributehad also been optional, then the minimumcardinalityfor theAcENcyentityin the resulting schema wouldhave been zero.Mergingof entities Merging is the reverse operation of partitioning. Anexample of mergingof entities is shown in Figure7.15 in whichthe peRsoNand ApenryENTentities, linkedby a one-to-one relationshipOwrrtER,aremerged into a single entityhaving the attributes of both. This restructuring

    Name ',pAPtNumber(0, t)Address

    Ag"

    Name

    Address

    Ag"

    ApnnrxENTi \ A'ddress

    Apl\-.- l -

    AptAddress

    Figure 7.t5 Example of mergingof entities.

  • 7/28/2019 Base Resumen

    18/41

    2?2 | Cnapter 7I Logicaldesign

    can be suggested br the fact that the most frequentoperations on the PeRscrentityalrvavs require data relating to the aPartment that the Persc-possesses. Thus, lve rvish to avoidthe accesses necessary to retrievethis ie::by means of the OwxeRrelationship.A side-effectof this restructuring is:;.possible presence ofnullvalues because, according to the cardinalities,tl:e::are people who do not ownapartments. Therefore,there are no values:::them for the attributes AprAooReSSand AprNuNeER.

    Mergingis generally carriedout on one-to-onerelationships, rareir' ;-one-to-many relationshipsand hardlyever on many-to-manyrelationsh::sThis isbecause mergingof entitieslinked bya one-to-many relationshi:::a many-to-manyrelationshipgenerates redundancies. In particular, it is eas".to verifythat redundancies can aPPear in non-key attributesof the en:::-.that participates in the originalrelationshipwitha maximumcardina":-.'equal to N.We willcome back to illustrate this pointin Chapter 8'Other types ofpartitioningand merging Partitioningand mergingc:-erations can also be applied to relationships. This can be done for :-'''--reasons. Firstly,in order to seParate occurrences of a relationship tha: -=always accessed separately. Secondly, to merge two (or more) relationsh::=between the same entities into a single relationship,when their occurrel;-are always accessed together. An example of partitioningof relationsh.r-sgiven in Figure7.16 in which the currentplayers of a basketball tearn ::distinguishedfrompast players.

    Name

    Position

    Name

    Posidon

    Nane

    Town

    Name

    Town

    Corposrt

    DateJoined

    Figure ;.t6 Example of partitioningof a relationship.

  • 7/28/2019 Base Resumen

    19/41

    Restructuringor e-t'".tlY;1I"'We should mention thatthe decisions aboutpanitioningand merging can

    be postponed untilthe physicaldesign phase. tlans of todar"s databasemanagement systems allow the specificationof clusters of logicalstructures,that is, grouping of tables, carried out at the physical level.Clusters allowrapid access to data distributed throughoutdifferentlogicalstructures.

    7.2.4 Selection of prmary identifiersThe choice of an identifier foreach entityis essential to the translation intothe relationalmodel, because of the major role keysplav in a value-baseddata, model, as we discussed in Chapter z. Furthermore,databasemanagement systems require the specificationof a primarykey on whichauxiliarystructures for fast access to d,ata, knownas indices, areautomaticallyconstructed. Indicesare discussed in more detailinSection 9.5.5. Thus, where thereare entities for whichmany identifiers(ornone) have been specified,it is necessary to decide whichattributesconstitutethe primaryidentifier.The criteriafor this decision are as follows.o Attributeswithnullvalues cannot formprimaryidentifiers.These

    attributes do not guarantee access to all the occurrences of thecorrespondingentity,as we pointed out whilediscussing keys for therelationalmodel. One or fewattributes are preferablethe indices are of limitedsize, lesslogicallinksamong the variousfacilitated.

    to many attributes.This ensures thatstorage is needed for the creation ofrelations,and joinoperations are

    . For the same reason,an internal identif,erwithfew attributes ispreferable to an external one,possiblyinvolvingmany entities.This is

    because external identifiersare translated intokeys comprisingtheidentifiersof the entities involvedin the external identification.Thus,keys withmany attributes wouldbe generated.

    . An identifierthat is used by many operations to access the occurrences ofan entityis preferable to others. In this way, these operations can beexecuted efficiently,since they can take advantage of the indicesautomaticallybuiltby the DBMS.At this stage, ifnone of the candidate identifierssatisfres rhe above

    requirements, it is possible to introducea furtherattributeto the entitr'.This

    :ttributewillhold special values (often called codes) generated solelv fc,r rhelurpose of identifyirgoccurrences of theentity.It is advisable to keep track of the identif,ers

    3rimarybut that are used by some operations forjlscuss in Chapter 9, for these identif,erswe cana;cess structures, generallyknownas secondary

  • 7/28/2019 Base Resumen

    20/41

    2?4 | Cnapter II Logicaldesign

    can be used to access data as an alternativeto those generated automatical-"'on the primarvidentifiers.

    7.3 Translationinto the relationalmodelThe second step of logicaldesign corresponds to a translation betree::differentdata models. Starting froman E-R schema, an equiualenlrelatiolischema is constructed. By equivalent,we mean a schema capable ;:representing the same information.Accordingto the restructuringmade c-the s-n schema in the frrst step of logicaldesign, it is sufficientto conside: :simplifiedversion of the E-R model. In this version, a schema contains :--generalizations or multi-valuedattributes and has only primaryidentifiers

    We willdeal withthe translation problemsystematically, beginning$'l::the fundamentalcase, that of entities linkedby many-to-many relationshlrsThisexample demonstrates the general principleon whichthe rvi:c-:translationtechnique is based.

    7 .3.1 Entities and many-to-manyrelationshiptConsider the schema in Figure7.L7.

    SurnameName

    SalaryBudget

    Number Code

    StartDate

    Figure 7.L7AnE-R schema witha many-to-manyrelationship.

    Its natural translation into the relationalmodel allows the following:o for each entity, a relationwiththe same name, havingas attributes

    same attributes as the entity and having its identif,eras key;

    . for the relationship,a relationwiththe same name, having as attributesthe attributesof the relationshipand the identifiersof the entitiesinvolved;these identifiers,taken together, formthe key ofthe relation'

    if tfre originalattributes of entities or relationshipsare optional, thencorrespondingattributes of relationscan assume nullvalues.

    The relationalschema obtained is thus as follows:

    Euelovee(Number, Surname, SalarY)Pnoecr(Code, Name, Budget)

    PnRrtctplrtou(Number,Code, StartDate)

    To make the meaning of the schema clearer it is helpfulto do sonerenaming. For example, in our case we can clarifythe contents of thePARTIcIPATIoNrelationby definingit as follows:

  • 7/28/2019 Base Resumen

    21/41

    rranstarion into the ""1.tonsir*lli,I"t

    PnRTlclplrroN(ECIplglgg,proiect,SrartDate)The domain of the Employeeattributeis a set of emplovee numbers and that

    of the Proiectattribute is a set of project codes. There are referentialconstraintsbetween these attributesand the EMpLoyEErelation and thePaoecr relationrespectively.

    Renaming is essential in some cases. For example, when we have recursivereiationshipssuch as that in Figure7.18.

    Figure 7.r8 r-n schema withrecursive relationship.

    This schema is translated into two relations:Pnooucr(Code, Name, Cost)

    CoMpostTtoN(part,Subpart, euantity)In this schema, both the attributes part and Subpart have product codes as

    domain. There is in fact a referentialconstraintbetween each of themand thePRooucr relation.

    The translationof a relationshipinvolvingmore than two entities is similarto the translationof a binaryrelationship.For example, consider the schemawitha ternary relationshipin Figure7.r9.

    SupplierlD

    SupplierName

    Figure 7.Lg

    Code

    TyP.

    CompostloN

    PRooucr

    PRooucr

    DepnnrMENT

    Telephone

    E-R schema withternary relationship.

    This schema is translated into the followingfourretrarior,s:

  • 7/28/2019 Base Resumen

    22/41

  • 7/28/2019 Base Resumen

    23/41

    Seron 7.3 | 217Translation into the relationalmodel I

    PtaYEn(Surname,DateofBirth'Position'Team' SalaryTenm(Name, Town, TeamColou rs

    In this schema, there is obviouslythe referentialconstraint between theattributeTeam of the Pl-vRrelationand the TEIMrelation'

    Note that the participationof the PLAYER entitYis mandatory. If itwereoptional(it is possible io have players who have no contract witha team)'then both of the translations withthree relations and withtwo relationswouldbe valid.Even if in the second translation we have fewer relations, itis in fact possible to have null values in the PLveR relationon the attributesTeam and Salary. Conversely, in the f,rst translation, this cannot happen.

    WementionedinSection5.z'z,thatn-arytelationshipsareusuallymany-to_many. However, when anlntityparticipateswitha maximumcardinalityof one, we can save a relation,as happens withthe translationof one-to-manybinaryrelationships.Theentitythatparticipatesintherelationshipwithmaxium cardinalityof one, is translitedinto a relation that includes the

    identifrersofthe other entities involvedin the relationship(as wellas

    possible attributesof the relationship itself).There is, therefore,no longer

    ^nyneedto represent the originalrelationshipwitha separate relation'For

    ""-p1.,urrr.i"thatthe pnobucrentity participated in the relationshipin

    Figure 7.19 witha minimumand maximumcardinalityof one' Thismeanstht, foreach product there is a sole supplierwho supplies^itand a soledepartmentthatissupplied.Thentheschemaistranslatedasfollows.

    SureueR(supplierlD, SupplierName)DePRRruEur(Name,TelePhone)

    Pnooucr(Code,Type, Supplier, Department'Quantty)

    Here there are referentialconstraints between the attributeSupplierof the

    PROOUCTrelationand the SUPPLIER relation,and betweenthe attribute

    Department of the PnooucT relationand the DeplRrueNrrelation'

    7.3.3 Entitieswithexternal identifiersEntitieswithexternal identif,ersgivecontainthe identif,erof the 'identifying'E-R schema shown in Figure7.2L.

    rise to relationshaving keYs thatentities.Consider, for example, the

    Registration N u mber _, Nar:e

    EnrolmentYear

    Surname

    Figure 7.zL E-R schema withexternal identifler'

    The correspondingrelationalschema is as follo\s:

  • 7/28/2019 Base Resumen

    24/41

    238 | CirapterII Logicaldesign

    STu orHr(Regi stration N u m ber, U niversity, Surname,En rol mentYear)UrutvrRstrv(Name,Town, Address)

    in whichthere is a referentialconstraint between the attribute Universityofthe STUDENTrelationand the UNtvERstrvrelation.

    As wecan see, by representing the external identifier,we also representthe relationshipbetween the two entities.Remember that entities identifiedexternallyalways participate in the relationshipwith aminimumandmaximumcardinalityof one. This typeof translation is valid independentlyof the cardinalitywithwhichthe other entitiesparticipate in therelationship.

    7.3.4 One-to-one relationshipsFor one-to-one relationships,there are generallymany possibilitiesfortranslation.We willbegin withone-to-one relationshipswithmandatorr-

    participationfor both the entities,such as that in the schema in Figure7.22.Sala Name

    Name TelephoneNumber Branch

    Figure 7.22 E-r. schema withone-to-one relationships.

    There are two symmetricaland equallyvalidpossibilitiesfor thistype oirelationship:

    Heeo(Number, Name, Salary, Department,StartDate)DpnRtNerur(N".e,Telephone, Branch)

    withthe referentialconstraintbetween the attributeDepartment of the He:relationand the DepAnryeNrrelation, or:

    Heo(N-rrlet,Name, Salary)DeptntvexT(N"me,Telephone, Branch,Head, StartDate)

    for whichthere is the referentialconstraintbetween the attributeHead ol :.:DEp,RlyeNtrelationand the Heno relation.

    Since there is a one-to-one correspondence between the occurrences o: :r:trvoentities,a furtheroption wouldseem possible in whichwe have a sr-;-:relation containingall the attributes in the schema. Thisoptionshou-j

    :rdiscarded, horvever, because we must not forget that the schema that rr.e -:translatingis the result of a process in- whichprecise choices were E:;:regardlng the merging and the partitioningof entities.This means that, ::::esi:ucrured E-R schema has twoentities linkedby a one-to-one relation, -,..:iou:j::convenientto keep the two concepts separate. It is therefore-:ii:::13:ra:eio merge them duringthe translation into the relationaln:oi:-

  • 7/28/2019 Base Resumen

    25/41

    Let us now consider the case ofparticipationforone of the entities,

    SectionT.3 1239Translation into the relationalmodel I

    a one-to-onereiarlonsiripn'ithoptionalsuch as thar rn rhe schema in Figure7.23.

    Salary Name

    Name TelephoneNumber BranchStartDate

    Figure 7.23 E-R schema withone-to-one relationship

    In this case we have one preferable option:Emelovee(\lgQ9,Name, Satary)

    DepRRrveNr(Mtrg,Telephone, Branch,Head, StartDate)

    forwhich there is the referentialconstraint between the attribute Head of theDeprRtleNt

    relation and the Ettptoyee relation.Thisoptionis preferable tothe one in whichthe relationshipis represented in ihe Emplyee relationthrough the name of the department managed, because, for this attribute,wecould have null values.

    Finally,consider the case in whichboth the entities have optionalparticipation.For example, assume that, in the schema in Figure7.23, therecan be departments withno head (and thus the minimumcaidinalityof theDgpnnrmrurentity is equal to zero). In this case, there is a furtherpoisibilitythat allows for three separate relations:

    Emelovee(Number,Name, Salary)Depnnlyeut(Name,Telephone, Branch)

    Mnltce,leNt(Head,Department,StartDate)Note that the key of the MNrceveNrrelationcould be the attribute

    Department as well.Here, we have referentialconstraints between theattributes Head and Department of the Ml.llcEMENTrelationand the EMpLoyEEand DpplRrmENTrelations, respectively.

    Thissolutionhas an advantage of never having nuilvalues on theattributes that implementthe relationship.on the other hand, rve need anextra relation,witha consequent increase of the complexitvof the database.Therelore,the ttNee-re\at\on so\ut\on \s to be corrs\ereorr\r r\. r\e rrurrr\er

    o\ o.c\rre\.es o\t\ere\asorrs\\p\s.very\ornrcompared, ao ai. o..rrrences olthe entities involvedin the relationship.In this case, there :s the adr-antageof avoidingthe presence

    of many null values.

    7.3.5 Translationof a complexschemaTo see how to proceed in a complex case, rte rr.i-l.e::-.- ; _-example of a translationbased o., ih. schema ,hort; .= ..a---=--r-,-

    - ---'

  • 7/28/2019 Base Resumen

    26/41

    Let us now consider the case ofparticipation forone of the entities,

    SectionT.3 I239Translation into the relationalmodel I

    a one-to-o:tereiat.onship n-ithoptionalsuch as :hat in ti:esci:erna in Figure7.23.

    Salary

    -,Name

    Name r DEpnRTMENT1

    Telephone

    Number BranchStartDate

    Figure 7.23 E-R schema withone-to-one relationship

    In this case we have one preferable option:

    Evelovee(Number, Name, Salary)DepRRrN eNr(&rng,Tlephone, Branch,Head, StartDate)

    for whichthere is the referentialconstraintbetween the attribute Head of theDeplRruENtrelation and the Euployerelation.This optionis preferable tothe one in whichthe relationshipis represented in the Eupt-oveErelationthrough the name of the department managed, because, for thisattribute,wecouldhave nullvalues.

    Finally,consider the case in whichboth the entities have optionalparticipation.For example,assume that, in the schema in Figure7.23, therecan be departments withno head (and thus the minimumcardinalityof theDepRnryeNrentity is equal to zero). In this case, there is a furtherpossibilitythat allows for three separate relations:

    Emelover(Number,Name, Salary)DEpeRwut(Name,Telephone, Branch)

    MeruncuENr(lged,Department,StartDate)Note that the key of the MlxlceyeNtrelation could be the attribute

    Department as well.Here, we have referentialconstraints between theattributes Head and Department of the MINIGEMENTrelationand the EMpLoyEEand DeplRt,tENTrelations, respectively.

    This solutionhas an advantage of never havingnullvalues on theattributes that implementthe relationship.On the other hand, n'e need anextra relation,witha consequent increase of the complexitvof the database.Therefore, the three-relationsolution is to be considered onlv if the numberofoccurrencesofthe relationshipis very lorv comparedto rhe occurrences ofthe entities involvedin the relationship.In this case, there :s rhe acr-antageof avoidingthe presence of many null values.

    7.3.5 Translationof a complexschemaTo see how to proceed in a complex case, \\ e ',rl*:example of a translation based on the schema shcrt-:

  • 7/28/2019 Base Resumen

    27/41

    24O ! Cnaore' lI Logicaldesign

    l

    i!'ii.

    liiillf

    m{#''

    A6l

    A6:.A63

    A3A3

    Figure 7.24 An E-R schema for translation.

    In the flrstphase, we translate each entityinto a relation.The translationof the entitieswithinternalidentifiersis immediate:

    E3(A3 t, A32)E4(A4t,A42)Es(45I, As2)

    E6(Af,A62, A63)Nowwe translate the entitieswithexternal identifiers.We obtain thefollowingrelations:

    Et(Ail,A5t,At2)E2(A21,At t, A5 t,A22)

    Note how E2 takes the attribute Alland (transitively)the attribute A5l,which,together with42 l, identifiesE I . Some referentialconstraints aredeflned for the relations produced (forexample, there is a referentialconstraint between the attribute A5 I of E I and E5).

    We now move on to the translationof relationships.RelationshipsR I andR6 have already been translated because of the external identiflcationof E2and El respectively. We assume we have decided to obtaina minimumnumber of relations in the finalschema and we will trythereforeto mergerelations where possible. we obtain the followingmodiflcationsto be carriedout on the initialrelations:. in order to translate R3, rve introduce,withappropriate renaming, the

    attributes thatidentifvE6, among those of E5, as wellas the attribute AR3of R3. Thus, we introduceA6lR3, A62R3 andAR3in E5;

    (l'l) \/ (o,t)

    (o,l) ./^,,\ (0,N)

    (0,1) ./^.\ (l,N)

  • 7/28/2019 Base Resumen

    28/41

  • 7/28/2019 Base Resumen

    29/41

    242 | Cnaore i--I Logicaldesign

    Typ. ln itialschema Possible translation

    Binary

    many-to-manyrelationship

    ArrrAr rz

    AR

    Arzr

    Arzz

    f , (ngl , Ar lz)r,(b'Arzz)

    R(nE,' AE2''

    An)

    Ternarymany-to-many

    relationship

    ArrAriz

    AR

    Arzr

    Anz

    r'(fu'Artz)fr(ng,Arzz)rr(b,Ar:z)

    R(U'AE2''Ar:,'An)

    One-to-manyrelationshipwithmandatory

    participation

    ArrAr rzAR

    Arzr

    Arzt

    i

    .

    fr(ag1 ,Arzz)

    One-to-manyrelationshipwith

    optionalparticipation

    ArlArrz

    AR

    Arzr

    Arzz

    r'(!'Arrz)E'(b'Arzz)

    R(b'AE2''An)Alternatively:

    r'(b'AE2t'Arr,'A;Ez(AEzt, Arzz)

    Relationshipwithexternal identifiers

    ArrAr rz

    AR

    AEzr

    Aezz

    E, (aU, AE2 I , A I t,A,rr(l_i,Arzz)

    Figure 7.25 Translations fromthe E-Rmodel to the relational.

    ri;ri

    i r'l,ii

    l t i,l r,

    i , lrl '

    tii,ri,l

    I

    ,,,i'

    I

    il,ilwil

    i

    ,

    :1

    1i

    ilhrt#r, 1 l r

    Figure 7.27 , withreference to the translationof the schema in Figure7.ry

    . Inthis representation keys of relations appear in bold face, arrows describereferentialconstraints and the presence ofasterisks on the attributes denotesthe possibilityof havingnull values on them.

    In this way, rve can keep track of the relationships of the originalr-Rschema. This is useful to identifyeasily, the joinpaths, that is, the join

  • 7/28/2019 Base Resumen

    30/41

    Section7 3 I24?Translation into the relationalmodel I

    TyP" lnitialscherna Possible translation

    One-to-onerelationshipwith

    mandatoryparticipation

    forboth entities

    AR

    Arzt

    Arzz

    = A=,A;rzArzL,AR)=r(a1,

    Arzz)

    Alternatively:rr(1gZ_l,AE22,1rl| 'Ail

    El(AE,Arrz)

    one-to-onerelationshipwith

    optionalparticipationforone entity

    AllAr rz

    AR

    Aezr

    Aezz

    f ,(aU, Ar2,AE2L,4ilr'(b'Arzz)

    One-to-onerelationshipwith

    optionalparticipation

    forboth entities

    AertAr rz

    AR

    Arzt

    AEzz

    f ,(nU,Arzl)Er(ng , AEZ2,Ar, ,, AR )

    Alternatively:

    r'(b'AEl2'Arr'' A*)Er(arr-I,Arzz)

    Alternatively:

    , El(Arrr,Alz)l-

    I Ez( AEz Arzz)l-| *tAr,,' 1t-, !. ' A*)Figure ':,.26 Translations fromthe E-Rmodel to the relational.

    Empuovgg PnolEcr

    Figure 7.27 Graphical representation of a translationc: ::eFigure 7.L7.

    NumberSurname Salary Code Name Budget

    PnnrcrPATIoN

    Employee Project StarDare

  • 7/28/2019 Base Resumen

    31/41

    244 I Jr,ap Le -I Logicaldesign

    irit'

    J;i

    1,,

    itl

    l r r, ,,iii r i t '

    itltiriti ,

    r l t iriri

    i,ililmi!ttlri:,1ri,;

    i

    'filh

    i*Itr

    operations necessar\' o reconstruct the informationrepresented by theoriginalrelationship.Thus, in the case in the example, the projects in nhic:-the emplovees participatecan be retrievedby mans of ihe"pnnlcrpATlc..relation.

    Anotherexample of the use of this notationis given in Figure7.28,reference to the translation of the schema in Figure7.2o.

    Figure 7.28 Graphical representation of a transrationof the schema rrFigure7.2o.

    Note that this method arso arows us to represent expricitl,.-::..relationshipsof the originarE-R schema, to which,in the equ:-.-.-:-:relationalschema, no reration corresponds (the coxrnacr relationship::- : - =example in question).

    As a finalexample, Figure7.29 shows the representationof the reia::::-:-schema obtained in Section 7.3.5. Now, the logiiallinksbetween the ..-::: r;relations can be easily identified.

    A5I As2 AR3 A6 I R5*A62R5-

    AI A62 AI2

    Figure 7.zg Graphical representation of the relationalschema obra::.: -Section 7.3.5.

    RI

    PuqyEn

    CoNTnACT TeamCo\o - -s

    surname DateofBirthI T"am I s"turyposition

    A6 R3 A62R3 A6 I R4*A62R4*

    ArrlA5lAt2

    A2 I lalr AsrlA22I I A5 I

    AR2I AR222I AI A5I

    A4I A42

  • 7/28/2019 Base Resumen

    32/41

    Anexampte "r,"r,.tl13174

    l24s

    In AppendixA, we willsee that a r-a:an: c: :ie ,i:aphicalformalismshown is actuallyadopted by the Access ciatat,ase ::raraqement svstem, bothto represent

    relationalschemas and to express joinoperations.

    7.4 Anexample of logicaldesignLet us return to the example in the precedingchapter regardingthe trainingcompany. The conceptual schema is shown again, for convenience, inFigure 7.3o.

    SSN SurnamePhone

    NoOfPart

    (r,N)

    IcwnOfBirth

    ErvployEr(0, I

    LevelProfessionalTitle Expertise

    Figure 7.3o The E-R schema of a training."-nrryThe followingoperations were planned on the data described by this

    schema:

    . operation l: insert a new trainee indicatingall his or her data;

    . operation z: assign a trainee to an editionof a course;' operation3: insert a new instructorindicatingall his or her data and the

    courses he or she is qualifledto teach;operation 4: assign a qualied instructorto an edition of a course;operation 5: displayall the informationon the past edirionsof a coursewithtitle,class timetables and number of trainees;operation6: displayall the courses offered.rr.::: l::,-:=:::-:_:--::-instructors who are qualifiedto teach them;

    o operation 7: foreachinstructor, findthetrainees:-.:--::= ::-:;:::-r ::she is teaching or has taught;

    o

    a

    StartDateEn

    Surname,l J (g.Ni(o,t)_./ \to,llo.t)-/ \ (o,N)

    TnnrruEe CounsrEorrtor.r

    StarDate1,,*

    Fne ruNce PEnmnNrrur

  • 7/28/2019 Base Resumen

    33/41

    246 | Chapter 7I Logicaldesign

    . operation E: carrr- out a statisticalanalysis of allthe trainees withall theinformationon them, on the editionof courses they have attended and onthe marks obtained.

    7.4.1 RestructuringphaseThe database load is shown in Figure7.3r. We willnow carryout the variousrestructuringtasks. The varioustransformationsare shown together in thefrnal schema in Section 7.33.

    Figure 7.3r Tables of volumesand operations for theschema in Figure;.i:Analysisof redundancies There is onlyone redundant piece of data ':the schema: the attribute NumberOfParticipantsin CouRseEo[toN,whichca:be derived fromthe relationshipsCURRENTATTENDANcEand PnsrATrENDAr\ciThe storage requirementis 4 X rooo = {ooobytes, havingassumed that fo;bytes are necessary for every occurrence of CouRsEEDtTtoNto store ::=number of participants.The operations involvedwith this informational: :5 and 8. The last of these can be left outbecause it deals withan infrecu-:operation that is carriedout in batch mode. We willthereforeevaluare :::cost of operationsz and 5 in the cases of the presence or absence ::redundant data. We can deduce fromthe table of volumes that each edi::,:-of the course has, on average, eight classes and ro participants.From this cia:.:we can easily derive the access tables shown in Figure7.32.

    Table of volumesConcept VolumeClassCourseEditronCourse

    lnstructorFreelancePermanentTraineeEmployeeProfessionalEmployerPastAttendanceCu rrentAttendanceCompositionTYP"PastTeaching

    CurrentTeachingQualificationCu rrentEmploymentPastEmployment

    E

    EE

    EE

    E

    E

    E

    E

    E

    R

    RRRR

    RRRR

    8000\000

    200

    30025050

    50004000I 0008000

    I 0000500

    8000f 000900

    t005004000

    I 0000

    Table of operationsOperation TyP. Frequencyoplop2op3op4op5op6op7op8 ]

    40 per d^,150 per d"y2 per d.y

    I 5 per d^yl0 per d^y20 per d^y5 per d^y

    l0 per month

  • 7/28/2019 Base Resumen

    34/41

    Accesses withredundancYOperation2

    Concept Cnstr Acc TyP.TraineeCurrentAtt'nceCourseEditionCourseEdition

    E

    RE

    E

    I

    I

    I

    I

    R

    wRw

    Section 7,4 | 247Anexample of logicaldesign I

    Operation5

    Concept Cnstr Acc TyP"

    CourseEditionTyP"CourseCompositionClass

    E

    RE

    RE

    I

    I

    I

    8I

    RRRRR

    Operation5

    Concept Cnstr Acc TyP"

    CourseEditionTyP"CourseCompositionClassPastAttendance

    E

    RE

    RE

    R

    I

    l

    I

    I8

    l0

    RRRRRR

    Figure 7.32 Access table for the schema in Figure7'3o'

    From the access tables we obtain:

    . withredundancy: foroperation 2 we have z X 5o = rOO read accesses andas many again in writeaccesses Per day, while,for operation 5' we have 19X to = i9o t.ra accesses per day for a total of49o accesses per day (havinggiven doubleweight to the writeaccesses);

    . withoutredundancy: foroperation z we have 5o read accesses per dayand as many again in writeaccesses per day, while,for operation 5' wehave zg x 10 = z9o read accesses per day, for a total of 44o accesses Perday (havinggivendouble weight to writeaccesses)'

    Thus, when the redundancy is Present, we have disadvantages both interms of storage and access time. We willthereforedelete the attributeN umberOfParticipantsfrom the entityCounseEotloN'RemovinggeneralizationsThere are two generalizations in the schema:that relating to ttre instructorsand that relating to the trainees. For theinstructors,ltcan be noted that the relevant operations, that is, 3, 4, 6 andJ, make no distinctionbetween freelance instructors and those emploved ona permanent basis by the company. Furthermore,the correspondingentitieshr..ro specificattributes for them. Therefore, rve decide to ie-e:e he childentitiesoi the generalizationand add an attribute Type :.. :le \s-q-a-cRentity. This attributehas a domain made up of the s\.r':c's F :;: :::=-:::eand P (forpermanent).

    For the trainees, we observe that in this case too, :l.e --::::::-s:--.'---"--":this data (operations t, zand'8) make no substan:ia- j:i::=---=:::-'1::-::=various types of occurrence.We can see, horsever' ::---::: ':-=-: :-::professlonals and employees both have specific3:::::-:3:-'"''=;-:-:

    Accesses wthout redundancYOp.ration2

    Concept Cnstr Acc TyP"TraineeCu rrentAtt'nce

    E

    R

    I

    I

    R

    W

  • 7/28/2019 Base Resumen

    35/41

    24AI Cna:ie-I Logicaldesign

    ti,

    1l1,j

    l

    l".'i{..

    rhere:ore .e"-. ::e e::::ies E,lptoyerand pRoressroNAl,adding two one_to_one reraionshirs3e:i'een these entities and the T^^TNEEentity. In this way,r'e can a'oic har-ingartriburesivithpossible nur varues on the parent entityof the generarization and rve can reduce the dimension of the rerations. Theesult of the restructuringcan be seen in the schemain figure 7.33.Partitioningand -rrgilgof concepts From the anarysis of data andoperations, many pltentialrestructrrirg,of this type can Lsidentified.Theflrstrelates to the couRsEorroNentitylwe can see that operation 5 reratesonly to the past editions and that ihe relationshrpspisrTEAcHrNGandPIsTATEruORNCErefer onlyto these editionsof the course. Thus, in order tomake the above operation more efficient,we courd decompose the entityhorizontaryto disringuishthe currenr edirionsf-- th;;;st ones. Thedisadvantage of this choice, however, is that the rerationrrrrf,aorrosrroNand rvpe wourdbe dupricated. Furthermore,operations 7 and g do not makegreat distinctionsbetween current editionsand past ones and wourdbe moreexpensive, because they wouldrequire

    visitsto twodistinctentities.Therefore, we willnot partitionthis entity.Twoother possibre restructurings that we courd consider are the mergingol tlre relationshipspmrTRcxrruand pRpsNrTrRcHrNGand the similarrelationshipspRsrArreruDANCEand pRTsENTATTENDANCE.In both cases, we aredealingwithtwo simirarconcepts between which,o-. op".rrronsmake nodifference (7 and 8). The -.rgirgof these relationships wourd produceanother advantage: it wourd ,o rorig..be necessary to transfer occurrencesfromone relationship to anothe*tit" end of , "...r. edition.A negativefactor is the presence of the attribute Mark,whichdoes not apply to theurrent editions and courd thus produce nulrvarues. For the rest, the tableof volumes telrs us that the estimated number of occurrences of theunReNrArrENDANcE

    relationshipis 5oo. Therefore, ,r.pporrrrgihatwe needfourbytes to store the marks, tt wurt. of storagewourdbe onry two Kbytes.we can decide therefore to merge the two pairs of relationship as describedin Figure7.33. we must add I constraint that is not expressibre by thechema, whichrequires that an instructorcannot teach more than oneeditionof a course in any one period.Simirarly,a participani"rrrro,attendmore than one edition of a course at a particulartime.Finally,we need to remove the multi-valuedattributeTerephone fromthel)r]*y:r"*entity. To do rhis we must introducea new .rr,iryTeleproNelinkedby a one-to-many rerationshipwiththe rr.srRucro*.rrii,ylrrrmwhichhe attributewillbe removed.rt is interesting to note that some decisions made in this phase reverse,

    inome way, decisions made during the conceptualdesign pirur..This is noturprisinghowever:the aim of conceptuar design b #;;;i.p..s.nt tt eequirements in the besr *-a'possibre,rvithout"conrid;;;irr""m"i.rr"yorhe application'In iogicar-ciesignr'e must instead try to optimiz theerformance and re-exan:n:::q earlier decisions is inevitable.

    ,ffli,lt',

    ]i,,,ll

    i

    lhiirti

    I

    l

    i

    i

    ,*"

  • 7/28/2019 Base Resumen

    36/41

    Anexampre "r,"r,j"t13174

    l24e

    Selection of primaryidentifiersOnr.. :e T:,.,:::n:irr.presents twoidentiflers:the social securit'numbei and :he :::ernarcode. It is farpreferable to chose the second. A social securir-numbercan require severalbytes whilean internar code, whichser\-es to istrnguishSooo t...r...rr.",(see-volume table) requires no more than nvo bvtes.There is another pragmatic consideratio, to " made regarding identiers,to do withthe CounseEDtTtoNentity. This entitv is identifi-edby t"h. StartDateattributeand by the couRSE entity. Thisgives a composireidentifierthat, ina relationalrepresentation, must be used to imprementtwo rerationships

    (ArreNolNceand TercgrNc).we can see, however, that each course has acode and that the average number of editionsof a course is five.This meanstht it is sufficientto add a smail integer to the course code to have anidentifierfor the course editions. This operationcan be carried out efficientlyand accurately duringthe creation ofla new edition. It followsthat it isconvenient to def,ne a new identifierfor the editionsof the courses thatreplaces the preceding external identifier.This is an example of analysis andrestructuring that is not in any of the gc-nerar categories

    *"hrrr"seen, but inpracticecan be encountered.This is the end of the restructuringphase of the originarE-R schema. Theresulting schema is shown in Figure7.33.

    TimeRX,

    StartDate

    Wkr (r,NJ-(0,r)(l.N)

    TownOfBirth

    (0. I

    Level ProfessioalTitle Expertise Name Ccde

    Figure783 The E-R schema of FigureT3oafterthe resrruciunEpirase.

    7.4.2Translationinto the relationalmodelByfollowingthe translation techniques

    desc::i:;::::-s ::.::::: ::: E_Rschema in Figure 733 can be translatd into he :--.-:-,..._:--: :=-::_::_:-s:--,::_-:counsEEDtloN(cod",startDare, EndDate Cs--e -s-_---: -

    Cuss(Time, Room, Dare, Ec,:c-lNsrRucroR(SSN,Surname, Ag., To,w,- C -- -,r

    StartDate EndDate

    Surname,le (o,Ni

    Counse

    Eorlorulrsrnucron

    Counse

  • 7/28/2019 Base Resumen

    37/41

    25O I Cnapre- -I Logicaldesign

    ,lh'

    i*

    li*

    itffi

    litr

    i'*i,H

    li,-,

    ir, ,

    lwittu

    ill

    lrili t!l:iL

    T:lrruoNe(Number, lnstructor)Couns:(Code, Name)

    Qunlt ncnttoN (Course, lnstructor)Tn,ITNEE(Code, SSN, Surname, Age, TownOfBirth,Sex)

    ATTENDANcE(Trainee,Edition,Marks*)Eueloven(Name,Address, Telephone)

    PmtEuployueNrfl-rainee,Employer, SrartDate, EndDate)PRoFEsstoNAL(Trainee,Expertise, professionalTitle*)

    Eunt-ovee(Trainee,Level, Position,Employer, StartDate)

    The logicalschema willnaturallybe completed by a support documentthat describes, among other things, all the referentialconstraints that existbetween the various relations.This can be done using the graphicalnotationintroduced in Section 7.3.7.

    7.5 Logicaldesign using CASEtoolsThe logicaldesign phase is generally supported by allthe cASE tools fordatabase development available on the market. In particular,since thetranslationto the relationalmodel is based on precise criteria,it is carried outby these systems almost automatically.on the other hand, the restructuringstep, whichprecedes the actual translation,is difficultto automate and thevarious products provide littleor no support for it. For example, somesystems automaticallytranslate all the generalizations according to just oneof the methods described in section 7.z.z.we have seen, however, that therestructuring of an E:R schema is a fundamentalactivityof the design for animportantreason. Namely, it can provide solutions to efficiencyproblemsthat should be resolved before carryingout the translationand ihat are not

    relevant to conceptual design. The designer should therefore take care tohandle this aspect withoutputting toomuch confidence into the toolavailable.

    Anexample of the output of the translation step using a database designtool is shown in Figure734.The example refers to the conceptual schema ofFigure6.r4. The resulting schema is shown in graphicarform,whichrepresents the relationaltables together with the relationshipsof the originalschema. Note horr,' the many-to-manyrelationshipbetween EyploypEandPnolecr has been translated into a relation.Also,note how new attributeshave been added to the relarionsoriginatingfromentities to represent theone-to-man and one-to-one relationships.In the f,gure, the sel code alsoappears. ge:re:a:e,iautomaricallvby the system. It allows the designer tocieine :le ia:a':ese using a specific database management system. somes1-sie'Ts :-"--,ri::ec: connectionwitha DBMSand can construct the;-'-es:---j'- ia:a'rase auromatically.other systems provide toolsto carry;-:::= ::"'i-

  • 7/28/2019 Base Resumen

    38/41

    Seclon7.7 I251Exercises I

    Figure 784 Logicaldesign witha CASE tool.

    is particularlyuseful for the analysis of a legacy system, possibly orientedtowards a migrationto a new database management system'

    7.6 BibliographyLogicaldesign is covered in detailin the books by Batini,ceri andNavathe [7],Teorey [84]and Teorey and Fry [85].The problemof translatingan E-R schema into the relationalmodel is discussed in the originalpaper bychen [23]and in a paper by Teorey, Yang and Fry [86],whichconsiders adetailed listof cases.

    7.7 ExercisesExercise 7.I Consider the r-n schema in Exercise 6..1. \lake hlpotheses onthe volumeof data and on the operations possible on this iata and, basedon these hypotheses, carry out the necessarv resiruc:Lir::..i:he schema.Then carry out the translationto the relationaLrnoie^'

    Exercise 7.2 Translate the E-R schema on(shown again for convenience in Figure ;.3 j \model.

    Emp_ld: NUMBEREmp_ld: NUMBERName: VARCHAR2(20) Dept_ld: NUMBER

    Name: VARCHAR2(20)Salary: NUMBERAge: NUMBER

    Telephone: NUMBER

    Name: VARCHAR2(20)

    City: VARCHAR2(20)

    Address:VARCHAR2(20)

    Dept_ld: NUMBERName: VARCHAR2@)

    ,ffiffiRTE I0BLEEmPloYee (

    Emp-Id HUHBEBNOTNULL,Oept-IO HUI'IBEBHOTHULL'Name UfiRCHRR2(20)NULL,Salary HUI'IBERHULL'fige NUilBEBHULL'PBII'IRRVl(EY(EmP-Itl));

    BERTETfiBLEProject (Name - UARCHRB2(20)HOTNULL'Budget HUI'IBEBNULL,Deadline DRIENULL'PRII,IABYl(EY(Nane)

    );TE TABLEEmPloYee-Project (

    Emp-Irt NUBEBNOfHULL'Name URBCH0B2(20)NOTNULL'PRI|'IRRYt(EY(Emp Id, Name) );

  • 7/28/2019 Base Resumen

    39/41

    252 | Cnaprer^ II Logicaldesign

    CodeSu rname

    Salary

    Ag"Eu plovEr (0,1 ) ( I,N)

    Phone

    Name0,N) (t,1)StartDate

    StartDateName (t,N) (l,N)

    BudgggNumberStreet

    ReleaseDate

    Figure Tg5 An E-R schema on the personnel of a company.

    E-R schema obtained in Exercise6.6 into axercise 7.3 Translate therelationalschema.

    Exercise7.4 Definea relationalschema corresponding to the E-R schemaobtained in Exercise6.ro. For the restructuringphase, indicate the possibleoptionsand choose one, makingassumptions on the quantitativeparameters. Assume that the database relates to certain apartment blocks,havingon average f,ve buildingseach, and that each buildinghas onaverage twentyapartments. The main operations are the registrationof atenant (5o per year per block)and recordingthe payment of rent.

    Exercise 7.5 Translate the E-R schema of Figure7.36 intoa relationaldatabase schema. For each relation indicate the key and, for each attribute,

    specify ifnullvalues can occur (supposing that the attributes of the r-Rschema do not admit nullvalues).

    Code

    _th LF

    MrecrmEr

    MENgensF.rrp DEpnnTMENT

    ARTICIPATIO Coyrposrlo

    PnolrcrAddress

    Ej --'-.=^.i*-!

  • 7/28/2019 Base Resumen

    40/41

  • 7/28/2019 Base Resumen

    41/41

    254 | Cnaprer^ -I Logicaldesign

    Ba ace Acccr,rrNumberCO

    VolumesConcept Typ" VolumeClientAccountTransactionPersonCompanyAccountHolderOperation

    E

    E

    E

    E

    E

    RR

    r 500020000

    600000I 4000

    I 00030000

    800000

    TotalBalance NumberOfAccountsClientNumber Cred itlimit

    1t ,trt)

    (l,N)

    Amount Transaction N u mberDate Typ"Figure 7.38 An E-R schema to translate.

    . operation 9: f,nd the number of accounts held by a client;' operation ro: show the transactions for the last three months of accounts

    of each clientwitha negative balance.Finally,suppose that in the operation stage, the database load for thisapplicationis that shown in Figure7.39.

    Accour.riName

    Address

    OperationsOperation TyPe F,reg ue n cyoplop2op3op4op5op6op7op8op9Op l0

    t'tIl

    100 per d^y

    500 per d^yI000 per d^y2000 per d^y1000 per d^y200 per dry

    1500 per d^yI per monrh

    75 per d^y20 per d^y

    Figure 7.39 Volumes and operations tables for the schema in Figure7.3g.

    Carr' our the logicaldesign phase on the r-n schema, takingintoaccour::he da:a pro'ided.In the restructuringphase, keep in minJthe fact tha:::rere are trvo redundancies on the schema: the atiribute TotalBalance ar.if\..r''noerofAccouncin the entitycugxt.These can be derived fromri:t:;-.::--:shipAccour.THoLDERand from the Accoururentity.

    AccouNrHoloEa CUENr

    OpeRnloN

    TnnrusAcloN

    CompaNyrnsor.r

    TaxNumber