bİl 354 – veritabanı yönetim sistemleri

60
BİL 354 – Veritabanı Sistemleri Relational Database Design (İlişkisel Veri Tabanı Tasarımı)

Upload: others

Post on 19-Oct-2021

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: BİL 354 – Veritabanı Yönetim Sistemleri

BİL 354 –

Veritabanı Sistemleri

Relational Database Design(İlişkisel Veri Tabanı Tasarımı)

Page 2: BİL 354 – Veritabanı Yönetim Sistemleri

Integrity Constraints (Bütünlük Kısıtlamaları)

Domain Constraints (Alan Kısıtlamaları)

Referential Integrity (Referans Kısıtlamaları)

Others (Nitelikler Arası Bağımlılıklar)

Functional Dependencies (İşlevsel Bağımlılıklar)

Page 3: BİL 354 – Veritabanı Yönetim Sistemleri

RedundancyRedundancy is at the root of several problemsassociated with relational schemas:

Redundant storage: Some information is storedrepeatedly.

Update anomalies: If one copy is updated, allcopies must be similarly updated.

Delete anomalies: It may not be possible to deletesome information without losing some otherinformation as well.

Insert anomalies: It may not be possible to storesome information unless some other informationis stored as well.

Page 4: BİL 354 – Veritabanı Yönetim Sistemleri

Update Anomaly

Page 5: BİL 354 – Veritabanı Yönetim Sistemleri

Insertion Anomaly

Page 6: BİL 354 – Veritabanı Yönetim Sistemleri

Deletion Anomaly

Page 7: BİL 354 – Veritabanı Yönetim Sistemleri

How about null values?

Hourly Emps(ssn, name, lot, rating, hourly_wages, hours_worked)

Null values cannot eliminate redundant storage or update anomalies!

However they can address insertion and deletion anomalies to some extent.

Deletion : rating and hourly_wages is non-null, ssn is null ?

Page 8: BİL 354 – Veritabanı Yönetim Sistemleri

Use of Decompositions

Hourly_Emps2(ssn, name, lot, rating, hours worked)

Wages(rating, hourly wages)

Page 9: BİL 354 – Veritabanı Yönetim Sistemleri

Decomposition (Ayrıştırma)

Decomposition should be used carefully, or itmay create more problems.

Two questions that should be askedrepeatedly:

Is there reason to decompose a relation? Normal Forms proposed to address this question

What problems (if any) does the decompositioncause? Lossless join

Dependency preserving

Page 10: BİL 354 – Veritabanı Yönetim Sistemleri

Decomposition (Ayrıştırma)

Lossless-join

Enable us to recover any instance of decomposed relationfrom corresponding instances of the smaller relations.

Dependency-preservation

o Enable us to enforce any constraint on the originalrelation by simply enforcing some constraints on each ofthe smaller relations.

o Thus, we need not performs joins of the smaller relationsto check whether a constraint on the original relation isviolated.

Page 11: BİL 354 – Veritabanı Yönetim Sistemleri

ExampleLending-schema = (branch-name, branch-city, assets,

customer-name, loan-number, amount)

Page 12: BİL 354 – Veritabanı Yönetim Sistemleri

Example : Decomposition

Decompose the relation schema Lending-schema into:

Branch-customer = (branch-name, branch-city,assets, customer-name)

customer-loan = (customer-name, loan-number,branch-name, amount)

All attributes of an original schema (R) must appear in the decomposition (R1, R2):

R = R1 R2

Lossless-join decomposition.For all possible relations r on schema R

r = R1 (r) R2 (r)

Page 13: BİL 354 – Veritabanı Yönetim Sistemleri

Example : Decomposition

Page 14: BİL 354 – Veritabanı Yönetim Sistemleri

Lossy-join Decomposition

The Relation branch-customer customer-loan

Page 15: BİL 354 – Veritabanı Yönetim Sistemleri

Desirable Properties of Decomposition

It may become necessary to decompose arelation into several smaller relations.

Lossless-Join

Dependency Preservation

Non-Repetition of Information

Using functional dependencies we can define normal forms that represent «good» db design.

Page 16: BİL 354 – Veritabanı Yönetim Sistemleri

Desirable Properties of Decomposition 2

Lossless-Join: enable to recover any instance of

decomposed relation from corresponding instances ofsmaller relations.

Dependency Preservation: enable to enforce any

constraint on the original relation by simply enforcingsome constraints on each of the smaller relations.

We need not perform joins of the smaller relations tocheck whether a constraint on the relation is violated.

Non-Repetition of Information

Page 17: BİL 354 – Veritabanı Yönetim Sistemleri

Functional Dependencies (İşlevsel Bağımlılık)

A functional dependency(FD) X Y holds over relationR if, for every allowable instance r of R:

t1 r, t2 r, (t1) = (t2) implies

(t1) = (t2)

i.e., given two tuples in r, if the X values agree, thenthe Y values must also agree. (X and Y are sets ofattributes.)

X → Y iff each X value is associated with precisely one Yvalue. Thus, given a tuple and the values of theattributes in X, one can determine the correspondingvalue of the Y attribute

X X Y Y

Page 18: BİL 354 – Veritabanı Yönetim Sistemleri

Functional Dependencies

A functional dependency (FD) is a kind of IC that generalizes the concept of a key.

Example:

StudentId Name

but not NameStudentId

If XY then

if X is updated, Y must also be updated

Page 19: BİL 354 – Veritabanı Yönetim Sistemleri

Functional Dependencies

An FD is a statement about all allowable relations.

Must be identified based on semantics of application.

Given some allowable instance r1 of R, we can check if it violates some FD f, but we cannot tell if f holds over R!

K is a candidate key for R means that K R

However, K R does not require K to be minimal!

Consider Enrolled(sid, cid, cname, semester)

(sid,cid) and (sid, cname) are candidate keys

(sid, cid, cname) is a superkey

Page 20: BİL 354 – Veritabanı Yönetim Sistemleri

Functional Dependency

Kavramsal Düzeyde

FNO FADI, FADRESİ

ÜKODU, FNO SFİYATI

Gerçek dünyada FADI biricik ise:

FADI FNO, FADRESİ

Olgu (örnek) DüzeyindeFNO FADI, FADRESİ

FADI FNO, FADRESİ

ÜKODU, FNO SFİYATI

FADRESİ FNO, FADI

Page 21: BİL 354 – Veritabanı Yönetim Sistemleri

Example

TAŞIT (PLAKANO, MARKA, MODEL, YIL, AĞIRLIK, RENK)

Bu örnekteki işlrevsel bağımlılıklar şunlardır:

PLAKANO MARKA, MODEL, YIL, AĞIRLIK, RENK

MARKA, MODEL AĞIRLIK

Page 22: BİL 354 – Veritabanı Yönetim Sistemleri

Example

R ilişki olgusunun işlevsel bağımlılıkları :D C

A, B C, D

A, C B, D

A, D B, C

B, C D

Page 23: BİL 354 – Veritabanı Yönetim Sistemleri

Types of Functional Dependencies 1/2R (ÖNO, ÖADI, BNO, BADI, FAKNO, DKODU, DADI, KRD, NOTU)

Kısmi İşlevsel Bağımlılık. Eğer X A’yı belirliyorsa ve X’in en az bir özaltkümesi de A’yı belirliyorsa (X A ve Z X : Z A) X Aişlevsel bağımlılığına kısmi işlevsel bağımlılık denir.

ÖNO, ÖADI BNO

ÖNO, DKODU KRD

Tam işlevsel bağımlılık. Eğer X A’yı belirliyorsa ve X’in hiçbir özaltkümesi A’yı belirlemiyorsa (X A ve Z X : Z A) X Aişlevsel bağımlılığına tam işlevsel bağımlılık denir.

ÖNO ÖADI, BNO

ÖNO, DKODU NOTU

Page 24: BİL 354 – Veritabanı Yönetim Sistemleri

Types of Functional Dependencies 2/2Önemsiz İşlevsel Bağımlılık. Eğer X A’yı belirliyorsa veA X’in bir altkümesi ise (X A ve A X ) X A işlevselbağımlılığına önemsiz işlevsel bağımlılık denir.

ÖNO, ÖADI ÖADI

BNO, BADI, FAKNO BNO, FAKNO

Önemli İşlevsel Bağımlılık. Eğer X A’yı belirliyorsa ve A X’inbir altkümesi değilse (X A ve A X ) X A işlevselbağımlılığına önemli işlevsel bağımlılık denir.

ÖNO, ÖADI, DKODU BNO, KRD

ÖNO, DKODU NOTU

Geçişli işlevsel bağımlılık. Eğer X Y’yi, Y de Z’i belirliyorsa

(X Y ve Y Z ) X Z işlevsel bağımlılığına geçişli işlevselbağımlılık denir.

ÖNO FAKNO (ÖNO BNO, BNO FAKNO)

Page 25: BİL 354 – Veritabanı Yönetim Sistemleri

Armstrong’s Axioms (İ.B. Türetme Kuralları)

Reflexivity (dönüşlülük):

If X Y, then Y X

Augmentation (arttırma):

If X Y, then XZ Y for any Z

Transitivity (geçişlilik):

If X Y and Y Z, then X Z

Page 26: BİL 354 – Veritabanı Yönetim Sistemleri

Other Rules

Union (Birleşim):

If X Y and X Z, then X YZ

Decomposition (Ayrışım):

If X YZ, then X Y and X Z

Pseudo Transitivity (Sözde Geçişlilik):

If X Y, then YZW and XZW

Accumulation rule:

If X YZ and Z W, then X YZW

Page 27: BİL 354 – Veritabanı Yönetim Sistemleri

Example

R (A, B, C, D, E, G)

F: A BD

BC EG

D E

DG E (artırma kuralına göre)

A E ( ayrıştırma ve geçişlilik kurallarına göre)

A ABD (dönüşlülük ve birleşim kurallarına göre)

A B ve A D (ayrıştırma kuralına göre)

AC EG (ayrıştırma ve sözde geçişlilik kurallarına göre)

Page 28: BİL 354 – Veritabanı Yönetim Sistemleri

Closure of a Set of Functional Dependencies

Given a set F set of functional dependencies,there are certain other functionaldependencies that are logically implied by F.

E.g. If A B and B C, then we can infer thatA C

The set of all functional dependencieslogically implied by F is the closure of F.

We denote the closure of F by F+.

We can find all of F+ by applyingArmstrong’s Axioms:

Page 29: BİL 354 – Veritabanı Yönetim Sistemleri

Computing F+

F+ = F;

loop {

For each f in F,

apply the reflexivity and augmentation rules

and add the new FDs to F+.

For each pair of FDs in F,

apply the transitivity rule and add the new FDs

to F+

} until F+ does not change any more.

Page 30: BİL 354 – Veritabanı Yönetim Sistemleri

Artıklık Algoritması

1. Başlangıçta T = { X } yap

2. “F – { f }”teki her W Z işlevsel bağımlılığı için:

eğer {W} T ise T = T {Z} yap

3. T değiştiği sürece 2. adımı tekrarla

4. Sonuçta eğer Y T ise (f : X Y) işlevsel bağımlılığı F’de artıktır.

• Ayrıştırma yapılır. • Artık olan her f F’den çıkarılır.

Page 31: BİL 354 – Veritabanı Yönetim Sistemleri

Example 1/2

R (A, B, C, D, E, G)

F: A BCDE

G BD

BC E

CG A

BDE ACG

F : A B G B BDE A

A C G D BDE C

A D BC E BDE G

A E CG A

Page 32: BİL 354 – Veritabanı Yönetim Sistemleri

Example 2/2F’de A B artık mı?

T = {A,C,D,E} elde edilir hayır

F’de A C artık mı?

T = {A,B,D,E,C,G} elde edilir evet

F1 = F – {A C}

F1’de A D artık mı?

T = {A,B,E} elde edilir hayır

F1’de A E artık mı? T = {A,B,D} elde edilir hayır

F1’de G B artık mı? T = {G,D} elde edilir hayır

F1’de G D artık mı?T = {G,B} elde edilir hayır

F1’de BC E artık mı? T = {B,C} elde edilir hayır

F1’de CG A artık mı?

T = {C,G,B,D,E,A} elde edilir evet

F2 = F1 – {CG A}

Page 33: BİL 354 – Veritabanı Yönetim Sistemleri

Functional Dependencies

Computing the closure of a set of FDs can be expensive. (Size of closure is exponential in # attrs!)

Typically, we just want to check if a given FD X Y is in the closure of a set of FDs F. An efficient check:

Compute attribute closure of X (denoted )

Set of all attributes A such that X A is in

There is a linear time algorithm to compute this.

Check if Y is in

Does F = {A B, B C, C D E } imply A E?

i.e, is A E in the closure ? Equivalently, is E in ?

X

F

Page 34: BİL 354 – Veritabanı Yönetim Sistemleri

Computation of Attribute Closure X+F

closure := X; --since X X+F

Repeat until no change in closure

if there is an FD Z V in F such that

Z closure

then closure := closure V

If T closure then X T is entailed by F

Page 35: BİL 354 – Veritabanı Yönetim Sistemleri

Example

A B C

A D E

B D

A F B

{A, B}+ =

{A, F}+ =

{A, B, C, D, E}{A, F, B, D, C, E}

Page 36: BİL 354 – Veritabanı Yönetim Sistemleri

Example

F: AB C

A D

D E

AC B

X XF+

A {A, D, E}

AB {A, B, C, D, E}

(Hence AB is a key)

B {B}

D {D, E}

Is AB E entailed by F? Yes

Is D C entailed by F? No

Result: XF+ allows us to determine FDs

of the form X Y entailed by F

Page 37: BİL 354 – Veritabanı Yönetim Sistemleri

37

What is the attribute closure good for?

Test if X is a superkey

compute X+, and check if X+ contains all attrs of R

Check if X Y holds

by checking if Y is contained in X+

Another (not so clever) way to compute closure S+

of FDs

for each subset of attributes X in relation R, compute X+ with respect to S

for each subset of attributes Y in X+, output the FD X Y

Page 38: BİL 354 – Veritabanı Yönetim Sistemleri

38

Normal Forms (Normal Biçimler)

Returning to the issue of schema refinement, thefirst question to ask is whether any refinementis needed!

If a relation is in a certain normal form (BCNF,3NF etc.), it is known that certain kinds ofproblems are avoided/minimized. This can beused to help us decide whether decomposingthe relation will help.

Page 39: BİL 354 – Veritabanı Yönetim Sistemleri

39

Normal Forms

First Normal Form (1NF): A relation is in 1NF, if everyfield contains atomic values. This requirement is implicitin our definition of the relational model.

Second Normal Form (2NF): (obsolete) each nonkeyattribute in the relation must be functionally dependentupon the primary key.

Third Normal Form (3NF)

Boyce-Code Normal Form (BCNF)

Fourth Normal Form (4NF) (for the interested)

Increasing requirements

Page 40: BİL 354 – Veritabanı Yönetim Sistemleri

1NFÖĞRENCİ (ÖNO, ÖADI, DERS (DADI, NOTU))

ÖĞRDERS (ÖNO, ÖADI, DADI, NOTU)

Page 41: BİL 354 – Veritabanı Yönetim Sistemleri

2NF

Asal Nitelik: ilişki anahtarlarında yeralan nitelikler.

Asal Olmayan Nitelik.

Bir ilişki 1NF ise

Asal olmayan niteliklerden hiçbiri anahtarlardan hiçbirine kısmi işlevsel bağımlı değil ise (tam bağımlı).

SATICI (ÜKODU, FNO, FADI, FADRESİ, SFİYATI)

FNO FADI, FADRESİ

Page 42: BİL 354 – Veritabanı Yönetim Sistemleri

3NFReln R with FDs F is in 3NF if, for all XA in

A X (called a trivial FD), or

X is a superkey, or

A is part of some key for R.

Minimality of a key is crucial in third conditionabove!

If R is in BCNF, obviously in 3NF.

If R is in 3NF, some redundancy is possible.

Lossless-join,dependency-preserving decomposition of Rinto a collection of 3NF relations always possible.

F

Page 43: BİL 354 – Veritabanı Yönetim Sistemleri

BCNF (Boyce Codd Normal Form)

Reln R with FDs F is in BCNF if, for all X A in

A X (called a trivial FD), or

X contains a key for R. (X is a superkey)

In other words, R is in BCNF if the only non-trivial FDs that hold over R are key constraints.

If example relation is in BCNF, the 2 tuples must be identical (since X is a key).

F

Page 44: BİL 354 – Veritabanı Yönetim Sistemleri

44

BCNF doesn’t always have a

dependency-preserving

decomposition.

Third normal form may be preferable to having to take a

join to check dependencies after an update.

Page 45: BİL 354 – Veritabanı Yönetim Sistemleri

Lossless-Join (Yitimsiz Birleştirme)

A decomposition should not lose information

Decomposition of R into X and Y is lossless-join w.r.t. a set of FDs F if, for every instance r that satisfies F:

(r) (r) = r

It is always true that r (r) (r)

In general, the other direction does not hold! If it does, the decomposition is lossless-join.

X Y

X Y

Page 46: BİL 354 – Veritabanı Yönetim Sistemleri

Lossy Decomposition

r r1 r2 ... rn

SSN Name Address SSN Name Name Address

1111 Joe 1 Pine 1111 Joe Joe 1 Pine

2222 Alice 2 Oak 2222 Alice Alice 2 Oak

3333 Alice 3 Pine 3333 Alice Alice 3 Pine

r r1 r2 rn...

r1 r2r

The following is always the case:

But the following is not always true:

Example:

The tuples (2222, Alice, 3 Pine) and (3333, Alice, 2 Oak) are in the join, but not in the original

Page 47: BİL 354 – Veritabanı Yönetim Sistemleri

Lossy Decompositions: What is Actually Lost?

In the previous example, the tuples (2222, Alice, 3 Pine) and (3333, Alice, 2 Oak) were gained, not lost!

Why do we say that the decomposition was lossy?

What was lost is information:

That 2222 lives at 2 Oak: In the decomposition, 2222 can live at either 2 Oak or 3 Pine

That 3333 lives at 3 Pine: In the decomposition, 3333 can live at either 2 Oak or 3 Pine

Page 48: BİL 354 – Veritabanı Yönetim Sistemleri

Is this decomposition lossless?

Tournament Year Winner Winner Date of Birth

Indiana Invitational 1998 Al Fredrickson 21 July 1975

Cleveland Open 1999 Bob Albertson 28 September 1968

Des Moines Masters 1999 Al Fredrickson 21 July 1975

Indiana Invitational 1999 Chip Masterson 14 March 1977

Tournament Year Winner

Indiana Invitational 1998 Al Fredrickson

Cleveland Open 1999 Bob Albertson

Des Moines Masters 1999 Al Fredrickson

Indiana Invitational 1999 Chip Masterson

Winner Winner Date of Birth

Al Fredrickson 21 July 1975

Bob Albertson 28 September 1968

Al Fredrickson 21 July 1975

Chip Masterson 14 March 1977

Page 49: BİL 354 – Veritabanı Yönetim Sistemleri

Is this decomposition lossless?

StudentId CourseId Dept Grade

12 CS101 IE B-

12 CS102 IE C+

12 IE201 IE C-

7 CS201 CS A

7 CS352 CS D

StudentId CourseId Grade

12 CS101 B-

12 CS102 C+

12 IE201 C-

7 CS201 A

7 CS352 D

StudentId Dept

12 IE

7 CS

Page 50: BİL 354 – Veritabanı Yönetim Sistemleri

Ayrıştırmaların Yitimsizlik Sınaması

Page 51: BİL 354 – Veritabanı Yönetim Sistemleri

Dependency Preserving Decomposition

(İşlevsel Bağımlılıkların Korunması)

Consider CSJDPQV, C is key, JP C and SD P.

BCNF decomposition: CSJDQV and SDP

Problem: Checking JP C requires a join!

Dependency preserving decomposition (Intuitive):

If R is decomposed into X, Y and Z, and we enforce the FDs that hold on X, on Y and on Z, then all FDs that were given to hold on R must also hold.

Page 52: BİL 354 – Veritabanı Yönetim Sistemleri

Dependency Preserving Decompositions

Decomposition of R into X and Y is dependencypreserving if (FX union FY )

+ = F +

ABC, A B, B C, C A, decomposed into AB and BC.

Is this dependency preserving? Is C A preserved?

Dependency preserving does not imply lossless join:

ABC, A B, decomposed into AB and BC.

And vice-versa!

Page 53: BİL 354 – Veritabanı Yönetim Sistemleri

İşlevsel Bağımlılıkların Korunması Alg.

Page 54: BİL 354 – Veritabanı Yönetim Sistemleri

Decomposition into BCNF

Input: relation R, set S of FDs over R.

Output: a set of relations in BCNF.

1. Compute S+.

2. Compute keys for R (from ER or from S+).

3. Use S+ and keys to check if R is in BCNF. If not:

a. Pick a violation FD A B.

b. Expand B as much as possible, by computing A+.

c. Create R1 = A B, and R2 = R B.

d. Find the FDs over R1, using S+. Repeat for R2.

e. Recurse on R1 & its set of FDs. Repeat for R2.

4. Else R is already in BCNF; add R to the output.

Page 55: BİL 354 – Veritabanı Yönetim Sistemleri

55

Decomposition into BCNF

Given: relation R with FD’s F.

Look among the given FD’s for a BCNF violation X ->B.

Compute X +.

Replace R by relations with schemas:

1. R1 = X +.

2. R2 = R – (X + – X ).

cont. with two new relations..

Page 56: BİL 354 – Veritabanı Yönetim Sistemleri

56

Example

Drinkers(name, addr, beersLiked, manf, favBeer)

F = name->addr,

name -> favBeer,

beersLiked->manf

Pick BCNF violation name->addr.

Close the left side: {name}+ = {name, addr, favBeer}.

Decomposed relations:Drinkers1(name, addr, favBeer)

Drinkers2(name, beersLiked, manf)

Page 57: BİL 354 – Veritabanı Yönetim Sistemleri

57

Example, Continued

We are not done; we need to check Drinkers1 and Drinkers2 for BCNF.

For Drinkers1(name, addr, favBeer), relevant FD’s are

name->addr and name->favBeer. Thus, {name} is the only key and Drinkers1 is in

BCNF.

Page 58: BİL 354 – Veritabanı Yönetim Sistemleri

58

Example, Continued

For Drinkers2(name, beersLiked, manf), the only FD is beersLiked->manf, and the only key is {name, beersLiked}. Violation of BCNF.

beersLiked+ = {beersLiked, manf}, so we decompose Drinkers2 into:

1. Drinkers3(beersLiked, manf)

2. Drinkers4(name, beersLiked)

Page 59: BİL 354 – Veritabanı Yönetim Sistemleri

59

Example, Concluded

The resulting decomposition of Drinkers :

1. Drinkers1(name, addr, favBeer)

2. Drinkers3(beersLiked, manf)

3. Drinkers4(name, beersLiked)

Notice: Drinkers1 tells us about drinkers, Drinkers3 tells us about beers, and Drinkers4tells us the relationship between drinkers and the beers they like.

Page 60: BİL 354 – Veritabanı Yönetim Sistemleri

Decomposition into 3NF

Reading…