bİl 354 – veritabanı yönetim sistemleri

BİL 354 –

Veritabanı Sistemleri

Relational Database Design(İlişkisel Veri Tabanı Tasarımı)

Integrity Constraints (Bütünlük Kısıtlamaları)

Domain Constraints (Alan Kısıtlamaları)

Referential Integrity (Referans Kısıtlamaları)

Others (Nitelikler Arası Bağımlılıklar)

Functional Dependencies (İşlevsel Bağımlılıklar)

RedundancyRedundancy is at the root of several problemsassociated with relational schemas:

Redundant storage: Some information is storedrepeatedly.

Update anomalies: If one copy is updated, allcopies must be similarly updated.

Delete anomalies: It may not be possible to deletesome information without losing some otherinformation as well.

Insert anomalies: It may not be possible to storesome information unless some other informationis stored as well.

Update Anomaly

Insertion Anomaly

Deletion Anomaly

How about null values?

Hourly Emps(ssn, name, lot, rating, hourly_wages, hours_worked)

Null values cannot eliminate redundant storage or update anomalies!

However they can address insertion and deletion anomalies to some extent.

Deletion : rating and hourly_wages is non-null, ssn is null ?

Use of Decompositions

Hourly_Emps2(ssn, name, lot, rating, hours worked)

Wages(rating, hourly wages)

Decomposition (Ayrıştırma)

Decomposition should be used carefully, or itmay create more problems.

Two questions that should be askedrepeatedly:

Is there reason to decompose a relation? Normal Forms proposed to address this question

What problems (if any) does the decompositioncause? Lossless join

Dependency preserving

Decomposition (Ayrıştırma)

Lossless-join

Enable us to recover any instance of decomposed relationfrom corresponding instances of the smaller relations.

Dependency-preservation

o Enable us to enforce any constraint on the originalrelation by simply enforcing some constraints on each ofthe smaller relations.

o Thus, we need not performs joins of the smaller relationsto check whether a constraint on the original relation isviolated.

ExampleLending-schema = (branch-name, branch-city, assets,

customer-name, loan-number, amount)

Example : Decomposition

Decompose the relation schema Lending-schema into:

Branch-customer = (branch-name, branch-city,assets, customer-name)

customer-loan = (customer-name, loan-number,branch-name, amount)

All attributes of an original schema (R) must appear in the decomposition (R1, R2):

R = R1 R2

Lossless-join decomposition.For all possible relations r on schema R

r = R1 (r) R2 (r)

Example : Decomposition

Lossy-join Decomposition

The Relation branch-customer customer-loan

Desirable Properties of Decomposition

It may become necessary to decompose arelation into several smaller relations.

Lossless-Join

Dependency Preservation

Non-Repetition of Information

Using functional dependencies we can define normal forms that represent «good» db design.

Desirable Properties of Decomposition 2

Lossless-Join: enable to recover any instance of

decomposed relation from corresponding instances ofsmaller relations.

Dependency Preservation: enable to enforce any

constraint on the original relation by simply enforcingsome constraints on each of the smaller relations.

We need not perform joins of the smaller relations tocheck whether a constraint on the relation is violated.

Non-Repetition of Information

Functional Dependencies (İşlevsel Bağımlılık)

A functional dependency(FD) X Y holds over relationR if, for every allowable instance r of R:

t1 r, t2 r, (t1) = (t2) implies

(t1) = (t2)

i.e., given two tuples in r, if the X values agree, thenthe Y values must also agree. (X and Y are sets ofattributes.)

X → Y iff each X value is associated with precisely one Yvalue. Thus, given a tuple and the values of theattributes in X, one can determine the correspondingvalue of the Y attribute

X X Y Y

Functional Dependencies

A functional dependency (FD) is a kind of IC that generalizes the concept of a key.

Example:

StudentId Name

but not NameStudentId

If XY then

if X is updated, Y must also be updated


An FD is a statement about all allowable relations.

Must be identified based on semantics of application.

Given some allowable instance r1 of R, we can check if it violates some FD f, but we cannot tell if f holds over R!

K is a candidate key for R means that K R

However, K R does not require K to be minimal!

Consider Enrolled(sid, cid, cname, semester)

(sid,cid) and (sid, cname) are candidate keys

(sid, cid, cname) is a superkey

Functional Dependency

Kavramsal Düzeyde

FNO FADI, FADRESİ

ÜKODU, FNO SFİYATI

Gerçek dünyada FADI biricik ise:

FADI FNO, FADRESİ

Olgu (örnek) DüzeyindeFNO FADI, FADRESİ

FADI FNO, FADRESİ

ÜKODU, FNO SFİYATI

FADRESİ FNO, FADI

Example

TAŞIT (PLAKANO, MARKA, MODEL, YIL, AĞIRLIK, RENK)

Bu örnekteki işlrevsel bağımlılıklar şunlardır:

PLAKANO MARKA, MODEL, YIL, AĞIRLIK, RENK

MARKA, MODEL AĞIRLIK

Example

R ilişki olgusunun işlevsel bağımlılıkları :D C

A, B C, D

A, C B, D

A, D B, C

B, C D

Types of Functional Dependencies 1/2R (ÖNO, ÖADI, BNO, BADI, FAKNO, DKODU, DADI, KRD, NOTU)

Kısmi İşlevsel Bağımlılık. Eğer X A’yı belirliyorsa ve X’in en az bir özaltkümesi de A’yı belirliyorsa (X A ve Z X : Z A) X Aişlevsel bağımlılığına kısmi işlevsel bağımlılık denir.

ÖNO, ÖADI BNO

ÖNO, DKODU KRD

Tam işlevsel bağımlılık. Eğer X A’yı belirliyorsa ve X’in hiçbir özaltkümesi A’yı belirlemiyorsa (X A ve Z X : Z A) X Aişlevsel bağımlılığına tam işlevsel bağımlılık denir.

ÖNO ÖADI, BNO

ÖNO, DKODU NOTU

Types of Functional Dependencies 2/2Önemsiz İşlevsel Bağımlılık. Eğer X A’yı belirliyorsa veA X’in bir altkümesi ise (X A ve A X ) X A işlevselbağımlılığına önemsiz işlevsel bağımlılık denir.

ÖNO, ÖADI ÖADI

BNO, BADI, FAKNO BNO, FAKNO

Önemli İşlevsel Bağımlılık. Eğer X A’yı belirliyorsa ve A X’inbir altkümesi değilse (X A ve A X ) X A işlevselbağımlılığına önemli işlevsel bağımlılık denir.

ÖNO, ÖADI, DKODU BNO, KRD

ÖNO, DKODU NOTU

Geçişli işlevsel bağımlılık. Eğer X Y’yi, Y de Z’i belirliyorsa

(X Y ve Y Z ) X Z işlevsel bağımlılığına geçişli işlevselbağımlılık denir.

ÖNO FAKNO (ÖNO BNO, BNO FAKNO)

Armstrong’s Axioms (İ.B. Türetme Kuralları)

Reflexivity (dönüşlülük):

If X Y, then Y X

Augmentation (arttırma):

If X Y, then XZ Y for any Z

Transitivity (geçişlilik):

If X Y and Y Z, then X Z

Other Rules

Union (Birleşim):

If X Y and X Z, then X YZ

Decomposition (Ayrışım):

If X YZ, then X Y and X Z

Pseudo Transitivity (Sözde Geçişlilik):

If X Y, then YZW and XZW

Accumulation rule:

If X YZ and Z W, then X YZW

Example

R (A, B, C, D, E, G)

F: A BD

BC EG

D E

DG E (artırma kuralına göre)

A E ( ayrıştırma ve geçişlilik kurallarına göre)

A ABD (dönüşlülük ve birleşim kurallarına göre)

A B ve A D (ayrıştırma kuralına göre)

AC EG (ayrıştırma ve sözde geçişlilik kurallarına göre)

Closure of a Set of Functional Dependencies

Given a set F set of functional dependencies,there are certain other functionaldependencies that are logically implied by F.

E.g. If A B and B C, then we can infer thatA C

The set of all functional dependencieslogically implied by F is the closure of F.

We denote the closure of F by F+.

We can find all of F+ by applyingArmstrong’s Axioms:

Computing F+

F+ = F;

loop {

For each f in F,

apply the reflexivity and augmentation rules

and add the new FDs to F+.

For each pair of FDs in F,

apply the transitivity rule and add the new FDs

to F+

} until F+ does not change any more.

Artıklık Algoritması

1. Başlangıçta T = { X } yap

2. “F – { f }”teki her W Z işlevsel bağımlılığı için:

eğer {W} T ise T = T {Z} yap

3. T değiştiği sürece 2. adımı tekrarla

4. Sonuçta eğer Y T ise (f : X Y) işlevsel bağımlılığı F’de artıktır.

• Ayrıştırma yapılır. • Artık olan her f F’den çıkarılır.

Example 1/2

R (A, B, C, D, E, G)

F: A BCDE

G BD

BC E

CG A

BDE ACG

F : A B G B BDE A

A C G D BDE C

A D BC E BDE G

A E CG A

Example 2/2F’de A B artık mı?

T = {A,C,D,E} elde edilir hayır

F’de A C artık mı?

T = {A,B,D,E,C,G} elde edilir evet

F1 = F – {A C}

F1’de A D artık mı?

T = {A,B,E} elde edilir hayır

F1’de A E artık mı? T = {A,B,D} elde edilir hayır

F1’de G B artık mı? T = {G,D} elde edilir hayır

F1’de G D artık mı?T = {G,B} elde edilir hayır

F1’de BC E artık mı? T = {B,C} elde edilir hayır

F1’de CG A artık mı?

T = {C,G,B,D,E,A} elde edilir evet

F2 = F1 – {CG A}


Computing the closure of a set of FDs can be expensive. (Size of closure is exponential in # attrs!)

Typically, we just want to check if a given FD X Y is in the closure of a set of FDs F. An efficient check:

Compute attribute closure of X (denoted )

Set of all attributes A such that X A is in

There is a linear time algorithm to compute this.

Check if Y is in

Does F = {A B, B C, C D E } imply A E?

i.e, is A E in the closure ? Equivalently, is E in ?

X

F

Computation of Attribute Closure X+F

closure := X; --since X X+F

Repeat until no change in closure

if there is an FD Z V in F such that

Z closure

then closure := closure V

If T closure then X T is entailed by F

Example

A B C

A D E

B D

A F B

{A, B}+ =

{A, F}+ =

{A, B, C, D, E}{A, F, B, D, C, E}

Example

F: AB C

A D

D E

AC B

X XF+

A {A, D, E}

AB {A, B, C, D, E}

(Hence AB is a key)

B {B}

D {D, E}

Is AB E entailed by F? Yes

Is D C entailed by F? No

Result: XF+ allows us to determine FDs

of the form X Y entailed by F

37

What is the attribute closure good for?

Test if X is a superkey

compute X+, and check if X+ contains all attrs of R

Check if X Y holds

by checking if Y is contained in X+

Another (not so clever) way to compute closure S+

of FDs

for each subset of attributes X in relation R, compute X+ with respect to S

for each subset of attributes Y in X+, output the FD X Y

38

Normal Forms (Normal Biçimler)

Returning to the issue of schema refinement, thefirst question to ask is whether any refinementis needed!

If a relation is in a certain normal form (BCNF,3NF etc.), it is known that certain kinds ofproblems are avoided/minimized. This can beused to help us decide whether decomposingthe relation will help.

39

Normal Forms

First Normal Form (1NF): A relation is in 1NF, if everyfield contains atomic values. This requirement is implicitin our definition of the relational model.

Second Normal Form (2NF): (obsolete) each nonkeyattribute in the relation must be functionally dependentupon the primary key.

Third Normal Form (3NF)

Boyce-Code Normal Form (BCNF)

Fourth Normal Form (4NF) (for the interested)

Increasing requirements

http://databases.about.com/library/glossary/bldef-attribute.htm

http://databases.about.com/library/glossary/bldef-primarykey.htm

1NFÖĞRENCİ (ÖNO, ÖADI, DERS (DADI, NOTU))

ÖĞRDERS (ÖNO, ÖADI, DADI, NOTU)

2NF

Asal Nitelik: ilişki anahtarlarında yeralan nitelikler.

Asal Olmayan Nitelik.

Bir ilişki 1NF ise

Asal olmayan niteliklerden hiçbiri anahtarlardan hiçbirine kısmi işlevsel bağımlı değil ise (tam bağımlı).

SATICI (ÜKODU, FNO, FADI, FADRESİ, SFİYATI)

FNO FADI, FADRESİ

3NFReln R with FDs F is in 3NF if, for all XA in

A X (called a trivial FD), or

X is a superkey, or

A is part of some key for R.

Minimality of a key is crucial in third conditionabove!

If R is in BCNF, obviously in 3NF.

If R is in 3NF, some redundancy is possible.

Lossless-join,dependency-preserving decomposition of Rinto a collection of 3NF relations always possible.

F

BCNF (Boyce Codd Normal Form)

Reln R with FDs F is in BCNF if, for all X A in

A X (called a trivial FD), or

X contains a key for R. (X is a superkey)

In other words, R is in BCNF if the only non-trivial FDs that hold over R are key constraints.

If example relation is in BCNF, the 2 tuples must be identical (since X is a key).

F

44

BCNF doesn’t always have a

dependency-preserving

decomposition.

Third normal form may be preferable to having to take a

join to check dependencies after an update.

Lossless-Join (Yitimsiz Birleştirme)

A decomposition should not lose information

Decomposition of R into X and Y is lossless-join w.r.t. a set of FDs F if, for every instance r that satisfies F:

(r) (r) = r

It is always true that r (r) (r)

In general, the other direction does not hold! If it does, the decomposition is lossless-join.

X Y

X Y

Lossy Decomposition

r r1 r2 ... rn

SSN Name Address SSN Name Name Address

1111 Joe 1 Pine 1111 Joe Joe 1 Pine

2222 Alice 2 Oak 2222 Alice Alice 2 Oak

3333 Alice 3 Pine 3333 Alice Alice 3 Pine

r r1 r2 rn...

r1 r2r

The following is always the case:

But the following is not always true:

Example:

The tuples (2222, Alice, 3 Pine) and (3333, Alice, 2 Oak) are in the join, but not in the original

Lossy Decompositions: What is Actually Lost?

In the previous example, the tuples (2222, Alice, 3 Pine) and (3333, Alice, 2 Oak) were gained, not lost!

Why do we say that the decomposition was lossy?

What was lost is information:

That 2222 lives at 2 Oak: In the decomposition, 2222 can live at either 2 Oak or 3 Pine

That 3333 lives at 3 Pine: In the decomposition, 3333 can live at either 2 Oak or 3 Pine

Is this decomposition lossless?

Tournament Year Winner Winner Date of Birth

Indiana Invitational 1998 Al Fredrickson 21 July 1975

Cleveland Open 1999 Bob Albertson 28 September 1968

Des Moines Masters 1999 Al Fredrickson 21 July 1975

Indiana Invitational 1999 Chip Masterson 14 March 1977

Tournament Year Winner

Indiana Invitational 1998 Al Fredrickson

Cleveland Open 1999 Bob Albertson

Des Moines Masters 1999 Al Fredrickson

Indiana Invitational 1999 Chip Masterson

Winner Winner Date of Birth

Al Fredrickson 21 July 1975

Bob Albertson 28 September 1968

Al Fredrickson 21 July 1975

Chip Masterson 14 March 1977

Is this decomposition lossless?

StudentId CourseId Dept Grade

12 CS101 IE B-

12 CS102 IE C+

12 IE201 IE C-

7 CS201 CS A

7 CS352 CS D

StudentId CourseId Grade

12 CS101 B-

12 CS102 C+

12 IE201 C-

7 CS201 A

7 CS352 D

StudentId Dept

12 IE

7 CS

Ayrıştırmaların Yitimsizlik Sınaması

Dependency Preserving Decomposition

(İşlevsel Bağımlılıkların Korunması)

Consider CSJDPQV, C is key, JP C and SD P.

BCNF decomposition: CSJDQV and SDP

Problem: Checking JP C requires a join!

Dependency preserving decomposition (Intuitive):

If R is decomposed into X, Y and Z, and we enforce the FDs that hold on X, on Y and on Z, then all FDs that were given to hold on R must also hold.

Dependency Preserving Decompositions

Decomposition of R into X and Y is dependencypreserving if (FX union FY )

+ = F +

ABC, A B, B C, C A, decomposed into AB and BC.

Is this dependency preserving? Is C A preserved?

Dependency preserving does not imply lossless join:

ABC, A B, decomposed into AB and BC.

And vice-versa!

İşlevsel Bağımlılıkların Korunması Alg.

Decomposition into BCNF

Input: relation R, set S of FDs over R.

Output: a set of relations in BCNF.

1. Compute S+.

2. Compute keys for R (from ER or from S+).

3. Use S+ and keys to check if R is in BCNF. If not:

a. Pick a violation FD A B.

b. Expand B as much as possible, by computing A+.

c. Create R1 = A B, and R2 = R B.

d. Find the FDs over R1, using S+. Repeat for R2.

e. Recurse on R1 & its set of FDs. Repeat for R2.

4. Else R is already in BCNF; add R to the output.

55

Decomposition into BCNF

Given: relation R with FD’s F.

Look among the given FD’s for a BCNF violation X ->B.

Compute X +.

Replace R by relations with schemas:

1. R1 = X +.

2. R2 = R – (X + – X ).

cont. with two new relations..

56

Example

Drinkers(name, addr, beersLiked, manf, favBeer)

F = name->addr,

name -> favBeer,

beersLiked->manf

Pick BCNF violation name->addr.

Close the left side: {name}+ = {name, addr, favBeer}.

Decomposed relations:Drinkers1(name, addr, favBeer)

Drinkers2(name, beersLiked, manf)

57

Example, Continued

We are not done; we need to check Drinkers1 and Drinkers2 for BCNF.

For Drinkers1(name, addr, favBeer), relevant FD’s are

name->addr and name->favBeer. Thus, {name} is the only key and Drinkers1 is in

BCNF.

58

Example, Continued

For Drinkers2(name, beersLiked, manf), the only FD is beersLiked->manf, and the only key is {name, beersLiked}. Violation of BCNF.

beersLiked+ = {beersLiked, manf}, so we decompose Drinkers2 into:

1. Drinkers3(beersLiked, manf)

2. Drinkers4(name, beersLiked)

59

Example, Concluded

The resulting decomposition of Drinkers :

1. Drinkers1(name, addr, favBeer)

2. Drinkers3(beersLiked, manf)

3. Drinkers4(name, beersLiked)

Notice: Drinkers1 tells us about drinkers, Drinkers3 tells us about beers, and Drinkers4tells us the relationship between drinkers and the beers they like.

Decomposition into 3NF

Reading…

bİl 354 – veritabanı yönetim sistemleri

Documents