bİl 354 – veritabanı yönetim sistemleri
TRANSCRIPT
BİL 354 –
Veritabanı Sistemleri
Relational Database Design(İlişkisel Veri Tabanı Tasarımı)
Integrity Constraints (Bütünlük Kısıtlamaları)
Domain Constraints (Alan Kısıtlamaları)
Referential Integrity (Referans Kısıtlamaları)
Others (Nitelikler Arası Bağımlılıklar)
Functional Dependencies (İşlevsel Bağımlılıklar)
RedundancyRedundancy is at the root of several problemsassociated with relational schemas:
Redundant storage: Some information is storedrepeatedly.
Update anomalies: If one copy is updated, allcopies must be similarly updated.
Delete anomalies: It may not be possible to deletesome information without losing some otherinformation as well.
Insert anomalies: It may not be possible to storesome information unless some other informationis stored as well.
Update Anomaly
Insertion Anomaly
Deletion Anomaly
How about null values?
Hourly Emps(ssn, name, lot, rating, hourly_wages, hours_worked)
Null values cannot eliminate redundant storage or update anomalies!
However they can address insertion and deletion anomalies to some extent.
Deletion : rating and hourly_wages is non-null, ssn is null ?
Use of Decompositions
Hourly_Emps2(ssn, name, lot, rating, hours worked)
Wages(rating, hourly wages)
Decomposition (Ayrıştırma)
Decomposition should be used carefully, or itmay create more problems.
Two questions that should be askedrepeatedly:
Is there reason to decompose a relation? Normal Forms proposed to address this question
What problems (if any) does the decompositioncause? Lossless join
Dependency preserving
Decomposition (Ayrıştırma)
Lossless-join
Enable us to recover any instance of decomposed relationfrom corresponding instances of the smaller relations.
Dependency-preservation
o Enable us to enforce any constraint on the originalrelation by simply enforcing some constraints on each ofthe smaller relations.
o Thus, we need not performs joins of the smaller relationsto check whether a constraint on the original relation isviolated.
ExampleLending-schema = (branch-name, branch-city, assets,
customer-name, loan-number, amount)
Example : Decomposition
Decompose the relation schema Lending-schema into:
Branch-customer = (branch-name, branch-city,assets, customer-name)
customer-loan = (customer-name, loan-number,branch-name, amount)
All attributes of an original schema (R) must appear in the decomposition (R1, R2):
R = R1 R2
Lossless-join decomposition.For all possible relations r on schema R
r = R1 (r) R2 (r)
Example : Decomposition
Lossy-join Decomposition
The Relation branch-customer customer-loan
Desirable Properties of Decomposition
It may become necessary to decompose arelation into several smaller relations.
Lossless-Join
Dependency Preservation
Non-Repetition of Information
Using functional dependencies we can define normal forms that represent «good» db design.
Desirable Properties of Decomposition 2
Lossless-Join: enable to recover any instance of
decomposed relation from corresponding instances ofsmaller relations.
Dependency Preservation: enable to enforce any
constraint on the original relation by simply enforcingsome constraints on each of the smaller relations.
We need not perform joins of the smaller relations tocheck whether a constraint on the relation is violated.
Non-Repetition of Information
Functional Dependencies (İşlevsel Bağımlılık)
A functional dependency(FD) X Y holds over relationR if, for every allowable instance r of R:
t1 r, t2 r, (t1) = (t2) implies
(t1) = (t2)
i.e., given two tuples in r, if the X values agree, thenthe Y values must also agree. (X and Y are sets ofattributes.)
X → Y iff each X value is associated with precisely one Yvalue. Thus, given a tuple and the values of theattributes in X, one can determine the correspondingvalue of the Y attribute
X X Y Y
Functional Dependencies
A functional dependency (FD) is a kind of IC that generalizes the concept of a key.
Example:
StudentId Name
but not NameStudentId
If XY then
if X is updated, Y must also be updated
Functional Dependencies
An FD is a statement about all allowable relations.
Must be identified based on semantics of application.
Given some allowable instance r1 of R, we can check if it violates some FD f, but we cannot tell if f holds over R!
K is a candidate key for R means that K R
However, K R does not require K to be minimal!
Consider Enrolled(sid, cid, cname, semester)
(sid,cid) and (sid, cname) are candidate keys
(sid, cid, cname) is a superkey
Functional Dependency
Kavramsal Düzeyde
FNO FADI, FADRESİ
ÜKODU, FNO SFİYATI
Gerçek dünyada FADI biricik ise:
FADI FNO, FADRESİ
Olgu (örnek) DüzeyindeFNO FADI, FADRESİ
FADI FNO, FADRESİ
ÜKODU, FNO SFİYATI
FADRESİ FNO, FADI
Example
TAŞIT (PLAKANO, MARKA, MODEL, YIL, AĞIRLIK, RENK)
Bu örnekteki işlrevsel bağımlılıklar şunlardır:
PLAKANO MARKA, MODEL, YIL, AĞIRLIK, RENK
MARKA, MODEL AĞIRLIK
Example
R ilişki olgusunun işlevsel bağımlılıkları :D C
A, B C, D
A, C B, D
A, D B, C
B, C D
Types of Functional Dependencies 1/2R (ÖNO, ÖADI, BNO, BADI, FAKNO, DKODU, DADI, KRD, NOTU)
Kısmi İşlevsel Bağımlılık. Eğer X A’yı belirliyorsa ve X’in en az bir özaltkümesi de A’yı belirliyorsa (X A ve Z X : Z A) X Aişlevsel bağımlılığına kısmi işlevsel bağımlılık denir.
ÖNO, ÖADI BNO
ÖNO, DKODU KRD
Tam işlevsel bağımlılık. Eğer X A’yı belirliyorsa ve X’in hiçbir özaltkümesi A’yı belirlemiyorsa (X A ve Z X : Z A) X Aişlevsel bağımlılığına tam işlevsel bağımlılık denir.
ÖNO ÖADI, BNO
ÖNO, DKODU NOTU
Types of Functional Dependencies 2/2Önemsiz İşlevsel Bağımlılık. Eğer X A’yı belirliyorsa veA X’in bir altkümesi ise (X A ve A X ) X A işlevselbağımlılığına önemsiz işlevsel bağımlılık denir.
ÖNO, ÖADI ÖADI
BNO, BADI, FAKNO BNO, FAKNO
Önemli İşlevsel Bağımlılık. Eğer X A’yı belirliyorsa ve A X’inbir altkümesi değilse (X A ve A X ) X A işlevselbağımlılığına önemli işlevsel bağımlılık denir.
ÖNO, ÖADI, DKODU BNO, KRD
ÖNO, DKODU NOTU
Geçişli işlevsel bağımlılık. Eğer X Y’yi, Y de Z’i belirliyorsa
(X Y ve Y Z ) X Z işlevsel bağımlılığına geçişli işlevselbağımlılık denir.
ÖNO FAKNO (ÖNO BNO, BNO FAKNO)
Armstrong’s Axioms (İ.B. Türetme Kuralları)
Reflexivity (dönüşlülük):
If X Y, then Y X
Augmentation (arttırma):
If X Y, then XZ Y for any Z
Transitivity (geçişlilik):
If X Y and Y Z, then X Z
Other Rules
Union (Birleşim):
If X Y and X Z, then X YZ
Decomposition (Ayrışım):
If X YZ, then X Y and X Z
Pseudo Transitivity (Sözde Geçişlilik):
If X Y, then YZW and XZW
Accumulation rule:
If X YZ and Z W, then X YZW
Example
R (A, B, C, D, E, G)
F: A BD
BC EG
D E
DG E (artırma kuralına göre)
A E ( ayrıştırma ve geçişlilik kurallarına göre)
A ABD (dönüşlülük ve birleşim kurallarına göre)
A B ve A D (ayrıştırma kuralına göre)
AC EG (ayrıştırma ve sözde geçişlilik kurallarına göre)
Closure of a Set of Functional Dependencies
Given a set F set of functional dependencies,there are certain other functionaldependencies that are logically implied by F.
E.g. If A B and B C, then we can infer thatA C
The set of all functional dependencieslogically implied by F is the closure of F.
We denote the closure of F by F+.
We can find all of F+ by applyingArmstrong’s Axioms:
Computing F+
F+ = F;
loop {
For each f in F,
apply the reflexivity and augmentation rules
and add the new FDs to F+.
For each pair of FDs in F,
apply the transitivity rule and add the new FDs
to F+
} until F+ does not change any more.
Artıklık Algoritması
1. Başlangıçta T = { X } yap
2. “F – { f }”teki her W Z işlevsel bağımlılığı için:
eğer {W} T ise T = T {Z} yap
3. T değiştiği sürece 2. adımı tekrarla
4. Sonuçta eğer Y T ise (f : X Y) işlevsel bağımlılığı F’de artıktır.
• Ayrıştırma yapılır. • Artık olan her f F’den çıkarılır.
Example 1/2
R (A, B, C, D, E, G)
F: A BCDE
G BD
BC E
CG A
BDE ACG
F : A B G B BDE A
A C G D BDE C
A D BC E BDE G
A E CG A
Example 2/2F’de A B artık mı?
T = {A,C,D,E} elde edilir hayır
F’de A C artık mı?
T = {A,B,D,E,C,G} elde edilir evet
F1 = F – {A C}
F1’de A D artık mı?
T = {A,B,E} elde edilir hayır
F1’de A E artık mı? T = {A,B,D} elde edilir hayır
F1’de G B artık mı? T = {G,D} elde edilir hayır
F1’de G D artık mı?T = {G,B} elde edilir hayır
F1’de BC E artık mı? T = {B,C} elde edilir hayır
F1’de CG A artık mı?
T = {C,G,B,D,E,A} elde edilir evet
F2 = F1 – {CG A}
Functional Dependencies
Computing the closure of a set of FDs can be expensive. (Size of closure is exponential in # attrs!)
Typically, we just want to check if a given FD X Y is in the closure of a set of FDs F. An efficient check:
Compute attribute closure of X (denoted )
Set of all attributes A such that X A is in
There is a linear time algorithm to compute this.
Check if Y is in
Does F = {A B, B C, C D E } imply A E?
i.e, is A E in the closure ? Equivalently, is E in ?
X
F
Computation of Attribute Closure X+F
closure := X; --since X X+F
Repeat until no change in closure
if there is an FD Z V in F such that
Z closure
then closure := closure V
If T closure then X T is entailed by F
Example
A B C
A D E
B D
A F B
{A, B}+ =
{A, F}+ =
{A, B, C, D, E}{A, F, B, D, C, E}
Example
F: AB C
A D
D E
AC B
X XF+
A {A, D, E}
AB {A, B, C, D, E}
(Hence AB is a key)
B {B}
D {D, E}
Is AB E entailed by F? Yes
Is D C entailed by F? No
Result: XF+ allows us to determine FDs
of the form X Y entailed by F
37
What is the attribute closure good for?
Test if X is a superkey
compute X+, and check if X+ contains all attrs of R
Check if X Y holds
by checking if Y is contained in X+
Another (not so clever) way to compute closure S+
of FDs
for each subset of attributes X in relation R, compute X+ with respect to S
for each subset of attributes Y in X+, output the FD X Y
38
Normal Forms (Normal Biçimler)
Returning to the issue of schema refinement, thefirst question to ask is whether any refinementis needed!
If a relation is in a certain normal form (BCNF,3NF etc.), it is known that certain kinds ofproblems are avoided/minimized. This can beused to help us decide whether decomposingthe relation will help.
39
Normal Forms
First Normal Form (1NF): A relation is in 1NF, if everyfield contains atomic values. This requirement is implicitin our definition of the relational model.
Second Normal Form (2NF): (obsolete) each nonkeyattribute in the relation must be functionally dependentupon the primary key.
Third Normal Form (3NF)
Boyce-Code Normal Form (BCNF)
Fourth Normal Form (4NF) (for the interested)
Increasing requirements
1NFÖĞRENCİ (ÖNO, ÖADI, DERS (DADI, NOTU))
ÖĞRDERS (ÖNO, ÖADI, DADI, NOTU)
2NF
Asal Nitelik: ilişki anahtarlarında yeralan nitelikler.
Asal Olmayan Nitelik.
Bir ilişki 1NF ise
Asal olmayan niteliklerden hiçbiri anahtarlardan hiçbirine kısmi işlevsel bağımlı değil ise (tam bağımlı).
SATICI (ÜKODU, FNO, FADI, FADRESİ, SFİYATI)
FNO FADI, FADRESİ
3NFReln R with FDs F is in 3NF if, for all XA in
A X (called a trivial FD), or
X is a superkey, or
A is part of some key for R.
Minimality of a key is crucial in third conditionabove!
If R is in BCNF, obviously in 3NF.
If R is in 3NF, some redundancy is possible.
Lossless-join,dependency-preserving decomposition of Rinto a collection of 3NF relations always possible.
F
BCNF (Boyce Codd Normal Form)
Reln R with FDs F is in BCNF if, for all X A in
A X (called a trivial FD), or
X contains a key for R. (X is a superkey)
In other words, R is in BCNF if the only non-trivial FDs that hold over R are key constraints.
If example relation is in BCNF, the 2 tuples must be identical (since X is a key).
F
44
BCNF doesn’t always have a
dependency-preserving
decomposition.
Third normal form may be preferable to having to take a
join to check dependencies after an update.
Lossless-Join (Yitimsiz Birleştirme)
A decomposition should not lose information
Decomposition of R into X and Y is lossless-join w.r.t. a set of FDs F if, for every instance r that satisfies F:
(r) (r) = r
It is always true that r (r) (r)
In general, the other direction does not hold! If it does, the decomposition is lossless-join.
X Y
X Y
Lossy Decomposition
r r1 r2 ... rn
SSN Name Address SSN Name Name Address
1111 Joe 1 Pine 1111 Joe Joe 1 Pine
2222 Alice 2 Oak 2222 Alice Alice 2 Oak
3333 Alice 3 Pine 3333 Alice Alice 3 Pine
r r1 r2 rn...
r1 r2r
The following is always the case:
But the following is not always true:
Example:
The tuples (2222, Alice, 3 Pine) and (3333, Alice, 2 Oak) are in the join, but not in the original
Lossy Decompositions: What is Actually Lost?
In the previous example, the tuples (2222, Alice, 3 Pine) and (3333, Alice, 2 Oak) were gained, not lost!
Why do we say that the decomposition was lossy?
What was lost is information:
That 2222 lives at 2 Oak: In the decomposition, 2222 can live at either 2 Oak or 3 Pine
That 3333 lives at 3 Pine: In the decomposition, 3333 can live at either 2 Oak or 3 Pine
Is this decomposition lossless?
Tournament Year Winner Winner Date of Birth
Indiana Invitational 1998 Al Fredrickson 21 July 1975
Cleveland Open 1999 Bob Albertson 28 September 1968
Des Moines Masters 1999 Al Fredrickson 21 July 1975
Indiana Invitational 1999 Chip Masterson 14 March 1977
Tournament Year Winner
Indiana Invitational 1998 Al Fredrickson
Cleveland Open 1999 Bob Albertson
Des Moines Masters 1999 Al Fredrickson
Indiana Invitational 1999 Chip Masterson
Winner Winner Date of Birth
Al Fredrickson 21 July 1975
Bob Albertson 28 September 1968
Al Fredrickson 21 July 1975
Chip Masterson 14 March 1977
Is this decomposition lossless?
StudentId CourseId Dept Grade
12 CS101 IE B-
12 CS102 IE C+
12 IE201 IE C-
7 CS201 CS A
7 CS352 CS D
StudentId CourseId Grade
12 CS101 B-
12 CS102 C+
12 IE201 C-
7 CS201 A
7 CS352 D
StudentId Dept
12 IE
7 CS
Ayrıştırmaların Yitimsizlik Sınaması
Dependency Preserving Decomposition
(İşlevsel Bağımlılıkların Korunması)
Consider CSJDPQV, C is key, JP C and SD P.
BCNF decomposition: CSJDQV and SDP
Problem: Checking JP C requires a join!
Dependency preserving decomposition (Intuitive):
If R is decomposed into X, Y and Z, and we enforce the FDs that hold on X, on Y and on Z, then all FDs that were given to hold on R must also hold.
Dependency Preserving Decompositions
Decomposition of R into X and Y is dependencypreserving if (FX union FY )
+ = F +
ABC, A B, B C, C A, decomposed into AB and BC.
Is this dependency preserving? Is C A preserved?
Dependency preserving does not imply lossless join:
ABC, A B, decomposed into AB and BC.
And vice-versa!
İşlevsel Bağımlılıkların Korunması Alg.
Decomposition into BCNF
Input: relation R, set S of FDs over R.
Output: a set of relations in BCNF.
1. Compute S+.
2. Compute keys for R (from ER or from S+).
3. Use S+ and keys to check if R is in BCNF. If not:
a. Pick a violation FD A B.
b. Expand B as much as possible, by computing A+.
c. Create R1 = A B, and R2 = R B.
d. Find the FDs over R1, using S+. Repeat for R2.
e. Recurse on R1 & its set of FDs. Repeat for R2.
4. Else R is already in BCNF; add R to the output.
55
Decomposition into BCNF
Given: relation R with FD’s F.
Look among the given FD’s for a BCNF violation X ->B.
Compute X +.
Replace R by relations with schemas:
1. R1 = X +.
2. R2 = R – (X + – X ).
cont. with two new relations..
56
Example
Drinkers(name, addr, beersLiked, manf, favBeer)
F = name->addr,
name -> favBeer,
beersLiked->manf
Pick BCNF violation name->addr.
Close the left side: {name}+ = {name, addr, favBeer}.
Decomposed relations:Drinkers1(name, addr, favBeer)
Drinkers2(name, beersLiked, manf)
57
Example, Continued
We are not done; we need to check Drinkers1 and Drinkers2 for BCNF.
For Drinkers1(name, addr, favBeer), relevant FD’s are
name->addr and name->favBeer. Thus, {name} is the only key and Drinkers1 is in
BCNF.
58
Example, Continued
For Drinkers2(name, beersLiked, manf), the only FD is beersLiked->manf, and the only key is {name, beersLiked}. Violation of BCNF.
beersLiked+ = {beersLiked, manf}, so we decompose Drinkers2 into:
1. Drinkers3(beersLiked, manf)
2. Drinkers4(name, beersLiked)
59
Example, Concluded
The resulting decomposition of Drinkers :
1. Drinkers1(name, addr, favBeer)
2. Drinkers3(beersLiked, manf)
3. Drinkers4(name, beersLiked)
Notice: Drinkers1 tells us about drinkers, Drinkers3 tells us about beers, and Drinkers4tells us the relationship between drinkers and the beers they like.
Decomposition into 3NF
Reading…