basics of database tuning · • decomposition in database design • functional dependencies •...

28
Functional Dependencies

Upload: others

Post on 25-Jun-2020

15 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Basics of Database Tuning · • Decomposition in database design • Functional dependencies • Armstrong’s axioms and auxiliary rules for closure computation. CMPT 354: Database

Functional Dependencies

Page 2: Basics of Database Tuning · • Decomposition in database design • Functional dependencies • Armstrong’s axioms and auxiliary rules for closure computation. CMPT 354: Database

CMPT 354: Database I -- Functional Dependencies 2

Redundancy in Database Design

• A table Students-take-courses (stud-id, name, address, phone, crs-id, instructor-name, office)– Students(stud-id, name, address, phone, …)– Instructors(name, office, …)

• Redundant information– If a student takes 20 courses, her/his name, address,

phone number have to be repeated 20 times– If an instructor teaches 2 courses with 120 students in

total, her/his office number is repeated 120 times

Page 3: Basics of Database Tuning · • Decomposition in database design • Functional dependencies • Armstrong’s axioms and auxiliary rules for closure computation. CMPT 354: Database

CMPT 354: Database I -- Functional Dependencies 3

Why Redundancy Could Be Bad?

• Space cost• Maintenance overhead

– If a student updates her/his address, 20 records need to be updated

– If an instructor moves to a new office, 120 records need to be updated

– What if inconsistency happens during the update?

Page 4: Basics of Database Tuning · • Decomposition in database design • Functional dependencies • Armstrong’s axioms and auxiliary rules for closure computation. CMPT 354: Database

CMPT 354: Database I -- Functional Dependencies 4

Why Redundancy Could Be Good?

• Students-take-courses(stud-id, name, crs-id)– Student name is redundant if we have table

Students(stud-id, name, address, phone, …)– Only need Students-take-courses(stud-id, crs-

id)• What if often we need to generate class

rosters?– Fast query answering: avoid joining two tables

many times

Page 5: Basics of Database Tuning · • Decomposition in database design • Functional dependencies • Armstrong’s axioms and auxiliary rules for closure computation. CMPT 354: Database

CMPT 354: Database I -- Functional Dependencies 5

Requirements of Good Design

• Correctness: no information loss– Must be guaranteed

• Efficiency– Minimum (or, as less as possible) redundant

(repeated) information– Good performance with respect to (expected)

typical workload– May have to trade off between space and query

answering time• Redundant information may help query answering

Page 6: Basics of Database Tuning · • Decomposition in database design • Functional dependencies • Armstrong’s axioms and auxiliary rules for closure computation. CMPT 354: Database

CMPT 354: Database I -- Functional Dependencies 6

Atomic Domains

• Domain is atomic if its elements are considered to be indivisible units– Course-id consisting of department code and course

number, e.g., CMPT 354– Bad examples: a customer’s all accounts, all owners of

an account• Non-atomic values complicate storage and query

answering, and encourage redundant (repeated) storage of data– Storage and redundancy: a set of accounts stored with

each customer, and a set of owners stored with each account

Page 7: Basics of Database Tuning · • Decomposition in database design • Functional dependencies • Armstrong’s axioms and auxiliary rules for closure computation. CMPT 354: Database

CMPT 354: Database I -- Functional Dependencies 7

First Normal Form

• Normal form: a quality criteria that the database design should meet

• A relational schema R is in first normal form if the domains of all attributes of R are atomic– All relations are assumed in first normal form

• A property of how the elements of the domain are used– Strings would normally be considered indivisible – Course-id is not atomic since two pieces of information

are encoded

Page 8: Basics of Database Tuning · • Decomposition in database design • Functional dependencies • Armstrong’s axioms and auxiliary rules for closure computation. CMPT 354: Database

CMPT 354: Database I -- Functional Dependencies 8

Combine Schemas?

• Combine borrow and loan to get bor_loan = (customer_id, loan_number, amount )

• Result is possible repetition of information (L-100 in example below)

Page 9: Basics of Database Tuning · • Decomposition in database design • Functional dependencies • Armstrong’s axioms and auxiliary rules for closure computation. CMPT 354: Database

CMPT 354: Database I -- Functional Dependencies 9

Why Decomposition?

• Suppose we had started with bor_loan, how would we know to split up (decompose) it into borrower and loan?

• Write a rule “if there were a schema (loan_number, amount), then loan_number would be a candidate key”

Page 10: Basics of Database Tuning · • Decomposition in database design • Functional dependencies • Armstrong’s axioms and auxiliary rules for closure computation. CMPT 354: Database

CMPT 354: Database I -- Functional Dependencies 10

Why Decomposition?

• Denote as a functional dependency loan_number → amount

• In bor_loan, because loan_number is not a candidate key, the amount of a loan may have to be repeated– This indicates the need to decompose bor_loan

Page 11: Basics of Database Tuning · • Decomposition in database design • Functional dependencies • Armstrong’s axioms and auxiliary rules for closure computation. CMPT 354: Database

CMPT 354: Database I -- Functional Dependencies 11

Combined Schema w/o Repetition

• Consider combining loan_branch and loanloan_amt_br = (loan_number, amount,

branch_name)• No repetition

Page 12: Basics of Database Tuning · • Decomposition in database design • Functional dependencies • Armstrong’s axioms and auxiliary rules for closure computation. CMPT 354: Database

CMPT 354: Database I -- Functional Dependencies 12

Decomposition Is Not Always Good

• Suppose we decompose employee intoemployee1 = (employee_id, employee_name)employee2 = (employee_name,

telephone_number, start_date)• We cannot reconstruct the original

employee relation if there are two employees having the same name

Page 13: Basics of Database Tuning · • Decomposition in database design • Functional dependencies • Armstrong’s axioms and auxiliary rules for closure computation. CMPT 354: Database

CMPT 354: Database I -- Functional Dependencies 13

A Lossy Decomposition

More tuples after re-joining the tables is considered loss of information instead of gain

Page 14: Basics of Database Tuning · • Decomposition in database design • Functional dependencies • Armstrong’s axioms and auxiliary rules for closure computation. CMPT 354: Database

CMPT 354: Database I -- Functional Dependencies 14

Designing by Decomposition

• Start from a wide table – the universal table– Containing all pieces of information

• Decide whether a particular relation R is in “good” form

• In the case that a relation R is not in a “good” form, decompose it into a set of relations {R1, R2, ..., Rn} such that – Each relation is in good form– The decomposition does not lose information

Page 15: Basics of Database Tuning · • Decomposition in database design • Functional dependencies • Armstrong’s axioms and auxiliary rules for closure computation. CMPT 354: Database

CMPT 354: Database I -- Functional Dependencies 15

Functional Dependencies

• Constraints on the set of legal relations• Require that the value for a certain set of

attributes determines uniquely the value for another set of attributes

• A functional dependency is a generalization of the notion of a key

Page 16: Basics of Database Tuning · • Decomposition in database design • Functional dependencies • Armstrong’s axioms and auxiliary rules for closure computation. CMPT 354: Database

CMPT 354: Database I -- Functional Dependencies 16

Functional Dependencies

• Let R be a relation schema, α ⊆ R and β⊆ R

• The functional dependency α → βholds on R if and only if for any legal relations r(R), whenever any two tuples t1and t2 of r agree on the attributes α, they also agree on the attributes β– t1[α] = t2 [α] ⇒ t1[β ] = t2 [β ]

Page 17: Basics of Database Tuning · • Decomposition in database design • Functional dependencies • Armstrong’s axioms and auxiliary rules for closure computation. CMPT 354: Database

CMPT 354: Database I -- Functional Dependencies 17

Example

• Example: Consider r(A,B ) with the following instance of r.

• On this instance, A → B does NOT hold, but B → A does hold

A B1 41 53 7

Page 18: Basics of Database Tuning · • Decomposition in database design • Functional dependencies • Armstrong’s axioms and auxiliary rules for closure computation. CMPT 354: Database

CMPT 354: Database I -- Functional Dependencies 18

Super Keys and Candidate Keys

• K is a superkey for relation schema R if and only if K → R

• K is a candidate key for R if and only if K →R and for no α ⊂ K, α → R

Page 19: Basics of Database Tuning · • Decomposition in database design • Functional dependencies • Armstrong’s axioms and auxiliary rules for closure computation. CMPT 354: Database

CMPT 354: Database I -- Functional Dependencies 19

Dependencies and Constraints

• Functional dependencies can express constraints that cannot be expressed using superkeys

• Consider the schemabor_loan = (customer_id, loan_number, amount )– We expect loan_number → amount– We do not expect amount → customer_id

Page 20: Basics of Database Tuning · • Decomposition in database design • Functional dependencies • Armstrong’s axioms and auxiliary rules for closure computation. CMPT 354: Database

CMPT 354: Database I -- Functional Dependencies 20

Use of Functional Dependencies• Testing relations to see if they are legal under a given set

of functional dependencies– If a relation r is legal under a set F of functional dependencies, we

say that r satisfies F

• Specifying constraints on the set of legal relations– We say that F holds on R if all legal relations on R satisfy the set of

functional dependencies F

• A specific instance of a relation schema may satisfy a functional dependency even if the functional dependency does not hold on all legal instances– For example, a specific instance of loan may, by chance, satisfy

amount → customer_name

Page 21: Basics of Database Tuning · • Decomposition in database design • Functional dependencies • Armstrong’s axioms and auxiliary rules for closure computation. CMPT 354: Database

CMPT 354: Database I -- Functional Dependencies 21

Trivial Functional Dependencies

• A functional dependency is trivial if it is satisfied by all instances of a relation– Example:

• customer_name, loan_number → customer_name• customer_name → customer_name

• In general, α → β is trivial if β ⊆ α

Page 22: Basics of Database Tuning · • Decomposition in database design • Functional dependencies • Armstrong’s axioms and auxiliary rules for closure computation. CMPT 354: Database

CMPT 354: Database I -- Functional Dependencies 22

Closure

• A set of functional dependencies may logically imply other functional dependencies– If A → B and B → C, then A → C

• The set of all functional dependencies logically implied by F is the closure of F

• We denote the closure of F by F+

– F+ is a superset of F

Page 23: Basics of Database Tuning · • Decomposition in database design • Functional dependencies • Armstrong’s axioms and auxiliary rules for closure computation. CMPT 354: Database

CMPT 354: Database I -- Functional Dependencies 23

Armstrong’s Axioms

• Finding F+

– (reflexivity) If β ⊆ α, then α → β– (augmentation) If α → β, then γ α → γ β– (transitivity) If α → β, and β → γ, then α → γ

• These rules are – Sound: generate only functional dependencies

that actually hold – Complete: generate all functional dependencies

that hold

Page 24: Basics of Database Tuning · • Decomposition in database design • Functional dependencies • Armstrong’s axioms and auxiliary rules for closure computation. CMPT 354: Database

CMPT 354: Database I -- Functional Dependencies 24

Example

• R = (A, B, C, G, H, I)F = { A → B

A → CCG → HCG → I

B → H}• some members of F+

A → H – By using transitivity

from A → B and B → H

AG → I – By augmenting A → C

with G, to get AG → CG and then using transitivity with CG → I

CG → HI – By augmenting CG → I

to infer CG → CGI, and augmenting of CG → H to infer CGI → HI, and then using transitivity

Page 25: Basics of Database Tuning · • Decomposition in database design • Functional dependencies • Armstrong’s axioms and auxiliary rules for closure computation. CMPT 354: Database

CMPT 354: Database I -- Functional Dependencies 25

Procedure for Computing F+

F + = Frepeat

for each functional dependency f in F+

apply reflexivity and augmentation rules on fadd the resulting functional dependencies to F +

for each pair of functional dependencies f1and f2 in F +if f1 and f2 can be combined using transitivity

then add the resulting functional dependency to F +until F + does not change any further

Page 26: Basics of Database Tuning · • Decomposition in database design • Functional dependencies • Armstrong’s axioms and auxiliary rules for closure computation. CMPT 354: Database

CMPT 354: Database I -- Functional Dependencies 26

Auxiliary Rules

• We can further simplify manual computation of F+ by using the following additional rules– (union) If α → β holds and α → γ holds, then α→ β γ holds

– (decomposition) If α → β γ holds, then α → βholds and α → γ holds

– (pseudotransitivity) If α → β holds and γ β → δholds, then α γ → δ holds

• The above rules can be inferred from Armstrong’s axioms

Page 27: Basics of Database Tuning · • Decomposition in database design • Functional dependencies • Armstrong’s axioms and auxiliary rules for closure computation. CMPT 354: Database

CMPT 354: Database I -- Functional Dependencies 27

Summary

• First normal form• Decomposition in database design• Functional dependencies• Armstrong’s axioms and auxiliary rules for

closure computation

Page 28: Basics of Database Tuning · • Decomposition in database design • Functional dependencies • Armstrong’s axioms and auxiliary rules for closure computation. CMPT 354: Database

CMPT 354: Database I -- Functional Dependencies 28

To-Do-List

• Please prove the auxiliary rules using Armstrong’s Axioms