zeit2301- database design normalisation school of engineering and information technology unsw@adfa...

31
ZEIT2301- Database Design Normalisation School of Engineering and Information Technology UNSW@ADFA Dr Kathryn Merrick Bldg 16, Rm 212 (Thursdays and Fridays only) [email protected]

Upload: lesley-knight

Post on 23-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ZEIT2301- Database Design Normalisation School of Engineering and Information Technology UNSW@ADFA Dr Kathryn Merrick Bldg 16, Rm 212 (Thursdays and Fridays

ZEIT2301- Database Design

Normalisation

School of Engineering and Information Technology

UNSW@ADFA

Dr Kathryn Merrick

Bldg 16, Rm 212 (Thursdays and Fridays only)

[email protected]

Page 2: ZEIT2301- Database Design Normalisation School of Engineering and Information Technology UNSW@ADFA Dr Kathryn Merrick Bldg 16, Rm 212 (Thursdays and Fridays

Topic 08: Database Normalisation

• Designing a ‘good’ relational database• Normal forms

• First normal form• Second normal form• Third normal form

Page 3: ZEIT2301- Database Design Normalisation School of Engineering and Information Technology UNSW@ADFA Dr Kathryn Merrick Bldg 16, Rm 212 (Thursdays and Fridays

A Motivating Example

Suppose we want to develop a database of bike statistics for a program that permits users to find out whether or not each bike can stoppie.

Page 4: ZEIT2301- Database Design Normalisation School of Engineering and Information Technology UNSW@ADFA Dr Kathryn Merrick Bldg 16, Rm 212 (Thursdays and Fridays

A file for such data might be…

Harley 1.588 0.724 0.9 falseHarley one pax 1.588 0.775 0.9 falseHonda 1.458 0.831 0.9 trueHonda one pax 1.458 0.881 0.9 true

Bike descriptionWheelbase

Centre of mass height

Coefficient of friction

This column contains two types of data (bike type and number of passengers)

This column contains duplicate (redundant) data

Data in this column is not even bike dependent

Can stoppie?

Page 5: ZEIT2301- Database Design Normalisation School of Engineering and Information Technology UNSW@ADFA Dr Kathryn Merrick Bldg 16, Rm 212 (Thursdays and Fridays

Relations

Relations are data tables with rows and columns

Row

Column

Page 6: ZEIT2301- Database Design Normalisation School of Engineering and Information Technology UNSW@ADFA Dr Kathryn Merrick Bldg 16, Rm 212 (Thursdays and Fridays

Attributes

Attributes are named columns of relations

Out bike example has five attributes:

Bike description

Wheelbase

Centre of mass height

Coefficient of friction

Can stoppie

Page 7: ZEIT2301- Database Design Normalisation School of Engineering and Information Technology UNSW@ADFA Dr Kathryn Merrick Bldg 16, Rm 212 (Thursdays and Fridays

Domains

The domain of an attribute is the set of allowable values for that attribute

Attribute domains in our bike example:

Attribute DomainBike description Alphanumeric: size 60;

Wheelbase Numeric: range [0-5]

Centre of mass height Numeric: range [0-5]

Coefficient of friction Numeric: range [0-1]

Can stoppie Boolean: one of [true, false]

Page 8: ZEIT2301- Database Design Normalisation School of Engineering and Information Technology UNSW@ADFA Dr Kathryn Merrick Bldg 16, Rm 212 (Thursdays and Fridays

Records (Tuples)

A record or tuple is one row of data in a relation

Our bike example has four records

Bike description

Wheelbase

Centre of mass height

Coefficient of friction

Can stoppie

Harley 1.588 0.724 0.9 false

Harley one pax 1.588 0.775 0.9 false

Honda 1.458 0.831 0.9 true

Honda one pax 1.458 0.881 0.9 true

Page 9: ZEIT2301- Database Design Normalisation School of Engineering and Information Technology UNSW@ADFA Dr Kathryn Merrick Bldg 16, Rm 212 (Thursdays and Fridays

Relational Databases

A relational database is a collection of normalised relations

Normalisation is a technique for producing a set of tables that conform to desirable redundancy and integrity constraints

There are three common normal forms: First normal form Second normal form Third normal form

Page 10: ZEIT2301- Database Design Normalisation School of Engineering and Information Technology UNSW@ADFA Dr Kathryn Merrick Bldg 16, Rm 212 (Thursdays and Fridays

First Normal Form (1NF)

A table is in 1NF if:

The intersection of every column and record contains only one value

and

It has a primary key attribute that uniquely identifies every record

Page 11: ZEIT2301- Database Design Normalisation School of Engineering and Information Technology UNSW@ADFA Dr Kathryn Merrick Bldg 16, Rm 212 (Thursdays and Fridays

Keys (Revision)

Superkey: a column or set of columns that uniquely identifies a record in a relation

Candidate key: a superkey with the minimum number of columns

Primary key: the candidate key selected for identification purposes

Foreign key: a column or set of columns in one table that matches a candidate key of another table

Page 12: ZEIT2301- Database Design Normalisation School of Engineering and Information Technology UNSW@ADFA Dr Kathryn Merrick Bldg 16, Rm 212 (Thursdays and Fridays

12

Decomposition to 1NF

Remove multi-valued attributes

Add a primary key

Page 13: ZEIT2301- Database Design Normalisation School of Engineering and Information Technology UNSW@ADFA Dr Kathryn Merrick Bldg 16, Rm 212 (Thursdays and Fridays

The bike example in 1NF

Bike name

Number of riders

Wheelbase

Centre of mass height

Coefficient of friction

Can stoppie

Harley 1 1.588 0.724 0.9 false

Harley 2 1.588 0.775 0.9 false

Honda 1 1.458 0.831 0.9 true

Honda 2 1.458 0.881 0.9 true

Bike name*

Number of riders*

Wheelbase

Centre of mass height

Coefficient of friction

Can stoppie

Harley 1 1.588 0.724 0.9 false

Harley 2 1.588 0.775 0.9 false

Honda 1 1.458 0.831 0.9 true

Honda 2 1.458 0.881 0.9 true

Page 14: ZEIT2301- Database Design Normalisation School of Engineering and Information Technology UNSW@ADFA Dr Kathryn Merrick Bldg 16, Rm 212 (Thursdays and Fridays

Second Normal Form (2NF)

A table is in 2NF if:

It is in 1NF and

The values in each non-primary-key column depend on value in all primary key columns (ie: not a subset of the primary keys)

Page 15: ZEIT2301- Database Design Normalisation School of Engineering and Information Technology UNSW@ADFA Dr Kathryn Merrick Bldg 16, Rm 212 (Thursdays and Fridays

15

Decomposition for 2NF

Remove non-primary key attributes that are not fully functionally dependant on the primary key

Place them in a new relation with the part of the primary key on which they are functionally dependant (i.e. their determinant)

Consider replacing compound primary keys with non-compound keys

Page 16: ZEIT2301- Database Design Normalisation School of Engineering and Information Technology UNSW@ADFA Dr Kathryn Merrick Bldg 16, Rm 212 (Thursdays and Fridays

16

Full Functional Dependency

Examine the non key attributes in “creditRecord”:address employer interestRate limit

From the FDs given, the attributes address, employer and interestRate are not dependent on the whole primary key

The attribute limit is fully functionally dependent on the primary key

creditRecord(customer, creditCard,address, employer, limit,

interestRate)

Page 17: ZEIT2301- Database Design Normalisation School of Engineering and Information Technology UNSW@ADFA Dr Kathryn Merrick Bldg 16, Rm 212 (Thursdays and Fridays

The bike example in 2NF

Bike name* Wheelbase

Harley 1.588

Honda 1.458

Scenario ID*

Bike name

Number of riders

Centre of mass height

Coefficient of friction

Can stoppie

1 Harley 1 0.724 0.9 false

2 Harley 2 0.775 0.9 false

3 Honda 1 0.831 0.9 true

4 Honda 2 0.881 0.9 true

Page 18: ZEIT2301- Database Design Normalisation School of Engineering and Information Technology UNSW@ADFA Dr Kathryn Merrick Bldg 16, Rm 212 (Thursdays and Fridays

Third Normal Form (3NF)

A table is in 3NF if:

It is in 1NF

and

It is in 2NF

and

The values in each non-primary-key column depend on values in only the primary key columns

Page 19: ZEIT2301- Database Design Normalisation School of Engineering and Information Technology UNSW@ADFA Dr Kathryn Merrick Bldg 16, Rm 212 (Thursdays and Fridays

19

Transitive Functional Dependency

Cnsider a relation with attributes A, B, and C.

If B is functional dependent on A (A B), and C is functional dependent on B (B C), then C is transitively dependent on A.

A B, B C

If any non-key attribute is transitively dependent on the primary key, the relation is not in 3NF.

Page 20: ZEIT2301- Database Design Normalisation School of Engineering and Information Technology UNSW@ADFA Dr Kathryn Merrick Bldg 16, Rm 212 (Thursdays and Fridays

The bike example in 3NF

Bike name*

Number of riders*

Centre of mass height

Harley 1 0.724

Harley 2 0.775

Honda 1 0.831

Honda 2 0.881

Road conditions*

Coefficient of friction

Icy 0.1

Wet 0.5

Dry 0.9

Scenario ID*

Bike name

Number of riders

Road conditions

Can stoppie

1 Harley 1 Dry false

2 Harley 2 Dry false

3 Honda 1 Dry true

4 Honda 2 Dry true

Bike name*

Wheelbase

Harley 1.588

Honda 1.458

Page 21: ZEIT2301- Database Design Normalisation School of Engineering and Information Technology UNSW@ADFA Dr Kathryn Merrick Bldg 16, Rm 212 (Thursdays and Fridays

21

“Codd’s Law of Normalization”

Thou shalt depend upon the key (1NF), the whole key (2NF), and nothing but the key (3NF)!

Page 22: ZEIT2301- Database Design Normalisation School of Engineering and Information Technology UNSW@ADFA Dr Kathryn Merrick Bldg 16, Rm 212 (Thursdays and Fridays

22

Normalization Exercise: 1NF?

member No

name homeCity hobby sport sportHQ

833 Chang Sydney coin collecting Netball Lyneham

834 Jones Canberra drag racing, video games

AFL Essendon

927 Wicken Perth video games AFL Essendon

968 Aparti Darwin coin collecting, drag racing

Rugby Randwick

972 Nixon Perth tiddlywinks Netball Lyneham

member(memberNo, name, homeCity, hobby, sport, sportHQ)

Page 23: ZEIT2301- Database Design Normalisation School of Engineering and Information Technology UNSW@ADFA Dr Kathryn Merrick Bldg 16, Rm 212 (Thursdays and Fridays

Session 1, 2009 23

Normalization Exercise: 1NF

member No

name home City

sport sportHQ

833 Chang Sydney Netball Lyneham

834 Jones Canberra AFL Essendon

927 Wicken Perth AFL Essendon

968 Aparti Darwin Rugby Randwick

972 Nixon Perth Netball Lyneham

member(memberNo, name, homeCity,hobby, sport, sportHQ)memberHobby(memberNo, hobby)

member No

hobby

833 coin collecting

834 drag racing

834 video games

927 video games

968 coin collecting

968 drag racing

972 tiddlywinks

Both tables are in 1NF: All attributes are single valued and depend on PK

Page 24: ZEIT2301- Database Design Normalisation School of Engineering and Information Technology UNSW@ADFA Dr Kathryn Merrick Bldg 16, Rm 212 (Thursdays and Fridays

24

Normalization Exercise: 2NF?

member No

name home City

sport sportHQ

833 Chang Sydney Netball Lyneham

834 Jones Canberra AFL Essendon

927 Wicken Perth AFL Essendon

968 Aparti Darwin Rugby Randwick

972 Nixon Perth Netball Lyneham

member(memberNo, name, homeCity, sport, sportHQ)

FDs:memberNo name, homeCity, sport, sportHQsport sportHQ

Do all non-key attributes depend on the whole PK?

Page 25: ZEIT2301- Database Design Normalisation School of Engineering and Information Technology UNSW@ADFA Dr Kathryn Merrick Bldg 16, Rm 212 (Thursdays and Fridays

Session 1, 2009 25

Normalization Exercise: 2NF

member No

name home City

sport sportHQ

833 Chang Sydney Netball Lyneham

834 Jones Canberra AFL Essendon

927 Wicken Perth AFL Essendon

968 Aparti Darwin Rugby Randwick

972 Nixon Perth Netball Lyneham

member(memberNo, name, homeCity, sport, sportHQ)

FDs:memberNo name, homeCity, sport, sportHQsport sportHQ

Do all non-key attributes depend on the whole PK?• PK is not composite. • All attributes depend on the PK. •Table is in 2NF

Page 26: ZEIT2301- Database Design Normalisation School of Engineering and Information Technology UNSW@ADFA Dr Kathryn Merrick Bldg 16, Rm 212 (Thursdays and Fridays

26

Normalization Exercise: 3NF?

member No

name home City

sport sportHQ

833 Chang Sydney Netball Lyneham

834 Jones Canberra AFL Essendon

927 Wicken Perth AFL Essendon

968 Aparti Darwin Rugby Randwick

972 Nixon Perth Netball Lyneham

member(memberNo, name, homeCity, sport, sportHQ)

FDs:memberNo name, homeCity, sport, sportHQsport sportHQ

Are there any transitive dependencies between non-key attributes?

Page 27: ZEIT2301- Database Design Normalisation School of Engineering and Information Technology UNSW@ADFA Dr Kathryn Merrick Bldg 16, Rm 212 (Thursdays and Fridays

Session 1, 2009 27

Normalization Exercise: 3NF

member No

name home City

sport

833 Chang Sydney Netball

834 Jones Canberra AFL

927 Wicken Perth AFL

968 Aparti Darwin Rugby

972 Nixon Perth Netball

member(memberNo, name, homeCity, sport)sport(sport, sportHQ)

FDs:memberNo name, homeCity, sport, sportHQsport sportHQ

Decompose based on transitive dependencies.NB. Maintain a link between tables

sport sportHQ

Netball Lyneham

AFL Essendon

Rugby Randwick

Page 28: ZEIT2301- Database Design Normalisation School of Engineering and Information Technology UNSW@ADFA Dr Kathryn Merrick Bldg 16, Rm 212 (Thursdays and Fridays

Session 1, 2009 28

Normalization Exercise: 3NF

sport(sport, sportHQ)

FDs:memberNo name, homeCity, sport, sportHQsport sportHQ

sport table:• 1NF (all attributes single valued and dependant on PK)

• 2NF (attribute depends on the whole key)

• 3NF (no transitive dependencies between non-key attributes)

sport sportHQ

Netball Lyneham

AFL Essendon

Rugby Randwick

Page 29: ZEIT2301- Database Design Normalisation School of Engineering and Information Technology UNSW@ADFA Dr Kathryn Merrick Bldg 16, Rm 212 (Thursdays and Fridays

29

Normalization Exercise: 2NF 3NF

memberHobby(memberNo, hobby)

member No

hobby

833 coin collecting

834 drag racing

834 video games

927 video games

968 coin collecting

968 drag racing

972 tiddlywinks

memberHobby table:• 1NF: All attributes are single-valued

• 2NF: Table is all key. No non-key attributes so table is in 2NF

• 3NF: No non-key attributes so no transitive dependencies between them!

Page 30: ZEIT2301- Database Design Normalisation School of Engineering and Information Technology UNSW@ADFA Dr Kathryn Merrick Bldg 16, Rm 212 (Thursdays and Fridays
Page 31: ZEIT2301- Database Design Normalisation School of Engineering and Information Technology UNSW@ADFA Dr Kathryn Merrick Bldg 16, Rm 212 (Thursdays and Fridays

Summary

After today’s lecture you should be able to:

Design a normalised relational database in First normal form Second normal form Third normal form