Денис Резник "relational database design. normalize till it hurts, then denormalize...

13
Golden Database Design Rule: Normalize till it hurts Denormalize till it works Denis Reznik Data Architect at Intapp, Inc. Microsoft Data Platform MVP http:/reznik.uneta.com.ua @denisreznik

Upload: fwdays

Post on 15-Apr-2017

497 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Денис Резник "Relational Database Design. Normalize till it hurts, then Denormalize till it works"

Golden Database Design Rule:Normalize till it hurtsDenormalize till it worksDenis ReznikData Architect at Intapp, Inc.Microsoft Data Platform MVPhttp:/reznik.uneta.com.ua@denisreznik

Page 2: Денис Резник "Relational Database Design. Normalize till it hurts, then Denormalize till it works"

Database History

1960s 1970s 1980s 1990s 2000s Nowadays

Object Databases

RDMS Commercial

Success

SQL

RDBMSIngress

System R

E.F. Codd’s Paper

CODASYLIMS

NoSQL(Johan Oskarsson)

(?)Google BigTable

Paper

Amazon Dynamo Paper

Page 3: Денис Резник "Relational Database Design. Normalize till it hurts, then Denormalize till it works"

Id UserId Name Date1 1 Work 07/03/2014

2 2 Test 09/09/2015

4 2 Rest 12/08/2015

Id Name Phone1 Bill +380678455732

2 John NULL

3 Mike +380501233427

Matter Client

Relations (Tables)

Attribute (Column)

Tuple (Row)

Relational Model

Page 4: Денис Резник "Relational Database Design. Normalize till it hurts, then Denormalize till it works"

Normalization

• Normalization is the process of organizing the columns (attributes) and tables (relations) of a relational database to minimize data redundancy

Redundancy ComplexityTable Count

Page 5: Денис Резник "Relational Database Design. Normalize till it hurts, then Denormalize till it works"

Company User Phone Phone TypeMicrosoft John Dow +380969785732 NULL

Microsoft John Dow +32345409123 NULL

Microsoft Larry McGregor +45678904692 NULL

Oracle Corp. John Snow +380988958371 NULL

Amazon Jack Snack +23348902385 Home

Amazon Jack Snack +69058763287 Work

First Normal Form (1NF)

• Each cell contains an atomic value

Company User PhoneMicrosoft John Dow Tel1: +380969785732, Tel2: +32345409123

Microsoft Larry McGregor Tel: +45678904692

Oracle Corp. John Snow +380988958371

Amazon Jack Snack Home: +23348902385 Work: +69058763287

MattersMatters

Page 6: Денис Резник "Relational Database Design. Normalize till it hurts, then Denormalize till it works"

Second Normal Form (2NF)

• Table has a Key (Key = Primary Key)

• All non-key columns of the relation are depend from a a whole KeyMatters

Company User Company Address ManagerMicrosoft John Dow Redmond Jane Daw

Microsoft Duncan MacLeod Redmond John Dow

Microsoft John Snow Redmond TonyStark

Oracle Corp. John Dow California Rick Brick

Amazon Jack Snack Seattle George Black

Google Dale Cooper California Diana Smith

User Company ManagerJohn Dow Microsoft Jane Dow

Duncan MacLeod Microsoft John Dow

John Snow Microsoft Tony Stark

John Dow Oracle Corp. Rick Brick

Jack Snack Amazon George Black

Dale Cooper Google Diana Smith

Company AddressMicrosoft Redmond

Oracle Corp. California

Amazon Seattle

Google California

ClientsKey: (Client, Matter)

Page 7: Денис Резник "Relational Database Design. Normalize till it hurts, then Denormalize till it works"

Second Normal Form (2NF)

• Table has a Key (Key = Primary Key)

• All non-key columns of the relation are depend from a a whole KeyMatters

Company User Company Address ManagerMicrosoft John Dow Redmond Jane Daw

Microsoft Duncan MacLeod Redmond John Dow

Microsoft John Snow Redmond TonyStark

Oracle Corp. John Dow California Rick Brick

Amazon Jack Snack Seattle George Black

Google Dale Cooper California Diana Smith

User Company ManagerJohn Dow Microsoft Jane Dow

Duncan MacLeod Microsoft John Dow

John Snow Microsoft Tony Stark

John Dow Oracle Corp. Rick Brick

Jack Snack Amazon George Black

Dale Cooper Google Diana Smith

Company AddressMicrosoft Redmond

Oracle Corp. California

Amazon Seattle

Google California

ClientsKey: (Client, Matter)

Page 8: Денис Резник "Relational Database Design. Normalize till it hurts, then Denormalize till it works"

Third Normal Form (3NF)

• Every non-prime attribute of Relation is non-transitively dependent on every Key of Relation

Matters

Company User Manager Manager AgeMicrosoft John Dow Peter Parker 23

Microsoft Patrik Jones Steven Wu 45

Microsoft Jackie Adams Steven Wu 45

Oracle Corp. Ashley Grey John James 67

Amazon Scott McMillan John Smith 34

Amazon Mary Smith John Smith 34

Key: (Client, Matter)Matters

Company User ManagerMicrosoft John Dow Peter Parker

Microsoft Patrik Jones Steven Wu

Microsoft Jackie Adams Steven Wu

Oracle Corp. Ashley Grey Jean Claude

John Smith Scott McMillan John Smith

Adam Gram Mary Smith John Dow

Attorneys

Manager Manager Age

Peter Partner 23

Steven Wu 45

Jean Claude 67

John Smith 34

Page 9: Денис Резник "Relational Database Design. Normalize till it hurts, then Denormalize till it works"

Fourth Normal Form (4NF)

• Eliminates independent many-to-one relationships between columns

Matters

Id Company Consultant1 Microsoft Peter Partner

2 Microsoft John Dow

3 Microsoft Amy Chen

4 Oracle Jim Beam

5 Amazon John Snow

6 Google John Snow

Matters

Id Company1 Microsoft

2 Oracle

3 Amazon

4 Google

Attorneys

Id Consultant1 Peter Partner

2 John Dow

3 Amy Chen

4 Jim Beam

5 John Snow

MatterAttorneys

CompanyId ConsultantId

1 1

1 2

1 3

2 4

3 5

4 5

Page 10: Денис Резник "Relational Database Design. Normalize till it hurts, then Denormalize till it works"

Foreign KeysUsers

User Company ManagerJohn Dow Microsoft Jane Dow

Duncan MacLeod Microsoft John Dow

John Snow Microsoft Tony Stark

John Dow Oracle Corp. Rick Brick

Jack Snack Amazon George Black

Dale Cooper Google Diana Smith

Company AddressMicrosoft Redmond

Oracle Corp. California

Amazon Seattle

Google California

CompaniesKey: (User, Company)

FK_USERS_COMPANY

Page 11: Денис Резник "Relational Database Design. Normalize till it hurts, then Denormalize till it works"

Foreign KeysUsers

User Company ManagerJohn Dow Microsoft Jane Dow

Duncan MacLeod Microsoft John Dow

John Snow Microsoft Tony Stark

John Dow Oracle Corp. Rick Brick

Jack Snack Amazon George Black

Dale Cooper Google Diana Smith

Company AddressMicrosoft Redmond

Oracle Corp. California

Amazon Seattle

Google California

CompaniesKey: (User, Company)

FK_USERS_COMPANY

UPDATE Users SET Company = 'Microsoft'WHERE User = Dale Cooper AND Company = Google'

Page 12: Денис Резник "Relational Database Design. Normalize till it hurts, then Denormalize till it works"

The Law of Diminishing Returns