database management system review

70
Database Management System CS157A SJSU Fall 2015 Kaya

Upload: kaya-ota

Post on 13-Apr-2017

145 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Database Management System Review

Database Management System CS157A SJSU Fall 2015 Kaya

Page 2: Database Management System Review

What is DB Definition of Database

A collection of information organized to afford efficient retrieval.

** not necessary to RDB **

Page 3: Database Management System Review

Why do we need DB?

1. Sharing = support concurrent access by multiple users(read and write)

2. Data Model Enforcement = make sure all apps see clean and organized data.

3. Scale = work with dataset too large to fit in memory

4. Flexibility = use data in new and unanticipated

Page 4: Database Management System Review

Data Models

Page 5: Database Management System Review

Database Model Kinds of Database model

1. Relational data model2. Object oriented relational data model3. Semi-structured data model

Page 6: Database Management System Review

Relational data model Excel like i.e. working with tables

Has operations Union, intersection, difference, selection, projection,

products, join, renaming

Person ID

Last Name

First Name

DateOf Birth

HomeaddrStreet

Homeaddr City

HomeAddrzip

Home Addr state

Work Addr street

Work Addr City

Work Addr Zip

Work AddrState

1 Yamada

taro 4/15 aaa bbb 111 CA eee fff 222 CA

Page 7: Database Management System Review

Object Oriented Relational Data Model

Similar to Relational database Added: object, classes, and inheritance directly

support DB-schema and query language OBJ: Person

Last Name

First Name

Date of Birth

Home Address

Work Address

OBJ: Address

Street

City

Zip

State

Refer

Page 8: Database Management System Review

Object Oriented Relational Data Model OBJ: Person

Last Name

First Name

Date of Birth

Home Address

Work Address

OBJ: Address

Street

City

Zip

StateRefer

Instance :Home Address

aaa

bbb

111

CA

Instance: Person

yamada

taro

4/15

Home

Work

Instance: Work Address

ccc

ddd

222

CA

Page 9: Database Management System Review

Semi Structured Data Model Data are Represented by Graph or Tree

To implement use XML

Movies

title

Genredrama

Length

281

Year1939

title

Year1977

Length

124

Genrescifi

Gone with the wind Star Wars

Page 10: Database Management System Review

XML representation <Movies>

<movie title =Gone with the wind>

<year>1939</year><Length>281</length><genre>drama</genre></movie >

<movie title =star wars><year>1992</year><Length>124</length><genre>scifi</genre></movie >

</Movies>

Page 11: Database Management System Review

Other Data Model Hierarchical model

Can be used to taxonomy(分類学 )

☆Has parent ID as meta data

Pictorial  Representation

Relational Representation

Page 12: Database Management System Review

Other Data Model Network model: differs from Relational model in

that data are represented by: Collection of Recodes Among data represented by link

Schema

Customer Account

Page 13: Database Management System Review

Defining Schema in SQL

Page 14: Database Management System Review

DATA TYPE-letters- Character string

Char(n): fixed length of char are stored. If you KNOW number of chars will be stored, then use this.

VARCHAR(N):upto n chars will be stored. If you do NOT know number of chars will be stored then use this.

Bit string BIT(n): like char(n) fixed length of bit chars BIT VARYING(n):like varchar(n) upto n bit chars

Page 15: Database Management System Review

Data types-math- BOOLEAN = {True , False}

INTEGER

SHORTINT: range is shorter then integer

FLOAT

DOUBLE

DECIMAL(n, d): customized real number;

NUMERIC(n, d): same as DECIMAL

Page 16: Database Management System Review

Data type-time- DATE: formed by 'yyyy-mm-dd’

TIME: formed by 'HH:mm:ss' or 'HH:mm:ss.d’ Where d is a fraction of sec

TIMESTAMP: formed by 'yyyy-mm-dd HH:mm:ss'

Page 17: Database Management System Review

Creating Tables Syntax in SQL:

In generalCreate table_name(Attribute1 data_type PRIMARY KEY Attribute2 data_type DEFAULT value Attribute3 data_type…….);

In ExampleCreate Movie(title varchar(50) PRIMARY KEY year int DEFAULT 0000length int);

reserved word = blue

Set initial value to 0000

Set title to be unique key

Page 18: Database Management System Review

Relational Operations

Page 19: Database Management System Review

UNION

Page 20: Database Management System Review

Union

Basic Rules of Union # of columns and order of columns MUST be SAME Data type of columns on involving tables in each

query MUST be SAME or compatible Returned columns are usually from the first table

Titlevarchar()

YearInt

Length Int

Titlevarchar()

YearInt

Length Int U

Titlevarchar()

YearTime

Length Int

Titlevarchar()

YearInt

Length Int

Page 21: Database Management System Review

Syntax In general

SELECT attribute1, attribute2 FROM Table1 UNIONSELECTattribute1, attribute2 FROM Table2

In example SELECT prod_code, prod_nameFROM Product UNIONSELECT prod_code,prod_name FROM Parches

Page 22: Database Management System Review

Example—table—

PUR_#

PROD_CODE PROD_NAME COM_NAME PUR_QTY PUR_AMOUNT

2 PR001 TV SONY 15 4500001 PR003 iPod PHILIPS 20 600003 PR007 laptop HP 6 2400004 PR005 mobile NOKIA 100 3000005 PR002 DVD player LG 10 300006 PR006 Sound system CREATIVE 8 40000

PROD_CODE PROD_NAME COM_NAME LIFEPR001 TV SONY 7PR002 DVD player LG 9PR003 iPod PHILIPS 9PR004 Sound system CREATIVE 8PR005 mobile NOKIA 6

UNION

Products:

Purchase:

Page 23: Database Management System Review

Example—output—

PROD_CODE PROD_NAME COM_NAME

PR001 TV SONY

PR002 DVD player LG

PR003 iPod PHILIPS

PR004 Sound system CREATIVE

PR005 mobile NOKIA

PR007 laptop HP

Products UNION of Purchase

Page 24: Database Management System Review

Union with different columns name

SELECT prod_code,prod_name,lifeFROM productWHERE life>6UNIONSELECT prod_code,prod_name,pur_qtyFROM purchaseWHERE pur_qty<20

PROD_CODE PROD_NAME COM_NAME LIFE(int)

PUR_# PROD_CODE PROD_NAME COM_NAME PUR_QTY(int)

PUR_AMOUNT

the two queries have been set using two different criteria(life and PUR_QTY) and different columns.

BUT NOTE both criteria have INTEGER VALUE

Page 25: Database Management System Review

Union with different columns name

PROD_CODE PROD_NAME LIFE

PR001 TV 7

PR001 TV 15

PR002 DVD player 9

PR002 DVD player 10

PR003 iPod 9

PR004 Sound system 8

PR006 Sound system 8

PR007 laptop 6

Orange values come from PRODUCT.LIFE

Blue values come from PURCHASE.PUR_QTR

BE CAREFULIN Most of cases,

This is unwelcomed result

Page 26: Database Management System Review

INTERSECTION

Page 27: Database Management System Review

Selection and Projection

Page 28: Database Management System Review

Selection

C is a condition (as in if-statement) that refers to attributes of R2

R1 is all those tuples of R2 that satisfy C

SQL form SELECT * FROM R2 WHERE C

Page 29: Database Management System Review

SelectionBar Beer Price

Joe’s Bud 2.50

Joe’s Miller 2.75

Sue’s Mud 2.50

Sue’s Miller 3.00

R2:

Bar Beer Price

Joe’s Bud 2.50

Joe’s Miller 2.75R1:

C: BAR = Joe’s

Page 30: Database Management System Review

Projection

R1 is constructed by looking at each tuples of R2 extracting the attributes on list L, in the order specified and creating from those components a tuples for R1

Eliminate Duplicated tuples if any

SQL form SELECT L FROM R2

Page 31: Database Management System Review

ProjectionBar Beer Price

Joe’s Bud 2.50

Joe’s Miller 2.75

Sue’s Bud 2.50

Sue’s Miller 3.00

R2:

Beer Price

Bud 2.50

Miller 2.75

Bud 2.50

Miller 3.00

Beer Price

Bud 2.50

Miller 2.75

Miller 3.00

Delete duplicate

Page 32: Database Management System Review

PRODUCT and JOIN

Page 33: Database Management System Review

CROSS PRODUCT Consider ALL possible combinations of two or more

tables.

# of row inTable1= x

# of row inTable2= y

# of rows in Result tablesx * y

Page 34: Database Management System Review
Page 35: Database Management System Review

Syntax In general

SELECT T1.A1, T1.A2, T2.A1, T2.A2….FROM T1CROSS JOIN T2

In example SELECT Eats.pizza, Eats.name, Person.age, Person.gender, Person.name FROM EatsCROSS JOIN Person

Eats has 9 rows and Person has 20results 9 * 20 = 180 rows

Page 36: Database Management System Review

EQUI-JOIN Equi join performs a join against equality or

matching column’s value of the associated tables

An equal sign(=) is used as comparison operator in the WHERE clause to refer equality. Select * from t1, t2 where t1.attr1 = t2.attr2

Also perform equi-join by using JOIN followed by ON and then specifying names of the columns along with their associated tables to check equality

Page 37: Database Management System Review

EQUI-JOINID Attribute1

2 A2

5 A5

3 A3

1 -----

4 A4

ID Attribute2

5 B5

1 B1

3 -----

6 B6

2 B2

5 C4

T1: T2:

ID Attribute1 ID Attribute2

1 ----- 1 B1

2 A2 2 B2

3 A3 3 -----

5 A5 5 B5

5 A5 5 C4

SELECT * FROM T1 JOIN T2 ON T1.ID = T2.ID

SELECT * FROM T1 ,T2 WHERE T1.ID = T2.ID

NOTEOne of IDs is NOT

eliminated

ID 5 in T1 is matched to two of ID 5 in T2.So, ID 5 in T1 is duplicated

Page 38: Database Management System Review

Natural Join Natural Join is a type of EQUI-JOIN

It is structured such a way that columns with same name of associated table will appear only once No duplicated columns name

Guidelines The associated table have one or more pairs of

identically named columns The columns MUST be the same data type Do not use ON clause in a natural join

Page 39: Database Management System Review

Natural-JOINID Attribute1

2 A2

5 A5

3 A3

1 -----

4 A4

ID Attribute2

5 B5

1 B1

3 -----

6 B6

2 B2

5 C4

T1: T2:

ID Attribute1 Attribute2

1 ----- B1

2 A2 B2

3 A3 -----

5 A5 B5

5 A5 C4

SELECT * FROM T1 ,T2 WHERE T1.ID = T2.ID

NOTEOne of IDs IS eliminated

SELECT *FROM T1NATURAL JOIN T2;

Page 40: Database Management System Review

Theta-Join Theta join allows for arbitrary comparison relation

Such as {<=, =>, <,>,= , !=}

Relational Algebra Notation

where C = any Boolean-valued condition

Take R1 × R2 then apply Projection with condition C

Page 41: Database Management System Review

Theta JoinBar Beer Price

Joe’s Bud 2.50

Joe’s Miller 2.75

Sue’s Mud 2.50

Sue’s Coors 3.00

Name ADDR

Joe’s Maple St

Sue’s River StR1: R2:

Bar Beer Price Name ADDRJoe’s Bud 2.50 Joe’s Maple St

Joe’s Miller 2.75 Joe’s Maple St

Sue’s Mud 2.50 Sue’s River St

Sue’s Coors 3.00 Sue’s River St

C: R1.Bar = R2.Name

Page 42: Database Management System Review

Other Join --

Page 43: Database Management System Review

Normalization

Page 44: Database Management System Review

Normalization Why do we need to normalize data?

To reduce redundancy and dependency

Page 45: Database Management System Review

No normalization Problems without normalization

Anomaly (矛盾 /不調和 ) can happen: Update anomaly Insertion anomaly Deletion anomaly

Solution normalization!

We need to data normalization to reduce anomalies

Page 46: Database Management System Review

Update anomaly

Update anomaly is a data inconsistency that result from data redundancy and a partial update.

Page 47: Database Management System Review

Update anomaly EmployeeID Name Department Student group123 J. Longfellow Accounting Beta Alpha Psi

234 B. Rech Marketing Marketing Club

234 B. Rech Marketing Marketing Manage Club

456 A.Bruchs CIS Technology Org

456 A.Bruchs CIS Beta Alpha Psi

What happen if you update like below?

UPDATE Employee SET department = “ECON”WHERE StudentGroup = “technology Org”

Table: employee

Page 48: Database Management System Review

Update anomaly EmployeeID Name Department Student group

123 J. Longfellow Accounting Beta Alpha Psi

234 B. Rech Marketing Marketing Club

234 B. Rech Marketing Marketing Manage Club

456 A.Bruchs ECON Technology Org

456 A.Bruchs CIS Beta Alpha Psi

When A.Bruchs’s department has been updated,say CIS to ECON ,Then 5th row’ s department has to be updated too.Otherwise, data can not be consistent

Can not be the same person any more !!!

Page 49: Database Management System Review

Another Update Anomaly S_id S_name S_address Suj_opted

401 Adam Noida Bio

402 Alex Panipat Math

403 Stuart Jammu Math

404 Adam Noida Physic

Update student’s address that appears >= 2

We need to check ALL ROWS for the update.

If this is not updated, Adam lives two different place inconsistency

Page 50: Database Management System Review

Insertion Anomaly

Insertion anomaly The inability to add data to DB due to absences

of other data

Page 51: Database Management System Review

Insertion Anomaly

This company hires Roy who has not decided student_group yet

Insert into Employee (EmployeeID, Name, Department, StudentGroup) values(125, “Roy”, “Math”, ) ERROR

Need to have smaller table that only controls employees, not employees AND their student group, department, etc.

EmployeeID Name Department Student group123 J. Longfellow Accounting Beta Alpha Psi

234 B. Rech Marketing Marketing Club

234 B. Rech Marketing Marketing Manage Club

456 A.Bruchs CIS Technology Org

456 A.Bruchs CIS Beta Alpha Psi

Page 52: Database Management System Review

Deletion Anomaly

Deletion anomaly is the unintended loss of data due to deletion of other data.

Page 53: Database Management System Review

Deletion anomalyEmployeeID Name Department Student group123 J. Longfellow Accounting Beta Alpha Psi

234 B. Rech Marketing Marketing Club

234 B. Rech Marketing Marketing Manage Club

456 A.Bruchs CIS Technology Org

456 A.Bruchs CIS Beta Alpha Psi

What happen if you execute:delete from Employee where StudentGroup = “Beta Alpha Psi”

Page 54: Database Management System Review

Deletion Anomaly

J.Longfellow no longer exists (as data)!!!

EmployeeID Name Department Student group123 J. Longfellow Accounting Beta Alpha Psi

234 B. Rech Marketing Marketing Club

234 B. Rech Marketing Marketing Manage Club

456 A.Bruchs CIS Technology Org

456 A.Bruchs CIS Beta Alpha Psi

Page 55: Database Management System Review

Functional Dependencies Trivial functional dependency

Partially functional dependency

A B C

B determines B == knowing B, can find B

A B C

B determines C == knowing B, can find C

Page 56: Database Management System Review

Functional Dependencies Fully functional dependency

Transitive dependency

A B C

A determines B AND C == knowing A, can find every non-key attributes

A B C

A determines B and B determines C

Page 57: Database Management System Review

First Normalization Form Definition of 1NF

Relation is in 1nf if it satisfy following condition: No two rows of data must contain repeating group of

information I.e. Each set of column must have an atomic value, such

that multiple columns cannot be used to fetch the same row

Page 58: Database Management System Review

2nd normalization form Definition: A relation is in 2nd nf if it satisfies following

condition: It is in 1st NF All non-key attributes are fully-functional dependency on

the primary key. Primary key has to be able to determine all other attributes.

A functional dependency that holds in a relation is partial when removing one of the determining attributes gives a functional dependency that holds in the relation.

If {A,B} {C} but also {A} {C} then {C} is partially functionally dependent on {A,B}

☆Can contain transitive functionality

Page 59: Database Management System Review

3rd Normalization Form A relation is in 3rd NF if it satisfies the following

condition: It is in 2nd NF There is no transitive dependency

Transitive dependency

A B Crelation

A determines BB = f(A)

B determines CC = h(B)

Transitive: C =h(f(A))

f h

Page 60: Database Management System Review

BCNF Determinant: is any attribute(simple or composite) on which

some other attribute is fully functional dependent.

BCNF definition: A relation R is in BCNF if and only if every determinant is

candidate key

Note -- 3rd NF does not deal with: A relation has multiple candidate key Those candidate keys are composite The candidate keys overlap

BCNF is to eliminate anomaly of those cases

BCNF is to deal with cases where 3rd

normalization can not.

Page 61: Database Management System Review

BCNF-Example Table = Supplies(supplier_no, supplier_name,city,zip)

Supplier_name is unique Supplier_no and supplier_name are unique

H1 (supplier_no) = city = g1(supplier_name)

H2(supplier_no) = zip = g2(supplier_name)\

H3(supplier_no) = supplier_name

G3(supplier_name) = supplier_no

Page 62: Database Management System Review

Possible Anomaly in BCNF INSERT: We cannot record the city for a supplier_no

without also knowing the supplier_name

DELETE: If we delete the row for a given supplier_name, we lose the information that the supplier_no is associated with a given city.

UPDATE: Since supplier_name is a candidate key (unique), there are none.

http://www2.york.psu.edu/~lxn/IST_210/normal_form_definitions.html

Page 63: Database Management System Review

Possible solution

Decompose Supplier into to two tables.

SUPPLIER_INFO (supplier_no, city, zip)

SUPPLIER_NAME (supplier_no, supplier_name)

Page 64: Database Management System Review

Representation

Page 65: Database Management System Review

Representation SQL Representation

select movietitle From(select starname, movietitle from starln) a,

(select name from moviestar where birthdate like ‘%1974%’) bWhere a.starname = b.name

Relational Algebra 3 Different representations shows the same

query

Page 66: Database Management System Review

Query Treeπmovietitle

∞movietitle

πstarnameπname

Starlnσmoveyearlike’%1974%’

MovieStar

Page 67: Database Management System Review

Disk

Page 68: Database Management System Review

Structure of disk

Reading one data at one time b/c using magnetic current is not reliable If failure, then it needs back to recover

Cylinder(non-physical)