1 database systems lecture #6 yan pan school of software, sysu 2011

48
1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

Upload: mitchell-harrison

Post on 11-Jan-2016

215 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

1

Database SystemsLecture #6

Yan Pan

School of Software, SYSU

2011

Page 2: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

2

Agenda Basic SQL RA…

Page 3: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

3

Recap: You are here First part of course is done: conceptual foundations You now know:

E/R Model Relational Model Relational Algebra (a little)

You now know how to: Capture part of world as an E/R model Convert E/R models to relational models Convert relational models to good (normal) forms

Next: Create, update, query tables with R.A/SQL Write SQL/DB-connected applications

Page 4: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

4

3-minute Normalization Review1. Q: What’s required for BCNF?

2. Q: How do we fix a non-BCNF relation?

3. Q: If AsBs violates BCNF, what do we do?

4. Q: Can BCNF decomposition ever be lossy?

5. Q: How do we combine two relations?

6. Q: Can BCNF decomp. lose FDs?

7. Q: Why would you ever use 3NF?

Page 5: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

5

Next topic: SQL Standard language for querying and manipulating data

Structured Query Language

Many standards: ANSI SQL, SQL92/SQL2, SQL3/SQL99 Originally: Structured English Query Language (SEQUEL) Vendors support various subsets/extensions We’ll do Oracle/MySQL/generic

“No one ever got fired for buying Oracle.”

Basic form (many more bells and whistles in addition):

SELECT attributes FROM relations (possibly multiple, joined) WHERE conditions (selections)

SELECT attributes FROM relations (possibly multiple, joined) WHERE conditions (selections)

Page 6: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

6

Data Types in SQL Characters:

CHAR(20) -- fixed length VARCHAR(40) -- variable length

Numbers: BIGINT, INT, SMALLINT, TINYINT REAL, FLOAT -- differ in precision MONEY

Times and dates: DATE DATETIME -- SQL Server

Page 7: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

7

“Tables”

PName Price Category Manufacturer

Gizmo $19.99 Gadgets GizmoWorks

Powergizmo $29.99 Gadgets GizmoWorks

SingleTouch $149.99 Photography Canon

MultiTouch $203.99 Household Hitachi

Product

Attribute namesTable name

Tuples or rows

Page 8: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

8

Simple SQL QueryPName Price Category Manufacturer

Gizmo $19.99 Gadgets GizmoWorks

Powergizmo $29.99 Gadgets GizmoWorks

SingleTouch $149.99 Photography Canon

MultiTouch $203.99 Household Hitachi

SELECT *FROM ProductWHERE category='Gadgets'

SELECT *FROM ProductWHERE category='Gadgets'

Product

PName Price Category Manufacturer

Gizmo $19.99 Gadgets GizmoWorks

Powergizmo $29.99 Gadgets GizmoWorks“selection”

Page 9: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

9

Simple SQL QueryPName Price Category Manufacturer

Gizmo $19.99 Gadgets GizmoWorks

Powergizmo $29.99 Gadgets GizmoWorks

SingleTouch $149.99 Photography Canon

MultiTouch $203.99 Household Hitachi

SELECT PName, Price, ManufacturerFROM ProductWHERE Price > 100

SELECT PName, Price, ManufacturerFROM ProductWHERE Price > 100

Product

PName Price Manufacturer

SingleTouch $149.99 Canon

MultiTouch $203.99 Hitachi

“selection” and“projection”

Page 10: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

10

A Notation for SQL Queries

SELECT Name, Price, ManufacturerFROM ProductWHERE Price > 100

SELECT Name, Price, ManufacturerFROM ProductWHERE Price > 100

Product(PName, Price, Category, Manfacturer)

(PName, Price, Manfacturer)

Input Relation

Output Relation

Page 11: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

Guifeng Zheng, DBMS, SS/SYSU11

The WHERE clause Contains a boolean expression

Teach literal is a test: x = y, x < y, x <= y, etc. For numbers, they have the usual meanings For CHARs/VARCHARs: lexicographic ordering

Expected conversion between CHAR and VARCHAR

For dates and times, what you expect

Page 12: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

12

Complex RA Expressions Schema: Movies (Title, year, length, inColor,

studioName, Prdcr#) Q: How long was Star Wars (1977)?

Strategy: find the row with Star Wars; then project the length field

Title Year Length inColor Studio Prdcr#

Star Wars 1977 124 True Fox 12345

M.Ducks 1991 104 True Disney 67890

W.World 1992 95 True Paramount 99999

Page 13: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

13

Combining operations Query: Which Fox movies were >= 100 minutes

long?Title Year Length Filmtype Studio

Star wars 1977 124 Color Fox

Mighty ducks 1991 104 Color Disney

Wayne’s world 1992 85 Color Paramount

Page 14: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

14

Next (parallel) topic: relational algebra Projection Selection Cartesian Product Joins: natural joins, theta joins Set operations: union, intersection, difference Combining operations to form queries Dependent and independent operations

Page 15: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

15

What is relational algebra? An algebra for relations “High-school” algebra: an algebra for numbers Algebra = formalism for constructing expressions

Operations Operands: Variables, Constants, expressions

Expressions: Vars & constants Operators applied to expressions They evaluate to values

Algebra Vars/consts Operators Eval to

High-school Numbers + * - / etc. Numbers

Relational Relations (=sets of tupes)

union, intersection, join, etc.

Relations

Page 16: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

16

Why do we care about relational algebra?1. The exprs are the form that questions about the

data take The relations these exprs cash out to are the answers to

our questions

2. RA ~ more succinct rep. of many SQL queries3. DBMS parse SQL into something like RA

First proofs of concept for RDBMS/RA: System R at IBM Ingress at Berkeley

“Modern” implementation of RA: SQL Both state of the art, mid-70s

Page 17: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

17

Relation operators Basic operators:

Selection: Projection: Cartesian Product:

Other set-theoretic ops: Union: Intersection: Difference: -

Additional operators: Joins (natural, equijoin, theta join, semijoin) Renaming: Grouping…

Page 18: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

18

Selection op Selects all tuples satisfying a condition Notation: c(R)

Examples salary > 100000(Employee) name = “Smith”(Employee)

The condition c can have comparison ops:=, <, , >, , <> boolean ops: and, or

Page 19: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

19

Selection example

Select the movies at Angelica: Theater=“Sunshine”(Showings)

Masc. Fem.VillageFilm Forum

Village

Village

N’hood

Bad Edu.

Annie Hall

Title

Sunshine

Sunshine

Theater

Village

Village

N’hood

Bad Edu.

Annie Hall

Title

Sunshine

Sunshine

Theater

Page 20: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

20

Projection op Keep only certain columns

Projection: op we used for decomposition Eliminates other columns, then removes

duplicates

Notation: A1,…,An(R)

Page 21: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

21

Cartesian product op Cross product again

“Cartesian Product”

Each tuple in R1 combines w/each tuple in R2

Algebraic notation: R1 R2

Not actual SQL!

If R1, R2 fields overlap, include both and disambiguate: R1.A, R2.A

Q: Where does the name come from? Q: If R1 has n1 rows and R2 has n2, how large is

R1 x R2?

Page 22: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

22

Cartesian product example

Street City

333 Some Street Chappaqua

444 Embassy Row Washington

Hillary-addresses

Job

Senator

First Lady

Lawyer

Hillary-jobs

Street City Job

333 Some Street Chappaqua Senator

444 Embassy Row Washington Senator

333 Some Street Chappaqua First Lady

444 Embassy Row Washington First Lady

333 Some Street Chappaqua Lawyer

444 Embassy Row Washington Lawyer

Hillary-addresses x Hillary-jobs

Page 23: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

23

Join op Corresponds to SQL query doing cross &

equality test

Specifically:

R1 R2 = every att once(shared atts =(R1 R2)) I.e., first compute the cross product R1 x R2

Next, select the rows in which shared fields agree Finally, project onto the union of R1 and R2’s fields

(remove duplicates)

Page 24: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

24

Natural join example

Name Street City

Hilary 333 Some Street Chappaqua

Hilary 444 Embassy Row Washington

Bill Somewhere else ddd

Addresses

Name Job

Hilary Senator

Hilary First Lady

Hilary Lawyer

Jobs

Addresses JobsName Street City Job

Hilary 333 Some Street Chappaqua Senator

Hilary 444 Embassy Row Washington Senator

Hilary 333 Some Street Chappaqua First Lady

Hilary 444 Embassy Row Washington First Lady

Hilary 333 Some Street Chappaqua Lawyer

Hilary 444 Embassy Row Washington Lawyer

Page 25: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

25

Natural Join R S

R S= ?

Unpaired tuples called dangling

A B

X Y

X Z

Y Z

Z V

B C

Z U

V W

Z V

Page 26: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

26

Natural Join Given the schemas R(A, B, C, D), S(A, C, E),

what is the schema of R S ?

Given R(A, B, C), S(D, E), what is R S?

Given R(A, B), S(A, B), what is R S?

Page 27: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

27

Join on arbitrary testA B C

1 2 3

6 7 8

9 7 8

B C D

2 3 4

2 3 5

7 8 10

A U.B U.C V.B V.C D

1 2 3 2 3 4

1 2 3 2 3 5

1 2 3 7 8 10

6 7 8 7 8 10

9 7 8 7 8 10

U V

U V

A<D

“Theta-join”

Page 28: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

28

Rename op Changes the schema, not the instance Notation: B1,…,Bn(R) is spelled “rho”, pronounced “row”

Example: Employee(ssn,name) social, name)(Employee)

Or just: (Employee)

Page 29: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

29

RA SQL SQL SELECT RA Projection SQL WHERE RA Selection SQL FROM RA Join/cross

Comma-separated list… SQL renaming RA rho

More ops later Keep RA in the back of your mind…

Page 30: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

30

Next: Joins in SQL Connect two or more tables:

PName Price Category Manufacturer

Gizmo $19.99 Gadgets GizmoWorks

Powergizmo $29.99 Gadgets GizmoWorks

SingleTouch $149.99 Photography Canon

MultiTouch $203.99 Household Hitachi

Product

Company CName StockPrice Country

GizmoWorks 25 USA

Canon 65 Japan

Hitachi 15 Japan

What isthe connection

betweenthem?

Page 31: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

31

Joins in SQL

Product (pname, price, category, manufacturer)Company (cname, stockPrice, country)

Find all products under $200 manufactured in Japan;return their names and prices.

SELECT PName, PriceFROM Product, CompanyWHERE Manufacturer=CName AND Country='Japan' AND Price <= 200

SELECT PName, PriceFROM Product, CompanyWHERE Manufacturer=CName AND Country='Japan' AND Price <= 200

Joinbetween Product

and Company

Page 32: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

32

Joins in SQL

PName Price Category Manufacturer

Gizmo $19.99 Gadgets GizmoWorks

Powergizmo $29.99 Gadgets GizmoWorks

SingleTouch $149.99 Photography Canon

MultiTouch $203.99 Household Hitachi

Product Company

Cname StockPrice Country

GizmoWorks 25 USA

Canon 65 Japan

Hitachi 15 Japan

PName Price

SingleTouch $149.99

SELECT PName, PriceFROM Product, CompanyWHERE Manufacturer=CName AND

Country='Japan' AND Price <= 200

SELECT PName, PriceFROM Product, CompanyWHERE Manufacturer=CName AND

Country='Japan' AND Price <= 200

Page 33: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

33

Joins in SQL

Product (pname, price, category, manufacturer)Company (cname, stockPrice, country)

Find all countries that manufacture some product in the ‘Gadgets’ category.

SELECT CountryFROM Product, CompanyWHERE Manufacturer=CName AND Category='Gadgets'

SELECT CountryFROM Product, CompanyWHERE Manufacturer=CName AND Category='Gadgets'

Page 34: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

34

Joins in SQL

Name Price Category Manufacturer

Gizmo $19.99 Gadgets GizmoWorks

Powergizmo $29.99 Gadgets GizmoWorks

SingleTouch $149.99 Photography Canon

MultiTouch $203.99 Household Hitachi

Product Company

Cname StockPrice Country

GizmoWorks 25 USA

Canon 65 Japan

Hitachi 15 Japan

Country

??

??

What is

the problem?What’s thesolution?

SELECT CountryFROM Product, CompanyWHERE Manufacturer=CName AND Category='Gadgets'

SELECT CountryFROM Product, CompanyWHERE Manufacturer=CName AND Category='Gadgets'

Page 35: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

35

Joins

Product (pname, price, category, manufacturer)Purchase (buyer, seller, store, product)Person(name, phone, city)

Find names of Seattleites who bought Gadgets, and the names of the stores they bought such product from.

SELECT DISTINCT name, storeFROM Person, Purchase, ProductWHERE persname=buyer AND product = pname AND city='Seattle' AND category='Gadgets'

SELECT DISTINCT name, storeFROM Person, Purchase, ProductWHERE persname=buyer AND product = pname AND city='Seattle' AND category='Gadgets'

Page 36: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

36

SQL Query Semantics

Parallel assignment – all tuples

Doesn’t impose any order

Answer = {}for all assignments x1 in R1, …, xn in Rn do if Conditions then Answer = Answer {(a1,…,ak)}return Answer

Answer = {}for all assignments x1 in R1, …, xn in Rn do if Conditions then Answer = Answer {(a1,…,ak)}return Answer

SELECT a1, a2, …, akFROM R1 AS x1, R2 AS x2, …, Rn AS xnWHERE Conditions

SELECT a1, a2, …, akFROM R1 AS x1, R2 AS x2, …, Rn AS xnWHERE Conditions

Page 37: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

37

SQL Query Semantics

Nested loops:

Answer = {}for x1 in R1 do for x2 in R2 do ….. for xn in Rn do if Conditions then Answer = Answer {(a1,…,ak)}return Answer

Answer = {}for x1 in R1 do for x2 in R2 do ….. for xn in Rn do if Conditions then Answer = Answer {(a1,…,ak)}return Answer

SELECT a1, a2, …, akFROM R1 AS x1, R2 AS x2, …, Rn AS xnWHERE Conditions

SELECT a1, a2, …, akFROM R1 AS x1, R2 AS x2, …, Rn AS xnWHERE Conditions

Page 38: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

38

Multiple join syntaxes Old-style syntax simply lists tables separated by

commas New-style makes the join explicit:

Functionally equivalent to old-style, but perhaps more elegant

Introduced in Oracle 8i, MySQL 3.x/4.x Older versions / other DBMSs may not support this

SELECT *

FROM A,B

WHERE …;

SELECT *

FROM A,B

WHERE …;

SELECT *

FROM A JOIN B ON …

WHERE …;

SELECT *

FROM A JOIN B ON …

WHERE …;

Page 39: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

39

New-style join types Cross joins (simplest):

FROM A CROSS JOIN B

Inner joins (regular joins): FROM A [INNER] JOIN B ON …

Natural join: FROM A NATURAL JOIN B; Joins on common fields and merges

Outer joins (later) No dangling rows

Page 40: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

40

CROSS JOIN e.g.

Name Address Gender Birthdate

Hanks 123 Palm Rd M 01/01/60

Taylor 456 Maple Av F 02/02/40

Lucas 789 Oak St M 03/03/55

Name Address Networth

Spielberg 246 Palm Rd 10M

Taylor 456 Maple Av 20M

Lucas 789 Oak St 30M

MovieStar

MovieExec

Page 41: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

41

CROSS JOIN e.g.

MovieStar.name

MovieStar.address

MovieStar. Gender

MovieStar.Birthdate

MovieExec. Name

MovieExec.Address

MovieExec. Networth

Hanks 123 Palm Rd M 01/01/60 Spielberg 246 Palm Rd 10M

Hanks 123 Palm Rd M 01/01/60 Taylor 456 Maple Av 20M

Hanks 123 Palm Rd M 01/01/60 Lucas 789 Oak St 30M

Taylor 456 Maple Av F 02/02/40 Spielberg 246 Palm Rd 10M

Taylor 456 Maple Av F 02/02/40 Taylor 456 Maple Av 20M

Taylor 456 Maple Av F 02/02/40 Lucas 789 Oak St 30M

Lucas 789 Oak St M 03/03/55 Spielberg 246 Palm Rd 10M

Lucas 789 Oak St M 03/03/55 Taylor 456 Maple Av 20M

Lucas 789 Oak St M 03/03/55 Lucas 789 Oak St 30M

Row

1

2

3

4

5

6

7

8

9

SELECT *

FROM MovieStar CROSS JOIN MovieExec

SELECT *

FROM MovieStar CROSS JOIN MovieExec

Page 42: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

42

JOIN … ON e.g

MovieStar.name

MovieStar.address

MovieStar. Gender

MovieStar.Birthdate

MovieExec. Name

MovieExec.Address

MovieExec. Networth

Hanks 123 Palm Rd M 01/01/60 Spielberg 246 Palm Rd 10M

Hanks 123 Palm Rd M 01/01/60 Taylor 456 Maple Av 20M

Hanks 123 Palm Rd M 01/01/60 Lucas 789 Oak St 30M

Taylor 456 Maple Av F 02/02/40 Spielberg 246 Palm Rd 10M

Taylor 456 Maple Av F 02/02/40 Taylor 456 Maple Av 20M

Taylor 456 Maple Av F 02/02/40 Lucas 789 Oak St 30M

Lucas 789 Oak St M 03/03/55 Spielberg 246 Palm Rd 10M

Lucas 789 Oak St M 03/03/55 Taylor 456 Maple Av 20M

Lucas 789 Oak St M 03/03/55 Lucas 789 Oak St 30M

Row

1

2

3

4

5

6

7

8

9

SELECT *

FROM MovieStar JOIN MovieExec

ON MovieStar.Name <> MovieExec.Name

SELECT *

FROM MovieStar JOIN MovieExec

ON MovieStar.Name <> MovieExec.Name

Page 43: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

43

NATURAL JOIN MovieStar(name, address, gender, birthdate) MovieExec(name, address, networth)

Natural Join syntax: FROM MovieStar NATURAL JOIN MovieExec

Results: list of movie stars who are also execs: (Name, address, gender, birthdate, networth)

Page 44: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

44

NATURAL JOIN e.g.Name Address Gender Birthdate

Hanks 123 Palm Rd M 01/01/60

Taylor 456 Maple Av F 02/02/40

Lucas 789 Oak St M 03/03/55

Name Address Networth

Spielberg 246 Palm Rd 10M

Taylor 456 Maple Av 20M

Lucas 789 Oak St 30M

MovieStar

MovieExec

Name Address Gender Birthdate Networth

Taylor 456 Maple Av F 02/02/40 20M

Lucas 789 Oak St M 03/03/55 30M

SELECT * FROM MovieStar NATURAL JOIN MovieExecSELECT * FROM MovieStar NATURAL JOIN MovieExec

Page 45: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

45

Another complex example People(ssn, name, street, city, state, state) Q: Who lives on George’s street?

A: First, generate pairs of (renamed) people: p1(People) x p2(People)

Then pick out pairs with George: p1.name='George'(p1(People) x p2(People))

And refine to rows with George and someone else: p1.name='George‘ AND p1.name<>p2.name(p1(People) x p2(People))

Finally, project out the names: p2.name(p1.name='George‘ AND p1.name<>p2.name(p1(People) x p2(People))

Page 46: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

46

Live examples Q: produce a list of employees and their

bosses What if no boss? Or no subordinate?

Joins on emp, emp man: Comma-based Inner Natural Cross Outer – left, right, full

Page 47: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

47

More live examples Inner joins require an ON clause

Like a where clause Arbitrary boolean expression If always true (1=1), reduces to cross join

New compar op: BETWEEN a between 5 and 10 a >= 5 and a <= 10

Q: produce a list of employees with their salary grades emp, salgrade

Page 48: 1 Database Systems Lecture #6 Yan Pan School of Software, SYSU 2011

48

Review Examples from sqlzoo.net

SELECT LFROM R1, …, Rn

WHERE C

SELECT LFROM R1, …, Rn

WHERE C

L(C(R1 x … Rn)L(C(R1 x … Rn)