1 database systems lecture #6 yan pan school of software, sysu 2011
TRANSCRIPT
1
Database SystemsLecture #6
Yan Pan
School of Software, SYSU
2011
2
Agenda Basic SQL RA…
3
Recap: You are here First part of course is done: conceptual foundations You now know:
E/R Model Relational Model Relational Algebra (a little)
You now know how to: Capture part of world as an E/R model Convert E/R models to relational models Convert relational models to good (normal) forms
Next: Create, update, query tables with R.A/SQL Write SQL/DB-connected applications
4
3-minute Normalization Review1. Q: What’s required for BCNF?
2. Q: How do we fix a non-BCNF relation?
3. Q: If AsBs violates BCNF, what do we do?
4. Q: Can BCNF decomposition ever be lossy?
5. Q: How do we combine two relations?
6. Q: Can BCNF decomp. lose FDs?
7. Q: Why would you ever use 3NF?
5
Next topic: SQL Standard language for querying and manipulating data
Structured Query Language
Many standards: ANSI SQL, SQL92/SQL2, SQL3/SQL99 Originally: Structured English Query Language (SEQUEL) Vendors support various subsets/extensions We’ll do Oracle/MySQL/generic
“No one ever got fired for buying Oracle.”
Basic form (many more bells and whistles in addition):
SELECT attributes FROM relations (possibly multiple, joined) WHERE conditions (selections)
SELECT attributes FROM relations (possibly multiple, joined) WHERE conditions (selections)
6
Data Types in SQL Characters:
CHAR(20) -- fixed length VARCHAR(40) -- variable length
Numbers: BIGINT, INT, SMALLINT, TINYINT REAL, FLOAT -- differ in precision MONEY
Times and dates: DATE DATETIME -- SQL Server
7
“Tables”
PName Price Category Manufacturer
Gizmo $19.99 Gadgets GizmoWorks
Powergizmo $29.99 Gadgets GizmoWorks
SingleTouch $149.99 Photography Canon
MultiTouch $203.99 Household Hitachi
Product
Attribute namesTable name
Tuples or rows
8
Simple SQL QueryPName Price Category Manufacturer
Gizmo $19.99 Gadgets GizmoWorks
Powergizmo $29.99 Gadgets GizmoWorks
SingleTouch $149.99 Photography Canon
MultiTouch $203.99 Household Hitachi
SELECT *FROM ProductWHERE category='Gadgets'
SELECT *FROM ProductWHERE category='Gadgets'
Product
PName Price Category Manufacturer
Gizmo $19.99 Gadgets GizmoWorks
Powergizmo $29.99 Gadgets GizmoWorks“selection”
9
Simple SQL QueryPName Price Category Manufacturer
Gizmo $19.99 Gadgets GizmoWorks
Powergizmo $29.99 Gadgets GizmoWorks
SingleTouch $149.99 Photography Canon
MultiTouch $203.99 Household Hitachi
SELECT PName, Price, ManufacturerFROM ProductWHERE Price > 100
SELECT PName, Price, ManufacturerFROM ProductWHERE Price > 100
Product
PName Price Manufacturer
SingleTouch $149.99 Canon
MultiTouch $203.99 Hitachi
“selection” and“projection”
10
A Notation for SQL Queries
SELECT Name, Price, ManufacturerFROM ProductWHERE Price > 100
SELECT Name, Price, ManufacturerFROM ProductWHERE Price > 100
Product(PName, Price, Category, Manfacturer)
(PName, Price, Manfacturer)
Input Relation
Output Relation
Guifeng Zheng, DBMS, SS/SYSU11
The WHERE clause Contains a boolean expression
Teach literal is a test: x = y, x < y, x <= y, etc. For numbers, they have the usual meanings For CHARs/VARCHARs: lexicographic ordering
Expected conversion between CHAR and VARCHAR
For dates and times, what you expect
12
Complex RA Expressions Schema: Movies (Title, year, length, inColor,
studioName, Prdcr#) Q: How long was Star Wars (1977)?
Strategy: find the row with Star Wars; then project the length field
Title Year Length inColor Studio Prdcr#
Star Wars 1977 124 True Fox 12345
M.Ducks 1991 104 True Disney 67890
W.World 1992 95 True Paramount 99999
13
Combining operations Query: Which Fox movies were >= 100 minutes
long?Title Year Length Filmtype Studio
Star wars 1977 124 Color Fox
Mighty ducks 1991 104 Color Disney
Wayne’s world 1992 85 Color Paramount
14
Next (parallel) topic: relational algebra Projection Selection Cartesian Product Joins: natural joins, theta joins Set operations: union, intersection, difference Combining operations to form queries Dependent and independent operations
15
What is relational algebra? An algebra for relations “High-school” algebra: an algebra for numbers Algebra = formalism for constructing expressions
Operations Operands: Variables, Constants, expressions
Expressions: Vars & constants Operators applied to expressions They evaluate to values
Algebra Vars/consts Operators Eval to
High-school Numbers + * - / etc. Numbers
Relational Relations (=sets of tupes)
union, intersection, join, etc.
Relations
16
Why do we care about relational algebra?1. The exprs are the form that questions about the
data take The relations these exprs cash out to are the answers to
our questions
2. RA ~ more succinct rep. of many SQL queries3. DBMS parse SQL into something like RA
First proofs of concept for RDBMS/RA: System R at IBM Ingress at Berkeley
“Modern” implementation of RA: SQL Both state of the art, mid-70s
17
Relation operators Basic operators:
Selection: Projection: Cartesian Product:
Other set-theoretic ops: Union: Intersection: Difference: -
Additional operators: Joins (natural, equijoin, theta join, semijoin) Renaming: Grouping…
18
Selection op Selects all tuples satisfying a condition Notation: c(R)
Examples salary > 100000(Employee) name = “Smith”(Employee)
The condition c can have comparison ops:=, <, , >, , <> boolean ops: and, or
19
Selection example
Select the movies at Angelica: Theater=“Sunshine”(Showings)
Masc. Fem.VillageFilm Forum
Village
Village
N’hood
Bad Edu.
Annie Hall
Title
Sunshine
Sunshine
Theater
Village
Village
N’hood
Bad Edu.
Annie Hall
Title
Sunshine
Sunshine
Theater
20
Projection op Keep only certain columns
Projection: op we used for decomposition Eliminates other columns, then removes
duplicates
Notation: A1,…,An(R)
21
Cartesian product op Cross product again
“Cartesian Product”
Each tuple in R1 combines w/each tuple in R2
Algebraic notation: R1 R2
Not actual SQL!
If R1, R2 fields overlap, include both and disambiguate: R1.A, R2.A
Q: Where does the name come from? Q: If R1 has n1 rows and R2 has n2, how large is
R1 x R2?
22
Cartesian product example
Street City
333 Some Street Chappaqua
444 Embassy Row Washington
Hillary-addresses
Job
Senator
First Lady
Lawyer
Hillary-jobs
Street City Job
333 Some Street Chappaqua Senator
444 Embassy Row Washington Senator
333 Some Street Chappaqua First Lady
444 Embassy Row Washington First Lady
333 Some Street Chappaqua Lawyer
444 Embassy Row Washington Lawyer
Hillary-addresses x Hillary-jobs
23
Join op Corresponds to SQL query doing cross &
equality test
Specifically:
R1 R2 = every att once(shared atts =(R1 R2)) I.e., first compute the cross product R1 x R2
Next, select the rows in which shared fields agree Finally, project onto the union of R1 and R2’s fields
(remove duplicates)
24
Natural join example
Name Street City
Hilary 333 Some Street Chappaqua
Hilary 444 Embassy Row Washington
Bill Somewhere else ddd
Addresses
Name Job
Hilary Senator
Hilary First Lady
Hilary Lawyer
Jobs
Addresses JobsName Street City Job
Hilary 333 Some Street Chappaqua Senator
Hilary 444 Embassy Row Washington Senator
Hilary 333 Some Street Chappaqua First Lady
Hilary 444 Embassy Row Washington First Lady
Hilary 333 Some Street Chappaqua Lawyer
Hilary 444 Embassy Row Washington Lawyer
25
Natural Join R S
R S= ?
Unpaired tuples called dangling
A B
X Y
X Z
Y Z
Z V
B C
Z U
V W
Z V
26
Natural Join Given the schemas R(A, B, C, D), S(A, C, E),
what is the schema of R S ?
Given R(A, B, C), S(D, E), what is R S?
Given R(A, B), S(A, B), what is R S?
27
Join on arbitrary testA B C
1 2 3
6 7 8
9 7 8
B C D
2 3 4
2 3 5
7 8 10
A U.B U.C V.B V.C D
1 2 3 2 3 4
1 2 3 2 3 5
1 2 3 7 8 10
6 7 8 7 8 10
9 7 8 7 8 10
U V
U V
A<D
“Theta-join”
28
Rename op Changes the schema, not the instance Notation: B1,…,Bn(R) is spelled “rho”, pronounced “row”
Example: Employee(ssn,name) social, name)(Employee)
Or just: (Employee)
29
RA SQL SQL SELECT RA Projection SQL WHERE RA Selection SQL FROM RA Join/cross
Comma-separated list… SQL renaming RA rho
More ops later Keep RA in the back of your mind…
30
Next: Joins in SQL Connect two or more tables:
PName Price Category Manufacturer
Gizmo $19.99 Gadgets GizmoWorks
Powergizmo $29.99 Gadgets GizmoWorks
SingleTouch $149.99 Photography Canon
MultiTouch $203.99 Household Hitachi
Product
Company CName StockPrice Country
GizmoWorks 25 USA
Canon 65 Japan
Hitachi 15 Japan
What isthe connection
betweenthem?
31
Joins in SQL
Product (pname, price, category, manufacturer)Company (cname, stockPrice, country)
Find all products under $200 manufactured in Japan;return their names and prices.
SELECT PName, PriceFROM Product, CompanyWHERE Manufacturer=CName AND Country='Japan' AND Price <= 200
SELECT PName, PriceFROM Product, CompanyWHERE Manufacturer=CName AND Country='Japan' AND Price <= 200
Joinbetween Product
and Company
32
Joins in SQL
PName Price Category Manufacturer
Gizmo $19.99 Gadgets GizmoWorks
Powergizmo $29.99 Gadgets GizmoWorks
SingleTouch $149.99 Photography Canon
MultiTouch $203.99 Household Hitachi
Product Company
Cname StockPrice Country
GizmoWorks 25 USA
Canon 65 Japan
Hitachi 15 Japan
PName Price
SingleTouch $149.99
SELECT PName, PriceFROM Product, CompanyWHERE Manufacturer=CName AND
Country='Japan' AND Price <= 200
SELECT PName, PriceFROM Product, CompanyWHERE Manufacturer=CName AND
Country='Japan' AND Price <= 200
33
Joins in SQL
Product (pname, price, category, manufacturer)Company (cname, stockPrice, country)
Find all countries that manufacture some product in the ‘Gadgets’ category.
SELECT CountryFROM Product, CompanyWHERE Manufacturer=CName AND Category='Gadgets'
SELECT CountryFROM Product, CompanyWHERE Manufacturer=CName AND Category='Gadgets'
34
Joins in SQL
Name Price Category Manufacturer
Gizmo $19.99 Gadgets GizmoWorks
Powergizmo $29.99 Gadgets GizmoWorks
SingleTouch $149.99 Photography Canon
MultiTouch $203.99 Household Hitachi
Product Company
Cname StockPrice Country
GizmoWorks 25 USA
Canon 65 Japan
Hitachi 15 Japan
Country
??
??
What is
the problem?What’s thesolution?
SELECT CountryFROM Product, CompanyWHERE Manufacturer=CName AND Category='Gadgets'
SELECT CountryFROM Product, CompanyWHERE Manufacturer=CName AND Category='Gadgets'
35
Joins
Product (pname, price, category, manufacturer)Purchase (buyer, seller, store, product)Person(name, phone, city)
Find names of Seattleites who bought Gadgets, and the names of the stores they bought such product from.
SELECT DISTINCT name, storeFROM Person, Purchase, ProductWHERE persname=buyer AND product = pname AND city='Seattle' AND category='Gadgets'
SELECT DISTINCT name, storeFROM Person, Purchase, ProductWHERE persname=buyer AND product = pname AND city='Seattle' AND category='Gadgets'
36
SQL Query Semantics
Parallel assignment – all tuples
Doesn’t impose any order
Answer = {}for all assignments x1 in R1, …, xn in Rn do if Conditions then Answer = Answer {(a1,…,ak)}return Answer
Answer = {}for all assignments x1 in R1, …, xn in Rn do if Conditions then Answer = Answer {(a1,…,ak)}return Answer
SELECT a1, a2, …, akFROM R1 AS x1, R2 AS x2, …, Rn AS xnWHERE Conditions
SELECT a1, a2, …, akFROM R1 AS x1, R2 AS x2, …, Rn AS xnWHERE Conditions
37
SQL Query Semantics
Nested loops:
Answer = {}for x1 in R1 do for x2 in R2 do ….. for xn in Rn do if Conditions then Answer = Answer {(a1,…,ak)}return Answer
Answer = {}for x1 in R1 do for x2 in R2 do ….. for xn in Rn do if Conditions then Answer = Answer {(a1,…,ak)}return Answer
SELECT a1, a2, …, akFROM R1 AS x1, R2 AS x2, …, Rn AS xnWHERE Conditions
SELECT a1, a2, …, akFROM R1 AS x1, R2 AS x2, …, Rn AS xnWHERE Conditions
38
Multiple join syntaxes Old-style syntax simply lists tables separated by
commas New-style makes the join explicit:
Functionally equivalent to old-style, but perhaps more elegant
Introduced in Oracle 8i, MySQL 3.x/4.x Older versions / other DBMSs may not support this
SELECT *
FROM A,B
WHERE …;
SELECT *
FROM A,B
WHERE …;
SELECT *
FROM A JOIN B ON …
WHERE …;
SELECT *
FROM A JOIN B ON …
WHERE …;
39
New-style join types Cross joins (simplest):
FROM A CROSS JOIN B
Inner joins (regular joins): FROM A [INNER] JOIN B ON …
Natural join: FROM A NATURAL JOIN B; Joins on common fields and merges
Outer joins (later) No dangling rows
40
CROSS JOIN e.g.
Name Address Gender Birthdate
Hanks 123 Palm Rd M 01/01/60
Taylor 456 Maple Av F 02/02/40
Lucas 789 Oak St M 03/03/55
Name Address Networth
Spielberg 246 Palm Rd 10M
Taylor 456 Maple Av 20M
Lucas 789 Oak St 30M
MovieStar
MovieExec
41
CROSS JOIN e.g.
MovieStar.name
MovieStar.address
MovieStar. Gender
MovieStar.Birthdate
MovieExec. Name
MovieExec.Address
MovieExec. Networth
Hanks 123 Palm Rd M 01/01/60 Spielberg 246 Palm Rd 10M
Hanks 123 Palm Rd M 01/01/60 Taylor 456 Maple Av 20M
Hanks 123 Palm Rd M 01/01/60 Lucas 789 Oak St 30M
Taylor 456 Maple Av F 02/02/40 Spielberg 246 Palm Rd 10M
Taylor 456 Maple Av F 02/02/40 Taylor 456 Maple Av 20M
Taylor 456 Maple Av F 02/02/40 Lucas 789 Oak St 30M
Lucas 789 Oak St M 03/03/55 Spielberg 246 Palm Rd 10M
Lucas 789 Oak St M 03/03/55 Taylor 456 Maple Av 20M
Lucas 789 Oak St M 03/03/55 Lucas 789 Oak St 30M
Row
1
2
3
4
5
6
7
8
9
SELECT *
FROM MovieStar CROSS JOIN MovieExec
SELECT *
FROM MovieStar CROSS JOIN MovieExec
42
JOIN … ON e.g
MovieStar.name
MovieStar.address
MovieStar. Gender
MovieStar.Birthdate
MovieExec. Name
MovieExec.Address
MovieExec. Networth
Hanks 123 Palm Rd M 01/01/60 Spielberg 246 Palm Rd 10M
Hanks 123 Palm Rd M 01/01/60 Taylor 456 Maple Av 20M
Hanks 123 Palm Rd M 01/01/60 Lucas 789 Oak St 30M
Taylor 456 Maple Av F 02/02/40 Spielberg 246 Palm Rd 10M
Taylor 456 Maple Av F 02/02/40 Taylor 456 Maple Av 20M
Taylor 456 Maple Av F 02/02/40 Lucas 789 Oak St 30M
Lucas 789 Oak St M 03/03/55 Spielberg 246 Palm Rd 10M
Lucas 789 Oak St M 03/03/55 Taylor 456 Maple Av 20M
Lucas 789 Oak St M 03/03/55 Lucas 789 Oak St 30M
Row
1
2
3
4
5
6
7
8
9
SELECT *
FROM MovieStar JOIN MovieExec
ON MovieStar.Name <> MovieExec.Name
SELECT *
FROM MovieStar JOIN MovieExec
ON MovieStar.Name <> MovieExec.Name
43
NATURAL JOIN MovieStar(name, address, gender, birthdate) MovieExec(name, address, networth)
Natural Join syntax: FROM MovieStar NATURAL JOIN MovieExec
Results: list of movie stars who are also execs: (Name, address, gender, birthdate, networth)
44
NATURAL JOIN e.g.Name Address Gender Birthdate
Hanks 123 Palm Rd M 01/01/60
Taylor 456 Maple Av F 02/02/40
Lucas 789 Oak St M 03/03/55
Name Address Networth
Spielberg 246 Palm Rd 10M
Taylor 456 Maple Av 20M
Lucas 789 Oak St 30M
MovieStar
MovieExec
Name Address Gender Birthdate Networth
Taylor 456 Maple Av F 02/02/40 20M
Lucas 789 Oak St M 03/03/55 30M
SELECT * FROM MovieStar NATURAL JOIN MovieExecSELECT * FROM MovieStar NATURAL JOIN MovieExec
45
Another complex example People(ssn, name, street, city, state, state) Q: Who lives on George’s street?
A: First, generate pairs of (renamed) people: p1(People) x p2(People)
Then pick out pairs with George: p1.name='George'(p1(People) x p2(People))
And refine to rows with George and someone else: p1.name='George‘ AND p1.name<>p2.name(p1(People) x p2(People))
Finally, project out the names: p2.name(p1.name='George‘ AND p1.name<>p2.name(p1(People) x p2(People))
46
Live examples Q: produce a list of employees and their
bosses What if no boss? Or no subordinate?
Joins on emp, emp man: Comma-based Inner Natural Cross Outer – left, right, full
47
More live examples Inner joins require an ON clause
Like a where clause Arbitrary boolean expression If always true (1=1), reduces to cross join
New compar op: BETWEEN a between 5 and 10 a >= 5 and a <= 10
Q: produce a list of employees with their salary grades emp, salgrade
48
Review Examples from sqlzoo.net
SELECT LFROM R1, …, Rn
WHERE C
SELECT LFROM R1, …, Rn
WHERE C
L(C(R1 x … Rn)L(C(R1 x … Rn)