principles*of*database*systems* cse544p · 44 joins pname price category manufacturer gizmo $19.99...
TRANSCRIPT
![Page 1: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/1.jpg)
Principles of Database Systems CSE 544p
Lecture #1 September 28, 2011
1 Dan Suciu -- p544 Fall 2011
![Page 2: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/2.jpg)
Staff
• Instructor: Dan Suciu – CSE 662, [email protected] – Office hours: Wednesdays, 5:30-‐6:20
• TAs: – Sandra Fan, [email protected]
Dan Suciu -- p544 Fall 2011 2
![Page 3: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/3.jpg)
CommunicaRons
• Web page: hSp://www.cs.washington.edu/p544 – Lectures will be available here – Homework will be posted here – Announcements may be posted here
• Mailing list: – Announcements, group discussions – If you registered, you are automaRcally subscribed
3 Dan Suciu -- p544 Fall 2011
![Page 4: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/4.jpg)
Textbook(s)
Main textbook: • Database Management Systems, Ramakrishnan and Gehrke
Second textbook: • Database Systems: The Complete Book, Garcia-‐Molina, Ullman, Widom
4 Dan Suciu -- p544 Fall 2011
![Page 5: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/5.jpg)
Course Format
• Lectures Wednesdays, 6:30-‐9:20
• 7 Homework Assignments
• Take-‐home Final
5 Dan Suciu -- p544 Fall 2011
![Page 6: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/6.jpg)
Grading
• Homework: 70 %
• Take-‐home Final: 30%
6 Dan Suciu -- p544 Fall 2011
![Page 7: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/7.jpg)
Homework Assignments
1. SQL 2. Conceptual design 3. JAVA/SQL 4. TransacRons 5. Database tuning 6. XML/XPath/XQuery 7. Pig LaRn, on AWS
7 Dan Suciu -- p544 Fall 2011 Due: Mondays’, by 11:59pm. Three late days per person
![Page 8: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/8.jpg)
Take-‐home Final
• Posted on December 8, at 11:59pm
• Due on December 10, by 10:00pm
• No late days/hours/minutes/seconds
Dan Suciu -- p544 Fall 2011 8
![Page 9: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/9.jpg)
Sohware Tools • Postgres:
– Preferred usage: download from download hSp://www.postgresql.org/download/ – Other opRon: use postgres on lab machines
• SQL Server 2008 – Download client from hSp://msdnaa.cs.washington.edu – Username is your full @cs.washington.edu email address – Doesn’t work ? Email ms-‐sw-‐[email protected] – Connect to IPROJSRV (may need tunneling) – OK to use you own server, just import IMDB
• Xquery: download one interpreter from – Preferred: Saxon: hSp://saxon.sourceforge.net/ (from apache; very popular) – Others:
• Zorba: hSp://www.zorba-‐xquery.com/ (I used this one: ½ day installaRon) • Galax: hSp://galax.sourceforge.net/ (great in the past, seems less well maintained)
• Pig LaRn: – We will run it on Amazon Web Services – You may download from hSp://hadoop.apache.org/pig/, but you won’t need it
Dan Suciu -- p544 Fall 2011 9
![Page 10: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/10.jpg)
Accessing SQL Server • SQL Server Management Studio • Server Type = Database Engine • Server Name = IPROJSRV • AuthenRcaRon = SQL Server AuthenRcaRon
– Login = your UW email address (not the CSE email) – Password = [in class]
• Must connect from within CSE, or must use tunneling • AlternaRvely: install your own, get it from MSDNAA (see earlier slide)
• Then play with IMDB, start working on HW 1
Dan Suciu -- p544 Fall 2011 10
![Page 11: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/11.jpg)
Rest of Today’s Lecture
• Overview of DBMS
• Overview of the course content
• SQL Dan Suciu -- p544 Fall 2011 11
![Page 12: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/12.jpg)
Database
What is a database ?
Give examples of databases
12 Dan Suciu -- p544 Fall 2011
![Page 13: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/13.jpg)
Database
What is a database ? • A collecRon of files storing related data
Give examples of databases • Accounts database; payroll database; UW’s students database; Amazon’s products database; airline reservaRon database
13 Dan Suciu -- p544 Fall 2011
![Page 14: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/14.jpg)
Database Management System
What is a DBMS ?
Give examples of DBMS
14 Dan Suciu -- p544 Fall 2011
![Page 15: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/15.jpg)
Database Management System
What is a DBMS ? • A big C program wri;en by someone else that allows us to manage efficiently a large database and allows it to persist over long periods of Bme
Give examples of DBMS • DB2 (IBM), SQL Server (MS), Oracle, Sybase • MySQL, Postgres, …
15
SQL for Nerds, Greenspun, hSp://philip.greenspun.com/sql/ (Chap 1) Dan Suciu -- p544 Fall 2011
![Page 16: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/16.jpg)
Market Shares
From 2006 Gartner report: • IBM: 21% market with $3.2BN in sales
• Oracle: 47% market with $7.1BN in sales
• Microsoh: 17% market with $2.6BN in sales
16 Dan Suciu -- p544 Fall 2011
![Page 17: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/17.jpg)
An Example
The Internet Movie Database hSp://www.imdb.com
• EnRRes: Actors (800k), Movies (400k), Directors, …
• RelaRonships: who played where, who directed what, …
17 Dan Suciu -- p544 Fall 2011
![Page 18: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/18.jpg)
Key concept 1: RelaRonal Data Model
18 Dan Suciu -- p544 Fall 2011
Actor: Cast:
Movie:
id fName lName gender
195428 Tom Hanks M 645947 Amy Hanks F
. . .
id Name year
337166 Toy Story 1995
. . . . . . . ..
pid mid
195428 337166 . . .
![Page 19: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/19.jpg)
Key concept 2: DeclaraRve Query Language
19
SELECT * FROM Actor
Dan Suciu -- p544 Fall 2011
SELECT count(*) FROM Actor
SELECT * FROM Actor WHERE lName = ‘Hanks’ SQL
We write what we want, not how we want it.
![Page 20: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/20.jpg)
Key concept 3: Data Independence
20
SELECT * FROM Actor, Casts, Movie WHERE lname='Hanks' and Actor.id = Casts.pid and Casts.mid=Movie.id and Movie.year=1995
817k actors, 3.5M casts, 380k movies; How can it be so fast ?
Physical data independence: query is independent of physical storage
![Page 21: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/21.jpg)
21
How Can We Evaluate the Query ?
Actor: Cast: Movie: id fName lName gender
. . . Hanks
. . .
id Name year
. . . 1995
. . .
pid mid
. . .
. . .
Plan 1: . . . . [ in class ] Plan 2: . . . . [ in class ]
Dan Suciu -- p544 Fall 2011
![Page 22: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/22.jpg)
Dan Suciu -- p544 Fall 2011 22
Actor Cast Movie
σlName=‘Hanks’ σyear=1995
Actor Cast Movie
σlName=‘Hanks’ σyear=1995
Indexes: on Actor.lName, on Movie.year
AlternaRve query plans:
Query opRmizaRon Database StaRsRcs histograms, synopses, etc
![Page 23: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/23.jpg)
Key concept 4: TransacRons
Dan Suciu -- p544 Fall 2011 23
X = Read(Account_1); X.amount = X.amount - 100; Write(Account_1, X); Y = Read(Account_2); Y.amount = Y.amount + 100; Write(Account_2, Y);
CRASH !
What is the problem ?
Recovery from systems failures: Transfer $100 from account 1 to account 2:
![Page 24: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/24.jpg)
Dan Suciu -- p544 Fall 2011 24
X = Read(Account); if (X.amount >= 100) { dispense_money( ); X.amount = X.amount – 100; } else error(“Insufficient funds”);
X = Read(Account); if (X.amount >= 100) { dispense_money( ); X.amount = X.amount – 100; } else error(“Insufficient funds”);
What can go wrong ?
Concurrency Control
Overdrahing an account:
User 1: User 2:
![Page 25: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/25.jpg)
TransacRons
ACID = • Atomicity ( = recovery) • Consistency • IsolaRon ( = concurrency control) • Durability
25 Dan Suciu -- p544 Fall 2011
![Page 26: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/26.jpg)
Client/Server Database Architecture
• Single server that stores the database • Many clients running apps and connecRng to DBMS • Performance boSlenecks:
– Client/server communicaRon – TransacRonal semanRcs
• Other architectures: – main memory database – replicated databases
26 Dan Suciu -- p544 Fall 2011
![Page 27: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/27.jpg)
Two Types of Database Usage
• OLTP (online-‐transacRon-‐processing) – Many updates – Many simple “point queries” – Few (or no) complex aggregate queries
• Decision-‐Support – Many aggregate/group-‐by queries. – Few (or no) updates
Dan Suciu -- p544 Fall 2011 27
![Page 28: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/28.jpg)
Trends in Data Management
• Large scale data analyRcs: Map/Reduce, Pig, … • Cloud based database service: AWS, Azure, … • NoSQL: sacrifice ACID for performance • Data privacy • Data provenance • Complex data analyRcs: probabilisRc databases
Dan Suciu -- p544 Fall 2011 28
![Page 29: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/29.jpg)
Outline of Course Content 1. SQL 2. RelaRonal Calculus, Database Design 3. Constraints, Views 4. TransacRons: recovery 5. TransacRons: concurrency control 6. XML, XPath, XQuery 7. Data storage, indexes, physical tuning 8. Query execuRon 9. Query opRmizaRon 10. Big Data: Parallel databases, Map/Reduce, Pig LaRn 11. Advanced topics: privacy, provenance, probabilisRc dbs
Dan Suciu -- p544 Fall 2011 29
![Page 30: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/30.jpg)
Announcement: Homework 1
• Homework 1 is posted; • Due on Monday, Oct. 10 • Tools:
– Postgres: install on your computer (PREFERRED) or use the installaRon in the lab
– SQL Server, for tesRng only; connect to IPROJSRV: login: your UW email address; password: ……..
• Tasks: create db, import data, create indices, write 11 SQL queries
Dan Suciu -- p544 Fall 2011 30
![Page 31: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/31.jpg)
31
Outline for rest of today
• Basics SQL (Chapters 5.2, 5.3) • Aggregates (Chapter 5.5.) • Nulls, Outer joins (Chapter 5.6) • Subqueries (Chapters 5.4)
– This is tough ! Next lecture we will discuss RelaRonal Calculus (a.k.a. Tuple Calculus, Chapter 4.3). See supplementary text Three Query Language Formalisms
Dan Suciu -- p544 Fall 2011
![Page 32: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/32.jpg)
32
SQL
• Data DefiniRon Language (DDL) – Create/alter/delete tables and their aSributes – Read from the book
• Data ManipulaRon Language (DML) – Query tables, Insert/delete/modify – Discussed in class
Dan Suciu -- p544 Fall 2011
![Page 33: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/33.jpg)
33
Tables in SQL
PName Price Category Manufacturer
Gizmo $19.99 Gadgets GizmoWorks
Powergizmo $29.99 Gadgets GizmoWorks
SingleTouch $149.99 Photography Canon
MultiTouch $203.99 Household Hitachi
Product
Attribute names Table name
Tuples or rows
Key
Dan Suciu -- p544 Fall 2011
![Page 34: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/34.jpg)
The RelaRonal Data Model Data is stored in tables , a.k.a. relaBons
Each relaRon has: 1. A schema = name+aSributes
– Product(PName, Price, Category, Manufacturer) – Each relaRon has a key, which we underline
2. An instance = set of rows
SQL departs from the pure relaRonal model in that it allows duplicate tuples • Set semanBcs à bag semanBcs
{1, 2, 3} à {1, 1, 2, 3, 3, 3}
Dan Suciu -- p544 Fall 2011 34
![Page 35: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/35.jpg)
35
Data Types in SQL
• Atomic types: – Characters: CHAR(20), VARCHAR(50) – Numbers: INT, BIGINT, SMALLINT, FLOAT – Others: MONEY, DATETIME, …
• Record (aka tuple) – Has atomic aSributes
• Table (relaRon) – A set of tuples
Dan Suciu -- p544 Fall 2011
![Page 36: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/36.jpg)
36
Simple SQL Query
PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks
Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography Canon MultiTouch $203.99 Household Hitachi
SELECT * FROM Product WHERE category=‘Gadgets’
Product
PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks
Powergizmo $29.99 Gadgets GizmoWorks “selection” Dan Suciu -- p544 Fall 2011
![Page 37: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/37.jpg)
37
Simple SQL Query PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks
Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography Canon MultiTouch $203.99 Household Hitachi
SELECT PName, Price, Manufacturer FROM Product WHERE Price > ‘$100’
Product
PName Price Manufacturer SingleTouch $149.99 Canon MultiTouch $203.99 Hitachi
“selection” and “projection”
Dan Suciu -- p544 Fall 2011
![Page 38: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/38.jpg)
38
Details • Case insensiRve:
SELECT = Select = select Product = product BUT: ‘SeaSle’ ≠ ‘seaSle’
• Constants: ‘abc’ -‐ yes “abc” -‐ no
Dan Suciu -- p544 Fall 2011
![Page 39: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/39.jpg)
39
EliminaRng Duplicates
SELECT DISTINCT category FROM Product
Compare to:
SELECT category FROM Product
Category Gadgets Gadgets
Photography Household
Category Gadgets
Photography Household
Dan Suciu -- p544 Fall 2011
![Page 40: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/40.jpg)
40
Ordering the Results
SELECT pname, price, manufacturer FROM Product WHERE category=‘Gadgets’ AND price > ‘$10’ ORDER BY price, pname
Ties are broken by the second attribute on the ORDER BY list. Ordering is ascending, unless you specify the DESC keyword.
Dan Suciu -- p544 Fall 2011
![Page 41: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/41.jpg)
41
SELECT Category FROM Product ORDER BY PName
PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks
Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography Canon MultiTouch $203.99 Household Hitachi
? SELECT DISTINCT category FROM Product ORDER BY category
SELECT DISTINCT category FROM Product ORDER BY PName
? ?
Dan Suciu -- p544 Fall 2011
![Page 42: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/42.jpg)
42
Keys and Foreign Keys
PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks
Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography Canon MultiTouch $203.99 Household Hitachi
Product
Company
CName Country
GizmoWorks USA
Canon Japan
Hitachi Japan
Key
Foreign key
Dan Suciu -- p544 Fall 2011
![Page 43: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/43.jpg)
43
Joins
Product (PName, Price, Category, Manufacturer) Company (CName,, Country) Find all products under $200 manufactured in Japan; return their names and prices.
SELECT PName, Price FROM Product, Company WHERE Manufacturer=CName AND Country=‘Japan’ AND Price <= ‘$200’
Join between Product
and Company
Dan Suciu -- p544 Fall 2011
![Page 44: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/44.jpg)
44
Joins
PName Price Category Manufacturer
Gizmo $19.99 Gadgets GizmoWorks
Powergizmo $29.99 Gadgets GizmoWorks
SingleTouch $149.99 Photography Canon
MultiTouch $203.99 Household Hitachi
Product Company
Cname Country
GizmoWorks USA
Canon Japan
Hitachi Japan
PName Price
SingleTouch $149.99
SELECT PName, Price FROM Product, Company WHERE Manufacturer=CName AND Country=‘Japan’ AND Price <= ‘$200’
Dan Suciu -- p544 Fall 2011
![Page 45: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/45.jpg)
45
Tuple Variables
SELECT DISTINCT name, country FROM Person, Company WHERE worksfor = cname
Which country ?
Product (PName, Price, Category, Manufacturer) Company (CName,, Country) Person(name, Country, Worksfor)
SELECT DISTINCT Person.name, Company.country FROM Person, Company WHERE Person.worksfor = Company.cname
Dan Suciu -- p544 Fall 2011
SELECT DISTINCT x.name, y.country FROM Person AS x, Company AS y WHERE x.worksfor = y.cname
![Page 46: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/46.jpg)
46
In Class
Product (pname, price, category, manufacturer) Company (cname, country) Find all Chinese companies that manufacture products in the ‘toy’ category
SELECT cname FROM WHERE
Dan Suciu -- p544 Fall 2011
![Page 47: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/47.jpg)
47
In Class
Product (pname, price, category, manufacturer) Company (cname, country) Find all Chinese companies that manufacture products both in the ‘electronic’ and ‘toy’ categories
SELECT cname FROM WHERE
Dan Suciu -- p544 Fall 2011
![Page 48: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/48.jpg)
48
The Nested Loop SemanRcs of SQL Queries
SELECT a1, a2, …, ak FROM R1 AS x1, R2 AS x2, …, Rn AS xn WHERE CondiRons
Dan Suciu -- p544 Fall 2011
Answer = {} for x1 in R1 do for x2 in R2 do ….. for xn in Rn do if Conditions then Answer = Answer ∪ {(a1,…,ak)} return Answer
![Page 49: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/49.jpg)
49
SELECT DISTINCT R.A FROM R, S, T WHERE R.A=S.A OR R.A=T.A
Using the Formal SemanRcs
If S ≠ ∅ and T ≠ ∅ then returns R ∩ (S ∪ T) else returns ∅
What do these queries compute ?
SELECT DISTINCT R.A FROM R, S WHERE R.A=S.A
Returns R ∩ S
Dan Suciu -- p544 Fall 2011
![Page 50: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/50.jpg)
50
AggregaRon
SELECT count(*) FROM Product
Except count, all aggregations apply to a single attribute
SELECT sum(price) FROM Product WHERE manufacturer=‘GizmoWorks’
SQL supports several aggregation operations: sum, count, min, max, avg
Dan Suciu -- p544 Fall 2011
Product (pname, price, category, manufacturer) Company (cname, country)
![Page 51: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/51.jpg)
51
COUNT applies to duplicates, unless otherwise stated:
SELECT count(category) FROM Product WHERE price > ‘$20’
If category has no nulls, then count(category)=count(*)
We probably want:
SELECT count(DISTINCT category) FROM Product WHERE price > ‘$20’
AggregaRon: Count
Dan Suciu -- p544 Fall 2011
Product (pname, price, category, manufacturer) Company (cname, country)
![Page 52: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/52.jpg)
52
Grouping and AggregaRon
SELECT manufacturer, count(*) AS total FROM Product WHERE price < ‘$200’ GROUP BY manufacturer
Let’s see what this means…
For each manufacturer, find total number of its products under $200.
Dan Suciu -- p544 Fall 2011
Product (pname, price, category, manufacturer) Company (cname, country)
![Page 53: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/53.jpg)
53
Grouping and AggregaRon
1. Compute the FROM and WHERE clauses. 2. Group by the attributes in the GROUPBY 3. Compute the SELECT clause, including aggregates.
Dan Suciu -- p544 Fall 2011
![Page 54: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/54.jpg)
54
1&2. FROM-‐WHERE-‐GROUPBY
Dan Suciu -- p544 Fall 2011
PName Price Category Manufacturer
Gizmo $19.99 Gadgets GizmoWorks
Powergizmo $29.99 Gadgets GizmoWorks
SingleTouch $149.99 Photography Canon
MultiTouch $203.99 Household Hitachi
SELECT manufacturer, count(*) AS total FROM Product WHERE price < ‘$200’ GROUP BY manufacturer
![Page 55: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/55.jpg)
55
3. SELECT
Dan Suciu -- p544 Fall 2011
SELECT manufacturer, count(*) AS total FROM Product WHERE price < ‘$200’ GROUP BY manufacturer
PName Price Category Manufacturer
Gizmo $19.99 Gadgets GizmoWorks
Powergizmo $29.99 Gadgets GizmoWorks
SingleTouch $149.99 Photography Canon
MultiTouch $203.99 Household Hitachi
count(*) Manufacturer
2 GizmoWorks
1 Canon
![Page 56: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/56.jpg)
56
HAVING Clause
SELECT manufacturer, count(*) AS total FROM Product WHERE price < ‘$200’ GROUP BY manufacturer HAVING min(price) >’$20’
Same query, except that we return only those manufacturers that make only products with price > $20
HAVING clause contains conditions on aggregates. Dan Suciu -- p544 Fall 2011
Product (pname, price, category, manufacturer) Company (cname, country)
![Page 57: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/57.jpg)
57
General form of Grouping and AggregaRon
SELECT S FROM R1,…,Rn WHERE C1 GROUP BY a1,…,ak HAVING C2 S = may contain aSributes a1,…,ak and/or any aggregates but NO
OTHER ATTRIBUTES C1 = is any condiRon on the aSributes in R1,…,Rn C2 = is any condiRon on aggregate expressions
Why ?
Dan Suciu -- p544 Fall 2011
![Page 58: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/58.jpg)
58
General form of Grouping and AggregaRon
EvaluaRon steps: 1. Evaluate FROM-‐WHERE, apply condiRon C1 2. Group by the aSributes a1,…,ak 3. Apply condiRon C2 to each group (may have aggregates) 4. Compute aggregates in S and return the result
SELECT S FROM R1,…,Rn WHERE C1 GROUP BY a1,…,ak HAVING C2
Dan Suciu -- p544 Fall 2011
![Page 59: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/59.jpg)
59
NULLS in SQL
• Whenever we don’t have a value, we can put a NULL • Can mean many things:
– Value does not exists – Value exists but is unknown – Value not applicable – Etc.
• The schema specifies for each aSribute if can be null (nullable aSribute) or not
• How does SQL cope with tables that have NULLs ?
Dan Suciu -- p544 Fall 2011
![Page 60: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/60.jpg)
60
Null Values
• If x= NULL then 4*(3-‐x)/7 is sRll NULL
• If x= NULL then x=‘Joe’ is UNKNOWN • In SQL there are three boolean values:
FALSE = 0 UNKNOWN = 0.5 TRUE = 1
Dan Suciu -- p544 Fall 2011
![Page 61: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/61.jpg)
61
Null Values
• C1 AND C2 = min(C1, C2) • C1 OR C2 = max(C1, C2) • NOT C1 = 1 – C1 Rule in SQL: include only tuples that yield TRUE
SELECT * FROM Person WHERE (age < 25) AND (height > 6 OR weight > 190)
E.g. age=20 heigth=NULL weight=200
Dan Suciu -- p544 Fall 2011
![Page 62: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/62.jpg)
62
Null Values
Unexpected behavior: Some Persons are not included !
SELECT * FROM Person WHERE age < 25 OR age >= 25
Dan Suciu -- p544 Fall 2011
![Page 63: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/63.jpg)
63
Null Values
Can test for NULL explicitly: – x IS NULL – x IS NOT NULL
Now it includes all Persons
SELECT * FROM Person WHERE age < 25 OR age >= 25 OR age IS NULL
Dan Suciu -- p544 Fall 2011
![Page 64: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/64.jpg)
Outerjoins
64
SELECT x.country, y.pname FROM Company x JOIN Product y ON x.cname = y.manufacturer
SELECT x.country, y.pname FROM Company x, Product y WHERE x.cname = y.manufacturer
Same as:
But countries that don’t manufacture will not be listed !
Product (pname, price, category, manufacturer) Company (cname, country) Normally, joins are “inner joins”:
Dan Suciu -- p544 Fall 2011
![Page 65: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/65.jpg)
Outerjoins
65
SELECT x.country, y.pname FROM Company x LEFT OUTER JOIN Product y ON x.cname = y.manufacturer
If we want to see the companies that don’t produce anything, then we use an outer join:
Dan Suciu -- p544 Fall 2011
Product (pname, price, category, manufacturer) Company (cname, country)
![Page 66: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/66.jpg)
66
Product Company
Dan Suciu -- p544 Fall 2011
PName Price Category Manufacturer
Gizmo $19.99 Gadgets GizmoWorks
Powergizmo $29.99 Gadgets GizmoWorks
SingleTouch $149.99 Photography Canon
MultiTouch $203.99 Household Hitachi
Cname Country
GizmoWorks USA
Canon Japan
Hitachi Japan
MuseumPass Vatican
Cname PName
USA GizmoWorks
USA GizmoWorks
Japan Canon
Japan Hitachi
Vatican NULL
![Page 67: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/67.jpg)
ApplicaRon
Dan Suciu -- p544 Fall 2011 67
SELECT x.country, count(*) FROM Company x, Product y WHERE x.cname = y.manufacturer GROUP BY x.country
What’s wrong ?
Product (pname, price, category, manufacturer) Company (cname, country)
Compute the total number of products made by each country
![Page 68: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/68.jpg)
ApplicaRon
Dan Suciu -- p544 Fall 2011 68
SELECT x.country, count(y.pname) FROM Company x LEFT OUTER JOIN Product y ON x.cname = y.manufacturer GROUP BY x.country
Now we also get the products who sold in 0 quantity
Product (pname, price, category, manufacturer) Company (cname, country)
Compute the total number of products made by each country
Note: we don’t use count(*)
WHY ?
![Page 69: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/69.jpg)
69
Outer Joins
• Leh outer join: – Include the leh tuple even if there’s no match
• Right outer join: – Include the right tuple even if there’s no match
• Full outer join: – Include the both leh and right tuples even if there’s no match
Dan Suciu -- p544 Fall 2011
![Page 70: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/70.jpg)
Subqueries
• A subquery is another SQL query nested inside a larger query
• Such inner-‐outer queries are called nested queries • A subquery may occur in:
1. A SELECT clause 2. A FROM clause 3. A WHERE clause
Dan Suciu -- p544 Fall 2011 70
Rule of thumb: avoid wriRng nested queries when possible; someRmes it’s impossible
![Page 71: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/71.jpg)
71
1. Subqueries in SELECT
Product (pname, price, category, manufacturer) Company (cname, country)
For each product return the country that manufactures it
SELECT X.pname, (SELECT Y.country FROM Company Y WHERE Y.cname=X.manufacturer) FROM Product X
What happens if a subquery returns more than one country ?
Dan Suciu -- p544 Fall 2011
![Page 72: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/72.jpg)
72
1. Subqueries in SELECT
Whenever possible, don’t use a nested queries:
= We have “unnested” the query
Dan Suciu -- p544 Fall 2011
SELECT X.pname, (SELECT Y.country FROM Company Y WHERE Y.cname=X.manufacturer) FROM Product X
SELECT pname, country FROM Product, Company WHERE cname=manufacturer
Product (pname, price, category, manufacturer) Company (cname, country)
![Page 73: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/73.jpg)
73
1. Subqueries in SELECT
Compute the number of products made by each country
SELECT DISTINCT x.country, (SELECT count(*) FROM Company y, Product WHERE y.cname=manufacturer and y.country = x.country) FROM Company x
Better: we can unnest by using a GROUP BY
Dan Suciu -- p544 Fall 2011
Product (pname, price, category, manufacturer) Company (cname, country)
SELECT x.country, count(*) FROM Company x, Product z WHERE x.cname = z.manufacturer GROUP BY x.country
![Page 74: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/74.jpg)
74
GROUP BY v.s. Nested Quereis
SELECT manufacturer, count(*) AS total FROM Product WHERE price < '$200’ GROUP BY manufacturer
SELECT DISTINCT x.manufacturer, (SELECT count(*) FROM Product y WHERE x.manufacturer = y.manufacturer AND price < '$200’) AS total FROM Product x WHERE price < '$200’
Why twice ? Dan Suciu -- p544 Fall 2011
![Page 75: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/75.jpg)
75
2. Subqueries in FROM
Find all products whose prices is > 20 and < 30
SELECT * FROM (SELECT * FROM Product AS Y WHERE Y.price > ‘$20’) AS x WHERE x.price < ‘$30’
Unnest this query !
Dan Suciu -- p544 Fall 2011
Product (pname, price, category, manufacturer) Company (cname, country)
![Page 76: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/76.jpg)
76
3. Subqueries in WHERE
Find all countries that make some products with price < 100
SELECT DISTINCT x.country FROM Company x WHERE EXISTS (SELECT * FROM Product y WHERE y.manufacturer = x.cname and y.price < ‘$100’)
Existential quantifiers
Using EXISTS:
Dan Suciu -- p544 Fall 2011
Product (pname, price, category, manufacturer) Company (cname, country)
Correlated subqery: uses x from outer query
![Page 77: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/77.jpg)
77
3. Subqueries in WHERE
Find all countries that make some products with price < 100
Predicate Calculus (a.k.a. First Order Logic)
Dan Suciu -- p544 Fall 2011
{ y |∃x.Company(x,y)∧(∃z.∃p.∃c.Product(z,p,c,x)∧p<100)}
Existential quantifiers Product (pname, price, category, manufacturer) Company (cname, country)
![Page 78: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/78.jpg)
78
3. Subqueries in WHERE
Find all countries that make some products with price < 100
SELECT DISTINCT country FROM Company WHERE cname IN (SELECT Product.manufacturer FROM Product WHERE Product.price < ‘$100’)
Using IN
Dan Suciu -- p544 Fall 2011
Existential quantifiers Product (pname, price, category, manufacturer) Company (cname, country)
De-‐correlated subqery
![Page 79: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/79.jpg)
79
3. Subqueries in WHERE
Find all countries that make some products with price < 100
SELECT DISTINCT Company.country FROM Company WHERE ‘$100’ > ANY (SELECT price FROM Product WHERE manufacturer = cname)
Using ANY:
Dan Suciu -- p544 Fall 2011
Existential quantifiers Product (pname, price, category, manufacturer) Company (cname, country)
![Page 80: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/80.jpg)
80
3. Subqueries in WHERE
Find all countries that make some products with price < 100
SELECT DISTINCT x.country FROM Company x, Product y WHERE x.cname = y.manufacturer and y.price < ‘$100’
Existential quantifiers are easy ! J
Now let’s unnest it:
Dan Suciu -- p544 Fall 2011
Existential quantifiers Product (pname, price, category, manufacturer) Company (cname, country)
![Page 81: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/81.jpg)
81
3. Subqueries in WHERE
Universal quantifiers are hard ! L
Find the countries of all companies that make only products with price < 100
Dan Suciu -- p544 Fall 2011
Universal quantifiers Product (pname, price, category, manufacturer) Company (cname, country)
![Page 82: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/82.jpg)
82
3. Subqueries in WHERE
Predicate Calculus (a.k.a. First Order Logic)
Dan Suciu -- p544 Fall 2011
{ y | ∃x.Company(x,y)∧(∀z.∀p.∀c.Product(z,p,c,x)èp<100) }
Find the countries of all companies that make only products with price < 100
Universal quantifiers Product (pname, price, category, manufacturer) Company (cname, country)
![Page 83: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/83.jpg)
83
3. Subqueries in WHERE
Dan Suciu -- p544 Fall 2011
{ y |∃x. Company(x,y)∧(∀z.∀p.∀c.Product(z,p,c,x)èp<100) }
De Morgan’s Laws: ¬(A ∧ B) = ¬A ∨ ¬B ¬(A ∨ B) = ¬A ∧ ¬B ¬∀x. P(x) = ∃x. ¬ P(x) ¬∃x. P(x) = ∀x. ¬ P(x)
{ y|∃x.Company(x,y)∧¬(∃z∃p.∃p.Product(z,p,c,x)∧p≥100) }
{ y | ∃x. Company(x,y)) } − { y | ∃x. Company(x,y) ∧(∃z∃p.∃c.Product(z,p,c,x)∧p≥100) }
¬(A è B) = A ∧ ¬B
=
=
![Page 84: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/84.jpg)
84
3. Subqueries in WHERE
2. Find all companies s.t. all their products have price < 100
1. Find the other companies: i.e. s.t. some product ≥ 100
Dan Suciu -- p544 Fall 2011
SELECT DISTINCT country FROM Company WHERE cname IN (SELECT manufacturer FROM Product WHERE price >= ‘$100’)
SELECT DISTINCT country FROM Company WHERE cname NOT IN (SELECT manufacturer FROM Product WHERE price >= ‘$100’)
![Page 85: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/85.jpg)
85
3. Subqueries in WHERE
Find the countries of all companies that make only products with price < 100
Universal quantifiers
Using EXISTS:
Dan Suciu -- p544 Fall 2011
SELECT DISTINCT x.country FROM Company x WHERE NOT EXISTS (SELECT * FROM Product y WHERE y.manufacturer = x.cname and y.price >= ‘$100’)
Product (pname, price, category, manufacturer) Company (cname, country)
![Page 86: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/86.jpg)
86
3. Subqueries in WHERE
SELECT DISTINCT Company.country FROM Company WHERE ‘$100’ > ALL (SELECT price FROM Product WHERE manufacturer = cname)
Using ALL:
Dan Suciu -- p544 Fall 2011
Find the countries of all companies that make only products with price < 100
Universal quantifiers Product (pname, price, category, manufacturer) Company (cname, country)
![Page 87: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/87.jpg)
87
QuesRon for Database Fans and their Friends
• Can we unnest this query ?
Dan Suciu -- p544 Fall 2011
Find the countries of all companies that make only products with price < 100
![Page 88: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/88.jpg)
88
Monotone Queries • A query Q is monotone if:
– Whenever we add tuples to one or more of the tables… – … the answer to the query cannot contain fewer tuples
• Fact: all unnested queries are monotone – Proof: using the “nested for loops” semanRcs
• Fact: A query a universal quanRfier is not monotone
• Consequence: we cannot unnest a query with a universal quanRfier Dan Suciu -- p544 Fall 2011
![Page 89: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/89.jpg)
Queries that must be nested
Dan Suciu -- p544 Fall 2011 89
Rule of Thumb: Non-‐monotone queries cannot be unnested. In parRcular, queries with a universal quanRfier cannot be unnested
![Page 90: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/90.jpg)
More SQL
Read the following commands in the book
• CREATE TABLE • INSERT • DELETE • UPDATE
They are easy; but we need/use them all the Rme in class, and in the homework assignments
Dan Suciu -- p544 Fall 2011 90
![Page 91: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/91.jpg)
91
Advanced SQLizing
1. UnnesRng Aggregates
2. Finding witnesses
Dan Suciu -- p544 Fall 2011
![Page 92: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/92.jpg)
UnnesRng Aggregates
For each category, find the maximum price
SELECT DISTINCT X.category, (SELECT max(Y.price) FROM Product Y WHERE X.category = Y.category) FROM Product X
SELECT category, max(price) FROM Product GROUP BY category
Equivalent queries
Note: no need for DISTINCT (DISTINCT is the same as GROUP BY)
Product (pname, price, category, manufacturer) Company (cname, country)
![Page 93: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/93.jpg)
UnnesRng Aggregates
Find the number of products made in each country SELECT DISTINCT X.country, (SELECT count(*) FROM Company Y, Product Z WHERE Y.cname=Z.manufacturer AND Y.country = X.country) FROM Company X
SELECT X.country, count(*) FROM Company X, Product Y WHERE X.cname=Y.manufacturer GROUP BY X.country
They are NOT equivalent !
(WHY?)
Product (pname, price, category, manufacturer) Company (cname, country)
![Page 94: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/94.jpg)
94
More UnnesRng
• Find authors who wrote ≥ 10 documents: • ASempt 1: with nested queries
SELECT DISTINCT Author.name FROM Author WHERE count(SELECT Wrote.url FROM Wrote WHERE Author.login=Wrote.login) > 10
This is SQL by a novice
Author(login,name) Wrote(login,url)
Dan Suciu -- p544 Fall 2011
![Page 95: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/95.jpg)
95
More UnnesRng
• Find all authors who wrote at least 10 documents:
• ASempt 2: SQL style (with GROUP BY)
SELECT DISTINCT Author.name FROM Author, Wrote WHERE Author.login=Wrote.login GROUP BY Author.name HAVING count(wrote.url) > 10
This is SQL by
an expert
Dan Suciu -- p544 Fall 2011
![Page 96: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/96.jpg)
96
Finding Witnesses
For each country, find its most expensive products
Dan Suciu -- p544 Fall 2011
Product (pname, price, category, manufacturer) Company (cname, country)
![Page 97: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/97.jpg)
Finding Witnesses
SELECT x.country, max(y.price) FROM Company x, Product y WHERE x.cname = y.manufacturer GROUP BY x.country
Finding the maximum price is easy…
But we need the witnesses, i.e. the products with max price
For each country, find its most expensive products
Product (pname, price, category, manufacturer) Company (cname, country)
![Page 98: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/98.jpg)
98
Finding Witnesses
SELECT u.country, v.pname, v.price FROM Company u, Product v, (SELECT x.country, max(y.price) as mprice FROM Company x, Product y WHERE x.cname = y.manufacturer GROUP BY x.country) AS p WHERE u.country = p.country and v.price = p.mprice
To find the witnesses, compute the maximum price in a subquery
Dan Suciu -- p544 Fall 2011
![Page 99: Principles*of*Database*Systems* CSE544p · 44 Joins PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography](https://reader030.vdocument.in/reader030/viewer/2022040913/5e88540169cdff57df21491f/html5/thumbnails/99.jpg)
99
Finding Witnesses
There is a more concise solution here:
SELECT x.country, y.pname, y.price FROM Company x, Product y WHERE x.cname = y.manufacturer and y.price >= ALL (SELECT z.price FROM Product z WHERE x.cname = z.manufacturer)
Dan Suciu -- p544 Fall 2011
Product (pname, price, category, manufacturer) Company (cname, country)