cs4432: database systems ii query operator & algebraic expressions 1
TRANSCRIPT
![Page 1: CS4432: Database Systems II Query Operator & Algebraic Expressions 1](https://reader036.vdocument.in/reader036/viewer/2022062313/56649cdc5503460f949a6faf/html5/thumbnails/1.jpg)
CS4432: Database Systems II
Query Operator & Algebraic Expressions
1
![Page 2: CS4432: Database Systems II Query Operator & Algebraic Expressions 1](https://reader036.vdocument.in/reader036/viewer/2022062313/56649cdc5503460f949a6faf/html5/thumbnails/2.jpg)
Why SQL
2
SQL is a very-high-level language. Say “what to do” rather than “how to do it.” Avoid a lot of data-manipulation details needed in procedural
languages like C++ or Java.
Database management system figures out “best” way to execute query. Called “query optimization.”
![Page 3: CS4432: Database Systems II Query Operator & Algebraic Expressions 1](https://reader036.vdocument.in/reader036/viewer/2022062313/56649cdc5503460f949a6faf/html5/thumbnails/3.jpg)
Query Processing
3
SELECT pNumber, count(*) AS CNT
FROM Student
WHERE sNumber > 1
GROUP BY pNumber;
SQL Query
Query Plans
![Page 4: CS4432: Database Systems II Query Operator & Algebraic Expressions 1](https://reader036.vdocument.in/reader036/viewer/2022062313/56649cdc5503460f949a6faf/html5/thumbnails/4.jpg)
Query Example
4
SELECT B, D FROM R, S WHERE R.A = “c” and S.E = 2 and R.C=S.C
![Page 5: CS4432: Database Systems II Query Operator & Algebraic Expressions 1](https://reader036.vdocument.in/reader036/viewer/2022062313/56649cdc5503460f949a6faf/html5/thumbnails/5.jpg)
5
• How do we execute query?
- Form Cartesian product of all tables in FROM-clause- Select tuples that match WHERE-clause- Project columns that occur in SELECT-clause
One idea
![Page 6: CS4432: Database Systems II Query Operator & Algebraic Expressions 1](https://reader036.vdocument.in/reader036/viewer/2022062313/56649cdc5503460f949a6faf/html5/thumbnails/6.jpg)
6
R X S R.A R.B R.C S.C S.D S.E
a 1 10 10 x 2
a 1 10 20 y 2
. .
C 2 10 10 x 2 . .
Bingo!
Got one...
SELECT B, D FROM R, S WHERE R.A = “c” and S.E = 2 and R.C=S.C
![Page 7: CS4432: Database Systems II Query Operator & Algebraic Expressions 1](https://reader036.vdocument.in/reader036/viewer/2022062313/56649cdc5503460f949a6faf/html5/thumbnails/7.jpg)
7
But ?
Performance would be unacceptable!
We need a better approach for reasoning about queries, their execution orders and their respective costs
![Page 8: CS4432: Database Systems II Query Operator & Algebraic Expressions 1](https://reader036.vdocument.in/reader036/viewer/2022062313/56649cdc5503460f949a6faf/html5/thumbnails/8.jpg)
8
Formal Relational Query Languages
Relational Algebra: More operational, very useful for representing execution plans.
Operators working on relations
![Page 9: CS4432: Database Systems II Query Operator & Algebraic Expressions 1](https://reader036.vdocument.in/reader036/viewer/2022062313/56649cdc5503460f949a6faf/html5/thumbnails/9.jpg)
Core Relational Algebra (Recap)
9
Union, intersection, and difference. Usual set operations, both operands have the same relation schema.
Selection: picking certain rows.
Projection: picking certain columns.
Products and joins: compositions of relations.
Renaming of relations and attributes.
Grouping and Aggregation: Grouping matching tuples
Duplicate Elimination: eliminates identical copies except one
Sorting: Orders tuples based on a given criteria
![Page 10: CS4432: Database Systems II Query Operator & Algebraic Expressions 1](https://reader036.vdocument.in/reader036/viewer/2022062313/56649cdc5503460f949a6faf/html5/thumbnails/10.jpg)
Relational Algebra Express Query Plans
10
B,D
R.A=“c” S.E=2 R.C=S.C
X R S
![Page 11: CS4432: Database Systems II Query Operator & Algebraic Expressions 1](https://reader036.vdocument.in/reader036/viewer/2022062313/56649cdc5503460f949a6faf/html5/thumbnails/11.jpg)
Recap on Relational Algebra & Operators
11
![Page 12: CS4432: Database Systems II Query Operator & Algebraic Expressions 1](https://reader036.vdocument.in/reader036/viewer/2022062313/56649cdc5503460f949a6faf/html5/thumbnails/12.jpg)
Algebra Behind the Query Language
Relational Algebra
Set of operators that operate on relations
Operator semantics based on Set or Bag theory
Relational algebra form underlying basis (and optimization rules) for SQL
12
SELECT pNumber, count(*) AS CNT
FROM Student
WHERE sNumber > 1
GROUP BY pNumber;
![Page 13: CS4432: Database Systems II Query Operator & Algebraic Expressions 1](https://reader036.vdocument.in/reader036/viewer/2022062313/56649cdc5503460f949a6faf/html5/thumbnails/13.jpg)
Relational Algebra Basic operators
Set Operations (Union: , Intersection: ∪ ∩ ,difference: – ) Select: σ Project: π Cartesian product: x rename: ρ
More advanced operators, e.g., grouping and joins
The operators take one or two relations as inputs and produce a new relation as an output
One input unary operator, two inputs binary operator
13
![Page 14: CS4432: Database Systems II Query Operator & Algebraic Expressions 1](https://reader036.vdocument.in/reader036/viewer/2022062313/56649cdc5503460f949a6faf/html5/thumbnails/14.jpg)
Union over sets: Consider two relations R and S that are
union-compatible (same schema)
A B
1 2
3 4
R
A B
1 2
3 4
5 6
S
A B
1 2
3 4
5 6
R S
14
Binary Op.
![Page 15: CS4432: Database Systems II Query Operator & Algebraic Expressions 1](https://reader036.vdocument.in/reader036/viewer/2022062313/56649cdc5503460f949a6faf/html5/thumbnails/15.jpg)
Difference over sets: – R – S are the tuples that appear in R and not in S R & S must be union-compatible
Defined as: R – S = {t | t ∈R and t ∈ S}
A B
1 2
3 4
R
A B
1 2
5 6
S
A B
3 4
R – S
15
Binary Op.
R-S ≠ S-R
![Page 16: CS4432: Database Systems II Query Operator & Algebraic Expressions 1](https://reader036.vdocument.in/reader036/viewer/2022062313/56649cdc5503460f949a6faf/html5/thumbnails/16.jpg)
Intersection over sets: ∩
Consider two Relations R and S that are union-compatible
A B
1 2
3 4
R
A B
1 2
3 4
5 6
S
A B
1 2
3 4
R ∩ S
16
Binary Op.
![Page 17: CS4432: Database Systems II Query Operator & Algebraic Expressions 1](https://reader036.vdocument.in/reader036/viewer/2022062313/56649cdc5503460f949a6faf/html5/thumbnails/17.jpg)
Selection: σ
Select: σc (R): c is a condition on R’s attributes Select subset of tuples from R that satisfy
selection condition c
A B C
1 2 5
3 4 6
1 2 7
R σ(C ≥ 6) (R)
A B C
3 4 6
1 2 7
17
Unary Op.
![Page 18: CS4432: Database Systems II Query Operator & Algebraic Expressions 1](https://reader036.vdocument.in/reader036/viewer/2022062313/56649cdc5503460f949a6faf/html5/thumbnails/18.jpg)
Selection: Example
Rσ ((A=B) ^ (D>5)) (R)
18
σ (D > C) (R)
![Page 19: CS4432: Database Systems II Query Operator & Algebraic Expressions 1](https://reader036.vdocument.in/reader036/viewer/2022062313/56649cdc5503460f949a6faf/html5/thumbnails/19.jpg)
Project: π πA1, A2, …, An (R), with A1, A2, …, An attributes AR
returns all tuples in R, but only columns A1, A2, …, An
A1, A2, …, An are called Projection List
A B C
1 2 5
3 4 6
1 2 7
1 2 8
R πA, C (R)
A C
1 5
3 6
1 7
1 8
19
Unary Op.
![Page 20: CS4432: Database Systems II Query Operator & Algebraic Expressions 1](https://reader036.vdocument.in/reader036/viewer/2022062313/56649cdc5503460f949a6faf/html5/thumbnails/20.jpg)
Extended Projection: πL (R)Example
π C, VA, X C*3+B (R)
20
A B C
1 2 5
3 4 6
1 2 7
1 2 8
R
C V X
5 1 17
6 3 22
7 1 23
8 1 26
Rename column A to VCompute this expression and call it X
![Page 21: CS4432: Database Systems II Query Operator & Algebraic Expressions 1](https://reader036.vdocument.in/reader036/viewer/2022062313/56649cdc5503460f949a6faf/html5/thumbnails/21.jpg)
Cross Product (Cartesian Product): X
RS
R X S
21
Each tuple in R joined with each tuple is S R x S = {t q | t ∈ R and q ∈ S}
Binary Op.
![Page 22: CS4432: Database Systems II Query Operator & Algebraic Expressions 1](https://reader036.vdocument.in/reader036/viewer/2022062313/56649cdc5503460f949a6faf/html5/thumbnails/22.jpg)
Natural Join: R ⋈ S
R S
R S ⋈
22
Implicit condition (R.B = S.B and R.D = S.D)
Binary Op.
An implicit equality condition on the common columns
![Page 23: CS4432: Database Systems II Query Operator & Algebraic Expressions 1](https://reader036.vdocument.in/reader036/viewer/2022062313/56649cdc5503460f949a6faf/html5/thumbnails/23.jpg)
Theta Join: R ⋈C S
A join based on any arbitrary condition C
It is defined as :
R ⋈C S = (σC (R X S))
A B
1 2
3 2
R
D C
2 3
4 5
4 5
S
R ⋈ R.A>=S.CS
A B D C
3 2 2 3
23
Recommendation: Always use Theta join(more explicit and more clear)
Binary Op.
![Page 24: CS4432: Database Systems II Query Operator & Algebraic Expressions 1](https://reader036.vdocument.in/reader036/viewer/2022062313/56649cdc5503460f949a6faf/html5/thumbnails/24.jpg)
Duplicate Elimination: (R) Delete all duplicate records
Convert a Bag (allows duplicates) to a Set (does not allow duplicates)
RA B
1 2
3 4
1 2
1 2
(R)
A B
1 2
3 4
24
Unary Op.
![Page 25: CS4432: Database Systems II Query Operator & Algebraic Expressions 1](https://reader036.vdocument.in/reader036/viewer/2022062313/56649cdc5503460f949a6faf/html5/thumbnails/25.jpg)
Grouping & Aggregation operator:
Grouing & Aggregate operation in relational algebra g1,g2, …gm, F1(A1), F2(A2), …Fn(An) (R)
25
Unary Op.
Group by these columns (can be empty)
Aggregation functions applied over each group
avg: average value
min: minimum value
max: maximum value
sum: sum of values
count: number of values
![Page 26: CS4432: Database Systems II Query Operator & Algebraic Expressions 1](https://reader036.vdocument.in/reader036/viewer/2022062313/56649cdc5503460f949a6faf/html5/thumbnails/26.jpg)
Grouping & Aggregation Operator: Example
sum(c)(R)
R S
branch_name,sum(balance)(S)
26
![Page 27: CS4432: Database Systems II Query Operator & Algebraic Expressions 1](https://reader036.vdocument.in/reader036/viewer/2022062313/56649cdc5503460f949a6faf/html5/thumbnails/27.jpg)
Assignment Operator: Write query as a sequence of line consisting of:
Series of assignments
Result expression containing the final answer
May use a variable multiple times in subsequent expressions
Example:
R1 (σ ((A=B) ^ (D>5)) (R – S)) ∩ W
R2 R1 ⋈(R.A = T.C) T
Result R1 U R2
27
![Page 28: CS4432: Database Systems II Query Operator & Algebraic Expressions 1](https://reader036.vdocument.in/reader036/viewer/2022062313/56649cdc5503460f949a6faf/html5/thumbnails/28.jpg)
Banking Example branch (branch_name, branch_city, assets)
customer (customer_name, customer_street, customer_city)
account (account_number, branch_name, balance)
loan (loan_number, branch_name, amount)
depositor (customer_name, account_number)
borrower (customer_name, loan_number)
28
![Page 29: CS4432: Database Systems II Query Operator & Algebraic Expressions 1](https://reader036.vdocument.in/reader036/viewer/2022062313/56649cdc5503460f949a6faf/html5/thumbnails/29.jpg)
Example Queries
Find customer names having account balance below 100 or above 10,000
πcustomer_name (depositor ⋈ πaccount_number(σbalance <100 OR balance > 10,000 (account)))
29
![Page 30: CS4432: Database Systems II Query Operator & Algebraic Expressions 1](https://reader036.vdocument.in/reader036/viewer/2022062313/56649cdc5503460f949a6faf/html5/thumbnails/30.jpg)
Example Queries
30
![Page 31: CS4432: Database Systems II Query Operator & Algebraic Expressions 1](https://reader036.vdocument.in/reader036/viewer/2022062313/56649cdc5503460f949a6faf/html5/thumbnails/31.jpg)
Example Queries (Cont’d)
31
![Page 32: CS4432: Database Systems II Query Operator & Algebraic Expressions 1](https://reader036.vdocument.in/reader036/viewer/2022062313/56649cdc5503460f949a6faf/html5/thumbnails/32.jpg)
Example Queries
Find customers’ names who have neither accounts nor loans
πcustomer_name(customer) - (πcustomer_name(borrower) U πcustomer_name(depositer))
32
![Page 33: CS4432: Database Systems II Query Operator & Algebraic Expressions 1](https://reader036.vdocument.in/reader036/viewer/2022062313/56649cdc5503460f949a6faf/html5/thumbnails/33.jpg)
Example Queries
33
For branches that gave loans > 100,000 or hold accounts with balances >50,000, report the branch name along whether it is reported because of a loan or an account
R1 πbranch_name, ‘Loan’ As Type (σamount >100,000 (loan))
R2 πbranch_name, ‘Account’ As Type(σbalance > 50,000 (account)))
Result R1 U R2
![Page 34: CS4432: Database Systems II Query Operator & Algebraic Expressions 1](https://reader036.vdocument.in/reader036/viewer/2022062313/56649cdc5503460f949a6faf/html5/thumbnails/34.jpg)
Example Queries
Find customer names having loans with sum > 20,000
πcustomer_name (σsum > 20,000 (customer_name, sum sum(amount)(loan borrower⋈ )))
34
![Page 35: CS4432: Database Systems II Query Operator & Algebraic Expressions 1](https://reader036.vdocument.in/reader036/viewer/2022062313/56649cdc5503460f949a6faf/html5/thumbnails/35.jpg)
Example Queries
Find the branch name with the largest number of accounts
R1 branch_name, countAccounts count(account_number)(account)
R2 Max max(countAccounts)(R1)
Result πbranch_name(R1 ⋈countAccounts = Max R2)
35
![Page 36: CS4432: Database Systems II Query Operator & Algebraic Expressions 1](https://reader036.vdocument.in/reader036/viewer/2022062313/56649cdc5503460f949a6faf/html5/thumbnails/36.jpg)
Summary of Relational-Algebra Operators
Set operators Union, Intersection, Difference
Selection & Projection & Extended Projection
Joins Natural, Theta, Outer join
Rename & Assignment
Duplicate elimination
Grouping & Aggregation
36