overview - carnegie mellon university

23
Faloutsos - Pavlo CMU SCS 15-415/615 1 CMU SCS Carnegie Mellon Univ. School of Computer Science 15-415/615 - DB Applications C. Faloutsos & A. Pavlo Lecture #4: Relational Algebra CMU SCS Faloutsos - Pavlo CMU SCS 15-415/615 #2 Overview • history • concepts Formal query languages – relational algebra – rel. tuple calculus – rel. domain calculus CMU SCS Faloutsos - Pavlo CMU SCS 15-415/615 #3 History before: records, pointers, sets etc introduced by E.F. Codd in 1970 • revolutionary! first systems: 1977-8 (System R; Ingres) Turing award in 1981

Upload: others

Post on 27-May-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Overview - Carnegie Mellon University

Faloutsos - Pavlo CMU SCS 15-415/615

1

CMU SCS

Carnegie Mellon Univ. School of Computer Science

15-415/615 - DB Applications

C. Faloutsos & A. Pavlo Lecture #4: Relational Algebra

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #2

Overview

•  history •  concepts •  Formal query languages

–  relational algebra –  rel. tuple calculus –  rel. domain calculus

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #3

History

•  before: records, pointers, sets etc •  introduced by E.F. Codd in 1970 •  revolutionary! •  first systems: 1977-8 (System R; Ingres) •  Turing award in 1981

Page 2: Overview - Carnegie Mellon University

Faloutsos - Pavlo CMU SCS 15-415/615

2

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #4

Concepts - reminder

•  Database: a set of relations (= tables) •  rows: tuples •  columns: attributes (or keys) •  superkey, candidate key, primary key

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #5

Example Database:

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #6

Example: cont’d Database:

rel. schema (attr+domains) tuple

k-th attribute

(Dk domain)

Page 3: Overview - Carnegie Mellon University

Faloutsos - Pavlo CMU SCS 15-415/615

3

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #7

Example: cont’d

rel. schema (attr+domains)

instance

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #8

Example: cont’d

rel. schema (attr+domains)

instance

•  Di: the domain of the i-th attribute (eg., char(10)

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #9

Overview

•  history •  concepts •  Formal query languages

–  relational algebra –  rel. tuple calculus –  rel. domain calculus

Page 4: Overview - Carnegie Mellon University

Faloutsos - Pavlo CMU SCS 15-415/615

4

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #10

Formal query languages

•  How do we collect information? •  Eg., find ssn’s of people in 415 •  (recall: everything is a set!) •  One solution: Rel. algebra, ie., set operators •  Q1: Which ones?? •  Q2: what is a minimal set of operators?

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #11

•  . •  . •  . •  set union U •  set difference ‘-’

Relational operators

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #12

Example:

•  Q: find all students (part or full time) •  A: PT-STUDENT union FT-STUDENT

Page 5: Overview - Carnegie Mellon University

Faloutsos - Pavlo CMU SCS 15-415/615

5

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #13

Observations:

•  two tables are ‘union compatible’ if they have the same attributes (‘domains’)

•  Q: how about intersection

U

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #14

Observations:

•  A: redundant: •  STUDENT intersection STAFF =

STUDENT STAFF

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #15

Observations:

•  A: redundant: •  STUDENT intersection STAFF =

STUDENT STAFF

Page 6: Overview - Carnegie Mellon University

Faloutsos - Pavlo CMU SCS 15-415/615

6

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #16

Observations:

•  A: redundant: •  STUDENT intersection STAFF = STUDENT - (STUDENT - STAFF)

STUDENT STAFF

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #17

Observations:

•  A: redundant: •  STUDENT intersection STAFF = STUDENT - (STUDENT - STAFF)

Double negation:

We’ll see it again, later…

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #18

•  . •  . •  . •  set union •  set difference ‘-’

Relational operators

U

Page 7: Overview - Carnegie Mellon University

Faloutsos - Pavlo CMU SCS 15-415/615

7

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #19

Other operators?

•  eg, find all students on ‘Main street’ •  A: ‘selection’

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #20

Other operators?

•  Notice: selection (and rest of operators) expect tables, and produce tables (-> can be cascaded!!)

•  For selection, in general:

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #21

Selection - examples

•  Find all ‘Smiths’ on ‘Forbes Ave’

‘condition’ can be any boolean combination of ‘=‘, ‘>’, ‘>=‘, ...

Page 8: Overview - Carnegie Mellon University

Faloutsos - Pavlo CMU SCS 15-415/615

8

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #22

•  selection •  . •  . •  set union •  set difference R - S

Relational operators

R U S

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #23

•  selection picks rows - how about columns? •  A: ‘projection’ - eg.:

finds all the ‘ssn’ - removing duplicates

Relational operators

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #24

Cascading: ‘find ssn of students on ‘forbes ave’

Relational operators

Page 9: Overview - Carnegie Mellon University

Faloutsos - Pavlo CMU SCS 15-415/615

9

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #25

•  selection •  projection •  . •  set union •  set difference R - S

Relational operators

R U S

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #26

Are we done yet? Q: Give a query we can not answer yet!

Relational operators

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #27

A: any query across two or more tables, eg., ‘find names of students in 15-415’

Q: what extra operator do we need??

Relational operators

Page 10: Overview - Carnegie Mellon University

Faloutsos - Pavlo CMU SCS 15-415/615

10

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #28

A: any query across two or more tables, eg., ‘find names of students in 15-415’

Q: what extra operator do we need?? A: surprisingly, cartesian product is enough!

Relational operators

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #29

Cartesian product

•  eg., dog-breeding: MALE x FEMALE •  gives all possible couples

x =

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #30

so what?

•  Eg., how do we find names of students taking 415?

Page 11: Overview - Carnegie Mellon University

Faloutsos - Pavlo CMU SCS 15-415/615

11

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #31

Cartesian product

•  A:

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #32

Cartesian product

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #33

Page 12: Overview - Carnegie Mellon University

Faloutsos - Pavlo CMU SCS 15-415/615

12

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #34

•  selection •  projection •  cartesian product MALE x FEMALE •  set union •  set difference R - S

FUNDAMENTAL Relational operators

R U S

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #35

Relational ops

•  Surprisingly, they are enough, to help us answer almost any query we want!!

•  derived/convenience operators: –  set intersection –  join (theta join, equi-join, natural join) –  ‘rename’ operator –  division

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #36

Joins

•  Equijoin:

Page 13: Overview - Carnegie Mellon University

Faloutsos - Pavlo CMU SCS 15-415/615

13

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #37

Cartesian product

•  A:

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #38

Joins

•  Equijoin: •  theta-joins: generalization of equi-join - any condition

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #39

Joins

•  very popular: natural join: R S •  like equi-join, but it drops duplicate

columns: STUDENT (ssn, name, address) TAKES (ssn, cid, grade)

Page 14: Overview - Carnegie Mellon University

Faloutsos - Pavlo CMU SCS 15-415/615

14

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #40

Joins

•  nat. join has 5 attributes

equi-join: 6

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #41

Natural Joins - nit-picking

•  if no attributes in common between R, S: nat. join -> cartesian product

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #42

Overview - rel. algebra

•  fundamental operators •  derived operators

–  joins etc –  rename –  division

•  examples

Page 15: Overview - Carnegie Mellon University

Faloutsos - Pavlo CMU SCS 15-415/615

15

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #43

Rename op.

•  Q: why? •  A: shorthand; self-joins; … •  for example, find the grand-parents of

‘Tom’, given PC (parent-id, child-id)

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #44

Rename op.

•  PC (parent-id, child-id)

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #45

Rename op.

•  first, WRONG attempt:

•  (why? how many columns?) •  Second WRONG attempt:

Page 16: Overview - Carnegie Mellon University

Faloutsos - Pavlo CMU SCS 15-415/615

16

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #46

Rename op.

•  we clearly need two different names for the same table - hence, the ‘rename’ op.

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #47

Overview - rel. algebra

•  fundamental operators •  derived operators

–  joins etc –  rename – division

•  examples

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #48

Division

•  Rarely used, but powerful. •  Example: find suspicious suppliers, ie.,

suppliers that supplied all the parts in A_BOMB

Page 17: Overview - Carnegie Mellon University

Faloutsos - Pavlo CMU SCS 15-415/615

17

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #49

Division

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #50

Division

•  Observations: ~reverse of cartesian product •  It can be derived from the 5 fundamental

operators (!!) •  How?

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #51

Division

•  Answer:

•  Observation: find ‘good’ suppliers, and subtract! (double negation)

Page 18: Overview - Carnegie Mellon University

Faloutsos - Pavlo CMU SCS 15-415/615

18

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #52

Division

•  Answer:

•  Observation: find ‘good’ suppliers, and subtract! (double negation)

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #53

Division

•  Answer:

All suppliers

All bad parts

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #54

Division

•  Answer:

all possible

suspicious shipments

Page 19: Overview - Carnegie Mellon University

Faloutsos - Pavlo CMU SCS 15-415/615

19

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #55

Division

•  Answer:

all possible

suspicious shipments

that didn’t happen

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #56

Division

•  Answer:

all suppliers who missed

at least one suspicious shipment,

i.e.: ‘good’ suppliers

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #57

Overview - rel. algebra

•  fundamental operators •  derived operators

–  joins etc –  rename –  division

•  examples

Page 20: Overview - Carnegie Mellon University

Faloutsos - Pavlo CMU SCS 15-415/615

20

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #58

Sample schema find names of students that take 15-415

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #59

Examples

•  find names of students that take 15-415

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #60

Examples

•  find names of students that take 15-415

Page 21: Overview - Carnegie Mellon University

Faloutsos - Pavlo CMU SCS 15-415/615

21

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #61

Sample schema find course names of ‘smith’

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #62

Examples

•  find course names of ‘smith’

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #63

Examples

•  find ssn of ‘overworked’ students, ie., that take 412, 413, 415

Page 22: Overview - Carnegie Mellon University

Faloutsos - Pavlo CMU SCS 15-415/615

22

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #64

Examples

•  find ssn of ‘overworked’ students, ie., that take 412, 413, 415: almost correct answer:

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #65

Examples

•  find ssn of ‘overworked’ students, ie., that take 412, 413, 415 - Correct answer:

c-name=413

c-name=415

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #66

Examples

•  find ssn of students that work at least as hard as ssn=123, ie., they take all the courses of ssn=123, and maybe more

Page 23: Overview - Carnegie Mellon University

Faloutsos - Pavlo CMU SCS 15-415/615

23

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #67

Sample schema

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #68

Examples

•  find ssn of students that work at least as hard as ssn=123 (ie., they take all the courses of ssn=123, and maybe more

CMU SCS

Faloutsos - Pavlo CMU SCS 15-415/615 #69

Conclusions

•  Relational model: only tables (‘relations’) •  relational algebra: powerful, minimal: 5

operators can handle almost any query!