slides 7 relational algebra

Upload: alphexxx

Post on 07-Apr-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/3/2019 Slides 7 Relational Algebra

    1/16

    1

    Relational Algebra

    Lecture 7

    2

    Outline

    Relational Algebra (Section 6.1)

  • 8/3/2019 Slides 7 Relational Algebra

    2/16

    3

    Relational Algebra

    Formalism for creating new relationsfrom existing ones

    Its place in the big picture:

    Declarative

    querylanguage

    Algebra Implementation

    SQL,

    relational calculus

    Relational algebra

    4

    Relational Algebra Five operators:

    Union: !

    Difference: -

    Selection:s Projection: P Cartesian Product: "

    Derived or auxiliary operators: Intersection, complement

    Joins (natural,equi-join, theta join, semi-join)

    Renaming:r

  • 8/3/2019 Slides 7 Relational Algebra

    3/16

    5

    1. Union and 2. Difference

    R1 ! R2Example:

    ActiveEmployees ! RetiredEmployees

    R1 R2

    Example: AllEmployees ! RetiredEmployees

    6

    What about Intersection?

    It is a derived operatorR1 # R2 = R1 (R1 R2)

    Also expressed as a join (will see later)

    Example

    UnionizedEmployees# RetiredEmployees

  • 8/3/2019 Slides 7 Relational Algebra

    4/16

    7

    3. Selection

    Returns all tuples which satisfy a condition

    Notation: sc(R) (sometimes also !)

    Examples

    s Salary > 40000(Employee)

    s name = Smith(Employee)

    The condition c can be =, , %,

    [in SQL: SELECT * FROM Employee

    WHERE Salary > 40000]

    condition

    8

    Selection Example

    Employee

    SSN Name DepartmentID Salary

    999999999 John 1 30,000

    777777777 Tony 1 32,000888888888 Alice 2 45,000

    SSN Name DepartmentID Salary

    888888888 Alice 2 45,000

    Find all employees with salary more than $40,000.s Salary > 40000(Employee)

  • 8/3/2019 Slides 7 Relational Algebra

    5/16

    9

    4. Projection

    Eliminates columns, then removes duplicates

    Notation: PA1,,An(R) (sometimes !) Example:

    project to social-security number and names:

    PSSN, Name (Employee)

    Output schema: Answer(SSN, Name)

    [In SQL: SELECT DISTINCT SSN, Name FROM Employee]

    attributes

    10

    Projection Example

    Employee

    SSN Name DepartmentID Salary

    999999999 John 1 30,000

    777777777 Tony 1 32,000

    888888888 Alice 2 45,000

    SSN Name

    999999999 John

    777777777 Tony888888888 Alice

    PSSN, Name (Employee)

  • 8/3/2019 Slides 7 Relational Algebra

    6/16

    11

    5. Cartesian Product

    Combine each tuple in R1 with each tuple inR2

    Notation: R1 " R2

    Example: Employee " Dependents

    Very rare in practice; mainly used to expressjoins

    [In SQL: SELECT * FROM R1, R2]

    12

    Cartesian Product Example

    Employee

    Name SSN

    John 999999999

    Tony 777777777

    Dependents

    EmployeeSSN Dname999999999 Emily

    777777777 Joe

    Employee x Dependents

    Name SSN EmployeeSSN Dname

    John 999999999 999999999 Emily

    John 999999999 777777777 Joe

    Tony 777777777 999999999 Emily

    Tony 777777777 777777777 Joe

    Note the output

  • 8/3/2019 Slides 7 Relational Algebra

    7/16

    13

    Cartesian Product and SQLselect * fromEmployee, Dependents;

    +------+-----------+-------------+-------+| Name | SSN | EmployeeSSN | Dname |

    +------+-----------+-------------+-------+

    | John | 999999999 | 999999999 | Emily |

    | Tony | 777777777 | 999999999 | Emily |

    | John | 999999999 | 777777777 | Joe |

    | Tony | 777777777 | 777777777 | Joe |

    +------+-----------+-------------+-------+

    select * from Employee, Dependentswhere SSN=EmpoyeeSSN;

    +------+-----------+-------------+-------+| Name | SSN | EmployeeSSN | Dname |

    +------+-----------+--------------+-------+

    | John | 999999999 | 999999999 | Emily |

    | Tony | 777777777 | 777777777 | Joe |

    +------+-----------+-------------+-------+

    14

    Relational Algebra Five operators:

    Union: !

    Difference: -

    Selection:s Projection: P Cartesian Product: "

    Derived or auxiliary operators: Intersection, complement

    Joins (natural,equi-join, theta join, semi-join)

    Renaming:r

  • 8/3/2019 Slides 7 Relational Algebra

    8/16

    15

    Renaming

    Changes the schema, not the instance

    Schema: R(A1, , An )

    Notation: r B1,,Bn (R)

    Example:

    r LastName, SocSocNo (Employee)

    Output schema: Answer(LastName, SocSocNo)

    [in SQL: SELECT Name AS LastName, SSN AS SocSocNo

    FROM Employee]

    16

    Renaming Example

    Employee

    Name SSNJohn 999999999

    Tony 777777777

    LastName SocSocNoJohn 999999999Tony 777777777

    rLastName, SocSocNo (Employee)

  • 8/3/2019 Slides 7 Relational Algebra

    9/16

    17

    Natural Join

    Notation: R1 ! R2

    Meaning: R1 ! R2 = PA(sC(R1 " R2)) Equality on common attributes names

    Where: The selection sC checks equality of all common

    attributes

    The projection eliminates the duplicate commonattributes

    [in SQL: SELECT DISTINCT R1.A, R1. B, R2.C FROM R1, R2WHERE R1.B = R2.BSchema: R1(A,B), R2(B,C)]

    18

    Natural Join Example

    Employee

    Name SSNJohn 999999999Tony 777777777

    Dependents

    SSN Dname999999999 Emily777777777 Joe

    Name SSN DnameJohn 999999999 EmilyTony 777777777 Joe

    Employee Dependents =

    PName, SSN, Dname(s SSN=SSN2(Employee x r SSN2, Dname(Dependents))

  • 8/3/2019 Slides 7 Relational Algebra

    10/16

    19

    Natural Join

    R= S=

    R ! S=

    VZ

    ZY

    ZX

    YX

    BA

    VZ

    WV

    UZ

    CB

    WVZ

    VZY

    UZY

    VZX

    UZX

    CBA

    20

    Natural Join

    Given the schemas R(A, B, C, D), S(A, C,

    E), what is the schema of R ! S ?

    Given R(A, B, C), S(D, E), what is R ! S ?

    Given R(A, B), S(A, B), what is R ! S ?

  • 8/3/2019 Slides 7 Relational Algebra

    11/16

    21

    Theta Join

    A join that involves a predicate

    R1 !q R2 = sq (R1 " R2)

    Here qcan be any condition

    22

    Eq-join

    A theta join where qis an equalityR1 !A=B R2 = s A=B (R1 " R2)

    Equality on two different attributes

    Example:

    Employee !SSN=SSN Dependents

    Most useful join in practice(difference to natural join?)

  • 8/3/2019 Slides 7 Relational Algebra

    12/16

    23

    Semijoin

    R " S = PA1,,An (R ! S)

    Where A1, , An are the attributes in R

    Example:

    Employee " Dependents

    24

    Employee

    Name EmpId DeptNameHarry 3415 Finance

    Sally 2241 Sales

    George 3401 FinanceHarriet 2202 Sales

    Dept

    DeptName Manager

    Sales HarrietProduction Charles

    Semijoin - Example

    Employee!DeptName EmpId DeptNameSally 2241 Sales

    Harriet 2202 Sales

    Only attributes of Employee are selected

  • 8/3/2019 Slides 7 Relational Algebra

    13/16

    25

    Semijoins in DistributedDatabases

    Semijoins are used in distributed databases

    . . .. . .

    NameSSN

    Dname

    . . .. . .

    AgeSSNEmployee

    Dependents

    network

    Employee !ssn=ssn (s age>71 (Dependents))T = PSSNs age>71 (Dependents)

    R = Employee "!TAnswer = R! Dependents

    26

    Complex RA Expressions

    Person Purchase Person Product

    sname=fred sname=gizmoPpidPssn

    seller-ssn=ssn

    pid=pid

    buyer-ssn=ssn

    Pname

  • 8/3/2019 Slides 7 Relational Algebra

    14/16

    27

    Application:Query Rewriting for Optimization

    Reserves Sailors

    sid=sid

    bid=100 rating > 5

    sname

    Reserves Sailors

    sid=sid

    bid=100

    sname

    rating > 5(Scan;write totemp T1)

    (Scan;write totemp T2)

    The earlier we process selections, the fewer tuples we need tomanipulate higher up in the tree (predicate pushdown)Disadvantages?

    28

    Algebraic Laws (Examples)

    Commutative and Associative Laws R "S = S "R, R "(S "T) = (R "S) "T

    R S = S R, R (S T) = (R S) T

    Laws involving selection s C AND C"(R) = sC(s C"(R)) = sC(R) "sC"(R)

    s C (R S) = sC (R) S When C involves only attributes of R

    Laws involving projections PM(PN(R)) = PM,N(R)

    ! ! ! ! ! !

    ! !

  • 8/3/2019 Slides 7 Relational Algebra

    15/16

    29

    Operations on Bags

    A bag = a set with repeated elementsAll operations need to be defined carefully on bags

    {a,b,b,c}!{a,b,b,b,e,f,f}={a,a,b,b,b,b,b,c,e,f,f}

    {a,b,b,b,c,c} {b,c,c,c,d} = {a,b,b}

    sC(R): preserve the number of occurrences

    PA(R): no duplicate elimination

    Cartesian product, join: no duplicate elimination

    Important: Relational Engines work on bags, not sets!

    30

    Finally: RA has Limitations !

    Cannot compute transitive closure

    Find all direct and indirect relatives of Fred

    Cannot express in RA !!! Need to write Cprogram

    SisterLouNancy

    SpouseBillMary

    CousinJoeMary

    FatherMaryFred

    RelationshipName2Name1

  • 8/3/2019 Slides 7 Relational Algebra

    16/16

    31

    Conclusion

    Relational algebra is important for SQL

    We mainly need projects, selections,

    joins

    Joins are typically time consuming for

    large databases