4541.564; spring 2008 pfs l prof. sang-goo ee (14:30pm

36
4541.564; Spring 2008 P fS L ADVANCED DATABASES Prof . Sang-goo Lee (14:30pm: Mon & Wed: Room 302-208) ADVANCED DATABASES

Upload: others

Post on 16-Mar-2022

0 views

Category:

Documents


0 download

TRANSCRIPT

4541.564; Spring 2008P f S L

ADVANCED DATABASES

Prof. Sang-goo Lee(14:30pm: Mon & Wed: Room 302-208)

ADVANCED DATABASES

Syllabus

Text Books Exams (tentative dates)

y

Database System Concepts, 5th Edition, A. Silberschatz, H. F. Korth, and S. Sudarshan, McGraw Hill 2006

( )Exam 1: 4/7 (Mon)Exam 2: 5/19 (Mon)Exam 3: 6/14 (Sat 14:30)McGraw Hill, 2006.

Relational Database Theory, Atzeni & De Antonellis, Benjamin/Cummings, 1993.

Exam 3: 6/14 (Sat, 14:30)

Term Project/Paper2~3 programming assignments

(Chap 2)

Database Systems, Atzeni, et al, McGraw Hill, 2000. (Chap 6~7)

p g g gTo be announced later

Grades2000. (Chap 6 7)

Lecture Noteswill be posted before class at

Exams 1, 2, & 3: 20% eachTerm Project/Paper: 30% totalQuizzes, Assignments,

http://europa.snu.ac.krusername & password requiredPlease use only for personal use

Q , g ,and others: 10%

** A score of 0 in f th

y p• any one of the exams,• any one of your term project, term paper, or• more than 50% of your

Original Slides:© Silberschatz, Korth and Sudarshan

Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 2

yassignments/quizzeswill result in F.

REVIEWSREVIEWS

Data, Database,

DataA formal description of

an entity, event, phenomena, or ideathat is worth recording

DatabaseAn integrated collection of persistent data representing the information of interestrepresenting the information of interestfor various programs that compose the computerized information system of an organization.Data are separated from the programs that use them

Original Slides:© Silberschatz, Korth and Sudarshan

Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 4

DBMS

Database Management Systemg yCollection of interrelated data anda set of programs to access those data

Information SystemDB + DBMS + Application programs + utilitiespp p g

File SystemPart of OSPart of OSStores programs, data, documents, or anything(in disk)(in disk)

Original Slides:© Silberschatz, Korth and Sudarshan

Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 5

Data Independencep

ability to modify a schema in one level without affecting a y y gschema definition in the next higher levelphysical data independence:physical data independence:

physical level - conceptual level

logical data independence:logical data independence:conceptual level - view level

View 1 ‥‥ View nCustomer Account

Conceptual Level

Customer Account

i t t bl ?Physical Level

pointer or table?

Original Slides:© Silberschatz, Korth and Sudarshan

Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 6

Instances and Schemas

Similar to types and variables in programming languagesyp p g g g gSchema – the logical structure of the database

e.g., the database consists of information about a set of customers and accounts g ,and the relationship between them)Analogous to type information of a variable in a programPhysical schema: database design at the physical levelLogical schema: database design at the logical level

I h l f h d b i lInstance – the actual content of the database at a particular point in time

A l h l f i blAnalogous to the value of a variable

Original Slides:© Silberschatz, Korth and Sudarshan

Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 7

Data Models

The underlying structure of a databasey gCollection of conceptual tools for describing

datadatadata relationshipsdata semanticsconsistency constraints

Entity Relationship ModelEntity Relationship ModelRelational ModelObj O i d M d lObject Oriented Model

Original Slides:© Silberschatz, Korth and Sudarshan

Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 8

Database Languages

Data Definition Language (DDL)

g g

g g ( )Specifies the DB Schema

create tabledrop column

Data Manipulation Language (DML)Operate on the contents of the DB

retrieve, insert, delete, change, etc.

Querya statement requesting the retrieval of informationquery language: part of DML data model dependent

Original Slides:© Silberschatz, Korth and Sudarshan

Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 9

Storage Managementg g

DBMS must effectively and efficiently manage storage (disk) spaceStorage managerStorage manager

a program module that provides the interface between the low-level data stored in the database and th ppli ti n pr r m nd q ri bmitt d t th t mthe application programs and queries submitted to the system

Physical StoragePhysical storage media hierarchyPhysical storage media hierarchyRAIDStorage AccessgFile OrganizationStorage Structures for Object-Oriented Databases

Original Slides:© Silberschatz, Korth and Sudarshan

Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 10

DB users

DBA (DB Administrator)schema definitionstorage structure, access method definition

h & h i l i ischema & physical organizationsecurity & authorizationbackup and recoverybackup and recovery

Application programmersSophisticated usersSophisticated users

use DML

Naïve usersNaïve usersUse interfaces provided by application programs

Original Slides:© Silberschatz, Korth and Sudarshan

Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 11

OverallApplicationinterface

Application programs

query Database schemeOverall

System DML il

DDL i t tStructure

Application programs

Embedded DML

precompiler

compiler interpreter

Query object code

Query evaluation

engine

processor

Transaction BufferTransaction manager

Buffer manager

Storage File

managermanager

Indices Statistical data

Original Slides:© Silberschatz, Korth and Sudarshan

Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 12

Data filesData dictionary

Entity Relationship Model

entity

y p

yentity set vs. entity (instance)weak entity sets

relationshiprelationship cardinalityp ybinary, ternary, n-ary

attributeattributemultivalued attributes, derived attributes

generalization/specializationgeneralization/specializationtotal-partial, exclusive-overlap

iaggregation

Original Slides:© Silberschatz, Korth and Sudarshan

Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 13

ER Diagramg

Original Slides:© Silberschatz, Korth and Sudarshan

Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 14

Relational Model

E. F. Codd, "a Relational Model of Data for Large Shared DatagBanks," Communications of the ACM, June 1970, pp.377-387.Uses a single structure called relationUses a single structure called relationSet (& math) oriented modelPh i l d i d dPhysical data independenceDefinition of a relation R:Let D1, ..., Dn be domains, thenR ⊂ D1, X ….. X Dn

R = { <d1, ..., dn> | d1∈ D1, ..., dn∈ Dn }

Original Slides:© Silberschatz, Korth and Sudarshan

Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 15

Relations and Tables

a tuple in a relation represents relationship among set of valuesp p p gImplemented as tablesRelation R = { <a1, b1, c1> <a2, b2, c2> <a , b , c > }Relation R { <a1, b1, c1>, <a2, b2, c2>, , <an, bn, cn> }=> Table R

Name Address Telephone

column (field, attribute)

row (record, tuple)HS Kim Suwon 323-3232KS Lee Busan 323-5454

Name Address Telephone

ow ( eco d, p e)KS Lee Busan 323 5454MH Choi Seoul 553-3235KH Na Yongin 545-5488…

Original Slides:© Silberschatz, Korth and Sudarshan

Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 16

Relational Database

A relational databasea set of relationsa collection of tables

Keysysuperkey, candidate key, and primary keykeys are constraints on allowable relation instances for a given schemay g

Original Slides:© Silberschatz, Korth and Sudarshan

Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 17

Relational Algebra

Query language

g

Q y g gNetwork, Hierarchical : navigational languageRelational

relational algebrarelational calculusSQLQUEL

Algebra : operators and operandsRelational algebra

operands : relationsoperators : fundamental operators + additional operators

Original Slides:© Silberschatz, Korth and Sudarshan

Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 18

Formal Definition

A basic expression in the relational algebra consists of either one of the following:

A relation in the databaseA lA constant relation

Let E1 and E2 be relational-algebra expressions; the following are ll l ti l l b p iall relational-algebra expressions:

E1 ∪ E2E1 – E2E1 x E2σp (E1), P is a predicate on attributes in E1∏s(E1), S is a list consisting of some of the attributes in E1ρN (E1), N is the new name for the result of E1

Original Slides:© Silberschatz, Korth and Sudarshan

Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 19

Additional Operationsp

We define additional operations that do not add any power to thep y prelational algebra, but that simplify common queries.

Set intersection

Natural joinNatural joinDivisionA iAssignment

Original Slides:© Silberschatz, Korth and Sudarshan

Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 20

SQLQ

Structured Query LanguageStructured Query Language

IBM's System R project : Sequel

RC-based: SQL is declarative

D DDDML & DDLSELECT / UNION / INTERSECT / EXCEPT

INSERT / DELETE / UPDATE

CREATE / DROP / ADDCREATE / DROP / ADD

Original Slides:© Silberschatz, Korth and Sudarshan

Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 21

Integrity Constraintsg y

Integrity Constraints (IC) are rules that the data in the DB must abide byIC defines the semantics of the DBDomain Constraints

restricts the values of a column

Referential IntegrityForeign Key ConstraintLet r1(R1), r2(R2) be relations with primary keys k1 & k2, respectively. α⊆R2 is a foreign key referencing k1 if it is required that for every t2∈r2 there must be aforeign key referencing k1 if it is required that for every t2∈r2, there must be a tuple t1∈r1 such that t1[k1] = t2[α] Πα(r2) ⊆ Π k1(r2)

Original Slides:© Silberschatz, Korth and Sudarshan

Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 22

Integrity Constraintsg y

Assertiona general constraint expressed as ∀x P(x)

but in SQL as ∃x (¬P(x))Q ( ( ))CREATE ASSERTION sum-constraint CHECK

(NOT EXISTS (SELECT * FROM branchWHERE (SELECT SUM( t) FROM lWHERE (SELECT SUM(amount) FROM loan

WHERE loan.b_name=branch.b_name)>= (SELECT SUM(amount) FROM accountWHERE account.b_name=branch.b_name)))

TriggerAction tied to a DB event (insert/delete/update)DEFINE TRIGGER overdraft ON UPDATE OF account T

b(IF NEW T.balance < 0

Original Slides:© Silberschatz, Korth and Sudarshan

Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 23

Functional Dependenciesp

Basic ConceptR: relation scheme. Let α⊆R, β⊆ RFunctional dependency α→β holds on R,

if i l l l i (R) f ll i f l 1 d 2 iif in any legal relation r(R), for all pairs of tuples t1 and t2 in r, if t1[α]=t2[α] then t1[β]=t2[β]

KeysKeysTrivial FDI f R lInference Rules

Reflexivity, Transitivity, Augmentation

Closure & CoverClosure & Cover

Original Slides:© Silberschatz, Korth and Sudarshan

Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 24

Enforcing ICg

DB statesDB state = DB instanceDB is in a valid state if it satisfies all Integrity constraintsAll assertions are tested when createdModification to db is only allowed if it does not cause violation

DB DB Updatestate 1 state 2

t0 t1

Original Slides:© Silberschatz, Korth and Sudarshan

Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 25

Object-Oriented Data Modelj

Object StructureObject Structurean object corresponds to an entity in ER modelan object encapsulates data and behaviorj pall interactions to an object are made via messages.

An object hasvariables : contain data for object (attribute in ER model)messages : interaction with rest of the worldmethods : procedures that implement the messages

Original Slides:© Silberschatz, Korth and Sudarshan

Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 26

Classes & Inheritance

Object Classa group of similar objectssame messages, methods, variables

⇒ l bj (i ) i i (i )⇒ class : object (instance) = entity set : entity (instance)

InheritancePerson

Employee Customer

Officer Teller Secretary

Complex (Composite) ObjectsObjects that contain other objectsAggregation (containment) hierarchAggregation (containment) hierarchyUse references (OID) as containment links

Original Slides:© Silberschatz, Korth and Sudarshan

Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 27

Object Identity & Persistence

Object Identity

j y

j yidentity in relational model - key (by value)identity in file system - file name (by name)identity in OO system - file generated id (built-in)

An object retains its identity even if some or all variables or definitions of h d h imethods change over time

Storage and access of persistent objectsHow do you store methods?How do you find objects in a database?H d i bj ?How do you store a composite object?

Original Slides:© Silberschatz, Korth and Sudarshan

Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 28

Object-Relational Databasej

Extended relational model to supportppNested relationsComplex typesObject orientation

Most commercial DBMS claim to be OROracle, DB2, Informix, …

Original Slides:© Silberschatz, Korth and Sudarshan

Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 29

Nested Relation

Relational modelFirst Normal Form: all attributes have atomic domains

Nested relational modelNested relational modelDomains may be atomic or relation valued

tuple (complex structure)p ( p )set (multiset)list (special set)

Original Slides:© Silberschatz, Korth and Sudarshan

Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 30

Querying with Complex TypesQ y g p yp

Relation valued attributesAn expression evaluating to a relationcan appear anywhere that a relation name my appear

Select B.name, Y.namefrom docs as B B author list as Yfrom docs as B, B.author_list as Y

Path expression (dot notation) is usedcomposite attributes / references

create table phd_students( d i f( l )) d l(advisor ref(people)) under people

select phd_students.advisor.namefrom phd_students

Original Slides:© Silberschatz, Korth and Sudarshan

Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 31

OO vs OR

ORdeclarative and limited power of (extended) SQL (compared to PL)data protection and good optimization

d h R l d l k d li d i iextends the Rel. model to make modeling and querying easier

OOffi i nt in mpl m in m m p r ti n f p r i t nt d tefficient in complex main mem. operations of persistant data

susceptible to data corruption

RelationalRelationalsimple data types, good query language, high protection

Original Slides:© Silberschatz, Korth and Sudarshan

Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 32

Indexingg

Ordered Indices B+-Tree Index FilesB Tree Index FilesB-Tree Index FilesStatic HashingDynamic Hashing Comparison of Ordered Indexing and Hashing p g gIndex Definition in SQLMultiple Key AccessMultiple-Key Access

Original Slides:© Silberschatz, Korth and Sudarshan

Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 33

Query Processing & OptimizationQ y g p

Measures of Query CostQ yEvaluation of individual operations

Select sort joinSelect, sort, join

Evaluation of ExpressionsE i l f iEquivalence of expressionsCost based optimization

Original Slides:© Silberschatz, Korth and Sudarshan

Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 34

Transactions

Transactiona collection of operations that performs a single logical function in a database applicationprogrammer is responsible for writing “correct” transactions

DBMS must ensure the atomicity and durability of each transaction (at least)

Atomicity: all-or-nothingConsistency: should not introduce inconsistenciesIsolation: should not interfere with each otherDurability: effect should be persistent

Original Slides:© Silberschatz, Korth and Sudarshan

Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 35

Concurrency Control & Recoveryy y

SerializabilityyLock-based protocolsTime stamp based protocolsTime-stamp based protocolsMultiple granularityLog-Based RecoveryShadow Pagingg g

Original Slides:© Silberschatz, Korth and Sudarshan

Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 36