4541.564; spring 2008 pfs l prof. sang-goo ee (14:30pm
TRANSCRIPT
4541.564; Spring 2008P f S L
ADVANCED DATABASES
Prof. Sang-goo Lee(14:30pm: Mon & Wed: Room 302-208)
ADVANCED DATABASES
Syllabus
Text Books Exams (tentative dates)
y
Database System Concepts, 5th Edition, A. Silberschatz, H. F. Korth, and S. Sudarshan, McGraw Hill 2006
( )Exam 1: 4/7 (Mon)Exam 2: 5/19 (Mon)Exam 3: 6/14 (Sat 14:30)McGraw Hill, 2006.
Relational Database Theory, Atzeni & De Antonellis, Benjamin/Cummings, 1993.
Exam 3: 6/14 (Sat, 14:30)
Term Project/Paper2~3 programming assignments
(Chap 2)
Database Systems, Atzeni, et al, McGraw Hill, 2000. (Chap 6~7)
p g g gTo be announced later
Grades2000. (Chap 6 7)
Lecture Noteswill be posted before class at
Exams 1, 2, & 3: 20% eachTerm Project/Paper: 30% totalQuizzes, Assignments,
http://europa.snu.ac.krusername & password requiredPlease use only for personal use
Q , g ,and others: 10%
** A score of 0 in f th
y p• any one of the exams,• any one of your term project, term paper, or• more than 50% of your
Original Slides:© Silberschatz, Korth and Sudarshan
Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 2
yassignments/quizzeswill result in F.
Data, Database,
DataA formal description of
an entity, event, phenomena, or ideathat is worth recording
DatabaseAn integrated collection of persistent data representing the information of interestrepresenting the information of interestfor various programs that compose the computerized information system of an organization.Data are separated from the programs that use them
Original Slides:© Silberschatz, Korth and Sudarshan
Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 4
DBMS
Database Management Systemg yCollection of interrelated data anda set of programs to access those data
Information SystemDB + DBMS + Application programs + utilitiespp p g
File SystemPart of OSPart of OSStores programs, data, documents, or anything(in disk)(in disk)
Original Slides:© Silberschatz, Korth and Sudarshan
Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 5
Data Independencep
ability to modify a schema in one level without affecting a y y gschema definition in the next higher levelphysical data independence:physical data independence:
physical level - conceptual level
logical data independence:logical data independence:conceptual level - view level
View 1 ‥‥ View nCustomer Account
Conceptual Level
Customer Account
i t t bl ?Physical Level
pointer or table?
Original Slides:© Silberschatz, Korth and Sudarshan
Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 6
Instances and Schemas
Similar to types and variables in programming languagesyp p g g g gSchema – the logical structure of the database
e.g., the database consists of information about a set of customers and accounts g ,and the relationship between them)Analogous to type information of a variable in a programPhysical schema: database design at the physical levelLogical schema: database design at the logical level
I h l f h d b i lInstance – the actual content of the database at a particular point in time
A l h l f i blAnalogous to the value of a variable
Original Slides:© Silberschatz, Korth and Sudarshan
Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 7
Data Models
The underlying structure of a databasey gCollection of conceptual tools for describing
datadatadata relationshipsdata semanticsconsistency constraints
Entity Relationship ModelEntity Relationship ModelRelational ModelObj O i d M d lObject Oriented Model
Original Slides:© Silberschatz, Korth and Sudarshan
Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 8
Database Languages
Data Definition Language (DDL)
g g
g g ( )Specifies the DB Schema
create tabledrop column
Data Manipulation Language (DML)Operate on the contents of the DB
retrieve, insert, delete, change, etc.
Querya statement requesting the retrieval of informationquery language: part of DML data model dependent
Original Slides:© Silberschatz, Korth and Sudarshan
Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 9
Storage Managementg g
DBMS must effectively and efficiently manage storage (disk) spaceStorage managerStorage manager
a program module that provides the interface between the low-level data stored in the database and th ppli ti n pr r m nd q ri bmitt d t th t mthe application programs and queries submitted to the system
Physical StoragePhysical storage media hierarchyPhysical storage media hierarchyRAIDStorage AccessgFile OrganizationStorage Structures for Object-Oriented Databases
Original Slides:© Silberschatz, Korth and Sudarshan
Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 10
DB users
DBA (DB Administrator)schema definitionstorage structure, access method definition
h & h i l i ischema & physical organizationsecurity & authorizationbackup and recoverybackup and recovery
Application programmersSophisticated usersSophisticated users
use DML
Naïve usersNaïve usersUse interfaces provided by application programs
Original Slides:© Silberschatz, Korth and Sudarshan
Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 11
OverallApplicationinterface
Application programs
query Database schemeOverall
System DML il
DDL i t tStructure
Application programs
Embedded DML
precompiler
compiler interpreter
Query object code
Query evaluation
engine
processor
Transaction BufferTransaction manager
Buffer manager
Storage File
managermanager
Indices Statistical data
Original Slides:© Silberschatz, Korth and Sudarshan
Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 12
Data filesData dictionary
Entity Relationship Model
entity
y p
yentity set vs. entity (instance)weak entity sets
relationshiprelationship cardinalityp ybinary, ternary, n-ary
attributeattributemultivalued attributes, derived attributes
generalization/specializationgeneralization/specializationtotal-partial, exclusive-overlap
iaggregation
Original Slides:© Silberschatz, Korth and Sudarshan
Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 13
ER Diagramg
Original Slides:© Silberschatz, Korth and Sudarshan
Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 14
Relational Model
E. F. Codd, "a Relational Model of Data for Large Shared DatagBanks," Communications of the ACM, June 1970, pp.377-387.Uses a single structure called relationUses a single structure called relationSet (& math) oriented modelPh i l d i d dPhysical data independenceDefinition of a relation R:Let D1, ..., Dn be domains, thenR ⊂ D1, X ….. X Dn
R = { <d1, ..., dn> | d1∈ D1, ..., dn∈ Dn }
Original Slides:© Silberschatz, Korth and Sudarshan
Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 15
Relations and Tables
a tuple in a relation represents relationship among set of valuesp p p gImplemented as tablesRelation R = { <a1, b1, c1> <a2, b2, c2> <a , b , c > }Relation R { <a1, b1, c1>, <a2, b2, c2>, , <an, bn, cn> }=> Table R
Name Address Telephone
column (field, attribute)
row (record, tuple)HS Kim Suwon 323-3232KS Lee Busan 323-5454
Name Address Telephone
ow ( eco d, p e)KS Lee Busan 323 5454MH Choi Seoul 553-3235KH Na Yongin 545-5488…
Original Slides:© Silberschatz, Korth and Sudarshan
Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 16
Relational Database
A relational databasea set of relationsa collection of tables
Keysysuperkey, candidate key, and primary keykeys are constraints on allowable relation instances for a given schemay g
Original Slides:© Silberschatz, Korth and Sudarshan
Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 17
Relational Algebra
Query language
g
Q y g gNetwork, Hierarchical : navigational languageRelational
relational algebrarelational calculusSQLQUEL
Algebra : operators and operandsRelational algebra
operands : relationsoperators : fundamental operators + additional operators
Original Slides:© Silberschatz, Korth and Sudarshan
Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 18
Formal Definition
A basic expression in the relational algebra consists of either one of the following:
A relation in the databaseA lA constant relation
Let E1 and E2 be relational-algebra expressions; the following are ll l ti l l b p iall relational-algebra expressions:
E1 ∪ E2E1 – E2E1 x E2σp (E1), P is a predicate on attributes in E1∏s(E1), S is a list consisting of some of the attributes in E1ρN (E1), N is the new name for the result of E1
Original Slides:© Silberschatz, Korth and Sudarshan
Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 19
Additional Operationsp
We define additional operations that do not add any power to thep y prelational algebra, but that simplify common queries.
Set intersection
Natural joinNatural joinDivisionA iAssignment
Original Slides:© Silberschatz, Korth and Sudarshan
Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 20
SQLQ
Structured Query LanguageStructured Query Language
IBM's System R project : Sequel
RC-based: SQL is declarative
D DDDML & DDLSELECT / UNION / INTERSECT / EXCEPT
INSERT / DELETE / UPDATE
CREATE / DROP / ADDCREATE / DROP / ADD
Original Slides:© Silberschatz, Korth and Sudarshan
Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 21
Integrity Constraintsg y
Integrity Constraints (IC) are rules that the data in the DB must abide byIC defines the semantics of the DBDomain Constraints
restricts the values of a column
Referential IntegrityForeign Key ConstraintLet r1(R1), r2(R2) be relations with primary keys k1 & k2, respectively. α⊆R2 is a foreign key referencing k1 if it is required that for every t2∈r2 there must be aforeign key referencing k1 if it is required that for every t2∈r2, there must be a tuple t1∈r1 such that t1[k1] = t2[α] Πα(r2) ⊆ Π k1(r2)
Original Slides:© Silberschatz, Korth and Sudarshan
Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 22
Integrity Constraintsg y
Assertiona general constraint expressed as ∀x P(x)
but in SQL as ∃x (¬P(x))Q ( ( ))CREATE ASSERTION sum-constraint CHECK
(NOT EXISTS (SELECT * FROM branchWHERE (SELECT SUM( t) FROM lWHERE (SELECT SUM(amount) FROM loan
WHERE loan.b_name=branch.b_name)>= (SELECT SUM(amount) FROM accountWHERE account.b_name=branch.b_name)))
TriggerAction tied to a DB event (insert/delete/update)DEFINE TRIGGER overdraft ON UPDATE OF account T
b(IF NEW T.balance < 0
Original Slides:© Silberschatz, Korth and Sudarshan
Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 23
Functional Dependenciesp
Basic ConceptR: relation scheme. Let α⊆R, β⊆ RFunctional dependency α→β holds on R,
if i l l l i (R) f ll i f l 1 d 2 iif in any legal relation r(R), for all pairs of tuples t1 and t2 in r, if t1[α]=t2[α] then t1[β]=t2[β]
KeysKeysTrivial FDI f R lInference Rules
Reflexivity, Transitivity, Augmentation
Closure & CoverClosure & Cover
Original Slides:© Silberschatz, Korth and Sudarshan
Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 24
Enforcing ICg
DB statesDB state = DB instanceDB is in a valid state if it satisfies all Integrity constraintsAll assertions are tested when createdModification to db is only allowed if it does not cause violation
DB DB Updatestate 1 state 2
t0 t1
Original Slides:© Silberschatz, Korth and Sudarshan
Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 25
Object-Oriented Data Modelj
Object StructureObject Structurean object corresponds to an entity in ER modelan object encapsulates data and behaviorj pall interactions to an object are made via messages.
An object hasvariables : contain data for object (attribute in ER model)messages : interaction with rest of the worldmethods : procedures that implement the messages
Original Slides:© Silberschatz, Korth and Sudarshan
Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 26
Classes & Inheritance
Object Classa group of similar objectssame messages, methods, variables
⇒ l bj (i ) i i (i )⇒ class : object (instance) = entity set : entity (instance)
InheritancePerson
Employee Customer
Officer Teller Secretary
Complex (Composite) ObjectsObjects that contain other objectsAggregation (containment) hierarchAggregation (containment) hierarchyUse references (OID) as containment links
Original Slides:© Silberschatz, Korth and Sudarshan
Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 27
Object Identity & Persistence
Object Identity
j y
j yidentity in relational model - key (by value)identity in file system - file name (by name)identity in OO system - file generated id (built-in)
An object retains its identity even if some or all variables or definitions of h d h imethods change over time
Storage and access of persistent objectsHow do you store methods?How do you find objects in a database?H d i bj ?How do you store a composite object?
Original Slides:© Silberschatz, Korth and Sudarshan
Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 28
Object-Relational Databasej
Extended relational model to supportppNested relationsComplex typesObject orientation
Most commercial DBMS claim to be OROracle, DB2, Informix, …
Original Slides:© Silberschatz, Korth and Sudarshan
Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 29
Nested Relation
Relational modelFirst Normal Form: all attributes have atomic domains
Nested relational modelNested relational modelDomains may be atomic or relation valued
tuple (complex structure)p ( p )set (multiset)list (special set)
Original Slides:© Silberschatz, Korth and Sudarshan
Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 30
Querying with Complex TypesQ y g p yp
Relation valued attributesAn expression evaluating to a relationcan appear anywhere that a relation name my appear
Select B.name, Y.namefrom docs as B B author list as Yfrom docs as B, B.author_list as Y
Path expression (dot notation) is usedcomposite attributes / references
create table phd_students( d i f( l )) d l(advisor ref(people)) under people
select phd_students.advisor.namefrom phd_students
Original Slides:© Silberschatz, Korth and Sudarshan
Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 31
OO vs OR
ORdeclarative and limited power of (extended) SQL (compared to PL)data protection and good optimization
d h R l d l k d li d i iextends the Rel. model to make modeling and querying easier
OOffi i nt in mpl m in m m p r ti n f p r i t nt d tefficient in complex main mem. operations of persistant data
susceptible to data corruption
RelationalRelationalsimple data types, good query language, high protection
Original Slides:© Silberschatz, Korth and Sudarshan
Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 32
Indexingg
Ordered Indices B+-Tree Index FilesB Tree Index FilesB-Tree Index FilesStatic HashingDynamic Hashing Comparison of Ordered Indexing and Hashing p g gIndex Definition in SQLMultiple Key AccessMultiple-Key Access
Original Slides:© Silberschatz, Korth and Sudarshan
Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 33
Query Processing & OptimizationQ y g p
Measures of Query CostQ yEvaluation of individual operations
Select sort joinSelect, sort, join
Evaluation of ExpressionsE i l f iEquivalence of expressionsCost based optimization
Original Slides:© Silberschatz, Korth and Sudarshan
Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 34
Transactions
Transactiona collection of operations that performs a single logical function in a database applicationprogrammer is responsible for writing “correct” transactions
DBMS must ensure the atomicity and durability of each transaction (at least)
Atomicity: all-or-nothingConsistency: should not introduce inconsistenciesIsolation: should not interfere with each otherDurability: effect should be persistent
Original Slides:© Silberschatz, Korth and Sudarshan
Advanced DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Review - 35