dbms and implementation all(full)
TRANSCRIPT
-
7/28/2019 DBMS and Implementation All(Full)
1/69
1
Database Management Systemsand Their Implementation
By Xu LizhenSchool of Computer Science and Engineering, Southeast University
Nanjing, China
Course Goal and its Preliminary Courses
The Preliminary Courses are:
Data Structure Database Principles Database Design and Application
The students should already have the basic concepts
2Database Management Systems and Their Implementation, Xu Lizhen
a ou a a ase sys em, suc as a a mo e , a aschema, SQL, DBMS, transaction, database design, etc.
Now we will introduce the implementationtechniques of Database Management Systems.
The goal is to build the foundation of furtherresearch in database field and to use databasesystem better through the study of this course.
Main Contents
Introduce the inner implementationtechnique of every kind of DBMS,including the architecture of DBMS, thesupport to data model and the
3Database Management Systems and Their Implementation, Xu Lizhen
implementation of DBMS core, userinterface, etc. The emphasis is the basicconcepts, the basic principles and theimplementation methods related toDBMS core.
-
7/28/2019 DBMS and Implementation All(Full)
2/69
2
Main Contents
Because the relational data model is the mainstreamdata model, and distributed DBMS includes all aspectsof classical centralized DBMS, the main thread of thiscourse is relational distributed database managements stem. The im lementation of ever as ects of DBMS
4Database Management Systems and Their Implementation, Xu Lizhen
are introduced according to distributed DBMS. Somecontents of other kinds of DBMS are also introduced,including federated database systems, parallel databasesystems and object-oriented database systems, etc.Along with the continuous progress of databasetechnique, new contents will be added at any time.
Course History
Database System Principles (before 1984) ----relational model theory, some query optimizationalgorithms
Distributed Database Systems (1985~1994) ----introduce implementation techniques in DDBMS
5
oroug y an sys em ca y
Database Management Systems and Their
Implementation (after 1995) ---- not limited toDDBMS, hope to introduce the implementationtechniques of DBMS more thoroughly. Along withthe progress of database technique, new contents canbe added without changing the course name.
Database Management Systems and Their Implementation, Xu Lizhen
References
1) Stefano Ceri, Distributed Databases
2) Wang Nengbin, Principles of Database Systems
3) Raghu Ramakrishnan, Johannes Gehrke, DatabaseManagement Systems , 3rd Edition, McGraw-HillCompanies, 2002
6Database Management Systems and Their Implementation, Xu Lizhen
4) Hector Garcia-Molina, Jeffrey.D.Ullman, DatabaseSystems: the Complete Book
5) S.Bing Yao et al, Query Optimization in DDBS
6) Courseware:http://cse.seu.edu.cn/people/lzxu/resource
-
7/28/2019 DBMS and Implementation All(Full)
3/69
3
Table of Contents
1. Introduction
The history, classification, and main research contents ofdatabase systems; Distributed database system
2. DBMS Architecture
The com osition of DBMS and its rocess structure; The
7Database Management Systems and Their Implementation, Xu Lizhen
architecture of distributed database systems
3. Access Management of Database
Physical file organization, index, and access primitives
4. Data Distribution
The fragmentation and distribution of data, distributeddatabase design, federated database design, parallel databasedesign, data catalog and its distribution
Table of Contents
5. Query Optimization
Basic problems; Query optimization techniques; Queryoptimization in distributed database systems; Queryoptimization in other kinds of DBMS
6. Recovery Mechanism
8Database Management Systems and Their Implementation, Xu Lizhen
Basic problems; Updating strategies and recovery techniques;Recovery mechanism in distributed DBMS
7.
Concurrency ControlBasic problems; Concurrency control techniques; Concurrencycontrol in distributed DBMS; Concurrency control in otherkinds of DBMS
1. Introduction
Database Management Systemsand Their Implementation, Xu
Lizhen
-
7/28/2019 DBMS and Implementation All(Full)
4/69
4
1.1 The History of Database Technologyand its Classification
(1) According to the development of data model No management(before 1960): Scientific computing
File system: Simple data management
Demand of data management growing continuously,DBMS emer ed.
10Database Management Systems and Their Implementation, Xu Lizhen
.
1964, the first DBMS (American): IDS,network
1969, the first commercialDBMS of IBM, hierarchical
1970, E.F.Codd(IBM) bring forwardrelational data model
Other data model: Object Oriented, deductive, ER, ...
(2) According to the development of DBMSarchitectures
Centralized database systems
Parallel database systems
Distributed database systems (and Federateddatabase systems)
Mobile database systems
3 Accordin to the develo ment of architectures of
11Database Management Systems and Their Implementation, Xu Lizhen
application systems based on databases
Centralized structure : HostTerminal
Distributed structure Client/Server structure
Three tier/multi-tier structure
Mobile computing
Grid computing (Data Grid), Cloud Computing
(4) According to the expanding of application fields
OLTP
Engineering Database
Deductive Database
Multimedia Database
Temporal Database
12Database Management Systems and Their Implementation, Xu Lizhen
Spatial Database
Data Warehouse, OLAP, Data Mining
XML Database
Big Data, NoSQL, NewSQL
-
7/28/2019 DBMS and Implementation All(Full)
5/69
5
13Database Management Systems and Their Implementation, Xu Lizhen
1.2 Distributed Database Systems
What is DDB?
A DDB is a collection of correlated data which arespread across a network and managed by a softwarecalled DDBMS.
14Database Management Systems and Their Implementation, Xu Lizhen
Two kinds:(1) Distributed physically, centralized logically (general DDB)
(2) Distributed physically, distributed logically too (FDBS)
We take the first as main topic in this course.
Distribution
Correlation
DDBMS
Features of DDBS :
15Database Management Systems and Their Implementation, Xu Lizhen
-
7/28/2019 DBMS and Implementation All(Full)
6/69
6
The advantages of DDBS:
Local autonomy
Good availability (because support multi copies) Good flexibility
Low system cost
High efficiency (most access processed locally, lesscommunication comparing to centralized database
16Database Management Systems and Their Implementation, Xu Lizhen
system)
Parallel process
The disadvantages of DDBS: Hard to integrate existing databases
Too complex (system itself and its using, maintenance,etc. such as DDB design)
The main problems in DDBS:
Compared to centralized DBMS, thedifferences of DDBS are as follows:
Query Optimization (different optimizinggoal)
Concurrenc control should consider whole
17Database Management Systems and Their Implementation, Xu Lizhen
network)
Recovery mechanism (failure combination)
Another problem specially for DDBS:
Data distribution
2. The Architecture of DBMS
Database Management Systemsand Their Implementation, Xu
Lizhen
-
7/28/2019 DBMS and Implementation All(Full)
7/69
7
Main Contains
The components of DBMS core The process structure of DBMS
The components of DDBMS core
19
Database Management Systems and Their Implementation, Xu Lizhen
2.1 The Components of DBMS Core
Database statement
(such as SQL)
App1
Grant checking
Parser
App i
interfacem
Appj
interface1
Statement / Program
Grammar tree
Message or data
Formatted message or data
App n. . . .. . .. .
u f i /API . . .
20Database Management Systems and Their Implementation, Xu Lizhen
DBMS coreOperating system
Disk
Semantic analysis and query treatment
( DDL QL DML DCL )
Concurrency
control
Recovery
mechanism
Access
management
Access pr imiti ve
System call
I/Ocommand
Message or data
Message or data
State info. or physical
data block
2.2 The Process Structure of DBMS
Single process structure
Multi processes structure
Multi threads structure
21Database Management Systems and Their Implementation, Xu Lizhen
processes / threads
-
7/28/2019 DBMS and Implementation All(Full)
8/69
8
Single process structure
The application program is compiled with DBMS core
as a single .exe file, running as a single process.
App licat ion cod es
22Database Management Systems and Their Implementation, Xu Lizhen
DBMS core (as a function)
SQLstatements Result
.exe file
Multi processes structure
One application process corresponding to one DBMScore process
App lic atio nprocess 1
DBMScore process 1pipe
SQLstatements
results
23Database Management Systems and Their Implementation, Xu Lizhen
App lic atio nprocess 2
DBMScore process 2pipe
SQLstatements
results
App lic atio nprocess n
DBMScore process npipe
SQLstatements
results
.
.
.
.
.
.
Multi threads structure
Only one DBMS process, every application processcorresponding to a DBMS core thread.
c at al og l oc k t ab le b uf fer DAEMON
Appl icati oni e/socket
SQLs tatements
DBMScore thread 1
24Database Management Systems and Their Implementation, Xu Lizhen
DBMSprocess
processresults
pipe/socketAppl icati on
process 2DBMScore thread 2
SQLs tatements
results
pipe/socketAppl icati on
process nDBMScore thread n
SQLs tatements
results
. . . . . .
-
7/28/2019 DBMS and Implementation All(Full)
9/69
9
Communication protocols between processes / threads
Application programs access databases through API
or embedded SQL offered by DBMS, according tocommunication protocol to realize synchronizingcontrol:
Ad-hoc i nterfac e orDBMS core
Pipe0
25Database Management Systems and Their Implementation, Xu Lizhen
app ca on programPipe1
State TupNum AttNum AttName AttType AttLen . . . . . . TmpFileName
Def in it iono fonea t t r ibu te Def in it ion o fo thera t tr ibutes
Pipe0: Send SQL statements, inner commands;Pipe1: return results. The result format:
Communication protocols between processes / threads
State: 0 -- error, 1 -- success for insert,delete,update,
2 -- query success, need to treat result further.
TupNum: tuple number in result.
AttNum: attribute number in result table.
AttName: attribute name.
26Database Management Systems and Their Implementation, Xu Lizhen
.
AttType: attribute type.
AttLen: byte number of this attribute.
TmpFileName: name of the temporary file whichstore the result data, need the above metadata toexplain it.
2.3 The Components of DDBMS Core
DB DC
DDDDBK
LDB1DB: database management
DC: communication control
27Database Management Systems and Their Implementation, Xu Lizhen
DB DC
DDDDBK
LDB2
Site1
Site2
DD: catalog management
DDBK: core, responsible for
parsing, distributed
transaction management,
concurrency control, recovery
and global query optimization.
-
7/28/2019 DBMS and Implementation All(Full)
10/69
10
An example of global query optimization
Global query optimization mayget an execution plan based oncost estimation, such as:
(1)send R2 to site1, R
R1 R2
Site1 Site2
28Database Management Systems and Their Implementation, Xu Lizhen
execute on site :
Select *
From R1, R
Where R1.a = R.b;
Select *From R1,R2Where R1.a = R2.b;
2.4 The Process Structure of DDBMS
Daemon Daemon Daemon
DDBMS core thread
Appl icati on pr ocess 2Appl icati on pr ocess 1
SQL/result SQL/result
SQL/result
SQL/resultDDBMS core thread 1
DDBMS core thread
29Database Management Systems and Their Implementation, Xu Lizhen
LDBMS process
Local database 1
LDBMS process
Local database 2
LDBMSprocess
Local database 3
Site 1 Site 2 Site 3
SQL/result SQL/result SQL/result SQL/result
DDBMS core thread 2
3. Database Access Management
Database Management Systemsand Their Implementation, Xu
Lizhen
-
7/28/2019 DBMS and Implementation All(Full)
11/69
11
Main Contains
The access to database is transferred to theoperations on files (of OS) eventually. The filestructure and access route offered on it willaffect the speed of data access directly. It isim ossible that one kind of file structure will
31Database Management Systems and Their Implementation, Xu Lizhen
be effective for all kinds of data access
Access types
File organization
Index technique
Access primitives
Access Types
Query all or most records of a file (>15%)
Query some special record
Query some records (
-
7/28/2019 DBMS and Implementation All(Full)
12/69
12
Index Technique
B+ Tree () Clustering index ( ) Inverted file
Dynamic hashing
Grid structure file and artitioned hash function
34Database Management Systems and Their Implementation, Xu Lizhen
Bitmap index (used in data warehouse)
Others
Bitmap index index itself is data
D at e S to r e S ta te C la ss S al es
3/1/96 32 NY A 6
3/1/96 36 MA A 9
3/1/96 38 NY B 5
3/1/96 41 CT A 11
3/1/96 43 NY A 9
3/1/96 46 RI B 3
Bitmap Index for Sales
8bit 4bit 2bit 1bit
0 1 1 0
1 0 0 1
0 1 0 1
1 0 1 1
1 0 0 1
0 0 1 1
Bitmap Index for State
AK AR CA CO CT MA NY RI
0 0 0 0 0 0 1 0
0 0 0 0 0 1 0 0
0 0 0 0 0 0 1 0
0 0 0 0 1 0 0 0
0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 1
35Database Management Systems and Their Implementation, Xu Lizhen
3/1/96 4 7 CT B 7
3/1/96 49 NY A 12
1 1 0 0
for Class
A B C
1 0 0
1 0 0
0 1 0
1 0 0
1 0 0
0 1 0
0 1 0
1 0 0
0 0 0 0 0 0 1 0
Total sales = ? (4*8+4*4+4*2+6*1=62)How manyclassA store inNY ? (3)Salesof class A store in NY = ? (2*8+2*4+1*2+1*1=27)How manystores inCT ? (2)Join operation (query product list of class A store in NY)
Access Primitives (examples)
int dbopendb(char * dbname)
Function: open a database.
int dbclosedb(unsigned dbid)
Function: close a database.
int dbTableInfo(unsigned rid, TableInfo * tinfo)
36Database Management Systems and Their Implementation, Xu Lizhen
unc on: ge e n orma on o e a e re erence y r .
int dbopen(char * tname,int mode, int flag)
Function: open the table tname and assign a rid for it.
int dbclose(unsigned rid)Function: close the table referenced by rid and release the rid.
int dbrename(oldname, newname)
Function: rename the table.
-
7/28/2019 DBMS and Implementation All(Full)
13/69
13
Access Primitives (examples)
int dbcreateattr (unsigned rid , sstree * attrlist)
Function: create some attributes in the table referenced by rid.
int dbupdateattrbyidx(unsigned rid, int nth, sstree attrinfo)
Function: update the definition of the nth attribute in the tablereferenced by rid.
*
37Database Management Systems and Their Implementation, Xu Lizhen
, ,attrinfo)
Function: update the definition of attribute attrname in the tablereferenced by rid.
int dbinsert(unsigned rid, char * tuple, int length, int flag)
Function: insert a tuple into the the table referenced by rid.
Access Primitives (examples)
int dbdelete(unsigned rid, long offset, int flag)
Function: delete the tuple specified by offset in the tablereferenced by rid.
int dbupdate(unsigned rid, long offset, char * newtuple, int flag)
Function: update the tuple specified by offset in the tablereferenced b rid with newtu le.
38Database Management Systems and Their Implementation, Xu Lizhen
.
int dbgetrecord(unsigned rid, int nth, char* buf)
Function: fetch out the nth tuple from the table referenced by rid
and put it into buffer buf. int dbopenidx(unsigned rid, indexattrstruct * attrarray, int flag)
Function: open the index of the table referenced by rid andassign a iid for it.
Access Primitives (examples)
int dbcloseidx(unsigned iid)
Function: close the index referenced by iid.
int dbfetch(unsigned rid, char * buf, long offset)
Function: fetch out the tuple specified by offset from the tablereferenced by rid and put it into buffer buf.
* *
39Database Management Systems and Their Implementation, Xu Lizhen
, , ,
Function: fetch out the TIDs of tuples whose value on indexedattribute has the flag relation withpvalue, and put them intooffsetbuf. iid is the reference of the index used.
int dbpack(unsigned rid)
Function: re-organize the relation, delete the tuples havingdeleted flag physically.
-
7/28/2019 DBMS and Implementation All(Full)
14/69
14
4. Data Distribution
Database Management Systemsand Their Implementation, Xu
Lizhen
4.1 Strategies of Data Distribution
(1) Centralized: distributed system, but the dataare still stored centralized. It is simplest, butthere is not any advantage of DDB.
(2) Partitioned: data are distributed withoutre etition. no co ies
41Database Management Systems and Their Implementation, Xu Lizhen
(3) Replicated: a complete copy of DB at eachsite. Good for retrieval-intensive system.
(4) Hybrid (mix of the above): an arbitraryfraction of DB at various sites. The mostflexible and complex distributing method.
Comparison of four strategies
1 2 3 4
flexibility
42Database Management Systems and Their Implementation, Xu Lizhen
complexity
Advantage of DDBS
Problems with DDBS
-
7/28/2019 DBMS and Implementation All(Full)
15/69
15
4.2 Unit of Data Distribution
(1) According to relation(or file), that meansnon partition
(2) According to fragments
43Database Management Systems and Their Implementation, Xu Lizhen
Horizontal fragmentation: tuple partition
Vertical fragmentation: attribute partition
Mixed fragmentation: both
The criteria of fragmentation:
(1) Completeness: every tuple or attribute musthas its reflection in some fragments.
(2) Reconstruction: should be able to
44Database Management Systems and Their Implementation, Xu Lizhen
reconstruct t e origina g o a re ation.
(3) Disjointness: for horizontal fragmentation.
(1) Horizontal Fragmentation
Defined by selection operation with predicate, andreconstructed by union operation.
Fragmentation Methods
Rn fragments (use P1, , P2. . .Pn)SELECT *
45Database Management Systems and Their Implementation, Xu Lizhen
Fulfill: PiPj=false (ij)P1P2. . .Pn=true
FROM R
WHERE P ;
Derived Fragmentation: relation is fragmented notaccording to itselfs attribute, but to another relationsfragmentation.
-
7/28/2019 DBMS and Implementation All(Full)
16/69
16
An example of Derived Fragmentation
TEACHER(TNAME, DEPT)
COURSE(CNAME,TNAME)
Suppose TEACHER has been fragmented according toDEPT, we want to fragment COURSE even if there is noDEPT attribute in it. This will be the fragmentation
46Database Management Systems and Their Implementation, Xu Lizhen
derived from TEACHERs fragmentation.
Semi join : R S = R (R S)
TEACHER9 = SELECT * FROM TEACHER
WHERE DEPT=9th;
COURSE9=COURSE TEACHER9
(2) Vertical Fragmentation
Defined by project operation, and reconstructed byjoin operation. Note:
Completeness: each attribute should appear in atleast one fragment.
Reconstruction: should fulfill the condition of
47Database Management Systems and Their Implementation, Xu Lizhen
lossless join decomposition when fragmentizing.
a. Include a key of original relation in every fragment.
b. Include a TID of original relation produced bysystem in every fragment.
(3) Mixed Fragmentation
Apply fragmentation operations recursively.
Can be showed with a fragmentation tree:
R Global relation
48Database Management Systems and Their Implementation, Xu Lizhen
R13
R2R1
R12R11 Fragments
V
H
-
7/28/2019 DBMS and Implementation All(Full)
17/69
17
4.3 Different Transparency Level
We can simplify a complex problem throughinformation hiding method
Level 1: Fragmentation Transparency
User only need to know global relations, he
49Database Management Systems and Their Implementation, Xu Lizhen
dont have to know if they are fragmentizedand how they are distributed. In thissituation, user can not feel the distribution ofdata, as if he is using a centralized database.
Level 2: Location Transparency
User need to know how the relations arefragmentized, but he dont have to worry thestore location of each fragment.
50Database Management Systems and Their Implementation, Xu Lizhen
User need to know how the relations arefragmentized and how they are distributed,but he dont have to worry every localdatabase managed by what kind of DBMS,using what DML, etc.
Level 4No Transparency
4.4 Problems Caused by Data Distribution
1) Multi copies consistency2) Distribution consistency
Mainly the change of tuples store locationresulted by updating operation. Solution:
(1) Redistribution
51Database Management Systems and Their Implementation, Xu Lizhen
After Update: SelectMoveInsertDelete(2) PiggybackingCheck tuple immediately while updating, if
there is any inconsistency it is sent back alongwith ACK information and then sent to the rightplace.
-
7/28/2019 DBMS and Implementation All(Full)
18/69
18
3) Translation of Global Queries to FragmentQueries and Selection of Physical Copies.
4) Design of Database Fragments and Allocation
52Database Management Systems and Their Implementation, Xu Lizhen
o ragments.
Above 1)~3) should be solved in DDBMS.
While 4) is a problem of distributed databasedesign.
4.5 Distributed Database Design
1) Distributed database
The design of fragments
The design of fragment distribution solution
To understand users requirements, we
53Database Management Systems and Their Implementation, Xu Lizhen
should ask the following questions to everyapplication while requirement analysis:
The sites where this application may occur
The frequency of this application
The data object accessed by this application
Design flow of centralized database
Requirement Analysis
Concept Design
Informationrequirement
Processrequirement
Requirementindication
DBMSfeature
A
54Database Management Systems and Their Implementation, Xu Lizhen
Logic Design
Physical Design
Hardware, OS feature
DBMS independentdata schema
Outer schema,concept schema
Inner schema
-
7/28/2019 DBMS and Implementation All(Full)
19/69
19
Design flow of distributed database
A
Fragmentation Design
Information
requirement
Process
requirement
Concept schema DDBMSfeature
B
55Database Management Systems and Their Implementation, Xu Lizhen
Distribution Design
Physical design of each site
Hardware, OS feature of each site
Distribution units
Local schema ofevery site
Inner schema
(1) Fragmentation design
In DDB, it is not true that the fragments should bedivided as fine as possible. It should be fragmentizedaccording to the requirement of application. Forexample, there are following two applications:
App1: SELECT GRADE FROM STUDENT
= th
56Database Management Systems and Their Implementation, Xu Lizhen
App2: SELECT AVG(GRADE)
FROM STUDENT
WHERE SEX=Male;
Problem:
if STUDENT should be fragmentized horizontallyaccording to DEPT?
General rules:
Select some important typical applications whichoccur often.
Analyze the local feature of the data accessed bythese applications.
For horizontal fragmentation:
57Database Management Systems and Their Implementation, Xu Lizhen
Select suitable predicate to fragmentize the globalrelations to fit the local requirement of each site. Ifthere is any contradiction, consider the need of moreimportant application.
Analyze the join operations in applications to decideif derived fragmentation is needed.
-
7/28/2019 DBMS and Implementation All(Full)
20/69
20
For vertical fragmentation:
Analyze the affinity between attributes, andconsider:
Save storage space and I/O cost
58Database Management Systems and Their Implementation, Xu Lizhen
.by some users.
(2) Distribution design
Through cost estimation, decide the suitablestore location (site) of each distribution unit.p252
2) Parallel database
What is parallel database system?
Share Noting (SN) structure
Vertical parallel and horizontal parallel
A complex query can be decomposed into severaloperation steps, the parallel process of these steps is
59Database Management Systems and Their Implementation, Xu Lizhen
ca e ver ca para e .
For the scan operation, if the relation to be scanned isfragmentized beforehand into several fragments, andstored on different disks of a SN structured parallelcomputer, then the scan can be processed on thesedisk in parallel. This kind of parallel is calledhorizontal parallel.
Example:
SELECT *
FROM R,S
WHERE R.a=S.a ANDR.b>20 AND
R=b>20(R)
S=c
-
7/28/2019 DBMS and Implementation All(Full)
21/69
21
Design flow of parallel database
A
Fragmentation Design
Information
requirement
Process
requirement
Concept schema
PDBMSfeature andthe feature ofparallel computer
system used
61Database Management Systems and Their Implementation, Xu Lizhen
Distribution design of data on different
disks of the parallel computer
Physical design of each disk
Distribution units
Local schema of
every disk
Inner schema
C
Data fragmentation mode in PDB
(1)Arbitrary
Fragmentize relation R in arbitrary mode,then stored these fragments on the disk ofdifferent processor. For example, R may bedivided avera el , or hashed into several
62Database Management Systems and Their Implementation, Xu Lizhen
fragments, etc.
(2)Based on expression
Put the tuples fulfill some condition into afragment. Suitable for the situation in which themost query are based on fragmentationconditions. --- excluded respectively.
The difference between PDB and DDB about datafragmentation and distribution
PDB DDB
The goal offragmentationanddistribution
Promote parallel processdegree, use the parallelcomputers ability asadequately as possible
Promote the local degree ofdata access, reduce the datatransferred on network
63Database Management Systems and Their Implementation, Xu Lizhen
in accordancewith
feature of parallel computersystem used, combiningapplication requirements.
,combining the feature ofDDBMS used.
Distributionmode
On multi disks of a parallelcomputer
On multi sites in thenetwork
-
7/28/2019 DBMS and Implementation All(Full)
22/69
22
Design flow of DDB with PDB as local database
B
If site i is a parallel
computer?
N
64Database Management Systems and Their Implementation, Xu Lizhen
C
Local physical design
of general DDB
Inner schema
Y
3) Federated database
In practical applications, there are strongrequirements for solving the integration of multiexisting, distributed and heterogeneous databases.
The database system in which every member isautonomic and collaborate each other based on
65Database Management Systems and Their Implementation, Xu Lizhen
nego a on --- e era e a a ase sys em.
No global schema in federated database system,
every federated member keeps its own data schema. The members negotiate each other to decide
respective input/output schema, then, the datasharing relations between each other are established.
The schema structure in federated database System
FS1 FS3FS2
IS3IS2IS1
66Database Management Systems and Their Implementation, Xu Lizhen
ES3ES2ES1
CS3CS2CS1
DB1 DB2 DB3
Site1 Site3Site2
-
7/28/2019 DBMS and Implementation All(Full)
23/69
23
FSi
= CSi
+ ISi
FSi is all of the data available for the users on sitei.
ISi is gained through the negotiation with ESj of othersites (ji).
-
67Database Management Systems and Their Implementation, Xu Lizhen
i i i the sub-queries on corresponding ESj.
The results gained from ESj the result forms ofcorresponding ISi, and combined with the results getfrom the sub-queries on CSi, then synthesized to theeventual result form of FS i.
4.6 The Distribution of Catalog
Catalog --- Data about data (Metadata).
Its main function is to transfer the users operatingdemands to the physical targets in system.
4.6.1 Contents of Catalogs
(1) The type of each data object (such as base table,
68Database Management Systems and Their Implementation, Xu Lizhen
v ew, . . . an s sc ema
(2) Distribution information (such as fragmentlocation, . . .)
(3) Access routing (such as index, . . .)
(4) Grant information
(5) Some statistic information used in queryoptimization
(1)~(4) are not changed frequently, while (5) willchange on every update operation. For it:
Update it periodically
Update it after every update operation
4.6.2 The features of catalog
69Database Management Systems and Their Implementation, Xu Lizhen
(1) mainly read operation on it
(2) very important to the efficiency and the datadistribution transparency of the system
(3) very important to site autonomy
(4) the distribution of catalog is decided by thearchitecture of DDBMS, not by applicationrequirements
-
7/28/2019 DBMS and Implementation All(Full)
24/69
24
4.6.3 Distribution strategies of catalog
(1)Centralized
A complete catalog stored at one site.
Extended centralized catalog: centralized atfirst; saved after being used; notify while
70Database Management Systems and Their Implementation, Xu Lizhen
(2) Fully replicated: catalogs are replicated ateach site. Simple in retrieval. Complex inupdate. Poor in autonomy.
(3) Local catalog: catalogs for local data arestored at each site. That means catalogs arestored along with data.
If want the catalog information about data on other site(look for through broadcast):
Master catalog: store a complete catalog on some site.Make every catalog information has two copies.
Cache: after getting and using the catalog informationabout data on other site through broadcast, save it forfuture use (cache it). Update the catalog cached
71Database Management Systems and Their Implementation, Xu Lizhen
roug e compar son o vers on num er.
R
Rs catalog(Ver No.) Cache of Rs catalog
(Ver No.)
(4) Different Combination of the above
Use different strategy to different contents ofcatalog, and then get different combinationstrategies. Such as:
a. Use full re licated strate to distribution
72Database Management Systems and Their Implementation, Xu Lizhen
information (2), use local catalog strategy toother parts.
b. Use local catalog strategy to statisticinformation, use fully replicated strategy toother parts.
-
7/28/2019 DBMS and Implementation All(Full)
25/69
25
4.6.4 Catalog management in R*--- Site autonomy
Characteristics:
There is no global catalog
Independent naming and data definition
The catalog grows reposefully
---
73Database Management Systems and Their Implementation, Xu Lizhen
---(SWN)
::[email protected]@BirthSite
ObjectName: the name given by user for the dataobject
User: users name. With this, different users canaccess different data object using the same name.
UserSite: the ID of the site where the User is. Withthis, different users on different sites can use the sameuser name.
BirthSite: the birth site of the data object. There is noglobal catalog in R* system. At the BirthSite the
74Database Management Systems and Their Implementation, Xu Lizhen
information about the data is always kept even thedata is migrated to other site.
Print Name (PN): user used normally when theyaccess a data object.
::=[User[@UserSite].]ObjectName[@BirthSite]
Name Resolution --- Mapping PN to SWN
Establish a synonym table for eachuser using Define Synonym ...statement.
ObjectName SWN
Mapping PN in different forms according tofollowing rules:
75Database Management Systems and Their Implementation, Xu Lizhen
1) PrintName = SWN, need not transform
2) Only have ObjectName: search ObjectName inthe synonym table of current user on current site.
3) User.ObjectName: search the synonym table ofuser User on current site.
-
7/28/2019 DBMS and Implementation All(Full)
26/69
26
5) ObjectName@BirthSite
If no match for the ObjectName is found in (2), (3) orprint name is in the form of (4) or (5), namecompletion is used.
Name Resolution --- Mapping PN to SWN
76Database Management Systems and Their Implementation, Xu Lizhen
A missing User is replaced by current User.
A missing UserSite or BirthSite is replaced by currentsite ID.
5. Query Optimization
Database Management Systemsand Their Implementation, Xu
Lizhen
Introduction
Rewrite the query statementssubmitted by user first, and thendecide the most effective operatingmethod and steps to get the result.
78Database Management Systems and Their Implementation, Xu Lizhen
The goal is to gain the result of usersquery with the lowest cost and inshortest time.
-
7/28/2019 DBMS and Implementation All(Full)
27/69
27
5.1 Summary of Query Optimization inDDBMS
Global Query: a query over globalrelation.
Fragment Query: a query over
79
ragments.
Database Management Systems and Their Implementation, Xu Lizhen
General flow
Global Query
Transfer it to fragment queries
1) Transform the queries tree into the most effective form
2) Query decomposition (into several sub queries which can be
Query tree
GlobalOptimiz
Algebr a optimi zation
80Database Management Systems and Their Implementation, Xu Lizhen
executed locally)3) Decide the order and site of the operations
Execute each operation according to the schedule in query plan
Query result
Query plan
tion
LocalOptimization
Operation optimization
Example
S(SNUM, SNAME, CITY)
SP(SNUM, PNUM, QUAN)
P(PNUM, PNAME, WEIGHT, SIZE)
Suppose the fragmentation is as following:
81Database Management Systems and Their Implementation, Xu Lizhen
S1 = CITY=Nanjing(S)S2 = CITY=Shanghai(S)
SP1 = SP S1
SP2 = SP S2
-
7/28/2019 DBMS and Implementation All(Full)
28/69
28
Global Query:
SELECT SNAME
FROM S, SP, P
WHERE S.SNUM=SP.SNUM AND
SP.PNUM=P.PNUM AND
82Database Management Systems and Their Implementation, Xu Lizhen
S.CITY=Nanjing AND
P.PNAME=Bolt AND
SP.QUAN>1000;
Query tree
(S.SNAME)
3 P.PNAME=Bolt
SP.PNUM=P.PNUM
S.SNUM=SP.SNUM
83Database Management Systems and Their Implementation, Xu Lizhen
S1 S2
1 CITY=Nanjing 2 QUAN>1000 P
S SP
SP1 SP2
After equivalent transform (Algebra optimization) :
(S.SNAME)
3
SP.PNUM=P.PNUM
S.SNUM=SP.SNUM3
84Database Management Systems and Their Implementation, Xu Lizhen
S1 S2
P
SP1 SP2
11 222211
1= (S.SNUM,S.SNAME)
2= (SP.SNUM,SP.PNUM)
3= (P.PNUM)S1 S2 SP1 SP2
P
-
7/28/2019 DBMS and Implementation All(Full)
29/69
29
The result of equivalent transform
(S.SNAME)
P
SP.PNUM=P.PNUM
S.SNUM=SP.SNUM
85Database Management Systems and Their Implementation, Xu Lizhen
S1 S2
SP1 SP2
The operation optimization of the sub tree (yellow) :
First consider distributed JN: (1) (S1 S2) (SP1 SP2)
(2) Distributed Join
Then consider Site Selection, may produce many combination
For every join operation, there are many computing method:
R Site , R S
86Database Management Systems and Their Implementation, Xu Lizhen
SRSite jSite i
S Site i, R S
JN Attr.(S) Site i, R S Site j, (R S) S
The goal of query optimization is to select a good solution fromso many possible execution strategies. So it is a complex task.
5.2 The Equivalent Transform of a Query
That is so called algebra optimization. It takes a series oftransform on original query expression, and transform itinto an equivalent, most effective form to be executed.For example: NAME,DEPTDEPT=15(EMP)DEPT =15NAME,DEPT (EMP)
(1)Query tree
87Database Management Systems and Their Implementation, Xu Lizhen
For example: SNUMAREA=NORTH(SUPPLYDEPTNUM DEPT)
(SNUM)
AREA = North
DEPT
DEPTNUM=DEPTNUM
Supply
Leaves: global relationMiddle nodes: unary/binaryoperationsLeaves root: the executingorder of operations
-
7/28/2019 DBMS and Implementation All(Full)
30/69
30
(2) The equivalent transform rules of relational algebra
1) Exchange rule of /: E1 E2 E2 E12) Combination rule of /: E1(E2E3) (E1E2)E33) Cluster rule of : A1An(B1Bm(E)) A1An(E),
legal when A1An is the sub set of {B1Bm}
88Database Management Systems and Their Implementation, Xu Lizhen
5) Exchange rule of and : F(A1An(E)) A1An (F(E))if F includes attributes B1Bm which dont belong toA1An, then A1An (F(E)) A1AnF(A1AnB1Bm(E))
6) If the attributes in F are all the attributes in E1, then
F(E1E2) F(E1)E2
if F in the form of F1F2, and there are only E1sattributes in F1, and there are only E2s attributes inF2, then F(E1E2) F1(E1)F2(E2)if F in the form of F1F2, and there are only E1sattributes in F1, while F2 includes the attributes both
89Database Management Systems and Their Implementation, Xu Lizhen
, F F2 F17) F(E1 E2) F(E1) F(E2)
8) F(E1 - E2)F(E1) - F(E2)9) Suppose A1An is a set of attributes, in which
B1Bm are E1s attributes, and C1Ck are E2sattributes, then
A1An(E1E2) B1Bm(E1)C1Ck(E2)
10) A1An(E1 E2) A1An(E1) A1An(E2)From the above we can see, the goal of algebraoptimization is to simplify the execution of the query,and the target is to make the scale of the operandswhich involved in binary operations be as small aspossible.
90Database Management Systems and Their Implementation, Xu Lizhen
(SNUM)
(DEPTNUM)
DEPTNUM=DEPTNUM
(SNUM, DEPTNUM)
Supply DEPT
AREA = North
(3) The generalprocedure of algebraoptimization pleaserefer to p118.
-
7/28/2019 DBMS and Implementation All(Full)
31/69
31
5.3 Transform Global Queries intoFragment Queries
Methods:
For horizontal fragmentation: R = R1 R2 Rn For vertical fragmentation: S = S1 S2 Sn
Replace the global relation in query expression with the above.The expression we get is called canonical expression
Transform the canonical ex ression with the e uivalent
91Database Management Systems and Their Implementation, Xu Lizhen
transform rules introduced above. Principles:
1) Push down the unary operations as low as possible
2) Look for and combine the common sub-expression
Definition: the sub-expression which occurs more than once inthe same query expression. If find this kind of sub-expressionand compute it only once, it will promote query efficiency.
General method:
(1) combine the same leaves in the query tree
(2) combine the middle nodes corresponding to thesame operation with the same operands.
example: emp.name(emp (mgrnum=373 dept)
92Database Management Systems and Their Implementation, Xu Lizhen
sal>35000 mgrnum=373
emp.name
deptnum = deptnum
dept
-
mgrnum=373
empdept
mgrnum=373sal>35000
emp
combinecombine
move up
combine
combine
emp.name
deptnum
-
sal>35000
Properties: R R R R R R RR R F(R) F(R)
93Database Management Systems and Their Implementation, Xu Lizhen
dept
mgrnum=373
emp
Common sub-expression:emp (mgrnum=373 dept)
F R F(R) not F R F1(R) F2(R) F1F2(R) F1(R) F2(R) F1F2(R) F1(R)F2(R) F1not F2(R)
-
7/28/2019 DBMS and Implementation All(Full)
32/69
32
emp.name
deptnum
sal
-
7/28/2019 DBMS and Implementation All(Full)
33/69
33
Decomposition method :
Traverse the query tree in post-order, until j become 2,
then get the first sub-tree. The rest may be deduced byanalogy, so we can get all of the sub-trees.
R1 R2 R3
97Database Management Systems and Their Implementation, Xu Lizhen
R41
R11 R2
1
R31
21 434321
R5
55
R6
66
R1
R2 R3
Site1 Site3Site2 Assembling tree
5.5 The Optimization of BinaryOperations
The executions of local sub-queries are responsible bylocal DBMS. The query optimization of DDBMS isresponsible for the global optimization, that is theexecution of assembling tree.
Because the executions of unary operations are
98Database Management Systems and Their Implementation, Xu Lizhen
respons e y oca a er a ge ra op m za onand query decomposition, the global optimization ofDDBMS only need to consider the binary operations,mainly the join operation.
How to find a good access strategy to compute thequery improved by algebra optimization is introducedin this section.
I. Main problems in global optimization
1) Materialization: select suitable copies offragments which involved in the query
2) Strategies for join operation
Non distributed oin:
99Database Management Systems and Their Implementation, Xu Lizhen
_(R2 U R3) R1
Distributed join:(R1 R2) (R1R3)
irect join
Use of semi_join
3) Select execution of each operation (mainly todirect join)
-
7/28/2019 DBMS and Implementation All(Full)
34/69
34
Nested loop: one relation acts as outer loop relation
(O), the other acts as inner loop relation (I). Forevery tuple in O, scan I one time to check joincondition.
Because the relation is accessed from disk in the
100Database Management Systems and Their Implementation, Xu Lizhen
un o oc , we can use oc u er o mproveefficiency. For R S, if let R as O, S as I, bR isphysical block number of R, bS is physical blocknumber of S, there are nB block buffers in system(nB>=2), and nB-1 buffers used for O, one bufferused for I, then the total disk access times neededto compute R S is:
bRbR/(nB-1)
bS
Merge scan: order the relation R and S on disk inahead, then we can compare their tuples in order,and both relation only need to scan one time. If Rand S have not ordered in ahead, must consider theordering cost to see if it is worth to use this method(p122)
101Database Management Systems and Their Implementation, Xu Lizhen
Using in ex or as to oo or mapping tup es: innested loop method, if there is suitable access routeon I (say B+ tree index), it can be used to substitutesequence scan. It is best when there is cluster index orhash on join attributes.
Hash join: because the join attributes of R and S havethe same domain, R and S can be hashed into thesame hash file using the same hash function, then R S can be computed based on the hash file.
The above are all classical algorithms in centralizedDBMS. The idea is similar when computing join inDDBMS using direct join strategy.
II. Three general optimization methods:1) By cost comparison (also called exhaustive search)2 B heuristic rule: enerall used in small s stems.
102Database Management Systems and Their Implementation, Xu Lizhen
(p124)3) Combination of 1, 2: eliminate the solutions
unsuitable obviously by using 2 to reduce solutionspace, then use 1 to compare cost of the restsolutions carefully.
III. Cost estimationTotal query cost=Processing cost+Transmission cost
-
7/28/2019 DBMS and Implementation All(Full)
35/69
35
According to different environment:
For wide area network: the transfer rate is about100bps~50Kbps, far slow than processing speed incomputer, so Processing cost can be omitted.
For local area network: the transfer rate will reach
103Database Management Systems and Their Implementation, Xu Lizhen
1000Mbps, both items should be considered.
1) Transmission cost
TC(x)=C0 + C1x
x: the amount of data transferred; C0: cost ofinitialization; C1: cost of transferring one data unit onnetwork. C0 , C1 rely on the features of the networkused.
2)Processing cost
Processing cost = costcpu + costI/Ocostcpu can be omitted generally.
cost of one I/O = D0 + D1D0: the average time looking for track (ms);
D : time of one data unit I/O s, can be omitted
104Database Management Systems and Their Implementation, Xu Lizhen
costI/O = no. of I/OD0
Notice: calculate query cost accurately is unnecessaryand unpractical. The goal is to find a good solutionthrough the comparison between different solutions,so only need to estimate the execution cost ofdifferent solutions under the same executionenvironment.
5.6 Implement Join Operation WithSemi_join
I. The role of semi_join
Semi_join is used to reduce transmission cost. So it issuitable for WAN only.
R S = R(R S)
if R and S are stored on site 1 and 2 respectively, the
105Database Management Systems and Their Implementation, Xu Lizhen
steps to realize R S with is as following:
1) TransferA(S) site1, A is join attribute
2) Execute R A(S)=R S on site1 (compress R)
3) Transfer R S site2
4) Execute (R S) S = R S on site2
-
7/28/2019 DBMS and Implementation All(Full)
36/69
36
Cost comparison
Cost of direct join = C0 + C1min(r, s) ------
r, s --- |R|, |S| (size of the relations) Cost of join via semi_join =
min(2C0 + C1s + C1r, 2C0 + C1r + C1s) =
2C0+C1min(s + r, r + s) ------
106Database Management Systems and Their Implementation, Xu Lizhen
s, r --- |A(S)|, |A(R)|
s, r --- |S R|, |R S|
Only when
-
7/28/2019 DBMS and Implementation All(Full)
37/69
37
Query graph:
Link the two relations with a line if there is
between them, then we can get thequery graph. The query whose querygraph like the left graph is called treequery (TQ)
R1
R2 R3
R5 R6R4
= = = =
109Database Management Systems and Their Implementation, Xu Lizhen
1. 2. 2. 3. 3. 1.
R1
R2 R3
The query whose query graph like the leftgraph is called cyclic query (CQ)
Example 3: q = (R1.A=R2.B)(R2.B=R3.C)(R3.C=R1.A)
This is a TQ, not a CQ, because R3.C=R1.A can be obtainedfrom transfer relation, it is not a independent condition.
Can be proved:
1) Full reducer exists for TQ.
2) No full reducer exists for CQ in many cases.
R1 A B
0 1
R2 C D
1 2
R3 E F
2 3
110Database Management Systems and Their Implementation, Xu Lizhen
q = (R1.B=R2.C)(R2.D=R3.E)(R3.F=R1.A)Even if the result of this query is empty, the size of anyone of R1, R2 and R3 can not be decreased through . Sothere is not full reducer for this query.
Another example about full reducer of CQ
R A B
1 a
2 b
3 c
S B C
a x
b y
c z
T C A
x 2
y 3
z 4
111Database Management Systems and Their Implementation, Xu Lizhen
q = . = . . = . . = . , s ere u re ucer
R=R T=A B
2 b
3 c
S=S R=B C
b y
c z
T=T S=C A
y 3
z 4
R=R T=A B
3 cS=S R= T=T S=
B C
c z
C A
z 4
-
7/28/2019 DBMS and Implementation All(Full)
38/69
38
R=R T=, So the full reducer of relation R inquery q is :
R=R T S=S R T=T S R=R T
S=S R T=T S R=R T= Conclusion:
For CQ: there is no full reducer in general. Even ifthere is, its length increases linearly along with the
112Database Management Systems and Their Implementation, Xu Lizhen
number of tuples of some relation in the query. (about3(m-1))
For TQ: the length of full reducer < n-1 (n is thenumber of nodes in the query graph).
So the full reducer can not be the optimization targetwhen execute join operation under distributedenvironment through .
SDD-1 Algorithm: heuristic rule (p189~190)
5.7 Direct Join
--- Implementation method of join operation in R*
I. Two basic implementation method of join operation
113Database Management Systems and Their Implementation, Xu Lizhen
algorithm in centralized DBMS
cost = (bR
+bR/(n
B- 1)* b
S) * D
0
Merge Scan: the extension of correspondingalgorithm in centralized DBMS
cost = (bR + bS) * D0 + Costsort(R) + Costsort(S)
II. Transmission of relations in these two methods
1) shipped whole : The whole relation is shippedwithout selections.
For I, a temporary relation is established at thedestination site for further use.
For O, the relation doesnt need storing.
114Database Management Systems and Their Implementation, Xu Lizhen
2) Fetch as need : The whole relation is not shipped.The tuples needed by the remote site are sent at itsrequest. The request message usually contains the valueof join attribute.
Usually there is an index on the join attribute at therequested site.
-
7/28/2019 DBMS and Implementation All(Full)
39/69
39
III. Six implementation strategies of join in R*
Nested Loop O: shipped whole
I : fetch as neededMerge Scan O: shipped whole
I : shipped whole
fetch as needed
115Database Management Systems and Their Implementation, Xu Lizhen
Shipping whole O and I to a 3rd site (NL or MS)
It is obvious that O should be shipped whole.
In NL, if I is shipped whole, index cant be shippedalong with it, moreover temporary relation isrequired. Both processing cost and storage cost arehigh.
Six strategies dont include:
Multiple join --- transformed into multibinary joins.
Copy selection --- because R* doesntsupport multi copies.
116Database Management Systems and Their Implementation, Xu Lizhen
5.8Distributed Grouping & Aggregate FunctionEvaluation
SELECT PNUM, SUM(QUAN)
FROM SP
GROUP BY PNUM;
That is : GBPNUM, SUM(QUAN)SP
There are the following conclusions about grouping &aggregate function evaluation in distributed computingenvironment :
1) Suppose Gi is a group gotten through grouping to R1 R2according to some attribute set, iff G iRj OR GiRj = for all i, j------(SNC), then :
GBG, AF(R1 R2) = (GBG, AFR1) (GBG, AFR2)
For example:
117Database Management Systems and Their Implementation, Xu Lizhen
, ;
If SP is derived fragmented according to the suppliers city:conform to SNC, so the grouping & aggregate can be evaluateddistributed.
If SP is derived fragmented according to the parts type: dontconform to SNC, because the same supplier may provide morethan one kinds of part at same time. In this situation thegrouping & aggregate can not be evaluated distributed.
-
7/28/2019 DBMS and Implementation All(Full)
40/69
40
2) If SNC does not hold, it is still possible to computesome aggregate functions of global relationdistributed
Suppose global relation: S
fragments: S1, S2, , Snthen:
SUM S = SUM SUM S SUM S SUM S
118Database Management Systems and Their Implementation, Xu Lizhen
COUNT(S) = SUM(COUNT(S1), COUNT(Sn))
AVG(S) = SUM(S)/COUNT(S)
MIN(S) = MIN(MIN(S1), MIN(S2), , MIN(Sn))
MAX(S) = MAX(MAX(S1), MAX(S2), , MAX(Sn))
5.9 Update Strategies
The consistency between multi copies must beconsidered while executing update, because anydata may have multi copies in DDB.
1) Updating all strategy
119Database Management Systems and Their Implementation, Xu Lizhen
unavailable.
p --- probability of availability of a copy.n --- No. of copies
The probability of success of the update=pn
0lim
n
n
p
2) Updating all available sites immediately andkeeping the update data at spooling site forunavailable sites, which are applied to that siteas soon as it is up again.
3) Primary copy updating strategy
120Database Management Systems and Their Implementation, Xu Lizhen
ss gn a copy as pr mary copy. e rema n ngcopies called secondary copies.Update : update P.C, then P.C broadcast theupdate to S.Cs at sometimes.P.C maybe inconsistent with S.C temporarily.There is no problem if the next operation is stillupdate. While if the next operation is a read tosome S.C, then:
-
7/28/2019 DBMS and Implementation All(Full)
41/69
41
Compare the version No. of S.C with that of
P.C, if version No. are equal, read S.C directly;else:
(1) redirect the read to P.C
(2) wait the update of S.C
121Database Management Systems and Their Implementation, Xu Lizhen
4) Snapshot
Snapshot is a kind of copy image notfollowed the changes in DB.
Master copy at one site, many snapshots aredistributed at other sites.
Update: master copy only.
Read: master copy
snapshotsis indicated by users
The snapshot can be refreshed:
(1) periodically
122Database Management Systems and Their Implementation, Xu Lizhen
(2) forced refreshing by REFRESH command
Snapshot is suitable for the application
systems in which there is less update, such ascensus system, etc.
6. Recovery
Database Management Systemsand Their Implementation, Xu
Lizhen
-
7/28/2019 DBMS and Implementation All(Full)
42/69
42
6.1 Introduction
The main roles of recovery mechanism in DBMS are:
(1) Reducing the likelihood of failures (prevention)
(2) Recover from failures (solving)
Restore DB to a consistent state after some failures.
124Database Management Systems and Their Implementation, Xu Lizhen
Redundancy is necessary.
Should inspect all possible failures.
General method:
1) Periodical dumping
Variation : Backup + Incremental dumping
I.D --- updated parts of DB
tdumping dumping failure
Update lost
125Database Management Systems and Their Implementation, Xu Lizhen
tBackup I.D failureI.D
Update lost
This method is easy to be implemented and theoverhead is low, but the update maybe lost after failureoccurring. So it is often used in file system or smallDBMS.
2) Backup + Log
Log : record of all changes on DB since the lastbackup copy was made.
Change:Old value (before image --- B.I)New value (after image --- A.I)
Recordedinto Log
126Database Management Systems and Their Implementation, Xu Lizhen
or up ate op. : . .
insert op. : ---- A.I
delete op. : B.I ----
tBackup failure
Log
-
7/28/2019 DBMS and Implementation All(Full)
43/69
43
While recovering:
Some transactions maybe half done, shouldundo them with B.I recorded in Log.
Some transactions have finished but the
127Database Management Systems and Their Implementation, Xu Lizhen
results have not been written into DB in time,should redo them with A.I recorded in Log.(finish writing into DB)
It is possible to recover DB to the most recentconsistent state with Log.
3) Multiple Copies
Advantages:
1 increase reliabilit
There are multi copies for every data object. Recoveredwith other copies while failure occurs. The systemcannot be collapsed because of some copys failure.
128Database Management Systems and Their Implementation, Xu Lizhen
(2) recovery is very easy
Problems:
(1) difficult to acquire independent failure modes incentralized database systems.
(2) waste in storage space
So this method is not suitable for Centralized DBMS.
6.2 Transaction
A transaction T is a finite sequence of actions onDB exhibiting the following effects:
Atomic action: Nothing or All.
Consistency preservation: consistency state of
129Database Management Systems and Their Implementation, Xu Lizhen
ano er cons s ency s a e o .
Isolation: concurrent transactions should runas if they are independent each other.
Durability:The effects of a successfullycompleted transaction are permanentlyreflected in DB and recoverable even failureoccurs later.
-
7/28/2019 DBMS and Implementation All(Full)
44/69
44
Example: transfer money s from account A to account B
Begin transaction
read A
A:=A-s
if A
-
7/28/2019 DBMS and Implementation All(Full)
45/69
45
6.4 Commit Rule and Log Ahead Rule
6.4.1 Commit Rule
A.I must be written to nonvolatile storage beforecommit of the transaction.
6.4.2 Log Ahead Rule
If A.I is written to DB before commit then B.I must
133Database Management Systems and Their Implementation, Xu Lizhen
first written to log.
6.4.3 Recovery strategies
(1)The features of undo and redo (are idempotent) :
undo(undo(undo undo(x) )) = undo(x)
redo(redo(redo redo(x) )) = redo(x)
(2) Three kinds of update strategy
a) A.IDB before commit
TIDactive list
B.ILog (Log Ahead Rule)
A.IDB
134Database Management Systems and Their Implementation, Xu Lizhen
TIDcommit listdelete TID from active list
commit
The recovery after failure in this situation
Check two lists for every TID while restartingafter failure:
Commitlist
Activelist
135Database Management Systems and Their Implementation, Xu Lizhen
undo, delete TID fromactive list
delete TID from active list
nothing to do
-
7/28/2019 DBMS and Implementation All(Full)
46/69
46
b) A.IDB after commit
TIDactive list
A.ILog (Commit Rule)
136Database Management Systems and Their Implementation, Xu Lizhen
TIDcommit list
A.IDB
delete TID from active list
commit
The recovery after failure in this situation
Check two lists for every TID while restartingafter failure:
Commitlist
Activelist
137Database Management Systems and Their Implementation, Xu Lizhen
delete TID from active list
redo, delete TID fromactive list
nothing to do
c) A.IDB concurrently with commit
TIDactive list
A.I, B.ILog (Two Rules)
A.IDB (partially done)
138Database Management Systems and Their Implementation, Xu Lizhen
TIDcommit list
A.IDB (completed)
delete TID from active listcommit
-
7/28/2019 DBMS and Implementation All(Full)
47/69
47
The recovery after failure in this situation
Check two lists for every TID while restartingafter failure:
Commitlist
Activelist
139Database Management Systems and Their Implementation, Xu Lizhen
undo, delete TID fromactive list
redo, delete TID fromactive list
nothing to do
Conclusion :
redo undo
a)
140Database Management Systems and Their Implementation, Xu Lizhen
b)
c)
d)?
6.5 Update Out of Place
Keep two copies for every page of a relation
Keep a page table (PT) for every relation
When updating some page, produce a newpage out of place, change the correspondingpointer in page table while the transaction
141Database Management Systems and Their Implementation, Xu Lizhen
committing, let it point to new page.
new
PT
old
Suppose relation Rhas N pages, then thelength of its PT is N
-
7/28/2019 DBMS and Implementation All(Full)
48/69
48
P141 : lories approach
commit
Master Record (in memory)
kth
Master Record (on disk)
142Database Management Systems and Their Implementation, Xu Lizhen
New
PTji
PT1
Old
PTj PTn
6.6 Recovery Procedures
Failure types :
1) Transaction failure: because of some reason beyondexpectation, the transaction has to be aborted.
2) System failure: the operating system collapse, but theDB on disk is not damaged. Such as power cutsuddenl .
143Database Management Systems and Their Implementation, Xu Lizhen
.
3) Media failure: disk failure, the DB on the disk isdamaged.
Solutions :1) Transaction failure: because it must occur before
committing :
Undo if necessary
Delete TID from active list
2) System failure:
Restore the system
Undo or redo if necessary
3) Media failure:
144Database Management Systems and Their Implementation, Xu Lizhen
Load the latest dump
Redo according the log
-
7/28/2019 DBMS and Implementation All(Full)
49/69
49
6.7 System Start Up
1) Emergency restart
Start after system or media failure. Recoveryis needed before start.
2) Warm start
145Database Management Systems and Their Implementation, Xu Lizhen
Start after system shutdown. Recovery is notrequired.
3) Cold start
Start the system from scratch. Start after acatastrophic failure or start a new DB.
6.8 Two Phase Commit
The transactions in DDBMS are distributed transactions, the keyof distributed transaction management is how to assure all sub-transactions either commit together or abort together.
Realize the sub-transactions harmony with each other relies oncommunication, while the communication is not reliable.
Two general paradox : No fixed length protocol exists.
146Database Management Systems and Their Implementation, Xu Lizhen
Solution : number the messages.
A BMarch i
OK i
March i+1
OK i+1
If A has not received OK i+11) B had not received March i+12) OK i+1 has been lost
A send March i+1 again; for B :1) If has received, send OK i+1 again
2) or, , send OK i+1
When there are multi generals, select one of them ascoordinator
coordinator
147Database Management Systems and Their Implementation, Xu Lizhen
participantparticipant
-
7/28/2019 DBMS and Implementation All(Full)
50/69
50
Two Phase Commit
coordinato
r
participate
Broadcast prepare
OK / not OK
Commit/Abort
Ack
If the sub-transaction successes andcan commit, it answers OK, or not OK
If all of the sub-transactions are OK,Commit command is sent out, orAbort command is sent out
I
II
148Database Management Systems and Their Implementation, Xu Lizhen
OK --- if success
Commit --- all OK Abort --- any one not OK
not OK --- if fail
Every participant is self-determining before answering OK, it canabort by itself. Once answers OK, it can only wait for the commandcome from the coordinator.
If the coordinator has failure after the participates answer OK, theparticipates have to wait, and is in blocked state. This is thedisadvantage of 2PC.
Three Phase Commit
Broadcast prepare
Ready / Abort
Prepare to commit
OK
If the sub-transaction successes and cancommit, it answers Ready, or Abort
Broadcast to all participants and thisMSG is recorded on the disk
Commit/Abort
I
II
oordinator
participate
149Database Management Systems and Their Implementation, Xu Lizhen
If the coordinator has not any failure, phase II is wasted If the coordinator has failure after the participates answer OK, the
participates communicate each other and check the MSG recordedon disk in phase II, and a new coordinator is elected. If the newcoordinator finds the Prepare to commit MSG on any participate, itsends out Commit command, or sends out Abort command. So theblocked problem can be solved in 3PC.
AckIII
7. Concurrency Control
Database Management Systemsand Their Implementation, Xu
Lizhen
-
7/28/2019 DBMS and Implementation All(Full)
51/69
51
7.1 Introduction
In multi users DBMS, permit multitransaction access the database concurrently.
7.1.1 Why concurrency?
151Database Management Systems and Their Implementation, Xu Lizhen
1) Improving system utilization & response time.
2) Different transaction may access to differentparts of database.
7.1.2 Problems arise from concurrent executions
T1
Read(x)
x:=x+1
Write(x)
T2
Read(x)
x:=2*x
T1
Write(t)
T2
Read(t[x])
Read(t[y])
T1
Read(x)
Read(x)
T2
Write(x)
152Database Management Systems and Their Implementation, Xu Lizhen
r e xt
Lost update
ro ac
Dirty read Unrepeatable read
So there maybe three kinds of conflict when transactions executeconcurrently. They are write write, write read, and read writeconflicts. Write write conflict must be avoided anytime. Write readand read write conflicts should be avoided generally, but they areendurable in some applications.
7.1.3 Serialization --- the criterion forconcurrency consistency
Definition: suppose {T1,T2,Tn} is a set of transactionsexecuting concurrently. If a schedule of {T1,T2,Tn}produces the same effect on database as some serialexecution of this set of transactions, then the schedule isserializable.
153Database Management Systems and Their Implementation, Xu Lizhen
serial execution different result? (yes, n!)
TA
Read R2
TB
Read R1
Write R2
TC
Write R1
The result of this scheduleis the same as serialexecution TA TB TC, soit is serializable. Theequivalent serial executionis TATBTC .
-
7/28/2019 DBMS and Implementation All(Full)
52/69
52
7.1.4 View equivalent and conflict equivalent
Schedules --- sequences that indicate thechronological order in which instructions ofconcurrent transactions are executed
a schedule for a set of transactions must
154Database Management Systems and Their Implementation, Xu Lizhen
consist o a instructions o t osetransactions
must preserve the order in which theinstructions appear in each individualtransaction.
Example Schedules
Let T1 transfer $50 fromA to B, and T2 transfer 10% of the balance fromA to B. The following is a serial schedule, in which T1 is followed by T2.
T1 T2
read(A)A :=A 50write(A)
155Database Management Systems and Their Implementation, Xu Lizhen
rea (B)B := B + 50write(B)
read(A)temp :=A*0.1;A :=A tempwrite(A)read(B)B := B + tempwrite(B)t
Example Schedules
Let T1 and T2 be the transactions defined previously. Thefollowing schedule is not a serial schedule, but it is equivalentto the above.
T1 T2
read(A)A :=A 50write(A)
156Database Management Systems and Their Implementation, Xu Lizhen
read (A)temp := A*0.1
A = A tempwrite(A)
read(B)B := B + 50write(B)
read(B)B := B + tempwrite(B)t
-
7/28/2019 DBMS and Implementation All(Full)
53/69
53
View equivalentand Conflict equivalent
Let S and S be two schedules with the same set of
transactions. S and S are view equivalent if theyproduce the same effect on database based on thesame initial execution condition.
Conflict operations : R-WW-W. The sequence of
157Database Management Systems and Their Implementation, Xu Lizhen
con c opera on w a ec e e ec o execu on.
Non-conflicting operations: R-R Even if thereare write operation, the data items operated aredifferent. Such as Ri(x) and Wj(y).
If a schedule S can be transformed into a schedule Sby a series of swaps of non-conflicting operations, wesay that S and S are conflict equivalent.
Property: if schedule S and S are conflict equivalent, theymust be view equivalent. It is not right contrarily.
Serialization can be divided into view serialization andconflict serialization.
Example 1: for the schedule s of transaction set {T 1,T2,T3}
s = R2 x)W3 x)R1 )W2 ) R1 )R2 x)W2 )W3 x) = s
158Database Management Systems and Their Implementation, Xu Lizhen
s is conflict serialization because s is a serial execution.
Example 2: s = R1(x)W2(x)W1(x)W3(x)
There is no conflict equivalent schedule of s, but we can find aschedule s
s = R1(x)W1(x)W2(x)W3(x)
It is view equivalent with s, and s is a serial execution, so s isview serialization.
The test algorithm of view equivalent is a NPproblem, while conflict serialization coversthe most instances of serializable schedule, sothe serialization we say in later parts will
oint to conflict serialization if without
159Database Management Systems and Their Implementation, Xu Lizhen
special indication.
-
7/28/2019 DBMS and Implementation All(Full)
54/69
54
7.1.5 Preceding graph
Directed graph G =
V --- set of vertexes, including all transactionsparticipating in schedule.
E --- set of edges, decided through the analysis ofconflict operations. If any of the following conditionsis fulfilled add an ed e TT into E:
160Database Management Systems and Their Implementation, Xu Lizhen
i j
Ri(x) precedes Wj(x)
Wi(x) precedes Rj(x)
Wi(x) precedes Wj(x)
Finally, check if there is cycle in the preceding graph.If there is cycle in it, the schedule is not serializable,or it is serializable.
Find equivalent serial execution while serialization
1) Because there is no cycle, there must be somevertexes whose in-degree is 0. Remove these vertexesand relative edges from the preceding graph, andstore these vertexes into a queue.
2) Process the left graph in the same way as above, butthe vertexes removed should be stored behind the
161Database Management Systems and Their Implementation, Xu Lizhen
existing vertexes in the queue.
3) Repeat 1 and 2 until all vertexes moved into the
queue.Example: for schedule s on {T1,T2,T3,T4}, suppose :
s = W3(y)R1(x)R2(y)W3(x)W2(x)W3(z)R4(z)W4(x)
Is it serializable? Find out the equivalent serialexecution if it is.
s = W3
(y)R1
(x)R2
(y)W3
(x)W2
(x)W3
(z)R4
(z)W4
(x)
T1
T2
T3
T4
T2
T3
T4
Queue : T1
T2
T4
Queue : T1,T3
162Database Management Systems and Their Implementation, Xu Lizhen
The commission of concurrency control is to enforce theconcurrent transactions executed in a serializableschedule.
T4
Queue : T1,T3,T2Equivalent serial execution :T1T3T2T4
Empty
-
7/28/2019 DBMS and Implementation All(Full)
55/69
55
7.2 Locking Protocol
Locking method is the most basic concurrency control
method. There maybe many kinds of locking protocols.7.2.1 X locks
Only one type of lock, for both read and write.
163Database Management Systems and Their Implementation, Xu Lizhen
--- --- Y --- compatible N --- incompatible
NL X
NL Y Y
X Y N
X_lock RUpdate RX_unlock REOT
TAX_lock Rwait
X_lock RRead R
TB
*Two Phase Locking
Definiton1: In a transaction, if all locks precede allunlocks, then the transaction is called two phasetransaction. This restriction is called two phaselocking protocol.
Definition2: In a transaction, if it first acquires a lockon the ob ect before o eratin on it, it is called well-
164Database Management Systems and Their Implementation, Xu Lizhen
formed. T1Lock A
Lock BLock CUnlock AUnlock BUnlock C
T2Lock A
Lock BUnlock AUnlock BLock CUnlock C
2PL not 2PL
Growing
phase
Shrinkingphase
Theorem: If S is anyschedule of well-formed and twophase transactions,then S is serializable.
(proving is on p151)
Conclusions :
1) Well-formed + 2PL : serializable
2) Well-formed + 2PL + unlock update at EOT:serializable and recoverable. (without dominophenomena)
3) Well-formed + 2PL + holding all locks to EOT: strict
165Database Management Systems and Their Implementation, Xu Lizhen
two phase locking transaction.
7.2.2 (S,X) locks
S lock --- if read access isintended.
X lock --- if update accessis intended.
NL S X
NL Y Y Y
S Y Y N
X Y N N
-
7/28/2019 DBMS and Implementation All(Full)
56/69
56
7.2.3 (S,U,X) locks
U lock --- update lock. For
an update access thetransaction first acquiresa U-lock and thenpromote it to X-lock.
Purpose: shorten the time
NL S U X
NL Y Y Y Y
S Y Y Y N
U Y Y N N
166Database Management Systems and Their Implementation, Xu Lizhen
of exclusion, so as toboost concurrency degree,and reduce deadlock.
X Y N N N
X (S,X) (S,U,X)
Concurrency degree
Overhead
7.3 Deadlock & Live Lock
Dead lock: wait in cycle, no transaction can obtain all of resourcesneeded to complete.
Live lock: although other transactions release their resource inlimited time, some transaction can not get the resources needed fora very long time.
TA
TB R
167Database Management Systems and Their Implementation, Xu Lizhen
X_ ock R1
X_lock R2
wait
_ ocX_lock R1wait
T1: S-lockT2: S-lock
T: x-lock
Live lock is simpler, only need to adjust schedule strategy, suchas FIFO
Deadlock: (1) Prevention(dont let it occur); (2) Solving(permit itoccurs, but can solve it)
7.3.1 Deadlock Detection
1) Timeout: If a transaction waits for some specifiedtime then deadlock is assumed and the transactionshould be aborted.
2) Detect deadlock by wait-for graph G=
V : set of transactions {T i|Ti is a transaction in DBS
168Database Management Systems and Their Implementation, Xu Lizhen
(i=1,2,n)}
E : {|Ti waits for Tj (i j)} If there is cycle in the graph, the deadlock occurs.
When to detect?
1) whenever one transaction waits.
2) periodically
-
7/28/2019 DBMS and Implementation All(Full)
57/69
57
What to do when detected?
1) Pick a victim (youngest, minimum abort cost, )
2) Abort the victim and release its locks and resources
3) Grant a waiter
4 Restart the victim automaticall or manuall
169Database Management Systems and Their Implementation, Xu Lizhen
7.3.2 Deadlock avoidance1) Requesting all locks at initial time of transaction.
2) Requesting locks in a specified order of resource.
3) Abort once conflicted.
4) Transaction Retry
Every transaction is uniquely time stamped. If TArequires a lock on a data object that is already locked byTB, one of the following methods is used:
a) Wait-die: TA waits if it is older than TB, otherwise itdies, i.e. it is aborted and automatically retriedwith original timestamp.
b) Wound-wait: TA waits if it is younger than TB,
170Database Management Systems and Their Implementation, Xu Lizhen
otherwise it wound TB, i.e. TB is aborted andautomatically retried with original timestamp.
In above, both have only one direction wait, eitherolder younger or younger older. It is impossibleto occur wait in cycle, so the dead lock is avoided.
7.4 Lock Granularities
7.4.1 Locking in multi granularities
To reduce the overhead of locking, the lock unit should be thebigger, the better; To boost the concurrency degree oftransactions, the lock unit should be the smaller, the better.
In large scale DBMS, the lock unit is divided into several levels:
DBFileRecordField
171Database Management Systems and Their Implementation, Xu Lizhen
In this situation, if a transaction acquires a lock on a data objectof some level then it acquires implicitly the same lock on eachdescendant of that data object.
So, there are two kinds of locks in multi granularity lock method:
Explicit lock
Implicit lock
-
7/28/2019 DBMS and Implementation All(Full)
58/69
58
7.4.2 Intention lock
How to check conflicts on implicit locks
Intention lock: provide three kinds of intension lockswhich are IS, IX and SIX. For example, if a transactionadds a S lock on some lower level data object, all thehigher level data object which contains it should beadded an IS lock as a warning information. If another
172Database Management Systems and Their Implementation, Xu Lizhen
transaction want to apply an X lock on a higher leveldata object later, it can find the implicit conflictthrough IS lock.
IS --- Intention share lock IX --- Intention exclusive lock SIX --- SIX
DB
File
Record
Field
S
IS
IS
Compatibility matrix while lock in multi granularities :
NL IS IX S SIX X
NL Y Y Y Y Y Y
IS Y Y Y Y Y N
IX Y Y Y N N N
X
SIX
IXS
strong
Exclusion
173Database Management Systems and Their Implementation, Xu Lizhen
S Y Y N Y N N
SIX Y Y N N N N
X Y N N N N N
The lock with strong exclusion degree can substitutethe lock with weak exclusion degree while locking, butit is not right contrarily.
IS
NL weak
egree
Locking Rules:
Locks are requested from root to leaves and releasedfrom leaves to root.
Locking order DB
Example:
174Database Management Systems and Their Implementation, Xu Lizhen
record file DB
(1) S IS IS
(2) X IX IX
(3) All read and some write SIX IX, IS
Request X lock to records need updating Substitute with stronger exclusive lock
File
Record
Field
-
7/28/2019 DBMS and Implementation All(Full)
59/69
59
7.5 Locking on Index (B+ Tree)
Transactions access concurrently not only the data in DB,
but also the indexes on these data, and apply operationson indexes, such as search, insert, delete, etc. So indexesalso need concurrency control while multi granularitylocking is supported.
175Database Management Systems and Their Implementation, Xu Lizhen
ow can we e c en y o c a par cu ar ea no e Btw, dont confuse this with multiple granularity locking!
One solution: Ignore the tree structure, just lock pageswhile traversing the tree, following 2PL.
This has terrible performance! Root node (and many higher level nodes) become bottlenecks
because every tree access begins at the root.
Some Useful Observations
Every access to B+ tree need a traverse from root tosome leaf. Only leaves have detail information aboutdata, such as TID, while higher levels of the tree onlydirect searches for leaf pages.
One node occupies one page generally, so the lockunit of B+ tree is page. Dont need multi granularity
176Database Management Systems and Their Implementation, Xu Lizhen
locks, only S, X lock on page level are enough.
B+ tree is key resource accessed frequently, liable to
be the bottleneck of system. Performance is veryimportant in index concurrency control.
If occur conflict while traverse the tree, discard alllocks applied, and search from root again after somedelay. Avoid deadlock resulted by wait.
Some Useful Observations Originally, locks on index are only used to keep the
consistency of index itself. The correctness ofconcurrent transactions is responsible by 2PL. Fromthis sense, the locks on index dont need keeping toEOT, they can release immediately after finish the
177Database Management Systems and Their Implementation, Xu Lizhen
.But in 7.6 we will know, even the strict 2PL has leakwhile multi granularity locks are permitted. The lockon leaf of B+ tree should be kept to EOT in order tomake up the leak of strict 2PL in this situation, whilelocks on other nodes of the tree can be released afterfinishing of search.
-
7/28/2019 DBMS and Implementation All(Full)
60/69
60
ExampleROOT
A
B
20
35
Traversing
the tree
178Database Management Systems and Their Implementation, Xu Lizhen
C
D E
F
G H I20*
38 44
22* 23* 24* 35* 36* 38* 41* 44*
23
Tree Locking Algorithm
1. While traversing, apply S lock on root first, then apply S lock onthe child node selected. Once get S lock on the child, the S lockon parent can be released, because traversing cant go back.Search like this until arrive leaf node. After traversing, only Slock on wanted leaf is left. Keep this S lock till EOT.
2. While inserting new index item, traverse first, find the leaf node
179Database Management Systems and Their Implementation, Xu Lizhen
w ere e new em s ou e nser e n. pp y oc on sleaf node.
If it is not full, insert directly.
If it is full, split node according to the rule of B+ tree. Whilesplitting, besides the original leaf, the new leaf and their parentshould add X lock. If parent is also full, the splitting will continue.
In every splitting, must apply X lock on each node to be changed.These X locks can be released when the changes are finished.
After the all inserting process is completed, the X lock on leaf nodeis changed to S lock and kept to EOT.
Tree Locking Algorithm3. While deleting an index item from the tree, the
procedure is similar as inserting. Deleting may causethe combination of nodes in B+ tree. The nodechanged must be X locked first and X lock releasedafter finishing change. The X lock on leaf node is also
180Database Management Systems and Their Implementation, Xu Lizhen
.
-
7/28/2019 DBMS and Implementation All(Full)
61/69
61
7.6 Phantom and Its Prevention
The assumption that the DB is a fixed
collection of objects is not true when multigranularity locking is permitted. Then evenStrict 2PL will not assure serializability:
T1 locks all a es containin sailor records with
181Database Management Systems and Their Implementation, Xu Lizhen
rating = 1, and finds oldest sailor (say, age = 71).
Next, T2 inserts a new sailor; rating = 1, age = 96.
T2 also deletes oldest sailor with rating = 2 (and,say, age = 80), and commits.
T1 now locks all pages containing sailor recordswith rating = 2, and finds oldest (say, age = 63).
No consistent DB state where T1 is correct!
The Problem
T1 implicitly assumes that it has locked the set of allsailor records with rating = 1.
Assumption only holds if no sailor records are added whileT1 is executing!
Need some mechanism to enforce this assumption. (Indexlocking and predicate locking)
182Database Management Systems and Their Implementation, Xu Lizhen
Example shows that conflict serializability guaranteesserializability only if the set of objects is fixed!
If the system dont support multi granularity locking,or even if support multi granularity locking, thequery need to scan the whole table and add S lock onthe table, then there is not this problem. For example :
select s#, average(grade) from SC group by s#;
Index Locking
If there is a dense index on the rating field, T1should lock the index node containing the dataentries with rating = 1 and keep it until EOT.
If there are no records with rating = 1, T1 must lock theindex node where such a data entry would be, if it existed!
r=1
Data
Index
183Database Management Systems and Their Implementation, Xu Lizhen
W en T2 wants to insert a new sai or rating = 1,age = 96 ), he cant get the X lock on the index nodecontaining the data entries with rating = 1, so hecant insert the new index item to realize the insertof a new sailor.
If there is no suitable index, T1 must lock thewhole table, no new records can be added beforeT1 commit of course.
-
7/28/2019 DBMS and Implementation All(Full)
62/69
62
Predicate Locking
Grant lock on all records that satisfy some
logical predicate, e.g. age > 2*salary. Index locking is a special case of predicate
locking for which an index supports efficient
184Database Management Systems and Their Implementation, Xu Lizhen
. What is the predicate in the sailor example?
In general, predicate locking has a lot oflocking overhead. It is almost impossible torealize it.
7.7 Isolation Level of Transaction
Support for isolation level of transaction is addedfrom SQL-92. Each transaction has an access mode, adiagnostics size, and an isolation level.
SET TRANSACTION statement
Possible result
185Database Management Systems and Their Implementation, Xu Lizhen
so ationLevel
Lock demandDirtyRead
UnrepeatableRead
PhantomProblem
Read
Uncommitted Maybe Maybe Maybe
No lock when read; X lock when write,
keep until EOTReadCommitted
No Maybe MaybeS lock when read, release after read; Xlock when write, keep until EOT
RepeatableR