foundations of information systems 5 dbms...
TRANSCRIPT
copy Prof Dr-Ing Wolfgang Lehner |
INTELLIGENT DATABASE GROUP
DBMS Architecture 5
Foundations of Information Systems
| 264
gt
copy A Behrend Foundations of Information Systems |
What is in the Lecture
1 Database Usage Query Programming Design
2 Database Architecture Indexes Transactions Query Processing
| 265
gt
copy A Behrend Foundations of Information Systems |
How is Database System build
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
byte[] b = read(File f int pos int length)
| 266
gt
copy A Behrend Foundations of Information Systems |
Storage System
Buffer
File System
Hardware
Data System
Application
Architectural Blue Print
TID TID TID
TID TID TID
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
Run
Buffered Pages - Page replacement strategy - Materialization strategy - Logging Backup Recovery
Paged files
Disks Flash RAID SAN hellip
Storage Structures - Record management - Free space management - Physical access paths
Access System
Data
base
Sys
tem
Data model semantics - System catalog - Record format - Logical access paths
Query processing - Parsing - Plan generation - Plan optimization - Plan execution
1 lsquoSmithrsquo 15061982
Table Person id INT name VARCHAR birthday DATE Index P_id_IX on Personid
Database System
SQL JDBC ODBC hellip
copy A Behrend Foundations of Information Systems | 267
gt
|
Architectural Trends
| 268
gt
copy A Behrend Foundations of Information Systems |
Different Access Characteristics
OLTP (On-line Transaction Processing) Mix between read-only and update queries Minor analysis tasks Used for data preservation and lookup Read typically only a few records at a time High performance by storing contiguous records in disk pages
OLAP (On-line Analytical Processing) Query-intensive DBMS applications Infrequent batch-oriented updates Complex analysis on large data volumes Read typically only a few attributes of large amounts of historical data in order to
partition them and compute aggregates High performance by storing contiguous values of a single attribute
| 269
gt
copy A Behrend Foundations of Information Systems |
Hardware Developments
Hardware improvements not equally distributed Advances in CPU speed have outpaced advances
in RAM latency Main-memory access has become a performance
bottleneck for many computer applications Bandwidth Latency Address translation (TLB)
rarr Memory Wall Cache memories can reduce the memory latency
when the requested data is found in the cache Vertically fragmented data structures optimize
memory cache usage
| 270
gt
copy A Behrend Foundations of Information Systems |
Row Storage vs Column Storage
Row Storage + easy to addmodify a record - might read unnecessary data
Column Storage + only need to read in relevant data - tuple writes require multiple accesses -gt suitable for read-mostly read-intensive large data repositories
| 271
gt
copy A Behrend Foundations of Information Systems |
Processing Models
[Marcin Zukowski Peter A Boncz Niels Nes Saacutendor Heacuteman MonetDBX100 - A DBMS In The CPU Cache IEEE Data Eng Bull 28(2) p17-22 2005]
| 272
gt
copy A Behrend Foundations of Information Systems |
Transaction Management
Principle of a transaction Sequence of successive DB operations that transform a database from a consistent
state into another consistent state surrounded by BOT EOT (Commit Abort)
Properties ACID Atomicity Consistency Isolation Durability A transaction will always come to an end Normal (commit) changes are permanently stored within the DB Abnormal (abort rollback) already composed changes are taken back
Note EOT state must not be different from BOT state
BOT(begin of transaction)
EOT(end of transaction)
possibly inconsistent database
consistentdatabase
consistentdatabase
DB DB
DML operations
| 273
gt
copy A Behrend Foundations of Information Systems |
ACID Properties of Transactions
Atomicity Indivisibility due to the transaction definition (Begin - End) All-or-nothing principle ie the DBS guarantees Either the complete execution of a transaction hellip hellip or the ineffectiveness of the whole transaction (and of all associated operations)
Consistency A successful transaction guarantees that all consistency requirements (integrity
requirements) have been met
Isolation Multiple transactions run isolated from each other and do not use (inconsistent)
intermediate results from other transactions
Durability All results of successful transactions have to be made persistent
| 274
gt
copy A Behrend Foundations of Information Systems |
Motivation
Atomicity Part of the transaction is done but we want to cancel it ABORTROLLBACK System crashes during transaction some changes made it to the disk some did not
Durability Transaction finished user notified COMMIT System crashes before changes sent successfully to disk (asynchronous write)
Consistency Physical consistency Correctness of the storage and access structures Completely executed modification operations preserve the consistency
Logical consistency Correctness of data contents ndash correspond to a (possible) state of the real world Completely executed transactions preserve the logical consistency
- All modifications of finished transactions are included - No modifications of open transactions are included
Remember Logical consistency requires physical consistency in the first place
UNDO Recovery
REDO Recovery
UNDO Recovery for consistency-related rollbacks
| 275
gt
copy A Behrend Foundations of Information Systems |
Reasons for crashes
Transaction error Violation of system restrictions Violation of security regulations Excessive resource requirements deadlocks
Application-related errors eg wrong operations and values ROLLBACK
System error System crash with loss of main-memory contents Database system operating system hardware power failure
Device error (especially storage-medium error) Destruction of secondary storage systems
Catastrophes Destruction of the computing center
| 276
gt
copy A Behrend Foundations of Information Systems |
Guarantee Atomicity amp Durability
Assumptions System may crash but the disk is durable The only atomicity guarantee is that a disk block write is atomic
Materialization strategy Preferred Policy StealNo Force This combination is most complicated but allows for highest performance No Force complicates enforcing Durability What if system crashes before a modified page written by a committed
transaction makes it to disk Write as little as possible in a convenient place at commit time to support
REDOing modifications Steal complicates enforcing Atomicity What if the transaction that performed udpates aborts What if system crashes before transaction is finished Must remember the old value of P (to support UNDOing the write to page P)
copy A Behrend Foundations of Information Systems | 277
gt
|
Storage System
Buffer
File System
Hardware
Data System
Application
TID TID TID
TID TID TID
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
Run
Buffered Pages - Page replacement strategy - Materialization strategy - Logging Backup Recovery
Paged files
Disks Flash RAID SAN hellip
Storage Structures - Record management - Free space management - Physical access paths
Access System
Data
base
Sys
tem
Data model semantics - System catalog - Record format - Logical access paths
Query processing - Parsing - Plan generation - Plan optimization - Plan execution
1 lsquoSmithrsquo 15061982
Table Person id INT name VARCHAR birthday DATE Index P_id_IX on Personid
Database System
SQL JDBC ODBC hellip
Record Management
| 278
gt
copy A Behrend Foundations of Information Systems |
Record
Record Package of fields that together describe a thing a person a fact etc Each fields represents on property of the entity described by the record Similar to a struct in C Variable length (in contrast to pages)
Record Manager Organizes physical storage of records in pages Operations Get Insert Update Delete Scan Agnostic to record structure and semantic
records considered as byte strings of variable length Structure and content of record is defined be Access System and application
Challenges Record addressing Free space management
| 279
gt
copy A Behrend Foundations of Information Systems |
Record Addressing
Record address Identifier for records used to address records eg in indexes or query processing Assigned during insert of a record
Goals Stability of identifier Fast and direct access Less organizational overhead
Direct addressing Byte address or position number in file or page Instable Byte address If record grows in length following records would get new address Position number Insert and delete operations change series or records
Indirect addressing Surrogate with mapping table (complete indirection) Tuple Identifier (TID concept)
| 280
gt
copy A Behrend Foundations of Information Systems |
Surrogate with Mapping Table
Surrogate Record type + serial number Serial number remains constant during recordrsquos life time
Mapping table Maps
surrogate to page
Problems Where to store mapping table How can it be extended How to search mapping table efficiently
rarr H2 use B-Tree to store mapping table
Mapping Table Surrogate | Page ID
| 281
gt
copy A Behrend Foundations of Information Systems |
TID Concept
Record addressing with indirection inside the page Each page contains an array with record positions TID of a record consist of page id and index in position array
Pros Access with one page access (two pages in case of overflow) Stable No mapping table required
Operations Insert Reuse unused
position or add position Delete Mark position
as unused in array Update Update all
positions in array Update with overflow Store record
as overflow record and store TID of overflow record at original position (No double overflow Update TID at original position)
Record
Overflow Record
| 282
gt
copy A Behrend Foundations of Information Systems |
Free Space Management
Problem In which page is enough space for new record
Solution Free space table lists for all pages how much space is left
Free space value Precise value Ceil(Log2(page size)) =gt 2 bytes for common page size of 4K Rough value use less bytes free space = (value page size)2^(bits per value)
Free space table With direct page addressing Assuming a single page can take n free space entries First page and each (n+1)-th page takes free space entries
With indirect page addressing Free space information stored in page table
copy A Behrend Foundations of Information Systems | 283
gt
|
Storage System
Buffer
File System
Hardware
Data System
Application
TID TID TID
TID TID TID
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
Run
Buffered Pages - Page replacement strategy - Materialization strategy - Logging Backup Recovery
Paged files
Disks Flash RAID SAN hellip
Storage Structures - Record management - Free space management - Physical access paths
Access System
Data
base
Sys
tem
Data model semantics - System catalog - Record format - Logical access paths
Query processing - Parsing - Plan generation - Plan optimization - Plan execution
1 lsquoSmithrsquo 15061982
Table Person id INT name VARCHAR birthday DATE Index P_id_IX on Personid
Database System
SQL JDBC ODBC hellip
Physical Access Paths ndash Index Structures
| 284
gt
copy A Behrend Foundations of Information Systems |
Primary Index Secondary Index
Overview Indexes
Table scan Read all pages and for each record
evaluate the search criteria Pre-fetching
Index Scan Use index for search criteria
on one or more attributes Fast access to single values or value ranges of index attributes Logicalphysical sorting of values of key attributes (depending on index structure) Enforcing uniqueness
Types if indexes Primary (Clustered) Index
determines physical organization use for PK Secondary (Non-Clustered)
Index redundant access path
Pers(PID NAME AGE SALARY)
Age
Salary
| 285
gt
copy A Behrend Foundations of Information Systems |
Overview Indexes (2)
Choice of Access Paths Index scan Only useful for low selectivity
(low number of result tuples) Break even-point according to the
output ratio of the number of tuples (usually max 5) Requires statistics about data Additional costs for index storage
and updating
Table Scan adequateefficient for small tables
(eg 5 pages) Queries with high selectivity
(large result sets) 100-200MBs sequential read
~ 100 disk seekss
hit rate
Index Scan
Table Scan
access time
| 286
gt
copy A Behrend Foundations of Information Systems |
Classification of Index Structures
Classification
Multiway Trees Tree structure with multiple children per node Idea chose fan out so that node size suits page size
Onedimensional Index Structures
Key Comparison Key Transformation
Sequential Tree-Based Hash-Based
Prefix Trees (Tries)
Binary Search Trees Dynamic Static Linked Lists
(log seq) Seq Lists
(phys seq) Multiway
Trees Example B-Tree
| 287
gt
copy A Behrend Foundations of Information Systems |
B-Tree
free space
(Ki Di Pi) = entry min |P| = k+1 max |P| = 2k+1
keys lt K1 keys gt Kp Ki lt keys lt Ki+1
| 288
gt
copy A Behrend Foundations of Information Systems |
B-Tree (2)
Example Keys Agnostic to specific key semantic Only defined complete order required Could be of fixed or variable length
Operations Search for data for given key value Insertion and deletion of key-data pair
Payload Agnostic to specific data semantic Can be record or reference (TID) or
mix
B-Tree with k = 2 h = 3
| 289
gt
copy A Behrend Foundations of Information Systems |
Search in the B-Tree
Starting at the root node each node is searched from left to right 1) if Ki matches the desired key value the data record has been found (further
records with the same key value might be located in a sub-tree to which Pi-1 points) 2) if Ki is smaller than the desired value the search will be continued in the root of
the sub-tree identified by Pi-1 3) if Ki is larger than the desired value the comparison with Ki+1 is repeated 4) if K2k is also smaller than the desired value the search will be continued in the
sub-tree of P2k If itlsquos impossible to descend further into a sub-tree (2 or 4) (leaf node) The search is aborted no record with the desired key value is found
Search for 38 20 6
| 290
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (1)
Insertion Rule insert only into leaf nodes At Non-Leaf Nodes descend down the tree as for the search S le Ki follow Pi-1 S gt Ki check Ki+1 S gt K2k follow P2k
At Leaf Node Insert the data record according to the sorting order Special case leaf node is full (2k records) rarr split the leaf node
Splitting Generate a new leaf node Split the 2k+1 entries (in order)
into two leaf nodes first k entries rarr left node last k entries rarr right node
middle entry (k+1-th) is used as new ldquodiscriminatorrdquo (branching) and inserted into the parent node
| 291
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (2)
Node Splitting during Insertion Two possible situations after a split The parent node is full rarr repeat split on this level Enough space rarr FINISHED
Special case root split Split of the root node rarr New root with two successor nodes Height of a tree grows by 1 The tree has been split from the bottom to the top
Dynamic reorganization (self-balancing) No unloading or loading necessary Tree is always balanced But In case of many insertions deletions reorganization can be beneficial
| 292
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (3)
Insertion Example Order k = 1 n=2k Keys 1 5 2 6 7 4 8 3
Finally h=3
| 293
gt
copy A Behrend Foundations of Information Systems |
Insertion and Deletion in the B-Tree
Problem Insertion can create overflow Deletion can create underflow and overflow
Example Insertion of key 22
rarr Overflow rarr Split
Deletion of key 22 rarr Underflow need to access all four nodes finally same as input
Insert 22
| 294
gt
copy A Behrend Foundations of Information Systems |
Underflow Merge
Deletion in the B-Tree
Example Order k = 1 n=2k Delete key 3
| 295
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (2)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Remember Each path from the root to the leaf has the same length h
| 296
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (3)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Overflow Split
| 297
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (4)
Example Order k = 1 n=2k Delete key 3
Overflow Split
Root Split
| 298
gt
copy A Behrend Foundations of Information Systems |
Deletion Algorithm
Example ndash there are different algorithms Search the node that contains the key K to be deleted If key K is in a leaf node delete the key in the leaf node and handle potentially
resulting underflow by merging with sibling If key K is in an inner node pull up new discriminator from one of the successors Analyze which successor node of K has more elements left or right one
If both have the same number of elements decide for one Replace the key K to be deleted with the direct successor Krsquo from the left
successor node or with the direct successor Krsquorsquo from the right successor node respectively Delete Krsquo or Krsquorsquo from the respective successor node (recursively)
Note Major variants Merge (tis lecture) Re-distribution (instead of splitmerge in case of overflowunderflow the entries are
re-distributed under consideration of one or multiple adjacent nodes)
| 299
gt
copy A Behrend Foundations of Information Systems |
B-Trees B+-Trees and B-Trees
B+-Trees and B-Trees Data is only in leaf nodes Key redundancy but higher fan-out rarr lower tree high less IO Simpler delete procedure rarr requires only merging of nodes
Double linked list of all leaf nodes B-Trees Modified valid node sizes
from [k2k] to [43k2k] rarr better node utilization but more splitsmerges
Example Secondary index Non unique
B-Tree with k = 2 h = 3
| 300
gt
copy A Behrend Foundations of Information Systems |
Indexing Low Cardinality Columns
Problem Example B-tree on the sex of customers for a table with 1000000 tuples results in
two lists with approximately 500000 tuples each
Query for all female customers requires 500000 random page accesses (secondary index) Table scan would be much faster
Conclusion B-trees (and also hashing) are useful for predicates with low selectivity
(outputinput cardinality ratio) Rule of thumb margin hit rate is approx 5 higher hit rates do not justify the efforts for an index access
F M
TID TID TID TID hellip TID TID TID TID hellip
| 301
gt
copy A Behrend Foundations of Information Systems |
Bitmap Index
Idea (Long history since the 1960s) Create a bitmapbitlist for each
attribute value Each tuple in the table is assigned to
one bit in the bitmap (by position sequential TID)
Bit values 1 attribute value set 0 attribute value not set
Necessary condition Sequential numbering of the tuples
(TIDs)
F M
1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
0 1 0 0 1 0 0 0 0 1 1 0 1 0 0
Name Sex Region Race Carol f n white
Harold m e black Anne f e asian
Iris f ne white hellip m se hisp hellip f e white hellip f sw asian hellip f w black hellip f n asian hellip m e hisp hellip m se black hellip f s white hellip m nw black hellip f s white hellip f w black
Sex
| 302
gt
copy A Behrend Foundations of Information Systems |
Querying Bitmap Indexes
Main advantage of bitmap indexes Simple and efficient logical join possible Read only data that is relevant for predicates Example σSex=lsquoflsquo ᴧ Region=lsquonlsquo R Bitmaps B1 and B2 in conjunction for (i=0 iltB1length i++) B = B1[i] amp B2[i]
Example IO Costs Estimation σSex=lsquoflsquo ᴧ Region=lsquonlsquo ᴧ Race=lsquoAsianlsquo R
(ldquoAsian women of region Northrdquo) Selectivity 12 18 14 = 164 N=10000 tuples with length of 400 bytes each
(~ 10 tuples per page for 4kB pages) Table scan 1000 pages Bitmap access 1000064 156 pages (worst case each tuple in a different page)
plus 1 page for bitmaps
F 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
N 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0
A 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
AND = AND
- DBMS Architecture
- What is in the Lecture
- How is Database System build
- Architectural Blue Print
- Architectural Trends
- Different Access Characteristics
- Hardware Developments
- Row Storage vs Column Storage
- Processing Models
- Transaction Management
- ACID Properties of Transactions
- Motivation
- Reasons for crashes
- Guarantee Atomicity amp Durability
- Record Management
- Record
- Record Addressing
- Surrogate with Mapping Table
- TID Concept
- Free Space Management
- Physical Access Paths ndash Index Structures
- Overview Indexes
- Overview Indexes (2)
- Classification of Index Structures
- B-Tree
- B-Tree (2)
- Search in the B-Tree
- Insertions in the B-Tree (1)
- Insertions in the B-Tree (2)
- Insertions in the B-Tree (3)
- Insertion and Deletion in the B-Tree
- Deletion in the B-Tree
- Deletion in the B-Tree (2)
- Deletion in the B-Tree (3)
- Deletion in the B-Tree (4)
- Deletion Algorithm
- B-Trees B+-Trees and B-Trees
- Indexing Low Cardinality Columns
- Bitmap Index
- Querying Bitmap Indexes
-
| 264
gt
copy A Behrend Foundations of Information Systems |
What is in the Lecture
1 Database Usage Query Programming Design
2 Database Architecture Indexes Transactions Query Processing
| 265
gt
copy A Behrend Foundations of Information Systems |
How is Database System build
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
byte[] b = read(File f int pos int length)
| 266
gt
copy A Behrend Foundations of Information Systems |
Storage System
Buffer
File System
Hardware
Data System
Application
Architectural Blue Print
TID TID TID
TID TID TID
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
Run
Buffered Pages - Page replacement strategy - Materialization strategy - Logging Backup Recovery
Paged files
Disks Flash RAID SAN hellip
Storage Structures - Record management - Free space management - Physical access paths
Access System
Data
base
Sys
tem
Data model semantics - System catalog - Record format - Logical access paths
Query processing - Parsing - Plan generation - Plan optimization - Plan execution
1 lsquoSmithrsquo 15061982
Table Person id INT name VARCHAR birthday DATE Index P_id_IX on Personid
Database System
SQL JDBC ODBC hellip
copy A Behrend Foundations of Information Systems | 267
gt
|
Architectural Trends
| 268
gt
copy A Behrend Foundations of Information Systems |
Different Access Characteristics
OLTP (On-line Transaction Processing) Mix between read-only and update queries Minor analysis tasks Used for data preservation and lookup Read typically only a few records at a time High performance by storing contiguous records in disk pages
OLAP (On-line Analytical Processing) Query-intensive DBMS applications Infrequent batch-oriented updates Complex analysis on large data volumes Read typically only a few attributes of large amounts of historical data in order to
partition them and compute aggregates High performance by storing contiguous values of a single attribute
| 269
gt
copy A Behrend Foundations of Information Systems |
Hardware Developments
Hardware improvements not equally distributed Advances in CPU speed have outpaced advances
in RAM latency Main-memory access has become a performance
bottleneck for many computer applications Bandwidth Latency Address translation (TLB)
rarr Memory Wall Cache memories can reduce the memory latency
when the requested data is found in the cache Vertically fragmented data structures optimize
memory cache usage
| 270
gt
copy A Behrend Foundations of Information Systems |
Row Storage vs Column Storage
Row Storage + easy to addmodify a record - might read unnecessary data
Column Storage + only need to read in relevant data - tuple writes require multiple accesses -gt suitable for read-mostly read-intensive large data repositories
| 271
gt
copy A Behrend Foundations of Information Systems |
Processing Models
[Marcin Zukowski Peter A Boncz Niels Nes Saacutendor Heacuteman MonetDBX100 - A DBMS In The CPU Cache IEEE Data Eng Bull 28(2) p17-22 2005]
| 272
gt
copy A Behrend Foundations of Information Systems |
Transaction Management
Principle of a transaction Sequence of successive DB operations that transform a database from a consistent
state into another consistent state surrounded by BOT EOT (Commit Abort)
Properties ACID Atomicity Consistency Isolation Durability A transaction will always come to an end Normal (commit) changes are permanently stored within the DB Abnormal (abort rollback) already composed changes are taken back
Note EOT state must not be different from BOT state
BOT(begin of transaction)
EOT(end of transaction)
possibly inconsistent database
consistentdatabase
consistentdatabase
DB DB
DML operations
| 273
gt
copy A Behrend Foundations of Information Systems |
ACID Properties of Transactions
Atomicity Indivisibility due to the transaction definition (Begin - End) All-or-nothing principle ie the DBS guarantees Either the complete execution of a transaction hellip hellip or the ineffectiveness of the whole transaction (and of all associated operations)
Consistency A successful transaction guarantees that all consistency requirements (integrity
requirements) have been met
Isolation Multiple transactions run isolated from each other and do not use (inconsistent)
intermediate results from other transactions
Durability All results of successful transactions have to be made persistent
| 274
gt
copy A Behrend Foundations of Information Systems |
Motivation
Atomicity Part of the transaction is done but we want to cancel it ABORTROLLBACK System crashes during transaction some changes made it to the disk some did not
Durability Transaction finished user notified COMMIT System crashes before changes sent successfully to disk (asynchronous write)
Consistency Physical consistency Correctness of the storage and access structures Completely executed modification operations preserve the consistency
Logical consistency Correctness of data contents ndash correspond to a (possible) state of the real world Completely executed transactions preserve the logical consistency
- All modifications of finished transactions are included - No modifications of open transactions are included
Remember Logical consistency requires physical consistency in the first place
UNDO Recovery
REDO Recovery
UNDO Recovery for consistency-related rollbacks
| 275
gt
copy A Behrend Foundations of Information Systems |
Reasons for crashes
Transaction error Violation of system restrictions Violation of security regulations Excessive resource requirements deadlocks
Application-related errors eg wrong operations and values ROLLBACK
System error System crash with loss of main-memory contents Database system operating system hardware power failure
Device error (especially storage-medium error) Destruction of secondary storage systems
Catastrophes Destruction of the computing center
| 276
gt
copy A Behrend Foundations of Information Systems |
Guarantee Atomicity amp Durability
Assumptions System may crash but the disk is durable The only atomicity guarantee is that a disk block write is atomic
Materialization strategy Preferred Policy StealNo Force This combination is most complicated but allows for highest performance No Force complicates enforcing Durability What if system crashes before a modified page written by a committed
transaction makes it to disk Write as little as possible in a convenient place at commit time to support
REDOing modifications Steal complicates enforcing Atomicity What if the transaction that performed udpates aborts What if system crashes before transaction is finished Must remember the old value of P (to support UNDOing the write to page P)
copy A Behrend Foundations of Information Systems | 277
gt
|
Storage System
Buffer
File System
Hardware
Data System
Application
TID TID TID
TID TID TID
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
Run
Buffered Pages - Page replacement strategy - Materialization strategy - Logging Backup Recovery
Paged files
Disks Flash RAID SAN hellip
Storage Structures - Record management - Free space management - Physical access paths
Access System
Data
base
Sys
tem
Data model semantics - System catalog - Record format - Logical access paths
Query processing - Parsing - Plan generation - Plan optimization - Plan execution
1 lsquoSmithrsquo 15061982
Table Person id INT name VARCHAR birthday DATE Index P_id_IX on Personid
Database System
SQL JDBC ODBC hellip
Record Management
| 278
gt
copy A Behrend Foundations of Information Systems |
Record
Record Package of fields that together describe a thing a person a fact etc Each fields represents on property of the entity described by the record Similar to a struct in C Variable length (in contrast to pages)
Record Manager Organizes physical storage of records in pages Operations Get Insert Update Delete Scan Agnostic to record structure and semantic
records considered as byte strings of variable length Structure and content of record is defined be Access System and application
Challenges Record addressing Free space management
| 279
gt
copy A Behrend Foundations of Information Systems |
Record Addressing
Record address Identifier for records used to address records eg in indexes or query processing Assigned during insert of a record
Goals Stability of identifier Fast and direct access Less organizational overhead
Direct addressing Byte address or position number in file or page Instable Byte address If record grows in length following records would get new address Position number Insert and delete operations change series or records
Indirect addressing Surrogate with mapping table (complete indirection) Tuple Identifier (TID concept)
| 280
gt
copy A Behrend Foundations of Information Systems |
Surrogate with Mapping Table
Surrogate Record type + serial number Serial number remains constant during recordrsquos life time
Mapping table Maps
surrogate to page
Problems Where to store mapping table How can it be extended How to search mapping table efficiently
rarr H2 use B-Tree to store mapping table
Mapping Table Surrogate | Page ID
| 281
gt
copy A Behrend Foundations of Information Systems |
TID Concept
Record addressing with indirection inside the page Each page contains an array with record positions TID of a record consist of page id and index in position array
Pros Access with one page access (two pages in case of overflow) Stable No mapping table required
Operations Insert Reuse unused
position or add position Delete Mark position
as unused in array Update Update all
positions in array Update with overflow Store record
as overflow record and store TID of overflow record at original position (No double overflow Update TID at original position)
Record
Overflow Record
| 282
gt
copy A Behrend Foundations of Information Systems |
Free Space Management
Problem In which page is enough space for new record
Solution Free space table lists for all pages how much space is left
Free space value Precise value Ceil(Log2(page size)) =gt 2 bytes for common page size of 4K Rough value use less bytes free space = (value page size)2^(bits per value)
Free space table With direct page addressing Assuming a single page can take n free space entries First page and each (n+1)-th page takes free space entries
With indirect page addressing Free space information stored in page table
copy A Behrend Foundations of Information Systems | 283
gt
|
Storage System
Buffer
File System
Hardware
Data System
Application
TID TID TID
TID TID TID
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
Run
Buffered Pages - Page replacement strategy - Materialization strategy - Logging Backup Recovery
Paged files
Disks Flash RAID SAN hellip
Storage Structures - Record management - Free space management - Physical access paths
Access System
Data
base
Sys
tem
Data model semantics - System catalog - Record format - Logical access paths
Query processing - Parsing - Plan generation - Plan optimization - Plan execution
1 lsquoSmithrsquo 15061982
Table Person id INT name VARCHAR birthday DATE Index P_id_IX on Personid
Database System
SQL JDBC ODBC hellip
Physical Access Paths ndash Index Structures
| 284
gt
copy A Behrend Foundations of Information Systems |
Primary Index Secondary Index
Overview Indexes
Table scan Read all pages and for each record
evaluate the search criteria Pre-fetching
Index Scan Use index for search criteria
on one or more attributes Fast access to single values or value ranges of index attributes Logicalphysical sorting of values of key attributes (depending on index structure) Enforcing uniqueness
Types if indexes Primary (Clustered) Index
determines physical organization use for PK Secondary (Non-Clustered)
Index redundant access path
Pers(PID NAME AGE SALARY)
Age
Salary
| 285
gt
copy A Behrend Foundations of Information Systems |
Overview Indexes (2)
Choice of Access Paths Index scan Only useful for low selectivity
(low number of result tuples) Break even-point according to the
output ratio of the number of tuples (usually max 5) Requires statistics about data Additional costs for index storage
and updating
Table Scan adequateefficient for small tables
(eg 5 pages) Queries with high selectivity
(large result sets) 100-200MBs sequential read
~ 100 disk seekss
hit rate
Index Scan
Table Scan
access time
| 286
gt
copy A Behrend Foundations of Information Systems |
Classification of Index Structures
Classification
Multiway Trees Tree structure with multiple children per node Idea chose fan out so that node size suits page size
Onedimensional Index Structures
Key Comparison Key Transformation
Sequential Tree-Based Hash-Based
Prefix Trees (Tries)
Binary Search Trees Dynamic Static Linked Lists
(log seq) Seq Lists
(phys seq) Multiway
Trees Example B-Tree
| 287
gt
copy A Behrend Foundations of Information Systems |
B-Tree
free space
(Ki Di Pi) = entry min |P| = k+1 max |P| = 2k+1
keys lt K1 keys gt Kp Ki lt keys lt Ki+1
| 288
gt
copy A Behrend Foundations of Information Systems |
B-Tree (2)
Example Keys Agnostic to specific key semantic Only defined complete order required Could be of fixed or variable length
Operations Search for data for given key value Insertion and deletion of key-data pair
Payload Agnostic to specific data semantic Can be record or reference (TID) or
mix
B-Tree with k = 2 h = 3
| 289
gt
copy A Behrend Foundations of Information Systems |
Search in the B-Tree
Starting at the root node each node is searched from left to right 1) if Ki matches the desired key value the data record has been found (further
records with the same key value might be located in a sub-tree to which Pi-1 points) 2) if Ki is smaller than the desired value the search will be continued in the root of
the sub-tree identified by Pi-1 3) if Ki is larger than the desired value the comparison with Ki+1 is repeated 4) if K2k is also smaller than the desired value the search will be continued in the
sub-tree of P2k If itlsquos impossible to descend further into a sub-tree (2 or 4) (leaf node) The search is aborted no record with the desired key value is found
Search for 38 20 6
| 290
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (1)
Insertion Rule insert only into leaf nodes At Non-Leaf Nodes descend down the tree as for the search S le Ki follow Pi-1 S gt Ki check Ki+1 S gt K2k follow P2k
At Leaf Node Insert the data record according to the sorting order Special case leaf node is full (2k records) rarr split the leaf node
Splitting Generate a new leaf node Split the 2k+1 entries (in order)
into two leaf nodes first k entries rarr left node last k entries rarr right node
middle entry (k+1-th) is used as new ldquodiscriminatorrdquo (branching) and inserted into the parent node
| 291
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (2)
Node Splitting during Insertion Two possible situations after a split The parent node is full rarr repeat split on this level Enough space rarr FINISHED
Special case root split Split of the root node rarr New root with two successor nodes Height of a tree grows by 1 The tree has been split from the bottom to the top
Dynamic reorganization (self-balancing) No unloading or loading necessary Tree is always balanced But In case of many insertions deletions reorganization can be beneficial
| 292
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (3)
Insertion Example Order k = 1 n=2k Keys 1 5 2 6 7 4 8 3
Finally h=3
| 293
gt
copy A Behrend Foundations of Information Systems |
Insertion and Deletion in the B-Tree
Problem Insertion can create overflow Deletion can create underflow and overflow
Example Insertion of key 22
rarr Overflow rarr Split
Deletion of key 22 rarr Underflow need to access all four nodes finally same as input
Insert 22
| 294
gt
copy A Behrend Foundations of Information Systems |
Underflow Merge
Deletion in the B-Tree
Example Order k = 1 n=2k Delete key 3
| 295
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (2)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Remember Each path from the root to the leaf has the same length h
| 296
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (3)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Overflow Split
| 297
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (4)
Example Order k = 1 n=2k Delete key 3
Overflow Split
Root Split
| 298
gt
copy A Behrend Foundations of Information Systems |
Deletion Algorithm
Example ndash there are different algorithms Search the node that contains the key K to be deleted If key K is in a leaf node delete the key in the leaf node and handle potentially
resulting underflow by merging with sibling If key K is in an inner node pull up new discriminator from one of the successors Analyze which successor node of K has more elements left or right one
If both have the same number of elements decide for one Replace the key K to be deleted with the direct successor Krsquo from the left
successor node or with the direct successor Krsquorsquo from the right successor node respectively Delete Krsquo or Krsquorsquo from the respective successor node (recursively)
Note Major variants Merge (tis lecture) Re-distribution (instead of splitmerge in case of overflowunderflow the entries are
re-distributed under consideration of one or multiple adjacent nodes)
| 299
gt
copy A Behrend Foundations of Information Systems |
B-Trees B+-Trees and B-Trees
B+-Trees and B-Trees Data is only in leaf nodes Key redundancy but higher fan-out rarr lower tree high less IO Simpler delete procedure rarr requires only merging of nodes
Double linked list of all leaf nodes B-Trees Modified valid node sizes
from [k2k] to [43k2k] rarr better node utilization but more splitsmerges
Example Secondary index Non unique
B-Tree with k = 2 h = 3
| 300
gt
copy A Behrend Foundations of Information Systems |
Indexing Low Cardinality Columns
Problem Example B-tree on the sex of customers for a table with 1000000 tuples results in
two lists with approximately 500000 tuples each
Query for all female customers requires 500000 random page accesses (secondary index) Table scan would be much faster
Conclusion B-trees (and also hashing) are useful for predicates with low selectivity
(outputinput cardinality ratio) Rule of thumb margin hit rate is approx 5 higher hit rates do not justify the efforts for an index access
F M
TID TID TID TID hellip TID TID TID TID hellip
| 301
gt
copy A Behrend Foundations of Information Systems |
Bitmap Index
Idea (Long history since the 1960s) Create a bitmapbitlist for each
attribute value Each tuple in the table is assigned to
one bit in the bitmap (by position sequential TID)
Bit values 1 attribute value set 0 attribute value not set
Necessary condition Sequential numbering of the tuples
(TIDs)
F M
1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
0 1 0 0 1 0 0 0 0 1 1 0 1 0 0
Name Sex Region Race Carol f n white
Harold m e black Anne f e asian
Iris f ne white hellip m se hisp hellip f e white hellip f sw asian hellip f w black hellip f n asian hellip m e hisp hellip m se black hellip f s white hellip m nw black hellip f s white hellip f w black
Sex
| 302
gt
copy A Behrend Foundations of Information Systems |
Querying Bitmap Indexes
Main advantage of bitmap indexes Simple and efficient logical join possible Read only data that is relevant for predicates Example σSex=lsquoflsquo ᴧ Region=lsquonlsquo R Bitmaps B1 and B2 in conjunction for (i=0 iltB1length i++) B = B1[i] amp B2[i]
Example IO Costs Estimation σSex=lsquoflsquo ᴧ Region=lsquonlsquo ᴧ Race=lsquoAsianlsquo R
(ldquoAsian women of region Northrdquo) Selectivity 12 18 14 = 164 N=10000 tuples with length of 400 bytes each
(~ 10 tuples per page for 4kB pages) Table scan 1000 pages Bitmap access 1000064 156 pages (worst case each tuple in a different page)
plus 1 page for bitmaps
F 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
N 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0
A 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
AND = AND
- DBMS Architecture
- What is in the Lecture
- How is Database System build
- Architectural Blue Print
- Architectural Trends
- Different Access Characteristics
- Hardware Developments
- Row Storage vs Column Storage
- Processing Models
- Transaction Management
- ACID Properties of Transactions
- Motivation
- Reasons for crashes
- Guarantee Atomicity amp Durability
- Record Management
- Record
- Record Addressing
- Surrogate with Mapping Table
- TID Concept
- Free Space Management
- Physical Access Paths ndash Index Structures
- Overview Indexes
- Overview Indexes (2)
- Classification of Index Structures
- B-Tree
- B-Tree (2)
- Search in the B-Tree
- Insertions in the B-Tree (1)
- Insertions in the B-Tree (2)
- Insertions in the B-Tree (3)
- Insertion and Deletion in the B-Tree
- Deletion in the B-Tree
- Deletion in the B-Tree (2)
- Deletion in the B-Tree (3)
- Deletion in the B-Tree (4)
- Deletion Algorithm
- B-Trees B+-Trees and B-Trees
- Indexing Low Cardinality Columns
- Bitmap Index
- Querying Bitmap Indexes
-
| 265
gt
copy A Behrend Foundations of Information Systems |
How is Database System build
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
byte[] b = read(File f int pos int length)
| 266
gt
copy A Behrend Foundations of Information Systems |
Storage System
Buffer
File System
Hardware
Data System
Application
Architectural Blue Print
TID TID TID
TID TID TID
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
Run
Buffered Pages - Page replacement strategy - Materialization strategy - Logging Backup Recovery
Paged files
Disks Flash RAID SAN hellip
Storage Structures - Record management - Free space management - Physical access paths
Access System
Data
base
Sys
tem
Data model semantics - System catalog - Record format - Logical access paths
Query processing - Parsing - Plan generation - Plan optimization - Plan execution
1 lsquoSmithrsquo 15061982
Table Person id INT name VARCHAR birthday DATE Index P_id_IX on Personid
Database System
SQL JDBC ODBC hellip
copy A Behrend Foundations of Information Systems | 267
gt
|
Architectural Trends
| 268
gt
copy A Behrend Foundations of Information Systems |
Different Access Characteristics
OLTP (On-line Transaction Processing) Mix between read-only and update queries Minor analysis tasks Used for data preservation and lookup Read typically only a few records at a time High performance by storing contiguous records in disk pages
OLAP (On-line Analytical Processing) Query-intensive DBMS applications Infrequent batch-oriented updates Complex analysis on large data volumes Read typically only a few attributes of large amounts of historical data in order to
partition them and compute aggregates High performance by storing contiguous values of a single attribute
| 269
gt
copy A Behrend Foundations of Information Systems |
Hardware Developments
Hardware improvements not equally distributed Advances in CPU speed have outpaced advances
in RAM latency Main-memory access has become a performance
bottleneck for many computer applications Bandwidth Latency Address translation (TLB)
rarr Memory Wall Cache memories can reduce the memory latency
when the requested data is found in the cache Vertically fragmented data structures optimize
memory cache usage
| 270
gt
copy A Behrend Foundations of Information Systems |
Row Storage vs Column Storage
Row Storage + easy to addmodify a record - might read unnecessary data
Column Storage + only need to read in relevant data - tuple writes require multiple accesses -gt suitable for read-mostly read-intensive large data repositories
| 271
gt
copy A Behrend Foundations of Information Systems |
Processing Models
[Marcin Zukowski Peter A Boncz Niels Nes Saacutendor Heacuteman MonetDBX100 - A DBMS In The CPU Cache IEEE Data Eng Bull 28(2) p17-22 2005]
| 272
gt
copy A Behrend Foundations of Information Systems |
Transaction Management
Principle of a transaction Sequence of successive DB operations that transform a database from a consistent
state into another consistent state surrounded by BOT EOT (Commit Abort)
Properties ACID Atomicity Consistency Isolation Durability A transaction will always come to an end Normal (commit) changes are permanently stored within the DB Abnormal (abort rollback) already composed changes are taken back
Note EOT state must not be different from BOT state
BOT(begin of transaction)
EOT(end of transaction)
possibly inconsistent database
consistentdatabase
consistentdatabase
DB DB
DML operations
| 273
gt
copy A Behrend Foundations of Information Systems |
ACID Properties of Transactions
Atomicity Indivisibility due to the transaction definition (Begin - End) All-or-nothing principle ie the DBS guarantees Either the complete execution of a transaction hellip hellip or the ineffectiveness of the whole transaction (and of all associated operations)
Consistency A successful transaction guarantees that all consistency requirements (integrity
requirements) have been met
Isolation Multiple transactions run isolated from each other and do not use (inconsistent)
intermediate results from other transactions
Durability All results of successful transactions have to be made persistent
| 274
gt
copy A Behrend Foundations of Information Systems |
Motivation
Atomicity Part of the transaction is done but we want to cancel it ABORTROLLBACK System crashes during transaction some changes made it to the disk some did not
Durability Transaction finished user notified COMMIT System crashes before changes sent successfully to disk (asynchronous write)
Consistency Physical consistency Correctness of the storage and access structures Completely executed modification operations preserve the consistency
Logical consistency Correctness of data contents ndash correspond to a (possible) state of the real world Completely executed transactions preserve the logical consistency
- All modifications of finished transactions are included - No modifications of open transactions are included
Remember Logical consistency requires physical consistency in the first place
UNDO Recovery
REDO Recovery
UNDO Recovery for consistency-related rollbacks
| 275
gt
copy A Behrend Foundations of Information Systems |
Reasons for crashes
Transaction error Violation of system restrictions Violation of security regulations Excessive resource requirements deadlocks
Application-related errors eg wrong operations and values ROLLBACK
System error System crash with loss of main-memory contents Database system operating system hardware power failure
Device error (especially storage-medium error) Destruction of secondary storage systems
Catastrophes Destruction of the computing center
| 276
gt
copy A Behrend Foundations of Information Systems |
Guarantee Atomicity amp Durability
Assumptions System may crash but the disk is durable The only atomicity guarantee is that a disk block write is atomic
Materialization strategy Preferred Policy StealNo Force This combination is most complicated but allows for highest performance No Force complicates enforcing Durability What if system crashes before a modified page written by a committed
transaction makes it to disk Write as little as possible in a convenient place at commit time to support
REDOing modifications Steal complicates enforcing Atomicity What if the transaction that performed udpates aborts What if system crashes before transaction is finished Must remember the old value of P (to support UNDOing the write to page P)
copy A Behrend Foundations of Information Systems | 277
gt
|
Storage System
Buffer
File System
Hardware
Data System
Application
TID TID TID
TID TID TID
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
Run
Buffered Pages - Page replacement strategy - Materialization strategy - Logging Backup Recovery
Paged files
Disks Flash RAID SAN hellip
Storage Structures - Record management - Free space management - Physical access paths
Access System
Data
base
Sys
tem
Data model semantics - System catalog - Record format - Logical access paths
Query processing - Parsing - Plan generation - Plan optimization - Plan execution
1 lsquoSmithrsquo 15061982
Table Person id INT name VARCHAR birthday DATE Index P_id_IX on Personid
Database System
SQL JDBC ODBC hellip
Record Management
| 278
gt
copy A Behrend Foundations of Information Systems |
Record
Record Package of fields that together describe a thing a person a fact etc Each fields represents on property of the entity described by the record Similar to a struct in C Variable length (in contrast to pages)
Record Manager Organizes physical storage of records in pages Operations Get Insert Update Delete Scan Agnostic to record structure and semantic
records considered as byte strings of variable length Structure and content of record is defined be Access System and application
Challenges Record addressing Free space management
| 279
gt
copy A Behrend Foundations of Information Systems |
Record Addressing
Record address Identifier for records used to address records eg in indexes or query processing Assigned during insert of a record
Goals Stability of identifier Fast and direct access Less organizational overhead
Direct addressing Byte address or position number in file or page Instable Byte address If record grows in length following records would get new address Position number Insert and delete operations change series or records
Indirect addressing Surrogate with mapping table (complete indirection) Tuple Identifier (TID concept)
| 280
gt
copy A Behrend Foundations of Information Systems |
Surrogate with Mapping Table
Surrogate Record type + serial number Serial number remains constant during recordrsquos life time
Mapping table Maps
surrogate to page
Problems Where to store mapping table How can it be extended How to search mapping table efficiently
rarr H2 use B-Tree to store mapping table
Mapping Table Surrogate | Page ID
| 281
gt
copy A Behrend Foundations of Information Systems |
TID Concept
Record addressing with indirection inside the page Each page contains an array with record positions TID of a record consist of page id and index in position array
Pros Access with one page access (two pages in case of overflow) Stable No mapping table required
Operations Insert Reuse unused
position or add position Delete Mark position
as unused in array Update Update all
positions in array Update with overflow Store record
as overflow record and store TID of overflow record at original position (No double overflow Update TID at original position)
Record
Overflow Record
| 282
gt
copy A Behrend Foundations of Information Systems |
Free Space Management
Problem In which page is enough space for new record
Solution Free space table lists for all pages how much space is left
Free space value Precise value Ceil(Log2(page size)) =gt 2 bytes for common page size of 4K Rough value use less bytes free space = (value page size)2^(bits per value)
Free space table With direct page addressing Assuming a single page can take n free space entries First page and each (n+1)-th page takes free space entries
With indirect page addressing Free space information stored in page table
copy A Behrend Foundations of Information Systems | 283
gt
|
Storage System
Buffer
File System
Hardware
Data System
Application
TID TID TID
TID TID TID
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
Run
Buffered Pages - Page replacement strategy - Materialization strategy - Logging Backup Recovery
Paged files
Disks Flash RAID SAN hellip
Storage Structures - Record management - Free space management - Physical access paths
Access System
Data
base
Sys
tem
Data model semantics - System catalog - Record format - Logical access paths
Query processing - Parsing - Plan generation - Plan optimization - Plan execution
1 lsquoSmithrsquo 15061982
Table Person id INT name VARCHAR birthday DATE Index P_id_IX on Personid
Database System
SQL JDBC ODBC hellip
Physical Access Paths ndash Index Structures
| 284
gt
copy A Behrend Foundations of Information Systems |
Primary Index Secondary Index
Overview Indexes
Table scan Read all pages and for each record
evaluate the search criteria Pre-fetching
Index Scan Use index for search criteria
on one or more attributes Fast access to single values or value ranges of index attributes Logicalphysical sorting of values of key attributes (depending on index structure) Enforcing uniqueness
Types if indexes Primary (Clustered) Index
determines physical organization use for PK Secondary (Non-Clustered)
Index redundant access path
Pers(PID NAME AGE SALARY)
Age
Salary
| 285
gt
copy A Behrend Foundations of Information Systems |
Overview Indexes (2)
Choice of Access Paths Index scan Only useful for low selectivity
(low number of result tuples) Break even-point according to the
output ratio of the number of tuples (usually max 5) Requires statistics about data Additional costs for index storage
and updating
Table Scan adequateefficient for small tables
(eg 5 pages) Queries with high selectivity
(large result sets) 100-200MBs sequential read
~ 100 disk seekss
hit rate
Index Scan
Table Scan
access time
| 286
gt
copy A Behrend Foundations of Information Systems |
Classification of Index Structures
Classification
Multiway Trees Tree structure with multiple children per node Idea chose fan out so that node size suits page size
Onedimensional Index Structures
Key Comparison Key Transformation
Sequential Tree-Based Hash-Based
Prefix Trees (Tries)
Binary Search Trees Dynamic Static Linked Lists
(log seq) Seq Lists
(phys seq) Multiway
Trees Example B-Tree
| 287
gt
copy A Behrend Foundations of Information Systems |
B-Tree
free space
(Ki Di Pi) = entry min |P| = k+1 max |P| = 2k+1
keys lt K1 keys gt Kp Ki lt keys lt Ki+1
| 288
gt
copy A Behrend Foundations of Information Systems |
B-Tree (2)
Example Keys Agnostic to specific key semantic Only defined complete order required Could be of fixed or variable length
Operations Search for data for given key value Insertion and deletion of key-data pair
Payload Agnostic to specific data semantic Can be record or reference (TID) or
mix
B-Tree with k = 2 h = 3
| 289
gt
copy A Behrend Foundations of Information Systems |
Search in the B-Tree
Starting at the root node each node is searched from left to right 1) if Ki matches the desired key value the data record has been found (further
records with the same key value might be located in a sub-tree to which Pi-1 points) 2) if Ki is smaller than the desired value the search will be continued in the root of
the sub-tree identified by Pi-1 3) if Ki is larger than the desired value the comparison with Ki+1 is repeated 4) if K2k is also smaller than the desired value the search will be continued in the
sub-tree of P2k If itlsquos impossible to descend further into a sub-tree (2 or 4) (leaf node) The search is aborted no record with the desired key value is found
Search for 38 20 6
| 290
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (1)
Insertion Rule insert only into leaf nodes At Non-Leaf Nodes descend down the tree as for the search S le Ki follow Pi-1 S gt Ki check Ki+1 S gt K2k follow P2k
At Leaf Node Insert the data record according to the sorting order Special case leaf node is full (2k records) rarr split the leaf node
Splitting Generate a new leaf node Split the 2k+1 entries (in order)
into two leaf nodes first k entries rarr left node last k entries rarr right node
middle entry (k+1-th) is used as new ldquodiscriminatorrdquo (branching) and inserted into the parent node
| 291
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (2)
Node Splitting during Insertion Two possible situations after a split The parent node is full rarr repeat split on this level Enough space rarr FINISHED
Special case root split Split of the root node rarr New root with two successor nodes Height of a tree grows by 1 The tree has been split from the bottom to the top
Dynamic reorganization (self-balancing) No unloading or loading necessary Tree is always balanced But In case of many insertions deletions reorganization can be beneficial
| 292
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (3)
Insertion Example Order k = 1 n=2k Keys 1 5 2 6 7 4 8 3
Finally h=3
| 293
gt
copy A Behrend Foundations of Information Systems |
Insertion and Deletion in the B-Tree
Problem Insertion can create overflow Deletion can create underflow and overflow
Example Insertion of key 22
rarr Overflow rarr Split
Deletion of key 22 rarr Underflow need to access all four nodes finally same as input
Insert 22
| 294
gt
copy A Behrend Foundations of Information Systems |
Underflow Merge
Deletion in the B-Tree
Example Order k = 1 n=2k Delete key 3
| 295
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (2)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Remember Each path from the root to the leaf has the same length h
| 296
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (3)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Overflow Split
| 297
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (4)
Example Order k = 1 n=2k Delete key 3
Overflow Split
Root Split
| 298
gt
copy A Behrend Foundations of Information Systems |
Deletion Algorithm
Example ndash there are different algorithms Search the node that contains the key K to be deleted If key K is in a leaf node delete the key in the leaf node and handle potentially
resulting underflow by merging with sibling If key K is in an inner node pull up new discriminator from one of the successors Analyze which successor node of K has more elements left or right one
If both have the same number of elements decide for one Replace the key K to be deleted with the direct successor Krsquo from the left
successor node or with the direct successor Krsquorsquo from the right successor node respectively Delete Krsquo or Krsquorsquo from the respective successor node (recursively)
Note Major variants Merge (tis lecture) Re-distribution (instead of splitmerge in case of overflowunderflow the entries are
re-distributed under consideration of one or multiple adjacent nodes)
| 299
gt
copy A Behrend Foundations of Information Systems |
B-Trees B+-Trees and B-Trees
B+-Trees and B-Trees Data is only in leaf nodes Key redundancy but higher fan-out rarr lower tree high less IO Simpler delete procedure rarr requires only merging of nodes
Double linked list of all leaf nodes B-Trees Modified valid node sizes
from [k2k] to [43k2k] rarr better node utilization but more splitsmerges
Example Secondary index Non unique
B-Tree with k = 2 h = 3
| 300
gt
copy A Behrend Foundations of Information Systems |
Indexing Low Cardinality Columns
Problem Example B-tree on the sex of customers for a table with 1000000 tuples results in
two lists with approximately 500000 tuples each
Query for all female customers requires 500000 random page accesses (secondary index) Table scan would be much faster
Conclusion B-trees (and also hashing) are useful for predicates with low selectivity
(outputinput cardinality ratio) Rule of thumb margin hit rate is approx 5 higher hit rates do not justify the efforts for an index access
F M
TID TID TID TID hellip TID TID TID TID hellip
| 301
gt
copy A Behrend Foundations of Information Systems |
Bitmap Index
Idea (Long history since the 1960s) Create a bitmapbitlist for each
attribute value Each tuple in the table is assigned to
one bit in the bitmap (by position sequential TID)
Bit values 1 attribute value set 0 attribute value not set
Necessary condition Sequential numbering of the tuples
(TIDs)
F M
1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
0 1 0 0 1 0 0 0 0 1 1 0 1 0 0
Name Sex Region Race Carol f n white
Harold m e black Anne f e asian
Iris f ne white hellip m se hisp hellip f e white hellip f sw asian hellip f w black hellip f n asian hellip m e hisp hellip m se black hellip f s white hellip m nw black hellip f s white hellip f w black
Sex
| 302
gt
copy A Behrend Foundations of Information Systems |
Querying Bitmap Indexes
Main advantage of bitmap indexes Simple and efficient logical join possible Read only data that is relevant for predicates Example σSex=lsquoflsquo ᴧ Region=lsquonlsquo R Bitmaps B1 and B2 in conjunction for (i=0 iltB1length i++) B = B1[i] amp B2[i]
Example IO Costs Estimation σSex=lsquoflsquo ᴧ Region=lsquonlsquo ᴧ Race=lsquoAsianlsquo R
(ldquoAsian women of region Northrdquo) Selectivity 12 18 14 = 164 N=10000 tuples with length of 400 bytes each
(~ 10 tuples per page for 4kB pages) Table scan 1000 pages Bitmap access 1000064 156 pages (worst case each tuple in a different page)
plus 1 page for bitmaps
F 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
N 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0
A 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
AND = AND
- DBMS Architecture
- What is in the Lecture
- How is Database System build
- Architectural Blue Print
- Architectural Trends
- Different Access Characteristics
- Hardware Developments
- Row Storage vs Column Storage
- Processing Models
- Transaction Management
- ACID Properties of Transactions
- Motivation
- Reasons for crashes
- Guarantee Atomicity amp Durability
- Record Management
- Record
- Record Addressing
- Surrogate with Mapping Table
- TID Concept
- Free Space Management
- Physical Access Paths ndash Index Structures
- Overview Indexes
- Overview Indexes (2)
- Classification of Index Structures
- B-Tree
- B-Tree (2)
- Search in the B-Tree
- Insertions in the B-Tree (1)
- Insertions in the B-Tree (2)
- Insertions in the B-Tree (3)
- Insertion and Deletion in the B-Tree
- Deletion in the B-Tree
- Deletion in the B-Tree (2)
- Deletion in the B-Tree (3)
- Deletion in the B-Tree (4)
- Deletion Algorithm
- B-Trees B+-Trees and B-Trees
- Indexing Low Cardinality Columns
- Bitmap Index
- Querying Bitmap Indexes
-
| 266
gt
copy A Behrend Foundations of Information Systems |
Storage System
Buffer
File System
Hardware
Data System
Application
Architectural Blue Print
TID TID TID
TID TID TID
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
Run
Buffered Pages - Page replacement strategy - Materialization strategy - Logging Backup Recovery
Paged files
Disks Flash RAID SAN hellip
Storage Structures - Record management - Free space management - Physical access paths
Access System
Data
base
Sys
tem
Data model semantics - System catalog - Record format - Logical access paths
Query processing - Parsing - Plan generation - Plan optimization - Plan execution
1 lsquoSmithrsquo 15061982
Table Person id INT name VARCHAR birthday DATE Index P_id_IX on Personid
Database System
SQL JDBC ODBC hellip
copy A Behrend Foundations of Information Systems | 267
gt
|
Architectural Trends
| 268
gt
copy A Behrend Foundations of Information Systems |
Different Access Characteristics
OLTP (On-line Transaction Processing) Mix between read-only and update queries Minor analysis tasks Used for data preservation and lookup Read typically only a few records at a time High performance by storing contiguous records in disk pages
OLAP (On-line Analytical Processing) Query-intensive DBMS applications Infrequent batch-oriented updates Complex analysis on large data volumes Read typically only a few attributes of large amounts of historical data in order to
partition them and compute aggregates High performance by storing contiguous values of a single attribute
| 269
gt
copy A Behrend Foundations of Information Systems |
Hardware Developments
Hardware improvements not equally distributed Advances in CPU speed have outpaced advances
in RAM latency Main-memory access has become a performance
bottleneck for many computer applications Bandwidth Latency Address translation (TLB)
rarr Memory Wall Cache memories can reduce the memory latency
when the requested data is found in the cache Vertically fragmented data structures optimize
memory cache usage
| 270
gt
copy A Behrend Foundations of Information Systems |
Row Storage vs Column Storage
Row Storage + easy to addmodify a record - might read unnecessary data
Column Storage + only need to read in relevant data - tuple writes require multiple accesses -gt suitable for read-mostly read-intensive large data repositories
| 271
gt
copy A Behrend Foundations of Information Systems |
Processing Models
[Marcin Zukowski Peter A Boncz Niels Nes Saacutendor Heacuteman MonetDBX100 - A DBMS In The CPU Cache IEEE Data Eng Bull 28(2) p17-22 2005]
| 272
gt
copy A Behrend Foundations of Information Systems |
Transaction Management
Principle of a transaction Sequence of successive DB operations that transform a database from a consistent
state into another consistent state surrounded by BOT EOT (Commit Abort)
Properties ACID Atomicity Consistency Isolation Durability A transaction will always come to an end Normal (commit) changes are permanently stored within the DB Abnormal (abort rollback) already composed changes are taken back
Note EOT state must not be different from BOT state
BOT(begin of transaction)
EOT(end of transaction)
possibly inconsistent database
consistentdatabase
consistentdatabase
DB DB
DML operations
| 273
gt
copy A Behrend Foundations of Information Systems |
ACID Properties of Transactions
Atomicity Indivisibility due to the transaction definition (Begin - End) All-or-nothing principle ie the DBS guarantees Either the complete execution of a transaction hellip hellip or the ineffectiveness of the whole transaction (and of all associated operations)
Consistency A successful transaction guarantees that all consistency requirements (integrity
requirements) have been met
Isolation Multiple transactions run isolated from each other and do not use (inconsistent)
intermediate results from other transactions
Durability All results of successful transactions have to be made persistent
| 274
gt
copy A Behrend Foundations of Information Systems |
Motivation
Atomicity Part of the transaction is done but we want to cancel it ABORTROLLBACK System crashes during transaction some changes made it to the disk some did not
Durability Transaction finished user notified COMMIT System crashes before changes sent successfully to disk (asynchronous write)
Consistency Physical consistency Correctness of the storage and access structures Completely executed modification operations preserve the consistency
Logical consistency Correctness of data contents ndash correspond to a (possible) state of the real world Completely executed transactions preserve the logical consistency
- All modifications of finished transactions are included - No modifications of open transactions are included
Remember Logical consistency requires physical consistency in the first place
UNDO Recovery
REDO Recovery
UNDO Recovery for consistency-related rollbacks
| 275
gt
copy A Behrend Foundations of Information Systems |
Reasons for crashes
Transaction error Violation of system restrictions Violation of security regulations Excessive resource requirements deadlocks
Application-related errors eg wrong operations and values ROLLBACK
System error System crash with loss of main-memory contents Database system operating system hardware power failure
Device error (especially storage-medium error) Destruction of secondary storage systems
Catastrophes Destruction of the computing center
| 276
gt
copy A Behrend Foundations of Information Systems |
Guarantee Atomicity amp Durability
Assumptions System may crash but the disk is durable The only atomicity guarantee is that a disk block write is atomic
Materialization strategy Preferred Policy StealNo Force This combination is most complicated but allows for highest performance No Force complicates enforcing Durability What if system crashes before a modified page written by a committed
transaction makes it to disk Write as little as possible in a convenient place at commit time to support
REDOing modifications Steal complicates enforcing Atomicity What if the transaction that performed udpates aborts What if system crashes before transaction is finished Must remember the old value of P (to support UNDOing the write to page P)
copy A Behrend Foundations of Information Systems | 277
gt
|
Storage System
Buffer
File System
Hardware
Data System
Application
TID TID TID
TID TID TID
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
Run
Buffered Pages - Page replacement strategy - Materialization strategy - Logging Backup Recovery
Paged files
Disks Flash RAID SAN hellip
Storage Structures - Record management - Free space management - Physical access paths
Access System
Data
base
Sys
tem
Data model semantics - System catalog - Record format - Logical access paths
Query processing - Parsing - Plan generation - Plan optimization - Plan execution
1 lsquoSmithrsquo 15061982
Table Person id INT name VARCHAR birthday DATE Index P_id_IX on Personid
Database System
SQL JDBC ODBC hellip
Record Management
| 278
gt
copy A Behrend Foundations of Information Systems |
Record
Record Package of fields that together describe a thing a person a fact etc Each fields represents on property of the entity described by the record Similar to a struct in C Variable length (in contrast to pages)
Record Manager Organizes physical storage of records in pages Operations Get Insert Update Delete Scan Agnostic to record structure and semantic
records considered as byte strings of variable length Structure and content of record is defined be Access System and application
Challenges Record addressing Free space management
| 279
gt
copy A Behrend Foundations of Information Systems |
Record Addressing
Record address Identifier for records used to address records eg in indexes or query processing Assigned during insert of a record
Goals Stability of identifier Fast and direct access Less organizational overhead
Direct addressing Byte address or position number in file or page Instable Byte address If record grows in length following records would get new address Position number Insert and delete operations change series or records
Indirect addressing Surrogate with mapping table (complete indirection) Tuple Identifier (TID concept)
| 280
gt
copy A Behrend Foundations of Information Systems |
Surrogate with Mapping Table
Surrogate Record type + serial number Serial number remains constant during recordrsquos life time
Mapping table Maps
surrogate to page
Problems Where to store mapping table How can it be extended How to search mapping table efficiently
rarr H2 use B-Tree to store mapping table
Mapping Table Surrogate | Page ID
| 281
gt
copy A Behrend Foundations of Information Systems |
TID Concept
Record addressing with indirection inside the page Each page contains an array with record positions TID of a record consist of page id and index in position array
Pros Access with one page access (two pages in case of overflow) Stable No mapping table required
Operations Insert Reuse unused
position or add position Delete Mark position
as unused in array Update Update all
positions in array Update with overflow Store record
as overflow record and store TID of overflow record at original position (No double overflow Update TID at original position)
Record
Overflow Record
| 282
gt
copy A Behrend Foundations of Information Systems |
Free Space Management
Problem In which page is enough space for new record
Solution Free space table lists for all pages how much space is left
Free space value Precise value Ceil(Log2(page size)) =gt 2 bytes for common page size of 4K Rough value use less bytes free space = (value page size)2^(bits per value)
Free space table With direct page addressing Assuming a single page can take n free space entries First page and each (n+1)-th page takes free space entries
With indirect page addressing Free space information stored in page table
copy A Behrend Foundations of Information Systems | 283
gt
|
Storage System
Buffer
File System
Hardware
Data System
Application
TID TID TID
TID TID TID
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
Run
Buffered Pages - Page replacement strategy - Materialization strategy - Logging Backup Recovery
Paged files
Disks Flash RAID SAN hellip
Storage Structures - Record management - Free space management - Physical access paths
Access System
Data
base
Sys
tem
Data model semantics - System catalog - Record format - Logical access paths
Query processing - Parsing - Plan generation - Plan optimization - Plan execution
1 lsquoSmithrsquo 15061982
Table Person id INT name VARCHAR birthday DATE Index P_id_IX on Personid
Database System
SQL JDBC ODBC hellip
Physical Access Paths ndash Index Structures
| 284
gt
copy A Behrend Foundations of Information Systems |
Primary Index Secondary Index
Overview Indexes
Table scan Read all pages and for each record
evaluate the search criteria Pre-fetching
Index Scan Use index for search criteria
on one or more attributes Fast access to single values or value ranges of index attributes Logicalphysical sorting of values of key attributes (depending on index structure) Enforcing uniqueness
Types if indexes Primary (Clustered) Index
determines physical organization use for PK Secondary (Non-Clustered)
Index redundant access path
Pers(PID NAME AGE SALARY)
Age
Salary
| 285
gt
copy A Behrend Foundations of Information Systems |
Overview Indexes (2)
Choice of Access Paths Index scan Only useful for low selectivity
(low number of result tuples) Break even-point according to the
output ratio of the number of tuples (usually max 5) Requires statistics about data Additional costs for index storage
and updating
Table Scan adequateefficient for small tables
(eg 5 pages) Queries with high selectivity
(large result sets) 100-200MBs sequential read
~ 100 disk seekss
hit rate
Index Scan
Table Scan
access time
| 286
gt
copy A Behrend Foundations of Information Systems |
Classification of Index Structures
Classification
Multiway Trees Tree structure with multiple children per node Idea chose fan out so that node size suits page size
Onedimensional Index Structures
Key Comparison Key Transformation
Sequential Tree-Based Hash-Based
Prefix Trees (Tries)
Binary Search Trees Dynamic Static Linked Lists
(log seq) Seq Lists
(phys seq) Multiway
Trees Example B-Tree
| 287
gt
copy A Behrend Foundations of Information Systems |
B-Tree
free space
(Ki Di Pi) = entry min |P| = k+1 max |P| = 2k+1
keys lt K1 keys gt Kp Ki lt keys lt Ki+1
| 288
gt
copy A Behrend Foundations of Information Systems |
B-Tree (2)
Example Keys Agnostic to specific key semantic Only defined complete order required Could be of fixed or variable length
Operations Search for data for given key value Insertion and deletion of key-data pair
Payload Agnostic to specific data semantic Can be record or reference (TID) or
mix
B-Tree with k = 2 h = 3
| 289
gt
copy A Behrend Foundations of Information Systems |
Search in the B-Tree
Starting at the root node each node is searched from left to right 1) if Ki matches the desired key value the data record has been found (further
records with the same key value might be located in a sub-tree to which Pi-1 points) 2) if Ki is smaller than the desired value the search will be continued in the root of
the sub-tree identified by Pi-1 3) if Ki is larger than the desired value the comparison with Ki+1 is repeated 4) if K2k is also smaller than the desired value the search will be continued in the
sub-tree of P2k If itlsquos impossible to descend further into a sub-tree (2 or 4) (leaf node) The search is aborted no record with the desired key value is found
Search for 38 20 6
| 290
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (1)
Insertion Rule insert only into leaf nodes At Non-Leaf Nodes descend down the tree as for the search S le Ki follow Pi-1 S gt Ki check Ki+1 S gt K2k follow P2k
At Leaf Node Insert the data record according to the sorting order Special case leaf node is full (2k records) rarr split the leaf node
Splitting Generate a new leaf node Split the 2k+1 entries (in order)
into two leaf nodes first k entries rarr left node last k entries rarr right node
middle entry (k+1-th) is used as new ldquodiscriminatorrdquo (branching) and inserted into the parent node
| 291
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (2)
Node Splitting during Insertion Two possible situations after a split The parent node is full rarr repeat split on this level Enough space rarr FINISHED
Special case root split Split of the root node rarr New root with two successor nodes Height of a tree grows by 1 The tree has been split from the bottom to the top
Dynamic reorganization (self-balancing) No unloading or loading necessary Tree is always balanced But In case of many insertions deletions reorganization can be beneficial
| 292
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (3)
Insertion Example Order k = 1 n=2k Keys 1 5 2 6 7 4 8 3
Finally h=3
| 293
gt
copy A Behrend Foundations of Information Systems |
Insertion and Deletion in the B-Tree
Problem Insertion can create overflow Deletion can create underflow and overflow
Example Insertion of key 22
rarr Overflow rarr Split
Deletion of key 22 rarr Underflow need to access all four nodes finally same as input
Insert 22
| 294
gt
copy A Behrend Foundations of Information Systems |
Underflow Merge
Deletion in the B-Tree
Example Order k = 1 n=2k Delete key 3
| 295
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (2)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Remember Each path from the root to the leaf has the same length h
| 296
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (3)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Overflow Split
| 297
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (4)
Example Order k = 1 n=2k Delete key 3
Overflow Split
Root Split
| 298
gt
copy A Behrend Foundations of Information Systems |
Deletion Algorithm
Example ndash there are different algorithms Search the node that contains the key K to be deleted If key K is in a leaf node delete the key in the leaf node and handle potentially
resulting underflow by merging with sibling If key K is in an inner node pull up new discriminator from one of the successors Analyze which successor node of K has more elements left or right one
If both have the same number of elements decide for one Replace the key K to be deleted with the direct successor Krsquo from the left
successor node or with the direct successor Krsquorsquo from the right successor node respectively Delete Krsquo or Krsquorsquo from the respective successor node (recursively)
Note Major variants Merge (tis lecture) Re-distribution (instead of splitmerge in case of overflowunderflow the entries are
re-distributed under consideration of one or multiple adjacent nodes)
| 299
gt
copy A Behrend Foundations of Information Systems |
B-Trees B+-Trees and B-Trees
B+-Trees and B-Trees Data is only in leaf nodes Key redundancy but higher fan-out rarr lower tree high less IO Simpler delete procedure rarr requires only merging of nodes
Double linked list of all leaf nodes B-Trees Modified valid node sizes
from [k2k] to [43k2k] rarr better node utilization but more splitsmerges
Example Secondary index Non unique
B-Tree with k = 2 h = 3
| 300
gt
copy A Behrend Foundations of Information Systems |
Indexing Low Cardinality Columns
Problem Example B-tree on the sex of customers for a table with 1000000 tuples results in
two lists with approximately 500000 tuples each
Query for all female customers requires 500000 random page accesses (secondary index) Table scan would be much faster
Conclusion B-trees (and also hashing) are useful for predicates with low selectivity
(outputinput cardinality ratio) Rule of thumb margin hit rate is approx 5 higher hit rates do not justify the efforts for an index access
F M
TID TID TID TID hellip TID TID TID TID hellip
| 301
gt
copy A Behrend Foundations of Information Systems |
Bitmap Index
Idea (Long history since the 1960s) Create a bitmapbitlist for each
attribute value Each tuple in the table is assigned to
one bit in the bitmap (by position sequential TID)
Bit values 1 attribute value set 0 attribute value not set
Necessary condition Sequential numbering of the tuples
(TIDs)
F M
1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
0 1 0 0 1 0 0 0 0 1 1 0 1 0 0
Name Sex Region Race Carol f n white
Harold m e black Anne f e asian
Iris f ne white hellip m se hisp hellip f e white hellip f sw asian hellip f w black hellip f n asian hellip m e hisp hellip m se black hellip f s white hellip m nw black hellip f s white hellip f w black
Sex
| 302
gt
copy A Behrend Foundations of Information Systems |
Querying Bitmap Indexes
Main advantage of bitmap indexes Simple and efficient logical join possible Read only data that is relevant for predicates Example σSex=lsquoflsquo ᴧ Region=lsquonlsquo R Bitmaps B1 and B2 in conjunction for (i=0 iltB1length i++) B = B1[i] amp B2[i]
Example IO Costs Estimation σSex=lsquoflsquo ᴧ Region=lsquonlsquo ᴧ Race=lsquoAsianlsquo R
(ldquoAsian women of region Northrdquo) Selectivity 12 18 14 = 164 N=10000 tuples with length of 400 bytes each
(~ 10 tuples per page for 4kB pages) Table scan 1000 pages Bitmap access 1000064 156 pages (worst case each tuple in a different page)
plus 1 page for bitmaps
F 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
N 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0
A 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
AND = AND
- DBMS Architecture
- What is in the Lecture
- How is Database System build
- Architectural Blue Print
- Architectural Trends
- Different Access Characteristics
- Hardware Developments
- Row Storage vs Column Storage
- Processing Models
- Transaction Management
- ACID Properties of Transactions
- Motivation
- Reasons for crashes
- Guarantee Atomicity amp Durability
- Record Management
- Record
- Record Addressing
- Surrogate with Mapping Table
- TID Concept
- Free Space Management
- Physical Access Paths ndash Index Structures
- Overview Indexes
- Overview Indexes (2)
- Classification of Index Structures
- B-Tree
- B-Tree (2)
- Search in the B-Tree
- Insertions in the B-Tree (1)
- Insertions in the B-Tree (2)
- Insertions in the B-Tree (3)
- Insertion and Deletion in the B-Tree
- Deletion in the B-Tree
- Deletion in the B-Tree (2)
- Deletion in the B-Tree (3)
- Deletion in the B-Tree (4)
- Deletion Algorithm
- B-Trees B+-Trees and B-Trees
- Indexing Low Cardinality Columns
- Bitmap Index
- Querying Bitmap Indexes
-
copy A Behrend Foundations of Information Systems | 267
gt
|
Architectural Trends
| 268
gt
copy A Behrend Foundations of Information Systems |
Different Access Characteristics
OLTP (On-line Transaction Processing) Mix between read-only and update queries Minor analysis tasks Used for data preservation and lookup Read typically only a few records at a time High performance by storing contiguous records in disk pages
OLAP (On-line Analytical Processing) Query-intensive DBMS applications Infrequent batch-oriented updates Complex analysis on large data volumes Read typically only a few attributes of large amounts of historical data in order to
partition them and compute aggregates High performance by storing contiguous values of a single attribute
| 269
gt
copy A Behrend Foundations of Information Systems |
Hardware Developments
Hardware improvements not equally distributed Advances in CPU speed have outpaced advances
in RAM latency Main-memory access has become a performance
bottleneck for many computer applications Bandwidth Latency Address translation (TLB)
rarr Memory Wall Cache memories can reduce the memory latency
when the requested data is found in the cache Vertically fragmented data structures optimize
memory cache usage
| 270
gt
copy A Behrend Foundations of Information Systems |
Row Storage vs Column Storage
Row Storage + easy to addmodify a record - might read unnecessary data
Column Storage + only need to read in relevant data - tuple writes require multiple accesses -gt suitable for read-mostly read-intensive large data repositories
| 271
gt
copy A Behrend Foundations of Information Systems |
Processing Models
[Marcin Zukowski Peter A Boncz Niels Nes Saacutendor Heacuteman MonetDBX100 - A DBMS In The CPU Cache IEEE Data Eng Bull 28(2) p17-22 2005]
| 272
gt
copy A Behrend Foundations of Information Systems |
Transaction Management
Principle of a transaction Sequence of successive DB operations that transform a database from a consistent
state into another consistent state surrounded by BOT EOT (Commit Abort)
Properties ACID Atomicity Consistency Isolation Durability A transaction will always come to an end Normal (commit) changes are permanently stored within the DB Abnormal (abort rollback) already composed changes are taken back
Note EOT state must not be different from BOT state
BOT(begin of transaction)
EOT(end of transaction)
possibly inconsistent database
consistentdatabase
consistentdatabase
DB DB
DML operations
| 273
gt
copy A Behrend Foundations of Information Systems |
ACID Properties of Transactions
Atomicity Indivisibility due to the transaction definition (Begin - End) All-or-nothing principle ie the DBS guarantees Either the complete execution of a transaction hellip hellip or the ineffectiveness of the whole transaction (and of all associated operations)
Consistency A successful transaction guarantees that all consistency requirements (integrity
requirements) have been met
Isolation Multiple transactions run isolated from each other and do not use (inconsistent)
intermediate results from other transactions
Durability All results of successful transactions have to be made persistent
| 274
gt
copy A Behrend Foundations of Information Systems |
Motivation
Atomicity Part of the transaction is done but we want to cancel it ABORTROLLBACK System crashes during transaction some changes made it to the disk some did not
Durability Transaction finished user notified COMMIT System crashes before changes sent successfully to disk (asynchronous write)
Consistency Physical consistency Correctness of the storage and access structures Completely executed modification operations preserve the consistency
Logical consistency Correctness of data contents ndash correspond to a (possible) state of the real world Completely executed transactions preserve the logical consistency
- All modifications of finished transactions are included - No modifications of open transactions are included
Remember Logical consistency requires physical consistency in the first place
UNDO Recovery
REDO Recovery
UNDO Recovery for consistency-related rollbacks
| 275
gt
copy A Behrend Foundations of Information Systems |
Reasons for crashes
Transaction error Violation of system restrictions Violation of security regulations Excessive resource requirements deadlocks
Application-related errors eg wrong operations and values ROLLBACK
System error System crash with loss of main-memory contents Database system operating system hardware power failure
Device error (especially storage-medium error) Destruction of secondary storage systems
Catastrophes Destruction of the computing center
| 276
gt
copy A Behrend Foundations of Information Systems |
Guarantee Atomicity amp Durability
Assumptions System may crash but the disk is durable The only atomicity guarantee is that a disk block write is atomic
Materialization strategy Preferred Policy StealNo Force This combination is most complicated but allows for highest performance No Force complicates enforcing Durability What if system crashes before a modified page written by a committed
transaction makes it to disk Write as little as possible in a convenient place at commit time to support
REDOing modifications Steal complicates enforcing Atomicity What if the transaction that performed udpates aborts What if system crashes before transaction is finished Must remember the old value of P (to support UNDOing the write to page P)
copy A Behrend Foundations of Information Systems | 277
gt
|
Storage System
Buffer
File System
Hardware
Data System
Application
TID TID TID
TID TID TID
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
Run
Buffered Pages - Page replacement strategy - Materialization strategy - Logging Backup Recovery
Paged files
Disks Flash RAID SAN hellip
Storage Structures - Record management - Free space management - Physical access paths
Access System
Data
base
Sys
tem
Data model semantics - System catalog - Record format - Logical access paths
Query processing - Parsing - Plan generation - Plan optimization - Plan execution
1 lsquoSmithrsquo 15061982
Table Person id INT name VARCHAR birthday DATE Index P_id_IX on Personid
Database System
SQL JDBC ODBC hellip
Record Management
| 278
gt
copy A Behrend Foundations of Information Systems |
Record
Record Package of fields that together describe a thing a person a fact etc Each fields represents on property of the entity described by the record Similar to a struct in C Variable length (in contrast to pages)
Record Manager Organizes physical storage of records in pages Operations Get Insert Update Delete Scan Agnostic to record structure and semantic
records considered as byte strings of variable length Structure and content of record is defined be Access System and application
Challenges Record addressing Free space management
| 279
gt
copy A Behrend Foundations of Information Systems |
Record Addressing
Record address Identifier for records used to address records eg in indexes or query processing Assigned during insert of a record
Goals Stability of identifier Fast and direct access Less organizational overhead
Direct addressing Byte address or position number in file or page Instable Byte address If record grows in length following records would get new address Position number Insert and delete operations change series or records
Indirect addressing Surrogate with mapping table (complete indirection) Tuple Identifier (TID concept)
| 280
gt
copy A Behrend Foundations of Information Systems |
Surrogate with Mapping Table
Surrogate Record type + serial number Serial number remains constant during recordrsquos life time
Mapping table Maps
surrogate to page
Problems Where to store mapping table How can it be extended How to search mapping table efficiently
rarr H2 use B-Tree to store mapping table
Mapping Table Surrogate | Page ID
| 281
gt
copy A Behrend Foundations of Information Systems |
TID Concept
Record addressing with indirection inside the page Each page contains an array with record positions TID of a record consist of page id and index in position array
Pros Access with one page access (two pages in case of overflow) Stable No mapping table required
Operations Insert Reuse unused
position or add position Delete Mark position
as unused in array Update Update all
positions in array Update with overflow Store record
as overflow record and store TID of overflow record at original position (No double overflow Update TID at original position)
Record
Overflow Record
| 282
gt
copy A Behrend Foundations of Information Systems |
Free Space Management
Problem In which page is enough space for new record
Solution Free space table lists for all pages how much space is left
Free space value Precise value Ceil(Log2(page size)) =gt 2 bytes for common page size of 4K Rough value use less bytes free space = (value page size)2^(bits per value)
Free space table With direct page addressing Assuming a single page can take n free space entries First page and each (n+1)-th page takes free space entries
With indirect page addressing Free space information stored in page table
copy A Behrend Foundations of Information Systems | 283
gt
|
Storage System
Buffer
File System
Hardware
Data System
Application
TID TID TID
TID TID TID
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
Run
Buffered Pages - Page replacement strategy - Materialization strategy - Logging Backup Recovery
Paged files
Disks Flash RAID SAN hellip
Storage Structures - Record management - Free space management - Physical access paths
Access System
Data
base
Sys
tem
Data model semantics - System catalog - Record format - Logical access paths
Query processing - Parsing - Plan generation - Plan optimization - Plan execution
1 lsquoSmithrsquo 15061982
Table Person id INT name VARCHAR birthday DATE Index P_id_IX on Personid
Database System
SQL JDBC ODBC hellip
Physical Access Paths ndash Index Structures
| 284
gt
copy A Behrend Foundations of Information Systems |
Primary Index Secondary Index
Overview Indexes
Table scan Read all pages and for each record
evaluate the search criteria Pre-fetching
Index Scan Use index for search criteria
on one or more attributes Fast access to single values or value ranges of index attributes Logicalphysical sorting of values of key attributes (depending on index structure) Enforcing uniqueness
Types if indexes Primary (Clustered) Index
determines physical organization use for PK Secondary (Non-Clustered)
Index redundant access path
Pers(PID NAME AGE SALARY)
Age
Salary
| 285
gt
copy A Behrend Foundations of Information Systems |
Overview Indexes (2)
Choice of Access Paths Index scan Only useful for low selectivity
(low number of result tuples) Break even-point according to the
output ratio of the number of tuples (usually max 5) Requires statistics about data Additional costs for index storage
and updating
Table Scan adequateefficient for small tables
(eg 5 pages) Queries with high selectivity
(large result sets) 100-200MBs sequential read
~ 100 disk seekss
hit rate
Index Scan
Table Scan
access time
| 286
gt
copy A Behrend Foundations of Information Systems |
Classification of Index Structures
Classification
Multiway Trees Tree structure with multiple children per node Idea chose fan out so that node size suits page size
Onedimensional Index Structures
Key Comparison Key Transformation
Sequential Tree-Based Hash-Based
Prefix Trees (Tries)
Binary Search Trees Dynamic Static Linked Lists
(log seq) Seq Lists
(phys seq) Multiway
Trees Example B-Tree
| 287
gt
copy A Behrend Foundations of Information Systems |
B-Tree
free space
(Ki Di Pi) = entry min |P| = k+1 max |P| = 2k+1
keys lt K1 keys gt Kp Ki lt keys lt Ki+1
| 288
gt
copy A Behrend Foundations of Information Systems |
B-Tree (2)
Example Keys Agnostic to specific key semantic Only defined complete order required Could be of fixed or variable length
Operations Search for data for given key value Insertion and deletion of key-data pair
Payload Agnostic to specific data semantic Can be record or reference (TID) or
mix
B-Tree with k = 2 h = 3
| 289
gt
copy A Behrend Foundations of Information Systems |
Search in the B-Tree
Starting at the root node each node is searched from left to right 1) if Ki matches the desired key value the data record has been found (further
records with the same key value might be located in a sub-tree to which Pi-1 points) 2) if Ki is smaller than the desired value the search will be continued in the root of
the sub-tree identified by Pi-1 3) if Ki is larger than the desired value the comparison with Ki+1 is repeated 4) if K2k is also smaller than the desired value the search will be continued in the
sub-tree of P2k If itlsquos impossible to descend further into a sub-tree (2 or 4) (leaf node) The search is aborted no record with the desired key value is found
Search for 38 20 6
| 290
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (1)
Insertion Rule insert only into leaf nodes At Non-Leaf Nodes descend down the tree as for the search S le Ki follow Pi-1 S gt Ki check Ki+1 S gt K2k follow P2k
At Leaf Node Insert the data record according to the sorting order Special case leaf node is full (2k records) rarr split the leaf node
Splitting Generate a new leaf node Split the 2k+1 entries (in order)
into two leaf nodes first k entries rarr left node last k entries rarr right node
middle entry (k+1-th) is used as new ldquodiscriminatorrdquo (branching) and inserted into the parent node
| 291
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (2)
Node Splitting during Insertion Two possible situations after a split The parent node is full rarr repeat split on this level Enough space rarr FINISHED
Special case root split Split of the root node rarr New root with two successor nodes Height of a tree grows by 1 The tree has been split from the bottom to the top
Dynamic reorganization (self-balancing) No unloading or loading necessary Tree is always balanced But In case of many insertions deletions reorganization can be beneficial
| 292
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (3)
Insertion Example Order k = 1 n=2k Keys 1 5 2 6 7 4 8 3
Finally h=3
| 293
gt
copy A Behrend Foundations of Information Systems |
Insertion and Deletion in the B-Tree
Problem Insertion can create overflow Deletion can create underflow and overflow
Example Insertion of key 22
rarr Overflow rarr Split
Deletion of key 22 rarr Underflow need to access all four nodes finally same as input
Insert 22
| 294
gt
copy A Behrend Foundations of Information Systems |
Underflow Merge
Deletion in the B-Tree
Example Order k = 1 n=2k Delete key 3
| 295
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (2)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Remember Each path from the root to the leaf has the same length h
| 296
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (3)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Overflow Split
| 297
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (4)
Example Order k = 1 n=2k Delete key 3
Overflow Split
Root Split
| 298
gt
copy A Behrend Foundations of Information Systems |
Deletion Algorithm
Example ndash there are different algorithms Search the node that contains the key K to be deleted If key K is in a leaf node delete the key in the leaf node and handle potentially
resulting underflow by merging with sibling If key K is in an inner node pull up new discriminator from one of the successors Analyze which successor node of K has more elements left or right one
If both have the same number of elements decide for one Replace the key K to be deleted with the direct successor Krsquo from the left
successor node or with the direct successor Krsquorsquo from the right successor node respectively Delete Krsquo or Krsquorsquo from the respective successor node (recursively)
Note Major variants Merge (tis lecture) Re-distribution (instead of splitmerge in case of overflowunderflow the entries are
re-distributed under consideration of one or multiple adjacent nodes)
| 299
gt
copy A Behrend Foundations of Information Systems |
B-Trees B+-Trees and B-Trees
B+-Trees and B-Trees Data is only in leaf nodes Key redundancy but higher fan-out rarr lower tree high less IO Simpler delete procedure rarr requires only merging of nodes
Double linked list of all leaf nodes B-Trees Modified valid node sizes
from [k2k] to [43k2k] rarr better node utilization but more splitsmerges
Example Secondary index Non unique
B-Tree with k = 2 h = 3
| 300
gt
copy A Behrend Foundations of Information Systems |
Indexing Low Cardinality Columns
Problem Example B-tree on the sex of customers for a table with 1000000 tuples results in
two lists with approximately 500000 tuples each
Query for all female customers requires 500000 random page accesses (secondary index) Table scan would be much faster
Conclusion B-trees (and also hashing) are useful for predicates with low selectivity
(outputinput cardinality ratio) Rule of thumb margin hit rate is approx 5 higher hit rates do not justify the efforts for an index access
F M
TID TID TID TID hellip TID TID TID TID hellip
| 301
gt
copy A Behrend Foundations of Information Systems |
Bitmap Index
Idea (Long history since the 1960s) Create a bitmapbitlist for each
attribute value Each tuple in the table is assigned to
one bit in the bitmap (by position sequential TID)
Bit values 1 attribute value set 0 attribute value not set
Necessary condition Sequential numbering of the tuples
(TIDs)
F M
1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
0 1 0 0 1 0 0 0 0 1 1 0 1 0 0
Name Sex Region Race Carol f n white
Harold m e black Anne f e asian
Iris f ne white hellip m se hisp hellip f e white hellip f sw asian hellip f w black hellip f n asian hellip m e hisp hellip m se black hellip f s white hellip m nw black hellip f s white hellip f w black
Sex
| 302
gt
copy A Behrend Foundations of Information Systems |
Querying Bitmap Indexes
Main advantage of bitmap indexes Simple and efficient logical join possible Read only data that is relevant for predicates Example σSex=lsquoflsquo ᴧ Region=lsquonlsquo R Bitmaps B1 and B2 in conjunction for (i=0 iltB1length i++) B = B1[i] amp B2[i]
Example IO Costs Estimation σSex=lsquoflsquo ᴧ Region=lsquonlsquo ᴧ Race=lsquoAsianlsquo R
(ldquoAsian women of region Northrdquo) Selectivity 12 18 14 = 164 N=10000 tuples with length of 400 bytes each
(~ 10 tuples per page for 4kB pages) Table scan 1000 pages Bitmap access 1000064 156 pages (worst case each tuple in a different page)
plus 1 page for bitmaps
F 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
N 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0
A 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
AND = AND
- DBMS Architecture
- What is in the Lecture
- How is Database System build
- Architectural Blue Print
- Architectural Trends
- Different Access Characteristics
- Hardware Developments
- Row Storage vs Column Storage
- Processing Models
- Transaction Management
- ACID Properties of Transactions
- Motivation
- Reasons for crashes
- Guarantee Atomicity amp Durability
- Record Management
- Record
- Record Addressing
- Surrogate with Mapping Table
- TID Concept
- Free Space Management
- Physical Access Paths ndash Index Structures
- Overview Indexes
- Overview Indexes (2)
- Classification of Index Structures
- B-Tree
- B-Tree (2)
- Search in the B-Tree
- Insertions in the B-Tree (1)
- Insertions in the B-Tree (2)
- Insertions in the B-Tree (3)
- Insertion and Deletion in the B-Tree
- Deletion in the B-Tree
- Deletion in the B-Tree (2)
- Deletion in the B-Tree (3)
- Deletion in the B-Tree (4)
- Deletion Algorithm
- B-Trees B+-Trees and B-Trees
- Indexing Low Cardinality Columns
- Bitmap Index
- Querying Bitmap Indexes
-
| 268
gt
copy A Behrend Foundations of Information Systems |
Different Access Characteristics
OLTP (On-line Transaction Processing) Mix between read-only and update queries Minor analysis tasks Used for data preservation and lookup Read typically only a few records at a time High performance by storing contiguous records in disk pages
OLAP (On-line Analytical Processing) Query-intensive DBMS applications Infrequent batch-oriented updates Complex analysis on large data volumes Read typically only a few attributes of large amounts of historical data in order to
partition them and compute aggregates High performance by storing contiguous values of a single attribute
| 269
gt
copy A Behrend Foundations of Information Systems |
Hardware Developments
Hardware improvements not equally distributed Advances in CPU speed have outpaced advances
in RAM latency Main-memory access has become a performance
bottleneck for many computer applications Bandwidth Latency Address translation (TLB)
rarr Memory Wall Cache memories can reduce the memory latency
when the requested data is found in the cache Vertically fragmented data structures optimize
memory cache usage
| 270
gt
copy A Behrend Foundations of Information Systems |
Row Storage vs Column Storage
Row Storage + easy to addmodify a record - might read unnecessary data
Column Storage + only need to read in relevant data - tuple writes require multiple accesses -gt suitable for read-mostly read-intensive large data repositories
| 271
gt
copy A Behrend Foundations of Information Systems |
Processing Models
[Marcin Zukowski Peter A Boncz Niels Nes Saacutendor Heacuteman MonetDBX100 - A DBMS In The CPU Cache IEEE Data Eng Bull 28(2) p17-22 2005]
| 272
gt
copy A Behrend Foundations of Information Systems |
Transaction Management
Principle of a transaction Sequence of successive DB operations that transform a database from a consistent
state into another consistent state surrounded by BOT EOT (Commit Abort)
Properties ACID Atomicity Consistency Isolation Durability A transaction will always come to an end Normal (commit) changes are permanently stored within the DB Abnormal (abort rollback) already composed changes are taken back
Note EOT state must not be different from BOT state
BOT(begin of transaction)
EOT(end of transaction)
possibly inconsistent database
consistentdatabase
consistentdatabase
DB DB
DML operations
| 273
gt
copy A Behrend Foundations of Information Systems |
ACID Properties of Transactions
Atomicity Indivisibility due to the transaction definition (Begin - End) All-or-nothing principle ie the DBS guarantees Either the complete execution of a transaction hellip hellip or the ineffectiveness of the whole transaction (and of all associated operations)
Consistency A successful transaction guarantees that all consistency requirements (integrity
requirements) have been met
Isolation Multiple transactions run isolated from each other and do not use (inconsistent)
intermediate results from other transactions
Durability All results of successful transactions have to be made persistent
| 274
gt
copy A Behrend Foundations of Information Systems |
Motivation
Atomicity Part of the transaction is done but we want to cancel it ABORTROLLBACK System crashes during transaction some changes made it to the disk some did not
Durability Transaction finished user notified COMMIT System crashes before changes sent successfully to disk (asynchronous write)
Consistency Physical consistency Correctness of the storage and access structures Completely executed modification operations preserve the consistency
Logical consistency Correctness of data contents ndash correspond to a (possible) state of the real world Completely executed transactions preserve the logical consistency
- All modifications of finished transactions are included - No modifications of open transactions are included
Remember Logical consistency requires physical consistency in the first place
UNDO Recovery
REDO Recovery
UNDO Recovery for consistency-related rollbacks
| 275
gt
copy A Behrend Foundations of Information Systems |
Reasons for crashes
Transaction error Violation of system restrictions Violation of security regulations Excessive resource requirements deadlocks
Application-related errors eg wrong operations and values ROLLBACK
System error System crash with loss of main-memory contents Database system operating system hardware power failure
Device error (especially storage-medium error) Destruction of secondary storage systems
Catastrophes Destruction of the computing center
| 276
gt
copy A Behrend Foundations of Information Systems |
Guarantee Atomicity amp Durability
Assumptions System may crash but the disk is durable The only atomicity guarantee is that a disk block write is atomic
Materialization strategy Preferred Policy StealNo Force This combination is most complicated but allows for highest performance No Force complicates enforcing Durability What if system crashes before a modified page written by a committed
transaction makes it to disk Write as little as possible in a convenient place at commit time to support
REDOing modifications Steal complicates enforcing Atomicity What if the transaction that performed udpates aborts What if system crashes before transaction is finished Must remember the old value of P (to support UNDOing the write to page P)
copy A Behrend Foundations of Information Systems | 277
gt
|
Storage System
Buffer
File System
Hardware
Data System
Application
TID TID TID
TID TID TID
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
Run
Buffered Pages - Page replacement strategy - Materialization strategy - Logging Backup Recovery
Paged files
Disks Flash RAID SAN hellip
Storage Structures - Record management - Free space management - Physical access paths
Access System
Data
base
Sys
tem
Data model semantics - System catalog - Record format - Logical access paths
Query processing - Parsing - Plan generation - Plan optimization - Plan execution
1 lsquoSmithrsquo 15061982
Table Person id INT name VARCHAR birthday DATE Index P_id_IX on Personid
Database System
SQL JDBC ODBC hellip
Record Management
| 278
gt
copy A Behrend Foundations of Information Systems |
Record
Record Package of fields that together describe a thing a person a fact etc Each fields represents on property of the entity described by the record Similar to a struct in C Variable length (in contrast to pages)
Record Manager Organizes physical storage of records in pages Operations Get Insert Update Delete Scan Agnostic to record structure and semantic
records considered as byte strings of variable length Structure and content of record is defined be Access System and application
Challenges Record addressing Free space management
| 279
gt
copy A Behrend Foundations of Information Systems |
Record Addressing
Record address Identifier for records used to address records eg in indexes or query processing Assigned during insert of a record
Goals Stability of identifier Fast and direct access Less organizational overhead
Direct addressing Byte address or position number in file or page Instable Byte address If record grows in length following records would get new address Position number Insert and delete operations change series or records
Indirect addressing Surrogate with mapping table (complete indirection) Tuple Identifier (TID concept)
| 280
gt
copy A Behrend Foundations of Information Systems |
Surrogate with Mapping Table
Surrogate Record type + serial number Serial number remains constant during recordrsquos life time
Mapping table Maps
surrogate to page
Problems Where to store mapping table How can it be extended How to search mapping table efficiently
rarr H2 use B-Tree to store mapping table
Mapping Table Surrogate | Page ID
| 281
gt
copy A Behrend Foundations of Information Systems |
TID Concept
Record addressing with indirection inside the page Each page contains an array with record positions TID of a record consist of page id and index in position array
Pros Access with one page access (two pages in case of overflow) Stable No mapping table required
Operations Insert Reuse unused
position or add position Delete Mark position
as unused in array Update Update all
positions in array Update with overflow Store record
as overflow record and store TID of overflow record at original position (No double overflow Update TID at original position)
Record
Overflow Record
| 282
gt
copy A Behrend Foundations of Information Systems |
Free Space Management
Problem In which page is enough space for new record
Solution Free space table lists for all pages how much space is left
Free space value Precise value Ceil(Log2(page size)) =gt 2 bytes for common page size of 4K Rough value use less bytes free space = (value page size)2^(bits per value)
Free space table With direct page addressing Assuming a single page can take n free space entries First page and each (n+1)-th page takes free space entries
With indirect page addressing Free space information stored in page table
copy A Behrend Foundations of Information Systems | 283
gt
|
Storage System
Buffer
File System
Hardware
Data System
Application
TID TID TID
TID TID TID
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
Run
Buffered Pages - Page replacement strategy - Materialization strategy - Logging Backup Recovery
Paged files
Disks Flash RAID SAN hellip
Storage Structures - Record management - Free space management - Physical access paths
Access System
Data
base
Sys
tem
Data model semantics - System catalog - Record format - Logical access paths
Query processing - Parsing - Plan generation - Plan optimization - Plan execution
1 lsquoSmithrsquo 15061982
Table Person id INT name VARCHAR birthday DATE Index P_id_IX on Personid
Database System
SQL JDBC ODBC hellip
Physical Access Paths ndash Index Structures
| 284
gt
copy A Behrend Foundations of Information Systems |
Primary Index Secondary Index
Overview Indexes
Table scan Read all pages and for each record
evaluate the search criteria Pre-fetching
Index Scan Use index for search criteria
on one or more attributes Fast access to single values or value ranges of index attributes Logicalphysical sorting of values of key attributes (depending on index structure) Enforcing uniqueness
Types if indexes Primary (Clustered) Index
determines physical organization use for PK Secondary (Non-Clustered)
Index redundant access path
Pers(PID NAME AGE SALARY)
Age
Salary
| 285
gt
copy A Behrend Foundations of Information Systems |
Overview Indexes (2)
Choice of Access Paths Index scan Only useful for low selectivity
(low number of result tuples) Break even-point according to the
output ratio of the number of tuples (usually max 5) Requires statistics about data Additional costs for index storage
and updating
Table Scan adequateefficient for small tables
(eg 5 pages) Queries with high selectivity
(large result sets) 100-200MBs sequential read
~ 100 disk seekss
hit rate
Index Scan
Table Scan
access time
| 286
gt
copy A Behrend Foundations of Information Systems |
Classification of Index Structures
Classification
Multiway Trees Tree structure with multiple children per node Idea chose fan out so that node size suits page size
Onedimensional Index Structures
Key Comparison Key Transformation
Sequential Tree-Based Hash-Based
Prefix Trees (Tries)
Binary Search Trees Dynamic Static Linked Lists
(log seq) Seq Lists
(phys seq) Multiway
Trees Example B-Tree
| 287
gt
copy A Behrend Foundations of Information Systems |
B-Tree
free space
(Ki Di Pi) = entry min |P| = k+1 max |P| = 2k+1
keys lt K1 keys gt Kp Ki lt keys lt Ki+1
| 288
gt
copy A Behrend Foundations of Information Systems |
B-Tree (2)
Example Keys Agnostic to specific key semantic Only defined complete order required Could be of fixed or variable length
Operations Search for data for given key value Insertion and deletion of key-data pair
Payload Agnostic to specific data semantic Can be record or reference (TID) or
mix
B-Tree with k = 2 h = 3
| 289
gt
copy A Behrend Foundations of Information Systems |
Search in the B-Tree
Starting at the root node each node is searched from left to right 1) if Ki matches the desired key value the data record has been found (further
records with the same key value might be located in a sub-tree to which Pi-1 points) 2) if Ki is smaller than the desired value the search will be continued in the root of
the sub-tree identified by Pi-1 3) if Ki is larger than the desired value the comparison with Ki+1 is repeated 4) if K2k is also smaller than the desired value the search will be continued in the
sub-tree of P2k If itlsquos impossible to descend further into a sub-tree (2 or 4) (leaf node) The search is aborted no record with the desired key value is found
Search for 38 20 6
| 290
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (1)
Insertion Rule insert only into leaf nodes At Non-Leaf Nodes descend down the tree as for the search S le Ki follow Pi-1 S gt Ki check Ki+1 S gt K2k follow P2k
At Leaf Node Insert the data record according to the sorting order Special case leaf node is full (2k records) rarr split the leaf node
Splitting Generate a new leaf node Split the 2k+1 entries (in order)
into two leaf nodes first k entries rarr left node last k entries rarr right node
middle entry (k+1-th) is used as new ldquodiscriminatorrdquo (branching) and inserted into the parent node
| 291
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (2)
Node Splitting during Insertion Two possible situations after a split The parent node is full rarr repeat split on this level Enough space rarr FINISHED
Special case root split Split of the root node rarr New root with two successor nodes Height of a tree grows by 1 The tree has been split from the bottom to the top
Dynamic reorganization (self-balancing) No unloading or loading necessary Tree is always balanced But In case of many insertions deletions reorganization can be beneficial
| 292
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (3)
Insertion Example Order k = 1 n=2k Keys 1 5 2 6 7 4 8 3
Finally h=3
| 293
gt
copy A Behrend Foundations of Information Systems |
Insertion and Deletion in the B-Tree
Problem Insertion can create overflow Deletion can create underflow and overflow
Example Insertion of key 22
rarr Overflow rarr Split
Deletion of key 22 rarr Underflow need to access all four nodes finally same as input
Insert 22
| 294
gt
copy A Behrend Foundations of Information Systems |
Underflow Merge
Deletion in the B-Tree
Example Order k = 1 n=2k Delete key 3
| 295
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (2)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Remember Each path from the root to the leaf has the same length h
| 296
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (3)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Overflow Split
| 297
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (4)
Example Order k = 1 n=2k Delete key 3
Overflow Split
Root Split
| 298
gt
copy A Behrend Foundations of Information Systems |
Deletion Algorithm
Example ndash there are different algorithms Search the node that contains the key K to be deleted If key K is in a leaf node delete the key in the leaf node and handle potentially
resulting underflow by merging with sibling If key K is in an inner node pull up new discriminator from one of the successors Analyze which successor node of K has more elements left or right one
If both have the same number of elements decide for one Replace the key K to be deleted with the direct successor Krsquo from the left
successor node or with the direct successor Krsquorsquo from the right successor node respectively Delete Krsquo or Krsquorsquo from the respective successor node (recursively)
Note Major variants Merge (tis lecture) Re-distribution (instead of splitmerge in case of overflowunderflow the entries are
re-distributed under consideration of one or multiple adjacent nodes)
| 299
gt
copy A Behrend Foundations of Information Systems |
B-Trees B+-Trees and B-Trees
B+-Trees and B-Trees Data is only in leaf nodes Key redundancy but higher fan-out rarr lower tree high less IO Simpler delete procedure rarr requires only merging of nodes
Double linked list of all leaf nodes B-Trees Modified valid node sizes
from [k2k] to [43k2k] rarr better node utilization but more splitsmerges
Example Secondary index Non unique
B-Tree with k = 2 h = 3
| 300
gt
copy A Behrend Foundations of Information Systems |
Indexing Low Cardinality Columns
Problem Example B-tree on the sex of customers for a table with 1000000 tuples results in
two lists with approximately 500000 tuples each
Query for all female customers requires 500000 random page accesses (secondary index) Table scan would be much faster
Conclusion B-trees (and also hashing) are useful for predicates with low selectivity
(outputinput cardinality ratio) Rule of thumb margin hit rate is approx 5 higher hit rates do not justify the efforts for an index access
F M
TID TID TID TID hellip TID TID TID TID hellip
| 301
gt
copy A Behrend Foundations of Information Systems |
Bitmap Index
Idea (Long history since the 1960s) Create a bitmapbitlist for each
attribute value Each tuple in the table is assigned to
one bit in the bitmap (by position sequential TID)
Bit values 1 attribute value set 0 attribute value not set
Necessary condition Sequential numbering of the tuples
(TIDs)
F M
1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
0 1 0 0 1 0 0 0 0 1 1 0 1 0 0
Name Sex Region Race Carol f n white
Harold m e black Anne f e asian
Iris f ne white hellip m se hisp hellip f e white hellip f sw asian hellip f w black hellip f n asian hellip m e hisp hellip m se black hellip f s white hellip m nw black hellip f s white hellip f w black
Sex
| 302
gt
copy A Behrend Foundations of Information Systems |
Querying Bitmap Indexes
Main advantage of bitmap indexes Simple and efficient logical join possible Read only data that is relevant for predicates Example σSex=lsquoflsquo ᴧ Region=lsquonlsquo R Bitmaps B1 and B2 in conjunction for (i=0 iltB1length i++) B = B1[i] amp B2[i]
Example IO Costs Estimation σSex=lsquoflsquo ᴧ Region=lsquonlsquo ᴧ Race=lsquoAsianlsquo R
(ldquoAsian women of region Northrdquo) Selectivity 12 18 14 = 164 N=10000 tuples with length of 400 bytes each
(~ 10 tuples per page for 4kB pages) Table scan 1000 pages Bitmap access 1000064 156 pages (worst case each tuple in a different page)
plus 1 page for bitmaps
F 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
N 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0
A 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
AND = AND
- DBMS Architecture
- What is in the Lecture
- How is Database System build
- Architectural Blue Print
- Architectural Trends
- Different Access Characteristics
- Hardware Developments
- Row Storage vs Column Storage
- Processing Models
- Transaction Management
- ACID Properties of Transactions
- Motivation
- Reasons for crashes
- Guarantee Atomicity amp Durability
- Record Management
- Record
- Record Addressing
- Surrogate with Mapping Table
- TID Concept
- Free Space Management
- Physical Access Paths ndash Index Structures
- Overview Indexes
- Overview Indexes (2)
- Classification of Index Structures
- B-Tree
- B-Tree (2)
- Search in the B-Tree
- Insertions in the B-Tree (1)
- Insertions in the B-Tree (2)
- Insertions in the B-Tree (3)
- Insertion and Deletion in the B-Tree
- Deletion in the B-Tree
- Deletion in the B-Tree (2)
- Deletion in the B-Tree (3)
- Deletion in the B-Tree (4)
- Deletion Algorithm
- B-Trees B+-Trees and B-Trees
- Indexing Low Cardinality Columns
- Bitmap Index
- Querying Bitmap Indexes
-
| 269
gt
copy A Behrend Foundations of Information Systems |
Hardware Developments
Hardware improvements not equally distributed Advances in CPU speed have outpaced advances
in RAM latency Main-memory access has become a performance
bottleneck for many computer applications Bandwidth Latency Address translation (TLB)
rarr Memory Wall Cache memories can reduce the memory latency
when the requested data is found in the cache Vertically fragmented data structures optimize
memory cache usage
| 270
gt
copy A Behrend Foundations of Information Systems |
Row Storage vs Column Storage
Row Storage + easy to addmodify a record - might read unnecessary data
Column Storage + only need to read in relevant data - tuple writes require multiple accesses -gt suitable for read-mostly read-intensive large data repositories
| 271
gt
copy A Behrend Foundations of Information Systems |
Processing Models
[Marcin Zukowski Peter A Boncz Niels Nes Saacutendor Heacuteman MonetDBX100 - A DBMS In The CPU Cache IEEE Data Eng Bull 28(2) p17-22 2005]
| 272
gt
copy A Behrend Foundations of Information Systems |
Transaction Management
Principle of a transaction Sequence of successive DB operations that transform a database from a consistent
state into another consistent state surrounded by BOT EOT (Commit Abort)
Properties ACID Atomicity Consistency Isolation Durability A transaction will always come to an end Normal (commit) changes are permanently stored within the DB Abnormal (abort rollback) already composed changes are taken back
Note EOT state must not be different from BOT state
BOT(begin of transaction)
EOT(end of transaction)
possibly inconsistent database
consistentdatabase
consistentdatabase
DB DB
DML operations
| 273
gt
copy A Behrend Foundations of Information Systems |
ACID Properties of Transactions
Atomicity Indivisibility due to the transaction definition (Begin - End) All-or-nothing principle ie the DBS guarantees Either the complete execution of a transaction hellip hellip or the ineffectiveness of the whole transaction (and of all associated operations)
Consistency A successful transaction guarantees that all consistency requirements (integrity
requirements) have been met
Isolation Multiple transactions run isolated from each other and do not use (inconsistent)
intermediate results from other transactions
Durability All results of successful transactions have to be made persistent
| 274
gt
copy A Behrend Foundations of Information Systems |
Motivation
Atomicity Part of the transaction is done but we want to cancel it ABORTROLLBACK System crashes during transaction some changes made it to the disk some did not
Durability Transaction finished user notified COMMIT System crashes before changes sent successfully to disk (asynchronous write)
Consistency Physical consistency Correctness of the storage and access structures Completely executed modification operations preserve the consistency
Logical consistency Correctness of data contents ndash correspond to a (possible) state of the real world Completely executed transactions preserve the logical consistency
- All modifications of finished transactions are included - No modifications of open transactions are included
Remember Logical consistency requires physical consistency in the first place
UNDO Recovery
REDO Recovery
UNDO Recovery for consistency-related rollbacks
| 275
gt
copy A Behrend Foundations of Information Systems |
Reasons for crashes
Transaction error Violation of system restrictions Violation of security regulations Excessive resource requirements deadlocks
Application-related errors eg wrong operations and values ROLLBACK
System error System crash with loss of main-memory contents Database system operating system hardware power failure
Device error (especially storage-medium error) Destruction of secondary storage systems
Catastrophes Destruction of the computing center
| 276
gt
copy A Behrend Foundations of Information Systems |
Guarantee Atomicity amp Durability
Assumptions System may crash but the disk is durable The only atomicity guarantee is that a disk block write is atomic
Materialization strategy Preferred Policy StealNo Force This combination is most complicated but allows for highest performance No Force complicates enforcing Durability What if system crashes before a modified page written by a committed
transaction makes it to disk Write as little as possible in a convenient place at commit time to support
REDOing modifications Steal complicates enforcing Atomicity What if the transaction that performed udpates aborts What if system crashes before transaction is finished Must remember the old value of P (to support UNDOing the write to page P)
copy A Behrend Foundations of Information Systems | 277
gt
|
Storage System
Buffer
File System
Hardware
Data System
Application
TID TID TID
TID TID TID
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
Run
Buffered Pages - Page replacement strategy - Materialization strategy - Logging Backup Recovery
Paged files
Disks Flash RAID SAN hellip
Storage Structures - Record management - Free space management - Physical access paths
Access System
Data
base
Sys
tem
Data model semantics - System catalog - Record format - Logical access paths
Query processing - Parsing - Plan generation - Plan optimization - Plan execution
1 lsquoSmithrsquo 15061982
Table Person id INT name VARCHAR birthday DATE Index P_id_IX on Personid
Database System
SQL JDBC ODBC hellip
Record Management
| 278
gt
copy A Behrend Foundations of Information Systems |
Record
Record Package of fields that together describe a thing a person a fact etc Each fields represents on property of the entity described by the record Similar to a struct in C Variable length (in contrast to pages)
Record Manager Organizes physical storage of records in pages Operations Get Insert Update Delete Scan Agnostic to record structure and semantic
records considered as byte strings of variable length Structure and content of record is defined be Access System and application
Challenges Record addressing Free space management
| 279
gt
copy A Behrend Foundations of Information Systems |
Record Addressing
Record address Identifier for records used to address records eg in indexes or query processing Assigned during insert of a record
Goals Stability of identifier Fast and direct access Less organizational overhead
Direct addressing Byte address or position number in file or page Instable Byte address If record grows in length following records would get new address Position number Insert and delete operations change series or records
Indirect addressing Surrogate with mapping table (complete indirection) Tuple Identifier (TID concept)
| 280
gt
copy A Behrend Foundations of Information Systems |
Surrogate with Mapping Table
Surrogate Record type + serial number Serial number remains constant during recordrsquos life time
Mapping table Maps
surrogate to page
Problems Where to store mapping table How can it be extended How to search mapping table efficiently
rarr H2 use B-Tree to store mapping table
Mapping Table Surrogate | Page ID
| 281
gt
copy A Behrend Foundations of Information Systems |
TID Concept
Record addressing with indirection inside the page Each page contains an array with record positions TID of a record consist of page id and index in position array
Pros Access with one page access (two pages in case of overflow) Stable No mapping table required
Operations Insert Reuse unused
position or add position Delete Mark position
as unused in array Update Update all
positions in array Update with overflow Store record
as overflow record and store TID of overflow record at original position (No double overflow Update TID at original position)
Record
Overflow Record
| 282
gt
copy A Behrend Foundations of Information Systems |
Free Space Management
Problem In which page is enough space for new record
Solution Free space table lists for all pages how much space is left
Free space value Precise value Ceil(Log2(page size)) =gt 2 bytes for common page size of 4K Rough value use less bytes free space = (value page size)2^(bits per value)
Free space table With direct page addressing Assuming a single page can take n free space entries First page and each (n+1)-th page takes free space entries
With indirect page addressing Free space information stored in page table
copy A Behrend Foundations of Information Systems | 283
gt
|
Storage System
Buffer
File System
Hardware
Data System
Application
TID TID TID
TID TID TID
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
Run
Buffered Pages - Page replacement strategy - Materialization strategy - Logging Backup Recovery
Paged files
Disks Flash RAID SAN hellip
Storage Structures - Record management - Free space management - Physical access paths
Access System
Data
base
Sys
tem
Data model semantics - System catalog - Record format - Logical access paths
Query processing - Parsing - Plan generation - Plan optimization - Plan execution
1 lsquoSmithrsquo 15061982
Table Person id INT name VARCHAR birthday DATE Index P_id_IX on Personid
Database System
SQL JDBC ODBC hellip
Physical Access Paths ndash Index Structures
| 284
gt
copy A Behrend Foundations of Information Systems |
Primary Index Secondary Index
Overview Indexes
Table scan Read all pages and for each record
evaluate the search criteria Pre-fetching
Index Scan Use index for search criteria
on one or more attributes Fast access to single values or value ranges of index attributes Logicalphysical sorting of values of key attributes (depending on index structure) Enforcing uniqueness
Types if indexes Primary (Clustered) Index
determines physical organization use for PK Secondary (Non-Clustered)
Index redundant access path
Pers(PID NAME AGE SALARY)
Age
Salary
| 285
gt
copy A Behrend Foundations of Information Systems |
Overview Indexes (2)
Choice of Access Paths Index scan Only useful for low selectivity
(low number of result tuples) Break even-point according to the
output ratio of the number of tuples (usually max 5) Requires statistics about data Additional costs for index storage
and updating
Table Scan adequateefficient for small tables
(eg 5 pages) Queries with high selectivity
(large result sets) 100-200MBs sequential read
~ 100 disk seekss
hit rate
Index Scan
Table Scan
access time
| 286
gt
copy A Behrend Foundations of Information Systems |
Classification of Index Structures
Classification
Multiway Trees Tree structure with multiple children per node Idea chose fan out so that node size suits page size
Onedimensional Index Structures
Key Comparison Key Transformation
Sequential Tree-Based Hash-Based
Prefix Trees (Tries)
Binary Search Trees Dynamic Static Linked Lists
(log seq) Seq Lists
(phys seq) Multiway
Trees Example B-Tree
| 287
gt
copy A Behrend Foundations of Information Systems |
B-Tree
free space
(Ki Di Pi) = entry min |P| = k+1 max |P| = 2k+1
keys lt K1 keys gt Kp Ki lt keys lt Ki+1
| 288
gt
copy A Behrend Foundations of Information Systems |
B-Tree (2)
Example Keys Agnostic to specific key semantic Only defined complete order required Could be of fixed or variable length
Operations Search for data for given key value Insertion and deletion of key-data pair
Payload Agnostic to specific data semantic Can be record or reference (TID) or
mix
B-Tree with k = 2 h = 3
| 289
gt
copy A Behrend Foundations of Information Systems |
Search in the B-Tree
Starting at the root node each node is searched from left to right 1) if Ki matches the desired key value the data record has been found (further
records with the same key value might be located in a sub-tree to which Pi-1 points) 2) if Ki is smaller than the desired value the search will be continued in the root of
the sub-tree identified by Pi-1 3) if Ki is larger than the desired value the comparison with Ki+1 is repeated 4) if K2k is also smaller than the desired value the search will be continued in the
sub-tree of P2k If itlsquos impossible to descend further into a sub-tree (2 or 4) (leaf node) The search is aborted no record with the desired key value is found
Search for 38 20 6
| 290
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (1)
Insertion Rule insert only into leaf nodes At Non-Leaf Nodes descend down the tree as for the search S le Ki follow Pi-1 S gt Ki check Ki+1 S gt K2k follow P2k
At Leaf Node Insert the data record according to the sorting order Special case leaf node is full (2k records) rarr split the leaf node
Splitting Generate a new leaf node Split the 2k+1 entries (in order)
into two leaf nodes first k entries rarr left node last k entries rarr right node
middle entry (k+1-th) is used as new ldquodiscriminatorrdquo (branching) and inserted into the parent node
| 291
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (2)
Node Splitting during Insertion Two possible situations after a split The parent node is full rarr repeat split on this level Enough space rarr FINISHED
Special case root split Split of the root node rarr New root with two successor nodes Height of a tree grows by 1 The tree has been split from the bottom to the top
Dynamic reorganization (self-balancing) No unloading or loading necessary Tree is always balanced But In case of many insertions deletions reorganization can be beneficial
| 292
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (3)
Insertion Example Order k = 1 n=2k Keys 1 5 2 6 7 4 8 3
Finally h=3
| 293
gt
copy A Behrend Foundations of Information Systems |
Insertion and Deletion in the B-Tree
Problem Insertion can create overflow Deletion can create underflow and overflow
Example Insertion of key 22
rarr Overflow rarr Split
Deletion of key 22 rarr Underflow need to access all four nodes finally same as input
Insert 22
| 294
gt
copy A Behrend Foundations of Information Systems |
Underflow Merge
Deletion in the B-Tree
Example Order k = 1 n=2k Delete key 3
| 295
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (2)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Remember Each path from the root to the leaf has the same length h
| 296
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (3)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Overflow Split
| 297
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (4)
Example Order k = 1 n=2k Delete key 3
Overflow Split
Root Split
| 298
gt
copy A Behrend Foundations of Information Systems |
Deletion Algorithm
Example ndash there are different algorithms Search the node that contains the key K to be deleted If key K is in a leaf node delete the key in the leaf node and handle potentially
resulting underflow by merging with sibling If key K is in an inner node pull up new discriminator from one of the successors Analyze which successor node of K has more elements left or right one
If both have the same number of elements decide for one Replace the key K to be deleted with the direct successor Krsquo from the left
successor node or with the direct successor Krsquorsquo from the right successor node respectively Delete Krsquo or Krsquorsquo from the respective successor node (recursively)
Note Major variants Merge (tis lecture) Re-distribution (instead of splitmerge in case of overflowunderflow the entries are
re-distributed under consideration of one or multiple adjacent nodes)
| 299
gt
copy A Behrend Foundations of Information Systems |
B-Trees B+-Trees and B-Trees
B+-Trees and B-Trees Data is only in leaf nodes Key redundancy but higher fan-out rarr lower tree high less IO Simpler delete procedure rarr requires only merging of nodes
Double linked list of all leaf nodes B-Trees Modified valid node sizes
from [k2k] to [43k2k] rarr better node utilization but more splitsmerges
Example Secondary index Non unique
B-Tree with k = 2 h = 3
| 300
gt
copy A Behrend Foundations of Information Systems |
Indexing Low Cardinality Columns
Problem Example B-tree on the sex of customers for a table with 1000000 tuples results in
two lists with approximately 500000 tuples each
Query for all female customers requires 500000 random page accesses (secondary index) Table scan would be much faster
Conclusion B-trees (and also hashing) are useful for predicates with low selectivity
(outputinput cardinality ratio) Rule of thumb margin hit rate is approx 5 higher hit rates do not justify the efforts for an index access
F M
TID TID TID TID hellip TID TID TID TID hellip
| 301
gt
copy A Behrend Foundations of Information Systems |
Bitmap Index
Idea (Long history since the 1960s) Create a bitmapbitlist for each
attribute value Each tuple in the table is assigned to
one bit in the bitmap (by position sequential TID)
Bit values 1 attribute value set 0 attribute value not set
Necessary condition Sequential numbering of the tuples
(TIDs)
F M
1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
0 1 0 0 1 0 0 0 0 1 1 0 1 0 0
Name Sex Region Race Carol f n white
Harold m e black Anne f e asian
Iris f ne white hellip m se hisp hellip f e white hellip f sw asian hellip f w black hellip f n asian hellip m e hisp hellip m se black hellip f s white hellip m nw black hellip f s white hellip f w black
Sex
| 302
gt
copy A Behrend Foundations of Information Systems |
Querying Bitmap Indexes
Main advantage of bitmap indexes Simple and efficient logical join possible Read only data that is relevant for predicates Example σSex=lsquoflsquo ᴧ Region=lsquonlsquo R Bitmaps B1 and B2 in conjunction for (i=0 iltB1length i++) B = B1[i] amp B2[i]
Example IO Costs Estimation σSex=lsquoflsquo ᴧ Region=lsquonlsquo ᴧ Race=lsquoAsianlsquo R
(ldquoAsian women of region Northrdquo) Selectivity 12 18 14 = 164 N=10000 tuples with length of 400 bytes each
(~ 10 tuples per page for 4kB pages) Table scan 1000 pages Bitmap access 1000064 156 pages (worst case each tuple in a different page)
plus 1 page for bitmaps
F 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
N 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0
A 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
AND = AND
- DBMS Architecture
- What is in the Lecture
- How is Database System build
- Architectural Blue Print
- Architectural Trends
- Different Access Characteristics
- Hardware Developments
- Row Storage vs Column Storage
- Processing Models
- Transaction Management
- ACID Properties of Transactions
- Motivation
- Reasons for crashes
- Guarantee Atomicity amp Durability
- Record Management
- Record
- Record Addressing
- Surrogate with Mapping Table
- TID Concept
- Free Space Management
- Physical Access Paths ndash Index Structures
- Overview Indexes
- Overview Indexes (2)
- Classification of Index Structures
- B-Tree
- B-Tree (2)
- Search in the B-Tree
- Insertions in the B-Tree (1)
- Insertions in the B-Tree (2)
- Insertions in the B-Tree (3)
- Insertion and Deletion in the B-Tree
- Deletion in the B-Tree
- Deletion in the B-Tree (2)
- Deletion in the B-Tree (3)
- Deletion in the B-Tree (4)
- Deletion Algorithm
- B-Trees B+-Trees and B-Trees
- Indexing Low Cardinality Columns
- Bitmap Index
- Querying Bitmap Indexes
-
| 270
gt
copy A Behrend Foundations of Information Systems |
Row Storage vs Column Storage
Row Storage + easy to addmodify a record - might read unnecessary data
Column Storage + only need to read in relevant data - tuple writes require multiple accesses -gt suitable for read-mostly read-intensive large data repositories
| 271
gt
copy A Behrend Foundations of Information Systems |
Processing Models
[Marcin Zukowski Peter A Boncz Niels Nes Saacutendor Heacuteman MonetDBX100 - A DBMS In The CPU Cache IEEE Data Eng Bull 28(2) p17-22 2005]
| 272
gt
copy A Behrend Foundations of Information Systems |
Transaction Management
Principle of a transaction Sequence of successive DB operations that transform a database from a consistent
state into another consistent state surrounded by BOT EOT (Commit Abort)
Properties ACID Atomicity Consistency Isolation Durability A transaction will always come to an end Normal (commit) changes are permanently stored within the DB Abnormal (abort rollback) already composed changes are taken back
Note EOT state must not be different from BOT state
BOT(begin of transaction)
EOT(end of transaction)
possibly inconsistent database
consistentdatabase
consistentdatabase
DB DB
DML operations
| 273
gt
copy A Behrend Foundations of Information Systems |
ACID Properties of Transactions
Atomicity Indivisibility due to the transaction definition (Begin - End) All-or-nothing principle ie the DBS guarantees Either the complete execution of a transaction hellip hellip or the ineffectiveness of the whole transaction (and of all associated operations)
Consistency A successful transaction guarantees that all consistency requirements (integrity
requirements) have been met
Isolation Multiple transactions run isolated from each other and do not use (inconsistent)
intermediate results from other transactions
Durability All results of successful transactions have to be made persistent
| 274
gt
copy A Behrend Foundations of Information Systems |
Motivation
Atomicity Part of the transaction is done but we want to cancel it ABORTROLLBACK System crashes during transaction some changes made it to the disk some did not
Durability Transaction finished user notified COMMIT System crashes before changes sent successfully to disk (asynchronous write)
Consistency Physical consistency Correctness of the storage and access structures Completely executed modification operations preserve the consistency
Logical consistency Correctness of data contents ndash correspond to a (possible) state of the real world Completely executed transactions preserve the logical consistency
- All modifications of finished transactions are included - No modifications of open transactions are included
Remember Logical consistency requires physical consistency in the first place
UNDO Recovery
REDO Recovery
UNDO Recovery for consistency-related rollbacks
| 275
gt
copy A Behrend Foundations of Information Systems |
Reasons for crashes
Transaction error Violation of system restrictions Violation of security regulations Excessive resource requirements deadlocks
Application-related errors eg wrong operations and values ROLLBACK
System error System crash with loss of main-memory contents Database system operating system hardware power failure
Device error (especially storage-medium error) Destruction of secondary storage systems
Catastrophes Destruction of the computing center
| 276
gt
copy A Behrend Foundations of Information Systems |
Guarantee Atomicity amp Durability
Assumptions System may crash but the disk is durable The only atomicity guarantee is that a disk block write is atomic
Materialization strategy Preferred Policy StealNo Force This combination is most complicated but allows for highest performance No Force complicates enforcing Durability What if system crashes before a modified page written by a committed
transaction makes it to disk Write as little as possible in a convenient place at commit time to support
REDOing modifications Steal complicates enforcing Atomicity What if the transaction that performed udpates aborts What if system crashes before transaction is finished Must remember the old value of P (to support UNDOing the write to page P)
copy A Behrend Foundations of Information Systems | 277
gt
|
Storage System
Buffer
File System
Hardware
Data System
Application
TID TID TID
TID TID TID
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
Run
Buffered Pages - Page replacement strategy - Materialization strategy - Logging Backup Recovery
Paged files
Disks Flash RAID SAN hellip
Storage Structures - Record management - Free space management - Physical access paths
Access System
Data
base
Sys
tem
Data model semantics - System catalog - Record format - Logical access paths
Query processing - Parsing - Plan generation - Plan optimization - Plan execution
1 lsquoSmithrsquo 15061982
Table Person id INT name VARCHAR birthday DATE Index P_id_IX on Personid
Database System
SQL JDBC ODBC hellip
Record Management
| 278
gt
copy A Behrend Foundations of Information Systems |
Record
Record Package of fields that together describe a thing a person a fact etc Each fields represents on property of the entity described by the record Similar to a struct in C Variable length (in contrast to pages)
Record Manager Organizes physical storage of records in pages Operations Get Insert Update Delete Scan Agnostic to record structure and semantic
records considered as byte strings of variable length Structure and content of record is defined be Access System and application
Challenges Record addressing Free space management
| 279
gt
copy A Behrend Foundations of Information Systems |
Record Addressing
Record address Identifier for records used to address records eg in indexes or query processing Assigned during insert of a record
Goals Stability of identifier Fast and direct access Less organizational overhead
Direct addressing Byte address or position number in file or page Instable Byte address If record grows in length following records would get new address Position number Insert and delete operations change series or records
Indirect addressing Surrogate with mapping table (complete indirection) Tuple Identifier (TID concept)
| 280
gt
copy A Behrend Foundations of Information Systems |
Surrogate with Mapping Table
Surrogate Record type + serial number Serial number remains constant during recordrsquos life time
Mapping table Maps
surrogate to page
Problems Where to store mapping table How can it be extended How to search mapping table efficiently
rarr H2 use B-Tree to store mapping table
Mapping Table Surrogate | Page ID
| 281
gt
copy A Behrend Foundations of Information Systems |
TID Concept
Record addressing with indirection inside the page Each page contains an array with record positions TID of a record consist of page id and index in position array
Pros Access with one page access (two pages in case of overflow) Stable No mapping table required
Operations Insert Reuse unused
position or add position Delete Mark position
as unused in array Update Update all
positions in array Update with overflow Store record
as overflow record and store TID of overflow record at original position (No double overflow Update TID at original position)
Record
Overflow Record
| 282
gt
copy A Behrend Foundations of Information Systems |
Free Space Management
Problem In which page is enough space for new record
Solution Free space table lists for all pages how much space is left
Free space value Precise value Ceil(Log2(page size)) =gt 2 bytes for common page size of 4K Rough value use less bytes free space = (value page size)2^(bits per value)
Free space table With direct page addressing Assuming a single page can take n free space entries First page and each (n+1)-th page takes free space entries
With indirect page addressing Free space information stored in page table
copy A Behrend Foundations of Information Systems | 283
gt
|
Storage System
Buffer
File System
Hardware
Data System
Application
TID TID TID
TID TID TID
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
Run
Buffered Pages - Page replacement strategy - Materialization strategy - Logging Backup Recovery
Paged files
Disks Flash RAID SAN hellip
Storage Structures - Record management - Free space management - Physical access paths
Access System
Data
base
Sys
tem
Data model semantics - System catalog - Record format - Logical access paths
Query processing - Parsing - Plan generation - Plan optimization - Plan execution
1 lsquoSmithrsquo 15061982
Table Person id INT name VARCHAR birthday DATE Index P_id_IX on Personid
Database System
SQL JDBC ODBC hellip
Physical Access Paths ndash Index Structures
| 284
gt
copy A Behrend Foundations of Information Systems |
Primary Index Secondary Index
Overview Indexes
Table scan Read all pages and for each record
evaluate the search criteria Pre-fetching
Index Scan Use index for search criteria
on one or more attributes Fast access to single values or value ranges of index attributes Logicalphysical sorting of values of key attributes (depending on index structure) Enforcing uniqueness
Types if indexes Primary (Clustered) Index
determines physical organization use for PK Secondary (Non-Clustered)
Index redundant access path
Pers(PID NAME AGE SALARY)
Age
Salary
| 285
gt
copy A Behrend Foundations of Information Systems |
Overview Indexes (2)
Choice of Access Paths Index scan Only useful for low selectivity
(low number of result tuples) Break even-point according to the
output ratio of the number of tuples (usually max 5) Requires statistics about data Additional costs for index storage
and updating
Table Scan adequateefficient for small tables
(eg 5 pages) Queries with high selectivity
(large result sets) 100-200MBs sequential read
~ 100 disk seekss
hit rate
Index Scan
Table Scan
access time
| 286
gt
copy A Behrend Foundations of Information Systems |
Classification of Index Structures
Classification
Multiway Trees Tree structure with multiple children per node Idea chose fan out so that node size suits page size
Onedimensional Index Structures
Key Comparison Key Transformation
Sequential Tree-Based Hash-Based
Prefix Trees (Tries)
Binary Search Trees Dynamic Static Linked Lists
(log seq) Seq Lists
(phys seq) Multiway
Trees Example B-Tree
| 287
gt
copy A Behrend Foundations of Information Systems |
B-Tree
free space
(Ki Di Pi) = entry min |P| = k+1 max |P| = 2k+1
keys lt K1 keys gt Kp Ki lt keys lt Ki+1
| 288
gt
copy A Behrend Foundations of Information Systems |
B-Tree (2)
Example Keys Agnostic to specific key semantic Only defined complete order required Could be of fixed or variable length
Operations Search for data for given key value Insertion and deletion of key-data pair
Payload Agnostic to specific data semantic Can be record or reference (TID) or
mix
B-Tree with k = 2 h = 3
| 289
gt
copy A Behrend Foundations of Information Systems |
Search in the B-Tree
Starting at the root node each node is searched from left to right 1) if Ki matches the desired key value the data record has been found (further
records with the same key value might be located in a sub-tree to which Pi-1 points) 2) if Ki is smaller than the desired value the search will be continued in the root of
the sub-tree identified by Pi-1 3) if Ki is larger than the desired value the comparison with Ki+1 is repeated 4) if K2k is also smaller than the desired value the search will be continued in the
sub-tree of P2k If itlsquos impossible to descend further into a sub-tree (2 or 4) (leaf node) The search is aborted no record with the desired key value is found
Search for 38 20 6
| 290
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (1)
Insertion Rule insert only into leaf nodes At Non-Leaf Nodes descend down the tree as for the search S le Ki follow Pi-1 S gt Ki check Ki+1 S gt K2k follow P2k
At Leaf Node Insert the data record according to the sorting order Special case leaf node is full (2k records) rarr split the leaf node
Splitting Generate a new leaf node Split the 2k+1 entries (in order)
into two leaf nodes first k entries rarr left node last k entries rarr right node
middle entry (k+1-th) is used as new ldquodiscriminatorrdquo (branching) and inserted into the parent node
| 291
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (2)
Node Splitting during Insertion Two possible situations after a split The parent node is full rarr repeat split on this level Enough space rarr FINISHED
Special case root split Split of the root node rarr New root with two successor nodes Height of a tree grows by 1 The tree has been split from the bottom to the top
Dynamic reorganization (self-balancing) No unloading or loading necessary Tree is always balanced But In case of many insertions deletions reorganization can be beneficial
| 292
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (3)
Insertion Example Order k = 1 n=2k Keys 1 5 2 6 7 4 8 3
Finally h=3
| 293
gt
copy A Behrend Foundations of Information Systems |
Insertion and Deletion in the B-Tree
Problem Insertion can create overflow Deletion can create underflow and overflow
Example Insertion of key 22
rarr Overflow rarr Split
Deletion of key 22 rarr Underflow need to access all four nodes finally same as input
Insert 22
| 294
gt
copy A Behrend Foundations of Information Systems |
Underflow Merge
Deletion in the B-Tree
Example Order k = 1 n=2k Delete key 3
| 295
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (2)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Remember Each path from the root to the leaf has the same length h
| 296
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (3)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Overflow Split
| 297
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (4)
Example Order k = 1 n=2k Delete key 3
Overflow Split
Root Split
| 298
gt
copy A Behrend Foundations of Information Systems |
Deletion Algorithm
Example ndash there are different algorithms Search the node that contains the key K to be deleted If key K is in a leaf node delete the key in the leaf node and handle potentially
resulting underflow by merging with sibling If key K is in an inner node pull up new discriminator from one of the successors Analyze which successor node of K has more elements left or right one
If both have the same number of elements decide for one Replace the key K to be deleted with the direct successor Krsquo from the left
successor node or with the direct successor Krsquorsquo from the right successor node respectively Delete Krsquo or Krsquorsquo from the respective successor node (recursively)
Note Major variants Merge (tis lecture) Re-distribution (instead of splitmerge in case of overflowunderflow the entries are
re-distributed under consideration of one or multiple adjacent nodes)
| 299
gt
copy A Behrend Foundations of Information Systems |
B-Trees B+-Trees and B-Trees
B+-Trees and B-Trees Data is only in leaf nodes Key redundancy but higher fan-out rarr lower tree high less IO Simpler delete procedure rarr requires only merging of nodes
Double linked list of all leaf nodes B-Trees Modified valid node sizes
from [k2k] to [43k2k] rarr better node utilization but more splitsmerges
Example Secondary index Non unique
B-Tree with k = 2 h = 3
| 300
gt
copy A Behrend Foundations of Information Systems |
Indexing Low Cardinality Columns
Problem Example B-tree on the sex of customers for a table with 1000000 tuples results in
two lists with approximately 500000 tuples each
Query for all female customers requires 500000 random page accesses (secondary index) Table scan would be much faster
Conclusion B-trees (and also hashing) are useful for predicates with low selectivity
(outputinput cardinality ratio) Rule of thumb margin hit rate is approx 5 higher hit rates do not justify the efforts for an index access
F M
TID TID TID TID hellip TID TID TID TID hellip
| 301
gt
copy A Behrend Foundations of Information Systems |
Bitmap Index
Idea (Long history since the 1960s) Create a bitmapbitlist for each
attribute value Each tuple in the table is assigned to
one bit in the bitmap (by position sequential TID)
Bit values 1 attribute value set 0 attribute value not set
Necessary condition Sequential numbering of the tuples
(TIDs)
F M
1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
0 1 0 0 1 0 0 0 0 1 1 0 1 0 0
Name Sex Region Race Carol f n white
Harold m e black Anne f e asian
Iris f ne white hellip m se hisp hellip f e white hellip f sw asian hellip f w black hellip f n asian hellip m e hisp hellip m se black hellip f s white hellip m nw black hellip f s white hellip f w black
Sex
| 302
gt
copy A Behrend Foundations of Information Systems |
Querying Bitmap Indexes
Main advantage of bitmap indexes Simple and efficient logical join possible Read only data that is relevant for predicates Example σSex=lsquoflsquo ᴧ Region=lsquonlsquo R Bitmaps B1 and B2 in conjunction for (i=0 iltB1length i++) B = B1[i] amp B2[i]
Example IO Costs Estimation σSex=lsquoflsquo ᴧ Region=lsquonlsquo ᴧ Race=lsquoAsianlsquo R
(ldquoAsian women of region Northrdquo) Selectivity 12 18 14 = 164 N=10000 tuples with length of 400 bytes each
(~ 10 tuples per page for 4kB pages) Table scan 1000 pages Bitmap access 1000064 156 pages (worst case each tuple in a different page)
plus 1 page for bitmaps
F 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
N 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0
A 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
AND = AND
- DBMS Architecture
- What is in the Lecture
- How is Database System build
- Architectural Blue Print
- Architectural Trends
- Different Access Characteristics
- Hardware Developments
- Row Storage vs Column Storage
- Processing Models
- Transaction Management
- ACID Properties of Transactions
- Motivation
- Reasons for crashes
- Guarantee Atomicity amp Durability
- Record Management
- Record
- Record Addressing
- Surrogate with Mapping Table
- TID Concept
- Free Space Management
- Physical Access Paths ndash Index Structures
- Overview Indexes
- Overview Indexes (2)
- Classification of Index Structures
- B-Tree
- B-Tree (2)
- Search in the B-Tree
- Insertions in the B-Tree (1)
- Insertions in the B-Tree (2)
- Insertions in the B-Tree (3)
- Insertion and Deletion in the B-Tree
- Deletion in the B-Tree
- Deletion in the B-Tree (2)
- Deletion in the B-Tree (3)
- Deletion in the B-Tree (4)
- Deletion Algorithm
- B-Trees B+-Trees and B-Trees
- Indexing Low Cardinality Columns
- Bitmap Index
- Querying Bitmap Indexes
-
| 271
gt
copy A Behrend Foundations of Information Systems |
Processing Models
[Marcin Zukowski Peter A Boncz Niels Nes Saacutendor Heacuteman MonetDBX100 - A DBMS In The CPU Cache IEEE Data Eng Bull 28(2) p17-22 2005]
| 272
gt
copy A Behrend Foundations of Information Systems |
Transaction Management
Principle of a transaction Sequence of successive DB operations that transform a database from a consistent
state into another consistent state surrounded by BOT EOT (Commit Abort)
Properties ACID Atomicity Consistency Isolation Durability A transaction will always come to an end Normal (commit) changes are permanently stored within the DB Abnormal (abort rollback) already composed changes are taken back
Note EOT state must not be different from BOT state
BOT(begin of transaction)
EOT(end of transaction)
possibly inconsistent database
consistentdatabase
consistentdatabase
DB DB
DML operations
| 273
gt
copy A Behrend Foundations of Information Systems |
ACID Properties of Transactions
Atomicity Indivisibility due to the transaction definition (Begin - End) All-or-nothing principle ie the DBS guarantees Either the complete execution of a transaction hellip hellip or the ineffectiveness of the whole transaction (and of all associated operations)
Consistency A successful transaction guarantees that all consistency requirements (integrity
requirements) have been met
Isolation Multiple transactions run isolated from each other and do not use (inconsistent)
intermediate results from other transactions
Durability All results of successful transactions have to be made persistent
| 274
gt
copy A Behrend Foundations of Information Systems |
Motivation
Atomicity Part of the transaction is done but we want to cancel it ABORTROLLBACK System crashes during transaction some changes made it to the disk some did not
Durability Transaction finished user notified COMMIT System crashes before changes sent successfully to disk (asynchronous write)
Consistency Physical consistency Correctness of the storage and access structures Completely executed modification operations preserve the consistency
Logical consistency Correctness of data contents ndash correspond to a (possible) state of the real world Completely executed transactions preserve the logical consistency
- All modifications of finished transactions are included - No modifications of open transactions are included
Remember Logical consistency requires physical consistency in the first place
UNDO Recovery
REDO Recovery
UNDO Recovery for consistency-related rollbacks
| 275
gt
copy A Behrend Foundations of Information Systems |
Reasons for crashes
Transaction error Violation of system restrictions Violation of security regulations Excessive resource requirements deadlocks
Application-related errors eg wrong operations and values ROLLBACK
System error System crash with loss of main-memory contents Database system operating system hardware power failure
Device error (especially storage-medium error) Destruction of secondary storage systems
Catastrophes Destruction of the computing center
| 276
gt
copy A Behrend Foundations of Information Systems |
Guarantee Atomicity amp Durability
Assumptions System may crash but the disk is durable The only atomicity guarantee is that a disk block write is atomic
Materialization strategy Preferred Policy StealNo Force This combination is most complicated but allows for highest performance No Force complicates enforcing Durability What if system crashes before a modified page written by a committed
transaction makes it to disk Write as little as possible in a convenient place at commit time to support
REDOing modifications Steal complicates enforcing Atomicity What if the transaction that performed udpates aborts What if system crashes before transaction is finished Must remember the old value of P (to support UNDOing the write to page P)
copy A Behrend Foundations of Information Systems | 277
gt
|
Storage System
Buffer
File System
Hardware
Data System
Application
TID TID TID
TID TID TID
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
Run
Buffered Pages - Page replacement strategy - Materialization strategy - Logging Backup Recovery
Paged files
Disks Flash RAID SAN hellip
Storage Structures - Record management - Free space management - Physical access paths
Access System
Data
base
Sys
tem
Data model semantics - System catalog - Record format - Logical access paths
Query processing - Parsing - Plan generation - Plan optimization - Plan execution
1 lsquoSmithrsquo 15061982
Table Person id INT name VARCHAR birthday DATE Index P_id_IX on Personid
Database System
SQL JDBC ODBC hellip
Record Management
| 278
gt
copy A Behrend Foundations of Information Systems |
Record
Record Package of fields that together describe a thing a person a fact etc Each fields represents on property of the entity described by the record Similar to a struct in C Variable length (in contrast to pages)
Record Manager Organizes physical storage of records in pages Operations Get Insert Update Delete Scan Agnostic to record structure and semantic
records considered as byte strings of variable length Structure and content of record is defined be Access System and application
Challenges Record addressing Free space management
| 279
gt
copy A Behrend Foundations of Information Systems |
Record Addressing
Record address Identifier for records used to address records eg in indexes or query processing Assigned during insert of a record
Goals Stability of identifier Fast and direct access Less organizational overhead
Direct addressing Byte address or position number in file or page Instable Byte address If record grows in length following records would get new address Position number Insert and delete operations change series or records
Indirect addressing Surrogate with mapping table (complete indirection) Tuple Identifier (TID concept)
| 280
gt
copy A Behrend Foundations of Information Systems |
Surrogate with Mapping Table
Surrogate Record type + serial number Serial number remains constant during recordrsquos life time
Mapping table Maps
surrogate to page
Problems Where to store mapping table How can it be extended How to search mapping table efficiently
rarr H2 use B-Tree to store mapping table
Mapping Table Surrogate | Page ID
| 281
gt
copy A Behrend Foundations of Information Systems |
TID Concept
Record addressing with indirection inside the page Each page contains an array with record positions TID of a record consist of page id and index in position array
Pros Access with one page access (two pages in case of overflow) Stable No mapping table required
Operations Insert Reuse unused
position or add position Delete Mark position
as unused in array Update Update all
positions in array Update with overflow Store record
as overflow record and store TID of overflow record at original position (No double overflow Update TID at original position)
Record
Overflow Record
| 282
gt
copy A Behrend Foundations of Information Systems |
Free Space Management
Problem In which page is enough space for new record
Solution Free space table lists for all pages how much space is left
Free space value Precise value Ceil(Log2(page size)) =gt 2 bytes for common page size of 4K Rough value use less bytes free space = (value page size)2^(bits per value)
Free space table With direct page addressing Assuming a single page can take n free space entries First page and each (n+1)-th page takes free space entries
With indirect page addressing Free space information stored in page table
copy A Behrend Foundations of Information Systems | 283
gt
|
Storage System
Buffer
File System
Hardware
Data System
Application
TID TID TID
TID TID TID
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
Run
Buffered Pages - Page replacement strategy - Materialization strategy - Logging Backup Recovery
Paged files
Disks Flash RAID SAN hellip
Storage Structures - Record management - Free space management - Physical access paths
Access System
Data
base
Sys
tem
Data model semantics - System catalog - Record format - Logical access paths
Query processing - Parsing - Plan generation - Plan optimization - Plan execution
1 lsquoSmithrsquo 15061982
Table Person id INT name VARCHAR birthday DATE Index P_id_IX on Personid
Database System
SQL JDBC ODBC hellip
Physical Access Paths ndash Index Structures
| 284
gt
copy A Behrend Foundations of Information Systems |
Primary Index Secondary Index
Overview Indexes
Table scan Read all pages and for each record
evaluate the search criteria Pre-fetching
Index Scan Use index for search criteria
on one or more attributes Fast access to single values or value ranges of index attributes Logicalphysical sorting of values of key attributes (depending on index structure) Enforcing uniqueness
Types if indexes Primary (Clustered) Index
determines physical organization use for PK Secondary (Non-Clustered)
Index redundant access path
Pers(PID NAME AGE SALARY)
Age
Salary
| 285
gt
copy A Behrend Foundations of Information Systems |
Overview Indexes (2)
Choice of Access Paths Index scan Only useful for low selectivity
(low number of result tuples) Break even-point according to the
output ratio of the number of tuples (usually max 5) Requires statistics about data Additional costs for index storage
and updating
Table Scan adequateefficient for small tables
(eg 5 pages) Queries with high selectivity
(large result sets) 100-200MBs sequential read
~ 100 disk seekss
hit rate
Index Scan
Table Scan
access time
| 286
gt
copy A Behrend Foundations of Information Systems |
Classification of Index Structures
Classification
Multiway Trees Tree structure with multiple children per node Idea chose fan out so that node size suits page size
Onedimensional Index Structures
Key Comparison Key Transformation
Sequential Tree-Based Hash-Based
Prefix Trees (Tries)
Binary Search Trees Dynamic Static Linked Lists
(log seq) Seq Lists
(phys seq) Multiway
Trees Example B-Tree
| 287
gt
copy A Behrend Foundations of Information Systems |
B-Tree
free space
(Ki Di Pi) = entry min |P| = k+1 max |P| = 2k+1
keys lt K1 keys gt Kp Ki lt keys lt Ki+1
| 288
gt
copy A Behrend Foundations of Information Systems |
B-Tree (2)
Example Keys Agnostic to specific key semantic Only defined complete order required Could be of fixed or variable length
Operations Search for data for given key value Insertion and deletion of key-data pair
Payload Agnostic to specific data semantic Can be record or reference (TID) or
mix
B-Tree with k = 2 h = 3
| 289
gt
copy A Behrend Foundations of Information Systems |
Search in the B-Tree
Starting at the root node each node is searched from left to right 1) if Ki matches the desired key value the data record has been found (further
records with the same key value might be located in a sub-tree to which Pi-1 points) 2) if Ki is smaller than the desired value the search will be continued in the root of
the sub-tree identified by Pi-1 3) if Ki is larger than the desired value the comparison with Ki+1 is repeated 4) if K2k is also smaller than the desired value the search will be continued in the
sub-tree of P2k If itlsquos impossible to descend further into a sub-tree (2 or 4) (leaf node) The search is aborted no record with the desired key value is found
Search for 38 20 6
| 290
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (1)
Insertion Rule insert only into leaf nodes At Non-Leaf Nodes descend down the tree as for the search S le Ki follow Pi-1 S gt Ki check Ki+1 S gt K2k follow P2k
At Leaf Node Insert the data record according to the sorting order Special case leaf node is full (2k records) rarr split the leaf node
Splitting Generate a new leaf node Split the 2k+1 entries (in order)
into two leaf nodes first k entries rarr left node last k entries rarr right node
middle entry (k+1-th) is used as new ldquodiscriminatorrdquo (branching) and inserted into the parent node
| 291
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (2)
Node Splitting during Insertion Two possible situations after a split The parent node is full rarr repeat split on this level Enough space rarr FINISHED
Special case root split Split of the root node rarr New root with two successor nodes Height of a tree grows by 1 The tree has been split from the bottom to the top
Dynamic reorganization (self-balancing) No unloading or loading necessary Tree is always balanced But In case of many insertions deletions reorganization can be beneficial
| 292
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (3)
Insertion Example Order k = 1 n=2k Keys 1 5 2 6 7 4 8 3
Finally h=3
| 293
gt
copy A Behrend Foundations of Information Systems |
Insertion and Deletion in the B-Tree
Problem Insertion can create overflow Deletion can create underflow and overflow
Example Insertion of key 22
rarr Overflow rarr Split
Deletion of key 22 rarr Underflow need to access all four nodes finally same as input
Insert 22
| 294
gt
copy A Behrend Foundations of Information Systems |
Underflow Merge
Deletion in the B-Tree
Example Order k = 1 n=2k Delete key 3
| 295
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (2)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Remember Each path from the root to the leaf has the same length h
| 296
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (3)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Overflow Split
| 297
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (4)
Example Order k = 1 n=2k Delete key 3
Overflow Split
Root Split
| 298
gt
copy A Behrend Foundations of Information Systems |
Deletion Algorithm
Example ndash there are different algorithms Search the node that contains the key K to be deleted If key K is in a leaf node delete the key in the leaf node and handle potentially
resulting underflow by merging with sibling If key K is in an inner node pull up new discriminator from one of the successors Analyze which successor node of K has more elements left or right one
If both have the same number of elements decide for one Replace the key K to be deleted with the direct successor Krsquo from the left
successor node or with the direct successor Krsquorsquo from the right successor node respectively Delete Krsquo or Krsquorsquo from the respective successor node (recursively)
Note Major variants Merge (tis lecture) Re-distribution (instead of splitmerge in case of overflowunderflow the entries are
re-distributed under consideration of one or multiple adjacent nodes)
| 299
gt
copy A Behrend Foundations of Information Systems |
B-Trees B+-Trees and B-Trees
B+-Trees and B-Trees Data is only in leaf nodes Key redundancy but higher fan-out rarr lower tree high less IO Simpler delete procedure rarr requires only merging of nodes
Double linked list of all leaf nodes B-Trees Modified valid node sizes
from [k2k] to [43k2k] rarr better node utilization but more splitsmerges
Example Secondary index Non unique
B-Tree with k = 2 h = 3
| 300
gt
copy A Behrend Foundations of Information Systems |
Indexing Low Cardinality Columns
Problem Example B-tree on the sex of customers for a table with 1000000 tuples results in
two lists with approximately 500000 tuples each
Query for all female customers requires 500000 random page accesses (secondary index) Table scan would be much faster
Conclusion B-trees (and also hashing) are useful for predicates with low selectivity
(outputinput cardinality ratio) Rule of thumb margin hit rate is approx 5 higher hit rates do not justify the efforts for an index access
F M
TID TID TID TID hellip TID TID TID TID hellip
| 301
gt
copy A Behrend Foundations of Information Systems |
Bitmap Index
Idea (Long history since the 1960s) Create a bitmapbitlist for each
attribute value Each tuple in the table is assigned to
one bit in the bitmap (by position sequential TID)
Bit values 1 attribute value set 0 attribute value not set
Necessary condition Sequential numbering of the tuples
(TIDs)
F M
1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
0 1 0 0 1 0 0 0 0 1 1 0 1 0 0
Name Sex Region Race Carol f n white
Harold m e black Anne f e asian
Iris f ne white hellip m se hisp hellip f e white hellip f sw asian hellip f w black hellip f n asian hellip m e hisp hellip m se black hellip f s white hellip m nw black hellip f s white hellip f w black
Sex
| 302
gt
copy A Behrend Foundations of Information Systems |
Querying Bitmap Indexes
Main advantage of bitmap indexes Simple and efficient logical join possible Read only data that is relevant for predicates Example σSex=lsquoflsquo ᴧ Region=lsquonlsquo R Bitmaps B1 and B2 in conjunction for (i=0 iltB1length i++) B = B1[i] amp B2[i]
Example IO Costs Estimation σSex=lsquoflsquo ᴧ Region=lsquonlsquo ᴧ Race=lsquoAsianlsquo R
(ldquoAsian women of region Northrdquo) Selectivity 12 18 14 = 164 N=10000 tuples with length of 400 bytes each
(~ 10 tuples per page for 4kB pages) Table scan 1000 pages Bitmap access 1000064 156 pages (worst case each tuple in a different page)
plus 1 page for bitmaps
F 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
N 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0
A 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
AND = AND
- DBMS Architecture
- What is in the Lecture
- How is Database System build
- Architectural Blue Print
- Architectural Trends
- Different Access Characteristics
- Hardware Developments
- Row Storage vs Column Storage
- Processing Models
- Transaction Management
- ACID Properties of Transactions
- Motivation
- Reasons for crashes
- Guarantee Atomicity amp Durability
- Record Management
- Record
- Record Addressing
- Surrogate with Mapping Table
- TID Concept
- Free Space Management
- Physical Access Paths ndash Index Structures
- Overview Indexes
- Overview Indexes (2)
- Classification of Index Structures
- B-Tree
- B-Tree (2)
- Search in the B-Tree
- Insertions in the B-Tree (1)
- Insertions in the B-Tree (2)
- Insertions in the B-Tree (3)
- Insertion and Deletion in the B-Tree
- Deletion in the B-Tree
- Deletion in the B-Tree (2)
- Deletion in the B-Tree (3)
- Deletion in the B-Tree (4)
- Deletion Algorithm
- B-Trees B+-Trees and B-Trees
- Indexing Low Cardinality Columns
- Bitmap Index
- Querying Bitmap Indexes
-
| 272
gt
copy A Behrend Foundations of Information Systems |
Transaction Management
Principle of a transaction Sequence of successive DB operations that transform a database from a consistent
state into another consistent state surrounded by BOT EOT (Commit Abort)
Properties ACID Atomicity Consistency Isolation Durability A transaction will always come to an end Normal (commit) changes are permanently stored within the DB Abnormal (abort rollback) already composed changes are taken back
Note EOT state must not be different from BOT state
BOT(begin of transaction)
EOT(end of transaction)
possibly inconsistent database
consistentdatabase
consistentdatabase
DB DB
DML operations
| 273
gt
copy A Behrend Foundations of Information Systems |
ACID Properties of Transactions
Atomicity Indivisibility due to the transaction definition (Begin - End) All-or-nothing principle ie the DBS guarantees Either the complete execution of a transaction hellip hellip or the ineffectiveness of the whole transaction (and of all associated operations)
Consistency A successful transaction guarantees that all consistency requirements (integrity
requirements) have been met
Isolation Multiple transactions run isolated from each other and do not use (inconsistent)
intermediate results from other transactions
Durability All results of successful transactions have to be made persistent
| 274
gt
copy A Behrend Foundations of Information Systems |
Motivation
Atomicity Part of the transaction is done but we want to cancel it ABORTROLLBACK System crashes during transaction some changes made it to the disk some did not
Durability Transaction finished user notified COMMIT System crashes before changes sent successfully to disk (asynchronous write)
Consistency Physical consistency Correctness of the storage and access structures Completely executed modification operations preserve the consistency
Logical consistency Correctness of data contents ndash correspond to a (possible) state of the real world Completely executed transactions preserve the logical consistency
- All modifications of finished transactions are included - No modifications of open transactions are included
Remember Logical consistency requires physical consistency in the first place
UNDO Recovery
REDO Recovery
UNDO Recovery for consistency-related rollbacks
| 275
gt
copy A Behrend Foundations of Information Systems |
Reasons for crashes
Transaction error Violation of system restrictions Violation of security regulations Excessive resource requirements deadlocks
Application-related errors eg wrong operations and values ROLLBACK
System error System crash with loss of main-memory contents Database system operating system hardware power failure
Device error (especially storage-medium error) Destruction of secondary storage systems
Catastrophes Destruction of the computing center
| 276
gt
copy A Behrend Foundations of Information Systems |
Guarantee Atomicity amp Durability
Assumptions System may crash but the disk is durable The only atomicity guarantee is that a disk block write is atomic
Materialization strategy Preferred Policy StealNo Force This combination is most complicated but allows for highest performance No Force complicates enforcing Durability What if system crashes before a modified page written by a committed
transaction makes it to disk Write as little as possible in a convenient place at commit time to support
REDOing modifications Steal complicates enforcing Atomicity What if the transaction that performed udpates aborts What if system crashes before transaction is finished Must remember the old value of P (to support UNDOing the write to page P)
copy A Behrend Foundations of Information Systems | 277
gt
|
Storage System
Buffer
File System
Hardware
Data System
Application
TID TID TID
TID TID TID
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
Run
Buffered Pages - Page replacement strategy - Materialization strategy - Logging Backup Recovery
Paged files
Disks Flash RAID SAN hellip
Storage Structures - Record management - Free space management - Physical access paths
Access System
Data
base
Sys
tem
Data model semantics - System catalog - Record format - Logical access paths
Query processing - Parsing - Plan generation - Plan optimization - Plan execution
1 lsquoSmithrsquo 15061982
Table Person id INT name VARCHAR birthday DATE Index P_id_IX on Personid
Database System
SQL JDBC ODBC hellip
Record Management
| 278
gt
copy A Behrend Foundations of Information Systems |
Record
Record Package of fields that together describe a thing a person a fact etc Each fields represents on property of the entity described by the record Similar to a struct in C Variable length (in contrast to pages)
Record Manager Organizes physical storage of records in pages Operations Get Insert Update Delete Scan Agnostic to record structure and semantic
records considered as byte strings of variable length Structure and content of record is defined be Access System and application
Challenges Record addressing Free space management
| 279
gt
copy A Behrend Foundations of Information Systems |
Record Addressing
Record address Identifier for records used to address records eg in indexes or query processing Assigned during insert of a record
Goals Stability of identifier Fast and direct access Less organizational overhead
Direct addressing Byte address or position number in file or page Instable Byte address If record grows in length following records would get new address Position number Insert and delete operations change series or records
Indirect addressing Surrogate with mapping table (complete indirection) Tuple Identifier (TID concept)
| 280
gt
copy A Behrend Foundations of Information Systems |
Surrogate with Mapping Table
Surrogate Record type + serial number Serial number remains constant during recordrsquos life time
Mapping table Maps
surrogate to page
Problems Where to store mapping table How can it be extended How to search mapping table efficiently
rarr H2 use B-Tree to store mapping table
Mapping Table Surrogate | Page ID
| 281
gt
copy A Behrend Foundations of Information Systems |
TID Concept
Record addressing with indirection inside the page Each page contains an array with record positions TID of a record consist of page id and index in position array
Pros Access with one page access (two pages in case of overflow) Stable No mapping table required
Operations Insert Reuse unused
position or add position Delete Mark position
as unused in array Update Update all
positions in array Update with overflow Store record
as overflow record and store TID of overflow record at original position (No double overflow Update TID at original position)
Record
Overflow Record
| 282
gt
copy A Behrend Foundations of Information Systems |
Free Space Management
Problem In which page is enough space for new record
Solution Free space table lists for all pages how much space is left
Free space value Precise value Ceil(Log2(page size)) =gt 2 bytes for common page size of 4K Rough value use less bytes free space = (value page size)2^(bits per value)
Free space table With direct page addressing Assuming a single page can take n free space entries First page and each (n+1)-th page takes free space entries
With indirect page addressing Free space information stored in page table
copy A Behrend Foundations of Information Systems | 283
gt
|
Storage System
Buffer
File System
Hardware
Data System
Application
TID TID TID
TID TID TID
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
Run
Buffered Pages - Page replacement strategy - Materialization strategy - Logging Backup Recovery
Paged files
Disks Flash RAID SAN hellip
Storage Structures - Record management - Free space management - Physical access paths
Access System
Data
base
Sys
tem
Data model semantics - System catalog - Record format - Logical access paths
Query processing - Parsing - Plan generation - Plan optimization - Plan execution
1 lsquoSmithrsquo 15061982
Table Person id INT name VARCHAR birthday DATE Index P_id_IX on Personid
Database System
SQL JDBC ODBC hellip
Physical Access Paths ndash Index Structures
| 284
gt
copy A Behrend Foundations of Information Systems |
Primary Index Secondary Index
Overview Indexes
Table scan Read all pages and for each record
evaluate the search criteria Pre-fetching
Index Scan Use index for search criteria
on one or more attributes Fast access to single values or value ranges of index attributes Logicalphysical sorting of values of key attributes (depending on index structure) Enforcing uniqueness
Types if indexes Primary (Clustered) Index
determines physical organization use for PK Secondary (Non-Clustered)
Index redundant access path
Pers(PID NAME AGE SALARY)
Age
Salary
| 285
gt
copy A Behrend Foundations of Information Systems |
Overview Indexes (2)
Choice of Access Paths Index scan Only useful for low selectivity
(low number of result tuples) Break even-point according to the
output ratio of the number of tuples (usually max 5) Requires statistics about data Additional costs for index storage
and updating
Table Scan adequateefficient for small tables
(eg 5 pages) Queries with high selectivity
(large result sets) 100-200MBs sequential read
~ 100 disk seekss
hit rate
Index Scan
Table Scan
access time
| 286
gt
copy A Behrend Foundations of Information Systems |
Classification of Index Structures
Classification
Multiway Trees Tree structure with multiple children per node Idea chose fan out so that node size suits page size
Onedimensional Index Structures
Key Comparison Key Transformation
Sequential Tree-Based Hash-Based
Prefix Trees (Tries)
Binary Search Trees Dynamic Static Linked Lists
(log seq) Seq Lists
(phys seq) Multiway
Trees Example B-Tree
| 287
gt
copy A Behrend Foundations of Information Systems |
B-Tree
free space
(Ki Di Pi) = entry min |P| = k+1 max |P| = 2k+1
keys lt K1 keys gt Kp Ki lt keys lt Ki+1
| 288
gt
copy A Behrend Foundations of Information Systems |
B-Tree (2)
Example Keys Agnostic to specific key semantic Only defined complete order required Could be of fixed or variable length
Operations Search for data for given key value Insertion and deletion of key-data pair
Payload Agnostic to specific data semantic Can be record or reference (TID) or
mix
B-Tree with k = 2 h = 3
| 289
gt
copy A Behrend Foundations of Information Systems |
Search in the B-Tree
Starting at the root node each node is searched from left to right 1) if Ki matches the desired key value the data record has been found (further
records with the same key value might be located in a sub-tree to which Pi-1 points) 2) if Ki is smaller than the desired value the search will be continued in the root of
the sub-tree identified by Pi-1 3) if Ki is larger than the desired value the comparison with Ki+1 is repeated 4) if K2k is also smaller than the desired value the search will be continued in the
sub-tree of P2k If itlsquos impossible to descend further into a sub-tree (2 or 4) (leaf node) The search is aborted no record with the desired key value is found
Search for 38 20 6
| 290
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (1)
Insertion Rule insert only into leaf nodes At Non-Leaf Nodes descend down the tree as for the search S le Ki follow Pi-1 S gt Ki check Ki+1 S gt K2k follow P2k
At Leaf Node Insert the data record according to the sorting order Special case leaf node is full (2k records) rarr split the leaf node
Splitting Generate a new leaf node Split the 2k+1 entries (in order)
into two leaf nodes first k entries rarr left node last k entries rarr right node
middle entry (k+1-th) is used as new ldquodiscriminatorrdquo (branching) and inserted into the parent node
| 291
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (2)
Node Splitting during Insertion Two possible situations after a split The parent node is full rarr repeat split on this level Enough space rarr FINISHED
Special case root split Split of the root node rarr New root with two successor nodes Height of a tree grows by 1 The tree has been split from the bottom to the top
Dynamic reorganization (self-balancing) No unloading or loading necessary Tree is always balanced But In case of many insertions deletions reorganization can be beneficial
| 292
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (3)
Insertion Example Order k = 1 n=2k Keys 1 5 2 6 7 4 8 3
Finally h=3
| 293
gt
copy A Behrend Foundations of Information Systems |
Insertion and Deletion in the B-Tree
Problem Insertion can create overflow Deletion can create underflow and overflow
Example Insertion of key 22
rarr Overflow rarr Split
Deletion of key 22 rarr Underflow need to access all four nodes finally same as input
Insert 22
| 294
gt
copy A Behrend Foundations of Information Systems |
Underflow Merge
Deletion in the B-Tree
Example Order k = 1 n=2k Delete key 3
| 295
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (2)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Remember Each path from the root to the leaf has the same length h
| 296
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (3)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Overflow Split
| 297
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (4)
Example Order k = 1 n=2k Delete key 3
Overflow Split
Root Split
| 298
gt
copy A Behrend Foundations of Information Systems |
Deletion Algorithm
Example ndash there are different algorithms Search the node that contains the key K to be deleted If key K is in a leaf node delete the key in the leaf node and handle potentially
resulting underflow by merging with sibling If key K is in an inner node pull up new discriminator from one of the successors Analyze which successor node of K has more elements left or right one
If both have the same number of elements decide for one Replace the key K to be deleted with the direct successor Krsquo from the left
successor node or with the direct successor Krsquorsquo from the right successor node respectively Delete Krsquo or Krsquorsquo from the respective successor node (recursively)
Note Major variants Merge (tis lecture) Re-distribution (instead of splitmerge in case of overflowunderflow the entries are
re-distributed under consideration of one or multiple adjacent nodes)
| 299
gt
copy A Behrend Foundations of Information Systems |
B-Trees B+-Trees and B-Trees
B+-Trees and B-Trees Data is only in leaf nodes Key redundancy but higher fan-out rarr lower tree high less IO Simpler delete procedure rarr requires only merging of nodes
Double linked list of all leaf nodes B-Trees Modified valid node sizes
from [k2k] to [43k2k] rarr better node utilization but more splitsmerges
Example Secondary index Non unique
B-Tree with k = 2 h = 3
| 300
gt
copy A Behrend Foundations of Information Systems |
Indexing Low Cardinality Columns
Problem Example B-tree on the sex of customers for a table with 1000000 tuples results in
two lists with approximately 500000 tuples each
Query for all female customers requires 500000 random page accesses (secondary index) Table scan would be much faster
Conclusion B-trees (and also hashing) are useful for predicates with low selectivity
(outputinput cardinality ratio) Rule of thumb margin hit rate is approx 5 higher hit rates do not justify the efforts for an index access
F M
TID TID TID TID hellip TID TID TID TID hellip
| 301
gt
copy A Behrend Foundations of Information Systems |
Bitmap Index
Idea (Long history since the 1960s) Create a bitmapbitlist for each
attribute value Each tuple in the table is assigned to
one bit in the bitmap (by position sequential TID)
Bit values 1 attribute value set 0 attribute value not set
Necessary condition Sequential numbering of the tuples
(TIDs)
F M
1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
0 1 0 0 1 0 0 0 0 1 1 0 1 0 0
Name Sex Region Race Carol f n white
Harold m e black Anne f e asian
Iris f ne white hellip m se hisp hellip f e white hellip f sw asian hellip f w black hellip f n asian hellip m e hisp hellip m se black hellip f s white hellip m nw black hellip f s white hellip f w black
Sex
| 302
gt
copy A Behrend Foundations of Information Systems |
Querying Bitmap Indexes
Main advantage of bitmap indexes Simple and efficient logical join possible Read only data that is relevant for predicates Example σSex=lsquoflsquo ᴧ Region=lsquonlsquo R Bitmaps B1 and B2 in conjunction for (i=0 iltB1length i++) B = B1[i] amp B2[i]
Example IO Costs Estimation σSex=lsquoflsquo ᴧ Region=lsquonlsquo ᴧ Race=lsquoAsianlsquo R
(ldquoAsian women of region Northrdquo) Selectivity 12 18 14 = 164 N=10000 tuples with length of 400 bytes each
(~ 10 tuples per page for 4kB pages) Table scan 1000 pages Bitmap access 1000064 156 pages (worst case each tuple in a different page)
plus 1 page for bitmaps
F 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
N 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0
A 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
AND = AND
- DBMS Architecture
- What is in the Lecture
- How is Database System build
- Architectural Blue Print
- Architectural Trends
- Different Access Characteristics
- Hardware Developments
- Row Storage vs Column Storage
- Processing Models
- Transaction Management
- ACID Properties of Transactions
- Motivation
- Reasons for crashes
- Guarantee Atomicity amp Durability
- Record Management
- Record
- Record Addressing
- Surrogate with Mapping Table
- TID Concept
- Free Space Management
- Physical Access Paths ndash Index Structures
- Overview Indexes
- Overview Indexes (2)
- Classification of Index Structures
- B-Tree
- B-Tree (2)
- Search in the B-Tree
- Insertions in the B-Tree (1)
- Insertions in the B-Tree (2)
- Insertions in the B-Tree (3)
- Insertion and Deletion in the B-Tree
- Deletion in the B-Tree
- Deletion in the B-Tree (2)
- Deletion in the B-Tree (3)
- Deletion in the B-Tree (4)
- Deletion Algorithm
- B-Trees B+-Trees and B-Trees
- Indexing Low Cardinality Columns
- Bitmap Index
- Querying Bitmap Indexes
-
| 273
gt
copy A Behrend Foundations of Information Systems |
ACID Properties of Transactions
Atomicity Indivisibility due to the transaction definition (Begin - End) All-or-nothing principle ie the DBS guarantees Either the complete execution of a transaction hellip hellip or the ineffectiveness of the whole transaction (and of all associated operations)
Consistency A successful transaction guarantees that all consistency requirements (integrity
requirements) have been met
Isolation Multiple transactions run isolated from each other and do not use (inconsistent)
intermediate results from other transactions
Durability All results of successful transactions have to be made persistent
| 274
gt
copy A Behrend Foundations of Information Systems |
Motivation
Atomicity Part of the transaction is done but we want to cancel it ABORTROLLBACK System crashes during transaction some changes made it to the disk some did not
Durability Transaction finished user notified COMMIT System crashes before changes sent successfully to disk (asynchronous write)
Consistency Physical consistency Correctness of the storage and access structures Completely executed modification operations preserve the consistency
Logical consistency Correctness of data contents ndash correspond to a (possible) state of the real world Completely executed transactions preserve the logical consistency
- All modifications of finished transactions are included - No modifications of open transactions are included
Remember Logical consistency requires physical consistency in the first place
UNDO Recovery
REDO Recovery
UNDO Recovery for consistency-related rollbacks
| 275
gt
copy A Behrend Foundations of Information Systems |
Reasons for crashes
Transaction error Violation of system restrictions Violation of security regulations Excessive resource requirements deadlocks
Application-related errors eg wrong operations and values ROLLBACK
System error System crash with loss of main-memory contents Database system operating system hardware power failure
Device error (especially storage-medium error) Destruction of secondary storage systems
Catastrophes Destruction of the computing center
| 276
gt
copy A Behrend Foundations of Information Systems |
Guarantee Atomicity amp Durability
Assumptions System may crash but the disk is durable The only atomicity guarantee is that a disk block write is atomic
Materialization strategy Preferred Policy StealNo Force This combination is most complicated but allows for highest performance No Force complicates enforcing Durability What if system crashes before a modified page written by a committed
transaction makes it to disk Write as little as possible in a convenient place at commit time to support
REDOing modifications Steal complicates enforcing Atomicity What if the transaction that performed udpates aborts What if system crashes before transaction is finished Must remember the old value of P (to support UNDOing the write to page P)
copy A Behrend Foundations of Information Systems | 277
gt
|
Storage System
Buffer
File System
Hardware
Data System
Application
TID TID TID
TID TID TID
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
Run
Buffered Pages - Page replacement strategy - Materialization strategy - Logging Backup Recovery
Paged files
Disks Flash RAID SAN hellip
Storage Structures - Record management - Free space management - Physical access paths
Access System
Data
base
Sys
tem
Data model semantics - System catalog - Record format - Logical access paths
Query processing - Parsing - Plan generation - Plan optimization - Plan execution
1 lsquoSmithrsquo 15061982
Table Person id INT name VARCHAR birthday DATE Index P_id_IX on Personid
Database System
SQL JDBC ODBC hellip
Record Management
| 278
gt
copy A Behrend Foundations of Information Systems |
Record
Record Package of fields that together describe a thing a person a fact etc Each fields represents on property of the entity described by the record Similar to a struct in C Variable length (in contrast to pages)
Record Manager Organizes physical storage of records in pages Operations Get Insert Update Delete Scan Agnostic to record structure and semantic
records considered as byte strings of variable length Structure and content of record is defined be Access System and application
Challenges Record addressing Free space management
| 279
gt
copy A Behrend Foundations of Information Systems |
Record Addressing
Record address Identifier for records used to address records eg in indexes or query processing Assigned during insert of a record
Goals Stability of identifier Fast and direct access Less organizational overhead
Direct addressing Byte address or position number in file or page Instable Byte address If record grows in length following records would get new address Position number Insert and delete operations change series or records
Indirect addressing Surrogate with mapping table (complete indirection) Tuple Identifier (TID concept)
| 280
gt
copy A Behrend Foundations of Information Systems |
Surrogate with Mapping Table
Surrogate Record type + serial number Serial number remains constant during recordrsquos life time
Mapping table Maps
surrogate to page
Problems Where to store mapping table How can it be extended How to search mapping table efficiently
rarr H2 use B-Tree to store mapping table
Mapping Table Surrogate | Page ID
| 281
gt
copy A Behrend Foundations of Information Systems |
TID Concept
Record addressing with indirection inside the page Each page contains an array with record positions TID of a record consist of page id and index in position array
Pros Access with one page access (two pages in case of overflow) Stable No mapping table required
Operations Insert Reuse unused
position or add position Delete Mark position
as unused in array Update Update all
positions in array Update with overflow Store record
as overflow record and store TID of overflow record at original position (No double overflow Update TID at original position)
Record
Overflow Record
| 282
gt
copy A Behrend Foundations of Information Systems |
Free Space Management
Problem In which page is enough space for new record
Solution Free space table lists for all pages how much space is left
Free space value Precise value Ceil(Log2(page size)) =gt 2 bytes for common page size of 4K Rough value use less bytes free space = (value page size)2^(bits per value)
Free space table With direct page addressing Assuming a single page can take n free space entries First page and each (n+1)-th page takes free space entries
With indirect page addressing Free space information stored in page table
copy A Behrend Foundations of Information Systems | 283
gt
|
Storage System
Buffer
File System
Hardware
Data System
Application
TID TID TID
TID TID TID
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
Run
Buffered Pages - Page replacement strategy - Materialization strategy - Logging Backup Recovery
Paged files
Disks Flash RAID SAN hellip
Storage Structures - Record management - Free space management - Physical access paths
Access System
Data
base
Sys
tem
Data model semantics - System catalog - Record format - Logical access paths
Query processing - Parsing - Plan generation - Plan optimization - Plan execution
1 lsquoSmithrsquo 15061982
Table Person id INT name VARCHAR birthday DATE Index P_id_IX on Personid
Database System
SQL JDBC ODBC hellip
Physical Access Paths ndash Index Structures
| 284
gt
copy A Behrend Foundations of Information Systems |
Primary Index Secondary Index
Overview Indexes
Table scan Read all pages and for each record
evaluate the search criteria Pre-fetching
Index Scan Use index for search criteria
on one or more attributes Fast access to single values or value ranges of index attributes Logicalphysical sorting of values of key attributes (depending on index structure) Enforcing uniqueness
Types if indexes Primary (Clustered) Index
determines physical organization use for PK Secondary (Non-Clustered)
Index redundant access path
Pers(PID NAME AGE SALARY)
Age
Salary
| 285
gt
copy A Behrend Foundations of Information Systems |
Overview Indexes (2)
Choice of Access Paths Index scan Only useful for low selectivity
(low number of result tuples) Break even-point according to the
output ratio of the number of tuples (usually max 5) Requires statistics about data Additional costs for index storage
and updating
Table Scan adequateefficient for small tables
(eg 5 pages) Queries with high selectivity
(large result sets) 100-200MBs sequential read
~ 100 disk seekss
hit rate
Index Scan
Table Scan
access time
| 286
gt
copy A Behrend Foundations of Information Systems |
Classification of Index Structures
Classification
Multiway Trees Tree structure with multiple children per node Idea chose fan out so that node size suits page size
Onedimensional Index Structures
Key Comparison Key Transformation
Sequential Tree-Based Hash-Based
Prefix Trees (Tries)
Binary Search Trees Dynamic Static Linked Lists
(log seq) Seq Lists
(phys seq) Multiway
Trees Example B-Tree
| 287
gt
copy A Behrend Foundations of Information Systems |
B-Tree
free space
(Ki Di Pi) = entry min |P| = k+1 max |P| = 2k+1
keys lt K1 keys gt Kp Ki lt keys lt Ki+1
| 288
gt
copy A Behrend Foundations of Information Systems |
B-Tree (2)
Example Keys Agnostic to specific key semantic Only defined complete order required Could be of fixed or variable length
Operations Search for data for given key value Insertion and deletion of key-data pair
Payload Agnostic to specific data semantic Can be record or reference (TID) or
mix
B-Tree with k = 2 h = 3
| 289
gt
copy A Behrend Foundations of Information Systems |
Search in the B-Tree
Starting at the root node each node is searched from left to right 1) if Ki matches the desired key value the data record has been found (further
records with the same key value might be located in a sub-tree to which Pi-1 points) 2) if Ki is smaller than the desired value the search will be continued in the root of
the sub-tree identified by Pi-1 3) if Ki is larger than the desired value the comparison with Ki+1 is repeated 4) if K2k is also smaller than the desired value the search will be continued in the
sub-tree of P2k If itlsquos impossible to descend further into a sub-tree (2 or 4) (leaf node) The search is aborted no record with the desired key value is found
Search for 38 20 6
| 290
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (1)
Insertion Rule insert only into leaf nodes At Non-Leaf Nodes descend down the tree as for the search S le Ki follow Pi-1 S gt Ki check Ki+1 S gt K2k follow P2k
At Leaf Node Insert the data record according to the sorting order Special case leaf node is full (2k records) rarr split the leaf node
Splitting Generate a new leaf node Split the 2k+1 entries (in order)
into two leaf nodes first k entries rarr left node last k entries rarr right node
middle entry (k+1-th) is used as new ldquodiscriminatorrdquo (branching) and inserted into the parent node
| 291
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (2)
Node Splitting during Insertion Two possible situations after a split The parent node is full rarr repeat split on this level Enough space rarr FINISHED
Special case root split Split of the root node rarr New root with two successor nodes Height of a tree grows by 1 The tree has been split from the bottom to the top
Dynamic reorganization (self-balancing) No unloading or loading necessary Tree is always balanced But In case of many insertions deletions reorganization can be beneficial
| 292
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (3)
Insertion Example Order k = 1 n=2k Keys 1 5 2 6 7 4 8 3
Finally h=3
| 293
gt
copy A Behrend Foundations of Information Systems |
Insertion and Deletion in the B-Tree
Problem Insertion can create overflow Deletion can create underflow and overflow
Example Insertion of key 22
rarr Overflow rarr Split
Deletion of key 22 rarr Underflow need to access all four nodes finally same as input
Insert 22
| 294
gt
copy A Behrend Foundations of Information Systems |
Underflow Merge
Deletion in the B-Tree
Example Order k = 1 n=2k Delete key 3
| 295
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (2)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Remember Each path from the root to the leaf has the same length h
| 296
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (3)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Overflow Split
| 297
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (4)
Example Order k = 1 n=2k Delete key 3
Overflow Split
Root Split
| 298
gt
copy A Behrend Foundations of Information Systems |
Deletion Algorithm
Example ndash there are different algorithms Search the node that contains the key K to be deleted If key K is in a leaf node delete the key in the leaf node and handle potentially
resulting underflow by merging with sibling If key K is in an inner node pull up new discriminator from one of the successors Analyze which successor node of K has more elements left or right one
If both have the same number of elements decide for one Replace the key K to be deleted with the direct successor Krsquo from the left
successor node or with the direct successor Krsquorsquo from the right successor node respectively Delete Krsquo or Krsquorsquo from the respective successor node (recursively)
Note Major variants Merge (tis lecture) Re-distribution (instead of splitmerge in case of overflowunderflow the entries are
re-distributed under consideration of one or multiple adjacent nodes)
| 299
gt
copy A Behrend Foundations of Information Systems |
B-Trees B+-Trees and B-Trees
B+-Trees and B-Trees Data is only in leaf nodes Key redundancy but higher fan-out rarr lower tree high less IO Simpler delete procedure rarr requires only merging of nodes
Double linked list of all leaf nodes B-Trees Modified valid node sizes
from [k2k] to [43k2k] rarr better node utilization but more splitsmerges
Example Secondary index Non unique
B-Tree with k = 2 h = 3
| 300
gt
copy A Behrend Foundations of Information Systems |
Indexing Low Cardinality Columns
Problem Example B-tree on the sex of customers for a table with 1000000 tuples results in
two lists with approximately 500000 tuples each
Query for all female customers requires 500000 random page accesses (secondary index) Table scan would be much faster
Conclusion B-trees (and also hashing) are useful for predicates with low selectivity
(outputinput cardinality ratio) Rule of thumb margin hit rate is approx 5 higher hit rates do not justify the efforts for an index access
F M
TID TID TID TID hellip TID TID TID TID hellip
| 301
gt
copy A Behrend Foundations of Information Systems |
Bitmap Index
Idea (Long history since the 1960s) Create a bitmapbitlist for each
attribute value Each tuple in the table is assigned to
one bit in the bitmap (by position sequential TID)
Bit values 1 attribute value set 0 attribute value not set
Necessary condition Sequential numbering of the tuples
(TIDs)
F M
1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
0 1 0 0 1 0 0 0 0 1 1 0 1 0 0
Name Sex Region Race Carol f n white
Harold m e black Anne f e asian
Iris f ne white hellip m se hisp hellip f e white hellip f sw asian hellip f w black hellip f n asian hellip m e hisp hellip m se black hellip f s white hellip m nw black hellip f s white hellip f w black
Sex
| 302
gt
copy A Behrend Foundations of Information Systems |
Querying Bitmap Indexes
Main advantage of bitmap indexes Simple and efficient logical join possible Read only data that is relevant for predicates Example σSex=lsquoflsquo ᴧ Region=lsquonlsquo R Bitmaps B1 and B2 in conjunction for (i=0 iltB1length i++) B = B1[i] amp B2[i]
Example IO Costs Estimation σSex=lsquoflsquo ᴧ Region=lsquonlsquo ᴧ Race=lsquoAsianlsquo R
(ldquoAsian women of region Northrdquo) Selectivity 12 18 14 = 164 N=10000 tuples with length of 400 bytes each
(~ 10 tuples per page for 4kB pages) Table scan 1000 pages Bitmap access 1000064 156 pages (worst case each tuple in a different page)
plus 1 page for bitmaps
F 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
N 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0
A 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
AND = AND
- DBMS Architecture
- What is in the Lecture
- How is Database System build
- Architectural Blue Print
- Architectural Trends
- Different Access Characteristics
- Hardware Developments
- Row Storage vs Column Storage
- Processing Models
- Transaction Management
- ACID Properties of Transactions
- Motivation
- Reasons for crashes
- Guarantee Atomicity amp Durability
- Record Management
- Record
- Record Addressing
- Surrogate with Mapping Table
- TID Concept
- Free Space Management
- Physical Access Paths ndash Index Structures
- Overview Indexes
- Overview Indexes (2)
- Classification of Index Structures
- B-Tree
- B-Tree (2)
- Search in the B-Tree
- Insertions in the B-Tree (1)
- Insertions in the B-Tree (2)
- Insertions in the B-Tree (3)
- Insertion and Deletion in the B-Tree
- Deletion in the B-Tree
- Deletion in the B-Tree (2)
- Deletion in the B-Tree (3)
- Deletion in the B-Tree (4)
- Deletion Algorithm
- B-Trees B+-Trees and B-Trees
- Indexing Low Cardinality Columns
- Bitmap Index
- Querying Bitmap Indexes
-
| 274
gt
copy A Behrend Foundations of Information Systems |
Motivation
Atomicity Part of the transaction is done but we want to cancel it ABORTROLLBACK System crashes during transaction some changes made it to the disk some did not
Durability Transaction finished user notified COMMIT System crashes before changes sent successfully to disk (asynchronous write)
Consistency Physical consistency Correctness of the storage and access structures Completely executed modification operations preserve the consistency
Logical consistency Correctness of data contents ndash correspond to a (possible) state of the real world Completely executed transactions preserve the logical consistency
- All modifications of finished transactions are included - No modifications of open transactions are included
Remember Logical consistency requires physical consistency in the first place
UNDO Recovery
REDO Recovery
UNDO Recovery for consistency-related rollbacks
| 275
gt
copy A Behrend Foundations of Information Systems |
Reasons for crashes
Transaction error Violation of system restrictions Violation of security regulations Excessive resource requirements deadlocks
Application-related errors eg wrong operations and values ROLLBACK
System error System crash with loss of main-memory contents Database system operating system hardware power failure
Device error (especially storage-medium error) Destruction of secondary storage systems
Catastrophes Destruction of the computing center
| 276
gt
copy A Behrend Foundations of Information Systems |
Guarantee Atomicity amp Durability
Assumptions System may crash but the disk is durable The only atomicity guarantee is that a disk block write is atomic
Materialization strategy Preferred Policy StealNo Force This combination is most complicated but allows for highest performance No Force complicates enforcing Durability What if system crashes before a modified page written by a committed
transaction makes it to disk Write as little as possible in a convenient place at commit time to support
REDOing modifications Steal complicates enforcing Atomicity What if the transaction that performed udpates aborts What if system crashes before transaction is finished Must remember the old value of P (to support UNDOing the write to page P)
copy A Behrend Foundations of Information Systems | 277
gt
|
Storage System
Buffer
File System
Hardware
Data System
Application
TID TID TID
TID TID TID
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
Run
Buffered Pages - Page replacement strategy - Materialization strategy - Logging Backup Recovery
Paged files
Disks Flash RAID SAN hellip
Storage Structures - Record management - Free space management - Physical access paths
Access System
Data
base
Sys
tem
Data model semantics - System catalog - Record format - Logical access paths
Query processing - Parsing - Plan generation - Plan optimization - Plan execution
1 lsquoSmithrsquo 15061982
Table Person id INT name VARCHAR birthday DATE Index P_id_IX on Personid
Database System
SQL JDBC ODBC hellip
Record Management
| 278
gt
copy A Behrend Foundations of Information Systems |
Record
Record Package of fields that together describe a thing a person a fact etc Each fields represents on property of the entity described by the record Similar to a struct in C Variable length (in contrast to pages)
Record Manager Organizes physical storage of records in pages Operations Get Insert Update Delete Scan Agnostic to record structure and semantic
records considered as byte strings of variable length Structure and content of record is defined be Access System and application
Challenges Record addressing Free space management
| 279
gt
copy A Behrend Foundations of Information Systems |
Record Addressing
Record address Identifier for records used to address records eg in indexes or query processing Assigned during insert of a record
Goals Stability of identifier Fast and direct access Less organizational overhead
Direct addressing Byte address or position number in file or page Instable Byte address If record grows in length following records would get new address Position number Insert and delete operations change series or records
Indirect addressing Surrogate with mapping table (complete indirection) Tuple Identifier (TID concept)
| 280
gt
copy A Behrend Foundations of Information Systems |
Surrogate with Mapping Table
Surrogate Record type + serial number Serial number remains constant during recordrsquos life time
Mapping table Maps
surrogate to page
Problems Where to store mapping table How can it be extended How to search mapping table efficiently
rarr H2 use B-Tree to store mapping table
Mapping Table Surrogate | Page ID
| 281
gt
copy A Behrend Foundations of Information Systems |
TID Concept
Record addressing with indirection inside the page Each page contains an array with record positions TID of a record consist of page id and index in position array
Pros Access with one page access (two pages in case of overflow) Stable No mapping table required
Operations Insert Reuse unused
position or add position Delete Mark position
as unused in array Update Update all
positions in array Update with overflow Store record
as overflow record and store TID of overflow record at original position (No double overflow Update TID at original position)
Record
Overflow Record
| 282
gt
copy A Behrend Foundations of Information Systems |
Free Space Management
Problem In which page is enough space for new record
Solution Free space table lists for all pages how much space is left
Free space value Precise value Ceil(Log2(page size)) =gt 2 bytes for common page size of 4K Rough value use less bytes free space = (value page size)2^(bits per value)
Free space table With direct page addressing Assuming a single page can take n free space entries First page and each (n+1)-th page takes free space entries
With indirect page addressing Free space information stored in page table
copy A Behrend Foundations of Information Systems | 283
gt
|
Storage System
Buffer
File System
Hardware
Data System
Application
TID TID TID
TID TID TID
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
Run
Buffered Pages - Page replacement strategy - Materialization strategy - Logging Backup Recovery
Paged files
Disks Flash RAID SAN hellip
Storage Structures - Record management - Free space management - Physical access paths
Access System
Data
base
Sys
tem
Data model semantics - System catalog - Record format - Logical access paths
Query processing - Parsing - Plan generation - Plan optimization - Plan execution
1 lsquoSmithrsquo 15061982
Table Person id INT name VARCHAR birthday DATE Index P_id_IX on Personid
Database System
SQL JDBC ODBC hellip
Physical Access Paths ndash Index Structures
| 284
gt
copy A Behrend Foundations of Information Systems |
Primary Index Secondary Index
Overview Indexes
Table scan Read all pages and for each record
evaluate the search criteria Pre-fetching
Index Scan Use index for search criteria
on one or more attributes Fast access to single values or value ranges of index attributes Logicalphysical sorting of values of key attributes (depending on index structure) Enforcing uniqueness
Types if indexes Primary (Clustered) Index
determines physical organization use for PK Secondary (Non-Clustered)
Index redundant access path
Pers(PID NAME AGE SALARY)
Age
Salary
| 285
gt
copy A Behrend Foundations of Information Systems |
Overview Indexes (2)
Choice of Access Paths Index scan Only useful for low selectivity
(low number of result tuples) Break even-point according to the
output ratio of the number of tuples (usually max 5) Requires statistics about data Additional costs for index storage
and updating
Table Scan adequateefficient for small tables
(eg 5 pages) Queries with high selectivity
(large result sets) 100-200MBs sequential read
~ 100 disk seekss
hit rate
Index Scan
Table Scan
access time
| 286
gt
copy A Behrend Foundations of Information Systems |
Classification of Index Structures
Classification
Multiway Trees Tree structure with multiple children per node Idea chose fan out so that node size suits page size
Onedimensional Index Structures
Key Comparison Key Transformation
Sequential Tree-Based Hash-Based
Prefix Trees (Tries)
Binary Search Trees Dynamic Static Linked Lists
(log seq) Seq Lists
(phys seq) Multiway
Trees Example B-Tree
| 287
gt
copy A Behrend Foundations of Information Systems |
B-Tree
free space
(Ki Di Pi) = entry min |P| = k+1 max |P| = 2k+1
keys lt K1 keys gt Kp Ki lt keys lt Ki+1
| 288
gt
copy A Behrend Foundations of Information Systems |
B-Tree (2)
Example Keys Agnostic to specific key semantic Only defined complete order required Could be of fixed or variable length
Operations Search for data for given key value Insertion and deletion of key-data pair
Payload Agnostic to specific data semantic Can be record or reference (TID) or
mix
B-Tree with k = 2 h = 3
| 289
gt
copy A Behrend Foundations of Information Systems |
Search in the B-Tree
Starting at the root node each node is searched from left to right 1) if Ki matches the desired key value the data record has been found (further
records with the same key value might be located in a sub-tree to which Pi-1 points) 2) if Ki is smaller than the desired value the search will be continued in the root of
the sub-tree identified by Pi-1 3) if Ki is larger than the desired value the comparison with Ki+1 is repeated 4) if K2k is also smaller than the desired value the search will be continued in the
sub-tree of P2k If itlsquos impossible to descend further into a sub-tree (2 or 4) (leaf node) The search is aborted no record with the desired key value is found
Search for 38 20 6
| 290
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (1)
Insertion Rule insert only into leaf nodes At Non-Leaf Nodes descend down the tree as for the search S le Ki follow Pi-1 S gt Ki check Ki+1 S gt K2k follow P2k
At Leaf Node Insert the data record according to the sorting order Special case leaf node is full (2k records) rarr split the leaf node
Splitting Generate a new leaf node Split the 2k+1 entries (in order)
into two leaf nodes first k entries rarr left node last k entries rarr right node
middle entry (k+1-th) is used as new ldquodiscriminatorrdquo (branching) and inserted into the parent node
| 291
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (2)
Node Splitting during Insertion Two possible situations after a split The parent node is full rarr repeat split on this level Enough space rarr FINISHED
Special case root split Split of the root node rarr New root with two successor nodes Height of a tree grows by 1 The tree has been split from the bottom to the top
Dynamic reorganization (self-balancing) No unloading or loading necessary Tree is always balanced But In case of many insertions deletions reorganization can be beneficial
| 292
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (3)
Insertion Example Order k = 1 n=2k Keys 1 5 2 6 7 4 8 3
Finally h=3
| 293
gt
copy A Behrend Foundations of Information Systems |
Insertion and Deletion in the B-Tree
Problem Insertion can create overflow Deletion can create underflow and overflow
Example Insertion of key 22
rarr Overflow rarr Split
Deletion of key 22 rarr Underflow need to access all four nodes finally same as input
Insert 22
| 294
gt
copy A Behrend Foundations of Information Systems |
Underflow Merge
Deletion in the B-Tree
Example Order k = 1 n=2k Delete key 3
| 295
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (2)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Remember Each path from the root to the leaf has the same length h
| 296
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (3)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Overflow Split
| 297
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (4)
Example Order k = 1 n=2k Delete key 3
Overflow Split
Root Split
| 298
gt
copy A Behrend Foundations of Information Systems |
Deletion Algorithm
Example ndash there are different algorithms Search the node that contains the key K to be deleted If key K is in a leaf node delete the key in the leaf node and handle potentially
resulting underflow by merging with sibling If key K is in an inner node pull up new discriminator from one of the successors Analyze which successor node of K has more elements left or right one
If both have the same number of elements decide for one Replace the key K to be deleted with the direct successor Krsquo from the left
successor node or with the direct successor Krsquorsquo from the right successor node respectively Delete Krsquo or Krsquorsquo from the respective successor node (recursively)
Note Major variants Merge (tis lecture) Re-distribution (instead of splitmerge in case of overflowunderflow the entries are
re-distributed under consideration of one or multiple adjacent nodes)
| 299
gt
copy A Behrend Foundations of Information Systems |
B-Trees B+-Trees and B-Trees
B+-Trees and B-Trees Data is only in leaf nodes Key redundancy but higher fan-out rarr lower tree high less IO Simpler delete procedure rarr requires only merging of nodes
Double linked list of all leaf nodes B-Trees Modified valid node sizes
from [k2k] to [43k2k] rarr better node utilization but more splitsmerges
Example Secondary index Non unique
B-Tree with k = 2 h = 3
| 300
gt
copy A Behrend Foundations of Information Systems |
Indexing Low Cardinality Columns
Problem Example B-tree on the sex of customers for a table with 1000000 tuples results in
two lists with approximately 500000 tuples each
Query for all female customers requires 500000 random page accesses (secondary index) Table scan would be much faster
Conclusion B-trees (and also hashing) are useful for predicates with low selectivity
(outputinput cardinality ratio) Rule of thumb margin hit rate is approx 5 higher hit rates do not justify the efforts for an index access
F M
TID TID TID TID hellip TID TID TID TID hellip
| 301
gt
copy A Behrend Foundations of Information Systems |
Bitmap Index
Idea (Long history since the 1960s) Create a bitmapbitlist for each
attribute value Each tuple in the table is assigned to
one bit in the bitmap (by position sequential TID)
Bit values 1 attribute value set 0 attribute value not set
Necessary condition Sequential numbering of the tuples
(TIDs)
F M
1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
0 1 0 0 1 0 0 0 0 1 1 0 1 0 0
Name Sex Region Race Carol f n white
Harold m e black Anne f e asian
Iris f ne white hellip m se hisp hellip f e white hellip f sw asian hellip f w black hellip f n asian hellip m e hisp hellip m se black hellip f s white hellip m nw black hellip f s white hellip f w black
Sex
| 302
gt
copy A Behrend Foundations of Information Systems |
Querying Bitmap Indexes
Main advantage of bitmap indexes Simple and efficient logical join possible Read only data that is relevant for predicates Example σSex=lsquoflsquo ᴧ Region=lsquonlsquo R Bitmaps B1 and B2 in conjunction for (i=0 iltB1length i++) B = B1[i] amp B2[i]
Example IO Costs Estimation σSex=lsquoflsquo ᴧ Region=lsquonlsquo ᴧ Race=lsquoAsianlsquo R
(ldquoAsian women of region Northrdquo) Selectivity 12 18 14 = 164 N=10000 tuples with length of 400 bytes each
(~ 10 tuples per page for 4kB pages) Table scan 1000 pages Bitmap access 1000064 156 pages (worst case each tuple in a different page)
plus 1 page for bitmaps
F 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
N 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0
A 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
AND = AND
- DBMS Architecture
- What is in the Lecture
- How is Database System build
- Architectural Blue Print
- Architectural Trends
- Different Access Characteristics
- Hardware Developments
- Row Storage vs Column Storage
- Processing Models
- Transaction Management
- ACID Properties of Transactions
- Motivation
- Reasons for crashes
- Guarantee Atomicity amp Durability
- Record Management
- Record
- Record Addressing
- Surrogate with Mapping Table
- TID Concept
- Free Space Management
- Physical Access Paths ndash Index Structures
- Overview Indexes
- Overview Indexes (2)
- Classification of Index Structures
- B-Tree
- B-Tree (2)
- Search in the B-Tree
- Insertions in the B-Tree (1)
- Insertions in the B-Tree (2)
- Insertions in the B-Tree (3)
- Insertion and Deletion in the B-Tree
- Deletion in the B-Tree
- Deletion in the B-Tree (2)
- Deletion in the B-Tree (3)
- Deletion in the B-Tree (4)
- Deletion Algorithm
- B-Trees B+-Trees and B-Trees
- Indexing Low Cardinality Columns
- Bitmap Index
- Querying Bitmap Indexes
-
| 275
gt
copy A Behrend Foundations of Information Systems |
Reasons for crashes
Transaction error Violation of system restrictions Violation of security regulations Excessive resource requirements deadlocks
Application-related errors eg wrong operations and values ROLLBACK
System error System crash with loss of main-memory contents Database system operating system hardware power failure
Device error (especially storage-medium error) Destruction of secondary storage systems
Catastrophes Destruction of the computing center
| 276
gt
copy A Behrend Foundations of Information Systems |
Guarantee Atomicity amp Durability
Assumptions System may crash but the disk is durable The only atomicity guarantee is that a disk block write is atomic
Materialization strategy Preferred Policy StealNo Force This combination is most complicated but allows for highest performance No Force complicates enforcing Durability What if system crashes before a modified page written by a committed
transaction makes it to disk Write as little as possible in a convenient place at commit time to support
REDOing modifications Steal complicates enforcing Atomicity What if the transaction that performed udpates aborts What if system crashes before transaction is finished Must remember the old value of P (to support UNDOing the write to page P)
copy A Behrend Foundations of Information Systems | 277
gt
|
Storage System
Buffer
File System
Hardware
Data System
Application
TID TID TID
TID TID TID
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
Run
Buffered Pages - Page replacement strategy - Materialization strategy - Logging Backup Recovery
Paged files
Disks Flash RAID SAN hellip
Storage Structures - Record management - Free space management - Physical access paths
Access System
Data
base
Sys
tem
Data model semantics - System catalog - Record format - Logical access paths
Query processing - Parsing - Plan generation - Plan optimization - Plan execution
1 lsquoSmithrsquo 15061982
Table Person id INT name VARCHAR birthday DATE Index P_id_IX on Personid
Database System
SQL JDBC ODBC hellip
Record Management
| 278
gt
copy A Behrend Foundations of Information Systems |
Record
Record Package of fields that together describe a thing a person a fact etc Each fields represents on property of the entity described by the record Similar to a struct in C Variable length (in contrast to pages)
Record Manager Organizes physical storage of records in pages Operations Get Insert Update Delete Scan Agnostic to record structure and semantic
records considered as byte strings of variable length Structure and content of record is defined be Access System and application
Challenges Record addressing Free space management
| 279
gt
copy A Behrend Foundations of Information Systems |
Record Addressing
Record address Identifier for records used to address records eg in indexes or query processing Assigned during insert of a record
Goals Stability of identifier Fast and direct access Less organizational overhead
Direct addressing Byte address or position number in file or page Instable Byte address If record grows in length following records would get new address Position number Insert and delete operations change series or records
Indirect addressing Surrogate with mapping table (complete indirection) Tuple Identifier (TID concept)
| 280
gt
copy A Behrend Foundations of Information Systems |
Surrogate with Mapping Table
Surrogate Record type + serial number Serial number remains constant during recordrsquos life time
Mapping table Maps
surrogate to page
Problems Where to store mapping table How can it be extended How to search mapping table efficiently
rarr H2 use B-Tree to store mapping table
Mapping Table Surrogate | Page ID
| 281
gt
copy A Behrend Foundations of Information Systems |
TID Concept
Record addressing with indirection inside the page Each page contains an array with record positions TID of a record consist of page id and index in position array
Pros Access with one page access (two pages in case of overflow) Stable No mapping table required
Operations Insert Reuse unused
position or add position Delete Mark position
as unused in array Update Update all
positions in array Update with overflow Store record
as overflow record and store TID of overflow record at original position (No double overflow Update TID at original position)
Record
Overflow Record
| 282
gt
copy A Behrend Foundations of Information Systems |
Free Space Management
Problem In which page is enough space for new record
Solution Free space table lists for all pages how much space is left
Free space value Precise value Ceil(Log2(page size)) =gt 2 bytes for common page size of 4K Rough value use less bytes free space = (value page size)2^(bits per value)
Free space table With direct page addressing Assuming a single page can take n free space entries First page and each (n+1)-th page takes free space entries
With indirect page addressing Free space information stored in page table
copy A Behrend Foundations of Information Systems | 283
gt
|
Storage System
Buffer
File System
Hardware
Data System
Application
TID TID TID
TID TID TID
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
Run
Buffered Pages - Page replacement strategy - Materialization strategy - Logging Backup Recovery
Paged files
Disks Flash RAID SAN hellip
Storage Structures - Record management - Free space management - Physical access paths
Access System
Data
base
Sys
tem
Data model semantics - System catalog - Record format - Logical access paths
Query processing - Parsing - Plan generation - Plan optimization - Plan execution
1 lsquoSmithrsquo 15061982
Table Person id INT name VARCHAR birthday DATE Index P_id_IX on Personid
Database System
SQL JDBC ODBC hellip
Physical Access Paths ndash Index Structures
| 284
gt
copy A Behrend Foundations of Information Systems |
Primary Index Secondary Index
Overview Indexes
Table scan Read all pages and for each record
evaluate the search criteria Pre-fetching
Index Scan Use index for search criteria
on one or more attributes Fast access to single values or value ranges of index attributes Logicalphysical sorting of values of key attributes (depending on index structure) Enforcing uniqueness
Types if indexes Primary (Clustered) Index
determines physical organization use for PK Secondary (Non-Clustered)
Index redundant access path
Pers(PID NAME AGE SALARY)
Age
Salary
| 285
gt
copy A Behrend Foundations of Information Systems |
Overview Indexes (2)
Choice of Access Paths Index scan Only useful for low selectivity
(low number of result tuples) Break even-point according to the
output ratio of the number of tuples (usually max 5) Requires statistics about data Additional costs for index storage
and updating
Table Scan adequateefficient for small tables
(eg 5 pages) Queries with high selectivity
(large result sets) 100-200MBs sequential read
~ 100 disk seekss
hit rate
Index Scan
Table Scan
access time
| 286
gt
copy A Behrend Foundations of Information Systems |
Classification of Index Structures
Classification
Multiway Trees Tree structure with multiple children per node Idea chose fan out so that node size suits page size
Onedimensional Index Structures
Key Comparison Key Transformation
Sequential Tree-Based Hash-Based
Prefix Trees (Tries)
Binary Search Trees Dynamic Static Linked Lists
(log seq) Seq Lists
(phys seq) Multiway
Trees Example B-Tree
| 287
gt
copy A Behrend Foundations of Information Systems |
B-Tree
free space
(Ki Di Pi) = entry min |P| = k+1 max |P| = 2k+1
keys lt K1 keys gt Kp Ki lt keys lt Ki+1
| 288
gt
copy A Behrend Foundations of Information Systems |
B-Tree (2)
Example Keys Agnostic to specific key semantic Only defined complete order required Could be of fixed or variable length
Operations Search for data for given key value Insertion and deletion of key-data pair
Payload Agnostic to specific data semantic Can be record or reference (TID) or
mix
B-Tree with k = 2 h = 3
| 289
gt
copy A Behrend Foundations of Information Systems |
Search in the B-Tree
Starting at the root node each node is searched from left to right 1) if Ki matches the desired key value the data record has been found (further
records with the same key value might be located in a sub-tree to which Pi-1 points) 2) if Ki is smaller than the desired value the search will be continued in the root of
the sub-tree identified by Pi-1 3) if Ki is larger than the desired value the comparison with Ki+1 is repeated 4) if K2k is also smaller than the desired value the search will be continued in the
sub-tree of P2k If itlsquos impossible to descend further into a sub-tree (2 or 4) (leaf node) The search is aborted no record with the desired key value is found
Search for 38 20 6
| 290
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (1)
Insertion Rule insert only into leaf nodes At Non-Leaf Nodes descend down the tree as for the search S le Ki follow Pi-1 S gt Ki check Ki+1 S gt K2k follow P2k
At Leaf Node Insert the data record according to the sorting order Special case leaf node is full (2k records) rarr split the leaf node
Splitting Generate a new leaf node Split the 2k+1 entries (in order)
into two leaf nodes first k entries rarr left node last k entries rarr right node
middle entry (k+1-th) is used as new ldquodiscriminatorrdquo (branching) and inserted into the parent node
| 291
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (2)
Node Splitting during Insertion Two possible situations after a split The parent node is full rarr repeat split on this level Enough space rarr FINISHED
Special case root split Split of the root node rarr New root with two successor nodes Height of a tree grows by 1 The tree has been split from the bottom to the top
Dynamic reorganization (self-balancing) No unloading or loading necessary Tree is always balanced But In case of many insertions deletions reorganization can be beneficial
| 292
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (3)
Insertion Example Order k = 1 n=2k Keys 1 5 2 6 7 4 8 3
Finally h=3
| 293
gt
copy A Behrend Foundations of Information Systems |
Insertion and Deletion in the B-Tree
Problem Insertion can create overflow Deletion can create underflow and overflow
Example Insertion of key 22
rarr Overflow rarr Split
Deletion of key 22 rarr Underflow need to access all four nodes finally same as input
Insert 22
| 294
gt
copy A Behrend Foundations of Information Systems |
Underflow Merge
Deletion in the B-Tree
Example Order k = 1 n=2k Delete key 3
| 295
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (2)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Remember Each path from the root to the leaf has the same length h
| 296
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (3)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Overflow Split
| 297
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (4)
Example Order k = 1 n=2k Delete key 3
Overflow Split
Root Split
| 298
gt
copy A Behrend Foundations of Information Systems |
Deletion Algorithm
Example ndash there are different algorithms Search the node that contains the key K to be deleted If key K is in a leaf node delete the key in the leaf node and handle potentially
resulting underflow by merging with sibling If key K is in an inner node pull up new discriminator from one of the successors Analyze which successor node of K has more elements left or right one
If both have the same number of elements decide for one Replace the key K to be deleted with the direct successor Krsquo from the left
successor node or with the direct successor Krsquorsquo from the right successor node respectively Delete Krsquo or Krsquorsquo from the respective successor node (recursively)
Note Major variants Merge (tis lecture) Re-distribution (instead of splitmerge in case of overflowunderflow the entries are
re-distributed under consideration of one or multiple adjacent nodes)
| 299
gt
copy A Behrend Foundations of Information Systems |
B-Trees B+-Trees and B-Trees
B+-Trees and B-Trees Data is only in leaf nodes Key redundancy but higher fan-out rarr lower tree high less IO Simpler delete procedure rarr requires only merging of nodes
Double linked list of all leaf nodes B-Trees Modified valid node sizes
from [k2k] to [43k2k] rarr better node utilization but more splitsmerges
Example Secondary index Non unique
B-Tree with k = 2 h = 3
| 300
gt
copy A Behrend Foundations of Information Systems |
Indexing Low Cardinality Columns
Problem Example B-tree on the sex of customers for a table with 1000000 tuples results in
two lists with approximately 500000 tuples each
Query for all female customers requires 500000 random page accesses (secondary index) Table scan would be much faster
Conclusion B-trees (and also hashing) are useful for predicates with low selectivity
(outputinput cardinality ratio) Rule of thumb margin hit rate is approx 5 higher hit rates do not justify the efforts for an index access
F M
TID TID TID TID hellip TID TID TID TID hellip
| 301
gt
copy A Behrend Foundations of Information Systems |
Bitmap Index
Idea (Long history since the 1960s) Create a bitmapbitlist for each
attribute value Each tuple in the table is assigned to
one bit in the bitmap (by position sequential TID)
Bit values 1 attribute value set 0 attribute value not set
Necessary condition Sequential numbering of the tuples
(TIDs)
F M
1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
0 1 0 0 1 0 0 0 0 1 1 0 1 0 0
Name Sex Region Race Carol f n white
Harold m e black Anne f e asian
Iris f ne white hellip m se hisp hellip f e white hellip f sw asian hellip f w black hellip f n asian hellip m e hisp hellip m se black hellip f s white hellip m nw black hellip f s white hellip f w black
Sex
| 302
gt
copy A Behrend Foundations of Information Systems |
Querying Bitmap Indexes
Main advantage of bitmap indexes Simple and efficient logical join possible Read only data that is relevant for predicates Example σSex=lsquoflsquo ᴧ Region=lsquonlsquo R Bitmaps B1 and B2 in conjunction for (i=0 iltB1length i++) B = B1[i] amp B2[i]
Example IO Costs Estimation σSex=lsquoflsquo ᴧ Region=lsquonlsquo ᴧ Race=lsquoAsianlsquo R
(ldquoAsian women of region Northrdquo) Selectivity 12 18 14 = 164 N=10000 tuples with length of 400 bytes each
(~ 10 tuples per page for 4kB pages) Table scan 1000 pages Bitmap access 1000064 156 pages (worst case each tuple in a different page)
plus 1 page for bitmaps
F 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
N 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0
A 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
AND = AND
- DBMS Architecture
- What is in the Lecture
- How is Database System build
- Architectural Blue Print
- Architectural Trends
- Different Access Characteristics
- Hardware Developments
- Row Storage vs Column Storage
- Processing Models
- Transaction Management
- ACID Properties of Transactions
- Motivation
- Reasons for crashes
- Guarantee Atomicity amp Durability
- Record Management
- Record
- Record Addressing
- Surrogate with Mapping Table
- TID Concept
- Free Space Management
- Physical Access Paths ndash Index Structures
- Overview Indexes
- Overview Indexes (2)
- Classification of Index Structures
- B-Tree
- B-Tree (2)
- Search in the B-Tree
- Insertions in the B-Tree (1)
- Insertions in the B-Tree (2)
- Insertions in the B-Tree (3)
- Insertion and Deletion in the B-Tree
- Deletion in the B-Tree
- Deletion in the B-Tree (2)
- Deletion in the B-Tree (3)
- Deletion in the B-Tree (4)
- Deletion Algorithm
- B-Trees B+-Trees and B-Trees
- Indexing Low Cardinality Columns
- Bitmap Index
- Querying Bitmap Indexes
-
| 276
gt
copy A Behrend Foundations of Information Systems |
Guarantee Atomicity amp Durability
Assumptions System may crash but the disk is durable The only atomicity guarantee is that a disk block write is atomic
Materialization strategy Preferred Policy StealNo Force This combination is most complicated but allows for highest performance No Force complicates enforcing Durability What if system crashes before a modified page written by a committed
transaction makes it to disk Write as little as possible in a convenient place at commit time to support
REDOing modifications Steal complicates enforcing Atomicity What if the transaction that performed udpates aborts What if system crashes before transaction is finished Must remember the old value of P (to support UNDOing the write to page P)
copy A Behrend Foundations of Information Systems | 277
gt
|
Storage System
Buffer
File System
Hardware
Data System
Application
TID TID TID
TID TID TID
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
Run
Buffered Pages - Page replacement strategy - Materialization strategy - Logging Backup Recovery
Paged files
Disks Flash RAID SAN hellip
Storage Structures - Record management - Free space management - Physical access paths
Access System
Data
base
Sys
tem
Data model semantics - System catalog - Record format - Logical access paths
Query processing - Parsing - Plan generation - Plan optimization - Plan execution
1 lsquoSmithrsquo 15061982
Table Person id INT name VARCHAR birthday DATE Index P_id_IX on Personid
Database System
SQL JDBC ODBC hellip
Record Management
| 278
gt
copy A Behrend Foundations of Information Systems |
Record
Record Package of fields that together describe a thing a person a fact etc Each fields represents on property of the entity described by the record Similar to a struct in C Variable length (in contrast to pages)
Record Manager Organizes physical storage of records in pages Operations Get Insert Update Delete Scan Agnostic to record structure and semantic
records considered as byte strings of variable length Structure and content of record is defined be Access System and application
Challenges Record addressing Free space management
| 279
gt
copy A Behrend Foundations of Information Systems |
Record Addressing
Record address Identifier for records used to address records eg in indexes or query processing Assigned during insert of a record
Goals Stability of identifier Fast and direct access Less organizational overhead
Direct addressing Byte address or position number in file or page Instable Byte address If record grows in length following records would get new address Position number Insert and delete operations change series or records
Indirect addressing Surrogate with mapping table (complete indirection) Tuple Identifier (TID concept)
| 280
gt
copy A Behrend Foundations of Information Systems |
Surrogate with Mapping Table
Surrogate Record type + serial number Serial number remains constant during recordrsquos life time
Mapping table Maps
surrogate to page
Problems Where to store mapping table How can it be extended How to search mapping table efficiently
rarr H2 use B-Tree to store mapping table
Mapping Table Surrogate | Page ID
| 281
gt
copy A Behrend Foundations of Information Systems |
TID Concept
Record addressing with indirection inside the page Each page contains an array with record positions TID of a record consist of page id and index in position array
Pros Access with one page access (two pages in case of overflow) Stable No mapping table required
Operations Insert Reuse unused
position or add position Delete Mark position
as unused in array Update Update all
positions in array Update with overflow Store record
as overflow record and store TID of overflow record at original position (No double overflow Update TID at original position)
Record
Overflow Record
| 282
gt
copy A Behrend Foundations of Information Systems |
Free Space Management
Problem In which page is enough space for new record
Solution Free space table lists for all pages how much space is left
Free space value Precise value Ceil(Log2(page size)) =gt 2 bytes for common page size of 4K Rough value use less bytes free space = (value page size)2^(bits per value)
Free space table With direct page addressing Assuming a single page can take n free space entries First page and each (n+1)-th page takes free space entries
With indirect page addressing Free space information stored in page table
copy A Behrend Foundations of Information Systems | 283
gt
|
Storage System
Buffer
File System
Hardware
Data System
Application
TID TID TID
TID TID TID
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
Run
Buffered Pages - Page replacement strategy - Materialization strategy - Logging Backup Recovery
Paged files
Disks Flash RAID SAN hellip
Storage Structures - Record management - Free space management - Physical access paths
Access System
Data
base
Sys
tem
Data model semantics - System catalog - Record format - Logical access paths
Query processing - Parsing - Plan generation - Plan optimization - Plan execution
1 lsquoSmithrsquo 15061982
Table Person id INT name VARCHAR birthday DATE Index P_id_IX on Personid
Database System
SQL JDBC ODBC hellip
Physical Access Paths ndash Index Structures
| 284
gt
copy A Behrend Foundations of Information Systems |
Primary Index Secondary Index
Overview Indexes
Table scan Read all pages and for each record
evaluate the search criteria Pre-fetching
Index Scan Use index for search criteria
on one or more attributes Fast access to single values or value ranges of index attributes Logicalphysical sorting of values of key attributes (depending on index structure) Enforcing uniqueness
Types if indexes Primary (Clustered) Index
determines physical organization use for PK Secondary (Non-Clustered)
Index redundant access path
Pers(PID NAME AGE SALARY)
Age
Salary
| 285
gt
copy A Behrend Foundations of Information Systems |
Overview Indexes (2)
Choice of Access Paths Index scan Only useful for low selectivity
(low number of result tuples) Break even-point according to the
output ratio of the number of tuples (usually max 5) Requires statistics about data Additional costs for index storage
and updating
Table Scan adequateefficient for small tables
(eg 5 pages) Queries with high selectivity
(large result sets) 100-200MBs sequential read
~ 100 disk seekss
hit rate
Index Scan
Table Scan
access time
| 286
gt
copy A Behrend Foundations of Information Systems |
Classification of Index Structures
Classification
Multiway Trees Tree structure with multiple children per node Idea chose fan out so that node size suits page size
Onedimensional Index Structures
Key Comparison Key Transformation
Sequential Tree-Based Hash-Based
Prefix Trees (Tries)
Binary Search Trees Dynamic Static Linked Lists
(log seq) Seq Lists
(phys seq) Multiway
Trees Example B-Tree
| 287
gt
copy A Behrend Foundations of Information Systems |
B-Tree
free space
(Ki Di Pi) = entry min |P| = k+1 max |P| = 2k+1
keys lt K1 keys gt Kp Ki lt keys lt Ki+1
| 288
gt
copy A Behrend Foundations of Information Systems |
B-Tree (2)
Example Keys Agnostic to specific key semantic Only defined complete order required Could be of fixed or variable length
Operations Search for data for given key value Insertion and deletion of key-data pair
Payload Agnostic to specific data semantic Can be record or reference (TID) or
mix
B-Tree with k = 2 h = 3
| 289
gt
copy A Behrend Foundations of Information Systems |
Search in the B-Tree
Starting at the root node each node is searched from left to right 1) if Ki matches the desired key value the data record has been found (further
records with the same key value might be located in a sub-tree to which Pi-1 points) 2) if Ki is smaller than the desired value the search will be continued in the root of
the sub-tree identified by Pi-1 3) if Ki is larger than the desired value the comparison with Ki+1 is repeated 4) if K2k is also smaller than the desired value the search will be continued in the
sub-tree of P2k If itlsquos impossible to descend further into a sub-tree (2 or 4) (leaf node) The search is aborted no record with the desired key value is found
Search for 38 20 6
| 290
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (1)
Insertion Rule insert only into leaf nodes At Non-Leaf Nodes descend down the tree as for the search S le Ki follow Pi-1 S gt Ki check Ki+1 S gt K2k follow P2k
At Leaf Node Insert the data record according to the sorting order Special case leaf node is full (2k records) rarr split the leaf node
Splitting Generate a new leaf node Split the 2k+1 entries (in order)
into two leaf nodes first k entries rarr left node last k entries rarr right node
middle entry (k+1-th) is used as new ldquodiscriminatorrdquo (branching) and inserted into the parent node
| 291
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (2)
Node Splitting during Insertion Two possible situations after a split The parent node is full rarr repeat split on this level Enough space rarr FINISHED
Special case root split Split of the root node rarr New root with two successor nodes Height of a tree grows by 1 The tree has been split from the bottom to the top
Dynamic reorganization (self-balancing) No unloading or loading necessary Tree is always balanced But In case of many insertions deletions reorganization can be beneficial
| 292
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (3)
Insertion Example Order k = 1 n=2k Keys 1 5 2 6 7 4 8 3
Finally h=3
| 293
gt
copy A Behrend Foundations of Information Systems |
Insertion and Deletion in the B-Tree
Problem Insertion can create overflow Deletion can create underflow and overflow
Example Insertion of key 22
rarr Overflow rarr Split
Deletion of key 22 rarr Underflow need to access all four nodes finally same as input
Insert 22
| 294
gt
copy A Behrend Foundations of Information Systems |
Underflow Merge
Deletion in the B-Tree
Example Order k = 1 n=2k Delete key 3
| 295
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (2)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Remember Each path from the root to the leaf has the same length h
| 296
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (3)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Overflow Split
| 297
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (4)
Example Order k = 1 n=2k Delete key 3
Overflow Split
Root Split
| 298
gt
copy A Behrend Foundations of Information Systems |
Deletion Algorithm
Example ndash there are different algorithms Search the node that contains the key K to be deleted If key K is in a leaf node delete the key in the leaf node and handle potentially
resulting underflow by merging with sibling If key K is in an inner node pull up new discriminator from one of the successors Analyze which successor node of K has more elements left or right one
If both have the same number of elements decide for one Replace the key K to be deleted with the direct successor Krsquo from the left
successor node or with the direct successor Krsquorsquo from the right successor node respectively Delete Krsquo or Krsquorsquo from the respective successor node (recursively)
Note Major variants Merge (tis lecture) Re-distribution (instead of splitmerge in case of overflowunderflow the entries are
re-distributed under consideration of one or multiple adjacent nodes)
| 299
gt
copy A Behrend Foundations of Information Systems |
B-Trees B+-Trees and B-Trees
B+-Trees and B-Trees Data is only in leaf nodes Key redundancy but higher fan-out rarr lower tree high less IO Simpler delete procedure rarr requires only merging of nodes
Double linked list of all leaf nodes B-Trees Modified valid node sizes
from [k2k] to [43k2k] rarr better node utilization but more splitsmerges
Example Secondary index Non unique
B-Tree with k = 2 h = 3
| 300
gt
copy A Behrend Foundations of Information Systems |
Indexing Low Cardinality Columns
Problem Example B-tree on the sex of customers for a table with 1000000 tuples results in
two lists with approximately 500000 tuples each
Query for all female customers requires 500000 random page accesses (secondary index) Table scan would be much faster
Conclusion B-trees (and also hashing) are useful for predicates with low selectivity
(outputinput cardinality ratio) Rule of thumb margin hit rate is approx 5 higher hit rates do not justify the efforts for an index access
F M
TID TID TID TID hellip TID TID TID TID hellip
| 301
gt
copy A Behrend Foundations of Information Systems |
Bitmap Index
Idea (Long history since the 1960s) Create a bitmapbitlist for each
attribute value Each tuple in the table is assigned to
one bit in the bitmap (by position sequential TID)
Bit values 1 attribute value set 0 attribute value not set
Necessary condition Sequential numbering of the tuples
(TIDs)
F M
1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
0 1 0 0 1 0 0 0 0 1 1 0 1 0 0
Name Sex Region Race Carol f n white
Harold m e black Anne f e asian
Iris f ne white hellip m se hisp hellip f e white hellip f sw asian hellip f w black hellip f n asian hellip m e hisp hellip m se black hellip f s white hellip m nw black hellip f s white hellip f w black
Sex
| 302
gt
copy A Behrend Foundations of Information Systems |
Querying Bitmap Indexes
Main advantage of bitmap indexes Simple and efficient logical join possible Read only data that is relevant for predicates Example σSex=lsquoflsquo ᴧ Region=lsquonlsquo R Bitmaps B1 and B2 in conjunction for (i=0 iltB1length i++) B = B1[i] amp B2[i]
Example IO Costs Estimation σSex=lsquoflsquo ᴧ Region=lsquonlsquo ᴧ Race=lsquoAsianlsquo R
(ldquoAsian women of region Northrdquo) Selectivity 12 18 14 = 164 N=10000 tuples with length of 400 bytes each
(~ 10 tuples per page for 4kB pages) Table scan 1000 pages Bitmap access 1000064 156 pages (worst case each tuple in a different page)
plus 1 page for bitmaps
F 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
N 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0
A 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
AND = AND
- DBMS Architecture
- What is in the Lecture
- How is Database System build
- Architectural Blue Print
- Architectural Trends
- Different Access Characteristics
- Hardware Developments
- Row Storage vs Column Storage
- Processing Models
- Transaction Management
- ACID Properties of Transactions
- Motivation
- Reasons for crashes
- Guarantee Atomicity amp Durability
- Record Management
- Record
- Record Addressing
- Surrogate with Mapping Table
- TID Concept
- Free Space Management
- Physical Access Paths ndash Index Structures
- Overview Indexes
- Overview Indexes (2)
- Classification of Index Structures
- B-Tree
- B-Tree (2)
- Search in the B-Tree
- Insertions in the B-Tree (1)
- Insertions in the B-Tree (2)
- Insertions in the B-Tree (3)
- Insertion and Deletion in the B-Tree
- Deletion in the B-Tree
- Deletion in the B-Tree (2)
- Deletion in the B-Tree (3)
- Deletion in the B-Tree (4)
- Deletion Algorithm
- B-Trees B+-Trees and B-Trees
- Indexing Low Cardinality Columns
- Bitmap Index
- Querying Bitmap Indexes
-
copy A Behrend Foundations of Information Systems | 277
gt
|
Storage System
Buffer
File System
Hardware
Data System
Application
TID TID TID
TID TID TID
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
Run
Buffered Pages - Page replacement strategy - Materialization strategy - Logging Backup Recovery
Paged files
Disks Flash RAID SAN hellip
Storage Structures - Record management - Free space management - Physical access paths
Access System
Data
base
Sys
tem
Data model semantics - System catalog - Record format - Logical access paths
Query processing - Parsing - Plan generation - Plan optimization - Plan execution
1 lsquoSmithrsquo 15061982
Table Person id INT name VARCHAR birthday DATE Index P_id_IX on Personid
Database System
SQL JDBC ODBC hellip
Record Management
| 278
gt
copy A Behrend Foundations of Information Systems |
Record
Record Package of fields that together describe a thing a person a fact etc Each fields represents on property of the entity described by the record Similar to a struct in C Variable length (in contrast to pages)
Record Manager Organizes physical storage of records in pages Operations Get Insert Update Delete Scan Agnostic to record structure and semantic
records considered as byte strings of variable length Structure and content of record is defined be Access System and application
Challenges Record addressing Free space management
| 279
gt
copy A Behrend Foundations of Information Systems |
Record Addressing
Record address Identifier for records used to address records eg in indexes or query processing Assigned during insert of a record
Goals Stability of identifier Fast and direct access Less organizational overhead
Direct addressing Byte address or position number in file or page Instable Byte address If record grows in length following records would get new address Position number Insert and delete operations change series or records
Indirect addressing Surrogate with mapping table (complete indirection) Tuple Identifier (TID concept)
| 280
gt
copy A Behrend Foundations of Information Systems |
Surrogate with Mapping Table
Surrogate Record type + serial number Serial number remains constant during recordrsquos life time
Mapping table Maps
surrogate to page
Problems Where to store mapping table How can it be extended How to search mapping table efficiently
rarr H2 use B-Tree to store mapping table
Mapping Table Surrogate | Page ID
| 281
gt
copy A Behrend Foundations of Information Systems |
TID Concept
Record addressing with indirection inside the page Each page contains an array with record positions TID of a record consist of page id and index in position array
Pros Access with one page access (two pages in case of overflow) Stable No mapping table required
Operations Insert Reuse unused
position or add position Delete Mark position
as unused in array Update Update all
positions in array Update with overflow Store record
as overflow record and store TID of overflow record at original position (No double overflow Update TID at original position)
Record
Overflow Record
| 282
gt
copy A Behrend Foundations of Information Systems |
Free Space Management
Problem In which page is enough space for new record
Solution Free space table lists for all pages how much space is left
Free space value Precise value Ceil(Log2(page size)) =gt 2 bytes for common page size of 4K Rough value use less bytes free space = (value page size)2^(bits per value)
Free space table With direct page addressing Assuming a single page can take n free space entries First page and each (n+1)-th page takes free space entries
With indirect page addressing Free space information stored in page table
copy A Behrend Foundations of Information Systems | 283
gt
|
Storage System
Buffer
File System
Hardware
Data System
Application
TID TID TID
TID TID TID
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
Run
Buffered Pages - Page replacement strategy - Materialization strategy - Logging Backup Recovery
Paged files
Disks Flash RAID SAN hellip
Storage Structures - Record management - Free space management - Physical access paths
Access System
Data
base
Sys
tem
Data model semantics - System catalog - Record format - Logical access paths
Query processing - Parsing - Plan generation - Plan optimization - Plan execution
1 lsquoSmithrsquo 15061982
Table Person id INT name VARCHAR birthday DATE Index P_id_IX on Personid
Database System
SQL JDBC ODBC hellip
Physical Access Paths ndash Index Structures
| 284
gt
copy A Behrend Foundations of Information Systems |
Primary Index Secondary Index
Overview Indexes
Table scan Read all pages and for each record
evaluate the search criteria Pre-fetching
Index Scan Use index for search criteria
on one or more attributes Fast access to single values or value ranges of index attributes Logicalphysical sorting of values of key attributes (depending on index structure) Enforcing uniqueness
Types if indexes Primary (Clustered) Index
determines physical organization use for PK Secondary (Non-Clustered)
Index redundant access path
Pers(PID NAME AGE SALARY)
Age
Salary
| 285
gt
copy A Behrend Foundations of Information Systems |
Overview Indexes (2)
Choice of Access Paths Index scan Only useful for low selectivity
(low number of result tuples) Break even-point according to the
output ratio of the number of tuples (usually max 5) Requires statistics about data Additional costs for index storage
and updating
Table Scan adequateefficient for small tables
(eg 5 pages) Queries with high selectivity
(large result sets) 100-200MBs sequential read
~ 100 disk seekss
hit rate
Index Scan
Table Scan
access time
| 286
gt
copy A Behrend Foundations of Information Systems |
Classification of Index Structures
Classification
Multiway Trees Tree structure with multiple children per node Idea chose fan out so that node size suits page size
Onedimensional Index Structures
Key Comparison Key Transformation
Sequential Tree-Based Hash-Based
Prefix Trees (Tries)
Binary Search Trees Dynamic Static Linked Lists
(log seq) Seq Lists
(phys seq) Multiway
Trees Example B-Tree
| 287
gt
copy A Behrend Foundations of Information Systems |
B-Tree
free space
(Ki Di Pi) = entry min |P| = k+1 max |P| = 2k+1
keys lt K1 keys gt Kp Ki lt keys lt Ki+1
| 288
gt
copy A Behrend Foundations of Information Systems |
B-Tree (2)
Example Keys Agnostic to specific key semantic Only defined complete order required Could be of fixed or variable length
Operations Search for data for given key value Insertion and deletion of key-data pair
Payload Agnostic to specific data semantic Can be record or reference (TID) or
mix
B-Tree with k = 2 h = 3
| 289
gt
copy A Behrend Foundations of Information Systems |
Search in the B-Tree
Starting at the root node each node is searched from left to right 1) if Ki matches the desired key value the data record has been found (further
records with the same key value might be located in a sub-tree to which Pi-1 points) 2) if Ki is smaller than the desired value the search will be continued in the root of
the sub-tree identified by Pi-1 3) if Ki is larger than the desired value the comparison with Ki+1 is repeated 4) if K2k is also smaller than the desired value the search will be continued in the
sub-tree of P2k If itlsquos impossible to descend further into a sub-tree (2 or 4) (leaf node) The search is aborted no record with the desired key value is found
Search for 38 20 6
| 290
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (1)
Insertion Rule insert only into leaf nodes At Non-Leaf Nodes descend down the tree as for the search S le Ki follow Pi-1 S gt Ki check Ki+1 S gt K2k follow P2k
At Leaf Node Insert the data record according to the sorting order Special case leaf node is full (2k records) rarr split the leaf node
Splitting Generate a new leaf node Split the 2k+1 entries (in order)
into two leaf nodes first k entries rarr left node last k entries rarr right node
middle entry (k+1-th) is used as new ldquodiscriminatorrdquo (branching) and inserted into the parent node
| 291
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (2)
Node Splitting during Insertion Two possible situations after a split The parent node is full rarr repeat split on this level Enough space rarr FINISHED
Special case root split Split of the root node rarr New root with two successor nodes Height of a tree grows by 1 The tree has been split from the bottom to the top
Dynamic reorganization (self-balancing) No unloading or loading necessary Tree is always balanced But In case of many insertions deletions reorganization can be beneficial
| 292
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (3)
Insertion Example Order k = 1 n=2k Keys 1 5 2 6 7 4 8 3
Finally h=3
| 293
gt
copy A Behrend Foundations of Information Systems |
Insertion and Deletion in the B-Tree
Problem Insertion can create overflow Deletion can create underflow and overflow
Example Insertion of key 22
rarr Overflow rarr Split
Deletion of key 22 rarr Underflow need to access all four nodes finally same as input
Insert 22
| 294
gt
copy A Behrend Foundations of Information Systems |
Underflow Merge
Deletion in the B-Tree
Example Order k = 1 n=2k Delete key 3
| 295
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (2)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Remember Each path from the root to the leaf has the same length h
| 296
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (3)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Overflow Split
| 297
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (4)
Example Order k = 1 n=2k Delete key 3
Overflow Split
Root Split
| 298
gt
copy A Behrend Foundations of Information Systems |
Deletion Algorithm
Example ndash there are different algorithms Search the node that contains the key K to be deleted If key K is in a leaf node delete the key in the leaf node and handle potentially
resulting underflow by merging with sibling If key K is in an inner node pull up new discriminator from one of the successors Analyze which successor node of K has more elements left or right one
If both have the same number of elements decide for one Replace the key K to be deleted with the direct successor Krsquo from the left
successor node or with the direct successor Krsquorsquo from the right successor node respectively Delete Krsquo or Krsquorsquo from the respective successor node (recursively)
Note Major variants Merge (tis lecture) Re-distribution (instead of splitmerge in case of overflowunderflow the entries are
re-distributed under consideration of one or multiple adjacent nodes)
| 299
gt
copy A Behrend Foundations of Information Systems |
B-Trees B+-Trees and B-Trees
B+-Trees and B-Trees Data is only in leaf nodes Key redundancy but higher fan-out rarr lower tree high less IO Simpler delete procedure rarr requires only merging of nodes
Double linked list of all leaf nodes B-Trees Modified valid node sizes
from [k2k] to [43k2k] rarr better node utilization but more splitsmerges
Example Secondary index Non unique
B-Tree with k = 2 h = 3
| 300
gt
copy A Behrend Foundations of Information Systems |
Indexing Low Cardinality Columns
Problem Example B-tree on the sex of customers for a table with 1000000 tuples results in
two lists with approximately 500000 tuples each
Query for all female customers requires 500000 random page accesses (secondary index) Table scan would be much faster
Conclusion B-trees (and also hashing) are useful for predicates with low selectivity
(outputinput cardinality ratio) Rule of thumb margin hit rate is approx 5 higher hit rates do not justify the efforts for an index access
F M
TID TID TID TID hellip TID TID TID TID hellip
| 301
gt
copy A Behrend Foundations of Information Systems |
Bitmap Index
Idea (Long history since the 1960s) Create a bitmapbitlist for each
attribute value Each tuple in the table is assigned to
one bit in the bitmap (by position sequential TID)
Bit values 1 attribute value set 0 attribute value not set
Necessary condition Sequential numbering of the tuples
(TIDs)
F M
1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
0 1 0 0 1 0 0 0 0 1 1 0 1 0 0
Name Sex Region Race Carol f n white
Harold m e black Anne f e asian
Iris f ne white hellip m se hisp hellip f e white hellip f sw asian hellip f w black hellip f n asian hellip m e hisp hellip m se black hellip f s white hellip m nw black hellip f s white hellip f w black
Sex
| 302
gt
copy A Behrend Foundations of Information Systems |
Querying Bitmap Indexes
Main advantage of bitmap indexes Simple and efficient logical join possible Read only data that is relevant for predicates Example σSex=lsquoflsquo ᴧ Region=lsquonlsquo R Bitmaps B1 and B2 in conjunction for (i=0 iltB1length i++) B = B1[i] amp B2[i]
Example IO Costs Estimation σSex=lsquoflsquo ᴧ Region=lsquonlsquo ᴧ Race=lsquoAsianlsquo R
(ldquoAsian women of region Northrdquo) Selectivity 12 18 14 = 164 N=10000 tuples with length of 400 bytes each
(~ 10 tuples per page for 4kB pages) Table scan 1000 pages Bitmap access 1000064 156 pages (worst case each tuple in a different page)
plus 1 page for bitmaps
F 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
N 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0
A 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
AND = AND
- DBMS Architecture
- What is in the Lecture
- How is Database System build
- Architectural Blue Print
- Architectural Trends
- Different Access Characteristics
- Hardware Developments
- Row Storage vs Column Storage
- Processing Models
- Transaction Management
- ACID Properties of Transactions
- Motivation
- Reasons for crashes
- Guarantee Atomicity amp Durability
- Record Management
- Record
- Record Addressing
- Surrogate with Mapping Table
- TID Concept
- Free Space Management
- Physical Access Paths ndash Index Structures
- Overview Indexes
- Overview Indexes (2)
- Classification of Index Structures
- B-Tree
- B-Tree (2)
- Search in the B-Tree
- Insertions in the B-Tree (1)
- Insertions in the B-Tree (2)
- Insertions in the B-Tree (3)
- Insertion and Deletion in the B-Tree
- Deletion in the B-Tree
- Deletion in the B-Tree (2)
- Deletion in the B-Tree (3)
- Deletion in the B-Tree (4)
- Deletion Algorithm
- B-Trees B+-Trees and B-Trees
- Indexing Low Cardinality Columns
- Bitmap Index
- Querying Bitmap Indexes
-
| 278
gt
copy A Behrend Foundations of Information Systems |
Record
Record Package of fields that together describe a thing a person a fact etc Each fields represents on property of the entity described by the record Similar to a struct in C Variable length (in contrast to pages)
Record Manager Organizes physical storage of records in pages Operations Get Insert Update Delete Scan Agnostic to record structure and semantic
records considered as byte strings of variable length Structure and content of record is defined be Access System and application
Challenges Record addressing Free space management
| 279
gt
copy A Behrend Foundations of Information Systems |
Record Addressing
Record address Identifier for records used to address records eg in indexes or query processing Assigned during insert of a record
Goals Stability of identifier Fast and direct access Less organizational overhead
Direct addressing Byte address or position number in file or page Instable Byte address If record grows in length following records would get new address Position number Insert and delete operations change series or records
Indirect addressing Surrogate with mapping table (complete indirection) Tuple Identifier (TID concept)
| 280
gt
copy A Behrend Foundations of Information Systems |
Surrogate with Mapping Table
Surrogate Record type + serial number Serial number remains constant during recordrsquos life time
Mapping table Maps
surrogate to page
Problems Where to store mapping table How can it be extended How to search mapping table efficiently
rarr H2 use B-Tree to store mapping table
Mapping Table Surrogate | Page ID
| 281
gt
copy A Behrend Foundations of Information Systems |
TID Concept
Record addressing with indirection inside the page Each page contains an array with record positions TID of a record consist of page id and index in position array
Pros Access with one page access (two pages in case of overflow) Stable No mapping table required
Operations Insert Reuse unused
position or add position Delete Mark position
as unused in array Update Update all
positions in array Update with overflow Store record
as overflow record and store TID of overflow record at original position (No double overflow Update TID at original position)
Record
Overflow Record
| 282
gt
copy A Behrend Foundations of Information Systems |
Free Space Management
Problem In which page is enough space for new record
Solution Free space table lists for all pages how much space is left
Free space value Precise value Ceil(Log2(page size)) =gt 2 bytes for common page size of 4K Rough value use less bytes free space = (value page size)2^(bits per value)
Free space table With direct page addressing Assuming a single page can take n free space entries First page and each (n+1)-th page takes free space entries
With indirect page addressing Free space information stored in page table
copy A Behrend Foundations of Information Systems | 283
gt
|
Storage System
Buffer
File System
Hardware
Data System
Application
TID TID TID
TID TID TID
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
Run
Buffered Pages - Page replacement strategy - Materialization strategy - Logging Backup Recovery
Paged files
Disks Flash RAID SAN hellip
Storage Structures - Record management - Free space management - Physical access paths
Access System
Data
base
Sys
tem
Data model semantics - System catalog - Record format - Logical access paths
Query processing - Parsing - Plan generation - Plan optimization - Plan execution
1 lsquoSmithrsquo 15061982
Table Person id INT name VARCHAR birthday DATE Index P_id_IX on Personid
Database System
SQL JDBC ODBC hellip
Physical Access Paths ndash Index Structures
| 284
gt
copy A Behrend Foundations of Information Systems |
Primary Index Secondary Index
Overview Indexes
Table scan Read all pages and for each record
evaluate the search criteria Pre-fetching
Index Scan Use index for search criteria
on one or more attributes Fast access to single values or value ranges of index attributes Logicalphysical sorting of values of key attributes (depending on index structure) Enforcing uniqueness
Types if indexes Primary (Clustered) Index
determines physical organization use for PK Secondary (Non-Clustered)
Index redundant access path
Pers(PID NAME AGE SALARY)
Age
Salary
| 285
gt
copy A Behrend Foundations of Information Systems |
Overview Indexes (2)
Choice of Access Paths Index scan Only useful for low selectivity
(low number of result tuples) Break even-point according to the
output ratio of the number of tuples (usually max 5) Requires statistics about data Additional costs for index storage
and updating
Table Scan adequateefficient for small tables
(eg 5 pages) Queries with high selectivity
(large result sets) 100-200MBs sequential read
~ 100 disk seekss
hit rate
Index Scan
Table Scan
access time
| 286
gt
copy A Behrend Foundations of Information Systems |
Classification of Index Structures
Classification
Multiway Trees Tree structure with multiple children per node Idea chose fan out so that node size suits page size
Onedimensional Index Structures
Key Comparison Key Transformation
Sequential Tree-Based Hash-Based
Prefix Trees (Tries)
Binary Search Trees Dynamic Static Linked Lists
(log seq) Seq Lists
(phys seq) Multiway
Trees Example B-Tree
| 287
gt
copy A Behrend Foundations of Information Systems |
B-Tree
free space
(Ki Di Pi) = entry min |P| = k+1 max |P| = 2k+1
keys lt K1 keys gt Kp Ki lt keys lt Ki+1
| 288
gt
copy A Behrend Foundations of Information Systems |
B-Tree (2)
Example Keys Agnostic to specific key semantic Only defined complete order required Could be of fixed or variable length
Operations Search for data for given key value Insertion and deletion of key-data pair
Payload Agnostic to specific data semantic Can be record or reference (TID) or
mix
B-Tree with k = 2 h = 3
| 289
gt
copy A Behrend Foundations of Information Systems |
Search in the B-Tree
Starting at the root node each node is searched from left to right 1) if Ki matches the desired key value the data record has been found (further
records with the same key value might be located in a sub-tree to which Pi-1 points) 2) if Ki is smaller than the desired value the search will be continued in the root of
the sub-tree identified by Pi-1 3) if Ki is larger than the desired value the comparison with Ki+1 is repeated 4) if K2k is also smaller than the desired value the search will be continued in the
sub-tree of P2k If itlsquos impossible to descend further into a sub-tree (2 or 4) (leaf node) The search is aborted no record with the desired key value is found
Search for 38 20 6
| 290
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (1)
Insertion Rule insert only into leaf nodes At Non-Leaf Nodes descend down the tree as for the search S le Ki follow Pi-1 S gt Ki check Ki+1 S gt K2k follow P2k
At Leaf Node Insert the data record according to the sorting order Special case leaf node is full (2k records) rarr split the leaf node
Splitting Generate a new leaf node Split the 2k+1 entries (in order)
into two leaf nodes first k entries rarr left node last k entries rarr right node
middle entry (k+1-th) is used as new ldquodiscriminatorrdquo (branching) and inserted into the parent node
| 291
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (2)
Node Splitting during Insertion Two possible situations after a split The parent node is full rarr repeat split on this level Enough space rarr FINISHED
Special case root split Split of the root node rarr New root with two successor nodes Height of a tree grows by 1 The tree has been split from the bottom to the top
Dynamic reorganization (self-balancing) No unloading or loading necessary Tree is always balanced But In case of many insertions deletions reorganization can be beneficial
| 292
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (3)
Insertion Example Order k = 1 n=2k Keys 1 5 2 6 7 4 8 3
Finally h=3
| 293
gt
copy A Behrend Foundations of Information Systems |
Insertion and Deletion in the B-Tree
Problem Insertion can create overflow Deletion can create underflow and overflow
Example Insertion of key 22
rarr Overflow rarr Split
Deletion of key 22 rarr Underflow need to access all four nodes finally same as input
Insert 22
| 294
gt
copy A Behrend Foundations of Information Systems |
Underflow Merge
Deletion in the B-Tree
Example Order k = 1 n=2k Delete key 3
| 295
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (2)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Remember Each path from the root to the leaf has the same length h
| 296
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (3)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Overflow Split
| 297
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (4)
Example Order k = 1 n=2k Delete key 3
Overflow Split
Root Split
| 298
gt
copy A Behrend Foundations of Information Systems |
Deletion Algorithm
Example ndash there are different algorithms Search the node that contains the key K to be deleted If key K is in a leaf node delete the key in the leaf node and handle potentially
resulting underflow by merging with sibling If key K is in an inner node pull up new discriminator from one of the successors Analyze which successor node of K has more elements left or right one
If both have the same number of elements decide for one Replace the key K to be deleted with the direct successor Krsquo from the left
successor node or with the direct successor Krsquorsquo from the right successor node respectively Delete Krsquo or Krsquorsquo from the respective successor node (recursively)
Note Major variants Merge (tis lecture) Re-distribution (instead of splitmerge in case of overflowunderflow the entries are
re-distributed under consideration of one or multiple adjacent nodes)
| 299
gt
copy A Behrend Foundations of Information Systems |
B-Trees B+-Trees and B-Trees
B+-Trees and B-Trees Data is only in leaf nodes Key redundancy but higher fan-out rarr lower tree high less IO Simpler delete procedure rarr requires only merging of nodes
Double linked list of all leaf nodes B-Trees Modified valid node sizes
from [k2k] to [43k2k] rarr better node utilization but more splitsmerges
Example Secondary index Non unique
B-Tree with k = 2 h = 3
| 300
gt
copy A Behrend Foundations of Information Systems |
Indexing Low Cardinality Columns
Problem Example B-tree on the sex of customers for a table with 1000000 tuples results in
two lists with approximately 500000 tuples each
Query for all female customers requires 500000 random page accesses (secondary index) Table scan would be much faster
Conclusion B-trees (and also hashing) are useful for predicates with low selectivity
(outputinput cardinality ratio) Rule of thumb margin hit rate is approx 5 higher hit rates do not justify the efforts for an index access
F M
TID TID TID TID hellip TID TID TID TID hellip
| 301
gt
copy A Behrend Foundations of Information Systems |
Bitmap Index
Idea (Long history since the 1960s) Create a bitmapbitlist for each
attribute value Each tuple in the table is assigned to
one bit in the bitmap (by position sequential TID)
Bit values 1 attribute value set 0 attribute value not set
Necessary condition Sequential numbering of the tuples
(TIDs)
F M
1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
0 1 0 0 1 0 0 0 0 1 1 0 1 0 0
Name Sex Region Race Carol f n white
Harold m e black Anne f e asian
Iris f ne white hellip m se hisp hellip f e white hellip f sw asian hellip f w black hellip f n asian hellip m e hisp hellip m se black hellip f s white hellip m nw black hellip f s white hellip f w black
Sex
| 302
gt
copy A Behrend Foundations of Information Systems |
Querying Bitmap Indexes
Main advantage of bitmap indexes Simple and efficient logical join possible Read only data that is relevant for predicates Example σSex=lsquoflsquo ᴧ Region=lsquonlsquo R Bitmaps B1 and B2 in conjunction for (i=0 iltB1length i++) B = B1[i] amp B2[i]
Example IO Costs Estimation σSex=lsquoflsquo ᴧ Region=lsquonlsquo ᴧ Race=lsquoAsianlsquo R
(ldquoAsian women of region Northrdquo) Selectivity 12 18 14 = 164 N=10000 tuples with length of 400 bytes each
(~ 10 tuples per page for 4kB pages) Table scan 1000 pages Bitmap access 1000064 156 pages (worst case each tuple in a different page)
plus 1 page for bitmaps
F 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
N 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0
A 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
AND = AND
- DBMS Architecture
- What is in the Lecture
- How is Database System build
- Architectural Blue Print
- Architectural Trends
- Different Access Characteristics
- Hardware Developments
- Row Storage vs Column Storage
- Processing Models
- Transaction Management
- ACID Properties of Transactions
- Motivation
- Reasons for crashes
- Guarantee Atomicity amp Durability
- Record Management
- Record
- Record Addressing
- Surrogate with Mapping Table
- TID Concept
- Free Space Management
- Physical Access Paths ndash Index Structures
- Overview Indexes
- Overview Indexes (2)
- Classification of Index Structures
- B-Tree
- B-Tree (2)
- Search in the B-Tree
- Insertions in the B-Tree (1)
- Insertions in the B-Tree (2)
- Insertions in the B-Tree (3)
- Insertion and Deletion in the B-Tree
- Deletion in the B-Tree
- Deletion in the B-Tree (2)
- Deletion in the B-Tree (3)
- Deletion in the B-Tree (4)
- Deletion Algorithm
- B-Trees B+-Trees and B-Trees
- Indexing Low Cardinality Columns
- Bitmap Index
- Querying Bitmap Indexes
-
| 279
gt
copy A Behrend Foundations of Information Systems |
Record Addressing
Record address Identifier for records used to address records eg in indexes or query processing Assigned during insert of a record
Goals Stability of identifier Fast and direct access Less organizational overhead
Direct addressing Byte address or position number in file or page Instable Byte address If record grows in length following records would get new address Position number Insert and delete operations change series or records
Indirect addressing Surrogate with mapping table (complete indirection) Tuple Identifier (TID concept)
| 280
gt
copy A Behrend Foundations of Information Systems |
Surrogate with Mapping Table
Surrogate Record type + serial number Serial number remains constant during recordrsquos life time
Mapping table Maps
surrogate to page
Problems Where to store mapping table How can it be extended How to search mapping table efficiently
rarr H2 use B-Tree to store mapping table
Mapping Table Surrogate | Page ID
| 281
gt
copy A Behrend Foundations of Information Systems |
TID Concept
Record addressing with indirection inside the page Each page contains an array with record positions TID of a record consist of page id and index in position array
Pros Access with one page access (two pages in case of overflow) Stable No mapping table required
Operations Insert Reuse unused
position or add position Delete Mark position
as unused in array Update Update all
positions in array Update with overflow Store record
as overflow record and store TID of overflow record at original position (No double overflow Update TID at original position)
Record
Overflow Record
| 282
gt
copy A Behrend Foundations of Information Systems |
Free Space Management
Problem In which page is enough space for new record
Solution Free space table lists for all pages how much space is left
Free space value Precise value Ceil(Log2(page size)) =gt 2 bytes for common page size of 4K Rough value use less bytes free space = (value page size)2^(bits per value)
Free space table With direct page addressing Assuming a single page can take n free space entries First page and each (n+1)-th page takes free space entries
With indirect page addressing Free space information stored in page table
copy A Behrend Foundations of Information Systems | 283
gt
|
Storage System
Buffer
File System
Hardware
Data System
Application
TID TID TID
TID TID TID
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
Run
Buffered Pages - Page replacement strategy - Materialization strategy - Logging Backup Recovery
Paged files
Disks Flash RAID SAN hellip
Storage Structures - Record management - Free space management - Physical access paths
Access System
Data
base
Sys
tem
Data model semantics - System catalog - Record format - Logical access paths
Query processing - Parsing - Plan generation - Plan optimization - Plan execution
1 lsquoSmithrsquo 15061982
Table Person id INT name VARCHAR birthday DATE Index P_id_IX on Personid
Database System
SQL JDBC ODBC hellip
Physical Access Paths ndash Index Structures
| 284
gt
copy A Behrend Foundations of Information Systems |
Primary Index Secondary Index
Overview Indexes
Table scan Read all pages and for each record
evaluate the search criteria Pre-fetching
Index Scan Use index for search criteria
on one or more attributes Fast access to single values or value ranges of index attributes Logicalphysical sorting of values of key attributes (depending on index structure) Enforcing uniqueness
Types if indexes Primary (Clustered) Index
determines physical organization use for PK Secondary (Non-Clustered)
Index redundant access path
Pers(PID NAME AGE SALARY)
Age
Salary
| 285
gt
copy A Behrend Foundations of Information Systems |
Overview Indexes (2)
Choice of Access Paths Index scan Only useful for low selectivity
(low number of result tuples) Break even-point according to the
output ratio of the number of tuples (usually max 5) Requires statistics about data Additional costs for index storage
and updating
Table Scan adequateefficient for small tables
(eg 5 pages) Queries with high selectivity
(large result sets) 100-200MBs sequential read
~ 100 disk seekss
hit rate
Index Scan
Table Scan
access time
| 286
gt
copy A Behrend Foundations of Information Systems |
Classification of Index Structures
Classification
Multiway Trees Tree structure with multiple children per node Idea chose fan out so that node size suits page size
Onedimensional Index Structures
Key Comparison Key Transformation
Sequential Tree-Based Hash-Based
Prefix Trees (Tries)
Binary Search Trees Dynamic Static Linked Lists
(log seq) Seq Lists
(phys seq) Multiway
Trees Example B-Tree
| 287
gt
copy A Behrend Foundations of Information Systems |
B-Tree
free space
(Ki Di Pi) = entry min |P| = k+1 max |P| = 2k+1
keys lt K1 keys gt Kp Ki lt keys lt Ki+1
| 288
gt
copy A Behrend Foundations of Information Systems |
B-Tree (2)
Example Keys Agnostic to specific key semantic Only defined complete order required Could be of fixed or variable length
Operations Search for data for given key value Insertion and deletion of key-data pair
Payload Agnostic to specific data semantic Can be record or reference (TID) or
mix
B-Tree with k = 2 h = 3
| 289
gt
copy A Behrend Foundations of Information Systems |
Search in the B-Tree
Starting at the root node each node is searched from left to right 1) if Ki matches the desired key value the data record has been found (further
records with the same key value might be located in a sub-tree to which Pi-1 points) 2) if Ki is smaller than the desired value the search will be continued in the root of
the sub-tree identified by Pi-1 3) if Ki is larger than the desired value the comparison with Ki+1 is repeated 4) if K2k is also smaller than the desired value the search will be continued in the
sub-tree of P2k If itlsquos impossible to descend further into a sub-tree (2 or 4) (leaf node) The search is aborted no record with the desired key value is found
Search for 38 20 6
| 290
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (1)
Insertion Rule insert only into leaf nodes At Non-Leaf Nodes descend down the tree as for the search S le Ki follow Pi-1 S gt Ki check Ki+1 S gt K2k follow P2k
At Leaf Node Insert the data record according to the sorting order Special case leaf node is full (2k records) rarr split the leaf node
Splitting Generate a new leaf node Split the 2k+1 entries (in order)
into two leaf nodes first k entries rarr left node last k entries rarr right node
middle entry (k+1-th) is used as new ldquodiscriminatorrdquo (branching) and inserted into the parent node
| 291
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (2)
Node Splitting during Insertion Two possible situations after a split The parent node is full rarr repeat split on this level Enough space rarr FINISHED
Special case root split Split of the root node rarr New root with two successor nodes Height of a tree grows by 1 The tree has been split from the bottom to the top
Dynamic reorganization (self-balancing) No unloading or loading necessary Tree is always balanced But In case of many insertions deletions reorganization can be beneficial
| 292
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (3)
Insertion Example Order k = 1 n=2k Keys 1 5 2 6 7 4 8 3
Finally h=3
| 293
gt
copy A Behrend Foundations of Information Systems |
Insertion and Deletion in the B-Tree
Problem Insertion can create overflow Deletion can create underflow and overflow
Example Insertion of key 22
rarr Overflow rarr Split
Deletion of key 22 rarr Underflow need to access all four nodes finally same as input
Insert 22
| 294
gt
copy A Behrend Foundations of Information Systems |
Underflow Merge
Deletion in the B-Tree
Example Order k = 1 n=2k Delete key 3
| 295
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (2)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Remember Each path from the root to the leaf has the same length h
| 296
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (3)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Overflow Split
| 297
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (4)
Example Order k = 1 n=2k Delete key 3
Overflow Split
Root Split
| 298
gt
copy A Behrend Foundations of Information Systems |
Deletion Algorithm
Example ndash there are different algorithms Search the node that contains the key K to be deleted If key K is in a leaf node delete the key in the leaf node and handle potentially
resulting underflow by merging with sibling If key K is in an inner node pull up new discriminator from one of the successors Analyze which successor node of K has more elements left or right one
If both have the same number of elements decide for one Replace the key K to be deleted with the direct successor Krsquo from the left
successor node or with the direct successor Krsquorsquo from the right successor node respectively Delete Krsquo or Krsquorsquo from the respective successor node (recursively)
Note Major variants Merge (tis lecture) Re-distribution (instead of splitmerge in case of overflowunderflow the entries are
re-distributed under consideration of one or multiple adjacent nodes)
| 299
gt
copy A Behrend Foundations of Information Systems |
B-Trees B+-Trees and B-Trees
B+-Trees and B-Trees Data is only in leaf nodes Key redundancy but higher fan-out rarr lower tree high less IO Simpler delete procedure rarr requires only merging of nodes
Double linked list of all leaf nodes B-Trees Modified valid node sizes
from [k2k] to [43k2k] rarr better node utilization but more splitsmerges
Example Secondary index Non unique
B-Tree with k = 2 h = 3
| 300
gt
copy A Behrend Foundations of Information Systems |
Indexing Low Cardinality Columns
Problem Example B-tree on the sex of customers for a table with 1000000 tuples results in
two lists with approximately 500000 tuples each
Query for all female customers requires 500000 random page accesses (secondary index) Table scan would be much faster
Conclusion B-trees (and also hashing) are useful for predicates with low selectivity
(outputinput cardinality ratio) Rule of thumb margin hit rate is approx 5 higher hit rates do not justify the efforts for an index access
F M
TID TID TID TID hellip TID TID TID TID hellip
| 301
gt
copy A Behrend Foundations of Information Systems |
Bitmap Index
Idea (Long history since the 1960s) Create a bitmapbitlist for each
attribute value Each tuple in the table is assigned to
one bit in the bitmap (by position sequential TID)
Bit values 1 attribute value set 0 attribute value not set
Necessary condition Sequential numbering of the tuples
(TIDs)
F M
1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
0 1 0 0 1 0 0 0 0 1 1 0 1 0 0
Name Sex Region Race Carol f n white
Harold m e black Anne f e asian
Iris f ne white hellip m se hisp hellip f e white hellip f sw asian hellip f w black hellip f n asian hellip m e hisp hellip m se black hellip f s white hellip m nw black hellip f s white hellip f w black
Sex
| 302
gt
copy A Behrend Foundations of Information Systems |
Querying Bitmap Indexes
Main advantage of bitmap indexes Simple and efficient logical join possible Read only data that is relevant for predicates Example σSex=lsquoflsquo ᴧ Region=lsquonlsquo R Bitmaps B1 and B2 in conjunction for (i=0 iltB1length i++) B = B1[i] amp B2[i]
Example IO Costs Estimation σSex=lsquoflsquo ᴧ Region=lsquonlsquo ᴧ Race=lsquoAsianlsquo R
(ldquoAsian women of region Northrdquo) Selectivity 12 18 14 = 164 N=10000 tuples with length of 400 bytes each
(~ 10 tuples per page for 4kB pages) Table scan 1000 pages Bitmap access 1000064 156 pages (worst case each tuple in a different page)
plus 1 page for bitmaps
F 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
N 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0
A 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
AND = AND
- DBMS Architecture
- What is in the Lecture
- How is Database System build
- Architectural Blue Print
- Architectural Trends
- Different Access Characteristics
- Hardware Developments
- Row Storage vs Column Storage
- Processing Models
- Transaction Management
- ACID Properties of Transactions
- Motivation
- Reasons for crashes
- Guarantee Atomicity amp Durability
- Record Management
- Record
- Record Addressing
- Surrogate with Mapping Table
- TID Concept
- Free Space Management
- Physical Access Paths ndash Index Structures
- Overview Indexes
- Overview Indexes (2)
- Classification of Index Structures
- B-Tree
- B-Tree (2)
- Search in the B-Tree
- Insertions in the B-Tree (1)
- Insertions in the B-Tree (2)
- Insertions in the B-Tree (3)
- Insertion and Deletion in the B-Tree
- Deletion in the B-Tree
- Deletion in the B-Tree (2)
- Deletion in the B-Tree (3)
- Deletion in the B-Tree (4)
- Deletion Algorithm
- B-Trees B+-Trees and B-Trees
- Indexing Low Cardinality Columns
- Bitmap Index
- Querying Bitmap Indexes
-
| 280
gt
copy A Behrend Foundations of Information Systems |
Surrogate with Mapping Table
Surrogate Record type + serial number Serial number remains constant during recordrsquos life time
Mapping table Maps
surrogate to page
Problems Where to store mapping table How can it be extended How to search mapping table efficiently
rarr H2 use B-Tree to store mapping table
Mapping Table Surrogate | Page ID
| 281
gt
copy A Behrend Foundations of Information Systems |
TID Concept
Record addressing with indirection inside the page Each page contains an array with record positions TID of a record consist of page id and index in position array
Pros Access with one page access (two pages in case of overflow) Stable No mapping table required
Operations Insert Reuse unused
position or add position Delete Mark position
as unused in array Update Update all
positions in array Update with overflow Store record
as overflow record and store TID of overflow record at original position (No double overflow Update TID at original position)
Record
Overflow Record
| 282
gt
copy A Behrend Foundations of Information Systems |
Free Space Management
Problem In which page is enough space for new record
Solution Free space table lists for all pages how much space is left
Free space value Precise value Ceil(Log2(page size)) =gt 2 bytes for common page size of 4K Rough value use less bytes free space = (value page size)2^(bits per value)
Free space table With direct page addressing Assuming a single page can take n free space entries First page and each (n+1)-th page takes free space entries
With indirect page addressing Free space information stored in page table
copy A Behrend Foundations of Information Systems | 283
gt
|
Storage System
Buffer
File System
Hardware
Data System
Application
TID TID TID
TID TID TID
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
Run
Buffered Pages - Page replacement strategy - Materialization strategy - Logging Backup Recovery
Paged files
Disks Flash RAID SAN hellip
Storage Structures - Record management - Free space management - Physical access paths
Access System
Data
base
Sys
tem
Data model semantics - System catalog - Record format - Logical access paths
Query processing - Parsing - Plan generation - Plan optimization - Plan execution
1 lsquoSmithrsquo 15061982
Table Person id INT name VARCHAR birthday DATE Index P_id_IX on Personid
Database System
SQL JDBC ODBC hellip
Physical Access Paths ndash Index Structures
| 284
gt
copy A Behrend Foundations of Information Systems |
Primary Index Secondary Index
Overview Indexes
Table scan Read all pages and for each record
evaluate the search criteria Pre-fetching
Index Scan Use index for search criteria
on one or more attributes Fast access to single values or value ranges of index attributes Logicalphysical sorting of values of key attributes (depending on index structure) Enforcing uniqueness
Types if indexes Primary (Clustered) Index
determines physical organization use for PK Secondary (Non-Clustered)
Index redundant access path
Pers(PID NAME AGE SALARY)
Age
Salary
| 285
gt
copy A Behrend Foundations of Information Systems |
Overview Indexes (2)
Choice of Access Paths Index scan Only useful for low selectivity
(low number of result tuples) Break even-point according to the
output ratio of the number of tuples (usually max 5) Requires statistics about data Additional costs for index storage
and updating
Table Scan adequateefficient for small tables
(eg 5 pages) Queries with high selectivity
(large result sets) 100-200MBs sequential read
~ 100 disk seekss
hit rate
Index Scan
Table Scan
access time
| 286
gt
copy A Behrend Foundations of Information Systems |
Classification of Index Structures
Classification
Multiway Trees Tree structure with multiple children per node Idea chose fan out so that node size suits page size
Onedimensional Index Structures
Key Comparison Key Transformation
Sequential Tree-Based Hash-Based
Prefix Trees (Tries)
Binary Search Trees Dynamic Static Linked Lists
(log seq) Seq Lists
(phys seq) Multiway
Trees Example B-Tree
| 287
gt
copy A Behrend Foundations of Information Systems |
B-Tree
free space
(Ki Di Pi) = entry min |P| = k+1 max |P| = 2k+1
keys lt K1 keys gt Kp Ki lt keys lt Ki+1
| 288
gt
copy A Behrend Foundations of Information Systems |
B-Tree (2)
Example Keys Agnostic to specific key semantic Only defined complete order required Could be of fixed or variable length
Operations Search for data for given key value Insertion and deletion of key-data pair
Payload Agnostic to specific data semantic Can be record or reference (TID) or
mix
B-Tree with k = 2 h = 3
| 289
gt
copy A Behrend Foundations of Information Systems |
Search in the B-Tree
Starting at the root node each node is searched from left to right 1) if Ki matches the desired key value the data record has been found (further
records with the same key value might be located in a sub-tree to which Pi-1 points) 2) if Ki is smaller than the desired value the search will be continued in the root of
the sub-tree identified by Pi-1 3) if Ki is larger than the desired value the comparison with Ki+1 is repeated 4) if K2k is also smaller than the desired value the search will be continued in the
sub-tree of P2k If itlsquos impossible to descend further into a sub-tree (2 or 4) (leaf node) The search is aborted no record with the desired key value is found
Search for 38 20 6
| 290
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (1)
Insertion Rule insert only into leaf nodes At Non-Leaf Nodes descend down the tree as for the search S le Ki follow Pi-1 S gt Ki check Ki+1 S gt K2k follow P2k
At Leaf Node Insert the data record according to the sorting order Special case leaf node is full (2k records) rarr split the leaf node
Splitting Generate a new leaf node Split the 2k+1 entries (in order)
into two leaf nodes first k entries rarr left node last k entries rarr right node
middle entry (k+1-th) is used as new ldquodiscriminatorrdquo (branching) and inserted into the parent node
| 291
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (2)
Node Splitting during Insertion Two possible situations after a split The parent node is full rarr repeat split on this level Enough space rarr FINISHED
Special case root split Split of the root node rarr New root with two successor nodes Height of a tree grows by 1 The tree has been split from the bottom to the top
Dynamic reorganization (self-balancing) No unloading or loading necessary Tree is always balanced But In case of many insertions deletions reorganization can be beneficial
| 292
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (3)
Insertion Example Order k = 1 n=2k Keys 1 5 2 6 7 4 8 3
Finally h=3
| 293
gt
copy A Behrend Foundations of Information Systems |
Insertion and Deletion in the B-Tree
Problem Insertion can create overflow Deletion can create underflow and overflow
Example Insertion of key 22
rarr Overflow rarr Split
Deletion of key 22 rarr Underflow need to access all four nodes finally same as input
Insert 22
| 294
gt
copy A Behrend Foundations of Information Systems |
Underflow Merge
Deletion in the B-Tree
Example Order k = 1 n=2k Delete key 3
| 295
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (2)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Remember Each path from the root to the leaf has the same length h
| 296
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (3)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Overflow Split
| 297
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (4)
Example Order k = 1 n=2k Delete key 3
Overflow Split
Root Split
| 298
gt
copy A Behrend Foundations of Information Systems |
Deletion Algorithm
Example ndash there are different algorithms Search the node that contains the key K to be deleted If key K is in a leaf node delete the key in the leaf node and handle potentially
resulting underflow by merging with sibling If key K is in an inner node pull up new discriminator from one of the successors Analyze which successor node of K has more elements left or right one
If both have the same number of elements decide for one Replace the key K to be deleted with the direct successor Krsquo from the left
successor node or with the direct successor Krsquorsquo from the right successor node respectively Delete Krsquo or Krsquorsquo from the respective successor node (recursively)
Note Major variants Merge (tis lecture) Re-distribution (instead of splitmerge in case of overflowunderflow the entries are
re-distributed under consideration of one or multiple adjacent nodes)
| 299
gt
copy A Behrend Foundations of Information Systems |
B-Trees B+-Trees and B-Trees
B+-Trees and B-Trees Data is only in leaf nodes Key redundancy but higher fan-out rarr lower tree high less IO Simpler delete procedure rarr requires only merging of nodes
Double linked list of all leaf nodes B-Trees Modified valid node sizes
from [k2k] to [43k2k] rarr better node utilization but more splitsmerges
Example Secondary index Non unique
B-Tree with k = 2 h = 3
| 300
gt
copy A Behrend Foundations of Information Systems |
Indexing Low Cardinality Columns
Problem Example B-tree on the sex of customers for a table with 1000000 tuples results in
two lists with approximately 500000 tuples each
Query for all female customers requires 500000 random page accesses (secondary index) Table scan would be much faster
Conclusion B-trees (and also hashing) are useful for predicates with low selectivity
(outputinput cardinality ratio) Rule of thumb margin hit rate is approx 5 higher hit rates do not justify the efforts for an index access
F M
TID TID TID TID hellip TID TID TID TID hellip
| 301
gt
copy A Behrend Foundations of Information Systems |
Bitmap Index
Idea (Long history since the 1960s) Create a bitmapbitlist for each
attribute value Each tuple in the table is assigned to
one bit in the bitmap (by position sequential TID)
Bit values 1 attribute value set 0 attribute value not set
Necessary condition Sequential numbering of the tuples
(TIDs)
F M
1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
0 1 0 0 1 0 0 0 0 1 1 0 1 0 0
Name Sex Region Race Carol f n white
Harold m e black Anne f e asian
Iris f ne white hellip m se hisp hellip f e white hellip f sw asian hellip f w black hellip f n asian hellip m e hisp hellip m se black hellip f s white hellip m nw black hellip f s white hellip f w black
Sex
| 302
gt
copy A Behrend Foundations of Information Systems |
Querying Bitmap Indexes
Main advantage of bitmap indexes Simple and efficient logical join possible Read only data that is relevant for predicates Example σSex=lsquoflsquo ᴧ Region=lsquonlsquo R Bitmaps B1 and B2 in conjunction for (i=0 iltB1length i++) B = B1[i] amp B2[i]
Example IO Costs Estimation σSex=lsquoflsquo ᴧ Region=lsquonlsquo ᴧ Race=lsquoAsianlsquo R
(ldquoAsian women of region Northrdquo) Selectivity 12 18 14 = 164 N=10000 tuples with length of 400 bytes each
(~ 10 tuples per page for 4kB pages) Table scan 1000 pages Bitmap access 1000064 156 pages (worst case each tuple in a different page)
plus 1 page for bitmaps
F 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
N 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0
A 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
AND = AND
- DBMS Architecture
- What is in the Lecture
- How is Database System build
- Architectural Blue Print
- Architectural Trends
- Different Access Characteristics
- Hardware Developments
- Row Storage vs Column Storage
- Processing Models
- Transaction Management
- ACID Properties of Transactions
- Motivation
- Reasons for crashes
- Guarantee Atomicity amp Durability
- Record Management
- Record
- Record Addressing
- Surrogate with Mapping Table
- TID Concept
- Free Space Management
- Physical Access Paths ndash Index Structures
- Overview Indexes
- Overview Indexes (2)
- Classification of Index Structures
- B-Tree
- B-Tree (2)
- Search in the B-Tree
- Insertions in the B-Tree (1)
- Insertions in the B-Tree (2)
- Insertions in the B-Tree (3)
- Insertion and Deletion in the B-Tree
- Deletion in the B-Tree
- Deletion in the B-Tree (2)
- Deletion in the B-Tree (3)
- Deletion in the B-Tree (4)
- Deletion Algorithm
- B-Trees B+-Trees and B-Trees
- Indexing Low Cardinality Columns
- Bitmap Index
- Querying Bitmap Indexes
-
| 281
gt
copy A Behrend Foundations of Information Systems |
TID Concept
Record addressing with indirection inside the page Each page contains an array with record positions TID of a record consist of page id and index in position array
Pros Access with one page access (two pages in case of overflow) Stable No mapping table required
Operations Insert Reuse unused
position or add position Delete Mark position
as unused in array Update Update all
positions in array Update with overflow Store record
as overflow record and store TID of overflow record at original position (No double overflow Update TID at original position)
Record
Overflow Record
| 282
gt
copy A Behrend Foundations of Information Systems |
Free Space Management
Problem In which page is enough space for new record
Solution Free space table lists for all pages how much space is left
Free space value Precise value Ceil(Log2(page size)) =gt 2 bytes for common page size of 4K Rough value use less bytes free space = (value page size)2^(bits per value)
Free space table With direct page addressing Assuming a single page can take n free space entries First page and each (n+1)-th page takes free space entries
With indirect page addressing Free space information stored in page table
copy A Behrend Foundations of Information Systems | 283
gt
|
Storage System
Buffer
File System
Hardware
Data System
Application
TID TID TID
TID TID TID
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
Run
Buffered Pages - Page replacement strategy - Materialization strategy - Logging Backup Recovery
Paged files
Disks Flash RAID SAN hellip
Storage Structures - Record management - Free space management - Physical access paths
Access System
Data
base
Sys
tem
Data model semantics - System catalog - Record format - Logical access paths
Query processing - Parsing - Plan generation - Plan optimization - Plan execution
1 lsquoSmithrsquo 15061982
Table Person id INT name VARCHAR birthday DATE Index P_id_IX on Personid
Database System
SQL JDBC ODBC hellip
Physical Access Paths ndash Index Structures
| 284
gt
copy A Behrend Foundations of Information Systems |
Primary Index Secondary Index
Overview Indexes
Table scan Read all pages and for each record
evaluate the search criteria Pre-fetching
Index Scan Use index for search criteria
on one or more attributes Fast access to single values or value ranges of index attributes Logicalphysical sorting of values of key attributes (depending on index structure) Enforcing uniqueness
Types if indexes Primary (Clustered) Index
determines physical organization use for PK Secondary (Non-Clustered)
Index redundant access path
Pers(PID NAME AGE SALARY)
Age
Salary
| 285
gt
copy A Behrend Foundations of Information Systems |
Overview Indexes (2)
Choice of Access Paths Index scan Only useful for low selectivity
(low number of result tuples) Break even-point according to the
output ratio of the number of tuples (usually max 5) Requires statistics about data Additional costs for index storage
and updating
Table Scan adequateefficient for small tables
(eg 5 pages) Queries with high selectivity
(large result sets) 100-200MBs sequential read
~ 100 disk seekss
hit rate
Index Scan
Table Scan
access time
| 286
gt
copy A Behrend Foundations of Information Systems |
Classification of Index Structures
Classification
Multiway Trees Tree structure with multiple children per node Idea chose fan out so that node size suits page size
Onedimensional Index Structures
Key Comparison Key Transformation
Sequential Tree-Based Hash-Based
Prefix Trees (Tries)
Binary Search Trees Dynamic Static Linked Lists
(log seq) Seq Lists
(phys seq) Multiway
Trees Example B-Tree
| 287
gt
copy A Behrend Foundations of Information Systems |
B-Tree
free space
(Ki Di Pi) = entry min |P| = k+1 max |P| = 2k+1
keys lt K1 keys gt Kp Ki lt keys lt Ki+1
| 288
gt
copy A Behrend Foundations of Information Systems |
B-Tree (2)
Example Keys Agnostic to specific key semantic Only defined complete order required Could be of fixed or variable length
Operations Search for data for given key value Insertion and deletion of key-data pair
Payload Agnostic to specific data semantic Can be record or reference (TID) or
mix
B-Tree with k = 2 h = 3
| 289
gt
copy A Behrend Foundations of Information Systems |
Search in the B-Tree
Starting at the root node each node is searched from left to right 1) if Ki matches the desired key value the data record has been found (further
records with the same key value might be located in a sub-tree to which Pi-1 points) 2) if Ki is smaller than the desired value the search will be continued in the root of
the sub-tree identified by Pi-1 3) if Ki is larger than the desired value the comparison with Ki+1 is repeated 4) if K2k is also smaller than the desired value the search will be continued in the
sub-tree of P2k If itlsquos impossible to descend further into a sub-tree (2 or 4) (leaf node) The search is aborted no record with the desired key value is found
Search for 38 20 6
| 290
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (1)
Insertion Rule insert only into leaf nodes At Non-Leaf Nodes descend down the tree as for the search S le Ki follow Pi-1 S gt Ki check Ki+1 S gt K2k follow P2k
At Leaf Node Insert the data record according to the sorting order Special case leaf node is full (2k records) rarr split the leaf node
Splitting Generate a new leaf node Split the 2k+1 entries (in order)
into two leaf nodes first k entries rarr left node last k entries rarr right node
middle entry (k+1-th) is used as new ldquodiscriminatorrdquo (branching) and inserted into the parent node
| 291
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (2)
Node Splitting during Insertion Two possible situations after a split The parent node is full rarr repeat split on this level Enough space rarr FINISHED
Special case root split Split of the root node rarr New root with two successor nodes Height of a tree grows by 1 The tree has been split from the bottom to the top
Dynamic reorganization (self-balancing) No unloading or loading necessary Tree is always balanced But In case of many insertions deletions reorganization can be beneficial
| 292
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (3)
Insertion Example Order k = 1 n=2k Keys 1 5 2 6 7 4 8 3
Finally h=3
| 293
gt
copy A Behrend Foundations of Information Systems |
Insertion and Deletion in the B-Tree
Problem Insertion can create overflow Deletion can create underflow and overflow
Example Insertion of key 22
rarr Overflow rarr Split
Deletion of key 22 rarr Underflow need to access all four nodes finally same as input
Insert 22
| 294
gt
copy A Behrend Foundations of Information Systems |
Underflow Merge
Deletion in the B-Tree
Example Order k = 1 n=2k Delete key 3
| 295
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (2)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Remember Each path from the root to the leaf has the same length h
| 296
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (3)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Overflow Split
| 297
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (4)
Example Order k = 1 n=2k Delete key 3
Overflow Split
Root Split
| 298
gt
copy A Behrend Foundations of Information Systems |
Deletion Algorithm
Example ndash there are different algorithms Search the node that contains the key K to be deleted If key K is in a leaf node delete the key in the leaf node and handle potentially
resulting underflow by merging with sibling If key K is in an inner node pull up new discriminator from one of the successors Analyze which successor node of K has more elements left or right one
If both have the same number of elements decide for one Replace the key K to be deleted with the direct successor Krsquo from the left
successor node or with the direct successor Krsquorsquo from the right successor node respectively Delete Krsquo or Krsquorsquo from the respective successor node (recursively)
Note Major variants Merge (tis lecture) Re-distribution (instead of splitmerge in case of overflowunderflow the entries are
re-distributed under consideration of one or multiple adjacent nodes)
| 299
gt
copy A Behrend Foundations of Information Systems |
B-Trees B+-Trees and B-Trees
B+-Trees and B-Trees Data is only in leaf nodes Key redundancy but higher fan-out rarr lower tree high less IO Simpler delete procedure rarr requires only merging of nodes
Double linked list of all leaf nodes B-Trees Modified valid node sizes
from [k2k] to [43k2k] rarr better node utilization but more splitsmerges
Example Secondary index Non unique
B-Tree with k = 2 h = 3
| 300
gt
copy A Behrend Foundations of Information Systems |
Indexing Low Cardinality Columns
Problem Example B-tree on the sex of customers for a table with 1000000 tuples results in
two lists with approximately 500000 tuples each
Query for all female customers requires 500000 random page accesses (secondary index) Table scan would be much faster
Conclusion B-trees (and also hashing) are useful for predicates with low selectivity
(outputinput cardinality ratio) Rule of thumb margin hit rate is approx 5 higher hit rates do not justify the efforts for an index access
F M
TID TID TID TID hellip TID TID TID TID hellip
| 301
gt
copy A Behrend Foundations of Information Systems |
Bitmap Index
Idea (Long history since the 1960s) Create a bitmapbitlist for each
attribute value Each tuple in the table is assigned to
one bit in the bitmap (by position sequential TID)
Bit values 1 attribute value set 0 attribute value not set
Necessary condition Sequential numbering of the tuples
(TIDs)
F M
1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
0 1 0 0 1 0 0 0 0 1 1 0 1 0 0
Name Sex Region Race Carol f n white
Harold m e black Anne f e asian
Iris f ne white hellip m se hisp hellip f e white hellip f sw asian hellip f w black hellip f n asian hellip m e hisp hellip m se black hellip f s white hellip m nw black hellip f s white hellip f w black
Sex
| 302
gt
copy A Behrend Foundations of Information Systems |
Querying Bitmap Indexes
Main advantage of bitmap indexes Simple and efficient logical join possible Read only data that is relevant for predicates Example σSex=lsquoflsquo ᴧ Region=lsquonlsquo R Bitmaps B1 and B2 in conjunction for (i=0 iltB1length i++) B = B1[i] amp B2[i]
Example IO Costs Estimation σSex=lsquoflsquo ᴧ Region=lsquonlsquo ᴧ Race=lsquoAsianlsquo R
(ldquoAsian women of region Northrdquo) Selectivity 12 18 14 = 164 N=10000 tuples with length of 400 bytes each
(~ 10 tuples per page for 4kB pages) Table scan 1000 pages Bitmap access 1000064 156 pages (worst case each tuple in a different page)
plus 1 page for bitmaps
F 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
N 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0
A 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
AND = AND
- DBMS Architecture
- What is in the Lecture
- How is Database System build
- Architectural Blue Print
- Architectural Trends
- Different Access Characteristics
- Hardware Developments
- Row Storage vs Column Storage
- Processing Models
- Transaction Management
- ACID Properties of Transactions
- Motivation
- Reasons for crashes
- Guarantee Atomicity amp Durability
- Record Management
- Record
- Record Addressing
- Surrogate with Mapping Table
- TID Concept
- Free Space Management
- Physical Access Paths ndash Index Structures
- Overview Indexes
- Overview Indexes (2)
- Classification of Index Structures
- B-Tree
- B-Tree (2)
- Search in the B-Tree
- Insertions in the B-Tree (1)
- Insertions in the B-Tree (2)
- Insertions in the B-Tree (3)
- Insertion and Deletion in the B-Tree
- Deletion in the B-Tree
- Deletion in the B-Tree (2)
- Deletion in the B-Tree (3)
- Deletion in the B-Tree (4)
- Deletion Algorithm
- B-Trees B+-Trees and B-Trees
- Indexing Low Cardinality Columns
- Bitmap Index
- Querying Bitmap Indexes
-
| 282
gt
copy A Behrend Foundations of Information Systems |
Free Space Management
Problem In which page is enough space for new record
Solution Free space table lists for all pages how much space is left
Free space value Precise value Ceil(Log2(page size)) =gt 2 bytes for common page size of 4K Rough value use less bytes free space = (value page size)2^(bits per value)
Free space table With direct page addressing Assuming a single page can take n free space entries First page and each (n+1)-th page takes free space entries
With indirect page addressing Free space information stored in page table
copy A Behrend Foundations of Information Systems | 283
gt
|
Storage System
Buffer
File System
Hardware
Data System
Application
TID TID TID
TID TID TID
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
Run
Buffered Pages - Page replacement strategy - Materialization strategy - Logging Backup Recovery
Paged files
Disks Flash RAID SAN hellip
Storage Structures - Record management - Free space management - Physical access paths
Access System
Data
base
Sys
tem
Data model semantics - System catalog - Record format - Logical access paths
Query processing - Parsing - Plan generation - Plan optimization - Plan execution
1 lsquoSmithrsquo 15061982
Table Person id INT name VARCHAR birthday DATE Index P_id_IX on Personid
Database System
SQL JDBC ODBC hellip
Physical Access Paths ndash Index Structures
| 284
gt
copy A Behrend Foundations of Information Systems |
Primary Index Secondary Index
Overview Indexes
Table scan Read all pages and for each record
evaluate the search criteria Pre-fetching
Index Scan Use index for search criteria
on one or more attributes Fast access to single values or value ranges of index attributes Logicalphysical sorting of values of key attributes (depending on index structure) Enforcing uniqueness
Types if indexes Primary (Clustered) Index
determines physical organization use for PK Secondary (Non-Clustered)
Index redundant access path
Pers(PID NAME AGE SALARY)
Age
Salary
| 285
gt
copy A Behrend Foundations of Information Systems |
Overview Indexes (2)
Choice of Access Paths Index scan Only useful for low selectivity
(low number of result tuples) Break even-point according to the
output ratio of the number of tuples (usually max 5) Requires statistics about data Additional costs for index storage
and updating
Table Scan adequateefficient for small tables
(eg 5 pages) Queries with high selectivity
(large result sets) 100-200MBs sequential read
~ 100 disk seekss
hit rate
Index Scan
Table Scan
access time
| 286
gt
copy A Behrend Foundations of Information Systems |
Classification of Index Structures
Classification
Multiway Trees Tree structure with multiple children per node Idea chose fan out so that node size suits page size
Onedimensional Index Structures
Key Comparison Key Transformation
Sequential Tree-Based Hash-Based
Prefix Trees (Tries)
Binary Search Trees Dynamic Static Linked Lists
(log seq) Seq Lists
(phys seq) Multiway
Trees Example B-Tree
| 287
gt
copy A Behrend Foundations of Information Systems |
B-Tree
free space
(Ki Di Pi) = entry min |P| = k+1 max |P| = 2k+1
keys lt K1 keys gt Kp Ki lt keys lt Ki+1
| 288
gt
copy A Behrend Foundations of Information Systems |
B-Tree (2)
Example Keys Agnostic to specific key semantic Only defined complete order required Could be of fixed or variable length
Operations Search for data for given key value Insertion and deletion of key-data pair
Payload Agnostic to specific data semantic Can be record or reference (TID) or
mix
B-Tree with k = 2 h = 3
| 289
gt
copy A Behrend Foundations of Information Systems |
Search in the B-Tree
Starting at the root node each node is searched from left to right 1) if Ki matches the desired key value the data record has been found (further
records with the same key value might be located in a sub-tree to which Pi-1 points) 2) if Ki is smaller than the desired value the search will be continued in the root of
the sub-tree identified by Pi-1 3) if Ki is larger than the desired value the comparison with Ki+1 is repeated 4) if K2k is also smaller than the desired value the search will be continued in the
sub-tree of P2k If itlsquos impossible to descend further into a sub-tree (2 or 4) (leaf node) The search is aborted no record with the desired key value is found
Search for 38 20 6
| 290
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (1)
Insertion Rule insert only into leaf nodes At Non-Leaf Nodes descend down the tree as for the search S le Ki follow Pi-1 S gt Ki check Ki+1 S gt K2k follow P2k
At Leaf Node Insert the data record according to the sorting order Special case leaf node is full (2k records) rarr split the leaf node
Splitting Generate a new leaf node Split the 2k+1 entries (in order)
into two leaf nodes first k entries rarr left node last k entries rarr right node
middle entry (k+1-th) is used as new ldquodiscriminatorrdquo (branching) and inserted into the parent node
| 291
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (2)
Node Splitting during Insertion Two possible situations after a split The parent node is full rarr repeat split on this level Enough space rarr FINISHED
Special case root split Split of the root node rarr New root with two successor nodes Height of a tree grows by 1 The tree has been split from the bottom to the top
Dynamic reorganization (self-balancing) No unloading or loading necessary Tree is always balanced But In case of many insertions deletions reorganization can be beneficial
| 292
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (3)
Insertion Example Order k = 1 n=2k Keys 1 5 2 6 7 4 8 3
Finally h=3
| 293
gt
copy A Behrend Foundations of Information Systems |
Insertion and Deletion in the B-Tree
Problem Insertion can create overflow Deletion can create underflow and overflow
Example Insertion of key 22
rarr Overflow rarr Split
Deletion of key 22 rarr Underflow need to access all four nodes finally same as input
Insert 22
| 294
gt
copy A Behrend Foundations of Information Systems |
Underflow Merge
Deletion in the B-Tree
Example Order k = 1 n=2k Delete key 3
| 295
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (2)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Remember Each path from the root to the leaf has the same length h
| 296
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (3)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Overflow Split
| 297
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (4)
Example Order k = 1 n=2k Delete key 3
Overflow Split
Root Split
| 298
gt
copy A Behrend Foundations of Information Systems |
Deletion Algorithm
Example ndash there are different algorithms Search the node that contains the key K to be deleted If key K is in a leaf node delete the key in the leaf node and handle potentially
resulting underflow by merging with sibling If key K is in an inner node pull up new discriminator from one of the successors Analyze which successor node of K has more elements left or right one
If both have the same number of elements decide for one Replace the key K to be deleted with the direct successor Krsquo from the left
successor node or with the direct successor Krsquorsquo from the right successor node respectively Delete Krsquo or Krsquorsquo from the respective successor node (recursively)
Note Major variants Merge (tis lecture) Re-distribution (instead of splitmerge in case of overflowunderflow the entries are
re-distributed under consideration of one or multiple adjacent nodes)
| 299
gt
copy A Behrend Foundations of Information Systems |
B-Trees B+-Trees and B-Trees
B+-Trees and B-Trees Data is only in leaf nodes Key redundancy but higher fan-out rarr lower tree high less IO Simpler delete procedure rarr requires only merging of nodes
Double linked list of all leaf nodes B-Trees Modified valid node sizes
from [k2k] to [43k2k] rarr better node utilization but more splitsmerges
Example Secondary index Non unique
B-Tree with k = 2 h = 3
| 300
gt
copy A Behrend Foundations of Information Systems |
Indexing Low Cardinality Columns
Problem Example B-tree on the sex of customers for a table with 1000000 tuples results in
two lists with approximately 500000 tuples each
Query for all female customers requires 500000 random page accesses (secondary index) Table scan would be much faster
Conclusion B-trees (and also hashing) are useful for predicates with low selectivity
(outputinput cardinality ratio) Rule of thumb margin hit rate is approx 5 higher hit rates do not justify the efforts for an index access
F M
TID TID TID TID hellip TID TID TID TID hellip
| 301
gt
copy A Behrend Foundations of Information Systems |
Bitmap Index
Idea (Long history since the 1960s) Create a bitmapbitlist for each
attribute value Each tuple in the table is assigned to
one bit in the bitmap (by position sequential TID)
Bit values 1 attribute value set 0 attribute value not set
Necessary condition Sequential numbering of the tuples
(TIDs)
F M
1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
0 1 0 0 1 0 0 0 0 1 1 0 1 0 0
Name Sex Region Race Carol f n white
Harold m e black Anne f e asian
Iris f ne white hellip m se hisp hellip f e white hellip f sw asian hellip f w black hellip f n asian hellip m e hisp hellip m se black hellip f s white hellip m nw black hellip f s white hellip f w black
Sex
| 302
gt
copy A Behrend Foundations of Information Systems |
Querying Bitmap Indexes
Main advantage of bitmap indexes Simple and efficient logical join possible Read only data that is relevant for predicates Example σSex=lsquoflsquo ᴧ Region=lsquonlsquo R Bitmaps B1 and B2 in conjunction for (i=0 iltB1length i++) B = B1[i] amp B2[i]
Example IO Costs Estimation σSex=lsquoflsquo ᴧ Region=lsquonlsquo ᴧ Race=lsquoAsianlsquo R
(ldquoAsian women of region Northrdquo) Selectivity 12 18 14 = 164 N=10000 tuples with length of 400 bytes each
(~ 10 tuples per page for 4kB pages) Table scan 1000 pages Bitmap access 1000064 156 pages (worst case each tuple in a different page)
plus 1 page for bitmaps
F 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
N 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0
A 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
AND = AND
- DBMS Architecture
- What is in the Lecture
- How is Database System build
- Architectural Blue Print
- Architectural Trends
- Different Access Characteristics
- Hardware Developments
- Row Storage vs Column Storage
- Processing Models
- Transaction Management
- ACID Properties of Transactions
- Motivation
- Reasons for crashes
- Guarantee Atomicity amp Durability
- Record Management
- Record
- Record Addressing
- Surrogate with Mapping Table
- TID Concept
- Free Space Management
- Physical Access Paths ndash Index Structures
- Overview Indexes
- Overview Indexes (2)
- Classification of Index Structures
- B-Tree
- B-Tree (2)
- Search in the B-Tree
- Insertions in the B-Tree (1)
- Insertions in the B-Tree (2)
- Insertions in the B-Tree (3)
- Insertion and Deletion in the B-Tree
- Deletion in the B-Tree
- Deletion in the B-Tree (2)
- Deletion in the B-Tree (3)
- Deletion in the B-Tree (4)
- Deletion Algorithm
- B-Trees B+-Trees and B-Trees
- Indexing Low Cardinality Columns
- Bitmap Index
- Querying Bitmap Indexes
-
copy A Behrend Foundations of Information Systems | 283
gt
|
Storage System
Buffer
File System
Hardware
Data System
Application
TID TID TID
TID TID TID
SELECT sfirstname slastname COUNT(lname) FROM Student s INNER JOIN Program p ON sprogramId = pid INNER JOIN Attendance a ON astudentId = sstudentId INNER JOIN Lecture l ON alectureId = lid GROUP BY sfirstname slastname WHERE pname=lsquoDSErsquo
Run
Buffered Pages - Page replacement strategy - Materialization strategy - Logging Backup Recovery
Paged files
Disks Flash RAID SAN hellip
Storage Structures - Record management - Free space management - Physical access paths
Access System
Data
base
Sys
tem
Data model semantics - System catalog - Record format - Logical access paths
Query processing - Parsing - Plan generation - Plan optimization - Plan execution
1 lsquoSmithrsquo 15061982
Table Person id INT name VARCHAR birthday DATE Index P_id_IX on Personid
Database System
SQL JDBC ODBC hellip
Physical Access Paths ndash Index Structures
| 284
gt
copy A Behrend Foundations of Information Systems |
Primary Index Secondary Index
Overview Indexes
Table scan Read all pages and for each record
evaluate the search criteria Pre-fetching
Index Scan Use index for search criteria
on one or more attributes Fast access to single values or value ranges of index attributes Logicalphysical sorting of values of key attributes (depending on index structure) Enforcing uniqueness
Types if indexes Primary (Clustered) Index
determines physical organization use for PK Secondary (Non-Clustered)
Index redundant access path
Pers(PID NAME AGE SALARY)
Age
Salary
| 285
gt
copy A Behrend Foundations of Information Systems |
Overview Indexes (2)
Choice of Access Paths Index scan Only useful for low selectivity
(low number of result tuples) Break even-point according to the
output ratio of the number of tuples (usually max 5) Requires statistics about data Additional costs for index storage
and updating
Table Scan adequateefficient for small tables
(eg 5 pages) Queries with high selectivity
(large result sets) 100-200MBs sequential read
~ 100 disk seekss
hit rate
Index Scan
Table Scan
access time
| 286
gt
copy A Behrend Foundations of Information Systems |
Classification of Index Structures
Classification
Multiway Trees Tree structure with multiple children per node Idea chose fan out so that node size suits page size
Onedimensional Index Structures
Key Comparison Key Transformation
Sequential Tree-Based Hash-Based
Prefix Trees (Tries)
Binary Search Trees Dynamic Static Linked Lists
(log seq) Seq Lists
(phys seq) Multiway
Trees Example B-Tree
| 287
gt
copy A Behrend Foundations of Information Systems |
B-Tree
free space
(Ki Di Pi) = entry min |P| = k+1 max |P| = 2k+1
keys lt K1 keys gt Kp Ki lt keys lt Ki+1
| 288
gt
copy A Behrend Foundations of Information Systems |
B-Tree (2)
Example Keys Agnostic to specific key semantic Only defined complete order required Could be of fixed or variable length
Operations Search for data for given key value Insertion and deletion of key-data pair
Payload Agnostic to specific data semantic Can be record or reference (TID) or
mix
B-Tree with k = 2 h = 3
| 289
gt
copy A Behrend Foundations of Information Systems |
Search in the B-Tree
Starting at the root node each node is searched from left to right 1) if Ki matches the desired key value the data record has been found (further
records with the same key value might be located in a sub-tree to which Pi-1 points) 2) if Ki is smaller than the desired value the search will be continued in the root of
the sub-tree identified by Pi-1 3) if Ki is larger than the desired value the comparison with Ki+1 is repeated 4) if K2k is also smaller than the desired value the search will be continued in the
sub-tree of P2k If itlsquos impossible to descend further into a sub-tree (2 or 4) (leaf node) The search is aborted no record with the desired key value is found
Search for 38 20 6
| 290
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (1)
Insertion Rule insert only into leaf nodes At Non-Leaf Nodes descend down the tree as for the search S le Ki follow Pi-1 S gt Ki check Ki+1 S gt K2k follow P2k
At Leaf Node Insert the data record according to the sorting order Special case leaf node is full (2k records) rarr split the leaf node
Splitting Generate a new leaf node Split the 2k+1 entries (in order)
into two leaf nodes first k entries rarr left node last k entries rarr right node
middle entry (k+1-th) is used as new ldquodiscriminatorrdquo (branching) and inserted into the parent node
| 291
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (2)
Node Splitting during Insertion Two possible situations after a split The parent node is full rarr repeat split on this level Enough space rarr FINISHED
Special case root split Split of the root node rarr New root with two successor nodes Height of a tree grows by 1 The tree has been split from the bottom to the top
Dynamic reorganization (self-balancing) No unloading or loading necessary Tree is always balanced But In case of many insertions deletions reorganization can be beneficial
| 292
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (3)
Insertion Example Order k = 1 n=2k Keys 1 5 2 6 7 4 8 3
Finally h=3
| 293
gt
copy A Behrend Foundations of Information Systems |
Insertion and Deletion in the B-Tree
Problem Insertion can create overflow Deletion can create underflow and overflow
Example Insertion of key 22
rarr Overflow rarr Split
Deletion of key 22 rarr Underflow need to access all four nodes finally same as input
Insert 22
| 294
gt
copy A Behrend Foundations of Information Systems |
Underflow Merge
Deletion in the B-Tree
Example Order k = 1 n=2k Delete key 3
| 295
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (2)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Remember Each path from the root to the leaf has the same length h
| 296
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (3)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Overflow Split
| 297
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (4)
Example Order k = 1 n=2k Delete key 3
Overflow Split
Root Split
| 298
gt
copy A Behrend Foundations of Information Systems |
Deletion Algorithm
Example ndash there are different algorithms Search the node that contains the key K to be deleted If key K is in a leaf node delete the key in the leaf node and handle potentially
resulting underflow by merging with sibling If key K is in an inner node pull up new discriminator from one of the successors Analyze which successor node of K has more elements left or right one
If both have the same number of elements decide for one Replace the key K to be deleted with the direct successor Krsquo from the left
successor node or with the direct successor Krsquorsquo from the right successor node respectively Delete Krsquo or Krsquorsquo from the respective successor node (recursively)
Note Major variants Merge (tis lecture) Re-distribution (instead of splitmerge in case of overflowunderflow the entries are
re-distributed under consideration of one or multiple adjacent nodes)
| 299
gt
copy A Behrend Foundations of Information Systems |
B-Trees B+-Trees and B-Trees
B+-Trees and B-Trees Data is only in leaf nodes Key redundancy but higher fan-out rarr lower tree high less IO Simpler delete procedure rarr requires only merging of nodes
Double linked list of all leaf nodes B-Trees Modified valid node sizes
from [k2k] to [43k2k] rarr better node utilization but more splitsmerges
Example Secondary index Non unique
B-Tree with k = 2 h = 3
| 300
gt
copy A Behrend Foundations of Information Systems |
Indexing Low Cardinality Columns
Problem Example B-tree on the sex of customers for a table with 1000000 tuples results in
two lists with approximately 500000 tuples each
Query for all female customers requires 500000 random page accesses (secondary index) Table scan would be much faster
Conclusion B-trees (and also hashing) are useful for predicates with low selectivity
(outputinput cardinality ratio) Rule of thumb margin hit rate is approx 5 higher hit rates do not justify the efforts for an index access
F M
TID TID TID TID hellip TID TID TID TID hellip
| 301
gt
copy A Behrend Foundations of Information Systems |
Bitmap Index
Idea (Long history since the 1960s) Create a bitmapbitlist for each
attribute value Each tuple in the table is assigned to
one bit in the bitmap (by position sequential TID)
Bit values 1 attribute value set 0 attribute value not set
Necessary condition Sequential numbering of the tuples
(TIDs)
F M
1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
0 1 0 0 1 0 0 0 0 1 1 0 1 0 0
Name Sex Region Race Carol f n white
Harold m e black Anne f e asian
Iris f ne white hellip m se hisp hellip f e white hellip f sw asian hellip f w black hellip f n asian hellip m e hisp hellip m se black hellip f s white hellip m nw black hellip f s white hellip f w black
Sex
| 302
gt
copy A Behrend Foundations of Information Systems |
Querying Bitmap Indexes
Main advantage of bitmap indexes Simple and efficient logical join possible Read only data that is relevant for predicates Example σSex=lsquoflsquo ᴧ Region=lsquonlsquo R Bitmaps B1 and B2 in conjunction for (i=0 iltB1length i++) B = B1[i] amp B2[i]
Example IO Costs Estimation σSex=lsquoflsquo ᴧ Region=lsquonlsquo ᴧ Race=lsquoAsianlsquo R
(ldquoAsian women of region Northrdquo) Selectivity 12 18 14 = 164 N=10000 tuples with length of 400 bytes each
(~ 10 tuples per page for 4kB pages) Table scan 1000 pages Bitmap access 1000064 156 pages (worst case each tuple in a different page)
plus 1 page for bitmaps
F 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
N 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0
A 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
AND = AND
- DBMS Architecture
- What is in the Lecture
- How is Database System build
- Architectural Blue Print
- Architectural Trends
- Different Access Characteristics
- Hardware Developments
- Row Storage vs Column Storage
- Processing Models
- Transaction Management
- ACID Properties of Transactions
- Motivation
- Reasons for crashes
- Guarantee Atomicity amp Durability
- Record Management
- Record
- Record Addressing
- Surrogate with Mapping Table
- TID Concept
- Free Space Management
- Physical Access Paths ndash Index Structures
- Overview Indexes
- Overview Indexes (2)
- Classification of Index Structures
- B-Tree
- B-Tree (2)
- Search in the B-Tree
- Insertions in the B-Tree (1)
- Insertions in the B-Tree (2)
- Insertions in the B-Tree (3)
- Insertion and Deletion in the B-Tree
- Deletion in the B-Tree
- Deletion in the B-Tree (2)
- Deletion in the B-Tree (3)
- Deletion in the B-Tree (4)
- Deletion Algorithm
- B-Trees B+-Trees and B-Trees
- Indexing Low Cardinality Columns
- Bitmap Index
- Querying Bitmap Indexes
-
| 284
gt
copy A Behrend Foundations of Information Systems |
Primary Index Secondary Index
Overview Indexes
Table scan Read all pages and for each record
evaluate the search criteria Pre-fetching
Index Scan Use index for search criteria
on one or more attributes Fast access to single values or value ranges of index attributes Logicalphysical sorting of values of key attributes (depending on index structure) Enforcing uniqueness
Types if indexes Primary (Clustered) Index
determines physical organization use for PK Secondary (Non-Clustered)
Index redundant access path
Pers(PID NAME AGE SALARY)
Age
Salary
| 285
gt
copy A Behrend Foundations of Information Systems |
Overview Indexes (2)
Choice of Access Paths Index scan Only useful for low selectivity
(low number of result tuples) Break even-point according to the
output ratio of the number of tuples (usually max 5) Requires statistics about data Additional costs for index storage
and updating
Table Scan adequateefficient for small tables
(eg 5 pages) Queries with high selectivity
(large result sets) 100-200MBs sequential read
~ 100 disk seekss
hit rate
Index Scan
Table Scan
access time
| 286
gt
copy A Behrend Foundations of Information Systems |
Classification of Index Structures
Classification
Multiway Trees Tree structure with multiple children per node Idea chose fan out so that node size suits page size
Onedimensional Index Structures
Key Comparison Key Transformation
Sequential Tree-Based Hash-Based
Prefix Trees (Tries)
Binary Search Trees Dynamic Static Linked Lists
(log seq) Seq Lists
(phys seq) Multiway
Trees Example B-Tree
| 287
gt
copy A Behrend Foundations of Information Systems |
B-Tree
free space
(Ki Di Pi) = entry min |P| = k+1 max |P| = 2k+1
keys lt K1 keys gt Kp Ki lt keys lt Ki+1
| 288
gt
copy A Behrend Foundations of Information Systems |
B-Tree (2)
Example Keys Agnostic to specific key semantic Only defined complete order required Could be of fixed or variable length
Operations Search for data for given key value Insertion and deletion of key-data pair
Payload Agnostic to specific data semantic Can be record or reference (TID) or
mix
B-Tree with k = 2 h = 3
| 289
gt
copy A Behrend Foundations of Information Systems |
Search in the B-Tree
Starting at the root node each node is searched from left to right 1) if Ki matches the desired key value the data record has been found (further
records with the same key value might be located in a sub-tree to which Pi-1 points) 2) if Ki is smaller than the desired value the search will be continued in the root of
the sub-tree identified by Pi-1 3) if Ki is larger than the desired value the comparison with Ki+1 is repeated 4) if K2k is also smaller than the desired value the search will be continued in the
sub-tree of P2k If itlsquos impossible to descend further into a sub-tree (2 or 4) (leaf node) The search is aborted no record with the desired key value is found
Search for 38 20 6
| 290
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (1)
Insertion Rule insert only into leaf nodes At Non-Leaf Nodes descend down the tree as for the search S le Ki follow Pi-1 S gt Ki check Ki+1 S gt K2k follow P2k
At Leaf Node Insert the data record according to the sorting order Special case leaf node is full (2k records) rarr split the leaf node
Splitting Generate a new leaf node Split the 2k+1 entries (in order)
into two leaf nodes first k entries rarr left node last k entries rarr right node
middle entry (k+1-th) is used as new ldquodiscriminatorrdquo (branching) and inserted into the parent node
| 291
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (2)
Node Splitting during Insertion Two possible situations after a split The parent node is full rarr repeat split on this level Enough space rarr FINISHED
Special case root split Split of the root node rarr New root with two successor nodes Height of a tree grows by 1 The tree has been split from the bottom to the top
Dynamic reorganization (self-balancing) No unloading or loading necessary Tree is always balanced But In case of many insertions deletions reorganization can be beneficial
| 292
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (3)
Insertion Example Order k = 1 n=2k Keys 1 5 2 6 7 4 8 3
Finally h=3
| 293
gt
copy A Behrend Foundations of Information Systems |
Insertion and Deletion in the B-Tree
Problem Insertion can create overflow Deletion can create underflow and overflow
Example Insertion of key 22
rarr Overflow rarr Split
Deletion of key 22 rarr Underflow need to access all four nodes finally same as input
Insert 22
| 294
gt
copy A Behrend Foundations of Information Systems |
Underflow Merge
Deletion in the B-Tree
Example Order k = 1 n=2k Delete key 3
| 295
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (2)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Remember Each path from the root to the leaf has the same length h
| 296
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (3)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Overflow Split
| 297
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (4)
Example Order k = 1 n=2k Delete key 3
Overflow Split
Root Split
| 298
gt
copy A Behrend Foundations of Information Systems |
Deletion Algorithm
Example ndash there are different algorithms Search the node that contains the key K to be deleted If key K is in a leaf node delete the key in the leaf node and handle potentially
resulting underflow by merging with sibling If key K is in an inner node pull up new discriminator from one of the successors Analyze which successor node of K has more elements left or right one
If both have the same number of elements decide for one Replace the key K to be deleted with the direct successor Krsquo from the left
successor node or with the direct successor Krsquorsquo from the right successor node respectively Delete Krsquo or Krsquorsquo from the respective successor node (recursively)
Note Major variants Merge (tis lecture) Re-distribution (instead of splitmerge in case of overflowunderflow the entries are
re-distributed under consideration of one or multiple adjacent nodes)
| 299
gt
copy A Behrend Foundations of Information Systems |
B-Trees B+-Trees and B-Trees
B+-Trees and B-Trees Data is only in leaf nodes Key redundancy but higher fan-out rarr lower tree high less IO Simpler delete procedure rarr requires only merging of nodes
Double linked list of all leaf nodes B-Trees Modified valid node sizes
from [k2k] to [43k2k] rarr better node utilization but more splitsmerges
Example Secondary index Non unique
B-Tree with k = 2 h = 3
| 300
gt
copy A Behrend Foundations of Information Systems |
Indexing Low Cardinality Columns
Problem Example B-tree on the sex of customers for a table with 1000000 tuples results in
two lists with approximately 500000 tuples each
Query for all female customers requires 500000 random page accesses (secondary index) Table scan would be much faster
Conclusion B-trees (and also hashing) are useful for predicates with low selectivity
(outputinput cardinality ratio) Rule of thumb margin hit rate is approx 5 higher hit rates do not justify the efforts for an index access
F M
TID TID TID TID hellip TID TID TID TID hellip
| 301
gt
copy A Behrend Foundations of Information Systems |
Bitmap Index
Idea (Long history since the 1960s) Create a bitmapbitlist for each
attribute value Each tuple in the table is assigned to
one bit in the bitmap (by position sequential TID)
Bit values 1 attribute value set 0 attribute value not set
Necessary condition Sequential numbering of the tuples
(TIDs)
F M
1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
0 1 0 0 1 0 0 0 0 1 1 0 1 0 0
Name Sex Region Race Carol f n white
Harold m e black Anne f e asian
Iris f ne white hellip m se hisp hellip f e white hellip f sw asian hellip f w black hellip f n asian hellip m e hisp hellip m se black hellip f s white hellip m nw black hellip f s white hellip f w black
Sex
| 302
gt
copy A Behrend Foundations of Information Systems |
Querying Bitmap Indexes
Main advantage of bitmap indexes Simple and efficient logical join possible Read only data that is relevant for predicates Example σSex=lsquoflsquo ᴧ Region=lsquonlsquo R Bitmaps B1 and B2 in conjunction for (i=0 iltB1length i++) B = B1[i] amp B2[i]
Example IO Costs Estimation σSex=lsquoflsquo ᴧ Region=lsquonlsquo ᴧ Race=lsquoAsianlsquo R
(ldquoAsian women of region Northrdquo) Selectivity 12 18 14 = 164 N=10000 tuples with length of 400 bytes each
(~ 10 tuples per page for 4kB pages) Table scan 1000 pages Bitmap access 1000064 156 pages (worst case each tuple in a different page)
plus 1 page for bitmaps
F 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
N 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0
A 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
AND = AND
- DBMS Architecture
- What is in the Lecture
- How is Database System build
- Architectural Blue Print
- Architectural Trends
- Different Access Characteristics
- Hardware Developments
- Row Storage vs Column Storage
- Processing Models
- Transaction Management
- ACID Properties of Transactions
- Motivation
- Reasons for crashes
- Guarantee Atomicity amp Durability
- Record Management
- Record
- Record Addressing
- Surrogate with Mapping Table
- TID Concept
- Free Space Management
- Physical Access Paths ndash Index Structures
- Overview Indexes
- Overview Indexes (2)
- Classification of Index Structures
- B-Tree
- B-Tree (2)
- Search in the B-Tree
- Insertions in the B-Tree (1)
- Insertions in the B-Tree (2)
- Insertions in the B-Tree (3)
- Insertion and Deletion in the B-Tree
- Deletion in the B-Tree
- Deletion in the B-Tree (2)
- Deletion in the B-Tree (3)
- Deletion in the B-Tree (4)
- Deletion Algorithm
- B-Trees B+-Trees and B-Trees
- Indexing Low Cardinality Columns
- Bitmap Index
- Querying Bitmap Indexes
-
| 285
gt
copy A Behrend Foundations of Information Systems |
Overview Indexes (2)
Choice of Access Paths Index scan Only useful for low selectivity
(low number of result tuples) Break even-point according to the
output ratio of the number of tuples (usually max 5) Requires statistics about data Additional costs for index storage
and updating
Table Scan adequateefficient for small tables
(eg 5 pages) Queries with high selectivity
(large result sets) 100-200MBs sequential read
~ 100 disk seekss
hit rate
Index Scan
Table Scan
access time
| 286
gt
copy A Behrend Foundations of Information Systems |
Classification of Index Structures
Classification
Multiway Trees Tree structure with multiple children per node Idea chose fan out so that node size suits page size
Onedimensional Index Structures
Key Comparison Key Transformation
Sequential Tree-Based Hash-Based
Prefix Trees (Tries)
Binary Search Trees Dynamic Static Linked Lists
(log seq) Seq Lists
(phys seq) Multiway
Trees Example B-Tree
| 287
gt
copy A Behrend Foundations of Information Systems |
B-Tree
free space
(Ki Di Pi) = entry min |P| = k+1 max |P| = 2k+1
keys lt K1 keys gt Kp Ki lt keys lt Ki+1
| 288
gt
copy A Behrend Foundations of Information Systems |
B-Tree (2)
Example Keys Agnostic to specific key semantic Only defined complete order required Could be of fixed or variable length
Operations Search for data for given key value Insertion and deletion of key-data pair
Payload Agnostic to specific data semantic Can be record or reference (TID) or
mix
B-Tree with k = 2 h = 3
| 289
gt
copy A Behrend Foundations of Information Systems |
Search in the B-Tree
Starting at the root node each node is searched from left to right 1) if Ki matches the desired key value the data record has been found (further
records with the same key value might be located in a sub-tree to which Pi-1 points) 2) if Ki is smaller than the desired value the search will be continued in the root of
the sub-tree identified by Pi-1 3) if Ki is larger than the desired value the comparison with Ki+1 is repeated 4) if K2k is also smaller than the desired value the search will be continued in the
sub-tree of P2k If itlsquos impossible to descend further into a sub-tree (2 or 4) (leaf node) The search is aborted no record with the desired key value is found
Search for 38 20 6
| 290
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (1)
Insertion Rule insert only into leaf nodes At Non-Leaf Nodes descend down the tree as for the search S le Ki follow Pi-1 S gt Ki check Ki+1 S gt K2k follow P2k
At Leaf Node Insert the data record according to the sorting order Special case leaf node is full (2k records) rarr split the leaf node
Splitting Generate a new leaf node Split the 2k+1 entries (in order)
into two leaf nodes first k entries rarr left node last k entries rarr right node
middle entry (k+1-th) is used as new ldquodiscriminatorrdquo (branching) and inserted into the parent node
| 291
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (2)
Node Splitting during Insertion Two possible situations after a split The parent node is full rarr repeat split on this level Enough space rarr FINISHED
Special case root split Split of the root node rarr New root with two successor nodes Height of a tree grows by 1 The tree has been split from the bottom to the top
Dynamic reorganization (self-balancing) No unloading or loading necessary Tree is always balanced But In case of many insertions deletions reorganization can be beneficial
| 292
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (3)
Insertion Example Order k = 1 n=2k Keys 1 5 2 6 7 4 8 3
Finally h=3
| 293
gt
copy A Behrend Foundations of Information Systems |
Insertion and Deletion in the B-Tree
Problem Insertion can create overflow Deletion can create underflow and overflow
Example Insertion of key 22
rarr Overflow rarr Split
Deletion of key 22 rarr Underflow need to access all four nodes finally same as input
Insert 22
| 294
gt
copy A Behrend Foundations of Information Systems |
Underflow Merge
Deletion in the B-Tree
Example Order k = 1 n=2k Delete key 3
| 295
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (2)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Remember Each path from the root to the leaf has the same length h
| 296
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (3)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Overflow Split
| 297
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (4)
Example Order k = 1 n=2k Delete key 3
Overflow Split
Root Split
| 298
gt
copy A Behrend Foundations of Information Systems |
Deletion Algorithm
Example ndash there are different algorithms Search the node that contains the key K to be deleted If key K is in a leaf node delete the key in the leaf node and handle potentially
resulting underflow by merging with sibling If key K is in an inner node pull up new discriminator from one of the successors Analyze which successor node of K has more elements left or right one
If both have the same number of elements decide for one Replace the key K to be deleted with the direct successor Krsquo from the left
successor node or with the direct successor Krsquorsquo from the right successor node respectively Delete Krsquo or Krsquorsquo from the respective successor node (recursively)
Note Major variants Merge (tis lecture) Re-distribution (instead of splitmerge in case of overflowunderflow the entries are
re-distributed under consideration of one or multiple adjacent nodes)
| 299
gt
copy A Behrend Foundations of Information Systems |
B-Trees B+-Trees and B-Trees
B+-Trees and B-Trees Data is only in leaf nodes Key redundancy but higher fan-out rarr lower tree high less IO Simpler delete procedure rarr requires only merging of nodes
Double linked list of all leaf nodes B-Trees Modified valid node sizes
from [k2k] to [43k2k] rarr better node utilization but more splitsmerges
Example Secondary index Non unique
B-Tree with k = 2 h = 3
| 300
gt
copy A Behrend Foundations of Information Systems |
Indexing Low Cardinality Columns
Problem Example B-tree on the sex of customers for a table with 1000000 tuples results in
two lists with approximately 500000 tuples each
Query for all female customers requires 500000 random page accesses (secondary index) Table scan would be much faster
Conclusion B-trees (and also hashing) are useful for predicates with low selectivity
(outputinput cardinality ratio) Rule of thumb margin hit rate is approx 5 higher hit rates do not justify the efforts for an index access
F M
TID TID TID TID hellip TID TID TID TID hellip
| 301
gt
copy A Behrend Foundations of Information Systems |
Bitmap Index
Idea (Long history since the 1960s) Create a bitmapbitlist for each
attribute value Each tuple in the table is assigned to
one bit in the bitmap (by position sequential TID)
Bit values 1 attribute value set 0 attribute value not set
Necessary condition Sequential numbering of the tuples
(TIDs)
F M
1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
0 1 0 0 1 0 0 0 0 1 1 0 1 0 0
Name Sex Region Race Carol f n white
Harold m e black Anne f e asian
Iris f ne white hellip m se hisp hellip f e white hellip f sw asian hellip f w black hellip f n asian hellip m e hisp hellip m se black hellip f s white hellip m nw black hellip f s white hellip f w black
Sex
| 302
gt
copy A Behrend Foundations of Information Systems |
Querying Bitmap Indexes
Main advantage of bitmap indexes Simple and efficient logical join possible Read only data that is relevant for predicates Example σSex=lsquoflsquo ᴧ Region=lsquonlsquo R Bitmaps B1 and B2 in conjunction for (i=0 iltB1length i++) B = B1[i] amp B2[i]
Example IO Costs Estimation σSex=lsquoflsquo ᴧ Region=lsquonlsquo ᴧ Race=lsquoAsianlsquo R
(ldquoAsian women of region Northrdquo) Selectivity 12 18 14 = 164 N=10000 tuples with length of 400 bytes each
(~ 10 tuples per page for 4kB pages) Table scan 1000 pages Bitmap access 1000064 156 pages (worst case each tuple in a different page)
plus 1 page for bitmaps
F 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
N 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0
A 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
AND = AND
- DBMS Architecture
- What is in the Lecture
- How is Database System build
- Architectural Blue Print
- Architectural Trends
- Different Access Characteristics
- Hardware Developments
- Row Storage vs Column Storage
- Processing Models
- Transaction Management
- ACID Properties of Transactions
- Motivation
- Reasons for crashes
- Guarantee Atomicity amp Durability
- Record Management
- Record
- Record Addressing
- Surrogate with Mapping Table
- TID Concept
- Free Space Management
- Physical Access Paths ndash Index Structures
- Overview Indexes
- Overview Indexes (2)
- Classification of Index Structures
- B-Tree
- B-Tree (2)
- Search in the B-Tree
- Insertions in the B-Tree (1)
- Insertions in the B-Tree (2)
- Insertions in the B-Tree (3)
- Insertion and Deletion in the B-Tree
- Deletion in the B-Tree
- Deletion in the B-Tree (2)
- Deletion in the B-Tree (3)
- Deletion in the B-Tree (4)
- Deletion Algorithm
- B-Trees B+-Trees and B-Trees
- Indexing Low Cardinality Columns
- Bitmap Index
- Querying Bitmap Indexes
-
| 286
gt
copy A Behrend Foundations of Information Systems |
Classification of Index Structures
Classification
Multiway Trees Tree structure with multiple children per node Idea chose fan out so that node size suits page size
Onedimensional Index Structures
Key Comparison Key Transformation
Sequential Tree-Based Hash-Based
Prefix Trees (Tries)
Binary Search Trees Dynamic Static Linked Lists
(log seq) Seq Lists
(phys seq) Multiway
Trees Example B-Tree
| 287
gt
copy A Behrend Foundations of Information Systems |
B-Tree
free space
(Ki Di Pi) = entry min |P| = k+1 max |P| = 2k+1
keys lt K1 keys gt Kp Ki lt keys lt Ki+1
| 288
gt
copy A Behrend Foundations of Information Systems |
B-Tree (2)
Example Keys Agnostic to specific key semantic Only defined complete order required Could be of fixed or variable length
Operations Search for data for given key value Insertion and deletion of key-data pair
Payload Agnostic to specific data semantic Can be record or reference (TID) or
mix
B-Tree with k = 2 h = 3
| 289
gt
copy A Behrend Foundations of Information Systems |
Search in the B-Tree
Starting at the root node each node is searched from left to right 1) if Ki matches the desired key value the data record has been found (further
records with the same key value might be located in a sub-tree to which Pi-1 points) 2) if Ki is smaller than the desired value the search will be continued in the root of
the sub-tree identified by Pi-1 3) if Ki is larger than the desired value the comparison with Ki+1 is repeated 4) if K2k is also smaller than the desired value the search will be continued in the
sub-tree of P2k If itlsquos impossible to descend further into a sub-tree (2 or 4) (leaf node) The search is aborted no record with the desired key value is found
Search for 38 20 6
| 290
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (1)
Insertion Rule insert only into leaf nodes At Non-Leaf Nodes descend down the tree as for the search S le Ki follow Pi-1 S gt Ki check Ki+1 S gt K2k follow P2k
At Leaf Node Insert the data record according to the sorting order Special case leaf node is full (2k records) rarr split the leaf node
Splitting Generate a new leaf node Split the 2k+1 entries (in order)
into two leaf nodes first k entries rarr left node last k entries rarr right node
middle entry (k+1-th) is used as new ldquodiscriminatorrdquo (branching) and inserted into the parent node
| 291
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (2)
Node Splitting during Insertion Two possible situations after a split The parent node is full rarr repeat split on this level Enough space rarr FINISHED
Special case root split Split of the root node rarr New root with two successor nodes Height of a tree grows by 1 The tree has been split from the bottom to the top
Dynamic reorganization (self-balancing) No unloading or loading necessary Tree is always balanced But In case of many insertions deletions reorganization can be beneficial
| 292
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (3)
Insertion Example Order k = 1 n=2k Keys 1 5 2 6 7 4 8 3
Finally h=3
| 293
gt
copy A Behrend Foundations of Information Systems |
Insertion and Deletion in the B-Tree
Problem Insertion can create overflow Deletion can create underflow and overflow
Example Insertion of key 22
rarr Overflow rarr Split
Deletion of key 22 rarr Underflow need to access all four nodes finally same as input
Insert 22
| 294
gt
copy A Behrend Foundations of Information Systems |
Underflow Merge
Deletion in the B-Tree
Example Order k = 1 n=2k Delete key 3
| 295
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (2)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Remember Each path from the root to the leaf has the same length h
| 296
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (3)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Overflow Split
| 297
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (4)
Example Order k = 1 n=2k Delete key 3
Overflow Split
Root Split
| 298
gt
copy A Behrend Foundations of Information Systems |
Deletion Algorithm
Example ndash there are different algorithms Search the node that contains the key K to be deleted If key K is in a leaf node delete the key in the leaf node and handle potentially
resulting underflow by merging with sibling If key K is in an inner node pull up new discriminator from one of the successors Analyze which successor node of K has more elements left or right one
If both have the same number of elements decide for one Replace the key K to be deleted with the direct successor Krsquo from the left
successor node or with the direct successor Krsquorsquo from the right successor node respectively Delete Krsquo or Krsquorsquo from the respective successor node (recursively)
Note Major variants Merge (tis lecture) Re-distribution (instead of splitmerge in case of overflowunderflow the entries are
re-distributed under consideration of one or multiple adjacent nodes)
| 299
gt
copy A Behrend Foundations of Information Systems |
B-Trees B+-Trees and B-Trees
B+-Trees and B-Trees Data is only in leaf nodes Key redundancy but higher fan-out rarr lower tree high less IO Simpler delete procedure rarr requires only merging of nodes
Double linked list of all leaf nodes B-Trees Modified valid node sizes
from [k2k] to [43k2k] rarr better node utilization but more splitsmerges
Example Secondary index Non unique
B-Tree with k = 2 h = 3
| 300
gt
copy A Behrend Foundations of Information Systems |
Indexing Low Cardinality Columns
Problem Example B-tree on the sex of customers for a table with 1000000 tuples results in
two lists with approximately 500000 tuples each
Query for all female customers requires 500000 random page accesses (secondary index) Table scan would be much faster
Conclusion B-trees (and also hashing) are useful for predicates with low selectivity
(outputinput cardinality ratio) Rule of thumb margin hit rate is approx 5 higher hit rates do not justify the efforts for an index access
F M
TID TID TID TID hellip TID TID TID TID hellip
| 301
gt
copy A Behrend Foundations of Information Systems |
Bitmap Index
Idea (Long history since the 1960s) Create a bitmapbitlist for each
attribute value Each tuple in the table is assigned to
one bit in the bitmap (by position sequential TID)
Bit values 1 attribute value set 0 attribute value not set
Necessary condition Sequential numbering of the tuples
(TIDs)
F M
1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
0 1 0 0 1 0 0 0 0 1 1 0 1 0 0
Name Sex Region Race Carol f n white
Harold m e black Anne f e asian
Iris f ne white hellip m se hisp hellip f e white hellip f sw asian hellip f w black hellip f n asian hellip m e hisp hellip m se black hellip f s white hellip m nw black hellip f s white hellip f w black
Sex
| 302
gt
copy A Behrend Foundations of Information Systems |
Querying Bitmap Indexes
Main advantage of bitmap indexes Simple and efficient logical join possible Read only data that is relevant for predicates Example σSex=lsquoflsquo ᴧ Region=lsquonlsquo R Bitmaps B1 and B2 in conjunction for (i=0 iltB1length i++) B = B1[i] amp B2[i]
Example IO Costs Estimation σSex=lsquoflsquo ᴧ Region=lsquonlsquo ᴧ Race=lsquoAsianlsquo R
(ldquoAsian women of region Northrdquo) Selectivity 12 18 14 = 164 N=10000 tuples with length of 400 bytes each
(~ 10 tuples per page for 4kB pages) Table scan 1000 pages Bitmap access 1000064 156 pages (worst case each tuple in a different page)
plus 1 page for bitmaps
F 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
N 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0
A 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
AND = AND
- DBMS Architecture
- What is in the Lecture
- How is Database System build
- Architectural Blue Print
- Architectural Trends
- Different Access Characteristics
- Hardware Developments
- Row Storage vs Column Storage
- Processing Models
- Transaction Management
- ACID Properties of Transactions
- Motivation
- Reasons for crashes
- Guarantee Atomicity amp Durability
- Record Management
- Record
- Record Addressing
- Surrogate with Mapping Table
- TID Concept
- Free Space Management
- Physical Access Paths ndash Index Structures
- Overview Indexes
- Overview Indexes (2)
- Classification of Index Structures
- B-Tree
- B-Tree (2)
- Search in the B-Tree
- Insertions in the B-Tree (1)
- Insertions in the B-Tree (2)
- Insertions in the B-Tree (3)
- Insertion and Deletion in the B-Tree
- Deletion in the B-Tree
- Deletion in the B-Tree (2)
- Deletion in the B-Tree (3)
- Deletion in the B-Tree (4)
- Deletion Algorithm
- B-Trees B+-Trees and B-Trees
- Indexing Low Cardinality Columns
- Bitmap Index
- Querying Bitmap Indexes
-
| 287
gt
copy A Behrend Foundations of Information Systems |
B-Tree
free space
(Ki Di Pi) = entry min |P| = k+1 max |P| = 2k+1
keys lt K1 keys gt Kp Ki lt keys lt Ki+1
| 288
gt
copy A Behrend Foundations of Information Systems |
B-Tree (2)
Example Keys Agnostic to specific key semantic Only defined complete order required Could be of fixed or variable length
Operations Search for data for given key value Insertion and deletion of key-data pair
Payload Agnostic to specific data semantic Can be record or reference (TID) or
mix
B-Tree with k = 2 h = 3
| 289
gt
copy A Behrend Foundations of Information Systems |
Search in the B-Tree
Starting at the root node each node is searched from left to right 1) if Ki matches the desired key value the data record has been found (further
records with the same key value might be located in a sub-tree to which Pi-1 points) 2) if Ki is smaller than the desired value the search will be continued in the root of
the sub-tree identified by Pi-1 3) if Ki is larger than the desired value the comparison with Ki+1 is repeated 4) if K2k is also smaller than the desired value the search will be continued in the
sub-tree of P2k If itlsquos impossible to descend further into a sub-tree (2 or 4) (leaf node) The search is aborted no record with the desired key value is found
Search for 38 20 6
| 290
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (1)
Insertion Rule insert only into leaf nodes At Non-Leaf Nodes descend down the tree as for the search S le Ki follow Pi-1 S gt Ki check Ki+1 S gt K2k follow P2k
At Leaf Node Insert the data record according to the sorting order Special case leaf node is full (2k records) rarr split the leaf node
Splitting Generate a new leaf node Split the 2k+1 entries (in order)
into two leaf nodes first k entries rarr left node last k entries rarr right node
middle entry (k+1-th) is used as new ldquodiscriminatorrdquo (branching) and inserted into the parent node
| 291
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (2)
Node Splitting during Insertion Two possible situations after a split The parent node is full rarr repeat split on this level Enough space rarr FINISHED
Special case root split Split of the root node rarr New root with two successor nodes Height of a tree grows by 1 The tree has been split from the bottom to the top
Dynamic reorganization (self-balancing) No unloading or loading necessary Tree is always balanced But In case of many insertions deletions reorganization can be beneficial
| 292
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (3)
Insertion Example Order k = 1 n=2k Keys 1 5 2 6 7 4 8 3
Finally h=3
| 293
gt
copy A Behrend Foundations of Information Systems |
Insertion and Deletion in the B-Tree
Problem Insertion can create overflow Deletion can create underflow and overflow
Example Insertion of key 22
rarr Overflow rarr Split
Deletion of key 22 rarr Underflow need to access all four nodes finally same as input
Insert 22
| 294
gt
copy A Behrend Foundations of Information Systems |
Underflow Merge
Deletion in the B-Tree
Example Order k = 1 n=2k Delete key 3
| 295
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (2)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Remember Each path from the root to the leaf has the same length h
| 296
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (3)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Overflow Split
| 297
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (4)
Example Order k = 1 n=2k Delete key 3
Overflow Split
Root Split
| 298
gt
copy A Behrend Foundations of Information Systems |
Deletion Algorithm
Example ndash there are different algorithms Search the node that contains the key K to be deleted If key K is in a leaf node delete the key in the leaf node and handle potentially
resulting underflow by merging with sibling If key K is in an inner node pull up new discriminator from one of the successors Analyze which successor node of K has more elements left or right one
If both have the same number of elements decide for one Replace the key K to be deleted with the direct successor Krsquo from the left
successor node or with the direct successor Krsquorsquo from the right successor node respectively Delete Krsquo or Krsquorsquo from the respective successor node (recursively)
Note Major variants Merge (tis lecture) Re-distribution (instead of splitmerge in case of overflowunderflow the entries are
re-distributed under consideration of one or multiple adjacent nodes)
| 299
gt
copy A Behrend Foundations of Information Systems |
B-Trees B+-Trees and B-Trees
B+-Trees and B-Trees Data is only in leaf nodes Key redundancy but higher fan-out rarr lower tree high less IO Simpler delete procedure rarr requires only merging of nodes
Double linked list of all leaf nodes B-Trees Modified valid node sizes
from [k2k] to [43k2k] rarr better node utilization but more splitsmerges
Example Secondary index Non unique
B-Tree with k = 2 h = 3
| 300
gt
copy A Behrend Foundations of Information Systems |
Indexing Low Cardinality Columns
Problem Example B-tree on the sex of customers for a table with 1000000 tuples results in
two lists with approximately 500000 tuples each
Query for all female customers requires 500000 random page accesses (secondary index) Table scan would be much faster
Conclusion B-trees (and also hashing) are useful for predicates with low selectivity
(outputinput cardinality ratio) Rule of thumb margin hit rate is approx 5 higher hit rates do not justify the efforts for an index access
F M
TID TID TID TID hellip TID TID TID TID hellip
| 301
gt
copy A Behrend Foundations of Information Systems |
Bitmap Index
Idea (Long history since the 1960s) Create a bitmapbitlist for each
attribute value Each tuple in the table is assigned to
one bit in the bitmap (by position sequential TID)
Bit values 1 attribute value set 0 attribute value not set
Necessary condition Sequential numbering of the tuples
(TIDs)
F M
1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
0 1 0 0 1 0 0 0 0 1 1 0 1 0 0
Name Sex Region Race Carol f n white
Harold m e black Anne f e asian
Iris f ne white hellip m se hisp hellip f e white hellip f sw asian hellip f w black hellip f n asian hellip m e hisp hellip m se black hellip f s white hellip m nw black hellip f s white hellip f w black
Sex
| 302
gt
copy A Behrend Foundations of Information Systems |
Querying Bitmap Indexes
Main advantage of bitmap indexes Simple and efficient logical join possible Read only data that is relevant for predicates Example σSex=lsquoflsquo ᴧ Region=lsquonlsquo R Bitmaps B1 and B2 in conjunction for (i=0 iltB1length i++) B = B1[i] amp B2[i]
Example IO Costs Estimation σSex=lsquoflsquo ᴧ Region=lsquonlsquo ᴧ Race=lsquoAsianlsquo R
(ldquoAsian women of region Northrdquo) Selectivity 12 18 14 = 164 N=10000 tuples with length of 400 bytes each
(~ 10 tuples per page for 4kB pages) Table scan 1000 pages Bitmap access 1000064 156 pages (worst case each tuple in a different page)
plus 1 page for bitmaps
F 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
N 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0
A 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
AND = AND
- DBMS Architecture
- What is in the Lecture
- How is Database System build
- Architectural Blue Print
- Architectural Trends
- Different Access Characteristics
- Hardware Developments
- Row Storage vs Column Storage
- Processing Models
- Transaction Management
- ACID Properties of Transactions
- Motivation
- Reasons for crashes
- Guarantee Atomicity amp Durability
- Record Management
- Record
- Record Addressing
- Surrogate with Mapping Table
- TID Concept
- Free Space Management
- Physical Access Paths ndash Index Structures
- Overview Indexes
- Overview Indexes (2)
- Classification of Index Structures
- B-Tree
- B-Tree (2)
- Search in the B-Tree
- Insertions in the B-Tree (1)
- Insertions in the B-Tree (2)
- Insertions in the B-Tree (3)
- Insertion and Deletion in the B-Tree
- Deletion in the B-Tree
- Deletion in the B-Tree (2)
- Deletion in the B-Tree (3)
- Deletion in the B-Tree (4)
- Deletion Algorithm
- B-Trees B+-Trees and B-Trees
- Indexing Low Cardinality Columns
- Bitmap Index
- Querying Bitmap Indexes
-
| 288
gt
copy A Behrend Foundations of Information Systems |
B-Tree (2)
Example Keys Agnostic to specific key semantic Only defined complete order required Could be of fixed or variable length
Operations Search for data for given key value Insertion and deletion of key-data pair
Payload Agnostic to specific data semantic Can be record or reference (TID) or
mix
B-Tree with k = 2 h = 3
| 289
gt
copy A Behrend Foundations of Information Systems |
Search in the B-Tree
Starting at the root node each node is searched from left to right 1) if Ki matches the desired key value the data record has been found (further
records with the same key value might be located in a sub-tree to which Pi-1 points) 2) if Ki is smaller than the desired value the search will be continued in the root of
the sub-tree identified by Pi-1 3) if Ki is larger than the desired value the comparison with Ki+1 is repeated 4) if K2k is also smaller than the desired value the search will be continued in the
sub-tree of P2k If itlsquos impossible to descend further into a sub-tree (2 or 4) (leaf node) The search is aborted no record with the desired key value is found
Search for 38 20 6
| 290
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (1)
Insertion Rule insert only into leaf nodes At Non-Leaf Nodes descend down the tree as for the search S le Ki follow Pi-1 S gt Ki check Ki+1 S gt K2k follow P2k
At Leaf Node Insert the data record according to the sorting order Special case leaf node is full (2k records) rarr split the leaf node
Splitting Generate a new leaf node Split the 2k+1 entries (in order)
into two leaf nodes first k entries rarr left node last k entries rarr right node
middle entry (k+1-th) is used as new ldquodiscriminatorrdquo (branching) and inserted into the parent node
| 291
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (2)
Node Splitting during Insertion Two possible situations after a split The parent node is full rarr repeat split on this level Enough space rarr FINISHED
Special case root split Split of the root node rarr New root with two successor nodes Height of a tree grows by 1 The tree has been split from the bottom to the top
Dynamic reorganization (self-balancing) No unloading or loading necessary Tree is always balanced But In case of many insertions deletions reorganization can be beneficial
| 292
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (3)
Insertion Example Order k = 1 n=2k Keys 1 5 2 6 7 4 8 3
Finally h=3
| 293
gt
copy A Behrend Foundations of Information Systems |
Insertion and Deletion in the B-Tree
Problem Insertion can create overflow Deletion can create underflow and overflow
Example Insertion of key 22
rarr Overflow rarr Split
Deletion of key 22 rarr Underflow need to access all four nodes finally same as input
Insert 22
| 294
gt
copy A Behrend Foundations of Information Systems |
Underflow Merge
Deletion in the B-Tree
Example Order k = 1 n=2k Delete key 3
| 295
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (2)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Remember Each path from the root to the leaf has the same length h
| 296
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (3)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Overflow Split
| 297
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (4)
Example Order k = 1 n=2k Delete key 3
Overflow Split
Root Split
| 298
gt
copy A Behrend Foundations of Information Systems |
Deletion Algorithm
Example ndash there are different algorithms Search the node that contains the key K to be deleted If key K is in a leaf node delete the key in the leaf node and handle potentially
resulting underflow by merging with sibling If key K is in an inner node pull up new discriminator from one of the successors Analyze which successor node of K has more elements left or right one
If both have the same number of elements decide for one Replace the key K to be deleted with the direct successor Krsquo from the left
successor node or with the direct successor Krsquorsquo from the right successor node respectively Delete Krsquo or Krsquorsquo from the respective successor node (recursively)
Note Major variants Merge (tis lecture) Re-distribution (instead of splitmerge in case of overflowunderflow the entries are
re-distributed under consideration of one or multiple adjacent nodes)
| 299
gt
copy A Behrend Foundations of Information Systems |
B-Trees B+-Trees and B-Trees
B+-Trees and B-Trees Data is only in leaf nodes Key redundancy but higher fan-out rarr lower tree high less IO Simpler delete procedure rarr requires only merging of nodes
Double linked list of all leaf nodes B-Trees Modified valid node sizes
from [k2k] to [43k2k] rarr better node utilization but more splitsmerges
Example Secondary index Non unique
B-Tree with k = 2 h = 3
| 300
gt
copy A Behrend Foundations of Information Systems |
Indexing Low Cardinality Columns
Problem Example B-tree on the sex of customers for a table with 1000000 tuples results in
two lists with approximately 500000 tuples each
Query for all female customers requires 500000 random page accesses (secondary index) Table scan would be much faster
Conclusion B-trees (and also hashing) are useful for predicates with low selectivity
(outputinput cardinality ratio) Rule of thumb margin hit rate is approx 5 higher hit rates do not justify the efforts for an index access
F M
TID TID TID TID hellip TID TID TID TID hellip
| 301
gt
copy A Behrend Foundations of Information Systems |
Bitmap Index
Idea (Long history since the 1960s) Create a bitmapbitlist for each
attribute value Each tuple in the table is assigned to
one bit in the bitmap (by position sequential TID)
Bit values 1 attribute value set 0 attribute value not set
Necessary condition Sequential numbering of the tuples
(TIDs)
F M
1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
0 1 0 0 1 0 0 0 0 1 1 0 1 0 0
Name Sex Region Race Carol f n white
Harold m e black Anne f e asian
Iris f ne white hellip m se hisp hellip f e white hellip f sw asian hellip f w black hellip f n asian hellip m e hisp hellip m se black hellip f s white hellip m nw black hellip f s white hellip f w black
Sex
| 302
gt
copy A Behrend Foundations of Information Systems |
Querying Bitmap Indexes
Main advantage of bitmap indexes Simple and efficient logical join possible Read only data that is relevant for predicates Example σSex=lsquoflsquo ᴧ Region=lsquonlsquo R Bitmaps B1 and B2 in conjunction for (i=0 iltB1length i++) B = B1[i] amp B2[i]
Example IO Costs Estimation σSex=lsquoflsquo ᴧ Region=lsquonlsquo ᴧ Race=lsquoAsianlsquo R
(ldquoAsian women of region Northrdquo) Selectivity 12 18 14 = 164 N=10000 tuples with length of 400 bytes each
(~ 10 tuples per page for 4kB pages) Table scan 1000 pages Bitmap access 1000064 156 pages (worst case each tuple in a different page)
plus 1 page for bitmaps
F 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
N 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0
A 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
AND = AND
- DBMS Architecture
- What is in the Lecture
- How is Database System build
- Architectural Blue Print
- Architectural Trends
- Different Access Characteristics
- Hardware Developments
- Row Storage vs Column Storage
- Processing Models
- Transaction Management
- ACID Properties of Transactions
- Motivation
- Reasons for crashes
- Guarantee Atomicity amp Durability
- Record Management
- Record
- Record Addressing
- Surrogate with Mapping Table
- TID Concept
- Free Space Management
- Physical Access Paths ndash Index Structures
- Overview Indexes
- Overview Indexes (2)
- Classification of Index Structures
- B-Tree
- B-Tree (2)
- Search in the B-Tree
- Insertions in the B-Tree (1)
- Insertions in the B-Tree (2)
- Insertions in the B-Tree (3)
- Insertion and Deletion in the B-Tree
- Deletion in the B-Tree
- Deletion in the B-Tree (2)
- Deletion in the B-Tree (3)
- Deletion in the B-Tree (4)
- Deletion Algorithm
- B-Trees B+-Trees and B-Trees
- Indexing Low Cardinality Columns
- Bitmap Index
- Querying Bitmap Indexes
-
| 289
gt
copy A Behrend Foundations of Information Systems |
Search in the B-Tree
Starting at the root node each node is searched from left to right 1) if Ki matches the desired key value the data record has been found (further
records with the same key value might be located in a sub-tree to which Pi-1 points) 2) if Ki is smaller than the desired value the search will be continued in the root of
the sub-tree identified by Pi-1 3) if Ki is larger than the desired value the comparison with Ki+1 is repeated 4) if K2k is also smaller than the desired value the search will be continued in the
sub-tree of P2k If itlsquos impossible to descend further into a sub-tree (2 or 4) (leaf node) The search is aborted no record with the desired key value is found
Search for 38 20 6
| 290
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (1)
Insertion Rule insert only into leaf nodes At Non-Leaf Nodes descend down the tree as for the search S le Ki follow Pi-1 S gt Ki check Ki+1 S gt K2k follow P2k
At Leaf Node Insert the data record according to the sorting order Special case leaf node is full (2k records) rarr split the leaf node
Splitting Generate a new leaf node Split the 2k+1 entries (in order)
into two leaf nodes first k entries rarr left node last k entries rarr right node
middle entry (k+1-th) is used as new ldquodiscriminatorrdquo (branching) and inserted into the parent node
| 291
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (2)
Node Splitting during Insertion Two possible situations after a split The parent node is full rarr repeat split on this level Enough space rarr FINISHED
Special case root split Split of the root node rarr New root with two successor nodes Height of a tree grows by 1 The tree has been split from the bottom to the top
Dynamic reorganization (self-balancing) No unloading or loading necessary Tree is always balanced But In case of many insertions deletions reorganization can be beneficial
| 292
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (3)
Insertion Example Order k = 1 n=2k Keys 1 5 2 6 7 4 8 3
Finally h=3
| 293
gt
copy A Behrend Foundations of Information Systems |
Insertion and Deletion in the B-Tree
Problem Insertion can create overflow Deletion can create underflow and overflow
Example Insertion of key 22
rarr Overflow rarr Split
Deletion of key 22 rarr Underflow need to access all four nodes finally same as input
Insert 22
| 294
gt
copy A Behrend Foundations of Information Systems |
Underflow Merge
Deletion in the B-Tree
Example Order k = 1 n=2k Delete key 3
| 295
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (2)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Remember Each path from the root to the leaf has the same length h
| 296
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (3)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Overflow Split
| 297
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (4)
Example Order k = 1 n=2k Delete key 3
Overflow Split
Root Split
| 298
gt
copy A Behrend Foundations of Information Systems |
Deletion Algorithm
Example ndash there are different algorithms Search the node that contains the key K to be deleted If key K is in a leaf node delete the key in the leaf node and handle potentially
resulting underflow by merging with sibling If key K is in an inner node pull up new discriminator from one of the successors Analyze which successor node of K has more elements left or right one
If both have the same number of elements decide for one Replace the key K to be deleted with the direct successor Krsquo from the left
successor node or with the direct successor Krsquorsquo from the right successor node respectively Delete Krsquo or Krsquorsquo from the respective successor node (recursively)
Note Major variants Merge (tis lecture) Re-distribution (instead of splitmerge in case of overflowunderflow the entries are
re-distributed under consideration of one or multiple adjacent nodes)
| 299
gt
copy A Behrend Foundations of Information Systems |
B-Trees B+-Trees and B-Trees
B+-Trees and B-Trees Data is only in leaf nodes Key redundancy but higher fan-out rarr lower tree high less IO Simpler delete procedure rarr requires only merging of nodes
Double linked list of all leaf nodes B-Trees Modified valid node sizes
from [k2k] to [43k2k] rarr better node utilization but more splitsmerges
Example Secondary index Non unique
B-Tree with k = 2 h = 3
| 300
gt
copy A Behrend Foundations of Information Systems |
Indexing Low Cardinality Columns
Problem Example B-tree on the sex of customers for a table with 1000000 tuples results in
two lists with approximately 500000 tuples each
Query for all female customers requires 500000 random page accesses (secondary index) Table scan would be much faster
Conclusion B-trees (and also hashing) are useful for predicates with low selectivity
(outputinput cardinality ratio) Rule of thumb margin hit rate is approx 5 higher hit rates do not justify the efforts for an index access
F M
TID TID TID TID hellip TID TID TID TID hellip
| 301
gt
copy A Behrend Foundations of Information Systems |
Bitmap Index
Idea (Long history since the 1960s) Create a bitmapbitlist for each
attribute value Each tuple in the table is assigned to
one bit in the bitmap (by position sequential TID)
Bit values 1 attribute value set 0 attribute value not set
Necessary condition Sequential numbering of the tuples
(TIDs)
F M
1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
0 1 0 0 1 0 0 0 0 1 1 0 1 0 0
Name Sex Region Race Carol f n white
Harold m e black Anne f e asian
Iris f ne white hellip m se hisp hellip f e white hellip f sw asian hellip f w black hellip f n asian hellip m e hisp hellip m se black hellip f s white hellip m nw black hellip f s white hellip f w black
Sex
| 302
gt
copy A Behrend Foundations of Information Systems |
Querying Bitmap Indexes
Main advantage of bitmap indexes Simple and efficient logical join possible Read only data that is relevant for predicates Example σSex=lsquoflsquo ᴧ Region=lsquonlsquo R Bitmaps B1 and B2 in conjunction for (i=0 iltB1length i++) B = B1[i] amp B2[i]
Example IO Costs Estimation σSex=lsquoflsquo ᴧ Region=lsquonlsquo ᴧ Race=lsquoAsianlsquo R
(ldquoAsian women of region Northrdquo) Selectivity 12 18 14 = 164 N=10000 tuples with length of 400 bytes each
(~ 10 tuples per page for 4kB pages) Table scan 1000 pages Bitmap access 1000064 156 pages (worst case each tuple in a different page)
plus 1 page for bitmaps
F 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
N 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0
A 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
AND = AND
- DBMS Architecture
- What is in the Lecture
- How is Database System build
- Architectural Blue Print
- Architectural Trends
- Different Access Characteristics
- Hardware Developments
- Row Storage vs Column Storage
- Processing Models
- Transaction Management
- ACID Properties of Transactions
- Motivation
- Reasons for crashes
- Guarantee Atomicity amp Durability
- Record Management
- Record
- Record Addressing
- Surrogate with Mapping Table
- TID Concept
- Free Space Management
- Physical Access Paths ndash Index Structures
- Overview Indexes
- Overview Indexes (2)
- Classification of Index Structures
- B-Tree
- B-Tree (2)
- Search in the B-Tree
- Insertions in the B-Tree (1)
- Insertions in the B-Tree (2)
- Insertions in the B-Tree (3)
- Insertion and Deletion in the B-Tree
- Deletion in the B-Tree
- Deletion in the B-Tree (2)
- Deletion in the B-Tree (3)
- Deletion in the B-Tree (4)
- Deletion Algorithm
- B-Trees B+-Trees and B-Trees
- Indexing Low Cardinality Columns
- Bitmap Index
- Querying Bitmap Indexes
-
| 290
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (1)
Insertion Rule insert only into leaf nodes At Non-Leaf Nodes descend down the tree as for the search S le Ki follow Pi-1 S gt Ki check Ki+1 S gt K2k follow P2k
At Leaf Node Insert the data record according to the sorting order Special case leaf node is full (2k records) rarr split the leaf node
Splitting Generate a new leaf node Split the 2k+1 entries (in order)
into two leaf nodes first k entries rarr left node last k entries rarr right node
middle entry (k+1-th) is used as new ldquodiscriminatorrdquo (branching) and inserted into the parent node
| 291
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (2)
Node Splitting during Insertion Two possible situations after a split The parent node is full rarr repeat split on this level Enough space rarr FINISHED
Special case root split Split of the root node rarr New root with two successor nodes Height of a tree grows by 1 The tree has been split from the bottom to the top
Dynamic reorganization (self-balancing) No unloading or loading necessary Tree is always balanced But In case of many insertions deletions reorganization can be beneficial
| 292
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (3)
Insertion Example Order k = 1 n=2k Keys 1 5 2 6 7 4 8 3
Finally h=3
| 293
gt
copy A Behrend Foundations of Information Systems |
Insertion and Deletion in the B-Tree
Problem Insertion can create overflow Deletion can create underflow and overflow
Example Insertion of key 22
rarr Overflow rarr Split
Deletion of key 22 rarr Underflow need to access all four nodes finally same as input
Insert 22
| 294
gt
copy A Behrend Foundations of Information Systems |
Underflow Merge
Deletion in the B-Tree
Example Order k = 1 n=2k Delete key 3
| 295
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (2)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Remember Each path from the root to the leaf has the same length h
| 296
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (3)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Overflow Split
| 297
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (4)
Example Order k = 1 n=2k Delete key 3
Overflow Split
Root Split
| 298
gt
copy A Behrend Foundations of Information Systems |
Deletion Algorithm
Example ndash there are different algorithms Search the node that contains the key K to be deleted If key K is in a leaf node delete the key in the leaf node and handle potentially
resulting underflow by merging with sibling If key K is in an inner node pull up new discriminator from one of the successors Analyze which successor node of K has more elements left or right one
If both have the same number of elements decide for one Replace the key K to be deleted with the direct successor Krsquo from the left
successor node or with the direct successor Krsquorsquo from the right successor node respectively Delete Krsquo or Krsquorsquo from the respective successor node (recursively)
Note Major variants Merge (tis lecture) Re-distribution (instead of splitmerge in case of overflowunderflow the entries are
re-distributed under consideration of one or multiple adjacent nodes)
| 299
gt
copy A Behrend Foundations of Information Systems |
B-Trees B+-Trees and B-Trees
B+-Trees and B-Trees Data is only in leaf nodes Key redundancy but higher fan-out rarr lower tree high less IO Simpler delete procedure rarr requires only merging of nodes
Double linked list of all leaf nodes B-Trees Modified valid node sizes
from [k2k] to [43k2k] rarr better node utilization but more splitsmerges
Example Secondary index Non unique
B-Tree with k = 2 h = 3
| 300
gt
copy A Behrend Foundations of Information Systems |
Indexing Low Cardinality Columns
Problem Example B-tree on the sex of customers for a table with 1000000 tuples results in
two lists with approximately 500000 tuples each
Query for all female customers requires 500000 random page accesses (secondary index) Table scan would be much faster
Conclusion B-trees (and also hashing) are useful for predicates with low selectivity
(outputinput cardinality ratio) Rule of thumb margin hit rate is approx 5 higher hit rates do not justify the efforts for an index access
F M
TID TID TID TID hellip TID TID TID TID hellip
| 301
gt
copy A Behrend Foundations of Information Systems |
Bitmap Index
Idea (Long history since the 1960s) Create a bitmapbitlist for each
attribute value Each tuple in the table is assigned to
one bit in the bitmap (by position sequential TID)
Bit values 1 attribute value set 0 attribute value not set
Necessary condition Sequential numbering of the tuples
(TIDs)
F M
1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
0 1 0 0 1 0 0 0 0 1 1 0 1 0 0
Name Sex Region Race Carol f n white
Harold m e black Anne f e asian
Iris f ne white hellip m se hisp hellip f e white hellip f sw asian hellip f w black hellip f n asian hellip m e hisp hellip m se black hellip f s white hellip m nw black hellip f s white hellip f w black
Sex
| 302
gt
copy A Behrend Foundations of Information Systems |
Querying Bitmap Indexes
Main advantage of bitmap indexes Simple and efficient logical join possible Read only data that is relevant for predicates Example σSex=lsquoflsquo ᴧ Region=lsquonlsquo R Bitmaps B1 and B2 in conjunction for (i=0 iltB1length i++) B = B1[i] amp B2[i]
Example IO Costs Estimation σSex=lsquoflsquo ᴧ Region=lsquonlsquo ᴧ Race=lsquoAsianlsquo R
(ldquoAsian women of region Northrdquo) Selectivity 12 18 14 = 164 N=10000 tuples with length of 400 bytes each
(~ 10 tuples per page for 4kB pages) Table scan 1000 pages Bitmap access 1000064 156 pages (worst case each tuple in a different page)
plus 1 page for bitmaps
F 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
N 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0
A 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
AND = AND
- DBMS Architecture
- What is in the Lecture
- How is Database System build
- Architectural Blue Print
- Architectural Trends
- Different Access Characteristics
- Hardware Developments
- Row Storage vs Column Storage
- Processing Models
- Transaction Management
- ACID Properties of Transactions
- Motivation
- Reasons for crashes
- Guarantee Atomicity amp Durability
- Record Management
- Record
- Record Addressing
- Surrogate with Mapping Table
- TID Concept
- Free Space Management
- Physical Access Paths ndash Index Structures
- Overview Indexes
- Overview Indexes (2)
- Classification of Index Structures
- B-Tree
- B-Tree (2)
- Search in the B-Tree
- Insertions in the B-Tree (1)
- Insertions in the B-Tree (2)
- Insertions in the B-Tree (3)
- Insertion and Deletion in the B-Tree
- Deletion in the B-Tree
- Deletion in the B-Tree (2)
- Deletion in the B-Tree (3)
- Deletion in the B-Tree (4)
- Deletion Algorithm
- B-Trees B+-Trees and B-Trees
- Indexing Low Cardinality Columns
- Bitmap Index
- Querying Bitmap Indexes
-
| 291
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (2)
Node Splitting during Insertion Two possible situations after a split The parent node is full rarr repeat split on this level Enough space rarr FINISHED
Special case root split Split of the root node rarr New root with two successor nodes Height of a tree grows by 1 The tree has been split from the bottom to the top
Dynamic reorganization (self-balancing) No unloading or loading necessary Tree is always balanced But In case of many insertions deletions reorganization can be beneficial
| 292
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (3)
Insertion Example Order k = 1 n=2k Keys 1 5 2 6 7 4 8 3
Finally h=3
| 293
gt
copy A Behrend Foundations of Information Systems |
Insertion and Deletion in the B-Tree
Problem Insertion can create overflow Deletion can create underflow and overflow
Example Insertion of key 22
rarr Overflow rarr Split
Deletion of key 22 rarr Underflow need to access all four nodes finally same as input
Insert 22
| 294
gt
copy A Behrend Foundations of Information Systems |
Underflow Merge
Deletion in the B-Tree
Example Order k = 1 n=2k Delete key 3
| 295
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (2)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Remember Each path from the root to the leaf has the same length h
| 296
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (3)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Overflow Split
| 297
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (4)
Example Order k = 1 n=2k Delete key 3
Overflow Split
Root Split
| 298
gt
copy A Behrend Foundations of Information Systems |
Deletion Algorithm
Example ndash there are different algorithms Search the node that contains the key K to be deleted If key K is in a leaf node delete the key in the leaf node and handle potentially
resulting underflow by merging with sibling If key K is in an inner node pull up new discriminator from one of the successors Analyze which successor node of K has more elements left or right one
If both have the same number of elements decide for one Replace the key K to be deleted with the direct successor Krsquo from the left
successor node or with the direct successor Krsquorsquo from the right successor node respectively Delete Krsquo or Krsquorsquo from the respective successor node (recursively)
Note Major variants Merge (tis lecture) Re-distribution (instead of splitmerge in case of overflowunderflow the entries are
re-distributed under consideration of one or multiple adjacent nodes)
| 299
gt
copy A Behrend Foundations of Information Systems |
B-Trees B+-Trees and B-Trees
B+-Trees and B-Trees Data is only in leaf nodes Key redundancy but higher fan-out rarr lower tree high less IO Simpler delete procedure rarr requires only merging of nodes
Double linked list of all leaf nodes B-Trees Modified valid node sizes
from [k2k] to [43k2k] rarr better node utilization but more splitsmerges
Example Secondary index Non unique
B-Tree with k = 2 h = 3
| 300
gt
copy A Behrend Foundations of Information Systems |
Indexing Low Cardinality Columns
Problem Example B-tree on the sex of customers for a table with 1000000 tuples results in
two lists with approximately 500000 tuples each
Query for all female customers requires 500000 random page accesses (secondary index) Table scan would be much faster
Conclusion B-trees (and also hashing) are useful for predicates with low selectivity
(outputinput cardinality ratio) Rule of thumb margin hit rate is approx 5 higher hit rates do not justify the efforts for an index access
F M
TID TID TID TID hellip TID TID TID TID hellip
| 301
gt
copy A Behrend Foundations of Information Systems |
Bitmap Index
Idea (Long history since the 1960s) Create a bitmapbitlist for each
attribute value Each tuple in the table is assigned to
one bit in the bitmap (by position sequential TID)
Bit values 1 attribute value set 0 attribute value not set
Necessary condition Sequential numbering of the tuples
(TIDs)
F M
1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
0 1 0 0 1 0 0 0 0 1 1 0 1 0 0
Name Sex Region Race Carol f n white
Harold m e black Anne f e asian
Iris f ne white hellip m se hisp hellip f e white hellip f sw asian hellip f w black hellip f n asian hellip m e hisp hellip m se black hellip f s white hellip m nw black hellip f s white hellip f w black
Sex
| 302
gt
copy A Behrend Foundations of Information Systems |
Querying Bitmap Indexes
Main advantage of bitmap indexes Simple and efficient logical join possible Read only data that is relevant for predicates Example σSex=lsquoflsquo ᴧ Region=lsquonlsquo R Bitmaps B1 and B2 in conjunction for (i=0 iltB1length i++) B = B1[i] amp B2[i]
Example IO Costs Estimation σSex=lsquoflsquo ᴧ Region=lsquonlsquo ᴧ Race=lsquoAsianlsquo R
(ldquoAsian women of region Northrdquo) Selectivity 12 18 14 = 164 N=10000 tuples with length of 400 bytes each
(~ 10 tuples per page for 4kB pages) Table scan 1000 pages Bitmap access 1000064 156 pages (worst case each tuple in a different page)
plus 1 page for bitmaps
F 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
N 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0
A 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
AND = AND
- DBMS Architecture
- What is in the Lecture
- How is Database System build
- Architectural Blue Print
- Architectural Trends
- Different Access Characteristics
- Hardware Developments
- Row Storage vs Column Storage
- Processing Models
- Transaction Management
- ACID Properties of Transactions
- Motivation
- Reasons for crashes
- Guarantee Atomicity amp Durability
- Record Management
- Record
- Record Addressing
- Surrogate with Mapping Table
- TID Concept
- Free Space Management
- Physical Access Paths ndash Index Structures
- Overview Indexes
- Overview Indexes (2)
- Classification of Index Structures
- B-Tree
- B-Tree (2)
- Search in the B-Tree
- Insertions in the B-Tree (1)
- Insertions in the B-Tree (2)
- Insertions in the B-Tree (3)
- Insertion and Deletion in the B-Tree
- Deletion in the B-Tree
- Deletion in the B-Tree (2)
- Deletion in the B-Tree (3)
- Deletion in the B-Tree (4)
- Deletion Algorithm
- B-Trees B+-Trees and B-Trees
- Indexing Low Cardinality Columns
- Bitmap Index
- Querying Bitmap Indexes
-
| 292
gt
copy A Behrend Foundations of Information Systems |
Insertions in the B-Tree (3)
Insertion Example Order k = 1 n=2k Keys 1 5 2 6 7 4 8 3
Finally h=3
| 293
gt
copy A Behrend Foundations of Information Systems |
Insertion and Deletion in the B-Tree
Problem Insertion can create overflow Deletion can create underflow and overflow
Example Insertion of key 22
rarr Overflow rarr Split
Deletion of key 22 rarr Underflow need to access all four nodes finally same as input
Insert 22
| 294
gt
copy A Behrend Foundations of Information Systems |
Underflow Merge
Deletion in the B-Tree
Example Order k = 1 n=2k Delete key 3
| 295
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (2)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Remember Each path from the root to the leaf has the same length h
| 296
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (3)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Overflow Split
| 297
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (4)
Example Order k = 1 n=2k Delete key 3
Overflow Split
Root Split
| 298
gt
copy A Behrend Foundations of Information Systems |
Deletion Algorithm
Example ndash there are different algorithms Search the node that contains the key K to be deleted If key K is in a leaf node delete the key in the leaf node and handle potentially
resulting underflow by merging with sibling If key K is in an inner node pull up new discriminator from one of the successors Analyze which successor node of K has more elements left or right one
If both have the same number of elements decide for one Replace the key K to be deleted with the direct successor Krsquo from the left
successor node or with the direct successor Krsquorsquo from the right successor node respectively Delete Krsquo or Krsquorsquo from the respective successor node (recursively)
Note Major variants Merge (tis lecture) Re-distribution (instead of splitmerge in case of overflowunderflow the entries are
re-distributed under consideration of one or multiple adjacent nodes)
| 299
gt
copy A Behrend Foundations of Information Systems |
B-Trees B+-Trees and B-Trees
B+-Trees and B-Trees Data is only in leaf nodes Key redundancy but higher fan-out rarr lower tree high less IO Simpler delete procedure rarr requires only merging of nodes
Double linked list of all leaf nodes B-Trees Modified valid node sizes
from [k2k] to [43k2k] rarr better node utilization but more splitsmerges
Example Secondary index Non unique
B-Tree with k = 2 h = 3
| 300
gt
copy A Behrend Foundations of Information Systems |
Indexing Low Cardinality Columns
Problem Example B-tree on the sex of customers for a table with 1000000 tuples results in
two lists with approximately 500000 tuples each
Query for all female customers requires 500000 random page accesses (secondary index) Table scan would be much faster
Conclusion B-trees (and also hashing) are useful for predicates with low selectivity
(outputinput cardinality ratio) Rule of thumb margin hit rate is approx 5 higher hit rates do not justify the efforts for an index access
F M
TID TID TID TID hellip TID TID TID TID hellip
| 301
gt
copy A Behrend Foundations of Information Systems |
Bitmap Index
Idea (Long history since the 1960s) Create a bitmapbitlist for each
attribute value Each tuple in the table is assigned to
one bit in the bitmap (by position sequential TID)
Bit values 1 attribute value set 0 attribute value not set
Necessary condition Sequential numbering of the tuples
(TIDs)
F M
1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
0 1 0 0 1 0 0 0 0 1 1 0 1 0 0
Name Sex Region Race Carol f n white
Harold m e black Anne f e asian
Iris f ne white hellip m se hisp hellip f e white hellip f sw asian hellip f w black hellip f n asian hellip m e hisp hellip m se black hellip f s white hellip m nw black hellip f s white hellip f w black
Sex
| 302
gt
copy A Behrend Foundations of Information Systems |
Querying Bitmap Indexes
Main advantage of bitmap indexes Simple and efficient logical join possible Read only data that is relevant for predicates Example σSex=lsquoflsquo ᴧ Region=lsquonlsquo R Bitmaps B1 and B2 in conjunction for (i=0 iltB1length i++) B = B1[i] amp B2[i]
Example IO Costs Estimation σSex=lsquoflsquo ᴧ Region=lsquonlsquo ᴧ Race=lsquoAsianlsquo R
(ldquoAsian women of region Northrdquo) Selectivity 12 18 14 = 164 N=10000 tuples with length of 400 bytes each
(~ 10 tuples per page for 4kB pages) Table scan 1000 pages Bitmap access 1000064 156 pages (worst case each tuple in a different page)
plus 1 page for bitmaps
F 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
N 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0
A 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
AND = AND
- DBMS Architecture
- What is in the Lecture
- How is Database System build
- Architectural Blue Print
- Architectural Trends
- Different Access Characteristics
- Hardware Developments
- Row Storage vs Column Storage
- Processing Models
- Transaction Management
- ACID Properties of Transactions
- Motivation
- Reasons for crashes
- Guarantee Atomicity amp Durability
- Record Management
- Record
- Record Addressing
- Surrogate with Mapping Table
- TID Concept
- Free Space Management
- Physical Access Paths ndash Index Structures
- Overview Indexes
- Overview Indexes (2)
- Classification of Index Structures
- B-Tree
- B-Tree (2)
- Search in the B-Tree
- Insertions in the B-Tree (1)
- Insertions in the B-Tree (2)
- Insertions in the B-Tree (3)
- Insertion and Deletion in the B-Tree
- Deletion in the B-Tree
- Deletion in the B-Tree (2)
- Deletion in the B-Tree (3)
- Deletion in the B-Tree (4)
- Deletion Algorithm
- B-Trees B+-Trees and B-Trees
- Indexing Low Cardinality Columns
- Bitmap Index
- Querying Bitmap Indexes
-
| 293
gt
copy A Behrend Foundations of Information Systems |
Insertion and Deletion in the B-Tree
Problem Insertion can create overflow Deletion can create underflow and overflow
Example Insertion of key 22
rarr Overflow rarr Split
Deletion of key 22 rarr Underflow need to access all four nodes finally same as input
Insert 22
| 294
gt
copy A Behrend Foundations of Information Systems |
Underflow Merge
Deletion in the B-Tree
Example Order k = 1 n=2k Delete key 3
| 295
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (2)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Remember Each path from the root to the leaf has the same length h
| 296
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (3)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Overflow Split
| 297
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (4)
Example Order k = 1 n=2k Delete key 3
Overflow Split
Root Split
| 298
gt
copy A Behrend Foundations of Information Systems |
Deletion Algorithm
Example ndash there are different algorithms Search the node that contains the key K to be deleted If key K is in a leaf node delete the key in the leaf node and handle potentially
resulting underflow by merging with sibling If key K is in an inner node pull up new discriminator from one of the successors Analyze which successor node of K has more elements left or right one
If both have the same number of elements decide for one Replace the key K to be deleted with the direct successor Krsquo from the left
successor node or with the direct successor Krsquorsquo from the right successor node respectively Delete Krsquo or Krsquorsquo from the respective successor node (recursively)
Note Major variants Merge (tis lecture) Re-distribution (instead of splitmerge in case of overflowunderflow the entries are
re-distributed under consideration of one or multiple adjacent nodes)
| 299
gt
copy A Behrend Foundations of Information Systems |
B-Trees B+-Trees and B-Trees
B+-Trees and B-Trees Data is only in leaf nodes Key redundancy but higher fan-out rarr lower tree high less IO Simpler delete procedure rarr requires only merging of nodes
Double linked list of all leaf nodes B-Trees Modified valid node sizes
from [k2k] to [43k2k] rarr better node utilization but more splitsmerges
Example Secondary index Non unique
B-Tree with k = 2 h = 3
| 300
gt
copy A Behrend Foundations of Information Systems |
Indexing Low Cardinality Columns
Problem Example B-tree on the sex of customers for a table with 1000000 tuples results in
two lists with approximately 500000 tuples each
Query for all female customers requires 500000 random page accesses (secondary index) Table scan would be much faster
Conclusion B-trees (and also hashing) are useful for predicates with low selectivity
(outputinput cardinality ratio) Rule of thumb margin hit rate is approx 5 higher hit rates do not justify the efforts for an index access
F M
TID TID TID TID hellip TID TID TID TID hellip
| 301
gt
copy A Behrend Foundations of Information Systems |
Bitmap Index
Idea (Long history since the 1960s) Create a bitmapbitlist for each
attribute value Each tuple in the table is assigned to
one bit in the bitmap (by position sequential TID)
Bit values 1 attribute value set 0 attribute value not set
Necessary condition Sequential numbering of the tuples
(TIDs)
F M
1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
0 1 0 0 1 0 0 0 0 1 1 0 1 0 0
Name Sex Region Race Carol f n white
Harold m e black Anne f e asian
Iris f ne white hellip m se hisp hellip f e white hellip f sw asian hellip f w black hellip f n asian hellip m e hisp hellip m se black hellip f s white hellip m nw black hellip f s white hellip f w black
Sex
| 302
gt
copy A Behrend Foundations of Information Systems |
Querying Bitmap Indexes
Main advantage of bitmap indexes Simple and efficient logical join possible Read only data that is relevant for predicates Example σSex=lsquoflsquo ᴧ Region=lsquonlsquo R Bitmaps B1 and B2 in conjunction for (i=0 iltB1length i++) B = B1[i] amp B2[i]
Example IO Costs Estimation σSex=lsquoflsquo ᴧ Region=lsquonlsquo ᴧ Race=lsquoAsianlsquo R
(ldquoAsian women of region Northrdquo) Selectivity 12 18 14 = 164 N=10000 tuples with length of 400 bytes each
(~ 10 tuples per page for 4kB pages) Table scan 1000 pages Bitmap access 1000064 156 pages (worst case each tuple in a different page)
plus 1 page for bitmaps
F 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
N 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0
A 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
AND = AND
- DBMS Architecture
- What is in the Lecture
- How is Database System build
- Architectural Blue Print
- Architectural Trends
- Different Access Characteristics
- Hardware Developments
- Row Storage vs Column Storage
- Processing Models
- Transaction Management
- ACID Properties of Transactions
- Motivation
- Reasons for crashes
- Guarantee Atomicity amp Durability
- Record Management
- Record
- Record Addressing
- Surrogate with Mapping Table
- TID Concept
- Free Space Management
- Physical Access Paths ndash Index Structures
- Overview Indexes
- Overview Indexes (2)
- Classification of Index Structures
- B-Tree
- B-Tree (2)
- Search in the B-Tree
- Insertions in the B-Tree (1)
- Insertions in the B-Tree (2)
- Insertions in the B-Tree (3)
- Insertion and Deletion in the B-Tree
- Deletion in the B-Tree
- Deletion in the B-Tree (2)
- Deletion in the B-Tree (3)
- Deletion in the B-Tree (4)
- Deletion Algorithm
- B-Trees B+-Trees and B-Trees
- Indexing Low Cardinality Columns
- Bitmap Index
- Querying Bitmap Indexes
-
| 294
gt
copy A Behrend Foundations of Information Systems |
Underflow Merge
Deletion in the B-Tree
Example Order k = 1 n=2k Delete key 3
| 295
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (2)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Remember Each path from the root to the leaf has the same length h
| 296
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (3)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Overflow Split
| 297
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (4)
Example Order k = 1 n=2k Delete key 3
Overflow Split
Root Split
| 298
gt
copy A Behrend Foundations of Information Systems |
Deletion Algorithm
Example ndash there are different algorithms Search the node that contains the key K to be deleted If key K is in a leaf node delete the key in the leaf node and handle potentially
resulting underflow by merging with sibling If key K is in an inner node pull up new discriminator from one of the successors Analyze which successor node of K has more elements left or right one
If both have the same number of elements decide for one Replace the key K to be deleted with the direct successor Krsquo from the left
successor node or with the direct successor Krsquorsquo from the right successor node respectively Delete Krsquo or Krsquorsquo from the respective successor node (recursively)
Note Major variants Merge (tis lecture) Re-distribution (instead of splitmerge in case of overflowunderflow the entries are
re-distributed under consideration of one or multiple adjacent nodes)
| 299
gt
copy A Behrend Foundations of Information Systems |
B-Trees B+-Trees and B-Trees
B+-Trees and B-Trees Data is only in leaf nodes Key redundancy but higher fan-out rarr lower tree high less IO Simpler delete procedure rarr requires only merging of nodes
Double linked list of all leaf nodes B-Trees Modified valid node sizes
from [k2k] to [43k2k] rarr better node utilization but more splitsmerges
Example Secondary index Non unique
B-Tree with k = 2 h = 3
| 300
gt
copy A Behrend Foundations of Information Systems |
Indexing Low Cardinality Columns
Problem Example B-tree on the sex of customers for a table with 1000000 tuples results in
two lists with approximately 500000 tuples each
Query for all female customers requires 500000 random page accesses (secondary index) Table scan would be much faster
Conclusion B-trees (and also hashing) are useful for predicates with low selectivity
(outputinput cardinality ratio) Rule of thumb margin hit rate is approx 5 higher hit rates do not justify the efforts for an index access
F M
TID TID TID TID hellip TID TID TID TID hellip
| 301
gt
copy A Behrend Foundations of Information Systems |
Bitmap Index
Idea (Long history since the 1960s) Create a bitmapbitlist for each
attribute value Each tuple in the table is assigned to
one bit in the bitmap (by position sequential TID)
Bit values 1 attribute value set 0 attribute value not set
Necessary condition Sequential numbering of the tuples
(TIDs)
F M
1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
0 1 0 0 1 0 0 0 0 1 1 0 1 0 0
Name Sex Region Race Carol f n white
Harold m e black Anne f e asian
Iris f ne white hellip m se hisp hellip f e white hellip f sw asian hellip f w black hellip f n asian hellip m e hisp hellip m se black hellip f s white hellip m nw black hellip f s white hellip f w black
Sex
| 302
gt
copy A Behrend Foundations of Information Systems |
Querying Bitmap Indexes
Main advantage of bitmap indexes Simple and efficient logical join possible Read only data that is relevant for predicates Example σSex=lsquoflsquo ᴧ Region=lsquonlsquo R Bitmaps B1 and B2 in conjunction for (i=0 iltB1length i++) B = B1[i] amp B2[i]
Example IO Costs Estimation σSex=lsquoflsquo ᴧ Region=lsquonlsquo ᴧ Race=lsquoAsianlsquo R
(ldquoAsian women of region Northrdquo) Selectivity 12 18 14 = 164 N=10000 tuples with length of 400 bytes each
(~ 10 tuples per page for 4kB pages) Table scan 1000 pages Bitmap access 1000064 156 pages (worst case each tuple in a different page)
plus 1 page for bitmaps
F 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
N 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0
A 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
AND = AND
- DBMS Architecture
- What is in the Lecture
- How is Database System build
- Architectural Blue Print
- Architectural Trends
- Different Access Characteristics
- Hardware Developments
- Row Storage vs Column Storage
- Processing Models
- Transaction Management
- ACID Properties of Transactions
- Motivation
- Reasons for crashes
- Guarantee Atomicity amp Durability
- Record Management
- Record
- Record Addressing
- Surrogate with Mapping Table
- TID Concept
- Free Space Management
- Physical Access Paths ndash Index Structures
- Overview Indexes
- Overview Indexes (2)
- Classification of Index Structures
- B-Tree
- B-Tree (2)
- Search in the B-Tree
- Insertions in the B-Tree (1)
- Insertions in the B-Tree (2)
- Insertions in the B-Tree (3)
- Insertion and Deletion in the B-Tree
- Deletion in the B-Tree
- Deletion in the B-Tree (2)
- Deletion in the B-Tree (3)
- Deletion in the B-Tree (4)
- Deletion Algorithm
- B-Trees B+-Trees and B-Trees
- Indexing Low Cardinality Columns
- Bitmap Index
- Querying Bitmap Indexes
-
| 295
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (2)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Remember Each path from the root to the leaf has the same length h
| 296
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (3)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Overflow Split
| 297
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (4)
Example Order k = 1 n=2k Delete key 3
Overflow Split
Root Split
| 298
gt
copy A Behrend Foundations of Information Systems |
Deletion Algorithm
Example ndash there are different algorithms Search the node that contains the key K to be deleted If key K is in a leaf node delete the key in the leaf node and handle potentially
resulting underflow by merging with sibling If key K is in an inner node pull up new discriminator from one of the successors Analyze which successor node of K has more elements left or right one
If both have the same number of elements decide for one Replace the key K to be deleted with the direct successor Krsquo from the left
successor node or with the direct successor Krsquorsquo from the right successor node respectively Delete Krsquo or Krsquorsquo from the respective successor node (recursively)
Note Major variants Merge (tis lecture) Re-distribution (instead of splitmerge in case of overflowunderflow the entries are
re-distributed under consideration of one or multiple adjacent nodes)
| 299
gt
copy A Behrend Foundations of Information Systems |
B-Trees B+-Trees and B-Trees
B+-Trees and B-Trees Data is only in leaf nodes Key redundancy but higher fan-out rarr lower tree high less IO Simpler delete procedure rarr requires only merging of nodes
Double linked list of all leaf nodes B-Trees Modified valid node sizes
from [k2k] to [43k2k] rarr better node utilization but more splitsmerges
Example Secondary index Non unique
B-Tree with k = 2 h = 3
| 300
gt
copy A Behrend Foundations of Information Systems |
Indexing Low Cardinality Columns
Problem Example B-tree on the sex of customers for a table with 1000000 tuples results in
two lists with approximately 500000 tuples each
Query for all female customers requires 500000 random page accesses (secondary index) Table scan would be much faster
Conclusion B-trees (and also hashing) are useful for predicates with low selectivity
(outputinput cardinality ratio) Rule of thumb margin hit rate is approx 5 higher hit rates do not justify the efforts for an index access
F M
TID TID TID TID hellip TID TID TID TID hellip
| 301
gt
copy A Behrend Foundations of Information Systems |
Bitmap Index
Idea (Long history since the 1960s) Create a bitmapbitlist for each
attribute value Each tuple in the table is assigned to
one bit in the bitmap (by position sequential TID)
Bit values 1 attribute value set 0 attribute value not set
Necessary condition Sequential numbering of the tuples
(TIDs)
F M
1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
0 1 0 0 1 0 0 0 0 1 1 0 1 0 0
Name Sex Region Race Carol f n white
Harold m e black Anne f e asian
Iris f ne white hellip m se hisp hellip f e white hellip f sw asian hellip f w black hellip f n asian hellip m e hisp hellip m se black hellip f s white hellip m nw black hellip f s white hellip f w black
Sex
| 302
gt
copy A Behrend Foundations of Information Systems |
Querying Bitmap Indexes
Main advantage of bitmap indexes Simple and efficient logical join possible Read only data that is relevant for predicates Example σSex=lsquoflsquo ᴧ Region=lsquonlsquo R Bitmaps B1 and B2 in conjunction for (i=0 iltB1length i++) B = B1[i] amp B2[i]
Example IO Costs Estimation σSex=lsquoflsquo ᴧ Region=lsquonlsquo ᴧ Race=lsquoAsianlsquo R
(ldquoAsian women of region Northrdquo) Selectivity 12 18 14 = 164 N=10000 tuples with length of 400 bytes each
(~ 10 tuples per page for 4kB pages) Table scan 1000 pages Bitmap access 1000064 156 pages (worst case each tuple in a different page)
plus 1 page for bitmaps
F 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
N 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0
A 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
AND = AND
- DBMS Architecture
- What is in the Lecture
- How is Database System build
- Architectural Blue Print
- Architectural Trends
- Different Access Characteristics
- Hardware Developments
- Row Storage vs Column Storage
- Processing Models
- Transaction Management
- ACID Properties of Transactions
- Motivation
- Reasons for crashes
- Guarantee Atomicity amp Durability
- Record Management
- Record
- Record Addressing
- Surrogate with Mapping Table
- TID Concept
- Free Space Management
- Physical Access Paths ndash Index Structures
- Overview Indexes
- Overview Indexes (2)
- Classification of Index Structures
- B-Tree
- B-Tree (2)
- Search in the B-Tree
- Insertions in the B-Tree (1)
- Insertions in the B-Tree (2)
- Insertions in the B-Tree (3)
- Insertion and Deletion in the B-Tree
- Deletion in the B-Tree
- Deletion in the B-Tree (2)
- Deletion in the B-Tree (3)
- Deletion in the B-Tree (4)
- Deletion Algorithm
- B-Trees B+-Trees and B-Trees
- Indexing Low Cardinality Columns
- Bitmap Index
- Querying Bitmap Indexes
-
| 296
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (3)
Example Order k = 1 n=2k Delete key 3
Underflow Merge
Overflow Split
| 297
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (4)
Example Order k = 1 n=2k Delete key 3
Overflow Split
Root Split
| 298
gt
copy A Behrend Foundations of Information Systems |
Deletion Algorithm
Example ndash there are different algorithms Search the node that contains the key K to be deleted If key K is in a leaf node delete the key in the leaf node and handle potentially
resulting underflow by merging with sibling If key K is in an inner node pull up new discriminator from one of the successors Analyze which successor node of K has more elements left or right one
If both have the same number of elements decide for one Replace the key K to be deleted with the direct successor Krsquo from the left
successor node or with the direct successor Krsquorsquo from the right successor node respectively Delete Krsquo or Krsquorsquo from the respective successor node (recursively)
Note Major variants Merge (tis lecture) Re-distribution (instead of splitmerge in case of overflowunderflow the entries are
re-distributed under consideration of one or multiple adjacent nodes)
| 299
gt
copy A Behrend Foundations of Information Systems |
B-Trees B+-Trees and B-Trees
B+-Trees and B-Trees Data is only in leaf nodes Key redundancy but higher fan-out rarr lower tree high less IO Simpler delete procedure rarr requires only merging of nodes
Double linked list of all leaf nodes B-Trees Modified valid node sizes
from [k2k] to [43k2k] rarr better node utilization but more splitsmerges
Example Secondary index Non unique
B-Tree with k = 2 h = 3
| 300
gt
copy A Behrend Foundations of Information Systems |
Indexing Low Cardinality Columns
Problem Example B-tree on the sex of customers for a table with 1000000 tuples results in
two lists with approximately 500000 tuples each
Query for all female customers requires 500000 random page accesses (secondary index) Table scan would be much faster
Conclusion B-trees (and also hashing) are useful for predicates with low selectivity
(outputinput cardinality ratio) Rule of thumb margin hit rate is approx 5 higher hit rates do not justify the efforts for an index access
F M
TID TID TID TID hellip TID TID TID TID hellip
| 301
gt
copy A Behrend Foundations of Information Systems |
Bitmap Index
Idea (Long history since the 1960s) Create a bitmapbitlist for each
attribute value Each tuple in the table is assigned to
one bit in the bitmap (by position sequential TID)
Bit values 1 attribute value set 0 attribute value not set
Necessary condition Sequential numbering of the tuples
(TIDs)
F M
1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
0 1 0 0 1 0 0 0 0 1 1 0 1 0 0
Name Sex Region Race Carol f n white
Harold m e black Anne f e asian
Iris f ne white hellip m se hisp hellip f e white hellip f sw asian hellip f w black hellip f n asian hellip m e hisp hellip m se black hellip f s white hellip m nw black hellip f s white hellip f w black
Sex
| 302
gt
copy A Behrend Foundations of Information Systems |
Querying Bitmap Indexes
Main advantage of bitmap indexes Simple and efficient logical join possible Read only data that is relevant for predicates Example σSex=lsquoflsquo ᴧ Region=lsquonlsquo R Bitmaps B1 and B2 in conjunction for (i=0 iltB1length i++) B = B1[i] amp B2[i]
Example IO Costs Estimation σSex=lsquoflsquo ᴧ Region=lsquonlsquo ᴧ Race=lsquoAsianlsquo R
(ldquoAsian women of region Northrdquo) Selectivity 12 18 14 = 164 N=10000 tuples with length of 400 bytes each
(~ 10 tuples per page for 4kB pages) Table scan 1000 pages Bitmap access 1000064 156 pages (worst case each tuple in a different page)
plus 1 page for bitmaps
F 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
N 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0
A 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
AND = AND
- DBMS Architecture
- What is in the Lecture
- How is Database System build
- Architectural Blue Print
- Architectural Trends
- Different Access Characteristics
- Hardware Developments
- Row Storage vs Column Storage
- Processing Models
- Transaction Management
- ACID Properties of Transactions
- Motivation
- Reasons for crashes
- Guarantee Atomicity amp Durability
- Record Management
- Record
- Record Addressing
- Surrogate with Mapping Table
- TID Concept
- Free Space Management
- Physical Access Paths ndash Index Structures
- Overview Indexes
- Overview Indexes (2)
- Classification of Index Structures
- B-Tree
- B-Tree (2)
- Search in the B-Tree
- Insertions in the B-Tree (1)
- Insertions in the B-Tree (2)
- Insertions in the B-Tree (3)
- Insertion and Deletion in the B-Tree
- Deletion in the B-Tree
- Deletion in the B-Tree (2)
- Deletion in the B-Tree (3)
- Deletion in the B-Tree (4)
- Deletion Algorithm
- B-Trees B+-Trees and B-Trees
- Indexing Low Cardinality Columns
- Bitmap Index
- Querying Bitmap Indexes
-
| 297
gt
copy A Behrend Foundations of Information Systems |
Deletion in the B-Tree (4)
Example Order k = 1 n=2k Delete key 3
Overflow Split
Root Split
| 298
gt
copy A Behrend Foundations of Information Systems |
Deletion Algorithm
Example ndash there are different algorithms Search the node that contains the key K to be deleted If key K is in a leaf node delete the key in the leaf node and handle potentially
resulting underflow by merging with sibling If key K is in an inner node pull up new discriminator from one of the successors Analyze which successor node of K has more elements left or right one
If both have the same number of elements decide for one Replace the key K to be deleted with the direct successor Krsquo from the left
successor node or with the direct successor Krsquorsquo from the right successor node respectively Delete Krsquo or Krsquorsquo from the respective successor node (recursively)
Note Major variants Merge (tis lecture) Re-distribution (instead of splitmerge in case of overflowunderflow the entries are
re-distributed under consideration of one or multiple adjacent nodes)
| 299
gt
copy A Behrend Foundations of Information Systems |
B-Trees B+-Trees and B-Trees
B+-Trees and B-Trees Data is only in leaf nodes Key redundancy but higher fan-out rarr lower tree high less IO Simpler delete procedure rarr requires only merging of nodes
Double linked list of all leaf nodes B-Trees Modified valid node sizes
from [k2k] to [43k2k] rarr better node utilization but more splitsmerges
Example Secondary index Non unique
B-Tree with k = 2 h = 3
| 300
gt
copy A Behrend Foundations of Information Systems |
Indexing Low Cardinality Columns
Problem Example B-tree on the sex of customers for a table with 1000000 tuples results in
two lists with approximately 500000 tuples each
Query for all female customers requires 500000 random page accesses (secondary index) Table scan would be much faster
Conclusion B-trees (and also hashing) are useful for predicates with low selectivity
(outputinput cardinality ratio) Rule of thumb margin hit rate is approx 5 higher hit rates do not justify the efforts for an index access
F M
TID TID TID TID hellip TID TID TID TID hellip
| 301
gt
copy A Behrend Foundations of Information Systems |
Bitmap Index
Idea (Long history since the 1960s) Create a bitmapbitlist for each
attribute value Each tuple in the table is assigned to
one bit in the bitmap (by position sequential TID)
Bit values 1 attribute value set 0 attribute value not set
Necessary condition Sequential numbering of the tuples
(TIDs)
F M
1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
0 1 0 0 1 0 0 0 0 1 1 0 1 0 0
Name Sex Region Race Carol f n white
Harold m e black Anne f e asian
Iris f ne white hellip m se hisp hellip f e white hellip f sw asian hellip f w black hellip f n asian hellip m e hisp hellip m se black hellip f s white hellip m nw black hellip f s white hellip f w black
Sex
| 302
gt
copy A Behrend Foundations of Information Systems |
Querying Bitmap Indexes
Main advantage of bitmap indexes Simple and efficient logical join possible Read only data that is relevant for predicates Example σSex=lsquoflsquo ᴧ Region=lsquonlsquo R Bitmaps B1 and B2 in conjunction for (i=0 iltB1length i++) B = B1[i] amp B2[i]
Example IO Costs Estimation σSex=lsquoflsquo ᴧ Region=lsquonlsquo ᴧ Race=lsquoAsianlsquo R
(ldquoAsian women of region Northrdquo) Selectivity 12 18 14 = 164 N=10000 tuples with length of 400 bytes each
(~ 10 tuples per page for 4kB pages) Table scan 1000 pages Bitmap access 1000064 156 pages (worst case each tuple in a different page)
plus 1 page for bitmaps
F 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
N 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0
A 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
AND = AND
- DBMS Architecture
- What is in the Lecture
- How is Database System build
- Architectural Blue Print
- Architectural Trends
- Different Access Characteristics
- Hardware Developments
- Row Storage vs Column Storage
- Processing Models
- Transaction Management
- ACID Properties of Transactions
- Motivation
- Reasons for crashes
- Guarantee Atomicity amp Durability
- Record Management
- Record
- Record Addressing
- Surrogate with Mapping Table
- TID Concept
- Free Space Management
- Physical Access Paths ndash Index Structures
- Overview Indexes
- Overview Indexes (2)
- Classification of Index Structures
- B-Tree
- B-Tree (2)
- Search in the B-Tree
- Insertions in the B-Tree (1)
- Insertions in the B-Tree (2)
- Insertions in the B-Tree (3)
- Insertion and Deletion in the B-Tree
- Deletion in the B-Tree
- Deletion in the B-Tree (2)
- Deletion in the B-Tree (3)
- Deletion in the B-Tree (4)
- Deletion Algorithm
- B-Trees B+-Trees and B-Trees
- Indexing Low Cardinality Columns
- Bitmap Index
- Querying Bitmap Indexes
-
| 298
gt
copy A Behrend Foundations of Information Systems |
Deletion Algorithm
Example ndash there are different algorithms Search the node that contains the key K to be deleted If key K is in a leaf node delete the key in the leaf node and handle potentially
resulting underflow by merging with sibling If key K is in an inner node pull up new discriminator from one of the successors Analyze which successor node of K has more elements left or right one
If both have the same number of elements decide for one Replace the key K to be deleted with the direct successor Krsquo from the left
successor node or with the direct successor Krsquorsquo from the right successor node respectively Delete Krsquo or Krsquorsquo from the respective successor node (recursively)
Note Major variants Merge (tis lecture) Re-distribution (instead of splitmerge in case of overflowunderflow the entries are
re-distributed under consideration of one or multiple adjacent nodes)
| 299
gt
copy A Behrend Foundations of Information Systems |
B-Trees B+-Trees and B-Trees
B+-Trees and B-Trees Data is only in leaf nodes Key redundancy but higher fan-out rarr lower tree high less IO Simpler delete procedure rarr requires only merging of nodes
Double linked list of all leaf nodes B-Trees Modified valid node sizes
from [k2k] to [43k2k] rarr better node utilization but more splitsmerges
Example Secondary index Non unique
B-Tree with k = 2 h = 3
| 300
gt
copy A Behrend Foundations of Information Systems |
Indexing Low Cardinality Columns
Problem Example B-tree on the sex of customers for a table with 1000000 tuples results in
two lists with approximately 500000 tuples each
Query for all female customers requires 500000 random page accesses (secondary index) Table scan would be much faster
Conclusion B-trees (and also hashing) are useful for predicates with low selectivity
(outputinput cardinality ratio) Rule of thumb margin hit rate is approx 5 higher hit rates do not justify the efforts for an index access
F M
TID TID TID TID hellip TID TID TID TID hellip
| 301
gt
copy A Behrend Foundations of Information Systems |
Bitmap Index
Idea (Long history since the 1960s) Create a bitmapbitlist for each
attribute value Each tuple in the table is assigned to
one bit in the bitmap (by position sequential TID)
Bit values 1 attribute value set 0 attribute value not set
Necessary condition Sequential numbering of the tuples
(TIDs)
F M
1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
0 1 0 0 1 0 0 0 0 1 1 0 1 0 0
Name Sex Region Race Carol f n white
Harold m e black Anne f e asian
Iris f ne white hellip m se hisp hellip f e white hellip f sw asian hellip f w black hellip f n asian hellip m e hisp hellip m se black hellip f s white hellip m nw black hellip f s white hellip f w black
Sex
| 302
gt
copy A Behrend Foundations of Information Systems |
Querying Bitmap Indexes
Main advantage of bitmap indexes Simple and efficient logical join possible Read only data that is relevant for predicates Example σSex=lsquoflsquo ᴧ Region=lsquonlsquo R Bitmaps B1 and B2 in conjunction for (i=0 iltB1length i++) B = B1[i] amp B2[i]
Example IO Costs Estimation σSex=lsquoflsquo ᴧ Region=lsquonlsquo ᴧ Race=lsquoAsianlsquo R
(ldquoAsian women of region Northrdquo) Selectivity 12 18 14 = 164 N=10000 tuples with length of 400 bytes each
(~ 10 tuples per page for 4kB pages) Table scan 1000 pages Bitmap access 1000064 156 pages (worst case each tuple in a different page)
plus 1 page for bitmaps
F 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
N 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0
A 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
AND = AND
- DBMS Architecture
- What is in the Lecture
- How is Database System build
- Architectural Blue Print
- Architectural Trends
- Different Access Characteristics
- Hardware Developments
- Row Storage vs Column Storage
- Processing Models
- Transaction Management
- ACID Properties of Transactions
- Motivation
- Reasons for crashes
- Guarantee Atomicity amp Durability
- Record Management
- Record
- Record Addressing
- Surrogate with Mapping Table
- TID Concept
- Free Space Management
- Physical Access Paths ndash Index Structures
- Overview Indexes
- Overview Indexes (2)
- Classification of Index Structures
- B-Tree
- B-Tree (2)
- Search in the B-Tree
- Insertions in the B-Tree (1)
- Insertions in the B-Tree (2)
- Insertions in the B-Tree (3)
- Insertion and Deletion in the B-Tree
- Deletion in the B-Tree
- Deletion in the B-Tree (2)
- Deletion in the B-Tree (3)
- Deletion in the B-Tree (4)
- Deletion Algorithm
- B-Trees B+-Trees and B-Trees
- Indexing Low Cardinality Columns
- Bitmap Index
- Querying Bitmap Indexes
-
| 299
gt
copy A Behrend Foundations of Information Systems |
B-Trees B+-Trees and B-Trees
B+-Trees and B-Trees Data is only in leaf nodes Key redundancy but higher fan-out rarr lower tree high less IO Simpler delete procedure rarr requires only merging of nodes
Double linked list of all leaf nodes B-Trees Modified valid node sizes
from [k2k] to [43k2k] rarr better node utilization but more splitsmerges
Example Secondary index Non unique
B-Tree with k = 2 h = 3
| 300
gt
copy A Behrend Foundations of Information Systems |
Indexing Low Cardinality Columns
Problem Example B-tree on the sex of customers for a table with 1000000 tuples results in
two lists with approximately 500000 tuples each
Query for all female customers requires 500000 random page accesses (secondary index) Table scan would be much faster
Conclusion B-trees (and also hashing) are useful for predicates with low selectivity
(outputinput cardinality ratio) Rule of thumb margin hit rate is approx 5 higher hit rates do not justify the efforts for an index access
F M
TID TID TID TID hellip TID TID TID TID hellip
| 301
gt
copy A Behrend Foundations of Information Systems |
Bitmap Index
Idea (Long history since the 1960s) Create a bitmapbitlist for each
attribute value Each tuple in the table is assigned to
one bit in the bitmap (by position sequential TID)
Bit values 1 attribute value set 0 attribute value not set
Necessary condition Sequential numbering of the tuples
(TIDs)
F M
1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
0 1 0 0 1 0 0 0 0 1 1 0 1 0 0
Name Sex Region Race Carol f n white
Harold m e black Anne f e asian
Iris f ne white hellip m se hisp hellip f e white hellip f sw asian hellip f w black hellip f n asian hellip m e hisp hellip m se black hellip f s white hellip m nw black hellip f s white hellip f w black
Sex
| 302
gt
copy A Behrend Foundations of Information Systems |
Querying Bitmap Indexes
Main advantage of bitmap indexes Simple and efficient logical join possible Read only data that is relevant for predicates Example σSex=lsquoflsquo ᴧ Region=lsquonlsquo R Bitmaps B1 and B2 in conjunction for (i=0 iltB1length i++) B = B1[i] amp B2[i]
Example IO Costs Estimation σSex=lsquoflsquo ᴧ Region=lsquonlsquo ᴧ Race=lsquoAsianlsquo R
(ldquoAsian women of region Northrdquo) Selectivity 12 18 14 = 164 N=10000 tuples with length of 400 bytes each
(~ 10 tuples per page for 4kB pages) Table scan 1000 pages Bitmap access 1000064 156 pages (worst case each tuple in a different page)
plus 1 page for bitmaps
F 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
N 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0
A 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
AND = AND
- DBMS Architecture
- What is in the Lecture
- How is Database System build
- Architectural Blue Print
- Architectural Trends
- Different Access Characteristics
- Hardware Developments
- Row Storage vs Column Storage
- Processing Models
- Transaction Management
- ACID Properties of Transactions
- Motivation
- Reasons for crashes
- Guarantee Atomicity amp Durability
- Record Management
- Record
- Record Addressing
- Surrogate with Mapping Table
- TID Concept
- Free Space Management
- Physical Access Paths ndash Index Structures
- Overview Indexes
- Overview Indexes (2)
- Classification of Index Structures
- B-Tree
- B-Tree (2)
- Search in the B-Tree
- Insertions in the B-Tree (1)
- Insertions in the B-Tree (2)
- Insertions in the B-Tree (3)
- Insertion and Deletion in the B-Tree
- Deletion in the B-Tree
- Deletion in the B-Tree (2)
- Deletion in the B-Tree (3)
- Deletion in the B-Tree (4)
- Deletion Algorithm
- B-Trees B+-Trees and B-Trees
- Indexing Low Cardinality Columns
- Bitmap Index
- Querying Bitmap Indexes
-
| 300
gt
copy A Behrend Foundations of Information Systems |
Indexing Low Cardinality Columns
Problem Example B-tree on the sex of customers for a table with 1000000 tuples results in
two lists with approximately 500000 tuples each
Query for all female customers requires 500000 random page accesses (secondary index) Table scan would be much faster
Conclusion B-trees (and also hashing) are useful for predicates with low selectivity
(outputinput cardinality ratio) Rule of thumb margin hit rate is approx 5 higher hit rates do not justify the efforts for an index access
F M
TID TID TID TID hellip TID TID TID TID hellip
| 301
gt
copy A Behrend Foundations of Information Systems |
Bitmap Index
Idea (Long history since the 1960s) Create a bitmapbitlist for each
attribute value Each tuple in the table is assigned to
one bit in the bitmap (by position sequential TID)
Bit values 1 attribute value set 0 attribute value not set
Necessary condition Sequential numbering of the tuples
(TIDs)
F M
1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
0 1 0 0 1 0 0 0 0 1 1 0 1 0 0
Name Sex Region Race Carol f n white
Harold m e black Anne f e asian
Iris f ne white hellip m se hisp hellip f e white hellip f sw asian hellip f w black hellip f n asian hellip m e hisp hellip m se black hellip f s white hellip m nw black hellip f s white hellip f w black
Sex
| 302
gt
copy A Behrend Foundations of Information Systems |
Querying Bitmap Indexes
Main advantage of bitmap indexes Simple and efficient logical join possible Read only data that is relevant for predicates Example σSex=lsquoflsquo ᴧ Region=lsquonlsquo R Bitmaps B1 and B2 in conjunction for (i=0 iltB1length i++) B = B1[i] amp B2[i]
Example IO Costs Estimation σSex=lsquoflsquo ᴧ Region=lsquonlsquo ᴧ Race=lsquoAsianlsquo R
(ldquoAsian women of region Northrdquo) Selectivity 12 18 14 = 164 N=10000 tuples with length of 400 bytes each
(~ 10 tuples per page for 4kB pages) Table scan 1000 pages Bitmap access 1000064 156 pages (worst case each tuple in a different page)
plus 1 page for bitmaps
F 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
N 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0
A 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
AND = AND
- DBMS Architecture
- What is in the Lecture
- How is Database System build
- Architectural Blue Print
- Architectural Trends
- Different Access Characteristics
- Hardware Developments
- Row Storage vs Column Storage
- Processing Models
- Transaction Management
- ACID Properties of Transactions
- Motivation
- Reasons for crashes
- Guarantee Atomicity amp Durability
- Record Management
- Record
- Record Addressing
- Surrogate with Mapping Table
- TID Concept
- Free Space Management
- Physical Access Paths ndash Index Structures
- Overview Indexes
- Overview Indexes (2)
- Classification of Index Structures
- B-Tree
- B-Tree (2)
- Search in the B-Tree
- Insertions in the B-Tree (1)
- Insertions in the B-Tree (2)
- Insertions in the B-Tree (3)
- Insertion and Deletion in the B-Tree
- Deletion in the B-Tree
- Deletion in the B-Tree (2)
- Deletion in the B-Tree (3)
- Deletion in the B-Tree (4)
- Deletion Algorithm
- B-Trees B+-Trees and B-Trees
- Indexing Low Cardinality Columns
- Bitmap Index
- Querying Bitmap Indexes
-
| 301
gt
copy A Behrend Foundations of Information Systems |
Bitmap Index
Idea (Long history since the 1960s) Create a bitmapbitlist for each
attribute value Each tuple in the table is assigned to
one bit in the bitmap (by position sequential TID)
Bit values 1 attribute value set 0 attribute value not set
Necessary condition Sequential numbering of the tuples
(TIDs)
F M
1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
0 1 0 0 1 0 0 0 0 1 1 0 1 0 0
Name Sex Region Race Carol f n white
Harold m e black Anne f e asian
Iris f ne white hellip m se hisp hellip f e white hellip f sw asian hellip f w black hellip f n asian hellip m e hisp hellip m se black hellip f s white hellip m nw black hellip f s white hellip f w black
Sex
| 302
gt
copy A Behrend Foundations of Information Systems |
Querying Bitmap Indexes
Main advantage of bitmap indexes Simple and efficient logical join possible Read only data that is relevant for predicates Example σSex=lsquoflsquo ᴧ Region=lsquonlsquo R Bitmaps B1 and B2 in conjunction for (i=0 iltB1length i++) B = B1[i] amp B2[i]
Example IO Costs Estimation σSex=lsquoflsquo ᴧ Region=lsquonlsquo ᴧ Race=lsquoAsianlsquo R
(ldquoAsian women of region Northrdquo) Selectivity 12 18 14 = 164 N=10000 tuples with length of 400 bytes each
(~ 10 tuples per page for 4kB pages) Table scan 1000 pages Bitmap access 1000064 156 pages (worst case each tuple in a different page)
plus 1 page for bitmaps
F 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
N 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0
A 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
AND = AND
- DBMS Architecture
- What is in the Lecture
- How is Database System build
- Architectural Blue Print
- Architectural Trends
- Different Access Characteristics
- Hardware Developments
- Row Storage vs Column Storage
- Processing Models
- Transaction Management
- ACID Properties of Transactions
- Motivation
- Reasons for crashes
- Guarantee Atomicity amp Durability
- Record Management
- Record
- Record Addressing
- Surrogate with Mapping Table
- TID Concept
- Free Space Management
- Physical Access Paths ndash Index Structures
- Overview Indexes
- Overview Indexes (2)
- Classification of Index Structures
- B-Tree
- B-Tree (2)
- Search in the B-Tree
- Insertions in the B-Tree (1)
- Insertions in the B-Tree (2)
- Insertions in the B-Tree (3)
- Insertion and Deletion in the B-Tree
- Deletion in the B-Tree
- Deletion in the B-Tree (2)
- Deletion in the B-Tree (3)
- Deletion in the B-Tree (4)
- Deletion Algorithm
- B-Trees B+-Trees and B-Trees
- Indexing Low Cardinality Columns
- Bitmap Index
- Querying Bitmap Indexes
-
| 302
gt
copy A Behrend Foundations of Information Systems |
Querying Bitmap Indexes
Main advantage of bitmap indexes Simple and efficient logical join possible Read only data that is relevant for predicates Example σSex=lsquoflsquo ᴧ Region=lsquonlsquo R Bitmaps B1 and B2 in conjunction for (i=0 iltB1length i++) B = B1[i] amp B2[i]
Example IO Costs Estimation σSex=lsquoflsquo ᴧ Region=lsquonlsquo ᴧ Race=lsquoAsianlsquo R
(ldquoAsian women of region Northrdquo) Selectivity 12 18 14 = 164 N=10000 tuples with length of 400 bytes each
(~ 10 tuples per page for 4kB pages) Table scan 1000 pages Bitmap access 1000064 156 pages (worst case each tuple in a different page)
plus 1 page for bitmaps
F 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1
N 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0
A 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
AND = AND
- DBMS Architecture
- What is in the Lecture
- How is Database System build
- Architectural Blue Print
- Architectural Trends
- Different Access Characteristics
- Hardware Developments
- Row Storage vs Column Storage
- Processing Models
- Transaction Management
- ACID Properties of Transactions
- Motivation
- Reasons for crashes
- Guarantee Atomicity amp Durability
- Record Management
- Record
- Record Addressing
- Surrogate with Mapping Table
- TID Concept
- Free Space Management
- Physical Access Paths ndash Index Structures
- Overview Indexes
- Overview Indexes (2)
- Classification of Index Structures
- B-Tree
- B-Tree (2)
- Search in the B-Tree
- Insertions in the B-Tree (1)
- Insertions in the B-Tree (2)
- Insertions in the B-Tree (3)
- Insertion and Deletion in the B-Tree
- Deletion in the B-Tree
- Deletion in the B-Tree (2)
- Deletion in the B-Tree (3)
- Deletion in the B-Tree (4)
- Deletion Algorithm
- B-Trees B+-Trees and B-Trees
- Indexing Low Cardinality Columns
- Bitmap Index
- Querying Bitmap Indexes
-