cm20145 file structure and indexing dr alwyn barry dr joanna bryson
Post on 21-Dec-2015
219 views
TRANSCRIPT
CM20145CM20145File Structure and File Structure and IndexingIndexing
Dr Alwyn BarryDr Joanna Bryson
Last TimeLast Time Locking & Two-Phase Locking Review How Locks Work Other Possible Mechanisms
Graph-Based Protocols Timestamp-Based Protocols Validation-Based Protocols
Further Refinements Multiple Granularity Multi-version Schemes
Deadlock Handling Prevention Cures
Locking & Indexes
Now: File Structure & Indexing
OverviewOverview Storage Access & Buffers File Organization
Fixed Length Records Variable Length
Organization of Records in Files Sequential & Clustering
Data-Dictionary Storage Intro to Indexing
Basic Concepts Ordered Indices
Dense & Sparse Multilevel Primary & Secondary
Storage Hierarchy (Lecture 9)Storage Hierarchy (Lecture 9)
©Silberschatz, Korth and Sudarshan
Modifications & additions by S Bird, J Bryson
Storage AccessStorage Access Blocks
A fixed-length unit. The units for storage allocation and data transfer. Database files are organized into blocks.
Buffer portion of main memory available to store copies of
disk blocks.
Buffer Manager subsystem responsible for allocating buffer space in
main memory.
Block Transfers Want to minimize the number of block transfers
between disk and memory. Keep as many blocks as possible in main memory.
Trans. Server Processes (Lec 9)Trans. Server Processes (Lec 9)
Buffer ManagerBuffer Manager Called when need a block from disk. If the block is already in the buffer:
the requesting program is given the address of the block in main memory.
Otherwise:1. The buffer manager allocates space in the
buffer for the block: Discard some other block, if necessary for
space. Discarded block is written back to disk if it
was modified since it was last fetched.2. The buffer manager:
Reads the blocks from disk to buffer. Passes block’s new main memory address
to requesting program.
Buffer-Replacement PoliciesBuffer-Replacement Policies Most operating systems replace the block using
the least recently used (LRU) strategy. Past usage often predicts future.
But database queries have well-defined access patterns (e.g. sequential scans). A database system can predict future
references from information in a query. LRU can be a bad strategy for access patterns
that repeatedly scan tables. Join of r and s computed by nested loops
for each tuple tr of r do for each tuple ts of s do if the tuples tr and ts match …
Want a mixed replacement strategy with information provided by the query optimizer.
Buffer-Replacement Policies (2)Buffer-Replacement Policies (2) Pinned block (e.g for inner loop, s)
Memory block that is not allowed to be written back to disk.
Toss-immediately strategy (e.g. for r) frees the space occupied by a block as soon as
its final tuple has been processed. Most recently used (MRU) strategy (for s)
If know iterating through table, then most recent block will be unused the longest.
Buffer manager can use statistical information regarding the probability that a request will reference a particular relation. E.g. the data dictionary is frequently accessed.
Heuristic: always keep data-dictionary blocks pinned in main memory.
OverviewOverview Storage Access & Buffers File Organization
Fixed Length Records Variable Length
Organization of Records in Files Sequential & Clustering
Data-Dictionary Storage Intro to Indexing
Basic Concepts Ordered Indices
Dense & Sparse Multilevel Primary & Secondary
File OrganizationFile Organization The database is stored as a collection of
files. Each file is a sequence of records. A record is a sequence of fields. One approach:
assume record size is fixed, each file has records of one particular type
only, different files are used for different
relations.
This case is easiest to implement, will consider variable length fields later.
Fixed-Length RecordsFixed-Length Records Simple approach:
Store record i starting from byte n (i – 1), where n is the size of each record.
Record access is simple but records may cross blocks. Modification: do not allow records to cross block
boundaries.
Deletion of record I: alternatives: move records i+1,…,n
to i,…,n – 1 move record n to i do not move records, but link all free records on a free list.
Free ListsFree Lists Store the address of first deleted record in file
header. Then store address of second deleted record in location of first, etc.
Addresses are pointers to free space. More space efficient representation: store
addresses in attributes, not one per record. (No pointers stored in in-use records.)
Variable-Length RecordsVariable-Length Records Variable-length records arise in database
systems in several ways: Storage of multiple record types in one file. Record types that allow variable lengths for
one or more fields. Record types that allow repeating fields:
used in some older data models, not 1st Normal Form!
Approachesa)Byte strings,b)Slotted pages,c) Fixed length representation: reserved space,d)Fixed length representation: pointers.
a) Byte-String Representationa) Byte-String Representation
Byte string representation:• Attach an end-of-record () control character to the end of each record.• Difficulties with insertions and deletions.
b) Slotted Page Representationb) Slotted Page Representation
Slotted page header contains: number of record entries end of free space in the block location and size of each record
Records can be moved around within a page to keep them contiguous with no empty space between them; entry in the header must be updated.
Pointers should not point directly to record — instead they should point to the entry for the record in header.
c) Fixed Length Rep: reserved spacec) Fixed Length Rep: reserved space Fixed-length representation:
reserved space pointers
Reserved space – can use fixed-length records of a known maximum length; unused space in shorter records filled with a null or end-of-record symbol.
d) Fixed Length Rep: Pointersd) Fixed Length Rep: Pointers
Pointer method A variable-length record is represented by a list of
fixed-length records, chained together via pointers. Can be used even if the maximum record length is
not known. May waste less space.
d) Fixed Length Rep: Pointers (2)d) Fixed Length Rep: Pointers (2) Space might still be wasted in all records
except the first in a chain. Solution is to allow two kinds of block in file:
Anchor block – contains the first records of chain Overflow block – contains records other than those
that are the first records of chains
OverviewOverview Storage Access & Buffers File Organization
Fixed Length Records Variable Length
Organization of Records in Files Sequential & Clustering
Data-Dictionary Storage Intro to Indexing
Basic Concepts Ordered Indices
Dense & Sparse Multilevel Primary & Secondary
Organization of Records in FilesOrganization of Records in Files Heap (won’t consider this)
a record can be placed anywhere in the file where there is space.
Sequential store records in sequential order, based on the value
of the search key of each record.
Clustering file organization records of several different relations can be stored in
the same file. Storing related records on same block minimizes I/O.
Hashing (next lecture) a hash function computed on some attribute of each
record, result of function specifies which block the record
should be placed in.
Sequential File OrganizationSequential File Organization Suitable for applications that require
sequential processing of the entire file. The records in the file are ordered by a
search key.
Sequential File Organization (2)Sequential File Organization (2) Deletion – use pointer chains. Insertion – locate the position where the record is
to be inserted. If there is free space insert there. If no free space, insert the record in an overflow block. In either case, pointer chain must be updated.
Need to reorganize the file occasionally to restore sequential order
Sequential File Organization (3)Sequential File Organization (3)
It is necessary to fill up the space once a record has been deleted
Shift everything up
Move last record
Clustering File OrganizationClustering File Organization Simple file structure stores each relation in a separate
file. Can instead store several relations in one file using a
clustering file organization. E.g., clustering organization of customer and depositor:
Clustering File Organization (2)Clustering File Organization (2)
Good for queries involving depositor customer, and for queries involving one single customer and their accounts.
Bad for queries involving only customer. Results in variable size records.
Add Pointer ChainsAdd Pointer Chains
Clustering File Organization (3)Clustering File Organization (3)
OverviewOverview Storage Access & Buffers File Organization
Fixed Length Records Variable Length
Organization of Records in Files Sequential & Clustering
Data-Dictionary Storage Intro to Indexing
Basic Concepts Ordered Indices
Dense & Sparse Multilevel Primary & Secondary
Data Dictionary StorageData Dictionary Storage
Information about relations: names of relations, names and types of attributes of each relation, names and definitions of views, integrity constraints.
User / accounting information, e.g. passwords. Statistical and descriptive data:
number of tuples in each relation, access frequency. Physical file organization information:
How relation is stored (sequential/hash/…). Physical location of relation:
operating system file name, or disk addresses of blocks containing relation records.
Information about indices.
The data dictionary (or system catalog) stores metadata (data about data), such as:
Data Dictionary Storage (2)Data Dictionary Storage (2) Catalog structure: use either
specialized data structures designed for efficient access, or
a set of relations, with existing system features used to ensure efficient access.
The latter alternative is usually preferred. A possible catalog representation:
Relation-metadata = (relation-name, number-of-attributes, storage-organization, location)Attribute-metadata = (attribute-name, relation-name, domain-type,
position, length)User-metadata = (user-name, encrypted-password, group)Index-metadata = (index-name, relation-name, index-type,
index-attributes)View-metadata = (view-name, definition)
OverviewOverview Storage Access & Buffers File Organization
Fixed Length Records Variable Length
Organization of Records in Files Sequential & Clustering
Data-Dictionary Storage Intro to Indexing
Basic Concepts Ordered Indices
Dense & Sparse Multilevel Primary & Secondary
Indexing: Basic ConceptsIndexing: Basic Concepts Indexing mechanisms are used to speed
up access to desired data. E.g., author catalog in library
Search Key - attribute or set of attributes used to look up records.
An index file consists of records (called index entries) of the form
Index much smaller than original file. Two basic kinds of indices:
Ordered indices: search keys stored sorted. Hash indices: search keys distributed
uniformly across buckets using hash function.
search-key pointer
Index Evaluation MetricsIndex Evaluation Metrics Types:
Access types supported efficiently. E.g., records with a specified value in the
attribute. records with an attribute value falling in a
specified range.
Time: Access time Insertion time Deletion time
Space: Space overhead
Ordered IndicesOrdered Indices Ordered index:
Index entries are stored sorted on the search key value. E.g., author catalog in library.
Primary index: In a sequentially ordered file, the index that’s search
key specifies the sequential order of the file. Also called clustering index. The search key of a primary index is usually but not
necessarily the primary key. Secondary index:
An index whose search key specifies an order different from the sequential order of the file.
Also called the non-clustering index. Index-sequential file:
Ordered sequential file with a primary index. Efficient for random access and sequential search.
Dense Index Files: ExampleDense Index Files: Example
Dense index — Index record appears for every search-key value in the file.
Dense Index Files: UpdatesDense Index Files: Updates Deletion:
If deleted record was only record with its search-key value, delete search-key too.
Insertion: lookup using the search-key value appearing
in the record to be inserted. if the search-key value does not appear in the
index, insert it. Multilevel update algorithms are simple
extensions of the single-level algorithms
Sparse Index FilesSparse Index Files Sparse Index:
Contains index records for only some search-key values.
Applicable only when records are sequentially ordered on search-key.
To locate record with search-key value K: Find index record with largest search-key value
< K, Search file sequentially starting at the record to
which that index record points. Less space and less maintenance overhead for
insertions and deletions than dense index. Generally slower than for locating records Good tradeoff: sparse index with an index entry
for every block in file, corresponding to least search-key value in the block.
Sparse Index Files: ExampleSparse Index Files: Example
Sparse Index Files: DeletionSparse Index Files: Deletion If deleted record was only record in the
file with its particular search-key value, the search-key is deleted too. Replacing entry in the index with the next
search-key value in the file, in search-key order.
If the next search-key value already has an index entry, just delete without replacement.
Sparse Index Files: InsertionSparse Index Files: Insertion Single-level index insertion:
Perform a lookup using the search-key value appearing in the record to be inserted.
If index stores an entry for each block of the file, no change needs to be made to the index unless a new block is created.
If new block created, first search-key value appearing in the new block is inserted into the index.
Multilevel insertion algorithms are simple extensions.
Multilevel IndexMultilevel Index If primary index doesn’t fit in
memory, access becomes expensive.
To reduce number of disk accesses to index records, treat primary index kept on disk as a sequential file and construct a sparse index on it. outer index – a sparse
index of primary index. inner index – the primary
index file. If outer index is too large to fit
in main memory, create another level.
Must update indices at all levels on insertion or deletion!
Secondary IndicesSecondary Indices Often want to find records based on values of
fields which are not the primary index. Example: Suppose the account database is
stored sequentially by account number, e.g.1. find all accounts in a particular branch.2. find all accounts with balances in a
specified range. A secondary index would have an index record
for each search-key value. Index record points to a bucket that contains
pointers to all the actual records with that particular search-key value.
Example: Secondary IndexExample: Secondary Index
Secondary Index on Secondary Index on balancebalance field of field of accountaccount
Primary and Secondary IndicesPrimary and Secondary Indices Indices offer substantial benefits when
searching for records. When a file is modified, every index on
the file must be updated. Updating indices imposes overhead on database modification.
Sequential scan using primary index is efficient, but a sequential scan using a secondary index is expensive: Each record access may fetch a new
block from disk. Secondary indices have to be dense.
SummarySummary Storage Access & Buffers File Organization
Fixed Length Records Variable Length
Organization of Records in Files Sequential & Clustering
Data-Dictionary Storage Intro to Indexing
Basic Concepts Ordered Indices
Dense & Sparse Multilevel Primary & Secondary
Next: Indexing & Hashing
Reading & ExercisesReading & Exercises
Reading: Silberschatz Chapter 11.5 – 11.8 Connolly & Begg 2.4, 20.3.2, 17.2
(sort of), Appendix C (this also covers Lecture 17)
Exercises: Silberschatz 11.7,14,17,20 C & B – no exercises in the
appendix, borrow Silberschatz!
Changing Whole RelationsChanging Whole Relations If two-phase locking is used :
A delete operation may be performed only if the transaction deleting the tuple has an exclusive lock on the tuple to be deleted.
A transaction that inserts a new tuple into the database is given an X-lock on the tuple.
Insertions and deletions can lead to the phantom phenomenon. A transaction that scans a relation (e.g., find
all Perryridge accounts) and a transaction that inserts a tuple in the relation (e.g., insert new Perryridge acct.) may conflict without accessing any tuple in common.
Non-serializable schedules can result! (Find transaction may not see new account, yet be serialized after the insert transaction.)
Insert and Delete OperationsInsert and Delete Operations A transaction scanning a relation is
reading information about what tuples the relation contains, while another transaction changes the same info. Something should be locked!
One solution: Associate a data item with the relation, to
represent the information about what tuples the relation contains.
Transactions scanning the relation acquire a shared lock in the data item.
Transactions inserting or deleting a tuple acquire an exclusive lock on the data item. (Note: locks on the data item do not conflict with locks on individual tuples.)
Locking for Inserts & DeletesLocking for Inserts & Deletes Previous protocol provides very
low concurrency for insertions/deletions.
Index locking protocols provide higher concurrency while preventing the phantom phenomenon, by requiring locks on certain index buckets.
Index Locking ProtocolIndex Locking Protocol Every relation must have at least one
index. Access to a relation must be made only through one of the indices on the relation.
A transaction Ti that performs a lookup must lock all the index buckets that it accesses, in S-mode.
A transaction Ti may not insert a tuple ti into a relation r without updating all indices to r.
Ti must perform a lookup on every index to find all index buckets that could have possibly contained a pointer to tuple ti, had it existed already, and obtain locks in X-mode on all these index buckets. Ti must also obtain locks in X-mode on all index buckets that it modifies.
The rules of the two-phase locking protocol must be observed.
Concurrency in Index StructuresConcurrency in Index Structures Indices are unlike other database items in that
their only job is to help in accessing data. Index-structures are typically accessed very
often, much more than other database items. Treating index-structures like other database
items leads to low concurrency. Two-phase locking on an index may result in transactions executing practically one-at-a-time.
It is acceptable to have nonserializable concurrent access to an index as long as the accuracy of the index is maintained.
There are index concurrency protocols where locks on internal nodes are released early, and not in a two-phase fashion.