tree-structured indexes

Tree-Structured Indexes

Chapter 10

Indexes is access structures which are used to speed up the retrieval of records in respond to a query.

Index fields are used to construct index. Any fields are of the file can be used to create an index and multiple

indexes on different fields can be constructed on the same file. To find a record or records in the file based on certain selection on

an indexing field, one has to initially access the index which points to one or more blocks in the file where the required records are located.

The most types of indexes are based on ordered files (single level indexes) such as ISAM (Indexed Sequential Access Methods) and tree structures (Multilevel indexes) such as B or B+ tree are commonly used in DBMS to implement dynamically changing multilevel indexes.

Tree-structured indexing techniques support both range searches and equality searches.

ISAM: static structure that is effective when the file is not frequently updated; B+ tree: dynamic, adjusts gracefully under inserts and deletes.

Introduction

The idea behind an ordered index access structure is similar to that behind the index used in a textbook, which lists important terms at the end of the book in alphabetical order along with a list of page numbers where the term appears in the book.

``Find all students with gpa > 3.0’’◦ If data is in sorted file, do binary search to find first such student,

then scan to find others.◦ Cost of binary search can be quite high.

Simple idea: create a second file with one record per page of the format [Key, Pointer] called index entries.

Create an `index’ file.

Range Searches

Can do binary search on (smaller) index file!

Page 1 Page 2 Page N Data File

k2 kNk1 Index File

Each index page contains one pointer more than the number of keys. Each key serves as a separator for the contents of the pages pointed to by the pointers to its left and right.

Index file may still be quite large. But we can apply the idea repeatedly!

ISAM

Leaf pages contain data entries.

P0

K1 P

1K 2 P

2K m

P m

index entry

Non-leaf

Pages

Pages

Overflow page

Primary pages

Leaf

Each tree node is a disk page. The data entries are in the leaf pages of the tree and additional overflow

pages chained to some leaf page. When the file is created, all leaf pages are allocated sequentially and

sorted on the search key value. The non leaf pages are then allocated If there are several inserts to the file subsequently, so that more entries

are inserted into a leaf than will fit onto a single page, additional pages are needed because the index structure is static. These additional pages are allocated from an overflow area.

ISAM

Non-leaf

Pages

Pages

Overflow page

Primary pages

Leaf

Comments on ISAM File creation: Leaf (data) pages allocated

sequentially, sorted by search key; then index pages allocated, then space for overflow pages. Index entries: <search key value, page id>; they

`direct’ search for data entries, which are in leaf pages.

Search: Start at root and determine which subtree by use key comparisons to go to leaf.

Insert: Find leaf data entry belongs to, and put it there.

Delete: Find and remove from leaf; if empty overflow page, de-allocate.

Static tree structure: inserts/deletes affect only leaf pages.

Data Pages

Index Pages

Overflow pages

All searches begins at the root, for example to locate a record with a key value 27, we start at the root and follow the left pointer since 27<40.

We then follow the middle pointer, since 20<=27<33 We assume that each node can hold 2 entries; no need for `next-leaf-

page’ pointers.

Example ISAM Tree

10* 15* 20* 27* 33* 37* 40* 46* 51* 55* 63* 97*

20 33 51 63

40

Root

After Inserting 23*, 48*, 41*, 42* ...

10* 15* 20* 27* 33* 37* 40* 46* 51* 55* 63* 97*

20 33 51 63

40

Root

23* 48* 41*

42*

Overflow

Pages

Leaf

Index

Pages

Pages

Primary

The entry 23* belongs in the second data page, which already contains 20* and 27* and has no more space. We deal with situation by adding overflow page and putting 23* in the overflow page.

Inserting 48*, 41* and 42* leads to an overflow chain of two pages

... Then Deleting 42*, 51*, 97*

Note that 51* appears in index levels, but not in leaf!

10* 15* 20* 27* 33* 37* 40* 46* 55* 63*

20 33 51 63

40

Root

23* 48* 41*

The deletion of an entry k* is handled by simply removing the entry. If the entry is on an overflow page and the overflow page become empty, the

page can be removed. If the entry is on a primary page and deletion makes the primary empty, leave

the empty primary page as it is. It serves as a placeholder for future insertion. We do not move records from the overflow pages to the primary page when

deletions on the primary page create space. Thus, the number of primary leaf pages is fixed at file creation time.

A static structure such as the ISAM index suffer from the problem that long overflow chains can develop as the file grows, which leading to a poor performance.

B+ tree is dynamic search structures that adjust gracefully to inserts and deletes.

Internal nodes direct the search and the leaf nodes contain the data entries.

To retrieve all leaf page we have to link them using page pointer. Minimum 50% occupancy for each node (except for root). Each node

contains d <= m <= 2d entries. The parameter d is called the order of the tree, which measure the capacity of a tree node. The root node is the only exception on this requirement, it is required 1<=m<=2d

Supports equality and range-searches efficiently.

B+ Tree: Most Widely Used Index

Index Entries

Data Entries("Sequence set")

(Direct search)

Search begins at root, and key comparisons direct it to a leaf (as in ISAM).

Search for 5*, 15*, all data entries >= 24* ... The tree is of order d=2, means each node contains between 2

and 4 entries. To search for 5* we follow the left most child pointer since 5<13.

To search for entries 14 or 15 we follow the second pointer. To find 24*, we follow the fourth child pointer since 24<=24<=30.

Example B+ Tree

Based on the search for 15*, we know it is not in the tree!

Root

17 24 30

2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*

13

Find correct leaf L. Put data entry onto L.

◦ If L has enough space, done!◦ Else, must split L (into L and a new node L2)

Redistribute entries evenly, copy up middle key. Insert index entry pointing to L2 into parent of L.

This can happen recursively◦ To split index node, redistribute entries evenly, but

push up middle key. (Contrast with leaf splits.) Splits “grow” tree; root split increases height.

◦ Tree growth: gets wider or one level taller at top.

Inserting a Data Entry into a B+ Tree

Inserting 8* into Example B+ Tree It belongs in the left most leaf, which is already full, which cause split the leaf

page. Insert an entry consisting of the pair [5, pointer] into the parent node. Note how the key 5, which discriminate between the split leaf page and its

newly created sibling, is copied up. Since the parent is also full, another split occurs and the middle key [17,

pointer] is pushed up the tree, in contrast to copied up in the leaf split. Note difference between copy-up and push-up; be sure you understand the

reasons for this.

2* 3* 5* 7* 8*

5

Entry to be inserted in parent node.(Note that 5 iscontinues to appear in the leaf.)

s copied up and

appears once in the index. Contrast

5 24 30

17

13

Entry to be inserted in parent node.(Note that 17 is pushed up and only

this with a leaf split.)

2* 3* 5* 7*

17 24 3013

Example B+ Tree After Inserting 8*

Notice that root was split, leading to increase in height.In this example, we can avoid split by re-distributing entries;

however, this is usually not done in practice.

2* 3*

Root

17

24 30

14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*

135

7*5* 8*

Example B+ Tree After Inserting 8* using redistribution

Root

17 24 30

2* 3* 5* 7* 8* 14* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*

8

16*

I. Redistribute entries of node N with a sibling before splitting the node. The sibling of a node N is a node that is immediately to the left or right of N.

II. The entry belongs in the left most leaf, which is full.III. The only sibling of this leaf node contains only two entries and

can thus accommodate more entries.IV. Note how the entry in the parent node that points to the second

leaf has a new key value; we copy up the new low key value on the second leaf.

In this example, we can avoid split by re-distributing entries; however, this is usually not done in practice.

Start at root, find leaf L where entry belongs. Remove the entry.

◦ If L is at least half-full, done! ◦ If L has only d-1 entries,

Try to re-distribute, borrowing from sibling (adjacent node with same parent as L).

If re-distribution fails, merge L and sibling. If merge occurred, must delete entry (pointing

to L or sibling) from parent of L. Merge could propagate to root, decreasing

height.

Deleting a Data Entry from a B+ Tree

Example Tree After (Inserting 8*, Then) Deleting 19* and 20* ...If the deletion cause the node to go below the occupancy

threshold, we must either redistribute entries from an adjacent sibling or merge the node with a sibling to maintain minimum occupancy.1. If entries are redistributed between two nodes,

their parent node must be updated to reflect this; the key value in the index entry pointing to the second node must be changed to be the lowest search key in the second node.

2. If two nodes are merged, their parent must be updated to reflect this by deleting the index entry for the second node; this index entry is pointed to by the pointer variable oldchildentry when the delete call returns to the parent node.

Deleting 19* is easy. Remove it from leaf page on which it appears and the leaf still contains two entries.

Deleting 20* is done with re-distribution, because the leaf will remain with only one entry after deletion. The only sibling of the leaf node that contains 20* has three entries, therefore we deal with this situation by redistribution.

We move entry 24* to the leaf page that contain 22* and copy up the new splitting key (27, which is the new low key value of the leaf from which we borrow 24*) into the parent.

Example Tree After (Inserting 8*, Then) Deleting 19* and 20* ...

2* 3*

17

30

14* 16* 33* 34* 38* 39*

135

7*5* 8* 22* 24*

27

27* 29*

2* 3*

Root

17

24 30

14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*

135

7*5* 8*

The affected leaf contains only one entry (22*) after deletion, and the only sibling contains just two entries (27* and 29*), therefore we can not redistribute. These two nodes must be merged.

While merging, we`toss’ of index entry (27). Deleting the entry 27 has created a non-leaf-level page with just one entry,

which is below the minimum of d=2. To fix this problem, we must either redistribute or merge. The only sibling of this node contains just two entries (5, 13), so the redistribution is not possible, we must therefore merge.

To merge the two nodes, we also need to pull down the index entry in the parent that currently discriminates between these nodes. This index entry has key value 17 and now we have a total

of four entries. Note that pulling down the splitting key 17 means that it will no longer appears in the parent node following the merge.

... And Then Deleting 24*

30

22* 27* 29* 33* 34* 38* 39*

2* 3* 7* 14* 16* 22* 27* 29* 33* 34* 38* 39*5* 8*

Root30135 17

Tree is shown below during deletion of 24*. (What could be a possible initial tree?)

In contrast to previous example, can re-distribute entry from left child of root to right child.

Example of Non-leaf Re-distribution

Root

135 17 20

22

30

14* 16* 17* 18* 20* 33* 34* 38* 39*22* 27* 29*21*7*5* 8*3*2*

Intuitively, entries are re-distributed by `pushing through’ the splitting entry in the parent node.

It suffices to re-distribute index entry with key 20; we’ve re-distributed 17 as well for illustration.

After Re-distribution

14* 16* 33* 34* 38* 39*22* 27* 29*17* 18* 20* 21*7*5* 8*2* 3*

Root

135

17

3020 22

Important to increase fan-out. (Why?) Key values in index entries only `direct traffic’;

can often compress them.◦ E.g., If we have adjacent index entries with search key

values Dannon Yogurt, David Smith and Devarakonda Murthy, we can abbreviate David Smith to Dav. (The other keys can be compressed too ...) Is this correct? Not quite! What if there is a data entry

Davey Jones? (Can only compress David Smith to Davi) In general, while compressing, must leave each index entry

greater than every key value (in any subtree) to its left. Insert/delete must be suitably modified.

Prefix Key Compression

If we have a large collection of records, and we want to create a B+ tree on some field, doing so by repeatedly inserting records is very slow.

Bulk Loading can be done much more efficiently.

Initialization: Sort all data entries, insert pointer to first (leaf) page in a new (root) page.

Bulk Loading of a B+ Tree

3* 4* 6* 9* 10* 11* 12* 13* 20* 22* 23* 31* 35* 36* 38* 41* 44*

Sorted pages of data entries; not yet in B+ treeRoot

Bulk Loading (Contd.)

Index entries for leaf pages always entered into right-most index page just above leaf level. When this fills up, it splits. (Split may go up right-most path to the root.)

Much faster than repeated inserts, especially when one considers locking!

3* 4* 6* 9* 10* 11* 12*13* 20*22* 23* 31* 35*36* 38*41* 44*

Root

Data entry pages

not yet in B+ tree3523126

10 20

3* 4* 6* 9* 10* 11* 12*13* 20*22* 23* 31* 35*36* 38*41* 44*

6

Root

10

12 23

20

35

38

not yet in B+ treeData entry pages

Option 1: multiple inserts.◦ Slow.◦ Does not give sequential storage of leaves.

Option 2: Bulk Loading ◦ Has advantages for concurrency control.◦ Fewer I/Os during build.◦ Leaves will be stored sequentially (and linked, of

course).◦ Can control “fill factor” on pages.

Summary of Bulk Loading

Order (d) concept replaced by physical space criterion in practice (`at least half-full’).◦ Index pages can typically hold many more entries

than leaf pages.◦ Variable sized records and search keys mean differnt

nodes will contain different numbers of entries.◦ Even with fixed length fields, multiple records with the

same search key value (duplicates) can lead to variable-sized data entries (if we use Alternative (3)).

A Note on `Order’

Tree-structured indexes are ideal for range-searches, also good for equality searches.

ISAM is a static structure.◦ Only leaf pages modified; overflow pages needed.◦ Overflow chains can degrade performance unless

size of data set and data distribution stay constant. B+ tree is a dynamic structure.

◦ Inserts/deletes leave tree height-balanced; log F N cost.

◦ High fanout (F) means depth rarely more than 3 or 4.

◦ Almost always better than maintaining a sorted file.

Summary

◦ Typically, 67% occupancy on average.◦ Usually preferable to ISAM, modulo locking

considerations; adjusts to growth gracefully.◦ If data entries are data records, splits can change rids!

Key compression increases fanout, reduces height. Bulk loading can be much faster than repeated

inserts for creating a B+ tree on a large data set. Most widely used index in database management

systems because of its versatility. One of the most optimized components of a DBMS.

Summary (Contd.)

tree-structured indexes

Documents