physical index structures logically, the index is a sorted list. physically, the sorted order is...
TRANSCRIPT
![Page 1: Physical Index Structures Logically, the index is a sorted list. Physically, the sorted order is normally maintained by pointers in a table. Tree-structured](https://reader036.vdocument.in/reader036/viewer/2022083005/56649f2b5503460f94c463ac/html5/thumbnails/1.jpg)
Physical Index Structures
• Logically, the index is a sorted list.
• Physically, the sorted order is normally maintained by pointers in a table.
• Tree-structured Indexes:– Binary tree– B-tree– B+-tree
Tree Structure
ROOT NODE
NODE NODE NODE
LEAF NODES
Node: branching point
![Page 2: Physical Index Structures Logically, the index is a sorted list. Physically, the sorted order is normally maintained by pointers in a table. Tree-structured](https://reader036.vdocument.in/reader036/viewer/2022083005/56649f2b5503460f94c463ac/html5/thumbnails/2.jpg)
Binary Tree Index
• Each index entry is a node of the tree.
• The index is a table with four fields:– the true index fields, key
value and address,
– a left, or less-than, pointer that points to a node with a smaller key value and,
– a right, or greater-than, pointer - points to node with larger key value
Key value
Rightpointer
Left pointer
Data pointeri.e. data fileaddress
A binary tree node
![Page 3: Physical Index Structures Logically, the index is a sorted list. Physically, the sorted order is normally maintained by pointers in a table. Tree-structured](https://reader036.vdocument.in/reader036/viewer/2022083005/56649f2b5503460f94c463ac/html5/thumbnails/3.jpg)
Binary Tree Index Example
16 87 13 54 22 35 39
1 2 3 4 5 6 7
161
872
133
544
356
225
Root node
Data file
16 1
87 2
21
2
3
4
5
13 3
3
54 4
4
22 5
5
6
Root nodeLP Key Add RP
Index as a table
397
6
7
35 6
39 7
7
- (only key values shown)
![Page 4: Physical Index Structures Logically, the index is a sorted list. Physically, the sorted order is normally maintained by pointers in a table. Tree-structured](https://reader036.vdocument.in/reader036/viewer/2022083005/56649f2b5503460f94c463ac/html5/thumbnails/4.jpg)
Binary Tree Index Problems• Data pointers are dispersed throughout every level of the
tree. This results in:– Unequal access times– Complex tree traversal programming
• A binary tree is normally unbalanced:– For the tree to be balanced (i.e. equal branch lengths),
the key value at each node must be the median of the values in its sub-trees.
– This is virtually impossible, as the tree is loaded top-down, i.e. in order of arrival of key values, hence,
– the tree becomes un-balanced, and unequal access times are the result.
![Page 5: Physical Index Structures Logically, the index is a sorted list. Physically, the sorted order is normally maintained by pointers in a table. Tree-structured](https://reader036.vdocument.in/reader036/viewer/2022083005/56649f2b5503460f94c463ac/html5/thumbnails/5.jpg)
Solution to Balance Problem in Index Tree Structures
• Load the tree “bottom-up”. That is, after a certain number of key values have been input, choose the median value to be promoted to a higher level so that it can point evenly to its left and right.
• This leads to the concepts of:– multi-value nodes, i.e. multiple key values
stored in sequence in each index node, and,– node-splitting - division of an overfull node into
two nodes, taking respectively, the low-end and high-end values of the split node.
![Page 6: Physical Index Structures Logically, the index is a sorted list. Physically, the sorted order is normally maintained by pointers in a table. Tree-structured](https://reader036.vdocument.in/reader036/viewer/2022083005/56649f2b5503460f94c463ac/html5/thumbnails/6.jpg)
K1 K2 K3A1 A2 A3
Left pointer - points to nodewith key values less than K1
Rightpointer
Points to node whosekey values are >K1and <K2
A B-tree Node
• Multiple key values per node
• K1<K2<K3 - i.e. key values in sequence
• Pointers all point to other nodes, and therefore to ALL of the key values in those nodes
![Page 7: Physical Index Structures Logically, the index is a sorted list. Physically, the sorted order is normally maintained by pointers in a table. Tree-structured](https://reader036.vdocument.in/reader036/viewer/2022083005/56649f2b5503460f94c463ac/html5/thumbnails/7.jpg)
Existing node values: 12 23 27 38New value to be inserted: 19
The split:
12 19 23 27 38
Key value 23 promotedto next highest level topoint to other two nodes
These values stayin the old node
These values move toa new node
B-tree Node Splitting
![Page 8: Physical Index Structures Logically, the index is a sorted list. Physically, the sorted order is normally maintained by pointers in a table. Tree-structured](https://reader036.vdocument.in/reader036/viewer/2022083005/56649f2b5503460f94c463ac/html5/thumbnails/8.jpg)
Data file has two records - root node of index now full.Data file: Root node:
87 36 362
871
1 2 3 4
Then, new data file record of key value 27 stored in cell 3
The split:
27 36 87
Promoted
362
273
871
NewRootNode
B-tree Node Split Example
![Page 9: Physical Index Structures Logically, the index is a sorted list. Physically, the sorted order is normally maintained by pointers in a table. Tree-structured](https://reader036.vdocument.in/reader036/viewer/2022083005/56649f2b5503460f94c463ac/html5/thumbnails/9.jpg)
362
273
871
K1 A1 K2 A2
1
2
3
4
36 22 3
27 3
87 1
Root Node
Current State of Index
![Page 10: Physical Index Structures Logically, the index is a sorted list. Physically, the sorted order is normally maintained by pointers in a table. Tree-structured](https://reader036.vdocument.in/reader036/viewer/2022083005/56649f2b5503460f94c463ac/html5/thumbnails/10.jpg)
B-tree Pros and Cons• Balanced - i.e. every branch is the same length,
i.e. descends to the same level. Therefore,• the wild variation in access times observable in
binary trees is avoided.• However, the key values, (and associated
addresses), are still dispersed throughout all levels of the structure, leading to:– unequal path lengths, and therefore unequal
access times, and,– complex tree-traversal algorithms for logically
sequential reading/unloading of the data file.
![Page 11: Physical Index Structures Logically, the index is a sorted list. Physically, the sorted order is normally maintained by pointers in a table. Tree-structured](https://reader036.vdocument.in/reader036/viewer/2022083005/56649f2b5503460f94c463ac/html5/thumbnails/11.jpg)
Solution to the Key Dispersal Problem
• Prohibit storage of data file addresses at all levels above leaf level.
• Consequently:– all accesses follow the same path length,
resulting in equal access times, and,– logically sequential reading of the data file
requires access to only the leaf level. That is, complex tree-traversal algorithms are not required.
![Page 12: Physical Index Structures Logically, the index is a sorted list. Physically, the sorted order is normally maintained by pointers in a table. Tree-structured](https://reader036.vdocument.in/reader036/viewer/2022083005/56649f2b5503460f94c463ac/html5/thumbnails/12.jpg)
Implementing the Solution• Since all key values must appear at leaf level, some
key values appear more than once in the index, and therefore,
• upper-level nodes don’t need address fields, and leaf-level nodes don’t need downward index pointers,
• the median value to be promoted when a node split occurs must belong to one of the ‘halves’. i.e. the rightmost value of the left half, (leading to less-than-or-equal pointers), or the leftmost value of the right
half, (greater-than-or-equal pointers).
![Page 13: Physical Index Structures Logically, the index is a sorted list. Physically, the sorted order is normally maintained by pointers in a table. Tree-structured](https://reader036.vdocument.in/reader036/viewer/2022083005/56649f2b5503460f94c463ac/html5/thumbnails/13.jpg)
1 2 3 4
The Data file:
56 9 72 41 34
The Root Node
Leaf Level Nodes
41
92
345
414
561
723
The left-hand node split when 41 was inserted. The high-orderend went to the right-hand node. Hence, the leaf-node pointer.
The B+-tree
![Page 14: Physical Index Structures Logically, the index is a sorted list. Physically, the sorted order is normally maintained by pointers in a table. Tree-structured](https://reader036.vdocument.in/reader036/viewer/2022083005/56649f2b5503460f94c463ac/html5/thumbnails/14.jpg)
1 2 3 4
56 9 72 41 34
The Root Node 25
25
9 25 34 41
The split
41
92
256
561
723
345
414
The Data File:
The B+-treeInsertion of data file record of key value
25