01-b trees

Upload: rigcho-azhar

Post on 03-Jun-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/12/2019 01-B Trees

    1/47

    Analysis and Design of Algorithms II

    B-Trees

    Arif Nurwidyantoro

  • 8/12/2019 01-B Trees

    2/47

    Analysis and Design of Algorithms II

    Analysis and Design of Algorithms II

    Overview

    Motivation Definition of B-Trees

    Basic operations of B-Trees

    Search

    Create

    Insert Delete

  • 8/12/2019 01-B Trees

    3/47

    Analysis and Design of Algorithms II

    Motivation

  • 8/12/2019 01-B Trees

    4/47

    Analysis and Design of Algorithms II

    Analysis and Design of Algorithms II

    B-Trees

    Balanced search trees Designed for disks and other direct access secondary storag

    Similar to Red-Black Trees

    Branching factor

    Internal nodes may have many children

    Every n-node B-Tree has height O(lg n)

  • 8/12/2019 01-B Trees

    5/47

    Analysis and Design of Algorithms IIAnalysis and Design of Algorithms II

    B-Trees

    Internal node x contains x.n keys x has x.n+1 children

    Keys in node x as dividing pointsseparating range of keys handledby x into x.n+1 subranges

    Each handled by one child of x

    Search key in B-Trees

    Make (x.n+1)-way decisions

    Based on comparisons with the x.nkeys stored at node x

  • 8/12/2019 01-B Trees

    6/47

    Analysis and Design of Algorithms IIAnalysis and Design of Algorithms II

    Data structures on secondary storages

    Memory in computer Main memory

    Secondary storage

    Main memory

    Faster

    Silicon memory chips Expensive

    Secondary storage

    Magnetic disks

    Bigger capacity

  • 8/12/2019 01-B Trees

    7/47

    Analysis and Design of Algorithms IIAnalysis and Design of Algorithms II

    Disk Drive

    Platters One or more

    Rotate at constant speedspindle

    Covered by magnetizable

    Drive reads and writes phead at the end of an ar

    Armscan move heads toaway from spindle

    Track Surface underneath head

    stationary

  • 8/12/2019 01-B Trees

    8/47

    Analysis and Design of Algorithms IIAnalysis and Design of Algorithms II

    Disk

    Slow 5400-15000 RPM

    5 orders of magnitude lower than memory acces times

    Have mechanical parts

    Mechanical

    Platter rotation

    Arm movement

  • 8/12/2019 01-B Trees

    9/47

    Analysis and Design of Algorithms IIAnalysis and Design of Algorithms II

    Disk

    Information divided into equalpages of bit Appear consecutively within tracks

    Each disk read/write is of one or more entire pages

    211to 214 bytes in length

    Running time

    Number of disk accesses

    CPU (computing) time

    Number of disk accesses

    Number of pages of information need to be read/written to the di

  • 8/12/2019 01-B Trees

    10/47

    Analysis and Design of Algorithms IIAnalysis and Design of Algorithms II

    B-Tree and Disk

    Amount of data handled large Didnt fit into main memory at once

    B-Tree algorithms

    copy selected pages from disk into main memory as needed

    write back into disk the pages that have changed

    Objective To optimize disks access by using the full amount of information re

    disk access

    i.e. one page -> size of 1 node of the B-Tree = 1 page

  • 8/12/2019 01-B Trees

    11/47

    Analysis and Design of Algorithms IIAnalysis and Design of Algorithms II

    B-Tree and Disk

    x = pointer to some object Disk-Read(x)

    Read object x into main memory

    Disk-Write(x)

    Save any changes made to the attributes of object x

  • 8/12/2019 01-B Trees

    12/47

    Analysis and Design of Algorithms IIAnalysis and Design of Algorithms II

    B-Tree and Disk

    Branching factors between 50-2000

    Depends on the size of a key relativeto the size of a page

    Large branching factors

    Reduce height of tree and number of

    disk accesses required to find any key

  • 8/12/2019 01-B Trees

    13/47

    Analysis and Design of Algorithms II

    Definition of B-Trees

  • 8/12/2019 01-B Trees

    14/47

    Analysis and Design of Algorithms IIAnalysis and Design of Algorithms II

    B-Trees Properties

    Every node x has the following attributes: x.n, the number of keys currently stored in node x

    x.nkeys themselves,x.key1, x.key2, , x.keyx.n, stored in nondecrea

    so thatx.key1 x.key2 x.keyx.n

    X.leaf,

    TRUE, if x is a leaf

    FALSE, if x is an internal node

    Each internal node x also containsx.n+1 pointersx.c1, x.c2,its children

    Leaf node have no children, ciattributes are undefined

  • 8/12/2019 01-B Trees

    15/47

    Analysis and Design of Algorithms IIAnalysis and Design of Algorithms II

    B-Trees Properties

    The keysx.keyiseparate the ranges of keys stored in each suis any key stored in the subtree with rootx.ci, then

    k1 x.key1 k2 x.key2 x.keyx.n kx.n+1

    All leaves have the same depth, which is the trees height h

  • 8/12/2019 01-B Trees

    16/47

    Analysis and Design of Algorithms IIAnalysis and Design of Algorithms II

    B-Trees Properties

    Nodes have lower and upper bounds on the number of keyscontain. Bounds expressed in terms of fixed integer t 2 calminimum degree of the B-Tree

    Every node other than the root must have at least t-1 keys

    Every internal node othet than the root thus has at least t children

    If the tree is nonempty, the root must have at least one key

    Every node may contain at most 2t-1 keys

    An internal node may have at most 2t children

    Node is full if it contains exactly 2t-1 keys

  • 8/12/2019 01-B Trees

    17/47

  • 8/12/2019 01-B Trees

    18/47

    Analysis and Design of Algorithms IIAnalysis and Design of Algorithms II

    Height of B-Tree

    Root of B-Tree T contains at least one key, all other nodes coleast t-1 keys

    T has at least 2 nodes at depth 1, 2t nodes at depth 2, at leadepth 3, so on until depth h has at least 2th-1nodes

    l i d i f l i h

    l i d i f l i h

  • 8/12/2019 01-B Trees

    19/47

    Analysis and Design of Algorithms IIAnalysis and Design of Algorithms II

    Height of B-Tree

    The number n of keys satisfies

    A l i d D i f Al ith II

    A l i d D i f Al ith II

  • 8/12/2019 01-B Trees

    20/47

    Analysis and Design of Algorithms IIAnalysis and Design of Algorithms II

    Height of B-Tree

    Altough the height of the tree grows as O(lg n)

    The base of the logarithm can be many times larger

    B-Trees save a factor about lg t over red-black tree in the nunodes examined for most tree operations

    A l i d D i f Al ith II

  • 8/12/2019 01-B Trees

    21/47

    Analysis and Design of Algorithms II

    Basic Operations of B-Tree

    Analysis and Design of Algorithms II

    Analysis and Design of Algorithms II

  • 8/12/2019 01-B Trees

    22/47

    Analysis and Design of Algorithms IIAnalysis and Design of Algorithms II

    Basic Operations

    Search

    Create

    Insert

    Delete

    Analysis and Design of Algorithms II

    Analysis and Design of Algorithms II

  • 8/12/2019 01-B Trees

    23/47

    Analysis and Design of Algorithms IIAnalysis and Design of Algorithms II

    Convention

    The root of the B-Tree is always in main memory, so we nevDisk-Read on the root

    Any nodes that are passed as parameters must already haveDisk-Read operation performed on them

    Procedure presented here are all one pass algorithm that pdownward from the root of the tree, without having to back u

    Analysis and Design of Algorithms II

    Analysis and Design of Algorithms II

  • 8/12/2019 01-B Trees

    24/47

    Analysis and Design of Algorithms IIAnalysis and Design of Algorithms II

    Search a B-Tree

    Similar to search a binary search tree Multiway branching decision At each internal node x, make an (x.n + 1)-way branching decision

    Input Pointer to the root node x of a subtree Key k to be searched for in that subtree

    Top level call example

    B-TREE-SEARCH (T.root, k) Output

    If k is in B-tree, return (y,i) node y and an index i such that y.keyi= k

    Otherwise, return NIL

    Analysis and Design of Algorithms II

    Analysis and Design of Algorithms II

  • 8/12/2019 01-B Trees

    25/47

    Analysis and Design of Algorithms IIAnalysis and Design of Algorithms II

    Search a B-Tree

    Analysis and Design of Algorithms II

    Analysis and Design of Algorithms II

  • 8/12/2019 01-B Trees

    26/47

    Analysis and Design of Algorithms IIAnalysis and Design of Algorithms II

    Search a B-Tree

    B-TREE-SEARCH procedure accesses O(h) = O(logt

    n)disk pag

    h is the height of the B-tree

    n is the number of keys in the B-tree

    Since x.n < 2t, while loop of lines 2-3 takes O(t) time within

    Total CPU time is O(th) = O(t logtn)

    Analysis and Design of Algorithms II

    Analysis and Design of Algorithms II

  • 8/12/2019 01-B Trees

    27/47

    Analysis and Design of Algorithms IIAnalysis and Design of Algorithms II

    Create an empty B-Tree

    use B-TREE-CREATE to create an empty root node

    Then, call B-TREE-INSERT to add new keys

    Use an auxiliary procedure ALLOCATE-NODE

    Allocates one disk page to be used as a new node in O(1)time

    B-TREE-CREATE requires O(1) disk operations and O(1)CPU

    Analysis and Design of Algorithms II

    Analysis and Design of Algorithms II

  • 8/12/2019 01-B Trees

    28/47

    y g gy g g

    Create an empty B-Tree

    Analysis and Design of Algorithms II

    Analysis and Design of Algorithms II

  • 8/12/2019 01-B Trees

    29/47

    y g gy g g

    Insert a key into B-Tree

    Insert the new key into existing leaf node

    Can not insert a key into a leaf node that is full Split operations

    Split full node y (having 2t-1 keys) arounds its median key y.keyt Into two nodes having t-1 keys each

    The median key moves up into ys parent to identify the dividingbetweent two new trees

    If the parent is also full, must split it before insert new key

    Could end up splitting full nodes all the way up the tree

    Analysis and Design of Algorithms II

    Analysis and Design of Algorithms II

  • 8/12/2019 01-B Trees

    30/47

    Insert a key into B-Tree

    Another approach: single pass down from the root to a leaf

    As travel down the tree searching for the position where thebelongs, split each full node we come along the way (includleaf itself)

    We are assured that its parent is not full, in case we want to split a

    Procedure involved

    Splitting a node in a B-tree

    Inserting a key into a B-tree in a single pass down the tree

    Analysis and Design of Algorithms II

    Analysis and Design of Algorithms II

  • 8/12/2019 01-B Trees

    31/47

    Splitting a node in a B-Tree

    Procedure B-TREE-SPLIT-CHILD

    Input

    Nonfull internal node x

    Index i such thatx.ciis afull child of x

    The procedure splits this child in two and adjusts x so that itadditional child

    To split a full root

    Make the root a child of a new empty root node

    Analysis and Design of Algorithms II

    Analysis and Design of Algorithms II

  • 8/12/2019 01-B Trees

    32/47

    Splitting a node in a B-Tree

    Split the full node y=x.ciabout its median key S, which moveys parent node x

    Those keys in y that are greater than the median key move inode z, which becomes a new child of x

    Analysis and Design of Algorithms II

    Analysis and Design of Algorithms II

  • 8/12/2019 01-B Trees

    33/47

    Splitting a node in a B-Tree

    Analysis and Design of Algorithms II

    Analysis and Design of Algorithms II

  • 8/12/2019 01-B Trees

    34/47

    Splitting a node in a B-Tree

    CPU time used (t)

    Due to the loops on lines 5-6 and 8-9

    The other loops run for O(t) iterations

    Disk operations O(1)

    Analysis and Design of Algorithms II

    Analysis and Design of Algorithms II

  • 8/12/2019 01-B Trees

    35/47

    Inserting a key into a B-Tree in a single pass dowtree

    Analysis and Design of Algorithms II

    Analysis and Design of Algorithms II

  • 8/12/2019 01-B Trees

    36/47

    Inserting a key into a B-Tree in a single passdown the tree

    Analysis and Design of Algorithms II

    Analysis and Design of Algorithms II

  • 8/12/2019 01-B Trees

    37/47

    Inserting a key into a B-Tree in a single pass dowtree

    For a B-tree of height h

    B-TREE-INSERT performs O(h) disk accesses

    Only O(1) DISK-READ and DISK-WRITE operations occur between cTREE-INSERT-NONFULL

    Total CPU time O(th) = O(t logtn)

    Analysis and Design of Algorithms II

    Analysis and Design of Algorithms II

  • 8/12/2019 01-B Trees

    38/47

    Analysis and Design of Algorithms II

    Analysis and Design of Algorithms II

  • 8/12/2019 01-B Trees

    39/47

    Deleting a key from a B-Tree

    Delete a key in internal node, need to rearrange the nodes

    Must not violate the B-tree properties

    Ensure that the node size doesnt get too small during delet

    Back up if a node (other than the root) along the path to whkey is to be deleted has the minimum number of keys

    Analysis and Design of Algorithms II

    Analysis and Design of Algorithms II

  • 8/12/2019 01-B Trees

    40/47

    Deleting a key from a B-Tree

    When call the procedure recursively on a node x

    The number of keys in x is at least the minimum degree t

    Requires one more key than the minimum required by the usual B

    So that sometimes a key may have to be moved into a child node brecursion descends to that child

    If the root node x ever becomes an internal node having no

    We delete x, and xs only childx.c1becomes the new root of the tr

    Decreasing the height of the tree by one and preserving the properoo of the tree contains at least one key

    Analysis and Design of Algorithms II

    Analysis and Design of Algorithms II

  • 8/12/2019 01-B Trees

    41/47

    Various cases deleting a key from B-Tree

    1. If the key k is in node x and x is a leaf, delete the key k fro

    2. If the key k is in node x and x is an internal node, do the foa. If the child y that precedes k in node x has at least t keys, then f

    predecessor k of k in the subtree rooted at y. Recursively deletereplace k by k in x (we can find k and delete it in a single down

    b. If y has fewer than t keys, then, symmetrically, examine the chilfollows k in node x. If z has at least t keys, then find the successo

    the subtree rooted at z. Recursively delete k, and replace k by kcan find k and delete in in a single downward pass)

    c. Otherwise, if both y and z have only t-1 keys, merge k and all of that x loses both k and the pointer to z, and now contains 2t-1 free z and recursively delete k from y

    Analysis and Design of Algorithms II

    Analysis and Design of Algorithms II

  • 8/12/2019 01-B Trees

    42/47

    Analysis and Design of Algorithms II

    Analysis and Design of Algorithms II

  • 8/12/2019 01-B Trees

    43/47

    Analysis and Design of Algorithms II

    Analysis and Design of Algorithms II

  • 8/12/2019 01-B Trees

    44/47

    Various cases deleting a key from B-Tree

    3. If the key k is not present in internal node x, determine th

    of the appropriate subtree that must contain k, if k is in thall. Ifx.cihas only t-1 keys, execute step 3a or 3b as necesguarantee that we descend to a noe containing at least t kfinish by recursing on the appropriate child of xa. Ifx.cihas only t-1 keys but has an immediate sibling with at least

    x.ci an extra key by moving a key from x down into x.ci, moving ax.c

    is immediate left or right sibling up into x, and movint the app

    child pointer from the sibling intox.cib. Ifx.ciand bothx.cis immediate sibling have t-1 keys, mergex.ciw

    sibling, which involves moving a key from x down into the new mto become the median key for that node

    Analysis and Design of Algorithms II

    Analysis and Design of Algorithms II

  • 8/12/2019 01-B Trees

    45/47

    Analysis and Design of Algorithms II

    Analysis and Design of Algorithms II

  • 8/12/2019 01-B Trees

    46/47

    Deleting a key from B-Tree

    Since most of the keys in a B-tree are in the leaves

    In practice, deletion operations are most often used to delete keysleaves

    B-TREE-DELETE acts downward pass through the tree without havup

    When deleting a key in an internal node, the procedure mak

    downward pass through the tree but may have to return to from which the key was deleted to replace the key with itspredecessor or successor (cases 2a and 2b)

    Analysis and Design of Algorithms II

    Analysis and Design of Algorithms II

  • 8/12/2019 01-B Trees

    47/47

    Deleting a key from B-Tree

    Involves only O(h)disk operations

    CPU time O(th) = O(t logtn)