2,3,4 and red black tree
DESCRIPTION
RBTTRANSCRIPT
1
Balanced search trees: 2‐3‐4 trees.2‐3‐4 (or 2‐4) trees improve the efficiency of insertItem and deleteItem methodsof 2‐3 trees, because they are performed on the path from the root to the leaf.However, they require more memory for storing 3 data items and 4 pointers ineach node.
Definition: A 2‐3‐4 tree is a general tree which satisfies the following properties:1 Each node may store three data items.2 Each node may have four children.3 The second and third data items in any node may be empty, in which case
sentinel value emptyFlag is stored there (assume emptyFlag := 0). If they are not empty, the first data item precedes the second one according to the specified ordering relationship, the second data item precedes the third data item.
4. For each node, data in the first child precedes the first data item in the node; data in the second child follows the first data item, but precedes the second; data in the third child follows the second data item, but precedes the third; data in the fourth child follows the third data item.
5 All leaf nodes are on the same level.
Example 2‐3‐4 tree
4
2 6 8
1 3 5 7 9 10 11
Class Node234tree {
Node234tree firstChild;Node234tree secondChild;Node234tree thirdChild;Node234tree fourthChild;Node234tree parent;p ;int firstItem;int secondItem;int thirdItem;
.... class methods follow }
Search in 2‐3‐4 trees
The search algorithm is similar to that in 2‐3 trees and binary search
trees. In the example 2‐3‐4 tree, the search for 10 is carried out as follows:
1 Compare 10 to the only item in the root 10 > 4 continue the search1. Compare 10 to the only item in the root. 10 > 4, continue the search
in the second child.
2. 10 > 6 and 10 > 8, continue the search in the third child.
3. 10 > 9, 10 = 10. Stop.
As in 2‐3 trees, the efficiency of the search operation is guaranteed to be
O(log n). On average, it will be better that the search efficiency in 2‐3
trees because the height of a 2‐3‐4 tree might be less than the height of thetrees, because the height of a 2‐3‐4 tree might be less than the height of the
2‐3 tree with the same data.
Insertion in 2‐3‐4 treesStep 1 Search for the item to be inserted (same as in 2‐3 trees). Step 2 Insert at the leaf level. The following cases are possible:
• The termination node is a 2‐node. Then, make it a 3‐node, and insert the new item appropriately.appropriately.
• The termination node is a 3‐node. Then, make it a 4‐node, and insert the new item appropriately.
• The termination node is a 4 node. Split is, pass the middle to the parent, and insert the new item appropriately.
General rules for inserting new nodes in 2‐3‐4 trees:Rule 1: During the search step, every time a 2‐node connected to a 4‐nodeis encountered, transform it into a 3‐node connected to two 2‐nodes.
l h h d d dRule 2: During the search step, every time a 3‐node connected to a 4‐node is encountered, transform it into a 4‐node connected to two 2‐nodes.
Note that two 2‐nodes resulting from these transformations have the same numberof children as the original 4‐node. This is why the split of a 4‐node does not affect any nodes below the level where the split occurs.
2
Efficiency of search and insert operations
Result 1: Search in a 2‐3‐4 tree with N nodes takes at most O(log N) time. This
is in case if all nodes are 2 nodes. If there are 3‐nodes and 4‐nodes on the tree,
the search will take less than (log N) time.
Result 2: Insertion into a 2‐3‐4 tree takes less than O(log N) time, and on
average requires less than 1 node split.
Deletion in 2‐3‐4 tree
Consider our example tree
4
2 6 8
1 3 5 7 9 10 11
The following special cases (with multiple sub‐cases each) are possible:
Case 1 (three sub‐cases): The item is deleted from a leaf node (a node with external children), which currently contains 2 or 3 items. Easy sub‐cases – delete the item transforming a 4‐node into a 3 node, or a 3 node into a 2 node. No other nodes are affected. Example: delete 9 – the existing 4 node, containing 9, 10, and 11 is transformed into a 3 node, containing 10 and 11. Deleting from a 2‐node (the third sub‐case) requires an item from the parent node to be drawn, which in turn must be replaced by an item from the sibling note (if the sibling node is NOT a 2‐node as well). See case 2.
Deletion in 2‐3‐4 tree (contd.)
Case 2 (with several more sub‐cases) Delete from a node that has non‐external children.For example, delete 8. This case can be reduced to case 1 by finding the item that precedes the one to be deleted in in‐order traversal (7, in our example) and exchanging the two items. If 7 were part of a 3‐ or 4‐ node, 8 would have been deleted easily. However, since 8 is now the only item in the node, we have a case of underflow. This requires that an item from the parent node be transferred to the underflow node, and substituted in the parent node by an item from the sibling node.
In our example, 7 will be transferred back to where it was, and 9 will move to the parentnode to fill the gap.
However, if the sibling node is also a 2‐node, the so‐called fusing takes place. That is, thetwo 2‐node siblings are “fused” in a single 3‐node, after an item is transferred from the parent node The later suggests that the parent can now handle one less child and itparent node. The later suggests that the parent can now handle one less child, and itindeed has one child less after two of its former children are fused.
The last sub‐case suggests that a parent node is also a 2‐node. Then, it must in turn borrow from its parent, etc., resulting in the 2‐3‐4 tree becoming one level shorter.
1
2-3-(4) Trees
Efficiency of 2-3 Tree• As for any search tree, the efficiency depends
on the tree’s height• A 2-3 tree of height h with the smallest number
of keys is a full tree of 2-nodes
1
2
2h−1
items
0
1
h−1
level
• So
12...21 −+++≥ hn
)1(log 2 +≤ nh
• A 2-3 tree of height h with the largest number of keys is a full tree of 3-nodes, each two keys and three children
• So
132...3212 −⋅++⋅+⋅≤ hn
)1(log3 +≥ nh )1(log3 +≥ nh
2-3 Tree Visualization• http://slady.net/java/bt/view.php?w=600&h=450• Text input box and three buttons
– Enter a number in the text box and click ”Insert”Enter a number in the text box and click Insert to have it entered into the 2-3 tree
– Similarly, you can delete from the tree and search the tree
– All three buttons work on data entered in the same text boxsame text box
2
Practice• Construct a 2-3 tree for the list 9, 5, 8, 3, 2,
4, 7 by successive insertions
2-3-4 Trees• Same as 2-3 tree, except now nodes can
have four subtrees (and 3 items)• Traversal and searching operations areTraversal and searching operations are
simply extensions of the same procedures for Binary Search Trees (BST) and 2-3 Trees
• 2-3-4 trees are useful because the steps required to resolve insertion and deletion dilemmas are reduced compared to 2-3 trees
Insertion• In order to insert an item into a 2-3-(4) tree it
is first necessary to locate the leaf that the item will be inserted into
• This requires that we start from the root and make comparison decisions until we arrive at a leaf
• This effectively traverses a “path” from the root node down to that leaf
The difficulty with insertions• Recall that in 2-3 trees we blindly followed
this path to the leaf. Then we (over)filled nodes to accommodate the new item
• Overfilled nodes were resolved by pushing items up to the parent
• This push had the possibility of overfilling the parent as well and the problem was propagated up the tree
3
Taking advantage (of descent)• 2-3-4 Trees make use of the descent from
the root to the leaf• 2-3 trees blindly descend to the leaf and2 3 trees blindly descend to the leaf and
make modifications to the tree only on the way back up as nodes become overfull
• 2-3-4 also make modifications to the tree during the descent to find the leaf
The simple modification• While searching for the leaf on the way down if
we encounter a “4” node then we immediately split that node…
Wh ld d h i ?
30 50 70
10 15 20 6040 80 90 10 15 20 6040 80 90
30 70
50
• Why would you do such craziness?– To make room for items that might be pushed up!
(now 50, 30 and 70 can hold at least one more item that might be pushed up from below)
Reasoning• 2-nodes and 3-nodes can always hold one
more item• The only nodes that can’t hold one more itemThe only nodes that can t hold one more item
are 4-nodes• Items are only pushed up to ancestors• If we encounter an ancestor on the way
down that won’t be able to accept an item we pcan transform it into 2-nodes that can before the insertion is actually performed
Special case
root
4
Other cases
2-node parent
Other cases
3-node parent
Guarantee• This guarantees that the exact same
insertion routine presented for 2-3 trees can be performed without ever having to worry p g yabout “resolving” overfull nodes
Example• Adding 16 to the following tree
– Step 1a: (descent) Is current node a 4-node?
30 50 70
10 15 20 6040 80 90
10 15 20 6040 80 90
30 70
50Yes* *
5
• Step 1b: Compare search key with current node (16 < 50)
50*
10 15 20 6040 80 90
30 70
50
**
• Step 1a: Is current node (*) a 4-node? No.• Step 1b: Compare search key with current
node (16<30)
• Step 1a: Is current node a 4-node? Yes
10 6040 80 90
15 30 70
50Yes
10 15 20 6040 80 90
30 70
50
*20
*
• 16 < 30 which is the middle branch
50
Th i d t l f i 2 d ( b 3
*10 6040 80 90
15 30 70
20
• The arrived at leaf is a 2-node (or maybe a 3-node) so you can simply insert the new item and nothing will have to propagate up
• The insertion …
50
N thi t b th
10 6040 80 90
15 30 70
16 20
• Nothing can propagate up because the necessary space was made during descent to the leaf
6
2-3-4 Tree Visualization• http://www.cse.ohio-state.edu/~bondhugu/acads/234-tree/index.shtml
• Insert 60, 30, 10, 20, 50, 40, 70, 80, 15, 90, 100
Inserting 60, 30, 10, 20 ...
... 50, 40 ...
Inserting 50, 40 ...
... 70, ...
Inserting 70 ...
... 80, 15 ...
7
Inserting 80, 15 ...
... 90 ...
Inserting 90 ...
... 100 ...
Inserting 100 ...
2-3-4 Deletions• Like with the 2-3 tree, to delete we
– find the node containing the value to be deleted– find the value that comes next in the treefind the value that comes next in the tree– swap the next value with the value to be deleted– delete the swapped value which is now in a leaf
node which is• either a 3-node or 4-node and we do not have to
physically delete a node, orphysically delete a node, or • a 2-node and we have to take care of removing the
node from the tree
8
2-3-4 Deletions• Bottom-up strategy
– if the 2-node has a sibling that is a 3-node or 4-node, redistribute the values between the sibling, gparent and current node
– if the 2-node has no nearby 3-node or 4-node siblings, then merge the sibling with the parent and move up to the parent level and perform the deletion recursively
2-3-4 Deletions• Top-down strategy
– when searching for the node containing the value to be deleted we will merge (collapse) any g ( p ) y2-node (except root) into a larger node
– in this way, we can be assured that any physical removal will take place only from a 3-node or 4-node
– there are many merge situations like there were y g6 split situations
• the cases depend on the type of node that is the given 2-node’s parent and a next sibling to the left or right
2-3-4 DeletionsTurning a 2-node into a 3-node ...
Case 1: an adjacent sibling has 2 or 3 items"steal" item from sibling by rotating items and moving subtreesteal item from sibling by rotating items and moving subtree(note: parent has at least two items, unless it is the root)
30 50
10 20 40
20 50
10 30 40*
25 25
"rotation"
2-3-4 DeletionsTurning a 2-node into a 3-node ...
Case 2: each adjacent sibling has only one item"steal" item from parent and merge node with siblingsteal item from parent and merge node with sibling(note: parent has at least two items, unless it is the root)
30 50
10 40
50
10 30 40*
25 25
merging
35 35
9
2-3-4 DeletionsTurning a 2-node into a 4-node ...
Case 3: parent is root and parent and each adjacent siblinghas only one itema o y o
merge node with parent and sibling
30
10 40 25
10 30 40
35*
25
merging
35
Example• Delete 40
40
20 50
14 32 43 62 70 79
10 18 25 33 42 47 57 60 8166 74
Red-Black Trees — Idea • We can represent a 4-node of the 2-3-4 tree using only 2-
nodes if we add two new nodes (shown in red)
7
• Note that the ordering property of the 2-3-4 Tree translates into the ordering property of a Binary Search TreeTh t t t lk b t d d bl k li k b t “li k ”
18
4 7 12
2 5 9
7
4 12
2 5 9 18
• The text talks about red and black links but “links” or children references do not have the ability to contain information about their “color”
Here again is our transformed 4-node
4 7 12
7
4 12
We can also apply this process to a 3-node (in two different ways)
182 5 2 5 9 189
4 7
2 5
7
4
2 5
99
4
7
5 9
2or
2-nodes are essentially left alone2 5 5 9
2 5
4
2 5
4
10
• A Red-Black Tree is a Binary-Tree representation of a 2-3-4 Tree– Think of black nodes as representing 2-3-4 Tree nodes– Think of red nodes as being the extra ones required to make a
Binary Tree out of the 2-3-4 Tree28
19
5 12 15
3 30147 9 2522 40 45 48
24 35 42
20 28
28
5 15
3 9 14 19
7
22 25 30 40 4550 50
24
42
3512
20
48
– It is no longer true that every leaf is at the same level. However, given a node, every path from it down to a leaf goes through the same number of black nodes
7
• Every red-black tree has an equivalent 2-3-4 representation
I
C N R
I
C N
A G M R
E H L XPA E G H L M P S X
E H L XP
S
• Thus we have the following restriction in the definition of red-black trees
• There must be an equal number of blackThere must be an equal number of black nodes on every path from the root to a node with fewer than two children
Practice• If you take any 2-3-4 tree you can illustrate it
as a Red-Black Tree…• Figure 12-20 from the book as a 2-3-4 treeFigure 12 20 from the book as a 2 3 4 tree
11
• Same tree as a Red-Black tree
• There are 15 other valid variations
• Red-black trees are undercover 2-3-4 trees• We can tell if a node is really a 4-node by
checking if both its left and right children arechecking if both its left and right children are colored red
• A node is a 3-node if its not a 4-node and has one red child
Red-black trees: benefits• 2-3 and 2-3-4 trees typically waste a lot of
memory on unused items and child references
• 2-nodes are really 4-nodes with 2/3 of memory for that node being unused
• Red-black trees have the advantages of 2-3-4 trees without the overhead
Red-black trees: insertion• Insertion in a red-black tree can be
implemented as a direct translation of the top-down 2-3-4 splitting algorithm– search from root to leaf and then insert the new
value– we must maintain height-balancing. how?
• in the 2-3-4 tree, we split any 4-node on the way down the tree using one of 6 cases
• we do the same for our red-black tree, implementingwe do the same for our red black tree, implementing the 6 cases from the point of view of red-black nodes rather than 4-nodes
• most of the splitting is done by re-coloring nodes, but there are some cases that require additional rotations
12
Red-black trees: insertion• To split a root 4-node: recolor the two
children black
X Y Z Y
ZX
Y Y
X Z ZX
Red-black trees: insertion• To split a non-root 4-node: flip the color of all
three nodes
X Y Z
W
X Z
W Y Y
W
Z
W
Y
X ZX
Red-black trees: insertion• One problem is that we may end up with two
adjacent red nodes– If the parent is a 3-node oriented the wrong way round– We fix this with a single rotation and further recoloring
V W
X Y Z
V W Y
X Z
V
W
Y
V
W
Y
W
V Y
W
V Y
X Z X Z X Z X Z
Convention• Leaf nodes are null links; the rest are internal
nodes30
70
8560
80
10
90
15
20
50 655
40 55
13
Definition of black-height• The number of black nodes on any path
f d i R B t tfrom a node, x, in a R-B tree to a descendent leaf, but excluding node xitself, is the node’s black-height, denoted bh(x)
• If T is an R-B tree, we define bh(T) asIf T is an R B tree, we define bh(T) as the black-height of its root
Black heights
30
7015
3
270
85
5
60
80
10
90
15
20
50 65
1
140 55
0
Node Height • The height of a node v is the number of nodes on
the longest path from v to a leaf, but excluding node v itself
A
B C
GD E F
A
B
E
height of node B is 3
height of the tree is 4
GD E
H I
FE
Theorem 1 – Any red-black tree with root x, has at least n = 2bh(x) – 1 internal nodes, where bh(x) is the black height of node x
Proof: by induction on height of xy g– Base step: x has height 0 (i.e. null leaf node)
• What is bh(x)?– Inductive step: x has positive height and 2 children
• Each child has black-height of bh(x) or bh(x) –1 (Why?)
• Since the height of a child of x is less than theSince the height of a child of x is less than the height of x itself, we can apply the inductive hypothesis to conclude that each child has ≥2bh(x)-1 - 1 internal nodes. So, the tree rooted at x has at least (2bh(x)-1 - 1)+(2bh(x)-1 - 1)+1= 2bh(x) -1 internal nodes
14
Theorem 2 – In a red-black tree, at least half the nodes on any path from the root to a leaf, not including the root, must be blackblack
Proof: If a red node on the path has a child, then the color of the child must be black
Theorem 3 – A red-black tree with n internal nodes has height h <= 2log(n + 1)
Proof: Let h be the height of the red-black tree with root x By Theorem 2tree with root x. By Theorem 2,
bh(x) >= h/2From Theorem 1, n >= 2bh(x) - 1Therefore n >= 2 h/2 – 1
n + 1 >= 2h/2n + 1 >= 2h/2
log(n + 1) >= h/22log(n + 1) >= h
2-3-(4) Trees
Efficiency of 2-3 Tree• As for any search tree, the efficiency depends
on the tree’s height• A 2-3 tree of height h with the smallest number
of keys is a full tree of 2-nodes
• So
12...21 −+++≥ hn
)1(log2 +≤ nh
1
2
2h−1
items
0
1
h−1
level
• A 2-3 tree of height h with the largest number of keys is a full tree of 3-nodes, each two keys and three children
• So
132...3212 −⋅++⋅+⋅≤ hn
)1(log3 +≥ nh
2-3 Tree Visualization• http://slady.net/java/bt/view.php?w=600&h=450
• Text input box and three buttons– Enter a number in the text box and click ”Insert”
to have it entered into the 2-3 tree
– Similarly, you can delete from the tree and search the tree
– All three buttons work on data entered in the same text box
Practice• Construct a 2-3 tree for the list 9, 5, 8, 3, 2,
4, 7 by successive insertions
2-3-4 Trees• Same as 2-3 tree, except now nodes can
have four subtrees (and 3 items)
• Traversal and searching operations are simply extensions of the same procedures for Binary Search Trees (BST) and 2-3 Trees
• 2-3-4 trees are useful because the steps required to resolve insertion and deletion dilemmas are reduced compared to 2-3 trees
Insertion• In order to insert an item into a 2-3-(4) tree it
is first necessary to locate the leaf that the item will be inserted into
• This requires that we start from the root and make comparison decisions until we arrive at a leaf
• This effectively traverses a “path” from the root node down to that leaf
The difficulty with insertions• Recall that in 2-3 trees we blindly followed
this path to the leaf. Then we (over)filled nodes to accommodate the new item
• Overfilled nodes were resolved by pushing items up to the parent
• This push had the possibility of overfilling the parent as well and the problem was propagated up the tree
Taking advantage (of descent)• 2-3-4 Trees make use of the descent from
the root to the leaf
• 2-3 trees blindly descend to the leaf and make modifications to the tree only on the way back up as nodes become overfull
• 2-3-4 also make modifications to the tree during the descent to find the leaf
The simple modification• While searching for the leaf on the way down if
we encounter a “4” node then we immediately split that node…
• Why would you do such craziness?– To make room for items that might be pushed up!
(now 50, 30 and 70 can hold at least one more item that might be pushed up from below)
30 50 70
10 15 20 6040 80 90 10 15 20 6040 80 90
30 70
50
Reasoning• 2-nodes and 3-nodes can always hold one
more item
• The only nodes that can’t hold one more item are 4-nodes
• Items are only pushed up to ancestors
• If we encounter an ancestor on the way down that won’t be able to accept an item we can transform it into 2-nodes that can before the insertion is actually performed
Special case
root
Other cases
2-node parent
Other cases
3-node parent
Guarantee• This guarantees that the exact same
insertion routine presented for 2-3 trees can be performed without ever having to worry about “resolving” overfull nodes
Example• Adding 16 to the following tree
– Step 1a: (descent) Is current node a 4-node?
30 50 70
10 15 20 6040 80 90
10 15 20 6040 80 90
30 70
50Yes* *
• Step 1b: Compare search key with current node (16 < 50)
• Step 1a: Is current node (*) a 4-node? No.• Step 1b: Compare search key with current
node (16<30)
10 15 20 6040 80 90
30 70
50*
*
*
• Step 1a: Is current node a 4-node? Yes
10 6040 80 90
15 30 70
50Yes
10 15 20 6040 80 90
30 70
50
*20
*
• 16 < 30 which is the middle branch
• The arrived at leaf is a 2-node (or maybe a 3-node) so you can simply insert the new item and nothing will have to propagate up
*10 6040 80 90
15 30 70
50
20
• The insertion …
• Nothing can propagate up because the necessary space was made during descent to the leaf
10 6040 80 90
15 30 70
50
16 20
2-3-4 Tree Visualization• http://www.cse.ohio-state.edu/~bondhugu/acads/234-tree/index.shtml
• Insert 60, 30, 10, 20, 50, 40, 70, 80, 15, 90, 100
Inserting 60, 30, 10, 20 ...
... 50, 40 ...
Inserting 50, 40 ...
... 70, ...
Inserting 70 ...
... 80, 15 ...
Inserting 80, 15 ...
... 90 ...
Inserting 90 ...
... 100 ...
Inserting 100 ...
2-3-4 Deletions• Like with the 2-3 tree, to delete we
– find the node containing the value to be deleted– find the value that comes next in the tree
– swap the next value with the value to be deleted– delete the swapped value which is now in a leaf
node which is• either a 3-node or 4-node and we do not have to
physically delete a node, or • a 2-node and we have to take care of removing the
node from the tree
2-3-4 Deletions• Bottom-up strategy
– if the 2-node has a sibling that is a 3-node or 4-node, redistribute the values between the sibling, parent and current node
– if the 2-node has no nearby 3-node or 4-node siblings, then merge the sibling with the parent and move up to the parent level and perform the deletion recursively
2-3-4 Deletions• Top-down strategy
– when searching for the node containing the value to be deleted we will merge (collapse) any 2-node (except root) into a larger node
– in this way, we can be assured that any physical removal will take place only from a 3-node or 4-node
– there are many merge situations like there were 6 split situations
• the cases depend on the type of node that is the given 2-node’s parent and a next sibling to the left or right
2-3-4 Deletions
Turning a 2-node into a 3-node ...
Case 1: an adjacent sibling has 2 or 3 items"steal" item from sibling by rotating items and moving subtree (note: parent has at least two items, unless it is the root)
30 50
10 20 40
25
20 50
10 30 40
25
"rotation"
*
2-3-4 Deletions
Turning a 2-node into a 3-node ...
Case 2: each adjacent sibling has only one item "steal" item from parent and merge node with sibling
(note: parent has at least two items, unless it is the root)
30 50
10 40
25
50
25
merging10 30 40
35 35
*
2-3-4 Deletions
Turning a 2-node into a 4-node ...
Case 3: parent is root and parent and each adjacent sibling has only one item
merge node with parent and sibling
30
10 40
25
25merging
10 30 40
35
35*
Example• Delete 40
40
20 50
14 32 43 62 70 79
10 18 25 33 42 47 57 60 8166 74
Red-Black Trees — Idea • We can represent a 4-node of the 2-3-4 tree using only 2-
nodes if we add two new nodes (shown in red)
• Note that the ordering property of the 2-3-4 Tree translates into the ordering property of a Binary Search Tree
• The text talks about red and black links but “links” or children references do not have the ability to contain information about their “color”
18
4 7 12
2 5 9
7
4 12
2 5 9 18
Here again is our transformed 4-node
We can also apply this process to a 3-node (in two different ways)
2-nodes are essentially left alone
18
4 7 12
2 5
7
4 12
2 5 9 189
4 7
2 5
7
4
2 5
99
4
7
5 9
2or
2 5
4
2 5
4
• A Red-Black Tree is a Binary-Tree representation of a 2-3-4 Tree– Think of black nodes as representing 2-3-4 Tree nodes– Think of red nodes as being the extra ones required to make a
Binary Tree out of the 2-3-4 Tree
– It is no longer true that every leaf is at the same level. However, given a node, every path from it down to a leaf goes through the same number of black nodes
19
5 12 15
3 30147 9 2522 40 45 48
24 35 42
20 2828
5 15
3 9 14 19
7
22 25 30 40 4550 50
24
42
3512
20
48
• Every red-black tree has an equivalent 2-3-4 representation
I
C N R
A E G H L M P S X
I
C N
A G M R
E H L XP
S
• Thus we have the following restriction in the definition of red-black trees
• There must be an equal number of black nodes on every path from the root to a node with fewer than two children
Practice• If you take any 2-3-4 tree you can illustrate it
as a Red-Black Tree…
• Figure 12-20 from the book as a 2-3-4 tree
• Same tree as a Red-Black tree
• There are 15 other valid variations
• Red-black trees are undercover 2-3-4 trees
• We can tell if a node is really a 4-node by checking if both its left and right children are colored red
• A node is a 3-node if its not a 4-node and has one red child
Red-black trees: benefits• 2-3 and 2-3-4 trees typically waste a lot of
memory on unused items and child references
• 2-nodes are really 4-nodes with 2/3 of memory for that node being unused
• Red-black trees have the advantages of 2-3-4 trees without the overhead
Red-black trees: insertion• Insertion in a red-black tree can be
implemented as a direct translation of the top-down 2-3-4 splitting algorithm– search from root to leaf and then insert the new
value– we must maintain height-balancing. how?
• in the 2-3-4 tree, we split any 4-node on the way down the tree using one of 6 cases
• we do the same for our red-black tree, implementing the 6 cases from the point of view of red-black nodes rather than 4-nodes
• most of the splitting is done by re-coloring nodes, but there are some cases that require additional rotations
Red-black trees: insertion• To split a root 4-node: recolor the two
children black
X Y Z Y
ZX
Y Y
X Z ZX
Red-black trees: insertion• To split a non-root 4-node: flip the color of all
three nodes
X Y Z
W
X Z
W Y Y
W
Z
W
Y
X ZX
Red-black trees: insertion• One problem is that we may end up with two
adjacent red nodes– If the parent is a 3-node oriented the wrong way round– We fix this with a single rotation and further recoloring
V W
X Y Z
V W Y
X Z
V
W
X Z
Y
V
W
X Z
Y
W
V Y
X Z
W
V Y
X Z
Convention• Leaf nodes are null links; the rest are internal
nodes30
70
85
5
60
80
10
90
15
20
50
40 55
65
Definition of black-height
• The number of black nodes on any path from a node, x, in a R-B tree to a descendent leaf, but excluding node x itself, is the node’s black-height, denoted bh(x)
• If T is an R-B tree, we define bh(T) as the black-height of its root
Black heights
30
70
85
5
60
80
10
90
15
20
50
40 55
65
3
2
1
0
1
Node Height • The height of a node v is the number of nodes on
the longest path from v to a leaf, but excluding node v itself
A
B C
GD E
H I
F
A
B
E
height of node B is 3
height of the tree is 4
Theorem 1 – Any red-black tree with root x, has at least n = 2bh(x) – 1 internal nodes, where bh(x) is the black height of node x
Proof: by induction on height of x– Base step: x has height 0 (i.e. null leaf node)
• What is bh(x)?– Inductive step: x has positive height and 2 children
• Each child has black-height of bh(x) or bh(x) – 1 (Why?)
• Since the height of a child of x is less than the height of x itself, we can apply the inductive hypothesis to conclude that each child has ≥ 2bh(x)-1 - 1 internal nodes. So, the tree rooted at x has at least (2bh(x)-1 - 1)+(2bh(x)-1 - 1)+1= 2bh(x) - 1 internal nodes
Theorem 2 – In a red-black tree, at least half the nodes on any path from the root to a leaf, not including the root, must be black
Proof: If a red node on the path has a child, then the color of the child must be black
Theorem 3 – A red-black tree with n internal nodes has height h <= 2log(n + 1)
Proof: Let h be the height of the red-black tree with root x. By Theorem 2,
bh(x) >= h/2
From Theorem 1, n >= 2bh(x) - 1
Therefore n >= 2 h/2 – 1
n + 1 >= 2h/2
log(n + 1) >= h/2
2log(n + 1) >= h
��
��
���������� ������ ������ � �����������
�� �� ����� � �� � �� � ��������� ��� �� ��� ��������� � !
��������� ���� �� ����
� ����� ��� ��� �� � ����� ��� �� �� ������ �� �� �������
� �� � ������ ��� �� ��������� ����� � ���� �� ����� ��� �
�� � �� � �� �� ������� � ��k1 k2 k3 k4
n
c5c3 c4c2c1
p1 p2
n2n1
p1 k3 p2
k1 k2 k4k1 k2 k3 k4
p1 p2
n
c5c3 c4c2c1
n1 n2
c1 c2 c3 c4 c5
n1 n2