2,3,4 and red black tree

1

Balanced search trees: 2‐3‐4 trees.2‐3‐4 (or 2‐4) trees improve the efficiency of insertItem and deleteItem methodsof 2‐3 trees, because they are performed on the path from the root to the leaf.However, they require more memory for storing 3 data items and 4 pointers ineach node.

Definition: A 2‐3‐4 tree is a general tree which satisfies the following properties:1 Each node may store three data items.2 Each node may have four children.3 The second and third data items in any node may be empty, in which case

sentinel value emptyFlag is stored there (assume emptyFlag := 0). If they are not empty, the first data item precedes the second one according to the specified ordering relationship, the second data item precedes the third data item.

4. For each node, data in the first child precedes the first data item in the node; data in the second child follows the first data item, but precedes the second; data in the third child follows the second data item, but precedes the third; data in the fourth child follows the third data item.

5 All leaf nodes are on the same level.

Example 2‐3‐4 tree

4

2 6 8

1 3 5 7 9 10 11

Class Node234tree {

Node234tree firstChild;Node234tree secondChild;Node234tree thirdChild;Node234tree fourthChild;Node234tree parent;p ;int firstItem;int secondItem;int thirdItem;

.... class methods follow }

Search in 2‐3‐4 trees

The search algorithm is similar to that in 2‐3 trees and binary search

trees. In the example 2‐3‐4 tree, the search for 10 is carried out as follows:

1 Compare 10 to the only item in the root 10 > 4 continue the search1. Compare 10 to the only item in the root. 10 > 4, continue the search

in the second child.

2. 10 > 6 and 10 > 8, continue the search in the third child.

3. 10 > 9, 10 = 10. Stop.

As in 2‐3 trees, the efficiency of the search operation is guaranteed to be

O(log n). On average, it will be better that the search efficiency in 2‐3

trees because the height of a 2‐3‐4 tree might be less than the height of thetrees, because the height of a 2‐3‐4 tree might be less than the height of the

2‐3 tree with the same data.

Insertion in 2‐3‐4 treesStep 1 Search for the item to be inserted (same as in 2‐3 trees). Step 2 Insert at the leaf level. The following cases are possible:

• The termination node is a 2‐node. Then, make it a 3‐node, and insert the new item appropriately.appropriately.

• The termination node is a 3‐node. Then, make it a 4‐node, and insert the new item appropriately.

• The termination node is a 4 node. Split is, pass the middle to the parent, and insert the new item appropriately.

General rules for inserting new nodes in 2‐3‐4 trees:Rule 1: During the search step, every time a 2‐node connected to a 4‐nodeis encountered, transform it into a 3‐node connected to two 2‐nodes.

l h h d d dRule 2: During the search step, every time a 3‐node connected to a 4‐node is encountered, transform it into a 4‐node connected to two 2‐nodes.

Note that two 2‐nodes resulting from these transformations have the same numberof children as the original 4‐node. This is why the split of a 4‐node does not affect any nodes below the level where the split occurs.

2

Efficiency of search and insert operations

Result 1: Search in a 2‐3‐4 tree with N nodes takes at most O(log N) time. This

is in case if all nodes are 2 nodes. If there are 3‐nodes and 4‐nodes on the tree,

the search will take less than (log N) time.

Result 2: Insertion into a 2‐3‐4 tree takes less than O(log N) time, and on

average requires less than 1 node split.

Deletion in 2‐3‐4 tree

Consider our example tree

4

2 6 8

1 3 5 7 9 10 11

The following special cases (with multiple sub‐cases each) are possible:

Case 1 (three sub‐cases): The item is deleted from a leaf node (a node with external children), which currently contains 2 or 3 items. Easy sub‐cases – delete the item transforming a 4‐node into a 3 node, or a 3 node into a 2 node. No other nodes are affected. Example: delete 9 – the existing 4 node, containing 9, 10, and 11 is transformed into a 3 node, containing 10 and 11. Deleting from a 2‐node (the third sub‐case) requires an item from the parent node to be drawn, which in turn must be replaced by an item from the sibling note (if the sibling node is NOT a 2‐node as well). See case 2.

Deletion in 2‐3‐4 tree (contd.)

Case 2 (with several more sub‐cases) Delete from a node that has non‐external children.For example, delete 8. This case can be reduced to case 1 by finding the item that precedes the one to be deleted in in‐order traversal (7, in our example) and exchanging the two items. If 7 were part of a 3‐ or 4‐ node, 8 would have been deleted easily. However, since 8 is now the only item in the node, we have a case of underflow. This requires that an item from the parent node be transferred to the underflow node, and substituted in the parent node by an item from the sibling node.

In our example, 7 will be transferred back to where it was, and 9 will move to the parentnode to fill the gap.

However, if the sibling node is also a 2‐node, the so‐called fusing takes place. That is, thetwo 2‐node siblings are “fused” in a single 3‐node, after an item is transferred from the parent node The later suggests that the parent can now handle one less child and itparent node. The later suggests that the parent can now handle one less child, and itindeed has one child less after two of its former children are fused.

The last sub‐case suggests that a parent node is also a 2‐node. Then, it must in turn borrow from its parent, etc., resulting in the 2‐3‐4 tree becoming one level shorter.

1

2-3-(4) Trees

Efficiency of 2-3 Tree• As for any search tree, the efficiency depends

on the tree’s height• A 2-3 tree of height h with the smallest number

of keys is a full tree of 2-nodes

1

2

2h−1

items

0

1

h−1

level

• So

12...21 −+++≥ hn

)1(log 2 +≤ nh

• A 2-3 tree of height h with the largest number of keys is a full tree of 3-nodes, each two keys and three children

• So

132...3212 −⋅++⋅+⋅≤ hn

)1(log3 +≥ nh )1(log3 +≥ nh

2-3 Tree Visualization• http://slady.net/java/bt/view.php?w=600&h=450• Text input box and three buttons

– Enter a number in the text box and click ”Insert”Enter a number in the text box and click Insert to have it entered into the 2-3 tree

– Similarly, you can delete from the tree and search the tree

– All three buttons work on data entered in the same text boxsame text box

2

Practice• Construct a 2-3 tree for the list 9, 5, 8, 3, 2,

4, 7 by successive insertions

2-3-4 Trees• Same as 2-3 tree, except now nodes can

have four subtrees (and 3 items)• Traversal and searching operations areTraversal and searching operations are

simply extensions of the same procedures for Binary Search Trees (BST) and 2-3 Trees

• 2-3-4 trees are useful because the steps required to resolve insertion and deletion dilemmas are reduced compared to 2-3 trees

Insertion• In order to insert an item into a 2-3-(4) tree it

is first necessary to locate the leaf that the item will be inserted into

• This requires that we start from the root and make comparison decisions until we arrive at a leaf

• This effectively traverses a “path” from the root node down to that leaf

The difficulty with insertions• Recall that in 2-3 trees we blindly followed

this path to the leaf. Then we (over)filled nodes to accommodate the new item

• Overfilled nodes were resolved by pushing items up to the parent

• This push had the possibility of overfilling the parent as well and the problem was propagated up the tree

3

Taking advantage (of descent)• 2-3-4 Trees make use of the descent from

the root to the leaf• 2-3 trees blindly descend to the leaf and2 3 trees blindly descend to the leaf and

make modifications to the tree only on the way back up as nodes become overfull

• 2-3-4 also make modifications to the tree during the descent to find the leaf

The simple modification• While searching for the leaf on the way down if

we encounter a “4” node then we immediately split that node…

Wh ld d h i ?

30 50 70

10 15 20 6040 80 90 10 15 20 6040 80 90

30 70

50

• Why would you do such craziness?– To make room for items that might be pushed up!

(now 50, 30 and 70 can hold at least one more item that might be pushed up from below)

Reasoning• 2-nodes and 3-nodes can always hold one

more item• The only nodes that can’t hold one more itemThe only nodes that can t hold one more item

are 4-nodes• Items are only pushed up to ancestors• If we encounter an ancestor on the way

down that won’t be able to accept an item we pcan transform it into 2-nodes that can before the insertion is actually performed

Special case

root

4

Other cases

2-node parent

Other cases

3-node parent

Guarantee• This guarantees that the exact same

insertion routine presented for 2-3 trees can be performed without ever having to worry p g yabout “resolving” overfull nodes

Example• Adding 16 to the following tree

– Step 1a: (descent) Is current node a 4-node?

30 50 70

10 15 20 6040 80 90

10 15 20 6040 80 90

30 70

50Yes* *

5

• Step 1b: Compare search key with current node (16 < 50)

50*

10 15 20 6040 80 90

30 70

50

**

• Step 1a: Is current node (*) a 4-node? No.• Step 1b: Compare search key with current

node (16<30)

• Step 1a: Is current node a 4-node? Yes

10 6040 80 90

15 30 70

50Yes

10 15 20 6040 80 90

30 70

50

*20

*

• 16 < 30 which is the middle branch

50

Th i d t l f i 2 d ( b 3

*10 6040 80 90

15 30 70

20

• The arrived at leaf is a 2-node (or maybe a 3-node) so you can simply insert the new item and nothing will have to propagate up

• The insertion …

50

N thi t b th

10 6040 80 90

15 30 70

16 20

• Nothing can propagate up because the necessary space was made during descent to the leaf

6

2-3-4 Tree Visualization• http://www.cse.ohio-state.edu/~bondhugu/acads/234-tree/index.shtml

• Insert 60, 30, 10, 20, 50, 40, 70, 80, 15, 90, 100

Inserting 60, 30, 10, 20 ...

... 50, 40 ...

Inserting 50, 40 ...

... 70, ...

Inserting 70 ...

... 80, 15 ...

7


... 90 ...

Inserting 90 ...

... 100 ...

Inserting 100 ...

2-3-4 Deletions• Like with the 2-3 tree, to delete we

– find the node containing the value to be deleted– find the value that comes next in the treefind the value that comes next in the tree– swap the next value with the value to be deleted– delete the swapped value which is now in a leaf

node which is• either a 3-node or 4-node and we do not have to

physically delete a node, orphysically delete a node, or • a 2-node and we have to take care of removing the

node from the tree

8

2-3-4 Deletions• Bottom-up strategy

– if the 2-node has a sibling that is a 3-node or 4-node, redistribute the values between the sibling, gparent and current node

– if the 2-node has no nearby 3-node or 4-node siblings, then merge the sibling with the parent and move up to the parent level and perform the deletion recursively

2-3-4 Deletions• Top-down strategy

– when searching for the node containing the value to be deleted we will merge (collapse) any g ( p ) y2-node (except root) into a larger node

– in this way, we can be assured that any physical removal will take place only from a 3-node or 4-node

– there are many merge situations like there were y g6 split situations

• the cases depend on the type of node that is the given 2-node’s parent and a next sibling to the left or right

2-3-4 DeletionsTurning a 2-node into a 3-node ...

Case 1: an adjacent sibling has 2 or 3 items"steal" item from sibling by rotating items and moving subtreesteal item from sibling by rotating items and moving subtree(note: parent has at least two items, unless it is the root)

30 50

10 20 40

20 50

10 30 40*

25 25

"rotation"


Case 2: each adjacent sibling has only one item"steal" item from parent and merge node with siblingsteal item from parent and merge node with sibling(note: parent has at least two items, unless it is the root)

30 50

10 40

50

10 30 40*

25 25

merging

35 35

9


Case 3: parent is root and parent and each adjacent siblinghas only one itema o y o

merge node with parent and sibling

30

10 40 25

10 30 40

35*

25

merging

35

Example• Delete 40

40

20 50

14 32 43 62 70 79

10 18 25 33 42 47 57 60 8166 74

Red-Black Trees — Idea • We can represent a 4-node of the 2-3-4 tree using only 2-

nodes if we add two new nodes (shown in red)

7

• Note that the ordering property of the 2-3-4 Tree translates into the ordering property of a Binary Search TreeTh t t t lk b t d d bl k li k b t “li k ”

18

4 7 12

2 5 9

7

4 12

2 5 9 18

• The text talks about red and black links but “links” or children references do not have the ability to contain information about their “color”

Here again is our transformed 4-node

4 7 12

7

4 12

We can also apply this process to a 3-node (in two different ways)

182 5 2 5 9 189

4 7

2 5

7

4

2 5

99

4

7

5 9

2or

2-nodes are essentially left alone2 5 5 9

2 5

4

2 5

4

10

• A Red-Black Tree is a Binary-Tree representation of a 2-3-4 Tree– Think of black nodes as representing 2-3-4 Tree nodes– Think of red nodes as being the extra ones required to make a

Binary Tree out of the 2-3-4 Tree28

19

5 12 15

3 30147 9 2522 40 45 48

24 35 42

20 28

28

5 15

3 9 14 19

7

22 25 30 40 4550 50

24

42

3512

20

48

– It is no longer true that every leaf is at the same level. However, given a node, every path from it down to a leaf goes through the same number of black nodes

7

• Every red-black tree has an equivalent 2-3-4 representation

I

C N R

I

C N

A G M R

E H L XPA E G H L M P S X

E H L XP

S

• Thus we have the following restriction in the definition of red-black trees

• There must be an equal number of blackThere must be an equal number of black nodes on every path from the root to a node with fewer than two children

Practice• If you take any 2-3-4 tree you can illustrate it

as a Red-Black Tree…• Figure 12-20 from the book as a 2-3-4 treeFigure 12 20 from the book as a 2 3 4 tree

11

• Same tree as a Red-Black tree

• There are 15 other valid variations

• Red-black trees are undercover 2-3-4 trees• We can tell if a node is really a 4-node by

checking if both its left and right children arechecking if both its left and right children are colored red

• A node is a 3-node if its not a 4-node and has one red child

Red-black trees: benefits• 2-3 and 2-3-4 trees typically waste a lot of

memory on unused items and child references

• 2-nodes are really 4-nodes with 2/3 of memory for that node being unused

• Red-black trees have the advantages of 2-3-4 trees without the overhead

Red-black trees: insertion• Insertion in a red-black tree can be

implemented as a direct translation of the top-down 2-3-4 splitting algorithm– search from root to leaf and then insert the new

value– we must maintain height-balancing. how?

• in the 2-3-4 tree, we split any 4-node on the way down the tree using one of 6 cases

• we do the same for our red-black tree, implementingwe do the same for our red black tree, implementing the 6 cases from the point of view of red-black nodes rather than 4-nodes

• most of the splitting is done by re-coloring nodes, but there are some cases that require additional rotations

12

Red-black trees: insertion• To split a root 4-node: recolor the two

children black

X Y Z Y

ZX

Y Y

X Z ZX

Red-black trees: insertion• To split a non-root 4-node: flip the color of all

three nodes

X Y Z

W

X Z

W Y Y

W

Z

W

Y

X ZX

Red-black trees: insertion• One problem is that we may end up with two

adjacent red nodes– If the parent is a 3-node oriented the wrong way round– We fix this with a single rotation and further recoloring

V W

X Y Z

V W Y

X Z

V

W

Y

V

W

Y

W

V Y

W

V Y

X Z X Z X Z X Z

Convention• Leaf nodes are null links; the rest are internal

nodes30

70

8560

80

10

90

15

20

50 655

40 55

13

Definition of black-height• The number of black nodes on any path

f d i R B t tfrom a node, x, in a R-B tree to a descendent leaf, but excluding node xitself, is the node’s black-height, denoted bh(x)

• If T is an R-B tree, we define bh(T) asIf T is an R B tree, we define bh(T) as the black-height of its root

Black heights

30

7015

3

270

85

5

60

80

10

90

15

20

50 65

1

140 55

0

Node Height • The height of a node v is the number of nodes on

the longest path from v to a leaf, but excluding node v itself

A

B C

GD E F

A

B

E

height of node B is 3

height of the tree is 4

GD E

H I

FE

Theorem 1 – Any red-black tree with root x, has at least n = 2bh(x) – 1 internal nodes, where bh(x) is the black height of node x

Proof: by induction on height of xy g– Base step: x has height 0 (i.e. null leaf node)

• What is bh(x)?– Inductive step: x has positive height and 2 children

• Each child has black-height of bh(x) or bh(x) –1 (Why?)

• Since the height of a child of x is less than theSince the height of a child of x is less than the height of x itself, we can apply the inductive hypothesis to conclude that each child has ≥2bh(x)-1 - 1 internal nodes. So, the tree rooted at x has at least (2bh(x)-1 - 1)+(2bh(x)-1 - 1)+1= 2bh(x) -1 internal nodes

14

Theorem 2 – In a red-black tree, at least half the nodes on any path from the root to a leaf, not including the root, must be blackblack

Proof: If a red node on the path has a child, then the color of the child must be black

Theorem 3 – A red-black tree with n internal nodes has height h <= 2log(n + 1)

Proof: Let h be the height of the red-black tree with root x By Theorem 2tree with root x. By Theorem 2,

bh(x) >= h/2From Theorem 1, n >= 2bh(x) - 1Therefore n >= 2 h/2 – 1

n + 1 >= 2h/2n + 1 >= 2h/2

log(n + 1) >= h/22log(n + 1) >= h

2-3-(4) Trees

Efficiency of 2-3 Tree• As for any search tree, the efficiency depends

on the tree’s height• A 2-3 tree of height h with the smallest number

of keys is a full tree of 2-nodes

• So

12...21 −+++≥ hn

)1(log2 +≤ nh

1

2

2h−1

items

0

1

h−1

level

• A 2-3 tree of height h with the largest number of keys is a full tree of 3-nodes, each two keys and three children

• So

132...3212 −⋅++⋅+⋅≤ hn

)1(log3 +≥ nh

2-3 Tree Visualization• http://slady.net/java/bt/view.php?w=600&h=450

• Text input box and three buttons– Enter a number in the text box and click ”Insert”

to have it entered into the 2-3 tree

– Similarly, you can delete from the tree and search the tree

– All three buttons work on data entered in the same text box

http://slady.net/java/bt/view.php?w=600&h=450

Practice• Construct a 2-3 tree for the list 9, 5, 8, 3, 2,

4, 7 by successive insertions

2-3-4 Trees• Same as 2-3 tree, except now nodes can

have four subtrees (and 3 items)

• Traversal and searching operations are simply extensions of the same procedures for Binary Search Trees (BST) and 2-3 Trees

• 2-3-4 trees are useful because the steps required to resolve insertion and deletion dilemmas are reduced compared to 2-3 trees

Insertion• In order to insert an item into a 2-3-(4) tree it

is first necessary to locate the leaf that the item will be inserted into

• This requires that we start from the root and make comparison decisions until we arrive at a leaf

• This effectively traverses a “path” from the root node down to that leaf

The difficulty with insertions• Recall that in 2-3 trees we blindly followed

this path to the leaf. Then we (over)filled nodes to accommodate the new item

• Overfilled nodes were resolved by pushing items up to the parent

• This push had the possibility of overfilling the parent as well and the problem was propagated up the tree

Taking advantage (of descent)• 2-3-4 Trees make use of the descent from

the root to the leaf

• 2-3 trees blindly descend to the leaf and make modifications to the tree only on the way back up as nodes become overfull

• 2-3-4 also make modifications to the tree during the descent to find the leaf

The simple modification• While searching for the leaf on the way down if

we encounter a “4” node then we immediately split that node…

• Why would you do such craziness?– To make room for items that might be pushed up!

(now 50, 30 and 70 can hold at least one more item that might be pushed up from below)

30 50 70

10 15 20 6040 80 90 10 15 20 6040 80 90

30 70

50

Reasoning• 2-nodes and 3-nodes can always hold one

more item

• The only nodes that can’t hold one more item are 4-nodes

• Items are only pushed up to ancestors

• If we encounter an ancestor on the way down that won’t be able to accept an item we can transform it into 2-nodes that can before the insertion is actually performed

Special case

root

Other cases

2-node parent

Other cases

3-node parent

Guarantee• This guarantees that the exact same

insertion routine presented for 2-3 trees can be performed without ever having to worry about “resolving” overfull nodes

Example• Adding 16 to the following tree

– Step 1a: (descent) Is current node a 4-node?

30 50 70

10 15 20 6040 80 90

10 15 20 6040 80 90

30 70

50Yes* *

• Step 1b: Compare search key with current node (16 < 50)

• Step 1a: Is current node (*) a 4-node? No.• Step 1b: Compare search key with current

node (16<30)

10 15 20 6040 80 90

30 70

50*

*

*

• Step 1a: Is current node a 4-node? Yes

10 6040 80 90

15 30 70

50Yes

10 15 20 6040 80 90

30 70

50

*20

*

• 16 < 30 which is the middle branch

• The arrived at leaf is a 2-node (or maybe a 3-node) so you can simply insert the new item and nothing will have to propagate up

*10 6040 80 90

15 30 70

50

20

• The insertion …

• Nothing can propagate up because the necessary space was made during descent to the leaf

10 6040 80 90

15 30 70

50

16 20

2-3-4 Tree Visualization• http://www.cse.ohio-state.edu/~bondhugu/acads/234-tree/index.shtml

• Insert 60, 30, 10, 20, 50, 40, 70, 80, 15, 90, 100

http://www.cse.ohio-state.edu/~bondhugu/acads/234-tree/index.shtml

Inserting 60, 30, 10, 20 ...

... 50, 40 ...


... 70, ...

Inserting 70 ...

... 80, 15 ...


... 90 ...

Inserting 90 ...

... 100 ...

Inserting 100 ...

2-3-4 Deletions• Like with the 2-3 tree, to delete we

– find the node containing the value to be deleted– find the value that comes next in the tree

– swap the next value with the value to be deleted– delete the swapped value which is now in a leaf

node which is• either a 3-node or 4-node and we do not have to

physically delete a node, or • a 2-node and we have to take care of removing the

node from the tree

2-3-4 Deletions• Bottom-up strategy

– if the 2-node has a sibling that is a 3-node or 4-node, redistribute the values between the sibling, parent and current node

– if the 2-node has no nearby 3-node or 4-node siblings, then merge the sibling with the parent and move up to the parent level and perform the deletion recursively

2-3-4 Deletions• Top-down strategy

– when searching for the node containing the value to be deleted we will merge (collapse) any 2-node (except root) into a larger node

– in this way, we can be assured that any physical removal will take place only from a 3-node or 4-node

– there are many merge situations like there were 6 split situations

• the cases depend on the type of node that is the given 2-node’s parent and a next sibling to the left or right

2-3-4 Deletions

Turning a 2-node into a 3-node ...

Case 1: an adjacent sibling has 2 or 3 items"steal" item from sibling by rotating items and moving subtree (note: parent has at least two items, unless it is the root)

30 50

10 20 40

25

20 50

10 30 40

25

"rotation"

*

2-3-4 Deletions


Case 2: each adjacent sibling has only one item "steal" item from parent and merge node with sibling

(note: parent has at least two items, unless it is the root)

30 50

10 40

25

50

25

merging10 30 40

35 35

*

2-3-4 Deletions


Case 3: parent is root and parent and each adjacent sibling has only one item

merge node with parent and sibling

30

10 40

25

25merging

10 30 40

35

35*

Example• Delete 40

40

20 50

14 32 43 62 70 79

10 18 25 33 42 47 57 60 8166 74

Red-Black Trees — Idea • We can represent a 4-node of the 2-3-4 tree using only 2-

nodes if we add two new nodes (shown in red)

• Note that the ordering property of the 2-3-4 Tree translates into the ordering property of a Binary Search Tree

• The text talks about red and black links but “links” or children references do not have the ability to contain information about their “color”

18

4 7 12

2 5 9

7

4 12

2 5 9 18

Here again is our transformed 4-node

We can also apply this process to a 3-node (in two different ways)

2-nodes are essentially left alone

18

4 7 12

2 5

7

4 12

2 5 9 189

4 7

2 5

7

4

2 5

99

4

7

5 9

2or

2 5

4

2 5

4

• A Red-Black Tree is a Binary-Tree representation of a 2-3-4 Tree– Think of black nodes as representing 2-3-4 Tree nodes– Think of red nodes as being the extra ones required to make a

Binary Tree out of the 2-3-4 Tree

– It is no longer true that every leaf is at the same level. However, given a node, every path from it down to a leaf goes through the same number of black nodes

19

5 12 15

3 30147 9 2522 40 45 48

24 35 42

20 2828

5 15

3 9 14 19

7

22 25 30 40 4550 50

24

42

3512

20

48

• Every red-black tree has an equivalent 2-3-4 representation

I

C N R

A E G H L M P S X

I

C N

A G M R

E H L XP

S

• Thus we have the following restriction in the definition of red-black trees

• There must be an equal number of black nodes on every path from the root to a node with fewer than two children

Practice• If you take any 2-3-4 tree you can illustrate it

as a Red-Black Tree…

• Figure 12-20 from the book as a 2-3-4 tree

• Same tree as a Red-Black tree

• There are 15 other valid variations

• Red-black trees are undercover 2-3-4 trees

• We can tell if a node is really a 4-node by checking if both its left and right children are colored red

• A node is a 3-node if its not a 4-node and has one red child

Red-black trees: benefits• 2-3 and 2-3-4 trees typically waste a lot of

memory on unused items and child references

• 2-nodes are really 4-nodes with 2/3 of memory for that node being unused

• Red-black trees have the advantages of 2-3-4 trees without the overhead

Red-black trees: insertion• Insertion in a red-black tree can be

implemented as a direct translation of the top-down 2-3-4 splitting algorithm– search from root to leaf and then insert the new

value– we must maintain height-balancing. how?

• in the 2-3-4 tree, we split any 4-node on the way down the tree using one of 6 cases

• we do the same for our red-black tree, implementing the 6 cases from the point of view of red-black nodes rather than 4-nodes

• most of the splitting is done by re-coloring nodes, but there are some cases that require additional rotations

Red-black trees: insertion• To split a root 4-node: recolor the two

children black

X Y Z Y

ZX

Y Y

X Z ZX

Red-black trees: insertion• To split a non-root 4-node: flip the color of all

three nodes

X Y Z

W

X Z

W Y Y

W

Z

W

Y

X ZX

Red-black trees: insertion• One problem is that we may end up with two

adjacent red nodes– If the parent is a 3-node oriented the wrong way round– We fix this with a single rotation and further recoloring

V W

X Y Z

V W Y

X Z

V

W

X Z

Y

V

W

X Z

Y

W

V Y

X Z

W

V Y

X Z

Convention• Leaf nodes are null links; the rest are internal

nodes30

70

85

5

60

80

10

90

15

20

50

40 55

65

Definition of black-height

• The number of black nodes on any path from a node, x, in a R-B tree to a descendent leaf, but excluding node x itself, is the node’s black-height, denoted bh(x)

• If T is an R-B tree, we define bh(T) as the black-height of its root

Black heights

30

70

85

5

60

80

10

90

15

20

50

40 55

65

3

2

1

0

1

Node Height • The height of a node v is the number of nodes on

the longest path from v to a leaf, but excluding node v itself

A

B C

GD E

H I

F

A

B

E

height of node B is 3

height of the tree is 4

Theorem 1 – Any red-black tree with root x, has at least n = 2bh(x) – 1 internal nodes, where bh(x) is the black height of node x

Proof: by induction on height of x– Base step: x has height 0 (i.e. null leaf node)

• What is bh(x)?– Inductive step: x has positive height and 2 children

• Each child has black-height of bh(x) or bh(x) – 1 (Why?)

• Since the height of a child of x is less than the height of x itself, we can apply the inductive hypothesis to conclude that each child has ≥ 2bh(x)-1 - 1 internal nodes. So, the tree rooted at x has at least (2bh(x)-1 - 1)+(2bh(x)-1 - 1)+1= 2bh(x) - 1 internal nodes

Theorem 2 – In a red-black tree, at least half the nodes on any path from the root to a leaf, not including the root, must be black

Proof: If a red node on the path has a child, then the color of the child must be black

Theorem 3 – A red-black tree with n internal nodes has height h <= 2log(n + 1)

Proof: Let h be the height of the red-black tree with root x. By Theorem 2,

bh(x) >= h/2

From Theorem 1, n >= 2bh(x) - 1

Therefore n >= 2 h/2 – 1

n + 1 >= 2h/2

log(n + 1) >= h/2

2log(n + 1) >= h

��

��

��

�� !

��

� ��

� ��

�� k1 k2 k3 k4

n

c5c3 c4c2c1

p1 p2

n2n1

p1 k3 p2

k1 k2 k4k1 k2 k3 k4

p1 p2

n

c5c3 c4c2c1

n1 n2

c1 c2 c3 c4 c5

n1 n2

2,3,4 and red black tree

Documents