lecture 9: avl trees - orccaorcca.on.ca/~yxie/courses/cs2210b-2011/htmls/notes/09-avltree.pdf ·...
TRANSCRIPT
Lecture 9: AVL Trees
D fi iti
Lecture 9: AVL Trees
DefinitionPropertiesInsertion
1Courtesy to Goodrich, Tamassia and Olga Veksler Instructor: Yuzhen Xie
BST PerformanceRecall that for a binary search tree with n nodes and of height h
methods find, insert and remove take O(h) time
The height h is O(n) in the worst case and O(log n) in the best casethe best caseThus in the worst case, find, insert and remove take O(n) time
worst case
2
best case
Balanced Tree MotivationIf we find a way to make sure the height h of a binary search tree is always O(log n), then search tree will be much more efficient, the worst case complexity for find, insert, remove will be O(log n), significantly better than O(n) for a BSTO(n) for a BST
To make sure height h of a tree is always O(log n), the tree must be “balanced”, that is for any node its left subtree should not be much higher than its right subtree
3
Balanced Trees
We have already seen an example of a b l d h i h l bibalanced tree, that is the complete binary tree
C l t bi t i t th l lComplete binary tree is not the only example of a balanced tree, that is tree with logarithmic heightg g
How to implement a balanced tree which allows operations find, remove, insert in O(log n)?
One way is with AVL trees
4
Height of a Node Definition
Recall that the height of a tree is the maximum over all node depths, or, equivalently the longest
6
2 82
3
1equivalently, the longest path in the tree 41 1
001
0
The height of a tree node vis defined as the heightis defined as the height of the subtree rooted at node v
5
AVL Tree DefinitionAn AVL Tree is a binary search tree which satisfies the height-balance
t44
4
propertyfor every internal node v of T, the heights of the children of vcan differ by at most 1
17 782
1 2
3
10can differ by at most 1
We will say that AVL trees are balanced
8832 50
48 62
1
1
2
1
10
Inventors: Adel'son-Vel'skii and Landis 1962
The height-balanceThe height-balance property guarantees that the height of an AVL tree is logarithmic in the number of
example of an AVL tree where the heights are shown next to
the nodes
6
gitems in the tree
Height of an AVL TreeTheorem: Height h of AVL tree storing n keys is O(log n).
Proof: Let us bound f(h): the minimum number of internal nodes of an AVL tree of height hnodes of an AVL tree of height h.We easily see that f(1) = 1 and f(2) = 2For h > 2, an AVL tree of height h contains the root node, one AVL subtree of height h 1 and the other of height at least h 2AVL subtree of height h-1 and the other of height at least h-2.That is, f(h) = 1 + f(h-1) + f(h-2)Knowing f(h-1) > f(h-2), we get f(h) > 2f(h-2).
The number of nodes doubles after 2 steps down the AVL tree
So f(h) > 2f(h-2), f(h) > 4f(h-4), f(h) > 8n(n-6), … … , f(h) > 2if(h-2i)
Solving the base case (h-2i=1) we get i=(h-1)/2Therefore f(h) > 2 (h-1)/2 f(1) = 2 (h-1)/2
T ki l ith f b th id h < 2l f(h) + 1
7
Taking logarithms of both sides: h < 2log f(h) + 1Thus the height of an AVL tree is O(log n)
Operations in an AVL TreeOperations in an AVL Tree
The height of an AVL tree is O(log n)The height of an AVL tree is O(log n)
Thus the search operation takes O(log n)Pe fo med j st like in BST since an AVL t ee is aPerformed just like in BST since any AVL tree is a BST
All th t’ l ft t d i t h h t i tAll that’s left to do is to show how to insertand remove in AVL trees, while maintaining
1 the height-balance property1. the height-balance property 2. the binary search tree order
8
Insertion in an AVL Tree
Insertion starts as in a binary search treeAl d b di t l dAlways done by expanding an external node.Example:
4444
17 7817 78
32 50 8832 50 88
48 6248 62
b f i ti 54
48 62
54w
9
before inserting 54
after insertion
Insertion in an AVL TreeAfter inserting a new item at a leaf, the height-balance property of the AVL tree is very likely lost
44
17 78of the AVL tree is very likely lost
Recall that height-balanceproperty requires that for any
32 50 88
height 1height 3
p p y q yinternal node the height of its children can differ by at most 1 48 62
54To make it an AVL tree again, 54To make it an AVL tree again, need to restore the balance by restructuring the tree
v“Pi t i l” t ti after insertionv“Pictorial” notation:
10
Analysis After InsertionLet us call a node unbalanced if the difference in heights of its left and right subtrees is more than 1After insertion the heights couldAfter insertion, the heights could change (increase) only for ancestors of the insertion node w
height of a node is the length of theheight of a node is the length of the longest path from that node to a leaf
Thus the only possibly unbalanced nodes are the ancestors of insertionnodes are the ancestors of insertion node wwe should search up the tree from the insertion node w looking forthe insertion node w, looking for any unbalanced nodes and correcting this unbalance somehow
also update the height of each node
w
11
also update the height of each node on the path from w to the root, as the heights of nodes on this path may have changed
Analysis After InsertionSo after inserting at node w, we follow the path from w to the root (the path of ancestors of w),(the path of ancestors of w), checking the balance of nodes
Usually, in about 50% of cases there is no unbalanced node after insertion
Suppose the first unbalanced node (as you go from w up the tree) is at position z
w
pHeight difference between the left and the right subtree of z is more than 1
tree was balanced before the insertionz
tree was balanced before the insertioneach insertion can change height only by 1
Thus this height difference is exactly 2One subtree has height p, the other height p+2
wp
p+2
12
w was inserted into the higher subtreep+2
Analysis After InsertionHeight difference between left and right subtrees of z is exactly 2One subtree has height p, the other height p+2
w was inserted into the higher subtreeTh 2 i h b i hi h l f b i hi hThere are 2 cases: right subtree is higher or left subtree is higher
zcase 1: height p+2 zcase 2:
wheight p w height pheight p+2height p+2
name the highest subtree Ssince tree was balanced before insertion height of S was p+1
S before insertion
insertion, height of S was p+1 before insertionname the root of S with ysince y is balanced after insertion,
d i t b l d ft
y
p+1 pp
13
and z is not balanced after insertion, both subtrees of y have height exactly p before insertion S S
Analysis After Insertiony
1 case 2:z
w
case 1: height p+2z
w
case 2:
height p
height p+2
S before insertion
wheight p S
w height pS
S after insertion, height p+2S before insertion, height p+1
S after insertion, height p+2
y
case bcase a
yy
pp
y
pp+1
y
p+1p
14
ww
After Insertion: 4 cases
zcase 1: zcase 2:zcase 1:
height py
z
height p
y
wheight p
height p+1
wheight pheight p+1
zcase 3:y
zcase 4:y
height py
height p
y
wwheight p
height p+1
wheight p
height p+1
Analysis After Insertionzcase 1:
yz
y
wheight p
height pheight p
y
h i ht
xg p
R, height p+1
Let R be the name of the right subtree of yR t i hi h i i t l d
height p
one has height p another height p 1R contains w, which is an internal node
Therefore, R has at least one internal nodeLet x be the root of RThere are 2 cases
another height p-1
There are 2 casesCase 1: x = w in which case p = 0 and both subtrees of w are leafsCase 2:
height of R went from p before insertion to p+1 after insertionwas balanced before insertion and is balanced after insertionx was balanced before insertion and is balanced after insertion
both subtrees of x had height p-1 before insertion After insertion, one subtree of x has height p, the other height p-1
Analysis After Insertion: 4 cases
zcase 1: zcase 2:
height pheight p
y yx x
height p height p
height p , height p-1 Height p , height p-1
zcase 3:y
zcase 4:y
height p
y
height p
y
height ph i h
x x
g pheight p
height p , height p-1 height p , height p-1
One More Pictureafter insertion
z z z
height pw
p+3height p
zy y
xheight pheight p
height p+2
height p+1
wheight pheight p , height p-1
height p
before insertionz
z zy
height p+1height p
p+2
height p
y yx
height pg p
height pheight p
height p-1
height p
height p-1
Rebalance, Finally! Case 1
z y
height p
yx
z xrebalance
height p
height p , height p-1
Tree is unbalanced at node z
left subtree of z has
Tree is balanced at node yleft subtree of y has height p+1left subtree of z has
height pright subtree of z has height p+2
p+1right subtree of y has height p+1
Tree is balanced at nodesheight p+2 Tree is balanced at nodes z, xThe binary search tree order is preserved
Rebalance, Case 2,3,4
zcase 4: case 3: zcase 2: height pzcase 4:
height p
yx
z
height p
y
x
g p
yx
height p
height p
h i ht h i ht 1
height p
height p
xheight p
height p , height p-1height p , height p-1
height p , height p-1
zy
x z yx
y zx
Trinode RestructuringAll 4 cases can be coded with the same algorithm, called trinoderestructuring
zcase 2: height prestructuringIn all 4 cases, out of 3 nodes x, y, z, we make
y
h h
x
the node with the middle key the new parentthe smallest key node as its left child
height p
height p , height p-1
the largest key node as its right child
for the “new parent” the old 1 or 2 subtrees not rooted at x, y, z need to be , y,put in the appropriate locations
The left subtree (if present) goes with the new left childTh i ht bt (if t) ith
y zx
21
The right subtree (if present) goes with the new right child
PseudoCode for Trinode RestructuringAlgorithm TriNodeRestructure(x,y,z)Input: node x, its parent y, its grandparent z. Node z is not balancedOutput: position of the node that goes in the place of z in the tree structure
if k ( ) k ( ) d k ( ) k ( ) h bif key(z) ≤ key(x) and key(x) ≤ key(y) then a = z; b = x; c = yif key(z) ≥ key(x) and key(x) ≥ key(y) then a = y; b = x; c = zif key(z) ≤ key(y) and key(y) ≤ key(x) then a = z; b = y; c = xif key(z) ≥ key(y) and key(y) ≥ key(x) then a = x; b = y; c = zy( ) y(y) y(y) y( ) y
if ( z = root ) thenroot = b; //In this case, root changes after triNodeRestructureb.parent = null
l // t t f t th d l ielse // reconnect parent of z to the node replacing zif z.Parent.leftChild = z then MakeLeftChild(z.Parent, b); else MakeRightChild(z.Parent, b);
if b LeftChild ≠ x and b LeftChild ≠ y and b LeftChild ≠ z thenif b.LeftChild ≠ x and b.LeftChild ≠ y and b.LeftChild ≠ z thenMakeRightChild(a, b.LeftChild);
if b.RightChild ≠ x and b.RightChild ≠ y and b.RightChild ≠ z then MakeLeftChild(c, b.RightChild);
22
MakeLeftChild(b, a);MakeRightChild(b,c);
return b
PseudoCode for Trinode Restructuring
On the previous slide, method MakeLeftChild(a b): makes node b the left child ofMakeLeftChild(a,b): makes node b the left child of node a. This involves 2 steps
a.leftchild = bb.parent = ab.parent a
MakeRightChild(a,b): makes node b the right child of node a. This involves 2 steps:p
a.rightchild = bb.parent = a
T i d t t i t k O(1)Trinode restructuring takes O(1)no loops, no recursive calls, just a constant number of comparisons and changes in parent-child relationships
23
Insertion and Trinode Restructuring
Only 1 trinode restructuring is necessary per insertion
after restructuring the height of thez
after restructuring, the height of the subtree formerly rooted at z (now rooted at x or y) is the same as before insertion, namely p+2, and the tree was balanced before insertion wwas balanced before insertion wThus after insertion, trinode restructuring restores height-balance order globally
z z x or y
p p+1p+2
w
pp+2p+3 p+2
p+1p+1
before insertion after insertion, but beforetrinode restructuring
after insertion andtrinode restructuring
PseudoCode for Insertion into AVL TreeAlgorithm AVLtreeInsert(k,o)Input: key k and value o; Output: node where the entry was inserted
w = TreeInsert(k o T root) // w holds position of new entry (k o)w = TreeInsert(k, o, T.root) // w holds position of new entry (k,o)// now need to check and if needed, restore height-balance propertyz = wwhile (z≠ null) // traverse up the tree checking for imbalancewhile (z ≠ null) // traverse up the tree, checking for imbalance
setHeight(z); // reset the height of z since it may have changedif |getHeight(z.left)-getHeight(z.right)| > 1 then
z=TriNodeRestructure(tallerChild(tallerChild(z)) tallerChild(z) z)z TriNodeRestructure(tallerChild(tallerChild(z)),tallerChild(z),z)setHeight(z.left); setHeight(z.right); setHeight(z);break; // exit while loop, tree is balanced after 1 trinodeRestructure
z = parent(z)p ( )return w
setHeight (z)= 1+max(z.left.height,z.right.height) tallerChild(node) gives the child with larger height Thus
25
tallerChild(node) gives the child with larger height. Thus y = tallerChild(z), and x = tallerChild(tallerChild(z))complexity of AVLtreeInsert(k,o) : O(log n)
Insertion Example, continued
44
17 782
5
4z
88
17 78
32 50
48 62
1
1
3
2
1
1
x
y
54
1
T0 T2
T3
444
unbalanced...T1 44
17
7832 50
622
1 2 2
3 x
y z
T1
88
7832 50
48
1 154
1
T2
...balanced
26T0 T1
2
T3
Insertion in AVL Tree: SummaryAfter inserting into binary tree at node w, we go up the tree, following the ancestor path from w, checking if any node on this path has become unbalanced
If an unbalanced node is found, perform tri-noderestructuring
Since after 1 such restructuring, the tree has become balanced and thus is an AVL tree, can return right after tri node restructuringtri-node restructuring
Travel up the tree is O(log n), and
27
tri-node restructuring is O(1)