trees cs-212 dick steflik. what is a tree a tree is a finite set of one or more nodes such that:...

Trees

CS-212

Dick Steflik

What is a Tree

• A tree is a finite set of one or more nodes such that:– There is a specially designated node called

the root– The remaining nodes are partitioned into n>0

disjoint sets T1,..,Tn, where each of these sets is a tree. T1,..,Tn are the subtrees of the root.

Examples

Tree Terms

Root

A

B C

D E F

G H

height (h)

leaf nodes(exterior nodes)

interior nodes

A is the parent of B and C

B and C are siblings

B and C are children of A

height - the number of nodes in the longest path going from the root to the furthest leaf

parent - any node in the tree at the next higher level in the tree

child - any node in the tree at the next lower level in the tree

siblings - any nodes in the tree having a common parent

order - the number of children in the node having the largest number of children

binary tree - any order 2 tree

binary search tree - any binary tree having the search tree property

Things represented as trees

• Table of Contents

• Subassembly diagrams

• Genealogy diagrams

• Pedigree diagrams

• Tournament playoffs

• Graphics representation

• Organizational charts

Nodal Structure Options

If we know the maximum order that an arbitrary tree is supposed to be we could allocate our data content and a child pointer for each possible child

Ex suppose max. order = 5

Datachild

1child

2child

3child

4child

5

each node would look like:

If our tree has many nodes that have less than 5 children this representation could be very wasteful considering that each childpointer requires 4 bytes of storage.

Is there a better, less wasteful representation?

As it turns out, YES there is

The lowly order 2 (binary) tree can be used to represent any order n tree. and we can make the statement that:

For any general tree, there is an equivalent binary tree

To do this we must visualize an order 2 tree differently; instead of as a collection of parents and children we view it as parent, leftmost child and that child’s siblings

A

B C D

Instead of this :

A

B

C

D

This:

Why do we want to do this?

It turns out that order 2 tree have a very nice structure in that there are only two choices to

make; Right or Left. This makes it easier to design algorithms for them.

To explore this let us look at the problem of creating an algorithm for visiting every node in a tree in some predictable order. This problem is called Traversal and can be accomplished with the following the following algorithm.

1. start at the root and

2. follow all of the left links until you can’t go any farther

3. back-up one node and try going right, if you can repeat steps 2 and 3, if you can’t repeat step 3

Node Structure (Static)

typedef struct { Element data; Link left Link right; } Node;

typedef struct { int numFree; int numInTree; Link free; Link root; Node Nodes[NUMNODES]; } tree;

Node Structure (Dynamic)

typedef node * Tree;

struct Node{

element e;

Tree left;

Tree right;

};

Static or Dynamic ?

• Static– you know the maximum number of nodes

• not likely to change

• Dynamic– size of the tree will change frequently– slightly faster than static

• traversing pointers is faster than calculating addresses for array indexing

Traversal

A

B C

1

2

3

1

2

31

2

3

Notice that each node is visited 3 times

Were we to print out the node data in the first visit to each node the printout would be : ABC

Were we to printout the node data on the second visit to each node the printout would be: BAC

Were we to printout the node data on the third visit to each node the printout would be: BCA

These are called the: preorder, inorder and postorder traversals respevtively

Pre-order Traversal

void preorder( tnode * t) {

if (t != NULL){ printf (“ %d “, t -> data );

preorder(t -> left);

preorder(t -> right);}

}

In-order Traversal

void inorder( tnode * t)

{

if (t != NULL)

{

inorder(t -> left);

printf(“ %d “, t -> data );

inorder(t -> right;

}

}

Post-order Traversal

void postorder( tnode * t) {

if (t != NULL){

postorder(t -> left);

postorder(t -> right);

printf(“ %d “, t -> data );

} }

Notice that...

the first node visited on the pre-order traversal ia always the root

the left-most node in the tree is the first node on the inorder traversal

the last node visited on the inorder traversal is the rightmost node in the tree

the last node visited on the postorder traversal is the root

Knowing this and given any two traversal paths the tree can be constructed…….

Armed with this information…

• We should be able to construct the tree that produced any pair of traversal paths

InsertionTree insert(Tree p , int v){ if (p == NULL) { p = (Tree) malloc(sizeof(struct Node)); p->data = v; p->right = NULL; p->left = NULL; } else if (v < p->data) p->left = insert(p->left,v); else if (v > p->data) p->right = insert(p->right,v); return p; };

search

Tree search(Tree p , int v){ if ((p == NULL) || (v == p->data)) { printf("Found\n"); return p; } if (v < p->data) return search(p->left,v); else return search(p->right,v);};

• Note: – If the tree is being used to represent a set the

search function would be better named isMember( )

Priority Queue• BST could be used as a Priority Queue

– Needs functions: insert( ) findMin( ) removeMin( ) or findMax( ) removeMax( )

• Node with minimum priority is leftmost node• Node with maximum priority is rightmost node

Deletion

• Three cases we need to consider:– V has no children

• Remove v

– V has one child• Swap v with child and do 1

– V has 2 children• Swap v with successor

– leftmost node in right subtree

• Do case 1 or case 2 to delete

Deletion

20

10

5 15

12

25

30

Heaps

• The heap property:– max heap: the largest key is at the root; at

each node the keys of the children must be less than the key of the parent

– min heap: the smallest key is at the root; the keys of the children must be greater than the key of the parent

– used as a priority queue; and the basis for heap sort

Visualizea one dimensional array as a tree as follows (the array contains the keys)

0

1

2

3

4

5

6

1

0

2

3 4 5 6

To Find the child or parent

• iL = (2*iP)+1

• iR = (2*iP)+2

• iP = (iC-1)/2

Adding a Key

• Tree is complete (i.e. it fills up in ascending index positions)

• Must keep track of where end of tree currently is

• Insert new key at end of tree, then bubble it up to it proper location by comparing to the parent’s key and swapping if necessary

• This gives O(log2n) for each add.

Deleting a key

• Swap the last key with the root and remove the last key

• Recursivly push the root down to its proper level by comparing it to its children and swapping with smallest child

• This gives O(log2n) for each deletion

Insuring O(log2n) performance

• The main problem with BST is that its hard to insure optimum performance.– shape of the tree is very dependent on the

order the keys were inserted in• trees tend to degenerate to varying degrees

The solution - Self-balancing trees

• AVL Trees

• Red/Black Trees

• 2-3 trees

• 2-3-4 trees

• B-Trees

AVL Trees

• Height balanced– at every node the allowable difference in

height between the right and left subtree is one

– an additional piece of data is required • balance factor

– value = 0 : right and left subtrees are same height– value = 1 : right subtree is one higher– value = -1 : left subtree is one higher– value = 2 or -2 : tree need to be rebalanced

AVL Method

• Insertion is pretty much like a BST recursive insertion but on the way out (unwinding the recursion) check and adjust the balance factors, if a difference of 2 or -2 is found rebalance the tree by doing an RR, LL, RL or LR rotation

Out of Balance Conditions• insertion into left subtree of left child of x

called LL (mirror image of RR) fix with a rotateRight

• insertion into right subtree of right child of x called RR (mirror image of LL) fix with a rotateLeft

• insertion into left subtree of right child of x called LR (mirror image of RL) fix with a rotateLeft , rotateRight

• insertion into right subtree of left child of x called RL (mirror image of LR) fix with a rotateRight , rotateLeft

LL

k1

k2

k2

k1

A

AA B B B

C C

C

Node * rotateLeft(Node * k2){ Node * k1 = k2->left; k2 ->left = k1->right; k1->right = k2; return k1 }

LR

k2

k1

A

k3

B C

D

k1 k3

k2

AB C

D

Node * DoubleRotateLeft( Node * k3){ k3->left = RotateLeft(k3->left); return RotateRight(k3); }

k1

k2

A

B

C

k3

D

trees cs-212 dick steflik. what is a tree a tree is a finite set of one or more nodes such that:...

Documents