cpsc 221: data structures dictionary adt binary search...

43
CPSC 221: Data Structures Dictionary ADT Binary Search Trees Alan J. Hu (Using Steve Wolfman’s Slides)

Upload: others

Post on 17-Jul-2020

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

CPSC 221: Data StructuresDictionary ADT

Binary Search Trees

Alan J. Hu(Using Steve Wolfman’s Slides)

Page 2: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

Learning Goals

After this unit, you should be able to...• Determine if a given tree is an instance of a particular type

(e.g. binary search tree, heap, etc.)• Describe and use pre-, in- and post-order traversal

algorithms• Describe the properties of binary trees, binary search trees,

and more general trees; Implement iterative and recursive algorithms for navigating them in C++

• Compare and contrast ordered versus unordered trees in terms of complexity and scope of application

• Insert and delete elements from a binary tree

Page 3: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

Today’s Outline

• Binary Trees• Dictionary ADT• Binary Search Trees• Deletion• Some troubling questions

Page 4: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

Binary Trees• Binary tree is

– an empty tree (NULL, in our case)– or, a root node with two subtrees

• Properties– max # of leaves: – max # of nodes:

• Representation:

A

B

D E

C

F

HG

JIDataright

pointerleft

pointer

recursive definition!

Page 5: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

Binary Trees• Binary tree is

– an empty tree (NULL, in our case)– or, a root node with two subtrees

• Properties– max # of leaves: 2h

– max # of nodes: 2h+1-1

• Representation:

A

B

D E

C

F

HG

JIDataright

pointerleft

pointer

recursive definition!

Page 6: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

RepresentationA

right pointer

leftpointer

A

B

D E

C

F

Bright

pointerleft

pointer

Cright

pointerleft

pointer

Dright

pointerleft

pointer

Eright

pointerleft

pointer

Fright

pointerleft

pointer

struct Node {KTYPE key;DTYPE data;Node * left;Node * right;

};

Page 7: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

Today’s Outline

• Binary Trees• Dictionary ADT• Binary Search Trees• Deletion• Some troubling questions

Page 8: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

What We Can Do So Far

• Stack– Push– Pop

• Queue– Enqueue– Dequeue

What’s wrong with Lists?

• List– Insert– Remove– Find

• Priority Queue– Insert– DeleteMin

Page 9: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

Dictionary ADT• Dictionary operations

– create– destroy– insert– find– delete

• Stores values associated with user-specified keys– values may be any (homogenous) type– keys may be any (homogenous) comparable type

• midterm– would be tastier with

brownies• prog-project

– so painful… who invented templates?

• wolf– the perfect mix of oomph

and Scrabble value

insert

find(wolf)

• brownies- tasty

• wolf- the perfect mix of oomph

and Scrabble value

Page 10: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

Search/Set ADT• Dictionary operations

– create– destroy– insert– find– delete

• Stores keys– keys may be any (homogenous) comparable– quickly tests for membership

• Berner• Whippet• Alsatian• Sarplaninac• Beardie• Sarloos• Malamute• Poodle

insert

find(Wolf)

• Min Pin

NOT FOUND

Page 11: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

A Modest Few Uses

• Arrays and “Associative” Arrays• Sets• Dictionaries• Router tables• Page tables• Symbol tables• C++ Structures• Python’s __dict__ that stores fields/methods

Page 12: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

Desiderata

• Fast insertion– runtime:

• Fast searching– runtime:

• Fast deletion– runtime:

Page 13: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

Naïve Implementations

• Linked list

• Unsorted array

• Sorted array

insert deletefind

Page 14: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

Naïve Implementations

• Linked list

• Unsorted array

• Sorted array

insert deletefind

so close!

Page 15: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

Today’s Outline

• Binary Trees• Dictionary ADT• Binary Search Trees• Deletion• Some troubling questions

Page 16: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

Binary Search Tree Dictionary Data Structure

4

121062

115

8

14

13

7 9

• Binary tree property– each node has 2 children– result:

• storage is small• operations are simple• average depth is small

• Search tree property– all keys in left subtree

smaller than root’s key– all keys in right subtree

larger than root’s key– result:

• easy to find any given key

Page 17: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

Example and Counter-Example

3

1171

84

5

4

181062

115

8

20

21BINARY SEARCH TREE NOT ABINARY SEARCH TREE

7

15

Page 18: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

In Order Listing

2092

155

10

307 17

In order listing:25791015172030

struct Node {// constructors omittedKTYPE key;DTYPE data;Node *left, *right;

};

Page 19: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

Aside: Traversals

• Pre-Order Traversal: Process the data at the node first, then process left child, then process right child.

• Post-Order Traversal: Process left child, then process right child, then process data at the node.

• In-Order Traversal: Process left child, then process data at the node, then process right child.

Code?19

Page 20: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

Aside: Traversals

• Pre-Order Traversal: Process the data at the node first, then process left child, then process right child.

• Post-Order Traversal: Process left child, then process right child, then process data at the node.

• In-Order Traversal: Process left child, then process data at the node, then process right child.

Who cares? These are the most common ways in which code processes trees.

20

Page 21: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

Finding a NodeNode *& find(Comparable key,

Node *& root) {if (root == NULL)return root;

else if (key < root->key)return find(key,

root->left);else if (key > root->key)return find(key,

root->right);elsereturn root;

}

2092

155

10

307 17

runtime:

a. O(1)b. O(lg n)c. O(n)d. O(n lg n)e. None of these

Page 22: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

Finding a NodeNode *& find(Comparable key,

Node *& root) {if (root == NULL)return root;

else if (key < root->key)return find(key,

root->left);else if (key > root->key)return find(key,

root->right);elsereturn root;

}

2092

155

10

307 17

WARNING: Much fancy footwork with refs (&) coming. You can do all of this without refs... just watch out for special cases.

Page 23: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

Iterative FindNode * find(Comparable key,

Node * root) {while (root != NULL &&

root->key != key) {if (key < root->key)root = root->left;

else root = root->right;

}

return root;}

Look familiar?

2092

155

10

307 17

(It’s trickier to get the ref return to work here. We won’t worry.)

Page 24: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

Insert

2092

155

10

307 17

runtime:

void insert(Comparable key,Node *& root) {

Node *& target(find(key,root));

assert(target == NULL);

target = new Node(key);}

Funky game we can play with the *& version.

Page 25: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

Reminder: Value vs. Reference Parameters

• Value parameters (Object foo)– copies parameter– no side effects

• Reference parameters (Object & foo)– shares parameter– can affect actual value– use when the value needs to be changed

• Const reference parameters (const Object & foo)– shares parameter– cannot affect actual value– use when the value is too intricate for pass-by-value

Page 26: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

BuildTree for BSTs

• Suppose the data 1, 2, 3, 4, 5, 6, 7, 8, 9 is inserted into an initially empty BST:– in order

– in reverse order

– median first, then left median, right median, etc.

Page 27: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

Analysis of BuildTree

• Worst case: O(n2) as we’ve seen• Average case assuming all orderings equally likely

turns out to be O(n lg n).

Page 28: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

Bonus: FindMin/FindMax

• Find minimum

• Find maximum 2092

155

10

307 17

Page 29: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

Double Bonus: SuccessorFind the next larger nodein this node’s subtree.

// Note: If no succ, returns (a useful) NULL.Node *& succ(Node *& root) {if (root->right == NULL)

return root->right;else

return min(root->right);}

Node *& min(Node *& root) {if (root->left == NULL) return root;else return min(root->left);

}

2092

155

10

307 17

Page 30: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

More Double Bonus: PredecessorFind the next smaller nodein this node’s subtree.

Node *& pred(Node *& root) {if (root->left == NULL)

return root->left;else

return max(root->left);}

Node *& max(Node *& root) {if (root->right == NULL) return root;else return max(root->right);

}

2092

155

10

307 17

Page 31: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

Today’s Outline

• Some Tree Review (here for reference, not discussed)

• Binary Trees• Dictionary ADT• Binary Search Trees• Deletion• Some troubling questions

Page 32: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

Deletion

2092

155

10

307 17

Why might deletion be harder than insertion?

Page 33: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

Lazy Deletion (“Tombstones”)

• Instead of physically deleting nodes, just mark them as deleted+ simpler+ physical deletions done in batches+ some adds just flip deleted flag– extra memory for “tombstone”– many lazy deletions slow finds– some operations may have to be

modified (e.g., min and max)

2092

155

10

307 17

Page 34: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

Lazy Deletion

2092

155

10

307 17

Delete(17)

Delete(15)

Delete(5)

Find(9)

Find(16)

Insert(5)

Find(17)

Page 35: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

Real Deletion - Leaf Case

2092

155

10

307 17

Delete(17)

Page 36: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

Real Deletion - One Child Case

2092

155

10

307

Delete(15)

Page 37: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

Real Deletion - Two Child Case

3092

205

10

7

Delete(5)

Page 38: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

Finally…

3092

207

10

Page 39: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

Delete Codevoid delete(Comparable key, Node *& root) {Node *& handle(find(key, root));Node * toDelete = handle;if (handle != NULL) {if (handle->left == NULL) { // Leaf or one childhandle = handle->right;

} else if (handle->right == NULL) { // One childhandle = handle->left;

} else { // Two child caseNode *& successor(succ(handle));handle->data = successor->data;toDelete = successor;successor = successor->right; // Succ has <= 1 child

}}delete toDelete;

}Refs make this short and “elegant”…

but could be done without them with a bit more work.

Page 40: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

Today’s Outline

• Binary Trees• Dictionary ADT• Binary Search Trees• Deletion• Some troubling questions

Page 41: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

Thinking about Binary Search Trees

• Observations– Each operation views two new elements at a time– Elements (even siblings) may be scattered in memory– Binary search trees are fast if they’re shallow

• Realities– For large data sets, disk accesses dominate runtime– Some deep and some shallow BSTs exist for any data

One more piece of bad news: what happens to a balanced tree after many insertions/deletions?

Page 42: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

Solutions?

• Reduce disk accesses?

• Keep BSTs shallow?

Page 43: CPSC 221: Data Structures Dictionary ADT Binary Search Treeshkhosrav/Courses/CPSC221/handouts/hu/sli… · Learning Goals After this unit, you should be able to... • Determine if

Coming Up

• Self-balancing Binary Search Trees• Huge Search Tree Data Structure