c++ programming: program design including data structures, fourth edition

97
C++ Programming: Program Design Including Data Structures, Fourth Edition Chapter 19: Searching and Sorting Algorithms

Upload: marah-avery

Post on 31-Dec-2015

34 views

Category:

Documents


0 download

DESCRIPTION

C++ Programming: Program Design Including Data Structures, Fourth Edition. Chapter 19: Searching and Sorting Algorithms. Objectives. In this chapter, you will: Learn the various search algorithms - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design IncludingData Structures, Fourth Edition

Chapter 19: Searching and Sorting Algorithms

Page 2: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 2

Objectives

In this chapter, you will:• Learn the various search algorithms• Explore how to implement the sequential and

binary search algorithms• Discover how the sequential and binary

search algorithms perform• Become aware of the lower bound on

comparison-based search algorithms

Page 3: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 3

Objectives (continued)

• Learn the various sorting algorithms• Explore how to implement the bubble,

selection, insertion, quick, and merge sorting algorithms

• Discover how the sorting algorithms discussed in this chapter perform

Page 4: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 4

Searching and Sorting Algorithms

• The most important operation that can be performed on a list is the search algorithm

• Using a search algorithm, you can:− Determine whether a particular item is in the

list

− If the data is specially organized (for example, sorted), find the location in the list where a new item can be inserted

− Find the location of an item to be deleted

Page 5: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 5

Searching and Sorting Algorithms (continued)

• Because searching and sorting require comparisons of data, the algorithms should work on the type of data that provide appropriate functions to compare data items

• Data can be organized with the help of an array or a linked list− unorderedLinkedList− unorderedArrayListType

Page 6: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 6

Search Algorithms

• Associated with each item in a data set is a special member that uniquely identifies the item in the data set− Called the key of the item

• Key comparison: comparing the key of the search item with the key of an item in the list− Can be counted: number of key comparisons

Page 7: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 7

Sequential Search

Page 8: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 8

Sequential Search Analysis

• The statements before and after the loop are executed only once, and hence require very little computer time

• The statements in the for loop are the ones that are repeated several times− Execution of the other statements in loop is

directly related to outcome of key comparison

• Speed of a computer does not affect the number of key comparisons required

Page 9: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 9

Sequential Search Analysis (continued)

• L: a list of length n• If search item is not in the list: n comparisons• If the search item is in the list:

− If search item is the first element of L one key comparison (best case)

− If search item is the last element of L n comparisons (worst case)

− Average number of comparisons:

Page 10: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 10

Binary Search

• Binary search can be applied to sorted lists• Uses the “divide and conquer” technique

− Compare search item to middle element− If search item is less than middle element,

restrict the search to the lower half of the list• Otherwise search the upper half of the list

Page 11: C++ Programming:  Program Design Including Data Structures,  Fourth Edition
Page 12: C++ Programming:  Program Design Including Data Structures,  Fourth Edition
Page 13: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 13

Performance of Binary Search

• Every iteration cuts size of search list in half• If list L has 1000 items

− At most 11 iterations needed to find x

• Every iteration makes two key comparisons− In this case, at most 22 key comparisons

• Sequential search would make 500 key comparisons (average) if x is in L

Page 14: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 14

Binary Search Algorithm and the class orderedArrayListType

Page 15: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 15

Asymptotic Notation: Big-O Notation

• After an algorithm is designed it should be analyzed

• There are various ways to design a particular algorithm− Certain algorithms take very little computer

time to execute; others take a considerable amount of time

Page 16: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

• Lines 1 to 6 each have one operation, << or >>

• Line 7 has one operation, >=

• Either Line 8 or Line 9 executes; each has one operation

• There are three operations, <<, in Line 11

• The total number of operations executed in this code is 6 + 1 + 1 + 3 = 11

Page 17: C++ Programming:  Program Design Including Data Structures,  Fourth Edition
Page 18: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 18

Asymptotic Notation: Big-O Notation (continued)

Page 19: C++ Programming:  Program Design Including Data Structures,  Fourth Edition
Page 20: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 20

Asymptotic Notation: Big-O Notation (continued)

Page 21: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 21

Asymptotic Notation: Big-O Notation (continued)

Page 22: C++ Programming:  Program Design Including Data Structures,  Fourth Edition
Page 23: C++ Programming:  Program Design Including Data Structures,  Fourth Edition
Page 24: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 24

Asymptotic Notation: Big-O Notation (continued)

Page 25: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 25

Asymptotic Notation: Big-O Notation (continued)

• We can use Big-O notation to compare the sequential and binary search algorithms:

Page 26: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 26

Lower Bound on Comparison-Based Search Algorithms

• Comparison-based search algorithm: search the list by comparing the target element with the list elements

Page 27: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 27

Sorting Algorithms

• There are several sorting algorithms in the literature

• We discuss some of the commonly used sorting algorithms

• To compare their performance, we provide some analysis of these algorithms

• These sorting algorithms can be applied to either array-based lists or linked lists

Page 28: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 28

Sorting a List: Bubble Sort

• Suppose list[0]...list[n - 1] is a list of n elements, indexed 0 to n – 1

• Bubble sort algorithm:− In a series of n - 1 iterations, compare

successive elements, list[index] and list[index + 1]

− If list[index] is greater than list[index + 1], then swap them

Page 29: C++ Programming:  Program Design Including Data Structures,  Fourth Edition
Page 30: C++ Programming:  Program Design Including Data Structures,  Fourth Edition
Page 31: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 31

Sorting a List: Bubble Sort (continued)

Page 32: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 32

Analysis: Bubble Sort

• bubbleSort contains nested loops− Outer loop executes n – 1 times

− For each iteration of outer loop, inner loop executes a certain number of times

• Comparisons:

• Assignments (worst case):

Page 33: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 33

Bubble Sort Algorithm and the class unorderedArrayListType

Calls bubbleSort

Page 34: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 34

Selection Sort: Array-Based Lists

• Selection sort: rearrange list by selecting an element and moving it to its proper position

• Find the smallest (or largest) element and move it to the beginning (end) of the list

Page 35: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 35

Selection Sort (continued)

• On successive passes, locate the smallest item in the list starting from the next element

Page 36: C++ Programming:  Program Design Including Data Structures,  Fourth Edition
Page 37: C++ Programming:  Program Design Including Data Structures,  Fourth Edition
Page 38: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 38

Analysis: Selection Sort

• swap: three assignments; executed n − 1 times− 3(n − 1) = O(n)

• minLocation:− For a list of length k, k − 1 key comparisons

− Executed n − 1 times (by selectionSort)

− Number of key comparisons:

Page 39: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 39

Insertion Sort: Array-Based Lists

• The insertion sort algorithm sorts the list by moving each element to its proper place

Page 40: C++ Programming:  Program Design Including Data Structures,  Fourth Edition
Page 41: C++ Programming:  Program Design Including Data Structures,  Fourth Edition
Page 42: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 42

Insertion Sort (continued)

• Pseudocode algorithm:

Page 43: C++ Programming:  Program Design Including Data Structures,  Fourth Edition
Page 44: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 44

Analysis: Insertion Sort

• The for loop executes n – 1 times• Best case (list is already sorted):

− Key comparisons: n – 1 = O(n)

• Worst case: for each for iteration, if statement evaluates to true− Key comparisons:1 + 2 + … + (n – 1) = n(n – 1) / 2 = O(n2)

• Average number of key comparisons and of item assignments: ¼ n2 + O(n) = O(n2)

Page 45: C++ Programming:  Program Design Including Data Structures,  Fourth Edition
Page 46: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 46

Lower Bound on Comparison-Based Sort Algorithms

• Comparison tree: graph used to trace the execution of a comparison-based algorithm− Let L be a list of n distinct elements; n > 0

• For any j and k, where 1 j n, 1 k n,

either L[j] < L[k] or L[j] > L[k]

− Node: represents a comparison• Labeled as j:k (comparison of L[j] with L[k])• If L[j] < L[k], follow the left branch; otherwise,

follow the right branch

− Leaf: represents the final ordering of the nodes

Page 47: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 47

Lower Bound on Comparison-Based Sort Algorithms (continued)

root

branchpath

Page 48: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 48

Lower Bound on Comparison-Based Sort Algorithms (continued)

• Associated with each root-to-leaf path is a unique permutation of the elements of L− Because the sort algorithm only moves the

data and makes comparisons

• For a list of n elements, n > 0, there are n! different permutations− Any of these might be the correct ordering of L

• Thus, the tree must have at least n! leaves

Page 49: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 49

Quick Sort: Array-Based Lists

• Uses the divide-and-conquer technique− The list is partitioned into two sublists

− Each sublist is then sorted

− Sorted sublists are combined into one list in such a way so that the combined list is sorted

Page 50: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 50

Quick Sort: Array-Based Lists (continued)

• To partition the list into two sublists, first we choose an element of the list called pivot

• The pivot divides the list into: lowerSublist and upperSublist− The elements in lowerSublist are < pivot− The elements in upperSublist are ≥ pivot

Page 51: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 51

Quick Sort: Array-Based Lists (continued)

• Partition algorithm (we assume that pivot is chosen as the middle element of the list):− Determine pivot; swap it with the first

element of the list

− For the remaining elements in the list:• If the current element is less than pivot, (1)

increment smallIndex, and (2) swap current element with element pointed by smallIndex

− Swap the first element (pivot), with the array element pointed to by smallIndex

Page 52: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 52

Quick Sort: Array-Based Lists (continued)

• Step 1 determines the pivot and moves pivot to the first array position

• During the execution of Step 2, the list elements get arranged

Page 53: C++ Programming:  Program Design Including Data Structures,  Fourth Edition
Page 54: C++ Programming:  Program Design Including Data Structures,  Fourth Edition
Page 55: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 55

Quick Sort: Array-Based Lists (continued)

Page 56: C++ Programming:  Program Design Including Data Structures,  Fourth Edition
Page 57: C++ Programming:  Program Design Including Data Structures,  Fourth Edition
Page 58: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 58

Analysis: Quick Sort

Page 59: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 59

Merge Sort: Linked List-Based Lists

• Quick sort: O(nlog2n) average case; O(n2) worst case

• Merge sort: always O(nlog2n)

− Uses the divide-and-conquer technique• Partitions the list into two sublists

• Sorts the sublists

• Combines the sublists into one sorted list

− Differs from quick sort in how list is partitioned• Divides list into two sublists of nearly equal size

Page 60: C++ Programming:  Program Design Including Data Structures,  Fourth Edition
Page 61: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 61

Merge Sort: Linked List-Based Lists (continued)

• General algorithm:

• We next describe the necessary algorithm to:− Divide the list into sublists of nearly equal size− Merge sort both sublists− Merge the sorted sublists

Page 62: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 62

Divide

Page 63: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 63

Divide (continued)

• Every time we advance middle by one node, we advance current by one node

• After advancing current by one node, if it is not NULL, we again advance it by one node− Eventually, current becomes NULL and middle points to the last node of first sublist

Page 64: C++ Programming:  Program Design Including Data Structures,  Fourth Edition
Page 65: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 65

Merge

• Sorted sublists are merged into a sorted list by comparing the elements of the sublists and then adjusting the pointers of the nodes with the smaller info

Page 66: C++ Programming:  Program Design Including Data Structures,  Fourth Edition
Page 67: C++ Programming:  Program Design Including Data Structures,  Fourth Edition
Page 68: C++ Programming:  Program Design Including Data Structures,  Fourth Edition
Page 69: C++ Programming:  Program Design Including Data Structures,  Fourth Edition
Page 70: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 70

Analysis: Merge Sort

• Suppose that L is a list of n elements, where n > 0

• Suppose that n is a power of 2; that is, n = 2m for some nonnegative integer m, so that we can divide the list into two sublists, each of size:

− m is the number of recursion levels

Page 71: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 71

Analysis: Merge Sort (continued)

Page 72: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 72

Analysis: Merge Sort (continued)

• To merge a sorted list of size s with a sorted list of size t, the maximum number of comparisons is s + t 1

• The function mergeList merges two sorted lists into a sorted list− This is where the actual work (comparisons

and assignments) is done− Max. # of comparisons at level k of recursion:

Page 73: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 73

• The maximum number of comparisons at each level of the recursion is O(n)− The maximum number of comparisons is

O(nm), where m is the number of levels of the recursion; since n = 2m m = log2n

− Thus, O(nm) O(n log2n)

• W(n): # of key comparisons in the worst case

• A(n): # of key comparisons in average case

Analysis: Merge Sort (continued)

Page 74: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 74

Programming Example: Election Results

• The presidential election for the student council of your university is about to be held

• You have to write a program to analyze the data and report the winner

• The university has four major divisions (labeled region 1 – 4), and each division has several departments

• Each department in each division handles its own voting and reports the votes received by each candidate to the election committee

Page 75: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 75

Programming Example: Election Results (continued)

• The voting is reported in the following form: firstName lastName regionNumber numberOfVotes

Page 76: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 76

Programming Example: Election Results (continued)

• The input file containing the voting data looks like the following:

• The main program component is a candidate− class candidateType

Page 77: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 77

personType

Page 78: C++ Programming:  Program Design Including Data Structures,  Fourth Edition
Page 79: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 79

Candidate

Page 80: C++ Programming:  Program Design Including Data Structures,  Fourth Edition
Page 81: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 81

Candidate (continued)

Page 82: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 82

Main Program

• Read each candidate’s name into candidateList

• Sort candidateList• Process the voting data• Calculate the total votes received by each

candidate• Print the results

Page 83: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 83

Main Program (continued)

Page 84: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 84

Main Program (continued)

Page 85: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 85

fillNames

Page 86: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 86

fillNames (continued)

Page 87: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 87

Sort Names

Page 88: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 88

Process Voting Data

Page 89: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 89

Process Voting Data (continued)

Page 90: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 90

Process Voting Data (continued)

Page 91: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 91

Add Votes

Page 92: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 92

Add Votes (continued)

Page 93: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 93

Print Heading and Print Results

Page 94: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 94

Print Heading and Print Results (continued)

Page 95: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 95

Summary

• On average, a sequential search searches half the list and makes O(n) comparisons− Not efficient for large lists

• A binary search requires the list to be sorted− 2log2n – 3 key comparisons

• Let f be a function of n: by asymptotic, we mean the study of the function f as n becomes larger and larger without bound

Page 96: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 96

Summary (continued)

• Binary search algorithm is the optimal worst-case algorithm for solving search problems by using the comparison method− To construct a search algorithm of the order

less than log2n, it can’t be comparison based

• Bubble sort: O(n2) key comparisons and item assignments

• Selection sort: O(n2) key comparisons and O(n) item assignments

Page 97: C++ Programming:  Program Design Including Data Structures,  Fourth Edition

C++ Programming: Program Design Including Data Structures, Fourth Edition 97

Summary (continued)

• Insertion sort: O(n2) key comparisons and item assignments

• Both the quick sort and merge sort algorithms sort a list by partitioning it− Quick sort: average number of key

comparisons is O(nlog2n); worst case number of key comparisons is O(n2)

− Merge sort: number of key comparisons is O(nlog2n)