data structures and algorithms lecture 17, 18 and 19 (sorting) instructor: quratulain date: 10, 13...

47
Data Structures and Algorithms Lecture 17 , 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Upload: jack-eaton

Post on 17-Jan-2018

218 views

Category:

Documents


0 download

DESCRIPTION

Introduction to Sorting We will deal primarily with algorithms that solve the General Sorting Problem. ◦ In this problem, we are given:  A list.  Items are all of the same type.  A comparison function.  Given two list items, determine which should come first.  Using this function is the only way we can make such a determination. ◦ We return:  A sorted list with the same items as the original list.

TRANSCRIPT

Page 1: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Data Structures and Algorithms

Lecture 17 , 18 and 19 (Sorting)Instructor: Quratulain

Date: 10, 13 and 17 November, 2009Faculty of Computer Science, IBA

Page 2: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Introduction to SortingTo sort a collection of data is to place

it in order.◦ “Sort” here is computing jargon. Normal

people use “sort” to mean placing things in categories.

Efficient sorting is of great interest.◦ Sorting is a very common operation.◦ Sorting code that is written with little

thought is often much less efficient than code using a good sorting algorithm.

◦ Some algorithms (like Binary Search) require sorted data. The efficiency of sorting affects the desirability of such algorithms.

Page 3: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Introduction to SortingWe will deal primarily with

algorithms that solve the General Sorting Problem.◦In this problem, we are given:

A list. Items are all of the same type.

A comparison function. Given two list items, determine which should

come first. Using this function is the only way we can make

such a determination.◦We return:

A sorted list with the same items as the original list.

Page 4: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Introduction to Sorting: Analyzing Sorting Algorithms We will analyze sorting algorithms according to five

criteria:◦ Efficiency

What is the (worst-case) order of the algorithm? Is the algorithm much faster on average than its worst-case

performance?◦ Requirements on Data

Does the algorithm need random-access data? Does it work well with linked lists?

What operations does the algorithm require the data to have? Of course, we always need “compare”. What else?

◦ Space Usage Can the algorithm sort in-place?

In-place means the algorithm does not copy the data to a separate buffer.

How much additional storage is required? Every algorithm requires at least a little.

◦ Stability Is the algorithm stable?

A sorting algorithm is stable if it does not reverse the order of equivalent items.

◦ Performance on Nearly Sorted Data Is the algorithm faster when given sorted (or nearly sorted)

data? All items close to where they should be, or a limited number of items

out of order.

Page 5: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Introduction to Sorting: Overview of Algorithms

There is no known sorting algorithm that has all the properties we would like one to have.

We will examine a number of sorting algorithms. Generally, these fall into two categories: O(n2) and O(n log n).◦ Quadratic [O(n2)] Algorithms

Bubble Sort Selection Sort Insertion Sort Quicksort A tree-based sort (later in semester)

◦ Log-Linear [O(n log n)] Algorithms Merge Sort Heap Sort (mostly later in semester)

◦ Special-Purpose Algorithm Radix Sort

It may seem odd that an algorithm called “Quicksort”

is in the slow category. More about this later.

Page 6: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Sorting Algorithms I: Bubble Sort — Introduction

One of the simplest sorting algorithms is also one of the worst: Bubble Sort.◦ We cover it because it is easy to

understand and analyze.◦ But we never use it.

Bubble Sort uses two basic operations:◦ Compare

Given two consecutive items in a list, which comes first?

◦ Swap Given two consecutive items in a list, swap them.

Note that Bubble Sort only performs these operations on consecutive pairs of items.

Page 7: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Sorting Algorithms I: Bubble Sort — DescriptionBubble Sort proceeds in a number of

“passes”.◦ In each pass, we go through the list,

considering each consecutive pair of items.◦ We compare. If the pair is out of order, we

swap.◦ The larger items rise to the top like bubbles.

I assume we are sorting in ascending order.◦ After the first pass, the last item in the list is

the largest.◦ Thus, later passes need not go through the

entire list.We can improve Bubble Sort’s

performance on some kinds of nearly sorted data:◦ During each pass, keep track of whether we

have done any swaps during that pass.◦ If not, then the data was sorted when the

pass began. Quit.

Page 8: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Sorting Algorithms I:Bubble Sort — 1. Set i to 02. Set j to i + 1 3. If a[i] > a[j], exchange their

values 4. Set j to j + 1. If j < n goto step

3 5. Set i to i + 1. If i < n - 1 goto

step 2 6. a is now sorted in ascending

order. Note: n is the number of elements

in the array.

Page 9: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Codeint hold, j, pass;Int switched=true; for (pass=0; pass< n-1 && switched==True;

pass++) { switched=False;

for( j=0; j<n-pass-1; j++) if (x[j] > x[j+1]) { switched =True; hold=x[j]; x[j]=x[j+1]; x[j+1]=hold; } }

Page 10: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Sorting Algorithms I: Bubble Sort — AnalysisEfficiency

◦ Bubble Sort is O(n2).◦ Bubble Sort also has an average-case time of O(n2).

Requirements on Data ◦ Bubble Sort works on a linked list.◦ Operations needed: (compare and) swap of consecutive

items.Space Usage

◦ Bubble Sort can be done in-place.◦ It does not require significant additional storage.

Stability ◦ Bubble Sort is stable.

Performance on Nearly Sorted Data /◦ We can write Bubble Sort to be O(n) if no item is far out

of place. ◦ Bubble Sort is O(n2) even if only one item is out of order.

Page 11: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Sorting Algorithms I: Bubble Sort — CommentsBubble Sort is very slow.

◦ It is never used, except: In C.S. classes, as a simple example. By people who do not understand sorting

algorithms.Bubble Sort does not require random-

access data, however, our implementation does.◦ Can we fix this?

Yes, we can rewrite our Bubble Sort to use forward iterators.

Page 12: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Insertion Sort

In each pass of an insertion sort, one or more pieces of data are inserted into their correct location in an ordered list (just as a card player picks up cards and places them in his hand in order).

Page 13: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Insertion Sort

In the insertion sort, the list is divided into 2 parts:◦ Sorted◦ Unsorted

In each pass, the first element of the unsorted sublist is transferred to the sorted sublist by inserting it at the appropriate place.

If we have a list of n elements, it will take, at most, n-1 passes to sort the data.

Page 14: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Insertion Sort

We can visualize this type of sort with the above figure.

The first part of the list is the sorted portion which is separated by a “conceptual” wall from the unsorted portion of the list.

Page 15: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Insertion Sort

Page 16: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Insertion Sort PseudocodeSee sorting handouts

Page 17: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Selection SortImagine some data that you can

examine all at once.To sort it, you could select the smallest

element and put it in its place, select the next smallest and put it in its place, etc.

For a card player, this process is analogous to looking at an entire hand of cards and ordering them by selecting cards one at a time and placing them in their proper order.

Page 18: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Selection SortThe selection sort follows this

idea. Given a list of data to be sorted,

we simply select the smallest item and place it in a sorted list.

We then repeat these steps until the list is sorted.

Page 19: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Selection SortIn the selection sort, the list at any

moment is divided into 2 sublists, sorted and unsorted, separated by a “conceptual” wall.

We select the smallest element from the unsorted sublist and exchange it with the element at the beginning of the unsorted data.

After each selection and exchange, the wall between the 2 sublists moves – increasing the number of sorted elements and decreasing the number of unsorted elements.

Page 20: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Selection SortYou can select smallest element or largest and place that in its position. Here smallest is used.

Page 21: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Selection Sort PseudocodeSee Sorting handout

Page 22: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Quick Sort

In the bubble sort, consecutive items are compared, and possibly exchanged, on each pass through the list.

This means that many exchanges may be needed to move an element to its correct position.

Quick sort is more efficient than bubble sort because a typical exchange involves elements that are far apart, so fewer exchanges are required to correctly position an element.

Page 23: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Quick SortEach iteration of the quick sort

selects an element, known as the pivot, and divides the list into 3 groups:◦Elements whose keys are less than

(or equal to) the pivot’s key.◦The pivot element◦Elements whose keys are greater

than the pivot’s key.

Page 24: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Quick Sort The sorting then continues by

quick sorting the left partition followed by quick sorting the right partition.

The basic algorithm is as follows:

Page 25: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Quick Sort1)Partitioning Step: Take an element in

the unsorted array and determine its final location in the sorted array. This occurs when all values to the left of the element in the array are less than (or equal to) the element, and all values to the right of the element are greater than (or equal to) the element. We now have 1 element in its proper location and two unsorted subarrays.

2)Recursive Step: Perform step 1 on each unsorted subarray.

Page 26: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Quick SortEach time step 1 is performed on

a subarray, another element is placed in its final location of the sorted array, and two unsorted subarrays are created.

When a subarray consists of one element, that subarray is sorted.

Therefore, that element is in its final location.

Page 27: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Quick SortThere are several partitioning

strategies used in practice (i.e., several “versions” of quick sort), but the one we are about to describe is known to work well.

For simplicity we will select the last element as the pivot element.

We could also chose a different pivot element and swap it with the last element in the array.

Page 28: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Quick SortBelow is the array we would like

to sort:

1 4 8 9 0 11 5 10 7 6

Page 29: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Quick Sort

The index left starts at the first element and right starts at the next-to-last element.

We want to move all of the elements smaller than the pivot to the left part of the array and all of the elements larger than the pivot to the right part.

1 4 8 9 0 11 5 10 7 6left righ

t

Page 30: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Quick SortWe move left to the right,

skipping over elements that are smaller than the pivot.

1 4 8 9 0 11 5 10 7 6left righ

t

Page 31: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Quick Sort

We then move right to the left, skipping over elements that are greater than the pivot.

When left and right have stopped, left is on an element greater than (or equal to) the pivot and right is on an element smaller than (or equal to) the pivot.

1 4 8 9 0 11 5 10

7 6

left right

Page 32: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Quick SortIf left is to the left of right (or if

left = right), those elements are swapped.

1 4 8 9 0 11 5 10 7 6left righ

t

1 4 5 9 0 11 8 10 7 6left righ

t

Page 33: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Quick SortEffectively, large elements are

pushed to the right and small elements are pushed to the left.

We then repeat the process until left and right cross.

Page 34: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Quick Sort

1 4 5 9 0 11 8 10 7 6left righ

t

1 4 5 9 0 11 8 10 7 6left righ

t

1 4 5 0 9 11 8 10 7 6left righ

t

Page 35: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Quick Sort

1 4 5 0 9 11 8 10 7 6left righ

t

1 4 5 0 9 11 8 10 7 6righ

tleft

Page 36: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Quick Sort

At this point, left and right have crossed so no swap is performed.

The final part of the partitioning is to swap the pivot element with left.

1 4 5 0 9 11 8 10 7 6righ

tleft

1 4 5 0 6 11 8 10 7 9righ

tleft

Page 37: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Quick Sort

Note that all elements to the left of the pivot are less than (or equal to) the pivot and all elements to the right of the pivot are greater than (or equal to) the pivot.

Hence, the pivot element has been placed in its final sorted position.

1 4 5 0 6 11 8 10 7 9righ

tleft

Page 38: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Quick SortWe now repeat the process using

the sub-arrays to the left and right of the pivot.

1 4 5 0 6 11 8 10 7 9

1 4 5 0 11 8 10 7 9

Page 39: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Quick Sort Pseudocodequicksort(list, leftMostIndex, rightMostIndex) {

if (leftMostIndex >= rightMostIndex) return

end if

pivot = list[rightMostIndex] left = leftMostIndex right = rightMostIndex – 1

loop (left <= right) // Find key on left that belongs on right loop (list[left] < pivot) left = left + 1 end loop

Make sure rightand left don’t cross

Page 40: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Quick Sort Pseudocode (cont)

// Find key on right that belongs on left loop (right >= leftMostIndex && list[right] >

pivot) right = right – 1end loop

// Swap out-of-order elementsif (left <= right) // Why swap if left = right? swap(list, left, right) left = left + 1 right = right – 1end if

end loop

Must account for special case of list[left]=list[right]=pivot

Necessary if pivot is the smallest element in the array.

Page 41: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Quick Sort Pseudocode (cont)

// Move the pivot element to its correct locationswap(list, left, rightMostIndex)

// Continue splitting list and sortingquickSort(list, leftMostIndex, right)quickSort(list, left+1, rightMostIndex)

}

Page 42: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Quick SortA couple of notes about quick

sort:◦There are more optimal ways to

choose the pivot value (such as the median-of-three method).

◦Also, when the subarrays get small, it becomes more efficient to use the insertion sort as opposed to continued use of quick sort.

Page 43: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Efficiency of Quick SortList have n items every time split into

two i-e n=2m

(n-1) comparison on first pass.List split in to half n/2 approximately.Same way continue to split as followsn+2*(n/2)+4*(n/4)+8*(n/8)+…+n*(n/n)n+n+n+n+…+n ------ [List split m times]The total number of comparison in average case

is O(n*m) where m= log n. Thus, O(n log n)In worst case is O(n2)

Page 44: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Bubble Sort vs. Quick SortIf we calculate the Big-O

notation, we find that (in the average case):◦Bubble Sort: O(n2)◦Quick Sort: O(nlog2n)

Page 45: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Heap SortIdea: take the items that need to be

sorted and insert them into a heap.By calling deleteHeap, we remove the

smallest (or largest) element, depending on whether or not we are working with a min- or max-heap, respectively.

Hence, the elements are removed in ascending or descending order.

Efficiency: O(nlog2n)

Page 46: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

Efficiency Summary

Sort Worst Case Average Case

Insertion O(n2) O(n2)Selection O(n2) O(n2)Bubble O(n2) O(n2)Quick O(n2) O(nlog2n)Heap O(nlog2n) O(nlog2n)Merge O(nlog2n) O(nlog2n)

Page 47: Data Structures and Algorithms Lecture 17, 18 and 19 (Sorting) Instructor: Quratulain Date: 10, 13 and 17 November, 2009 Faculty of Computer Science, IBA

SearchingSequential searchIndexed sequential SearchBinary searchBinary tree Hashing