the heroic tales of sorting algorithms

8/2/2019 The Heroic Tales of Sorting Algorithms

1/17

The Heroic Tales of Sorting Algorithms

Notation:O(x) = Worst Case Running Time

;(x) = Best Case Running Time

5(x)!Best and Worst case are the same.

Page numbers refer to the Preiss text bookData Structures and Algorithms with Object-

OrientatedDesign Patterns in Java.

This page was created with some references to Paul's spiffy sorting algorithms page which canbe found here. Most of the images scans of the text book (accept the code samples) were

gratefully taken from that site.

SortingAlgorithm

PageImplementation Summary Comments Type Stable? Asymptotic

Complexities

Straight

Insertion495

On each pass the current

item is inserted into thesorted section of the list. It

starts with the last positionof the sorted list, and

moves backwards until itfinds the proper place of

the current item. That itemis then inserted into that

place, and all items after

that are shuffled to the leftto accommodate it. It is forthis reason, that if the list is

already sorted, then thesorting would be O(n)

because every element isalready in its sorted

position. If however the listis sorted in reverse, it

would take O(n2) time as it

would be searching through

the entire sorted section ofthe list each time it does an

insertion, and shuffling allother elements down the

list..

Good for nearly

sorted lists, verybad for out of

order lists, due tothe shuffling.

Insertion Yes

Best Case:O(n).

Worst Case:

O(n2)


2/17

Binary

InsertionSort

497

This is an extension of the

Straight Insertion as above,however instead of doing a

linear search each time forthe correct position, it does

a binary search, which isO(log n) instead of O(n).

The only problem is that italways has to do a binary

search even if the item is inits current position. This

brings the cost of the bestcast up to O(n log n). Due

to the possibility of havingto shuffle all other

elements down the list on

each pass, the worst caserunning time remains atO(n

2).

This is better

than the StraitInsertion if the

comparisons arecostly. This is

because eventhough, it always

has to do log ncomparisons, it

would generallywork out to be

less than a linearsearch.

Insertion Yes

Best Case:

O(n log n).

Worst Case:O(n

2)

Bubble

Sort499

On each pass of the data,

adjacent elements arecompared, and switched if

they are out of order. eg. e1with e2, then e2 with e3 and

so on. This means that oneach pass, the largest

element that is left

unsorted, has been"bubbled" to its rightfulplace at the end of the

array. However, due to thefact that all adjacent out of

order pairs are swapped,the algorithm could be

finished sooner. Preissclaims that it will always

take O(n2) time because it

keeps sorting even if it is in

order, as we can see, thealgorithm doesn't recognise

that. Now someone with abit more knowledge than

Preiss will obviously see,that you can end the

algorithm in the case whenno swaps were made,

In general this is

better thanInsertion Sort I

believe, becauseit has a good

change of beingsorted in much

less than O(n2)

time, unless youare a blind Preissfollower.

Exchange Yes. NOTE:

Preiss uses abad

algorithm,and claims

that best andworst case is

O(n2).

We however

using a littlebit of

insight, cansee that the

following iscorrect of a

better bubblesort

Algorithm

(which doesPeake agreewith?)

Best Case:

O(n).

Worst Case:


3/17

thereby making the best

case O(n) (when it isalready sorted) and worst

case still at O(n2).

O(n2)

Quicksort 501

I strongly recommend

looking at the diagram forthis one. The code is alsouseful and provided below

(included is the selectPivotmethod even though that

probably won't help youunderstanding anyway).

The quick sort operatesalong these lines: Firstly a

pivot is selected, andremoved from the list

(hidden at the end). Thenthe elements are partitioned

into 2 sections. One whichis less than the pivot, and

one that is greater. Thispartitioning is achieved by

exchanging values. Thenthe pivot is restored in the

middle, and those 2sections are recursively

quick sorted.

A complicated

but effectivesortingalgorithm.

Exchange No Best Case:

O(n log n).

Worst Case:

O(n2)

Refer to

page 506 formore

informationabout these

values. Note:

Preiss onpage 524says that the

worst case isO(n log n)

contradictingpage 506,

but I believethat it is

O(n2), as per

page 506.

StraightSelection

Sorting.

511

This one, although not veryefficient is very simply.Basically, it does n

2linear

passes on the list, and oneach pass, it selects the

largest value, and swaps itwith the last unsorted

element.This means that it isn't

stable, because for examplea 3 could be swapped with

a 5 that is to the left of adifferent 3.

A very simplealgorithm, tocode, and a very

simple one toexplain, but a

little slow.

Note that you cando this using the

smallest value,and swapping it

with the firstunsorted element.

Selection No Unlike theBubble sortthis one is

truly 5(n2),

where best

case andworst case

are the same,because even

if the list issorted, the

samenumber ofselections

must still beperformed.


4/17

Heap Sort 513

This uses a similar idea to

the Straight SelectionSorting, except, instead of

using a linear search for themaximum, a heap is

constructed, and themaximum can easily be

removed (and the heapreformed) in log n time.

This means that you will don passes, each time doing a

log n remove maximum,meaning that the algorithm

will always run in 5(n logn) time, as it makes nodifference the original

order of the list.

This utilises, just

about the onlygood use of

heaps, that isfinding the

maximumelement, in a max

heap (or theminimum of a

min heap). Is inevery way as

good as thestraight selection

sort, but faster.

Selection No Best Case:

O(n log n).

Worst Case:O(n log n).

Ok, now Iknow that

lookstempting, but

for a muchmore

programmerfriendly

solution,look at

Merge sort

instead, for abetter O(nlog n) sort .

2 WayMerge

Sort

519

It is fairly simple to take 2sorted lists, and combine

the into another sorted list,simply by going through,

comparing the heads ofeach list, removing the

smallest to join the newsorted list. As you may

guess, this is an O(n)operation. With 2 way

sorting, we apply thismethod to an single

unsorted list. In brief, thealgorithm recursively splits

up the array until it isfragmented into pairs of

two single element arrays.Each of those single

elements is then merged

with its pairs, and thenthose pairs are merged withtheir pairs and so on, until

the entire list is united insorted order. Noting that if

there is every an oddnumber, an extra operation

is added, where it is added

Now isn't thismuch easier to

understand thatHeap sort, its

really quiteintuitive. This

one is bestexplain with the

aid of thediagram, and if

you haven'talready, you

should look at it.

Merge Yes

Best and

Worst Case:

5(n log n)


5/17

to one of the pairs, so that

that particular pair willhave 1 more element than

most of the others, andwon't have any effect on

the actual sorting.

BucketSort

526

Bucket sort initially createsa "counts" array whose size

is the size of the range ofall possible values for the

data we are sorting, eg. allof the values could be

between 1 and 100,therefore the array would

have 100 elements. 2passes are then done on the

list. The first tallies up theoccurrences of each of

number into the "counts"array. That is for each

index of the array, the datathat it contains signifies the

number of times thatnumber occurred in list.

The second and final passgoes though the counts

array, regenerating the list

in sorted form. So if therewere 3 instance of 1, 0 of 2,and 1 of 3, the sorted list

would be recreated to1,1,1,3. This diagram will

most likely remove allshadows of doubt in your

minds.

This sufferers alimitation that

Radix doesn't, inthat if the

possible range ofyour numbers is

very high, youwould need too

many "buckets"and it would be

impractical. Theother limitation

that Radixdoesn't have, that

this one does isthat stability is

not maintained. Itdoes however

outperform radixsort if the

possible range is

very small.

Distribution No

Best andWorst

case:5(m +n) where mis the

number ofpossible

values.Obviously

this is O(n)for most

values ofm,so long as m

isn't toolarge.

The reason

that thesedistribution

sorts breakthe O(n log

n) barrier isbecause no

comparisonsare

performed!

Radix

Sort528

This is an extremely spiffyimplementation of the

bucket sort algorithm. This

time, several bucket likesorts are performed (onefor each digit), but instead

of having a counts arrayrepresenting the range of

all possible values for thedata, it represents all of the

possible values for each

This is the god ofsorting

algorithms. It

will search thelargest list, withthe biggest

numbers, and hasa is guaranteed

O(n) timecomplexity. And

it ain't very

Distribution Yes

Best andWorst Case:

5(n)Bloody

awesome!


6/17

individual digit, which in

decimal numbering is only10. Firstly a bucked sort is

performed, using only theleast significant digit to

sort it by, then another isdone using the next least

significant digit, until theend, when you have done

the number of bucket sortsequal to the maximum

number of digits of yourbiggest number. Because

with the bucket sort, thereare only 10 buckets (the

counts array is of size 10),

this will always be an O(n)sorting algorithm! Seebelow for a Radix

Example. On each of theadapted bucket sorts it

does, the count array storesthe numbers of each digit.

Then the offsets are createdusing the counts, and then

the sorted array regeneratedusing the offsets and the

original data.

complex to

understand orimplement.

Myrecommendations

are to use thisone wherever

possible.

Radix Sort Example:

First Pass:

Data: 67 50 70 25 93 47 21

Buckets on first pass (least significant digit):

index 0 1 2 3 4 5 6 7 8 9

count 2 1 0 1 0 1 0 2 0 0

offset 0 2 3 3 4 4 5 5 7 7

Data after first pass

50 70 21 93 25 67 47

That data is created by doing a single pass on the unsorted data, using the offsets to work out atwhere each item belongs.

For example, it looks at the first one 67, then at the offsets for the digit 7, and inserts it into the5th position. The offset at 7 is then incremented, so that the next value encountered which has a

least significant digit of 7 is placed into the 6th position. Continuing the example, the number 50


7/17

would then be looked at, and inserted into the 0th position, its offset incremented, so that the nextvalue which is 70 would be inserted into the 1st position, and so on until then end of the list.

As you can see, this data is sorted by its least significant digit.

Second Pass:

Data after first pass

50 70 21 93 25 67 47

Buckets on first pass (most significant digit):

index 0 1 2 3 4 5 6 7 8 9

count 0 0 2 0 1 1 1 1 0 1

offset 0 2 3 3 4 4 5 5 7 7

Data after second pass (sorted)

21 25 47 50 67 70 93

Look at this diagram for another example, noting that the "offsets" array is unnecessary

Images:


8/17

Straight Insertion - Figure 15.2


9/17

Bubble Sorting - Figure 15.3


10/17

Quick Sorting - Figure 15.4


11/17

Program 15.7 AbstractQuickSorter code


12/17

Program 15.9 MedianOfThreeQuickSorter class selectPivot method


13/17

Straight Selection Sorting - Figure 15.5


14/17

Building a heap - Figure 15.7

Heap Sorting - Figure 15.8


15/17


16/17

Two-way merge sorting - Figure 15.10

Bucket Sorting - Figure 15.12


17/17

Radix Sorting - Figure 15.13

the heroic tales of sorting algorithms

Documents