the heroic tales of sorting algorithms

Upload: nancy-goyal

Post on 06-Apr-2018

222 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/2/2019 The Heroic Tales of Sorting Algorithms

    1/17

    The Heroic Tales of Sorting Algorithms

    Notation:O(x) = Worst Case Running Time

    ;(x) = Best Case Running Time

    5(x)!Best and Worst case are the same.

    Page numbers refer to the Preiss text bookData Structures and Algorithms with Object-

    OrientatedDesign Patterns in Java.

    This page was created with some references to Paul's spiffy sorting algorithms page which canbe found here. Most of the images scans of the text book (accept the code samples) were

    gratefully taken from that site.

    SortingAlgorithm

    PageImplementation Summary Comments Type Stable? Asymptotic

    Complexities

    Straight

    Insertion495

    On each pass the current

    item is inserted into thesorted section of the list. It

    starts with the last positionof the sorted list, and

    moves backwards until itfinds the proper place of

    the current item. That itemis then inserted into that

    place, and all items after

    that are shuffled to the leftto accommodate it. It is forthis reason, that if the list is

    already sorted, then thesorting would be O(n)

    because every element isalready in its sorted

    position. If however the listis sorted in reverse, it

    would take O(n2) time as it

    would be searching through

    the entire sorted section ofthe list each time it does an

    insertion, and shuffling allother elements down the

    list..

    Good for nearly

    sorted lists, verybad for out of

    order lists, due tothe shuffling.

    Insertion Yes

    Best Case:O(n).

    Worst Case:

    O(n2)

  • 8/2/2019 The Heroic Tales of Sorting Algorithms

    2/17

    Binary

    InsertionSort

    497

    This is an extension of the

    Straight Insertion as above,however instead of doing a

    linear search each time forthe correct position, it does

    a binary search, which isO(log n) instead of O(n).

    The only problem is that italways has to do a binary

    search even if the item is inits current position. This

    brings the cost of the bestcast up to O(n log n). Due

    to the possibility of havingto shuffle all other

    elements down the list on

    each pass, the worst caserunning time remains atO(n

    2).

    This is better

    than the StraitInsertion if the

    comparisons arecostly. This is

    because eventhough, it always

    has to do log ncomparisons, it

    would generallywork out to be

    less than a linearsearch.

    Insertion Yes

    Best Case:

    O(n log n).

    Worst Case:O(n

    2)

    Bubble

    Sort499

    On each pass of the data,

    adjacent elements arecompared, and switched if

    they are out of order. eg. e1with e2, then e2 with e3 and

    so on. This means that oneach pass, the largest

    element that is left

    unsorted, has been"bubbled" to its rightfulplace at the end of the

    array. However, due to thefact that all adjacent out of

    order pairs are swapped,the algorithm could be

    finished sooner. Preissclaims that it will always

    take O(n2) time because it

    keeps sorting even if it is in

    order, as we can see, thealgorithm doesn't recognise

    that. Now someone with abit more knowledge than

    Preiss will obviously see,that you can end the

    algorithm in the case whenno swaps were made,

    In general this is

    better thanInsertion Sort I

    believe, becauseit has a good

    change of beingsorted in much

    less than O(n2)

    time, unless youare a blind Preissfollower.

    Exchange Yes. NOTE:

    Preiss uses abad

    algorithm,and claims

    that best andworst case is

    O(n2).

    We however

    using a littlebit of

    insight, cansee that the

    following iscorrect of a

    better bubblesort

    Algorithm

    (which doesPeake agreewith?)

    Best Case:

    O(n).

    Worst Case:

  • 8/2/2019 The Heroic Tales of Sorting Algorithms

    3/17

    thereby making the best

    case O(n) (when it isalready sorted) and worst

    case still at O(n2).

    O(n2)

    Quicksort 501

    I strongly recommend

    looking at the diagram forthis one. The code is alsouseful and provided below

    (included is the selectPivotmethod even though that

    probably won't help youunderstanding anyway).

    The quick sort operatesalong these lines: Firstly a

    pivot is selected, andremoved from the list

    (hidden at the end). Thenthe elements are partitioned

    into 2 sections. One whichis less than the pivot, and

    one that is greater. Thispartitioning is achieved by

    exchanging values. Thenthe pivot is restored in the

    middle, and those 2sections are recursively

    quick sorted.

    A complicated

    but effectivesortingalgorithm.

    Exchange No Best Case:

    O(n log n).

    Worst Case:

    O(n2)

    Refer to

    page 506 formore

    informationabout these

    values. Note:

    Preiss onpage 524says that the

    worst case isO(n log n)

    contradictingpage 506,

    but I believethat it is

    O(n2), as per

    page 506.

    StraightSelection

    Sorting.

    511

    This one, although not veryefficient is very simply.Basically, it does n

    2linear

    passes on the list, and oneach pass, it selects the

    largest value, and swaps itwith the last unsorted

    element.This means that it isn't

    stable, because for examplea 3 could be swapped with

    a 5 that is to the left of adifferent 3.

    A very simplealgorithm, tocode, and a very

    simple one toexplain, but a

    little slow.

    Note that you cando this using the

    smallest value,and swapping it

    with the firstunsorted element.

    Selection No Unlike theBubble sortthis one is

    truly 5(n2),

    where best

    case andworst case

    are the same,because even

    if the list issorted, the

    samenumber ofselections

    must still beperformed.

  • 8/2/2019 The Heroic Tales of Sorting Algorithms

    4/17

    Heap Sort 513

    This uses a similar idea to

    the Straight SelectionSorting, except, instead of

    using a linear search for themaximum, a heap is

    constructed, and themaximum can easily be

    removed (and the heapreformed) in log n time.

    This means that you will don passes, each time doing a

    log n remove maximum,meaning that the algorithm

    will always run in 5(n logn) time, as it makes nodifference the original

    order of the list.

    This utilises, just

    about the onlygood use of

    heaps, that isfinding the

    maximumelement, in a max

    heap (or theminimum of a

    min heap). Is inevery way as

    good as thestraight selection

    sort, but faster.

    Selection No Best Case:

    O(n log n).

    Worst Case:O(n log n).

    Ok, now Iknow that

    lookstempting, but

    for a muchmore

    programmerfriendly

    solution,look at

    Merge sort

    instead, for abetter O(nlog n) sort .

    2 WayMerge

    Sort

    519

    It is fairly simple to take 2sorted lists, and combine

    the into another sorted list,simply by going through,

    comparing the heads ofeach list, removing the

    smallest to join the newsorted list. As you may

    guess, this is an O(n)operation. With 2 way

    sorting, we apply thismethod to an single

    unsorted list. In brief, thealgorithm recursively splits

    up the array until it isfragmented into pairs of

    two single element arrays.Each of those single

    elements is then merged

    with its pairs, and thenthose pairs are merged withtheir pairs and so on, until

    the entire list is united insorted order. Noting that if

    there is every an oddnumber, an extra operation

    is added, where it is added

    Now isn't thismuch easier to

    understand thatHeap sort, its

    really quiteintuitive. This

    one is bestexplain with the

    aid of thediagram, and if

    you haven'talready, you

    should look at it.

    Merge Yes

    Best and

    Worst Case:

    5(n log n)

  • 8/2/2019 The Heroic Tales of Sorting Algorithms

    5/17

    to one of the pairs, so that

    that particular pair willhave 1 more element than

    most of the others, andwon't have any effect on

    the actual sorting.

    BucketSort

    526

    Bucket sort initially createsa "counts" array whose size

    is the size of the range ofall possible values for the

    data we are sorting, eg. allof the values could be

    between 1 and 100,therefore the array would

    have 100 elements. 2passes are then done on the

    list. The first tallies up theoccurrences of each of

    number into the "counts"array. That is for each

    index of the array, the datathat it contains signifies the

    number of times thatnumber occurred in list.

    The second and final passgoes though the counts

    array, regenerating the list

    in sorted form. So if therewere 3 instance of 1, 0 of 2,and 1 of 3, the sorted list

    would be recreated to1,1,1,3. This diagram will

    most likely remove allshadows of doubt in your

    minds.

    This sufferers alimitation that

    Radix doesn't, inthat if the

    possible range ofyour numbers is

    very high, youwould need too

    many "buckets"and it would be

    impractical. Theother limitation

    that Radixdoesn't have, that

    this one does isthat stability is

    not maintained. Itdoes however

    outperform radixsort if the

    possible range is

    very small.

    Distribution No

    Best andWorst

    case:5(m +n) where mis the

    number ofpossible

    values.Obviously

    this is O(n)for most

    values ofm,so long as m

    isn't toolarge.

    The reason

    that thesedistribution

    sorts breakthe O(n log

    n) barrier isbecause no

    comparisonsare

    performed!

    Radix

    Sort528

    This is an extremely spiffyimplementation of the

    bucket sort algorithm. This

    time, several bucket likesorts are performed (onefor each digit), but instead

    of having a counts arrayrepresenting the range of

    all possible values for thedata, it represents all of the

    possible values for each

    This is the god ofsorting

    algorithms. It

    will search thelargest list, withthe biggest

    numbers, and hasa is guaranteed

    O(n) timecomplexity. And

    it ain't very

    Distribution Yes

    Best andWorst Case:

    5(n)Bloody

    awesome!

  • 8/2/2019 The Heroic Tales of Sorting Algorithms

    6/17

    individual digit, which in

    decimal numbering is only10. Firstly a bucked sort is

    performed, using only theleast significant digit to

    sort it by, then another isdone using the next least

    significant digit, until theend, when you have done

    the number of bucket sortsequal to the maximum

    number of digits of yourbiggest number. Because

    with the bucket sort, thereare only 10 buckets (the

    counts array is of size 10),

    this will always be an O(n)sorting algorithm! Seebelow for a Radix

    Example. On each of theadapted bucket sorts it

    does, the count array storesthe numbers of each digit.

    Then the offsets are createdusing the counts, and then

    the sorted array regeneratedusing the offsets and the

    original data.

    complex to

    understand orimplement.

    Myrecommendations

    are to use thisone wherever

    possible.

    Radix Sort Example:

    First Pass:

    Data: 67 50 70 25 93 47 21

    Buckets on first pass (least significant digit):

    index 0 1 2 3 4 5 6 7 8 9

    count 2 1 0 1 0 1 0 2 0 0

    offset 0 2 3 3 4 4 5 5 7 7

    Data after first pass

    50 70 21 93 25 67 47

    That data is created by doing a single pass on the unsorted data, using the offsets to work out atwhere each item belongs.

    For example, it looks at the first one 67, then at the offsets for the digit 7, and inserts it into the5th position. The offset at 7 is then incremented, so that the next value encountered which has a

    least significant digit of 7 is placed into the 6th position. Continuing the example, the number 50

  • 8/2/2019 The Heroic Tales of Sorting Algorithms

    7/17

    would then be looked at, and inserted into the 0th position, its offset incremented, so that the nextvalue which is 70 would be inserted into the 1st position, and so on until then end of the list.

    As you can see, this data is sorted by its least significant digit.

    Second Pass:

    Data after first pass

    50 70 21 93 25 67 47

    Buckets on first pass (most significant digit):

    index 0 1 2 3 4 5 6 7 8 9

    count 0 0 2 0 1 1 1 1 0 1

    offset 0 2 3 3 4 4 5 5 7 7

    Data after second pass (sorted)

    21 25 47 50 67 70 93

    Look at this diagram for another example, noting that the "offsets" array is unnecessary

    Images:

  • 8/2/2019 The Heroic Tales of Sorting Algorithms

    8/17

    Straight Insertion - Figure 15.2

  • 8/2/2019 The Heroic Tales of Sorting Algorithms

    9/17

    Bubble Sorting - Figure 15.3

  • 8/2/2019 The Heroic Tales of Sorting Algorithms

    10/17

    Quick Sorting - Figure 15.4

  • 8/2/2019 The Heroic Tales of Sorting Algorithms

    11/17

    Program 15.7 AbstractQuickSorter code

  • 8/2/2019 The Heroic Tales of Sorting Algorithms

    12/17

    Program 15.9 MedianOfThreeQuickSorter class selectPivot method

  • 8/2/2019 The Heroic Tales of Sorting Algorithms

    13/17

    Straight Selection Sorting - Figure 15.5

  • 8/2/2019 The Heroic Tales of Sorting Algorithms

    14/17

    Building a heap - Figure 15.7

    Heap Sorting - Figure 15.8

  • 8/2/2019 The Heroic Tales of Sorting Algorithms

    15/17

  • 8/2/2019 The Heroic Tales of Sorting Algorithms

    16/17

    Two-way merge sorting - Figure 15.10

    Bucket Sorting - Figure 15.12

  • 8/2/2019 The Heroic Tales of Sorting Algorithms

    17/17

    Radix Sorting - Figure 15.13