chapter 7 sorting: outline
Post on 05-Jan-2016
42 Views
Preview:
DESCRIPTION
TRANSCRIPT
Chapter 7 Sorting: OutlineChapter 7 Sorting: Outline IntroductionIntroduction
Searching and List VerificationSearching and List Verification DefinitionsDefinitions
Insertion SortInsertion Sort Quick SortQuick Sort Merge SortMerge Sort Heap SortHeap Sort Counting SortCounting Sort Radix SortRadix Sort Shell SortShell Sort Summary of Internal SortingSummary of Internal Sorting
Introduction (1/9)Introduction (1/9)
Why efficient sorting methods are so important ? Why efficient sorting methods are so important ? The efficiency of a searching strategy depends The efficiency of a searching strategy depends
on the assumptions we make about the on the assumptions we make about the arrangement of records in the listarrangement of records in the list
No single sorting technique is the “best” for all No single sorting technique is the “best” for all initial orderings and sizes of the list being sorted.initial orderings and sizes of the list being sorted.
We examine several techniques, indicating when We examine several techniques, indicating when one is superior to the others.one is superior to the others.
Introduction (2/9)Introduction (2/9) Sequential searchSequential search
We search the list by examining the key valueWe search the list by examining the key values s listlist[[00]].key, … , list.key, … , list[[n-1n-1]].key..key. Example: Example: ListList has has nn records. records.
4, 15, 17, 26, 30, 46, 48, 56, 58, 82, 90, 954, 15, 17, 26, 30, 46, 48, 56, 58, 82, 90, 95
In that order, until the correct record is located,In that order, until the correct record is located, or we have examined all the records in the lis or we have examined all the records in the listt
Unsuccessful search: Unsuccessful search: nn+1 +1 O( O(nn)) Average successful searchAverage successful search
)(O2/)1(/)1(1
0
nnnin
i
Introduction (3/9)Introduction (3/9) Binary searchBinary search Binary search assumes that the list is ordered on the key Binary search assumes that the list is ordered on the key
field such that field such that listlist[0].[0].keykey listlist[1]. [1]. keykey … … listlist[[nn-1]. -1]. keykey.. This search begins by comparing This search begins by comparing searchnumsearchnum (search ke (search ke
y) and y) and listlist[[middlemiddle].].keykey where where middlemiddle=(=(nn-1)/2-1)/2
56
[7]
17
[2]
58
[8]
26
[3]
4
[0]
48
[6]
90
[10]
15
[1]
30
[4]
46
[5]
82
[9]
95
[11]Figure 7.1:Figure 7.1:Decision tree for Decision tree for
binary search (p.323)binary search (p.323)
4, 15, 17, 26, 30, 46, 48, 56, 58, 82, 90, 95
Introduction (4/9)Introduction (4/9) Binary search (cont’d)Binary search (cont’d)
Analysis of Analysis of binsearchbinsearch: : makes no more than O(log makes no more than O(log nn) comparisons) comparisons
Introduction (5/9)Introduction (5/9) List VerificationList Verification
Compare lists to verify that they are identical or Compare lists to verify that they are identical or identify the discrepancies.identify the discrepancies.
ExampleExample International Revenue Service (IRS) International Revenue Service (IRS)
(e.g., employee vs. employer)(e.g., employee vs. employer)
Reports three types of errors:Reports three types of errors: all records found in all records found in listlist1 but not in 1 but not in listlist22 all records found in all records found in listlist2 but not in 2 but not in listlist11 all records that are in all records that are in listlist1 and 1 and listlist2 with the same key 2 with the same key
but have different values for different fieldsbut have different values for different fields
Introduction (6/9)Introduction (6/9) Verifying using a sequential searchVerifying using a sequential search
Check whether the elements in list1 are also in list2
And the elements in list2 but not in list1 would be show up
Introduction (7/9)Introduction (7/9) Fast Fast
verification verification of two listsof two lists
The remainder elements of a list is not a member of another list
The element of two lists are matched
The element of list1 is not in list2
The element of list2 is not in list1
Introduction (8/9)Introduction (8/9) ComplexitiesComplexities Assume the two lists are randomly arrangedAssume the two lists are randomly arranged Verify1: O(Verify1: O(mnmn)) Verify2: sorts them before verificationVerify2: sorts them before verification
O(O(tsorttsort((nn) + ) + tsorttsort((mm) + ) + mm + + nn) ) O( O(maxmax[[nnloglogn, mn, mloglogmm])]) tsorttsort((nn): the time needed to sort the ): the time needed to sort the nn records in records in listlist11 tsorttsort((mm): the time needed to sort the ): the time needed to sort the mm records in records in listlist22 we will show it is possible to sort we will show it is possible to sort nn records in O( records in O(nnloglognn) time) time
DefinitionDefinition Given (Given (RR00, , RR11, …, , …, RRnn-1-1), each ), each RRii has a key value has a key value KKii
find a permutation find a permutation , such that , such that KK((ii-1)-1) KK((ii)), 0<, 0<i i nn-1-1
denotes an unique permutationdenotes an unique permutation Sorted: Sorted: KK((ii-1)-1) KK((ii)), 0<, 0<ii<<nn-1-1
Stable: if Stable: if ii < < jj and and KKii = = KKjj then then RRii precedes precedes RRjj in the sort in the sort
ed listed list
Introduction (9/9)Introduction (9/9) Two important applications of sorting:Two important applications of sorting:
An aid to searchAn aid to search Matching entries in listsMatching entries in lists
Internal sortInternal sort The list is small enough to sort entirely in main The list is small enough to sort entirely in main
memorymemory
External sortExternal sort There is too much information to fit into main There is too much information to fit into main
memorymemory
Insertion Sort (1/3)Insertion Sort (1/3)
Concept:Concept: The basic step in this method is to insert a record The basic step in this method is to insert a record RR in in
to a sequence of ordered records, to a sequence of ordered records, RR11, , RR22, …, , …, RRii ( (KK11
KK22 , …, , …, KKii) in such a way that the resulting sequen) in such a way that the resulting sequen
ce of size ce of size ii is also ordered is also ordered
VariationVariation Binary insertion sortBinary insertion sort
reduce search timereduce search time
List insertion sortList insertion sort reduce insert timereduce insert time
Insertion Sort (2/3)Insertion Sort (2/3) Insertion sort programInsertion sort program
i = 1234
[0]
[1]
[2]
[3]
[4]
list
5 2 3 1 4 next = 2
5 2 3
5 3 1
5 3 2 1 4
5 4
Insertion Sort (3/3)Insertion Sort (3/3) Analysis of InsertionSort:Analysis of InsertionSort:
If If kk is the number of records LOO, then the computing is the number of records LOO, then the computing time is O((time is O((kk+1)+1)nn))
The worst-case time is O(The worst-case time is O(nn22).). The average time is O(The average time is O(nn22).). The best time is O(The best time is O(nn).).
)(O)(O2
0
2
n
j
ni
left out of order (LOO)
O(n)
}{max LOOis 0
jij
ii RRiffR
Quick Sort (1/6)Quick Sort (1/6) The quick sort scheme developed by C. A. R. HoaThe quick sort scheme developed by C. A. R. Hoa
re has the re has the best average behaviorbest average behavior among all the so among all the sorting methods we shall be studyingrting methods we shall be studying
Given (Given (RR00, , RR11, …, , …, RRn-1n-1) and ) and KKii denote a pivot keydenote a pivot key
If If KKii is placed in position is placed in position ss((ii),),
then then KKjj KKss((ii)) for for jj < < ss((ii), ), KKjj KKss((ii)) for for jj > > ss((ii).).
After a positioning has been made, the original file After a positioning has been made, the original file is partitioned into two subfiles, is partitioned into two subfiles, {{RR00, …, , …, RRss((ii)-1)-1}}, , RRss((ii)), ,
{{RRss((ii)+1)+1, …, , …, RRss((nn-1)-1)}}, and they will be sorted independ, and they will be sorted independ
entlyently
6159
112615
Quick Sort (2/6)Quick Sort (2/6) Quick Sort ConceptQuick Sort Concept
select a pivot keyselect a pivot key interchange the elements to their correct positions accinterchange the elements to their correct positions acc
ording to the pivotording to the pivot the original file is partitioned into two subfiles and they the original file is partitioned into two subfiles and they
will be sorted will be sorted independentlyindependently
R0 R1 R2 R3 R4 R5 R6 R7 R8 R9
26 5 37 1 61 11 59 15 48 1919 376111
191
15 193748
37 48
5 1 59 485 15 26 59 61 48 37
191 115 15 26 59 61 48 371 115 26 59 61 48 37
15 191 115 2615 191 115 26 6159
37 4815 191 115 26 6159
In-Place Partitioning ExampleIn-Place Partitioning Example6 2 8 5 11 10 4 1 9 7 3a 6 8 3
6 2 3 5 11 10 4 1 9 7 8a 6 11 1
6 2 3 5 1 10 4 11 9 7 8a 6 10 4
6 2 3 5 1 4 10 11 9 7 8a 6 104
bigElement is not to left of smallElement, terminate process. Swap pivot and smallElement.
4 2 3 5 1 4 11 9 7 8a 6 106
Quick Sort (4/6)Quick Sort (4/6) Analysis for Quick SortAnalysis for Quick Sort Assume that each time a record is positioned, the list is divAssume that each time a record is positioned, the list is div
ided into the rough same size of two parts.ided into the rough same size of two parts. Position a list with Position a list with nn element needs element needs O(O(nn)) TT((nn) is the time taken to sort ) is the time taken to sort nn elements elements TT((nn)<=)<=cncn+2+2TT((nn/2) for some /2) for some cc
<= <=cncn+2(+2(cncn/2+2/2+2TT((nn/4))/4)) ... ... <= <=cncnloglog22nn++nTnT(1)=(1)=O(O(nnloglognn))
Time complexityTime complexity Average case and best case:Average case and best case: O( O(nnloglognn)) Worst case: Worst case: O(O(nn22)) Best internal sorting method considering the average caseBest internal sorting method considering the average case
UnstableUnstable
Quick Sort (5/6)Quick Sort (5/6)
Lemma 7.1:Lemma 7.1: Let Let TTavgavg((nn) be the expected time for quicksort to sort a fil) be the expected time for quicksort to sort a fil
e with e with nn records. Then there exists a constant records. Then there exists a constant kk such th such that at TTavgavg((nn) ) knknloglogeenn for for nn 2 2
Space for Quick SortSpace for Quick Sort If the smaller of the two subarrays is always sorted first, If the smaller of the two subarrays is always sorted first,
the maximum stack space is the maximum stack space is O(logO(logn)n)
Stack space complexity: Stack space complexity: Average case and best case:Average case and best case: O(log O(lognn)) Worst case: Worst case: O(O(nn))
Quick Sort (6/6)Quick Sort (6/6)
Quick Sort VariationsQuick Sort Variations Quick sort using a median of three: Pick the median oQuick sort using a median of three: Pick the median o
f the first, middle, and last keys in the current sublist af the first, middle, and last keys in the current sublist as the pivot. Thus, pivot = median{s the pivot. Thus, pivot = median{KKll, , KK((l+rl+r)/2)/2, , KKrr}.}.
Median of Three Partitioning ExampleMedian of Three Partitioning Example
6 2 8 5 11 10 4 1 9 7 3a 6 310
3 2 8 5 11 10 4 1 9 7 6a 3 10 6
3 2 8 5 11 6 4 1 9 7 10a 103 6
3 2 8 5 11 7 4 1 9 6 10a 3 106
3 2 1 5 11 7 4 8 9 6 10a 3 106
3 2 1 5 4 7 11 8 9 6 10a 3 106
3 2 1 5 4 6 11 8 9 7 10a 3 106
Merge Sort (1/7)Merge Sort (1/7) Before looking at the merge sort algorithm to sort Before looking at the merge sort algorithm to sort
nn records, let us see how one may merge two so records, let us see how one may merge two sorted lists to get a single sorted list.rted lists to get a single sorted list.
MergingMerging Uses O(Uses O(nn) additional space.) additional space. It merges the sorted lists It merges the sorted lists
((listlist[[ii], … , ], … , listlist[[mm]) and (]) and (listlist[[m+1m+1], …, ], …, listlist[[nn]), ]), into a single sorted list, (into a single sorted list, (sortedsorted[[ii], … , ], … , sortedsorted[[nn]).]).
Copy sorted[1..n] to list[1..n]Copy sorted[1..n] to list[1..n]
Merge (using O(Merge (using O(nn) space)) space)
Merge Sort (3/7)Merge Sort (3/7) Recursive merge sort conceptRecursive merge sort concept
Merge Sort (4/7)Merge Sort (4/7) Recursive merge sort conceptRecursive merge sort concept
Merge Sort (5/7)Merge Sort (5/7) Recursive merge sort conceptRecursive merge sort concept
Merge Sort (6/7)Merge Sort (6/7) Recursive merge sort conceptRecursive merge sort concept
Merge Sort (7/7)Merge Sort (7/7) Recursive merge sort conceptRecursive merge sort concept
Heap Sort (1/3)Heap Sort (1/3) The challenges of merge sortThe challenges of merge sort
The merge sort requires additional storage proporThe merge sort requires additional storage proportional to the number of records in the file being sortional to the number of records in the file being sorted.ted.
Heap sortHeap sort Require only a fixed amount of additional storageRequire only a fixed amount of additional storage Slightly slower than merge sort using O(Slightly slower than merge sort using O(nn) additio) additio
nal spacenal space The worst case and average computing time is OThe worst case and average computing time is O
((n n loglog n n), same as merge sort), same as merge sort UnstableUnstable
adjustadjust adjust the binary tree to establish the heapadjust the binary tree to establish the heap
/* compare root and max. root */
/* move to parent */ [1]
[2] [3]
[4] [5] [6][7]
[8] [9] [10]
26
5 77
1 61 11 59
15 48 19
rootkey =
root = 1
n = 10
26
child = 23
77
67
59
14
26
Heap Sort (3/3)Heap Sort (3/3)
[1]
[2] [3]
[4][5] [6][7]
[8] [9] [10]
26
5 77
1 61 11 59
15 48 19
heapsortheapsort
n = 10i = 54
48
1
32
61
19
5
1
77
59
26
9
5
77
61
48
15
5
8
1
61
59
26
1
7
5
59
48
19
5
6
1
48
26
11
1
5
1
26
19
15
1
4
5
19
15
5
3
1
15
11
1
2
1
11
5
1
1
1
5
ascending order(max heap)
bottom-up
top-down
Counting SortCounting Sort
For key values within small rangeFor key values within small range 1. scan list[1..n] to count the frequency of 1. scan list[1..n] to count the frequency of
every valueevery value 2. sum to find proper index to put value x2. sum to find proper index to put value x 3. scan list[1..n] and put to sorted[]3. scan list[1..n] and put to sorted[] 4. copy sorted to list4. copy sorted to list O(n) for time and spaceO(n) for time and space
Radix Sort (1/5)Radix Sort (1/5)
We considers the problem of sorting records that We considers the problem of sorting records that have several keyshave several keys These keys are labeled These keys are labeled KK0 0 (most significant key)(most significant key), , KK11, ,
… , … , KKr-1 r-1 (least significant key)(least significant key).. Let Let KKi i
jj denote key denote key KKjj of record of record RRii..
A list of records A list of records RR00, … , , … , RRnn-1-1, is , is lexically sortedlexically sorted with re with re
spect to the keys spect to the keys KK00, , KK11, … , , … , KKrr-1-1 iffiff ((KKii
00, , KKii11, …, , …, KKii
r-1r-1) ) ( (KK00i+1i+1, , KK11
i+1i+1, …, , …, KKr-1r-1i+1i+1), 0), 0 i i < < nn-1-1
Radix Sort (2/5)Radix Sort (2/5)
ExampleExample sorting a deck of cards on two keys, suit and face sorting a deck of cards on two keys, suit and face
value, in which the keys have the ordering relation:value, in which the keys have the ordering relation:KK0 0 [Suit]:[Suit]: < < < < < < KK1 1 [Face value]: 2 < 3 < 4 < … < 10 < J < Q < K < A[Face value]: 2 < 3 < 4 < … < 10 < J < Q < K < A
Thus, a sorted deck of cards has the ordering:Thus, a sorted deck of cards has the ordering:22, …, A, …, A, … , 2, … , 2, … , A, … , A
Two approaches to sort:Two approaches to sort:1.1. MSD (Most Significant Digit) first:MSD (Most Significant Digit) first: sort on K0, then K1, ...
2.2. LSD (Least Significant Digit) first:LSD (Least Significant Digit) first: sort on Kr-1, then Kr-2, ...
Radix Sort (3/5)Radix Sort (3/5) MSD first
1. MSD sort first, e.g., bin sort, four bins
2. LSD sort second Result: 22, …, A, …, A, … , 2, … , 2, … , A, … , A
Radix Sort (4/5)Radix Sort (4/5) LSD first1.LSD sort first, e.g., face sort,
13 bins 2, 3, 4, …, 10, J, Q, K, A
2.MSD sort second (may not needed, we can just classify these 13 piles into 4 separated piles by considering them from face 2 to face A)
Simpler than the MSD one because we do not have to sort the subpiles independently
Result: 22, …, A, …, A, … , , … , 22, …, A, …, A
Radix Sort (5/5)Radix Sort (5/5) We also can use an LSD or MSD sort when we We also can use an LSD or MSD sort when we
have only one logical key, if we interpret this key have only one logical key, if we interpret this key as a composite of several keys.as a composite of several keys.
Example:Example: integer: the digit in the far right position is the least siginteger: the digit in the far right position is the least sig
nificant and the most significant for the far left positionnificant and the most significant for the far left position range: range: 0 K 999 using LSD or MSD sort for three keys (K0, K1, K2) since an LSD sort does not require the maintainence
of independent subpiles, it is easier to implement
MSD LSD0-9 0-9 0-9
Shell SortShell Sort
For (h = magic1; h > 0; h /= magic2)For (h = magic1; h > 0; h /= magic2) Insertion sort elements with distance hInsertion sort elements with distance h
Idea: let data has chance to “long jump”Idea: let data has chance to “long jump” Insertion sort is very fast for partially Insertion sort is very fast for partially
sorted arraysorted array The problem is how to find good magic?The problem is how to find good magic? Several sets have been discussedSeveral sets have been discussed Remember 3n+1Remember 3n+1
Summary of Internal Sorting (1/2)Summary of Internal Sorting (1/2) Insertion SortInsertion Sort
Works well when the list is already partially orderedWorks well when the list is already partially ordered The best sorting method for small The best sorting method for small nn
Merge SortMerge Sort The best/worst case (O(The best/worst case (O(nnloglognn)))) Require more storage than a heap sortRequire more storage than a heap sort Slightly more overhead than quick sortSlightly more overhead than quick sort
Quick SortQuick Sort The best average behaviorThe best average behavior The worst complexity in worst case (O(The worst complexity in worst case (O(nn22))))
Radix SortRadix Sort Depend on the size of the keys and the choice of the radixDepend on the size of the keys and the choice of the radix
Summary of Internal Sorting (2/2)Summary of Internal Sorting (2/2) Analysis of the average running timesAnalysis of the average running times
top related