mergesort analysis

2
Here are a few notes on mergesort. Merge Procedure Here is a description of the merge procedure in pseudo-code, which seeks to merge two sorted (assume that they are in ascending order) arrays L 1 and L 2 (of size n 1 and n 2 respectively) and produce a sorted array L (of size n 1 + n 2 ) (in ascending order). L = Merge (L 1 , L 2 , n 1 , n 2 ) BEGIN (1) p 1 = 1 [pointer to the current position in the rst array] (2) p 2 = 1 [pointer to the current position in the second array] (3) while (  p 1 n 1 AND p 2 n 2 ) { (3a) if (L 1 (  p 1 ) L 2 (  p 2 )), then append L 1 (  p 1 ) to L and increment p 1 , (3b) else append L 2 (  p 2 ) to L and increment p 2 . } [end while loop] (4) if p 1 > n 1 but p 2 < n 2 , then append the remaining elements of L 2 (i.e. all elements L 2 (  p 2 ) to L 2 (n 2 )) onto L. (5) elseif p 2 > n 2 but p 1 < n 1 , then append the remaining elements of L 1 (i.e. all elements L 1 (  p 1 ) to L 1 (n 1 )) onto L. END Time Complexity: The time complexity of this operation is O(n 1 + n 2 ). Why? Look at the whil e loop. It will exit when p 1 > n 1 or when p 2 > n 2 or both. In the former case , we see that all n 1 elements of L 1 must surely have been copied into L, which takes a total of  n 1 operations. No w let us assume that x elements of L 2 got copied into L (x could be 0). This takes another x opera tion s. Onc e we are out of the while loop, there are still n 2 x elements remaining in L 2 and these need to be copied into L, which takes n 2 x more opera tion s. Hence the tota l numbe r of oper atio ns is n 1 + n 2 x + x = n 1 + n 2 . The textbook mentions that in the worst case, the number of comparisons taking place in the above routine is at most n 1 + n 2 1. Let us look at this, wit h an example on tw o sorted arrays L 1 = {1, 3, 5} and L 2 = {2, 4, 6}. We initialize p 1 = p 2 = 1. As we ha ve L 1 (1) < L 2 (2), we append L 1 (  p 1 = 1) to L. So L now contains the element 1, and p 1 is incremented to 2. The next time , we see that L 2 (  p 2 = 1) is less than L 1 (  p 1 = 2) and hence we have L = {1, 2}. Now p 2 is incremented to 2. In the next step, we have L 1 (  p 1 ) < L 2 (  p 2 ) and hence L = {1, 2, 3}. If you continue likewise you will see that at some stage, we have p 1 = 3 and p 2 = 3 and L = {1, 2, 3, 4}. Now the element 5 from L 1 gets added to L, and p 1 is incremented to 4. As p 1 > n 1 the while loop exits. So far, we have performed a total of 5 comparison operations and 5 copy ing operations. We need to perform one more copy ing operation, which is putting the number 6 from L 2 into L. In a general case, the total number of comparison operations will be n 1 + n 2 1 and the total num ber of copying operations will be n 1 + n 2 . The total of both comparison and cop ying operations will be less than 2( n 1 + n 2 ). In any event, the time complexit y of the merge procedure is O(n 1 + n 2 ). 1

Upload: chhavish

Post on 07-Apr-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

8/4/2019 Mergesort Analysis

http://slidepdf.com/reader/full/mergesort-analysis 1/2

Here are a few notes on mergesort.

Merge Procedure

Here is a description of the merge procedure in pseudo-code, which seeks to merge two sorted

(assume that they are in ascending order) arrays L1 and L2 (of size n1 and n2 respectively) andproduce a sorted array L (of size n1 + n2) (in ascending order).

L = Merge (L1, L2, n1, n2)BEGIN(1) p1 = 1 [pointer to the current position in the first array](2) p2 = 1 [pointer to the current position in the second array](3) while ( p1 ≤ n1 AND p2 ≤ n2)

{(3a) if  (L1( p1) ≤ L2( p2)), then append L1( p1) to L and increment p1,(3b) else append L2( p2) to L and increment p2.

} [end while loop](4) if  p1 > n1 but p2 < n2, then append the remaining elements of  L2 (i.e. all elements L2( p2) toL2(n2)) onto L.(5) elseif  p2 > n2 but p1 < n1, then append the remaining elements of  L1 (i.e. all elements L1( p1)to L1(n1)) onto L.END

Time Complexity:

The time complexity of this operation is O(n1 + n2). Why? Look at the while loop. It will exitwhen p1 > n1 or when p2 > n2 or both. In the former case, we see that all n1 elements of  L1

must surely have been copied into L, which takes a total of  n1 operations. Now let us assumethat x elements of  L2 got copied into L (x could be 0). This takes another x operations. Oncewe are out of the while loop, there are still n2 − x elements remaining in L2 and these need tobe copied into L, which takes n2 − x more operations. Hence the total number of operations isn1 + n2 − x + x = n1 + n2.The textbook mentions that in the worst case, the number of  comparisons taking place in theabove routine is at most n1 + n2 − 1. Let us look at this, with an example on two sorted arraysL1 = {1, 3, 5} and L2 = {2, 4, 6}.We initialize p1 = p2 = 1. As we have L1(1) < L2(2), we append L1( p1 = 1) to L. So L nowcontains the element 1, and p1 is incremented to 2. The next time, we see that L2( p2 = 1) is lessthan L1( p1 = 2) and hence we have L = {1, 2}. Now p2 is incremented to 2. In the next step, wehave L1( p1) < L2( p2) and hence L = {1, 2, 3}. If you continue likewise you will see that at somestage, we have p1 = 3 and p2 = 3 and L = {1, 2, 3, 4}. Now the element 5 from L1 gets added to L,and p1 is incremented to 4. As p1 > n1 the while loop exits. So far, we have performed a total of 5comparison operations and 5 copying operations. We need to perform one more copying operation,which is putting the number 6 from L2 into L.

In a general case, the total number of comparison operations will be n1 + n2 − 1 and the totalnumber of copying operations will be n1+n2. The total of both comparison and copying operationswill be less than 2(n1+n2). In any event, the time complexity of the merge procedure is O(n1+n2).

1

8/4/2019 Mergesort Analysis

http://slidepdf.com/reader/full/mergesort-analysis 2/2

Merge Sorting

Look at the pseudo-code for mergesort on page 318 of the book. It recursively divides the ar-ray in a peculiar way until you get just individual elements. The merge procedure is then appliedbottom-up. In class, I simplified this by asking you to directly look at individual elements andmerge them to produce sorted sub-arrays of size 2 each, then take adjacent sub-arrays of size 2 andmerge them to produce sorted sub-arrays of size 4, and so on, until you get one final sorted array.In class, we derived the time complexity of the entire mergesort procedure to be O(n log n). Hereis a gist: There are some k steps, in each of which we spend O(n) time in the different mergingoperations. What is the maximum value of  k? Again: In the first step, we had n arrays each of size 1 which were merged to give you n/2 arrays each of size 2, which again were merged in thesecond step to give you n/4 arrays of size 4. This will go on for k steps. In the kth step, a totalof  n/2k−1 arrays, each of size 2k−1 are merged to give the final array. Clearly this can go on onlyuntil 2k = n, i.e. until k = log n. Hence the total number of steps is equal to log n, yielding anoverall time complexity of  O(n log n).

I assumed, in this case, that the size of the original array was in the form n = 2m where m isan integer. The natural question is what if  n is not an integral power of two. There are severalthings you can do. Here is one: Let us suppose that n wasn’t an integral power of 2, so that2m < n < 2m+1. Now let us add some x dummy elements to the array (a dummy element couldbe something like a very large number) so that n + x = 2m+1. In the very worst case, we need toadd x = 2m − 1 dummy elements and this will at most double the size of the original array. Inother words, we now would have n = 2m+1 elements instead of n elements (and note that n < 2n).Now apply the mergesort procedure to this new array. The total time complexity is O(n log n),i.e. O(2n log2n), which is no worse than O(n log n). There are many other tricks some of whichwill be quicker by a constant factor, but they will not affect the time complexity. Therefore we losenothing by restricting ourselves to arrays whose size is an integral power of 2 (if at all, it simplifiesthe analysis).

2