sorting (chapter 9)people.cs.aau.dk/~adavid/teaching/mtp-06/13-mvp06-notes.pdf · sorting...

40
1 Sorting (Chapter 9) Alexandre David B2-206

Upload: others

Post on 06-Aug-2020

18 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Sorting (Chapter 9)people.cs.aau.dk/~adavid/teaching/MTP-06/13-MVP06-notes.pdf · Sorting Algorithms Bubble sort Selection sort Insertion sort Quick sort Merge sort Heap sort O(n2

1

Sorting (Chapter 9)

Alexandre DavidB2-206

Page 2: Sorting (Chapter 9)people.cs.aau.dk/~adavid/teaching/MTP-06/13-MVP06-notes.pdf · Sorting Algorithms Bubble sort Selection sort Insertion sort Quick sort Merge sort Heap sort O(n2

2

21-04-2006 Alexandre David, MVP'06 2

Sorting

Arrange an unordered collection ofelements into monotonically increasing(or decreasing) order.Let S = <a1,a2,…,an>.Sort S into S’ = <a1’,a2’,…,an’> such that

ai’ ≤ aj’ for 1 ≤ i ≤ j ≤ nand S’ is a permutation of S.

Problem

The elements to sort (actually used for comparisons) are also called the keys.

Page 3: Sorting (Chapter 9)people.cs.aau.dk/~adavid/teaching/MTP-06/13-MVP06-notes.pdf · Sorting Algorithms Bubble sort Selection sort Insertion sort Quick sort Merge sort Heap sort O(n2

3

21-04-2006 Alexandre David, MVP'06 3

Recall on Comparison Based Sorting Algorithms

Bubble sortSelection sortInsertion sort

Quick sortMerge sortHeap sort

O(n2 )

Θ(n2 )

Ω(n)

Θ(n logn)Ω(n logn)

You should know these complexities from a previous course on algorithms.

Page 4: Sorting (Chapter 9)people.cs.aau.dk/~adavid/teaching/MTP-06/13-MVP06-notes.pdf · Sorting Algorithms Bubble sort Selection sort Insertion sort Quick sort Merge sort Heap sort O(n2

4

21-04-2006 Alexandre David, MVP'06 4

Characteristics of Sorting AlgorithmsIn-place sorting: No need for additional memory (or only constant size).Stable sorting: Ordered elements keep their original relative position.Internal sorting: Elements fit in process memory.External sorting: Elements are on auxiliary storage.

We assume internal sorting is possible.

Page 5: Sorting (Chapter 9)people.cs.aau.dk/~adavid/teaching/MTP-06/13-MVP06-notes.pdf · Sorting Algorithms Bubble sort Selection sort Insertion sort Quick sort Merge sort Heap sort O(n2

5

21-04-2006 Alexandre David, MVP'06 5

Fundamental DistinctionComparison based sorting:

Compare-exchange of pairs of elements.Lower bound is Ω(n logn) (proof based on decision trees).Merge & heap-sort are optimal.

Non-comparison based sorting:Use information on the element to sort.Lower bound is Ω(n).Counting & radix-sort are optimal.

We assume comparison based sorting is used.

Page 6: Sorting (Chapter 9)people.cs.aau.dk/~adavid/teaching/MTP-06/13-MVP06-notes.pdf · Sorting Algorithms Bubble sort Selection sort Insertion sort Quick sort Merge sort Heap sort O(n2

6

21-04-2006 Alexandre David, MVP'06 6

Issues in Parallel SortingWhere to store input & output?

One process or distributed?Enumeration of processes used to distribute output.

How to compare?How many elements per process?As many processes as element ⇒ poor performance because of inter-process communication.

Page 7: Sorting (Chapter 9)people.cs.aau.dk/~adavid/teaching/MTP-06/13-MVP06-notes.pdf · Sorting Algorithms Bubble sort Selection sort Insertion sort Quick sort Merge sort Heap sort O(n2

7

21-04-2006 Alexandre David, MVP'06 7

Parallel Compare-Exchange

Communication cost: ts+tw.Comparison cost much cheaper ⇒ communicationtime dominates.

Page 8: Sorting (Chapter 9)people.cs.aau.dk/~adavid/teaching/MTP-06/13-MVP06-notes.pdf · Sorting Algorithms Bubble sort Selection sort Insertion sort Quick sort Merge sort Heap sort O(n2

8

21-04-2006 Alexandre David, MVP'06 8

Blocks of Elements Per Process

P0 P1 Pp-1…

n elements

n/p elements per process

Blocks: A0 ≤ A1 ≤ … ≤ Ap-1

Page 9: Sorting (Chapter 9)people.cs.aau.dk/~adavid/teaching/MTP-06/13-MVP06-notes.pdf · Sorting Algorithms Bubble sort Selection sort Insertion sort Quick sort Merge sort Heap sort O(n2

9

21-04-2006 Alexandre David, MVP'06 9

Compare-Split

Exchange: Θ(ts+twn/p)

Merge: Θ(n/p) Split: O(n/p)

For large blocks: Θ(n/p)

Page 10: Sorting (Chapter 9)people.cs.aau.dk/~adavid/teaching/MTP-06/13-MVP06-notes.pdf · Sorting Algorithms Bubble sort Selection sort Insertion sort Quick sort Merge sort Heap sort O(n2

10

21-04-2006 Alexandre David, MVP'06 10

Sorting NetworksMostly of theoretical interest.Key idea: Perform many comparisons in parallel.Key elements:

Comparators: 2 inputs, 2 outputs.Network architecture: Comparators arranged in columns, each performing a permutation.Speed proportional to the depth.

Page 11: Sorting (Chapter 9)people.cs.aau.dk/~adavid/teaching/MTP-06/13-MVP06-notes.pdf · Sorting Algorithms Bubble sort Selection sort Insertion sort Quick sort Merge sort Heap sort O(n2

11

21-04-2006 Alexandre David, MVP'06 11

Comparators

Page 12: Sorting (Chapter 9)people.cs.aau.dk/~adavid/teaching/MTP-06/13-MVP06-notes.pdf · Sorting Algorithms Bubble sort Selection sort Insertion sort Quick sort Merge sort Heap sort O(n2

12

21-04-2006 Alexandre David, MVP'06 12

Sorting Networks

Page 13: Sorting (Chapter 9)people.cs.aau.dk/~adavid/teaching/MTP-06/13-MVP06-notes.pdf · Sorting Algorithms Bubble sort Selection sort Insertion sort Quick sort Merge sort Heap sort O(n2

13

21-04-2006 Alexandre David, MVP'06 13

Bitonic Sequence

A bitonic sequence is a sequence ofelements <a0,a1,…,an> s.t.

1. ∃i, 0 ≤ i ≤ n-1 s.t. <a0,…,ai> ismonotonically increasing and<ai+1,…,an-1> is monotonicallydecreasing,

2.or there is a cyclic shift of indicesso that 1) is satisfied.

Definition

Example: <1,2,4,7,6,0> & <8,9,2,1,0,4> are bitonic sequences.

Page 14: Sorting (Chapter 9)people.cs.aau.dk/~adavid/teaching/MTP-06/13-MVP06-notes.pdf · Sorting Algorithms Bubble sort Selection sort Insertion sort Quick sort Merge sort Heap sort O(n2

14

21-04-2006 Alexandre David, MVP'06 14

Bitonic SortRearrange a bitonic sequence to be sorted.Divide & conquer type of algorithm (similar to quicksort) using bitonic splits.

Sorting a bitonic sequence using bitonic splits = bitonic merge.But we need a bitonic sequence…

Page 15: Sorting (Chapter 9)people.cs.aau.dk/~adavid/teaching/MTP-06/13-MVP06-notes.pdf · Sorting Algorithms Bubble sort Selection sort Insertion sort Quick sort Merge sort Heap sort O(n2

15

21-04-2006 Alexandre David, MVP'06 15

Bitonic Split

<a0,a1,…,an/2-1,an/2,an/2+1,…,an-1>

s1 = <mina0,an/2,mina1,an/2+1,…,minan/2-1,an-1>bi

s2 = <maxa0,an/2,maxa1,an/2+1,…,maxan/2-1,an-1>bi’

s2s1

s1 ≤ s2s1 & s2 bitonic!

And in fact the procedure works even if the original sequence needs a cyclic shift to look like this particular case.

Page 16: Sorting (Chapter 9)people.cs.aau.dk/~adavid/teaching/MTP-06/13-MVP06-notes.pdf · Sorting Algorithms Bubble sort Selection sort Insertion sort Quick sort Merge sort Heap sort O(n2

16

21-04-2006 Alexandre David, MVP'06 16

Bitonic Merging Network logn stagesn/2 com

parators

⊕BM[n]

Cost: Θ(logn) obviously.

Page 17: Sorting (Chapter 9)people.cs.aau.dk/~adavid/teaching/MTP-06/13-MVP06-notes.pdf · Sorting Algorithms Bubble sort Selection sort Insertion sort Quick sort Merge sort Heap sort O(n2

17

21-04-2006 Alexandre David, MVP'06 17

Bitonic SortUse the bitonic network to merge bitonic sequences of increasing length… starting from 2, etc.Bitonic network is a component.

Page 18: Sorting (Chapter 9)people.cs.aau.dk/~adavid/teaching/MTP-06/13-MVP06-notes.pdf · Sorting Algorithms Bubble sort Selection sort Insertion sort Quick sort Merge sort Heap sort O(n2

18

21-04-2006 Alexandre David, MVP'06 18

Bitonic Sortlogn stages

Cost: O(log2n).Simulated on a serial computer: O(n log2n).

Not cost optimal compared to the optimal serial algorithm.

Page 19: Sorting (Chapter 9)people.cs.aau.dk/~adavid/teaching/MTP-06/13-MVP06-notes.pdf · Sorting Algorithms Bubble sort Selection sort Insertion sort Quick sort Merge sort Heap sort O(n2

19

21-04-2006 Alexandre David, MVP'06 19

Mapping to Hypercubes & Mesh – IdeaCommunication intensive, so special care for the mapping.How are the input wires paired?

Pairs have their labels differing by only one bit ⇒ mapping to hypercube straightforward.For a mesh lower connectivity, several solutions but worse than the hypercubeTP=Θ(log2n)+Θ(√n) for 1 element/process.Block of elements: sort locally (n/p logn/p) & use bitonic merge ⇒ cost optimal.

But not efficient & not scalablebecause the sequential algorithmis suboptimal.

Hypercube: Neighbors differ with each other by one bit.

Page 20: Sorting (Chapter 9)people.cs.aau.dk/~adavid/teaching/MTP-06/13-MVP06-notes.pdf · Sorting Algorithms Bubble sort Selection sort Insertion sort Quick sort Merge sort Heap sort O(n2

20

21-04-2006 Alexandre David, MVP'06 20

Bubble Sort

Difficult to parallelize as it is because it is inherently sequential.

procedure BUBBLE_SORT(n)begin

for i := n-1 downto 1 dofor j := 1 to i do

compare_exchange(aj,aj+1);end

Θ(n2 )

It is difficult to sort n elements in time logn using n processes (cost optimal w.r.t. the best serial algorithm in n logn) but it is easy to parallelize other (less efficient) algorithms.

Page 21: Sorting (Chapter 9)people.cs.aau.dk/~adavid/teaching/MTP-06/13-MVP06-notes.pdf · Sorting Algorithms Bubble sort Selection sort Insertion sort Quick sort Merge sort Heap sort O(n2

21

21-04-2006 Alexandre David, MVP'06 21

Odd-Even Transposition Sort

(a1,a2),(a3,a4)…

(a2,a3),(a4,a5)…

Θ(n2 )

Page 22: Sorting (Chapter 9)people.cs.aau.dk/~adavid/teaching/MTP-06/13-MVP06-notes.pdf · Sorting Algorithms Bubble sort Selection sort Insertion sort Quick sort Merge sort Heap sort O(n2

22

21-04-2006 Alexandre David, MVP'06 22

Page 23: Sorting (Chapter 9)people.cs.aau.dk/~adavid/teaching/MTP-06/13-MVP06-notes.pdf · Sorting Algorithms Bubble sort Selection sort Insertion sort Quick sort Merge sort Heap sort O(n2

23

21-04-2006 Alexandre David, MVP'06 23

Odd-Even Transposition SortEasy to parallelize!Θ(n) if 1 process/element.Not cost optimal but use fewer processes, an optimal local sort, and compare-splits:

( ) ( )nnpn

pnTP Θ+Θ+⎟⎟

⎞⎜⎜⎝

⎛Θ= log

local sort (optimal) + comparisons + communicationCost optimal for p = O(logn)but not scalable (few processes).

Write speedup & efficiency to find the bound on p but you can also see it with TP.

Page 24: Sorting (Chapter 9)people.cs.aau.dk/~adavid/teaching/MTP-06/13-MVP06-notes.pdf · Sorting Algorithms Bubble sort Selection sort Insertion sort Quick sort Merge sort Heap sort O(n2

24

21-04-2006 Alexandre David, MVP'06 24

Improvement: Shellsort2 phases:

Move elements on longer distances.Odd-even transposition but stop when no change.

Idea: Put quickly elements near their final position to reduce the number of iterations of odd-even transposition.

Page 25: Sorting (Chapter 9)people.cs.aau.dk/~adavid/teaching/MTP-06/13-MVP06-notes.pdf · Sorting Algorithms Bubble sort Selection sort Insertion sort Quick sort Merge sort Heap sort O(n2

25

21-04-2006 Alexandre David, MVP'06 25

Page 26: Sorting (Chapter 9)people.cs.aau.dk/~adavid/teaching/MTP-06/13-MVP06-notes.pdf · Sorting Algorithms Bubble sort Selection sort Insertion sort Quick sort Merge sort Heap sort O(n2

26

21-04-2006 Alexandre David, MVP'06 26

QuicksortAverage complexity: O(n logn).

But very efficient in practice.Average “robust”.Low overhead and very simple.

Divide & conquer algorithm:Partition A[q..r] into A[q..s] ≤ A[s+1..r].Recursively sort sub-arrays.Subtlety: How to partition?

Page 27: Sorting (Chapter 9)people.cs.aau.dk/~adavid/teaching/MTP-06/13-MVP06-notes.pdf · Sorting Algorithms Bubble sort Selection sort Insertion sort Quick sort Merge sort Heap sort O(n2

27

21-04-2006 Alexandre David, MVP'06 27

2 1 5 8 4 3 73

q r

3 5

4 752 13 3 87 83

4 52 1 3 7 83 51 73

42 3 851 73 4 5

Hoare partitioning is better. Check in your algorithm course.

Page 28: Sorting (Chapter 9)people.cs.aau.dk/~adavid/teaching/MTP-06/13-MVP06-notes.pdf · Sorting Algorithms Bubble sort Selection sort Insertion sort Quick sort Merge sort Heap sort O(n2

28

21-04-2006 Alexandre David, MVP'06 28

BUG

Page 29: Sorting (Chapter 9)people.cs.aau.dk/~adavid/teaching/MTP-06/13-MVP06-notes.pdf · Sorting Algorithms Bubble sort Selection sort Insertion sort Quick sort Merge sort Heap sort O(n2

29

21-04-2006 Alexandre David, MVP'06 29

Parallel QuicksortSimple version:

Recursive decomposition with one process per recursive call.Not cost optimal: Lower bound = n (initial partitioning).Best we can do: Use O(logn) processes.Need to parallelize the partitioning step.

Page 30: Sorting (Chapter 9)people.cs.aau.dk/~adavid/teaching/MTP-06/13-MVP06-notes.pdf · Sorting Algorithms Bubble sort Selection sort Insertion sort Quick sort Merge sort Heap sort O(n2

30

21-04-2006 Alexandre David, MVP'06 30

Parallel Quicksort for CRCW PRAMSee execution of quicksort as constructing a binary tree.

33,2,1 7,4,5,8

3 71,2 5,4 8

1

2

5

4

8

Page 31: Sorting (Chapter 9)people.cs.aau.dk/~adavid/teaching/MTP-06/13-MVP06-notes.pdf · Sorting Algorithms Bubble sort Selection sort Insertion sort Quick sort Merge sort Heap sort O(n2

31

21-04-2006 Alexandre David, MVP'06 31

BUG

Text & algorithm 9.5:A[p..s] ≤ x < A[s+1..q].Figures & algorithm 9.6:A[p..s] < x ≤ A[s+1..q].

Page 32: Sorting (Chapter 9)people.cs.aau.dk/~adavid/teaching/MTP-06/13-MVP06-notes.pdf · Sorting Algorithms Bubble sort Selection sort Insertion sort Quick sort Merge sort Heap sort O(n2

32

21-04-2006 Alexandre David, MVP'06 32

only one succeeds

A[i]≤A[parenti]

This algorithm does not correspond exactly to the serial version. Time for partitioning: O(1).

Page 33: Sorting (Chapter 9)people.cs.aau.dk/~adavid/teaching/MTP-06/13-MVP06-notes.pdf · Sorting Algorithms Bubble sort Selection sort Insertion sort Quick sort Merge sort Heap sort O(n2

33

21-04-2006 Alexandre David, MVP'06 33

13 2 5 8 4 3 71 2 3 4 5 6 7 8

root=1

1 1 1 1 1 1 1 1

31

2 6

22 2 666

2 6

Page 34: Sorting (Chapter 9)people.cs.aau.dk/~adavid/teaching/MTP-06/13-MVP06-notes.pdf · Sorting Algorithms Bubble sort Selection sort Insertion sort Quick sort Merge sort Heap sort O(n2

34

21-04-2006 Alexandre David, MVP'06 34

13 2 5 8 4 3 71 2 3 4 5 6 7 8

1 1 1 1 1 1 1 1

31

2 6

2 2 666

2 6

2 4

3 7 5

3 1 5

35 5

1 3 8

4

8

5

7

Each step: Θ(1). Average height: Θ(logn).This is cost-optimal – but it is only a model.

Page 35: Sorting (Chapter 9)people.cs.aau.dk/~adavid/teaching/MTP-06/13-MVP06-notes.pdf · Sorting Algorithms Bubble sort Selection sort Insertion sort Quick sort Merge sort Heap sort O(n2

35

21-04-2006 Alexandre David, MVP'06 35

Parallel Quicksort – Shared Address (Realistic)Same idea but remove contention:

Choose the pivot & broadcast it.Each process rearranges its block of elements locally.Global rearrangement of the blocks.When the blocks reach a certain size, local sort is used.

Page 36: Sorting (Chapter 9)people.cs.aau.dk/~adavid/teaching/MTP-06/13-MVP06-notes.pdf · Sorting Algorithms Bubble sort Selection sort Insertion sort Quick sort Merge sort Heap sort O(n2

36

21-04-2006 Alexandre David, MVP'06 36

Page 37: Sorting (Chapter 9)people.cs.aau.dk/~adavid/teaching/MTP-06/13-MVP06-notes.pdf · Sorting Algorithms Bubble sort Selection sort Insertion sort Quick sort Merge sort Heap sort O(n2

37

21-04-2006 Alexandre David, MVP'06 37

Page 38: Sorting (Chapter 9)people.cs.aau.dk/~adavid/teaching/MTP-06/13-MVP06-notes.pdf · Sorting Algorithms Bubble sort Selection sort Insertion sort Quick sort Merge sort Heap sort O(n2

38

21-04-2006 Alexandre David, MVP'06 38

CostScalability determined by time to broadcast the pivot & compute the prefix-sums.Cost optimal.

Page 39: Sorting (Chapter 9)people.cs.aau.dk/~adavid/teaching/MTP-06/13-MVP06-notes.pdf · Sorting Algorithms Bubble sort Selection sort Insertion sort Quick sort Merge sort Heap sort O(n2

39

21-04-2006 Alexandre David, MVP'06 39

MPI Formulation of QuicksortArrays must be explicitly distributed.Two phases:

Local partition smaller/larger than pivot.Determine who will sort the sub-arrays.

And send the sub-arrays to the right process.

Page 40: Sorting (Chapter 9)people.cs.aau.dk/~adavid/teaching/MTP-06/13-MVP06-notes.pdf · Sorting Algorithms Bubble sort Selection sort Insertion sort Quick sort Merge sort Heap sort O(n2

40

21-04-2006 Alexandre David, MVP'06 40

Final WordPivot selection is very important.Affects performance.Bad pivot means idle processes.