linear sorts

24
Linear Sorts Counting sort Bucket sort Radix sort

Upload: ann-medina

Post on 31-Dec-2015

37 views

Category:

Documents


0 download

DESCRIPTION

Linear Sorts. Counting sort Bucket sort Radix sort. Linear Sorts. We will study algorithms that do not depend only on comparing whole keys to be sorted. Counting sort Bucket sort Radix sort. Counting sort. Assumptions: n records Each record contains keys and data - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Linear Sorts

Linear Sorts

Counting sort

Bucket sort

Radix sort

Page 2: Linear Sorts

Linear Sorts 2

Linear Sorts

• We will study algorithms that do not depend only on comparing whole keys to be sorted.

• Counting sort• Bucket sort• Radix sort

Page 3: Linear Sorts

Linear Sorts 3

Counting sort

• Assumptions:– n records– Each record contains keys and data– All keys are in the range of 1 to k

• Space– The unsorted list is stored in A, the sorted list will

be stored in an additional array B– Uses an additional array C of size k

Page 4: Linear Sorts

Linear Sorts 4

Counting sort

• Main idea: 1. For each key value i, i = 1,…,k, count the number of times the

keys occurs in the unsorted input array A.

Store results in an auxiliary array, C 2. Use these counts to compute the offset. Offseti is used to

calculate the location where the record with key value i will be

stored in the sorted output list B. The offseti value has the location where the last keyi .

• When would you use counting sort?• How much memory is needed?

Page 5: Linear Sorts

Linear Sorts 5

Counting Sort

Counting-Sort( A, B, k)1. for i 1 to k2. do C[i ] 03. for j 1 to length[A]4. do C[A[ j ] ] C[A[ j ] ] + 15. for i 2 to k6. do C[i ] C[i ] +C[i -1]7. for j length[A] down 18. do B [ C[A[ j ] ] ] A[ j ] 9. C[A[ j ] ] ] C [A[ j ] ] -1Analysis:

Input: A [ 1 .. n ],A[J] {1,2, . . . , k }

Output: B [ 1 .. n ], sorted

Uses C [ 1 .. k ],auxiliary storage

Adapted from Cormen,Leiserson,Rivest

Page 6: Linear Sorts

Linear Sorts 6

A 4 31 4 43

1 2 3 4 5 6

k = 4, length = 6

C

after lines 1-2

0 0 0 0

C

after lines 3-4

1 0 2 3

Counting-Sort( A, B, k)1. for i 1 to k2. do C[i ] 03. for j 1 to length[A]4. do C[A[ j ] ] C[A[ j ] ] + 15. for i 2 to k6. do C[i ] C[i ] +C[i -1]

C

after lines 5-6

1 1 3 6

Page 7: Linear Sorts

Linear Sorts 7

7. for j length[A] down 18. do B [ C[A[ j ] ] ] A[ j ] 9. C[A[ j ] ] ] C [A[ j ] ] -1

A 4 31 4 43

1 2 3 4 5 6

B

1 2 3 4 5 6

C 1 1 3 6

<-1-> <- - 3 - -> <- - - 4 - ->

Page 8: Linear Sorts

Linear Sorts 8

Counting sort

3 Clinton4 Smith1 Xu2 Adams3 Dunn4 Yi 2 Baum1 Fu3 Gold1 Lu1 Land

1234

0000

1234

4232

1234

(4)(3)26(9)811

1 Lu1 Land

3 Gold

1234567891011

Original list

B

C C C1234567891011

finalcounts

"offsets"

A

Sort buckets

Page 9: Linear Sorts

Linear Sorts 9

Analysis:

• O(k + n) time

– What if k = O(n)

• But Sorting takes (n lg n) ????• Requires k + n extra storage.• This is a stable sort: It preserves the original order of

equal keys.• Clearly no good for sorting 32 bit values.

Page 10: Linear Sorts

Linear Sorts 10

Bucket sort

• Keys are distributed uniformly in interval [0, 1)

• The records are distributed into n buckets

• The buckets are sorted using one of the well known sorts

• Finally the buckets are combined

Page 11: Linear Sorts

Linear Sorts 11

Bucket sort

.78

.17

.39

.26

.72

.94

.21

.12

.23

.68

12345678910

0123456789

/

//

/

.12 .17/

.23

.68/

.72

.94/

.39/

.78/

.21

.26/

Step 1 distribute

0123456789

/

//

/

.12 .17/

.21

.68/

.72

.94/

.39/

.78/

.23

.26/

Step 2 sorted

Step3 combine

Page 12: Linear Sorts

Linear Sorts 12

Analysis

• P = 1/n , probability that the key goes to bucket i.• Expected size of bucket is np = n 1/n = 1

• The expected time to sort one bucket is (1).

• Overall expected time is (n).

Page 13: Linear Sorts

Linear Sorts 13

How did IBM get rich originally?

• In the early 1900's IBM produced punched card readers for census tabulation.

• Cards are 80 columns with 12 places for punches per column. Only 10 places needed for decimals.– Picture of punch card.

• Sorters had 12 bins. • Key idea: sort the least significant digit first.

Page 14: Linear Sorts

Linear Sorts 14

A punched card

Page 15: Linear Sorts

Linear Sorts 15

Card punching machineIBM card punching machine

Page 16: Linear Sorts

Linear Sorts 16

Hollerith’s tabulating machines

• As the cards were fed through a "tabulating machine," pins passed through the positions where holes were punched completing an electrical circuit and subsequently registered a value.

• The 1880 census in the U.S. took seven years to complete

• With Hollerith's "tabulating machines" the 1890 census took the Census Bureau six weeks

Page 17: Linear Sorts

Linear Sorts 17

Card sorting machine

IBM’s card sorting machine

Page 18: Linear Sorts

Linear Sorts 18

Radix sort

• Main idea– Break key into “digit” representation

key = id, id-1, …, i2, i1– "digit" can be a number in any base, a character, etc

• Radix sort:for i= 1 to d sort “digit” i using a stable sort

• Analysis : (d (stable sort time)) where d is the number of “digit”s

Page 19: Linear Sorts

Linear Sorts 19

Radix sort

• Which stable sort?– Since the range of values of a digit is small the

best stable sort to use is Counting Sort.– When counting sort is used the time

complexity is (d (n +k )) where k is the range of a "digit".

• When k O(n), (d n)

Page 20: Linear Sorts

Linear Sorts 20

Radix sort- with decimal digits

178139326572294321910368

12345678

910321572294326178368139

910321326139368572178294

139178294321326368572910

Input list Sorted list

Page 21: Linear Sorts

Linear Sorts 21

Radix sort with unstable digit sort

1713

12

1317

1713

Input listList not sorted

Since unstableand both keys equal to 1

Page 22: Linear Sorts

Linear Sorts 22

Is Quicksort stable?

• Note that data is sorted by key• Since sort unstable cannot be used for radix sort

515548

123

485551

Key Data

After partitionof 0 to 2

After partitionof 1 to 2

485551

Page 23: Linear Sorts

Linear Sorts 23

Is Heapsort stable?

• Note that data is sorted by key• Since sort unstable cannot be used for radix sort

5155

12 Key Data

Complete binarytree, and max heap

51

55

5551

Heap

Sorted

Afterswap

Page 24: Linear Sorts

Linear Sorts 24

Example

Sort 1 million 64-bit numbers.

We could use an in place comparison sort which would run in (n lg n) in the average case. lg 1,000,000 20 passes over the data

We can treat a 64 bit number as a 4 digit, radix-216

number. So d = 4, k = 216 , n = 1,000,000

(d (n + k )) = ( 4(216 +n)). This takes 4 * 2 passes over the data.

16 bitsd3

16 bitsd2

16 bitsd1

16 bitsdo

64 bits number = d3*(216)3 + d2*(216)2+ d1 (216)1 + d0(216)0

Adapted from Cormen,Leiserson,Rivest