Transcript

Improve Run Generation• Overlap input,output, and internal CPU work.• Reduce the number of runs (equivalently, increase average

run length).

DISK

MEMORY

DISK

Internal Quick Sort6 2 8 5 11 10 4 1 9 7 3

Use 6 as the pivot (median of 3).

Input first, middle, and last blocks first.

In-place partitioning.

Input blocks from the ends toward the middle.

Sort left and right groups recursively.

Can begin output as soon as left most block is ready.

4 2 3 5 1 6 10 11 9 7 8

Alternative Internal Sort Scheme

DISK

DISK

B1 B2 B3

Partition into 3 buffers.

Steady State Operation

Read from disk

Write to disk

Run generation

• Synchronization is done when the active input buffer gets empty (the active output buffer will be full at this time).

DISK

MEMORY

DISK

New Strategy

• Use 2 input and 2 output buffers.

• Rest of memory is used for a min loser tree.

Input 1Input 0

Output 0 Output 1

Loser Tree

Steady State Operation

Read from disk

Write to disk

Run generation

• Synchronization is done when the active input buffer gets empty (the active output buffer will be full at this time).

4 3 6 8 1 5 7 3 2 6 9 4 5 2 5 8

4

3

8

O0 O1

I0 I1

Initialize

4 3 6 8 1 5 7 3 2 6 9 4 5 2 5 8

4

6

8

3

5

1

7

Initialize

O0 O1

I0 I1

4 3 6 8 1 5 7 3 2 6 9 4 5 2 5 8

4

6

8

3

5

3

7

1

6

2

9

Initialize

O0 O1

I0 I1

4 3 6 8 1 5 7 3 2 6 9 4 5 2 5 8

4

6

8

3

5

3

7

2

5

2

8

1

6

4

9

Initialize

O0 O1

I0 I1

4 3 6 8 1 5 7 3 2 6 9 4 5 2 5 8

4

6

8

3

5

3

7

2

5

5

8

1

6

4

9

Initialize

O0 O1

I0 I1

4 3 6 8 1 5 7 3 2 6 9 4 5 2 5 8

4

6

8

3

5

3

7

2

5

5

8

2

6

4

9

Initialize

O0 O1

I0 I1

Generate Run 1

14 3 6 8 5 7 3 2 6 9 4 5 2 5 8

4

6

8

3

5

3

7

2

5

5

8

2

6

4

9

O0 O1

I0 I1

3

5

4

Generate Run 1

4 3 6 8 5 7 3 2 6 9 4 5 2 5 8

4

6

8

3

5

3

7

2

5

5

8

2

6

4

9

O0 O1

I0 I1

3

5

4

1

3

3

5

O0

2

3

Generate Run 1

4 3 6 8 5 7 3 6 9 4 5 2 5 8

4

6

8

3

5

3

7

2

5

5

8

3

6

4

9

O1

I0 I1

3

5

4

1

5

4

45

O0

2

3

Generate Run 1

4 3 6 8 5 7 3 6 9 4 5

2

5 8

4

6

8

3

5

3

7

2

5

5

8

3

6

4

9

O1

I0 I1

3

5

4

1

5

4

5

O0

2

34 3 6 8 5 7 3 6 9 4 5

2

5 8

4

6

8

3

5

3

7

2

5

5

8

3

6

4

9

O1

I0 I1

1

5

4

4

1

9

2

5

O0

2

34 3 6 8 5 7 3 6 9 4 5

2

5 8

4

6

8

3

5

3

7

4

5

5

8

3

6

4

9

O1

I0 I1

1

5

4

4

1

9

2

Continue With Run 1

O1

3

45

O0

2

4 3 6 8 5 7 3 6 9 4 5

2

5 8

4

6

8

3

5

3

7

4

5

5

8

4

6

4

9

I0 I1

1

5

1

1

9

2

Continue With Run 1

1

5

1

O1

3

45

O0

2

4 3 6 8 5 7

3

6 9 4 5

2

5 8

4

6

8

3

5

3

7

4

5

5

8

4

6

4

9

I0 I1

1

5

1

1

9

2

Continue With Run 1

5

9

9

5

7

91

O1

3

45

O0

2

4

3

6 8 5 7

3

6 9 4 5

2

5 8

4

6

8

3

5

3

7

4

5

5

8

4

6

4

9

I0 I1

1

5

1

1

9

2

Continue With Run 1

5

9

5

7

2

91

O1

3

45

O0

4

3

6 8 5 7

3

6 9 4 5 5 8

4

6

8

3

5

3

7

4

5

5

8

4

6

4

9

I0 I1

5

1

6

1

3

5

9

5

7

2

91

O1

3

45

O0

4

3

6 8 5 7

3

6 9 4 5 5 8

4

6

8

3

5

3

7

4

5

5

8

4

6

4

9

I0 I1

5

1

6

1

3

5

9

5

7

2

Continue With Run 1

2

2 91

O1

3

45

O0

4

3

6 8 5 7

3

6 9 4 5 5 8

4

6

8

3

5

3

7

4

5

5

8

4

6

4

9

I0 I1

5

1

6

1

3

5

9

5

7

Continue With Run 1

2

6

6

5

2 91

O1

3

45

O0

4

3

6 8 5 7

3

6 9

4

5 5 8

4

6

8

3

5

3

7

4

5

5

8

4

6

4

9

I0 I1

5

1

6

1

3

5

9

5

7

Continue With Run 1

2

6

6

5

1

1

9

5

Buffer Reduction

• We may reduce the number of buffers to 3.

• At any time, 1 us used to read into, 1 to write from, and the 3rd both feeds the loser tree and is filled from the tree.

• The 3 physical buffers rotate through these three roles.

• Let k be number of external nodes in loser tree.

• Run size >= k.

• Sorted input => 1 run.

• Reverse of sorted input => n/k runs.

• Average run size is ~2k.

• Memory capacity = m records.

• Run size using fill memory, sort, and output run scheme = m.

• Use loser tree scheme. Assume block size is b records. Need memory for 4 buffers (4b records). Loser tree k = m – 4b. Average run size = 2k = 2(m – 4b). 2k >= m when m >= 8b.

• Assume b = 100.

m 600 1000 5000 10000

k 200 600 4600 9600

2k 400 1200 9200 19200

• Total internal processing time using fill memory, sort, and output run scheme = O((n/m) m log m) = O(n log m).

• Total internal processing time using loser tree = O(n log k).

• Loser tree scheme generates runs that differ in their lengths.

4 3 6 9

Merging Runs Of Different Length

4 3

6

9

7 15

22

7

13

22


Top Related