algorithms on secondary storage...
TRANSCRIPT
-
Worcester Polytechnic Institute
Algorithms on Secondary Storage“Sorting”
Chapter 13.
Professor E. Rundensteiner
CS542
-
Worcester Polytechnic Institute
Lesson: Using secondary storage effectively
─ Data too large to “live” in memory
─ “Regular” algorithms on small scale only
─ Must design algorithms for “large data”
─ I/O costs dominate – must reduce I/O
-
Worcester Polytechnic Institute
Why Sort?
• A classic problem in computer science!
• Data requested in sorted order via SQL:
─e.g., find students in increasing gpa order
• Sorting is first step in bulk loading B+ tree index.
• Sorting useful for eliminating duplicate copies in a collection of records
• Sort-merge join algorithm involves sorting.
• Problem: sort 1Gb of data with 1Mb of RAM.
─why not virtual memory?
-
Worcester Polytechnic Institute
2-Way Sort: Requires 3 Buffers
• PREPARE:
1 pass:
Read a page,
sort it, write it.
• MERGE:
Many passes:
Merge smaller
runs into
larger runs. Main memory buffer
INPUT 1
INPUT 2
OUTPUT
DiskDisk
Main memory buffer
INPUT /
OUTPUT
DiskDisk
-
Worcester Polytechnic Institute
Two-Way External Merge Sort
-
Worcester Polytechnic Institute
• Idea: Divide and conquer: sort subfiles and merge into larger sorts
Input file
1-page runs
2-page runs
4-page runs
8-page runs
PASS 0
PASS 1
PASS 2
PASS 3
9
3,4 6,2 9,4 8,7 5,6 3,1 2
3,4 5,62,6 4,9 7,8 1,3 2
2,3
4,6
4,7
8,9
1,3
5,6 2
2,3
4,4
6,7
8,9
1,2
3,5
6
1,2
2,3
3,4
4,5
6,6
7,8
-
Worcester Polytechnic Institute
Input file
1-page runs
2-page runs
4-page runs
8-page runs
PASS 0
PASS 1
PASS 2
PASS 3
9
3,4 6,2 9,4 8,7 5,6 3,1 2
3,4 5,62,6 4,9 7,8 1,3 2
2,3
4,6
4,7
8,9
1,3
5,6 2
2,3
4,4
6,7
8,9
1,2
3,5
6
1,2
2,3
3,4
4,5
6,6
7,8
Cost ?
-
Worcester Polytechnic Institute
Input file
1-page runs
2-page runs
4-page runs
8-page runs
PASS 0
PASS 1
PASS 2
PASS 3
9
3,4 6,2 9,4 8,7 5,6 3,1 2
3,4 5,62,6 4,9 7,8 1,3 2
2,3
4,6
4,7
8,9
1,3
5,6 2
2,3
4,4
6,7
8,9
1,2
3,5
6
1,2
2,3
3,4
4,5
6,6
7,8
Costs for pass :all pages
# of passes :height of tree
Total cost : product of above
-
Worcester Polytechnic Institute
Input file
1-page runs
2-page runs
4-page runs
8-page runs
PASS 0
PASS 1
PASS 2
PASS 3
9
3,4 6,2 9,4 8,7 5,6 3,1 2
3,4 5,62,6 4,9 7,8 1,3 2
2,3
4,6
4,7
8,9
1,3
5,6 2
2,3
4,4
6,7
8,9
1,2
3,5
6
1,2
2,3
3,4
4,5
6,6
7,8
• Each pass reads + writes each page in file.
• N pages in file => 2N
• Number of passes
• So total cost is:
log2 1N
2 12N Nlog
-
Worcester Polytechnic Institute
External Merge Sort
•What if we had more buffer pages?
•How do we utilize them wisely ?
- Two main ideas !
-
Worcester Polytechnic Institute
Phase 1 : Prepare
B Main memory buffers
INPUT 1
INPUT B
DiskDisk
INPUT 2
. . .. . .
• Construct as large as possible starter lists.
-
Worcester Polytechnic Institute
Phase 2 : Merge
Compose as many sorted sublistsinto one long sorted list.
B Main memory buffers
INPUT 1
INPUT B-1
OUTPUT
DiskDisk
INPUT 2
. . .. . .
-
Worcester Polytechnic Institute
Cost of External Merge Sort
•Cost = 2N * (# of passes)
•# of passes: 1 1 log /B N B
-
Worcester Polytechnic Institute
Example
• Buffer : with 5 buffer pages
• File to sort : 108 pages
─Pass 0:
▪ Size of each run?
▪ Number of runs?
─Pass 1:
▪ Size of each run?
▪ Number of runs?
─ etc ???
-
Worcester Polytechnic Institute
Example
• Buffer : with 5 buffer pages
• File to sort : 108 pages
─Pass 0: = 22 sorted runs of 5 pages each (last run is only 3 pages)
─Pass 1: = 6 sorted runs of 20 pages each (last run is only 8 pages)
─Pass 2: 2 sorted runs, 80 pages and 28 pages
─Pass 3: Sorted file of 108 pages
• Total I/O costs: ? 2*N * (4 passes)
108 5/
22 4/
-
Worcester Polytechnic Institute
Example
Buffer : with 5 buffer pages ; File to sort : 108 pages
Total IO costs
= 2 * N ( # of passes )
= 2N * (
= 2 * 108 * (1 + )
= 2 * 108 * (1 + log_4 ( 22 ))
= 2 * 108 * 4
5/1084log_
1 1 log /B N B
-
Worcester Polytechnic Institute
Number of Passes of External Sort
N B=3 B=5 B=9 B=17 B=129 B=257
100 7 4 3 2 1 1
1,000 10 5 4 3 2 2
10,000 13 7 5 4 2 2
100,000 17 9 6 5 3 3
1,000,000 20 10 7 5 3 3
10,000,000 23 12 8 6 4 3
100,000,000 26 14 9 7 4 4
1,000,000,000 30 15 10 8 5 4
- gain of utilizing all available buffers
- importance of a high fan-in during merging
-
Worcester Polytechnic Institute
Optimizing External Sorting
•Cost metric ?
─I/O only (until now)
─CPU is nontrivial, worth reducing
-
Worcester Polytechnic Institute
Optimizing Internal Sorting: Phase 1
. . .12
4
28
103
5
CURRENT SETINPUT
OUTPUT
➢1 input, 1 output, B-2 current set buffer
➢Main idea: repeatedly pick tuple in current set with smallest k
value that is still greater than largest k value in output buffer and
append it to output buffer
-
Worcester Polytechnic Institute
Internal Sort Algorithm
. . .124
2910
3
5
CURRENT SETINPUT
OUTPUT
8
-
Worcester Polytechnic Institute
Internal Sort Algorithm
. . .12
4
2
4
10
3
5
CURRENT SETINPUT
OUTPUT
9
8
-
Worcester Polytechnic Institute
Internal Sort Algorithm
. . .12
4
2
8
103
5
CURRENT SETINPUT
OUTPUT
➢Input & Output? new input page is read in, if it is fully consumed.
Output is written out, when it is full
➢When terminate current run? When all tuples in current set are
smaller than largest tuple in output buffer.
-
Worcester Polytechnic Institute
Analysis of Heapsort
• Fact: Average length of a run in heapsort is 2B
─The “snowplow” analogy
• Worst-Case:
─What is min length of a run?
─How does this arise?
• Best-Case:
─What is max length of a run?
─How does this arise?
• Quicksort is faster, but ...
B
-
Worcester Polytechnic Institute
Optimizing External Sorting: Phase 2
• Further optimization for external sorting:
─Blocked I/O
─Double buffering
-
Worcester Polytechnic Institute
Blocking of Buffer Pages: Ext. Sort
OUTPUT OUTPUT'
Disk
INPUT 1
INPUT k
INPUT 2
INPUT 1
INPUT 2
block size b
INPUT k
B main memory buffers
Disk
block size b
• Thus far : Do 1 I/O a page at a time
• But : Reading a block of pages sequentially is cheaper
• Make input/output a block of b pages
-
Worcester Polytechnic Institute
Example: Blocking for Merge Sort
• Idea : buffer blocks = b pages; 1 buffer block of b pages for input,
1 buffer block of b pages for outputthen we can merge |B-b/b| runs in that first pass
Example : 10 buffer pages
if one-page input and output buffer blocks,
then we produce 9 runs at a time
if two-page input and output buffer blocks,
then we produce 4 runs at a time
-
Worcester Polytechnic Institute
Blocking of Buffer Pages
OUTPUT OUTPUT'
Disk
INPUT 1
INPUT k
INPUT 2
INPUT 1
INPUT 2
block size b
INPUT k
B main memory buffers, k-way merge
Disk
block size b
Pro: Save total costs of “IOs”Cons: Reduce fan-out during merge passes
In practice, most files still sorted in 2-3 passes.
-
Worcester Polytechnic Institute
Double Buffering – Overlap CPU and I/O
• To reduce wait time for I/O request to complete, prefetch into `shadow block’.
OUTPUT
OUTPUT'
Disk Disk
INPUT 1
INPUT k
INPUT 2
INPUT 1'
INPUT 2'
INPUT k'
block sizeb
B main memory buffers, k-way merge
Potentially, more passesIn practice, most files still sorted in 2-3 passes.
-
Worcester Polytechnic Institute
Using B+ Trees for Sorting
• Scenario: Table to be sorted has B+ tree index on sorting column(s).
• Idea: Retrieve records in order by traversing leaf pages.
• Is this a good idea?
• Cases to consider:
─B+ tree is clustered Good idea!
─B+ tree is not clustered Bad idea!
-
Worcester Polytechnic Institute
Clustered B+ Tree Used for Sorting
• Traversal of tree from root to left-most leaf,
• then if Alt. 1, retrieve all leaf pages in order
• If Alt. 2, additional cost of retrieving actual data records
Always better than external sorting!
(Directs search)
Data Records
Index
Data Entries("Sequence set")
-
Worcester Polytechnic Institute
Unclustered B+ Tree Used for Sorting
• Alternative (2) for data; each data entry contains rid of a data record. In general, one I/O per data record!
(Directs search)
Data Records
Index
Data Entries("Sequence set")
-
Worcester Polytechnic Institute
External Sorting vs. Unclustered Index
N Sorting p=1 p=10 p=100
100 200 100 1,000 10,000
1,000 2,000 1,000 10,000 100,000
10,000 40,000 10,000 100,000 1,000,000
100,000 600,000 100,000 1,000,000 10,000,000
1,000,000 8,000,000 1,000,000 10,000,000 100,000,000
10,000,000 80,000,000 10,000,000 100,000,000 1,000,000,000
p: # of records per page (p=100 realistic)B=1,000 and block size=32 for sorting
-
Worcester Polytechnic Institute
Summary
• External sorting is important; DBMS may dedicate part of buffer pool for sorting!
• External merge sort minimizes disk I/O costs:
─In practice, # of phases rarely more than 3.
• Choice of internal sort algorithm may matter.
• Best sorts are wildly fast:
─Despite 40+ years of research, still improving!
• Clustered B+ tree is good for sorting; unclusteredtree is usually very bad.