dulo: an effective buffer cache management scheme to exploit both temporal and spatial localities
DESCRIPTION
Dulo: an effective buffer cache management scheme to exploit both temporal and spatial localitiesTRANSCRIPT
DULO: An Effective Buffer Cache DULO: An Effective Buffer Cache Management Scheme to Exploit Both Management Scheme to Exploit Both
Temporal and Spatial LocalitiesTemporal and Spatial Localities
Song Jiang, Wayne State University
Feng Chen and Xiaoning Ding, Ohio State
Reported by wen
2
““Disk Wall”Disk Wall” is a Critical Issueis a Critical Issue
Disk performance for workloads without dominant sequential accesses can be seriously degraded.
Throughput of accesses to sequentially placed disk blocks can be an order of magnitude higher than that of accesses to randomly placed blocks.
Unfortunately, spatial locality of cached blocks is largely ignored and only temporal locality is considered in system buffer cache management.
3
Sequential Locality is Unique in DisksSequential Locality is Unique in Disks
Sequential Locality: disk accesses in sequence fastest
Disk speed is limited by mechanical constraints.
seek/rotation (high latency and power consumption)
OS can guess sequential disk-layout, but not always right.
4
Our solution: DULOOur solution: DULO
Exploiting DUal LOcalities (DULO) Temporal locality of program execution
Sequential locality of disk accesses
minimizing random disk accesses
Application independent approach
putting disk access information on OS map
5
Traditional efforts to break the disk bottleneckTraditional efforts to break the disk bottleneck
1. Reduce disk accesses through memory caching
By using replacement algorithms to exploit the temporal locality, and then reduce disk activities of requested blocks.
The object only to reduct block miss ignores to utilize sequential stored blocks on single disk track.
2. I/O request scheduling
Try to minimal seeks and thereafter maximal global disk throughput. e.g. SSTF, CSCAN
6
Traditional efforts... cont.Traditional efforts... cont.
3. Prefetching
prefeching manager predicts the future request patterns to fetch data in advance.
Conclusion:
I/O scheduling and prefetching are positioned at a level lower than buffer cache, so they have limited ability to cach the opportunities lost in buffer cache management.
I/O Scheduler
Disk Driver
Application I/O Requests
disk
Buffer cacheCaching &
prefetching
7
DULODULO key operationskey operations
Forming sequences
A sequence is defined as a number of blocks whose disk locations are adjacent and have been accessed during a limited time period.
sorting the sequences in the LUR stack according to their lecency and size.
sequences of large recency and size are placed close to the LRU stack bottom
8
Sequence Preview
Sequence the random disk block requested
15
DULO Design
1.Structuring LRU stack
2.Block Table: A data structure for Dual Locality
3.Forming Seqeuences
4.The DULO Replacement Algorithm
16
DULO Design: structuring LRU stack
Staging Section
Evicting Section
Correlation Buffer
Sequencing Bank
LRU Stack
The correlation buffer in DULO is similar to the correlation reference period used in the LRU-K replacement algorithm. It's size is fixed.
The sequencing bank is used to prepare a collection of blocks to be sequenced, and its size ranges from 0 to a maximum value, BANK MAX.
The blocks that are leaving the stack are sorted in the evicting section for a replacement order reflecting both their sequentiality and their recency.
17
DULO Design: Block Table Block Table a data structure for tracking disk blocksa data structure for tracking disk blocks
time1
Timestamps
time2
0
10
20
LBN: 5140 = 0*5122 + 10*512 + 20
18
9
7
10
3
8
= 9= 10
2
DULO Design:: Identifying Long Disk SequenceIdentifying Long Disk Sequence
= 7
1^
LBN : Block
N2N3
N1
N4 8
8
= 8
9
9
10
10
^
^
4^
19
7
9
10
3
82
DULO Design:: Identifying Long Disk SequenceIdentifying Long Disk Sequence
1
9
9
4
10
10
Sequence
Not a sequence
20
15
17
162
DULO Design:: Identifying Long Disk SequenceIdentifying Long Disk Sequence
1
6
17
17
Continuously Accessed
Not Continuously
Accessed
Not a Sequence (Lacking Stability)
21L=L1
DULO design: Replacement Algorithm
LRU Stack
Adapted GreedyDual Algorithm a global inflation value L , and a value H for each sequence
Calculate H values for sequences in sequencing bank:
H = L + 1 / Length( sequence )
Random blocks have larger H values
When a sequence (s) is replaced,
L = H value of s .
L increases monotonically and make future sequences have larger H values
Sequences with smaller H values are placed closer to the bottom of LRU stack
H=L0+1
L=L0
H=L0+0.25
H=L0+1
H=L0+0.25
22
Disk-Seen TASK 3: Replacement Algorithm
LRU Stack
Adapted GreedyDual Algorithm a global inflation value L , and a value H for each sequence
Calculate H values for sequences in sequencing bank:
H = L + 1 / Length( sequence )
Random blocks have larger H values
When a sequence (s) is replaced,
L = H value of s .
L increases monotonically and make future sequences have larger H values
Sequences with smaller H values are placed closer to the bottom of LRU stack
H=L1+1
H=L1+0.25
L=L1
H=L0+0.25
H=L0+1
23
DULO-Caching Principles
Moving long sequences to the bottom of stack
replace them early, get them back fast from disks
Replacement priority is set by sequence length.
Moving LRU sequences to the bottom of stack
exploiting temporal locality of data accesses
Keeping random blocks in upper level stack
hold them: expensive to get back from disks.
24
Evalution Results: extremely example
25
Evalution Results: Mixed request patterns
26
Parameter discussion
• The (maximum) sequencing bank size
• The (minimal) evicting size.
Staging Section
Evicting Section
Correlation Buffer
Sequencing Bank
LRU Stack
27
Evalution Results: bank size
An optimal bank size roughly in the range from 4MB to 16MB.
28
Bank Size
• A bank with too small size has little chance to form long sequences.
• Meanwhile, a bank size must be less than the evicting section size. This is because a large bank size causes the evicting section to be refilled too late and causes the random blocks in it to have to be evicted.
29
Evalution Results: eviction size
The larger the section size, usually means better performance. The figure does show the trend.
30
The DULO Implementation:Linux Caching
• The Linux 2.6 kernel groups all the process pages andfile pages into two LRU lists called the active list and the inactive list.
• we partition the inactive list into a staging section and an evicting section because the list is the place where new blocks are added and old blocks are replaced.
31
The DULO Implementation:Implementation Issues
• With two lists,both newly accessed pages and not recently accessed active pages demoted from the active list might be added into the inactive list and probably be sequenced in the same bank .
32
Related Work
• Algorithms such as 2Q, MQ, ARC LIRS et al, focus only on how to better utilize temporal locality, to better predict the blocks to be accessed and try to minimize page fault rate. None of these algorithms considers spatial locality
• Prefetchers cannot change the I/O request stream in any way as the buffer cache does, they can take advantage of the more sequential I/O request streams resulted from the DULO cache manager.
• Misses on the two types of blocks are equally treated without giving preference to random blocks
33
Limitations
1.DULO attempts to provide random blocks with a caching privilege. DULO can do little help when I/O requests access to random blocks that have not been accessed for a long time.
2.We adapt it to the 2Q-like Linux replacement policy. How to adapt DULO on other advanced caching algorithms and understanding the interaction between DULO replacement policy.
3. Integration between caching and prefetching polices in the DULO scheme.
34
ConclusionsConclusions
Disk performance is limited by
Non-uniform accesses: fast sequential, slow random
OS has limited knowledge of disk-layout: unable to effectively exploit sequential locality.
DULO can significantly improve I/O performance by exploiting both temporal and spatial locality.