gecko: a contention-oblivious design for the cloudjyshin/talks/hotstorage12-shin.pdf · gecko:...
TRANSCRIPT
![Page 1: Gecko: A Contention-Oblivious Design for the Cloudjyshin/talks/hotstorage12-shin.pdf · Gecko: Chain logging Design •Cutting the log tail from the body –GC reads do not interrupt](https://reader033.vdocument.in/reader033/viewer/2022053017/5f1ac2e34351343f9a008e3d/html5/thumbnails/1.jpg)
Ji-Yong Shin
Cornell University
In collaboration with Mahesh Balakrishnan (MSR SVC), Tudor Marian (Google),
Lakshmi Ganesh (UT Austin), and Hakim Weatherspoon (Cornell)
HotStorage Talk on June 13, 2012
Gecko: A Contention-Oblivious Design for Cloud Storage
![Page 2: Gecko: A Contention-Oblivious Design for the Cloudjyshin/talks/hotstorage12-shin.pdf · Gecko: Chain logging Design •Cutting the log tail from the body –GC reads do not interrupt](https://reader033.vdocument.in/reader033/viewer/2022053017/5f1ac2e34351343f9a008e3d/html5/thumbnails/2.jpg)
• What happens to storage?
Cloud and Virtualization
VMM
Guest VM
Guest VM
Guest VM
Guest VM
…
Shared Disk
Shared Disk
Shared Disk
Shared Disk
SEQUENTIAL
RANDOM
2
![Page 3: Gecko: A Contention-Oblivious Design for the Cloudjyshin/talks/hotstorage12-shin.pdf · Gecko: Chain logging Design •Cutting the log tail from the body –GC reads do not interrupt](https://reader033.vdocument.in/reader033/viewer/2022053017/5f1ac2e34351343f9a008e3d/html5/thumbnails/3.jpg)
Sequential Writers Only
• Sequential streams are no longer sequential
– 1~8 VM + EXT4 FS
– 4-disk RAID-0 setting
– Sequential Writer (256KB)
– Random Writer (4KB)
0
50
100
150
200
250
300
350
1 2 3 4 5 6 7 8
Thro
ugh
pu
t (M
B/s
)
# of VMs
Sequential Writers Only
0
50
100
150
200
250
300
350
1 2 3 4 5 6 7
Thro
ugh
pu
t (M
B/s
)
# of VMs
Sequential Writers + 1 Random Writer
3
![Page 4: Gecko: A Contention-Oblivious Design for the Cloudjyshin/talks/hotstorage12-shin.pdf · Gecko: Chain logging Design •Cutting the log tail from the body –GC reads do not interrupt](https://reader033.vdocument.in/reader033/viewer/2022053017/5f1ac2e34351343f9a008e3d/html5/thumbnails/4.jpg)
Existing Solutions for IO Contention?
• IO scheduling
– Entails increased latency for certain workload
– May still require moving disk head
• Workload placement
– Requires prior knowledge or dynamic prediction
– Limits freedom of placing VMs in the cloud
4
![Page 5: Gecko: A Contention-Oblivious Design for the Cloudjyshin/talks/hotstorage12-shin.pdf · Gecko: Chain logging Design •Cutting the log tail from the body –GC reads do not interrupt](https://reader033.vdocument.in/reader033/viewer/2022053017/5f1ac2e34351343f9a008e3d/html5/thumbnails/5.jpg)
…
Log
Tail
Shared Disk
Log-structured File System to the Rescue?
– Write everything as log to tail
– Perfect prediction for writes
– Assume reads are handled by cache
Addr 0 1 2 … … N
…
Wri
te
Wri
te
Wri
te
Log
Tail
Shared Disk Shared
Disk
Log
Tail
RAID0 + LFS
…
5
![Page 6: Gecko: A Contention-Oblivious Design for the Cloudjyshin/talks/hotstorage12-shin.pdf · Gecko: Chain logging Design •Cutting the log tail from the body –GC reads do not interrupt](https://reader033.vdocument.in/reader033/viewer/2022053017/5f1ac2e34351343f9a008e3d/html5/thumbnails/6.jpg)
• Garbage collection is the Achilles’ Heel of LFS
…
Log
Tail
Shared Disk
Challenges of Log-Structured File System
… N …
Log
Tail
Shared Disk Shared
Disk
Log
Tail
…
GC
GC
GC
Garbage Collection (GC) from Log Head
RAID0 + LFS
6
![Page 7: Gecko: A Contention-Oblivious Design for the Cloudjyshin/talks/hotstorage12-shin.pdf · Gecko: Chain logging Design •Cutting the log tail from the body –GC reads do not interrupt](https://reader033.vdocument.in/reader033/viewer/2022053017/5f1ac2e34351343f9a008e3d/html5/thumbnails/7.jpg)
Challenges of Log-Structured File System
• Garbage collection is the Achilles’ Heel of LFS
– 2-disk RAID-0 setting of LFS
– GC under write-only workload RAID 0 + LFS
0
50
100
150
200
250
0 10 20 30 40 50 60 70 80 90 100 110 120
Thro
ugh
pu
t (M
B/s
)
Time (s)
Aggregate Writes
Application Writes
GC Writes
7
![Page 8: Gecko: A Contention-Oblivious Design for the Cloudjyshin/talks/hotstorage12-shin.pdf · Gecko: Chain logging Design •Cutting the log tail from the body –GC reads do not interrupt](https://reader033.vdocument.in/reader033/viewer/2022053017/5f1ac2e34351343f9a008e3d/html5/thumbnails/8.jpg)
Summary of Challenges in the Cloud
• Server consolidation through cloud and virtualization
– Numbers of core and VM per server increase
– Storage is not yet maturely virtualized
• RAID cannot preserve high throughput
– IO performance varies depending on coexisting VMs
• LFS only solves write-write contention
– GC operation interferes with logging
– First class reads can interfere with logging
8
![Page 9: Gecko: A Contention-Oblivious Design for the Cloudjyshin/talks/hotstorage12-shin.pdf · Gecko: Chain logging Design •Cutting the log tail from the body –GC reads do not interrupt](https://reader033.vdocument.in/reader033/viewer/2022053017/5f1ac2e34351343f9a008e3d/html5/thumbnails/9.jpg)
Rest of the Talk
• Gecko, a chain logging design
– Overview
– Caching reads
– Garbage collection strategies
– Metadata management
• Evaluation
• Summary
9
![Page 10: Gecko: A Contention-Oblivious Design for the Cloudjyshin/talks/hotstorage12-shin.pdf · Gecko: Chain logging Design •Cutting the log tail from the body –GC reads do not interrupt](https://reader033.vdocument.in/reader033/viewer/2022053017/5f1ac2e34351343f9a008e3d/html5/thumbnails/10.jpg)
Gecko: Chain logging Design
• Cutting the log tail from the body
– GC reads do not interrupt the sequential write
– 1 uncontended drive >>faster>> N contended drives
Disk 2 Disk 1 Disk 0
Log Tail
Physical Addr Space
GC
Garbage Collection
from Log Head
Disk 2
10
![Page 11: Gecko: A Contention-Oblivious Design for the Cloudjyshin/talks/hotstorage12-shin.pdf · Gecko: Chain logging Design •Cutting the log tail from the body –GC reads do not interrupt](https://reader033.vdocument.in/reader033/viewer/2022053017/5f1ac2e34351343f9a008e3d/html5/thumbnails/11.jpg)
Disk 2
Read
Gecko Overview and Properties
Fault tolerance + Read performance
Disk 1 Disk 0
Log Tail
Read Read
No write-write contention, No GC-write contention
Disk 2’ Disk 1’ Disk 0’
Primary
Mirror
Read Log Tail
Disk 1’ Off
Disk 0’ Off
Power saving w/o
Consistency concerns
Read Read
11
![Page 12: Gecko: A Contention-Oblivious Design for the Cloudjyshin/talks/hotstorage12-shin.pdf · Gecko: Chain logging Design •Cutting the log tail from the body –GC reads do not interrupt](https://reader033.vdocument.in/reader033/viewer/2022053017/5f1ac2e34351343f9a008e3d/html5/thumbnails/12.jpg)
Gecko Caching
• What happens to reads going to tail drives?
Disk 2 Disk 1 Disk 0
Wri
te Tail Cache (Flash )
Rea
d
Rea
d
Rea
d
Blocks AT LEAST 86% of reads from real workload. (500GB disk, 34GB cache)
Prevents first-class read-write contention.
Revival of LFS using Flash
12
![Page 13: Gecko: A Contention-Oblivious Design for the Cloudjyshin/talks/hotstorage12-shin.pdf · Gecko: Chain logging Design •Cutting the log tail from the body –GC reads do not interrupt](https://reader033.vdocument.in/reader033/viewer/2022053017/5f1ac2e34351343f9a008e3d/html5/thumbnails/13.jpg)
Gecko Garbage Collection (GC)
Disk 2 Disk 1
Move-to-tail GC
Compact-in-body GC
+ Simple - GC shares write bandwidth
+ GC is independent from writes - Complicates metadata management
Disk 0
13
![Page 14: Gecko: A Contention-Oblivious Design for the Cloudjyshin/talks/hotstorage12-shin.pdf · Gecko: Chain logging Design •Cutting the log tail from the body –GC reads do not interrupt](https://reader033.vdocument.in/reader033/viewer/2022053017/5f1ac2e34351343f9a008e3d/html5/thumbnails/14.jpg)
Gecko Metadata and Persistence Primary map: less than 8 GB RAM
for a 8 TB storage
Inverse map: 8 GB flash for a 8 TB storage (flushed every 1024 writes)
4KB pages
empty filled
Data (in disk)
Disk 0 Disk 2 Disk 1
Physical-to-logical map (in flash) h
ead
tail
hea
d
tail
4-byte entries Logical-to-physical map (in memory)
14
![Page 15: Gecko: A Contention-Oblivious Design for the Cloudjyshin/talks/hotstorage12-shin.pdf · Gecko: Chain logging Design •Cutting the log tail from the body –GC reads do not interrupt](https://reader033.vdocument.in/reader033/viewer/2022053017/5f1ac2e34351343f9a008e3d/html5/thumbnails/15.jpg)
Evaluation Setup
• In-kernel version
– Implemented as block device for portability
– Similar to software RAID
– Move-to-tail GC
• User-level emulator
– For fast prototyping
– Runs block traces
– Compact-in-body GC
15
![Page 16: Gecko: A Contention-Oblivious Design for the Cloudjyshin/talks/hotstorage12-shin.pdf · Gecko: Chain logging Design •Cutting the log tail from the body –GC reads do not interrupt](https://reader033.vdocument.in/reader033/viewer/2022053017/5f1ac2e34351343f9a008e3d/html5/thumbnails/16.jpg)
Evaluation
• Performance under move-to-tail GC
– 2-disk Gecko chain, write only workload
– GC does not affect aggregate throughput
RAID 0 + LFS Gecko
0
50
100
150
200
250
0 10 20 30 40 50 60 70 80 90 100 110 120
Thro
ugh
pu
t (M
B/s
)
Time (s)
0
50
100
150
200
250
0 10 20 30 40 50 60 70 80 90 100 110 120
Thro
ugh
pu
t (M
B/s
)
Time (s)
Aggregate Writes
Application Writes
GC Writes
16
![Page 17: Gecko: A Contention-Oblivious Design for the Cloudjyshin/talks/hotstorage12-shin.pdf · Gecko: Chain logging Design •Cutting the log tail from the body –GC reads do not interrupt](https://reader033.vdocument.in/reader033/viewer/2022053017/5f1ac2e34351343f9a008e3d/html5/thumbnails/17.jpg)
Evaluation
• Performance under compact-in-body GC (CIB GC)
– Write only workload is used
– Application throughput is not affected
0
20
40
60
80
100
120
No GC CIB GC
Thro
ugh
pu
t (M
B/s
)
Average Application Throughput
17
![Page 18: Gecko: A Contention-Oblivious Design for the Cloudjyshin/talks/hotstorage12-shin.pdf · Gecko: Chain logging Design •Cutting the log tail from the body –GC reads do not interrupt](https://reader033.vdocument.in/reader033/viewer/2022053017/5f1ac2e34351343f9a008e3d/html5/thumbnails/18.jpg)
Summary
• Log-structured designs
– Oblivious to write-write contention
– Sensitive to GC/read-write contention
• Gecko fixes the GC-write and read-write contention
– Separates the tail of the log from its body
– Flash re-enables log-structured designs
• Tail flash cache for read-write contention
• Small flash memory for persistence
18
![Page 19: Gecko: A Contention-Oblivious Design for the Cloudjyshin/talks/hotstorage12-shin.pdf · Gecko: Chain logging Design •Cutting the log tail from the body –GC reads do not interrupt](https://reader033.vdocument.in/reader033/viewer/2022053017/5f1ac2e34351343f9a008e3d/html5/thumbnails/19.jpg)
Future work
• Experiments with real workloads
• Exploration to minimize read-read contention
• IO handling policy inside Gecko
19