f2fs: a new file system for flash storage...• hot/cold separation ... sata ssd 250gb nvmessd...
TRANSCRIPT
![Page 1: F2FS: A New File System for Flash Storage...• Hot/Cold separation ... SATA SSD 250GB NVMeSSD 960GB. Mobile Benchmark 27](https://reader036.vdocument.in/reader036/viewer/2022070820/610514c1b9728b59906e6b6b/html5/thumbnails/1.jpg)
F2FS: A New File System for Flash Storage
Changman Lee, Dongho Sim, Joo-Young Hwang, and Sangyeun Cho
S/W Development Team Memory Business Samsung Electronics Co., Ltd.
Presenter: Jonggyu Park
![Page 2: F2FS: A New File System for Flash Storage...• Hot/Cold separation ... SATA SSD 250GB NVMeSSD 960GB. Mobile Benchmark 27](https://reader036.vdocument.in/reader036/viewer/2022070820/610514c1b9728b59906e6b6b/html5/thumbnails/2.jpg)
Contents• Introduction
• Why LFS is good for Flash?• Drawbacks of Conventional LFS
• Design• Flash Friendly On-disk Layout• Efficient Index Structure• Multi-head Logging• Adaptive Logging• Recovery & fsync acceleration
• Evaluation• Experimental Setup• Mobile Benchmark• Server Benchmark
• Conclusion
2
![Page 3: F2FS: A New File System for Flash Storage...• Hot/Cold separation ... SATA SSD 250GB NVMeSSD 960GB. Mobile Benchmark 27](https://reader036.vdocument.in/reader036/viewer/2022070820/610514c1b9728b59906e6b6b/html5/thumbnails/3.jpg)
Why LFS is good for Flash?
• Flash memory• Out-place update (No in-place update)• Random write is harmful (increasing GC costs)• Sequential I/O is faster than Random I/O
• Log-structured File system• Out-place update• Mostly sequential write
3
A A’
invalidate
write a new block
<Updating ‘A’ block on LFS> <Garbage Collection in Flash Memory>
Victim block New block
Valid block
Invalid block
COPY
![Page 4: F2FS: A New File System for Flash Storage...• Hot/Cold separation ... SATA SSD 250GB NVMeSSD 960GB. Mobile Benchmark 27](https://reader036.vdocument.in/reader036/viewer/2022070820/610514c1b9728b59906e6b6b/html5/thumbnails/4.jpg)
Drawbacks of Conventional LFS
• HDD-optimized Layout
• Wandering Tree Problem (index structure)
• No data classification
• High cleaning costs under high utilization
• High ‘fsync’ overhead (checkpoint per a single fsync)
4
![Page 5: F2FS: A New File System for Flash Storage...• Hot/Cold separation ... SATA SSD 250GB NVMeSSD 960GB. Mobile Benchmark 27](https://reader036.vdocument.in/reader036/viewer/2022070820/610514c1b9728b59906e6b6b/html5/thumbnails/5.jpg)
F2FS (Flash-Friendly File System)
• HDD-optimized Layoutà Flash-aware Layout
• Wandering Tree Problem (index structure)à Efficient index structure
• No data classificationà Multi-head logging and data hot/cold separation
• High cleaning costs under high utilizationà Adaptive logging
• High ‘fsync’ overhead (checkpoint per a single fsync)à fsync acceleration
5
![Page 6: F2FS: A New File System for Flash Storage...• Hot/Cold separation ... SATA SSD 250GB NVMeSSD 960GB. Mobile Benchmark 27](https://reader036.vdocument.in/reader036/viewer/2022070820/610514c1b9728b59906e6b6b/html5/thumbnails/6.jpg)
Flash Friendly On-disk Layout
• Flash-aware on disk layout• FS metadata are in the random write zone• Main area is aligned to the zone size• Cleaning is performed in a unit of section (FTL’s GC unit)
6
Block: 4KBSegment: 2MBSection: n segmentsZone: m sections
![Page 7: F2FS: A New File System for Flash Storage...• Hot/Cold separation ... SATA SSD 250GB NVMeSSD 960GB. Mobile Benchmark 27](https://reader036.vdocument.in/reader036/viewer/2022070820/610514c1b9728b59906e6b6b/html5/thumbnails/7.jpg)
Flash Friendly On-disk Layout
7
Type Description
CP File system info., bitmaps for valid NAT/SIT sets, and summary of current active segments
SIT Segment info. such as valid block count and bitmap for the validity of all the blocks
NAT Block address table for all the node blocks stored in the main area
SSA Summary entries which contains the owner info. Of all the data and node blocks
![Page 8: F2FS: A New File System for Flash Storage...• Hot/Cold separation ... SATA SSD 250GB NVMeSSD 960GB. Mobile Benchmark 27](https://reader036.vdocument.in/reader036/viewer/2022070820/610514c1b9728b59906e6b6b/html5/thumbnails/8.jpg)
How to read data on F2FS
8
1) Obtain the ‘root’ inode through NAT
Reading /file
CP SIT root
root
FileNAT SSA File
directnodeinode data
1
File
![Page 9: F2FS: A New File System for Flash Storage...• Hot/Cold separation ... SATA SSD 250GB NVMeSSD 960GB. Mobile Benchmark 27](https://reader036.vdocument.in/reader036/viewer/2022070820/610514c1b9728b59906e6b6b/html5/thumbnails/9.jpg)
How to read data on F2FS
9
1) Obtain the ‘root’ inode through NAT2) Search a directory entry named ‘file’ from its data block
Reading /file
CP SIT root
root
FileNAT SSA File
directnodeinode data
2
File
![Page 10: F2FS: A New File System for Flash Storage...• Hot/Cold separation ... SATA SSD 250GB NVMeSSD 960GB. Mobile Benchmark 27](https://reader036.vdocument.in/reader036/viewer/2022070820/610514c1b9728b59906e6b6b/html5/thumbnails/10.jpg)
How to read data on F2FS
10
1) Obtain the ‘root’ inode through NAT2) Search a directory entry named ‘file’ from its data block3) Translate the inode number to the address through NAT
Reading /file
CP SIT root
root
FileNAT SSA File
directnodeinode data
3
File
![Page 11: F2FS: A New File System for Flash Storage...• Hot/Cold separation ... SATA SSD 250GB NVMeSSD 960GB. Mobile Benchmark 27](https://reader036.vdocument.in/reader036/viewer/2022070820/610514c1b9728b59906e6b6b/html5/thumbnails/11.jpg)
How to read data on F2FS
11
1) Obtain the ‘root’ inode through NAT2) Search a directory entry named ‘file’ from its data block3) Translate the inode number to the address through NAT4) Obtain the ‘file’ inode by reading the corresponding block
Reading /file
CP SIT root
root
FileNAT SSA File
directnodeinode data
4
File
![Page 12: F2FS: A New File System for Flash Storage...• Hot/Cold separation ... SATA SSD 250GB NVMeSSD 960GB. Mobile Benchmark 27](https://reader036.vdocument.in/reader036/viewer/2022070820/610514c1b9728b59906e6b6b/html5/thumbnails/12.jpg)
How to read data on F2FS
12
1) Obtain the ‘root’ inode through NAT2) Search a directory entry named ‘file’ from its data block3) Translate the inode number to the address through NAT4) Obtain the ‘file’ inode by reading the corresponding block5) Obtain a direct node block address translated by NAT
Reading /file
CP SIT root
root
FileNAT SSA File
directnodeinode data
5
File
![Page 13: F2FS: A New File System for Flash Storage...• Hot/Cold separation ... SATA SSD 250GB NVMeSSD 960GB. Mobile Benchmark 27](https://reader036.vdocument.in/reader036/viewer/2022070820/610514c1b9728b59906e6b6b/html5/thumbnails/13.jpg)
How to read data on F2FS
13
1) Obtain the ‘root’ inode through NAT2) Search a directory entry named ‘file’ from its data block3) Translate the inode number to the address through NAT4) Obtain the ‘file’ inode by reading the corresponding block5) Obtain a direct node block address translated by NAT6) Access the data block using the direct node block
Reading /file
CP SIT root
root
FileNAT SSA File
directnodeinode data
6
File
![Page 14: F2FS: A New File System for Flash Storage...• Hot/Cold separation ... SATA SSD 250GB NVMeSSD 960GB. Mobile Benchmark 27](https://reader036.vdocument.in/reader036/viewer/2022070820/610514c1b9728b59906e6b6b/html5/thumbnails/14.jpg)
Efficient Index Structure
• Conventional LFS
14
SB
CP
InodeMap Inode for
directory
Inode for regular file
Directorydata
Filedata
IndirectPointer block
DirectPointer block
LFS
Filedata
…Segment
UsageSegmentSummary
![Page 15: F2FS: A New File System for Flash Storage...• Hot/Cold separation ... SATA SSD 250GB NVMeSSD 960GB. Mobile Benchmark 27](https://reader036.vdocument.in/reader036/viewer/2022070820/610514c1b9728b59906e6b6b/html5/thumbnails/15.jpg)
Efficient Index Structure
• F2FS
15
SB
CP NAT
Inode for directory
Inode for regular file
Directorydata
Filedata
IndirectNode
DirectNode
LFS
Filedata…
SIT
SSA
![Page 16: F2FS: A New File System for Flash Storage...• Hot/Cold separation ... SATA SSD 250GB NVMeSSD 960GB. Mobile Benchmark 27](https://reader036.vdocument.in/reader036/viewer/2022070820/610514c1b9728b59906e6b6b/html5/thumbnails/16.jpg)
Multi-head Logging
• Hot/Cold separation• Node > Data• Direct node > Indirect Node• Directory > Regular File
• Zone-aware Allocation
16
Zone ZoneSegment Segment Segment Segment
Flash Block Flash Block
FTL Mapping
Zone ZoneSegment Segment Segment Segment
Flash Block Flash Block
FTL Mapping
<Zone-blind allocation> <Zone-aware allocation>
![Page 17: F2FS: A New File System for Flash Storage...• Hot/Cold separation ... SATA SSD 250GB NVMeSSD 960GB. Mobile Benchmark 27](https://reader036.vdocument.in/reader036/viewer/2022070820/610514c1b9728b59906e6b6b/html5/thumbnails/17.jpg)
Adaptive Logging
• To reduce cleaning costs at high utilization, F2FS utilize adaptive logging• Append logging (logging to clean segments)• Need cleaning operations if no free segments• Cleaning causes mostly rand. read and seq. write
• Threaded logging (logging to dirty segments)• Reuse invalid blocks in dirty segments• No need cleaning• Cause random writes
17
Zone ZoneSegment Segment Segment Segment
inva
lid
inva
lid
inva
lid
inva
lid
inva
lid
inva
lid
inva
lid
inva
lid
inva
lid
inva
lid
inva
lid
Threaded logging writes data into invalid blocks
![Page 18: F2FS: A New File System for Flash Storage...• Hot/Cold separation ... SATA SSD 250GB NVMeSSD 960GB. Mobile Benchmark 27](https://reader036.vdocument.in/reader036/viewer/2022070820/610514c1b9728b59906e6b6b/html5/thumbnails/18.jpg)
Recovery and ‘fsync’ Acceleration
• Recovery • Checkpoint and rollback
• ‘fsync’ Acceleration• When fsync, Direct node blocks are written with fsync
mark• No need to create a checkpoint• When crash, compare fsynced blocks with old blocks
18
![Page 19: F2FS: A New File System for Flash Storage...• Hot/Cold separation ... SATA SSD 250GB NVMeSSD 960GB. Mobile Benchmark 27](https://reader036.vdocument.in/reader036/viewer/2022070820/610514c1b9728b59906e6b6b/html5/thumbnails/19.jpg)
Recovery and ‘fsync’ Acceleration
19
SB CP NAT SIT SSA
0 1dir1node
file1
file2
0 10 1
1. Create dir1, file1, and file2
dir1 file1node
file2node
![Page 20: F2FS: A New File System for Flash Storage...• Hot/Cold separation ... SATA SSD 250GB NVMeSSD 960GB. Mobile Benchmark 27](https://reader036.vdocument.in/reader036/viewer/2022070820/610514c1b9728b59906e6b6b/html5/thumbnails/20.jpg)
Recovery and ‘fsync’ Acceleration
20
SB CP NAT SIT SSA
0 1dir1node
file1
file2
0 10 1
1. Create dir1, file1, and file2
dir1 file1node
file2node
2. Create checkpoint
Checkpoint
![Page 21: F2FS: A New File System for Flash Storage...• Hot/Cold separation ... SATA SSD 250GB NVMeSSD 960GB. Mobile Benchmark 27](https://reader036.vdocument.in/reader036/viewer/2022070820/610514c1b9728b59906e6b6b/html5/thumbnails/21.jpg)
Recovery and ‘fsync’ Acceleration
21
SB CP NAT SIT SSA
0 1dir1node
file1
file2
0 10 1
1. Create dir1, file1, and file2
dir1 file1node
file2node
2. Create checkpoint
Checkpoint
3. File2 update and fsync
new file2
fsync
fsyncmark
new file2node
![Page 22: F2FS: A New File System for Flash Storage...• Hot/Cold separation ... SATA SSD 250GB NVMeSSD 960GB. Mobile Benchmark 27](https://reader036.vdocument.in/reader036/viewer/2022070820/610514c1b9728b59906e6b6b/html5/thumbnails/22.jpg)
Recovery and ‘fsync’ Acceleration
22
SB CP NAT SIT SSA
0 1dir1node
file1
file2
0 10 1
1. Create dir1, file1, and file2
dir1 file1node
file2node
2. Create checkpoint
Checkpoint
3. File2 update and fsync
new file2
fsync
fsyncmark
4. Sudden Power Off
new file2node
![Page 23: F2FS: A New File System for Flash Storage...• Hot/Cold separation ... SATA SSD 250GB NVMeSSD 960GB. Mobile Benchmark 27](https://reader036.vdocument.in/reader036/viewer/2022070820/610514c1b9728b59906e6b6b/html5/thumbnails/23.jpg)
Recovery and ‘fsync’ Acceleration
23
SB CP NAT SIT SSA
0 1dir1node
file1
file2
0 10 1
1. Create dir1, file1, and file2
dir1 file1node
file2node
2. Create checkpoint
Checkpoint
3. File2 update and fsync
new file2
fsync
fsyncmark
4. Sudden Power Off- Recovery5. Roll-back to the latest stable checkpoint
new file2node
![Page 24: F2FS: A New File System for Flash Storage...• Hot/Cold separation ... SATA SSD 250GB NVMeSSD 960GB. Mobile Benchmark 27](https://reader036.vdocument.in/reader036/viewer/2022070820/610514c1b9728b59906e6b6b/html5/thumbnails/24.jpg)
Recovery and ‘fsync’ Acceleration
24
SB CP NAT SIT SSA
0 1dir1node
file1
file2
new file2
Checkpoint
fsync
0 10 1
1. Create dir1, file1, and file22. Create checkpoint3. File2 update and fsync4. Sudden Power Off- Recovery5. Roll-back to the latest stable checkpoint6. Roll-forward to file2’s fsynced data
dir1 file1node
file2node
new file2node
fsyncmark
Compare
![Page 25: F2FS: A New File System for Flash Storage...• Hot/Cold separation ... SATA SSD 250GB NVMeSSD 960GB. Mobile Benchmark 27](https://reader036.vdocument.in/reader036/viewer/2022070820/610514c1b9728b59906e6b6b/html5/thumbnails/25.jpg)
Recovery and ‘fsync’ Acceleration
25
SB CP NAT SIT SSA
0 1dir1node
file1
file2
new file2
Checkpoint
fsync
0 10 1
1. Create dir1, file1, and file22. Create checkpoint3. File2 update and fsync4. Sudden Power Off- Recovery5. Roll-back to the latest stable checkpoint6. Roll-forward to file2’s fsynced data
dir1 file1node
file2node
new file2node
fsyncmark
7. Create new checkpoint
![Page 26: F2FS: A New File System for Flash Storage...• Hot/Cold separation ... SATA SSD 250GB NVMeSSD 960GB. Mobile Benchmark 27](https://reader036.vdocument.in/reader036/viewer/2022070820/610514c1b9728b59906e6b6b/html5/thumbnails/26.jpg)
Evaluation Setup
26
• Hardware & Software Specs
• Summary of benchmarks
Target System Storage DevicesMobile CPU: Exynos 5410
Memory: 2GBOS: Linux 3.4.5Android: JB 4.2.2
eMMC 16GB(2GB partition)
Server CPU: Intel i7-3770Memory: 4GBOS: Linux 3.14Ubuntu 12.10 server
SATA SSD 250GBNVMe SSD 960GB
![Page 27: F2FS: A New File System for Flash Storage...• Hot/Cold separation ... SATA SSD 250GB NVMeSSD 960GB. Mobile Benchmark 27](https://reader036.vdocument.in/reader036/viewer/2022070820/610514c1b9728b59906e6b6b/html5/thumbnails/27.jpg)
Mobile Benchmark
27
• In iozone-RW, F2FS performs 3.1x better than ext4• In F2FS, more than 90% of writes are sequential
• F2FS reduces write amount per fsync at SQLite• F2FS reduces the amount of data writes by about 46%
over Ext4
• F2FS reduces the elapsed time by 20% (facebook) and 40% (twitter) compared with Ext4
![Page 28: F2FS: A New File System for Flash Storage...• Hot/Cold separation ... SATA SSD 250GB NVMeSSD 960GB. Mobile Benchmark 27](https://reader036.vdocument.in/reader036/viewer/2022070820/610514c1b9728b59906e6b6b/html5/thumbnails/28.jpg)
Server Benchmark
28
• On SATA SSD, F2FS shows• 2.5x better than ext4 on varmail benchmark• 16% better than ext4 on Oltp benchmark
• On PCIe SSD, F2FS shows• 1.8x better than ext4 on varmail benchmark• 13% better than ext4 on Oltp benchmark
![Page 29: F2FS: A New File System for Flash Storage...• Hot/Cold separation ... SATA SSD 250GB NVMeSSD 960GB. Mobile Benchmark 27](https://reader036.vdocument.in/reader036/viewer/2022070820/610514c1b9728b59906e6b6b/html5/thumbnails/29.jpg)
Adaptive Logging Performance
29
• Adaptive logging give graceful performance degradation under highly aged conditions• Fileserver test on SATA SSD (94% util.)• Performance improvement: 2x/3x over ext4/btrfs
• IOzone test on eMMC (100% util.)• Performance is similar to ext4
![Page 30: F2FS: A New File System for Flash Storage...• Hot/Cold separation ... SATA SSD 250GB NVMeSSD 960GB. Mobile Benchmark 27](https://reader036.vdocument.in/reader036/viewer/2022070820/610514c1b9728b59906e6b6b/html5/thumbnails/30.jpg)
Conclusion
30
• F2FS Contributions• Flash-Friendly on-disk layout• Efficient Index Structure• Multi-head logging• Adaptive logging• Recovery and fsync acceleration
• Evaluation• F2FS outperforms other FSs on various benchmarks• F2FS transforms random write to sequential write• F2FS reduce fsync overheads• Adaptive logging relieves cleaning overheads
![Page 31: F2FS: A New File System for Flash Storage...• Hot/Cold separation ... SATA SSD 250GB NVMeSSD 960GB. Mobile Benchmark 27](https://reader036.vdocument.in/reader036/viewer/2022070820/610514c1b9728b59906e6b6b/html5/thumbnails/31.jpg)
Thank you