toolbox for dimensioning windows storage systems
DESCRIPTION
laboratory. Versailles Saint Quentin University. Toolbox for Dimensioning Windows Storage Systems. Jalil Boukhobza , Claude Timsit [email protected] 12/09/2006. Outline. Introduction Overview of the Windows I/O subsystem architecture The developed tools I/O benchmarking - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Toolbox for Dimensioning Windows Storage Systems](https://reader036.vdocument.in/reader036/viewer/2022082517/56813d34550346895da6f764/html5/thumbnails/1.jpg)
Toolbox for Dimensioning Windows
Storage Systems
Jalil Boukhobza, Claude Timsit
[email protected]/09/2006
Versailles Saint Quentin UniversityVersailles Saint Quentin University laboratorylaboratory
![Page 2: Toolbox for Dimensioning Windows Storage Systems](https://reader036.vdocument.in/reader036/viewer/2022082517/56813d34550346895da6f764/html5/thumbnails/2.jpg)
PRiSM Lab/ University of Versailles 212/09/2006
Outline Introduction Overview of the Windows I/O subsystem
architecture The developed tools
I/O benchmarking Storage parameter extraction I/O simulation
Summary
![Page 3: Toolbox for Dimensioning Windows Storage Systems](https://reader036.vdocument.in/reader036/viewer/2022082517/56813d34550346895da6f764/html5/thumbnails/3.jpg)
PRiSM Lab/ University of Versailles 312/09/2006
Introduction
Windows I/O system is poorly studied
CreateFile(): many file access modes different caching algorithms big performance fluctuations for a given workload (ratio 2 to 10).
Disk subsystems built independently from OS Interaction with the OS are not easily predictable
What is the performance of a given workload on a What is the performance of a given workload on a given system architecture for a defined I/O strategy given system architecture for a defined I/O strategy
on Windows systems and how to optimize it ?on Windows systems and how to optimize it ?
![Page 4: Toolbox for Dimensioning Windows Storage Systems](https://reader036.vdocument.in/reader036/viewer/2022082517/56813d34550346895da6f764/html5/thumbnails/4.jpg)
PRiSM Lab/ University of Versailles 412/09/2006
Overview of the Windows I/O system architecture Different file access modes
in the CreateFile() function: Without using the file system
cache: no buffer mode (FILE_FLAG_NO_BUFFERING)
Using the file system cache: sequential, normal, write through modes.
FILE_FLAG_SEQUENTIAL_SCAN, FILE_ATTRIBUTE_NORMAL, FILE_FLAG_WRITE_THROUGH
Storage Storage DeviceDevice
Storage Storage DeviceDevice
I/O requestI/O requestFastIOFastIO
Page missPage miss
Cache ManagerCache ManagerCache ManagerCache Manager
File System DriverFile System DriverFile System DriverFile System Driver
Storage Device Storage Device DriverDriver
Storage Device Storage Device DriverDriver
Virtual Memory Virtual Memory ManagerManager
Virtual Memory Virtual Memory ManagerManager
![Page 5: Toolbox for Dimensioning Windows Storage Systems](https://reader036.vdocument.in/reader036/viewer/2022082517/56813d34550346895da6f764/html5/thumbnails/5.jpg)
PRiSM Lab/ University of Versailles 512/09/2006
Developed tools 1. I/O benchmarking
All Windows file access modes Win32 CreateFile(): Normal, sequential, random, no buffer, write through
Request sizes Sequential / Random / interleaved (accesses) Flexible test file selection (zoning) Control on test file fragmentation (file defragmenter and
mover)
Results: I/O throughputs Response times
Read transfer rates for different modes
0
5
10
15
20
25
30
35
40
45
50
0 100 200 300 400 500 600
Request sizes (KB)
Tra
nsfe
r ra
tes (
MB
/sec)
1- No buffer 2- Normal 3- Sequential
1
2
3
Normal mode read
0
5
10
15
20
25
30
0 10 20 30 40 50Request number
Res
po
nse
tim
es (
ms)
64kb
320kb
![Page 6: Toolbox for Dimensioning Windows Storage Systems](https://reader036.vdocument.in/reader036/viewer/2022082517/56813d34550346895da6f764/html5/thumbnails/6.jpg)
PRiSM Lab/ University of Versailles 612/09/2006
2.Storage parameter extraction
Configuration parameters: partly provided by manufacturers (zoning information, cache segment size and number)
Performance parameters: measured to discover the real application performance that may be different from the peak advertised performance (seek times, memory to memory and disk cache to memory throughput, etc.)
Disk cache algorithms: this tool helps users to identify those different algorithms (e.g. read ahead & lazy write)
![Page 7: Toolbox for Dimensioning Windows Storage Systems](https://reader036.vdocument.in/reader036/viewer/2022082517/56813d34550346895da6f764/html5/thumbnails/7.jpg)
PRiSM Lab/ University of Versailles 712/09/2006
Example
No buffer mode read of 64KB
0
0.5
1
1.5
2
2.5
0 5 10 15 20 25 30
Request number
Res
pons
e tim
e (m
s)
Seek times
Study the periodicity to find
track size
Per request response time
Cache segment sizeRead block of size T from disk
Re-read that block: if entirely loaded from the disk cache -> segment size ≥T, increment T else decrement T
Empty the cache
Disk cache updating algorithmsGenerally simple algorithms (LRU, FIFO, LFU, etc.) that can be tested once the segment size known by issuing different read block sequences and then re-read the blocks to see which one is accessed from the disk (and so has been ejected from the cache).
![Page 8: Toolbox for Dimensioning Windows Storage Systems](https://reader036.vdocument.in/reader036/viewer/2022082517/56813d34550346895da6f764/html5/thumbnails/8.jpg)
PRiSM Lab/ University of Versailles 812/09/2006
3. The I/O simulation tool (WinIOSim)
Goals: Application optimization: identifying the best I/O strategy for an
application I/O workload on a given architecture. Hardware optimization: finding the optimal hardware configuration for a
given I/O workload.
What’s new ? Implementation of Windows specific cache algorithms depending on the
access modes identified by reverse engineering work. Specific sequences of I/O requests issued by the system and application
process for each access mode
Disk subsystem reactions to these algorithms Specific reactions for specific sequences (issued by the file system cache
depending on the disk
![Page 9: Toolbox for Dimensioning Windows Storage Systems](https://reader036.vdocument.in/reader036/viewer/2022082517/56813d34550346895da6f764/html5/thumbnails/9.jpg)
PRiSM Lab/ University of Versailles 912/09/2006
The WinIOSim architecture
Application process
System process
I/O scheduler
Process memory
File system cache
Disk cache
Disk
I/O generator 1
I/O generator 2
Trace file
![Page 10: Toolbox for Dimensioning Windows Storage Systems](https://reader036.vdocument.in/reader036/viewer/2022082517/56813d34550346895da6f764/html5/thumbnails/10.jpg)
PRiSM Lab/ University of Versailles 1012/09/2006
The WinIOSim modules
I/O generators: Workload generator
Request types, sizes, number, inter arrival timesaccess modes, requested addresses, etc.
Different possible distributions for each parameter (Poisson, uniform, exponential, etc.) thanks to OMNET++.
Implemented request criticality (synchronous and asynchronous requests).
Trace files extracted using Filemon.
Process memory and file system cache modules: simulating the data copy operations, updating policies, etc.
![Page 11: Toolbox for Dimensioning Windows Storage Systems](https://reader036.vdocument.in/reader036/viewer/2022082517/56813d34550346895da6f764/html5/thumbnails/11.jpg)
PRiSM Lab/ University of Versailles 1112/09/2006
The WinIOSim modules (2)
Application and system processes: Request flow control. Request grouping and splitting. File system cache prefetching algorithms, lazy write and write
through algorithms depending on access modes. Both communicate to issue the final request sequence (as seen
by the disk).
Different buses: simulating bus throughput, delays, sharing.
![Page 12: Toolbox for Dimensioning Windows Storage Systems](https://reader036.vdocument.in/reader036/viewer/2022082517/56813d34550346895da6f764/html5/thumbnails/12.jpg)
PRiSM Lab/ University of Versailles 1212/09/2006
The WinIOSim modules (3)
IO scheduler controls the flow of requests to the disk subsystem Queuing system: FIFO, SCAN, LOOK.
Disk Mapping, zoning, spare area, number of platters, seek times,
rotational speed, head switching times, track and cylinder skew, etc.
Disk cache Segmentation, read ahead algorithms, lazy write and write
through algorithms, cache updating policies, etc.
![Page 13: Toolbox for Dimensioning Windows Storage Systems](https://reader036.vdocument.in/reader036/viewer/2022082517/56813d34550346895da6f764/html5/thumbnails/13.jpg)
PRiSM Lab/ University of Versailles 1312/09/2006
Simulator’s file system cache strategies
Windows prefetching algorithms: No buffer mode
Read operations:
Sequential mode: loading data sequentially Bn, Bn+1, Bn+2, Bn+3, etc.
Normal mode (default) System process:
One requested block: 3 blocks of 64KB
64KB block loaded by the system process
64KB block loaded by the application process
What are the disk cache reactions?
Will it load a part of these data ?
B1 B2 B3 B4
The final sequence of request blocks is: B1,1, B1,2, B1,3, B3,1, B2,1, B3,2,B2,2, B3,3, B2,3, B4,1 , B4,2, ..
1 2 3 1 2 3 1 2 3
![Page 14: Toolbox for Dimensioning Windows Storage Systems](https://reader036.vdocument.in/reader036/viewer/2022082517/56813d34550346895da6f764/html5/thumbnails/14.jpg)
PRiSM Lab/ University of Versailles 1412/09/2006
Simulator’s file system cache strategies (2)
Write operations: Sequential and normal modes : for one request some blocks are flushed on
the disk and the others on the file system cache (later on flushed on the disk).
Write through mode Each written block -> file system cache -> disk cache -> disk +
modification of a system file on the disk -> acknowledge.
64KB block copied to the disk
64KB block copied to the file system cache and flushed later on to the disk
Example with a 320KB request size:
File system cache
Disk
flush
Req 4 Req 5
Req 6 Req 7 Req 8
Req 2
Req 1
Req 3normal mode write 320KB
0
50
100
150
200
250
0 20 40 60 80 100 120
Request number
Resp
onse
tim
e (m
s)
1
![Page 15: Toolbox for Dimensioning Windows Storage Systems](https://reader036.vdocument.in/reader036/viewer/2022082517/56813d34550346895da6f764/html5/thumbnails/15.jpg)
PRiSM Lab/ University of Versailles 1512/09/2006
Configuration of the simulator
Inputs I/O generator configuration
Modeled by the user Real I/O traces
The simulated architecture definition If existing:
Obtained from manufacturers (rarely complete) Obtained using the WIOTester parameter extraction tool we developed.
Outputs Response times and throughputs (2 main metrics for I/Os) The different states of all the modules at each stage of the simulation
![Page 16: Toolbox for Dimensioning Windows Storage Systems](https://reader036.vdocument.in/reader036/viewer/2022082517/56813d34550346895da6f764/html5/thumbnails/16.jpg)
PRiSM Lab/ University of Versailles 1612/09/2006
Validation of the simulator Measures (SQLio & WioTestser) Vs Simulation
(WinIOSim)
For read operations: Sequential access with:
“no buffer” mode “normal” mode
Random access with: “no buffer” mode “normal” mode
For the write operations: “Normal” mode “No buffer” mode
![Page 17: Toolbox for Dimensioning Windows Storage Systems](https://reader036.vdocument.in/reader036/viewer/2022082517/56813d34550346895da6f764/html5/thumbnails/17.jpg)
PRiSM Lab/ University of Versailles 1712/09/2006
Tested architectures
![Page 18: Toolbox for Dimensioning Windows Storage Systems](https://reader036.vdocument.in/reader036/viewer/2022082517/56813d34550346895da6f764/html5/thumbnails/18.jpg)
PRiSM Lab/ University of Versailles 1812/09/2006
Validation resultsPercentage of error between measures and simulations for
reading sequential data with the no buffer mode
05
101520253035404550
32 64 128 256 512Request sizes (KB)
% o
f err
or
measu
res/s
imu
lati
on
s
Dell
Asus
HP
Percentage of error between measures and simulations for reading random data with the no buffer mode
0
1
23
4
5
6
78
9
10
32 64 128 256 512Request sizes (KB)
% o
f err
or
measu
res/s
imu
lati
on
s
Dell
Asus
HP
Percentage of error between measures and simulations for reading sequential data with the normal mode
05
101520253035404550
32 64 128 256 512
Request sizes (KB)
% o
f err
or
measu
res/s
imu
lati
on
s
Dell
Asus
HP
Percentage of error between measures and simulations for reading random data with the normal mode
012
34567
89
10
32 64 128 256 512
Request sizes (KB)
% o
f err
or
measu
res/s
imu
lati
on
s
Dell
Asus
HP
Percentage of error between measures and simulations for writing data with the no buffer mode
012
34567
89
10
32 64 128 192 256 320 384 448 512
Request sizes (KB)
% o
f err
or
measu
res/s
imu
lati
on
s
Dell
Asus
HP
Percentage of error between measures and simulations for writing data with the normal mode
0
1
2
3
4
56
7
8
9
10
32 64 128 192 256 320 384 448 512
Request sizes (KB)
% o
f err
or
measu
res/s
imu
lati
on
s
Dell
Asus
HP
1 2
3 4
5 6
![Page 19: Toolbox for Dimensioning Windows Storage Systems](https://reader036.vdocument.in/reader036/viewer/2022082517/56813d34550346895da6f764/html5/thumbnails/19.jpg)
PRiSM Lab/ University of Versailles 1912/09/2006
Summary Efficient tool for Windows I/O
system performance prediction and optimization. Based on the complementarity of measures and simulations.
Flexible and dedicated I/O benchmarking tool.
I/O parameter extraction tool. Very accurate and flexible
simulations of the whole Windows IO system: from application to disk (<10% error).
Simulation of the interactions between the modules for example file system cache and disk cache.
![Page 20: Toolbox for Dimensioning Windows Storage Systems](https://reader036.vdocument.in/reader036/viewer/2022082517/56813d34550346895da6f764/html5/thumbnails/20.jpg)
PRiSM Lab/ University of Versailles 2012/09/2006
Thank you !
Questions ?
www.prism.uvsq.fr/~jboukh