iops ÿ g+ busy o io '! ' ¦ + x
TRANSCRIPT
iops busy IO
†
iops IO size read/write IOIOiops
iops iopsOS 1iops IO busy
IO3
IO
Method of guaranteeing IO performancethat uses IO busy rate for each iops.
Kazuichi Oe†
It is well known that maximum performance (iops) of a hard disk is influenced by accessposition and range on the volume, the distribution of IO size, the ratio of read/write and etc..In order to guarantee the IO performance for the specified application, it is needed to monitorthe current maximum IO performance dynamically. We propose the method of guaranteeingthe IO performance by using the Rate of IO busy for each iops. The Rate of IO busy foreach iops can be calculated from statistical information of the operating system. We haveverified the effectiveness of this method by using the real samba workload, backup workloadand the mixture of them
1.
IO read/write
seek
IO
OS
OS IO
RAID
†
Fujitsu Laboratories Ltd.
OS
OS
IO
• IO busy IO busy iops
• IO busy busy
IO busy iops
busy
IO
2
3
4
IO busy IO
コンピュータシステム・シンポジウム C om puter System Sym posium
ⓒ 2011 Information Processing Society of Japan73
ComSys20112011/12/1
5 6
1 IO
iops
PC
PC
IO
2.
IO
PC
Facade controller
1) SLO
service-level-objective Facade controller
SLO request read/write
Facade controller
iops
Facade controller
SLO
IO queue
iops
VM OS IO
VM PC 1
2) PC
IO share IO share
PC IO PC
read/write PC
flow control flow
control PC read/write IO
share PC IO queue PC
PC IO queue
IO share IO
PC VM SFQ VM
IO queue
1) IO queue
VM
iops
OS IO
Linux Block IO Controller3) Linux
Block IO Controller IO queue CFQ
IO queue VM
IO
2) 1) IO
queue
IO queue
4) Linux XFS
SGI IRIX IO
8)
XFS
IO
IO
IO 5) 6)
3.
• IO size read/write
• IO
• IO IO
OS
IO size size
コンピュータシステム・シンポジウム C om puter System Sym posium
ⓒ 2011 Information Processing Society of Japan74
ComSys20112011/12/1
• IO busy iops
– IO busy IO busy iops
– IO busy busy
IO busy iops
• IO busy
• IO busy =100%
IO
IO
RAID
RAID 64KB stripe
size stripe size *
IO
IO RAID
1/( )
•
Samba
1000
Veritas VxFS 500GB
7)
•
NetVault
2
2TB 500GB A
500GB B
A B A B
(A+B) iops %util svctm
%util Linux IO busy
svctm 1IO
iostat svctm iops
iops = 1/svctm
1
http://www.bakbone.co.jp/products/netvault.html
1 IO size
Fig. 1 IO size distribution used to experiment
3 4
A
iops
B A
90% 89% A B
iops
A 70% 69% A
B
A+B
30%
3 4
A 239iops 65iops
3.7
1
Table 1 Storage device environment
PC FUJITSU PRIMERGY RX100S4
Celeron 3.06 GHz,Mem=2GB,CentOS 5.3
disk Newtech Evolution II 400GB
(SATA 2U)
6disk/2TB RAID5
(stripe size=64KB)
2
Table 2 Workload used to experiment
IO size 1 128KB
read:write 1 50:50
offset 2 0-30GB, 300-330GB
コンピュータシステム・シンポジウム C om puter System Sym posium
ⓒ 2011 Information Processing Society of Japan75
ComSys20112011/12/1
2 IO offset
Fig. 2 IO offset distribution used to experiment
3 IO offset svctm
Fig. 3 Relation between IO offset and svctm (file sharing)
iostat kernel block
device IO queue IO CPU
%util
%util=100 IO
%util iops
%util iops %util
%util iops
%util 100%
iops
%util busy
5 busy
A B
90% A+B 95%
B 80%
2.6.9-67
ll rw blk.c
4 IO offset svctm
Fig. 4 Relation between IO offset and svctm (backup)
5
Fig. 5 Summary of workload analysis result
A A+B 85%
busy
iops %util
iops IO
4. IO busy IO
IO IO busy
busy
busy IO busy iops
IO busy iops
1iops busy
1iops IO busy
IO
IO busy busy
コンピュータシステム・シンポジウム C om puter System Sym posium
ⓒ 2011 Information Processing Society of Japan76
ComSys20112011/12/1
IO busy IO
IO busy IO busy
6
busy
1iops IO busy
1iops IO busy
IO
iops IO busy
1iops busy
1iops busy
IO busy
IO busy
IO busy busy
IO busy busy
busy
OS
IO busy busy
6 IO
busy 100% IO busy
busy IO
busy busy IO busy
iops
IO busy 100% 4
busy 80-85%
2-3iops IO busy 100%
IO busy busy
IO busy =100%
IO busy busy
2
IO busy
• IO busy IO busy
• 1iops busy 1iops
busy
IO
3 80%-95%
95%
6
Fig. 6 Workload information obtained when operating it
IO
IO
A a iops
B A B
IO busy X% X IO
busy busy
iops IO busy
1iops IO busy
busy per iops = IO busy /iops
Linux iostat -x
busy per iops = %util/(r/s + w/s)
1iops IO busy
B
iops B A
iops
( 1 ) a iops IO busy
a busy = a * busy per iops
( 2 ) B IO busy
b busy = X - a busy
( 3 ) b util B
b iops
b = b busy / busy per iops
( 4 ) B b iops
コンピュータシステム・シンポジウム C om puter System Sym posium
ⓒ 2011 Information Processing Society of Japan77
ComSys20112011/12/1
(4) 5
IO1 IO
IO b 1
IO 1
IO IO
5
kernel driver
IO busy
A B IO busy 100%
IO busy
IO busy
100%
100%
IO busy 10
3 2100% IO busy
IO
busy New X X New X
New X X x
New X
New X = X - x
7 12 busy
New X 12 x
x
• IO busy =100 iops(=C)
• IO busy 12 iops
IO busy 3
IO busy 12
1 1 1iops
IO busy
12 =
2 2
IO busy 100%
B
18 thread
2 20 603
5 B IO busy 100%
IO busy 100%
IO kernel
A B IO
B kernel
IO IO busy
IO busy =100%
A B iops 20 % B
IO busy 100%
7 1
1iops IO busy =c
2 1iops IO busy
=d d 12 1iops
IO busy 10 d Ciops
x 1
10 IO busy =(c-d)*C
x = (c-d)*C*y (y=0 ∼ 1)
New X
New X = X - (c-d)*C*y (y=0 ∼ 1)
New X
New X
y
y 4
busy New X
New X busy
y
New X
5
y=0.5 0.5
New X
IO busy busy
)
IO busy IO busy
1iops busy
1iops busy
2
busy
IO busy
New X IO busy
IO busy IO busy
4 IO busy = busy
コンピュータシステム・シンポジウム C om puter System Sym posium
ⓒ 2011 Information Processing Society of Japan78
ComSys20112011/12/1
7 busy
Fig. 7 Method of updating proportion limit busy rate
20%
IO
3 busy
iops IO busy
busy
1iops IO busy
iops
1IO
3 svctmA+B
iops=80 svctm=5.68ms
iops=160 svctm=5.72 svctm
0.78% iops 80
160 IO
1iops
IO busy
1iops IO
busy iops
5.
3 1
8
• A IO thread
– thread
10thread IO
3 500GB
IO ( 9 )
– thread
8
Fig. 8 Outline of verification system
4 5 6
thread
• B IO thread
– thread
10thread IO
3 500GB
IO ( 9 )
– thread
4 5 6
thread
thread
• thread
– B
– IO busy =95%
A B
3 3
1 2 3 4
5 6
3 thread A,B
Table 3 Workload given to thread A and B
A B
1
2
3
20 iostat -x
コンピュータシステム・シンポジウム C om puter System Sym posium
ⓒ 2011 Information Processing Society of Japan79
ComSys20112011/12/1
9
Fig. 9 How to give load
1 10 4
11 %util=100
A 50iops
A
B iops 7
2 %util=100
B iops
11
%util=100 3 500
IO busy 63%
%util=100 3 700
thread A 50iops
1 IO busy
2
960 A 5iops B
4 1 iops
Table 4 Target iops of evaluation 1
A iops B iops ( )
5 50 5
50 50 10
5 50 10
5 2 iops
Table 5 Target iops of evaluation 2
A iops B iops ( )
10 200 5
100 200 10
50 200 10
6 3 iops
Table 6 Target iops of evaluation 3
A iops B iops ( )
5 30 5
30 30 10
5 30 10
4 X
10 1
Fig. 10 Outcome of an experiment of evaluation 1
11 1 %util=100
Fig. 11 Outcome of an experiment of evaluation 1
(frequency to which %util=100 is observed)
A B %util
50% IO busy
%util/iops 5iops
0.8 1.3
IO busy
2 12 5
13 %util=100
A 100iops
A iops
B
IO busy
87% A 100iops
1 IO busy
3 14 6
15 %util=100
コンピュータシステム・シンポジウム C om puter System Sym posium
ⓒ 2011 Information Processing Society of Japan80
ComSys20112011/12/1
12 2
Fig. 12 Outcome of an experiment of evaluation 2
13 2 %util=100
Fig. 13 Outcome of an experiment of evaluation 2
(frequency to which %util=100 is observed)
A 100iops
2
3 1 IO busy
1 2
3 A thread IO
1 IO busy
2
IO busy IO busy
IO busy
1iops IO busy
1iops IO busy
1
14 3
Fig. 14 Outcome of an experiment of evaluation 3
15 3 %util=100
Fig. 15 Outcome of an experiment of evaluation 3
(frequency to which %util=100 is observed)
6.
IO busy iops
busy
busy IO busy
iops
IO busy IO busy
1iops IO busy
IO busy busy
3
IO
IO busy
2
コンピュータシステム・シンポジウム C om puter System Sym posium
ⓒ 2011 Information Processing Society of Japan81
ComSys20112011/12/1
1 IO
busy
7.
IT
1) Christopher R.Lumb, Arif Merchant, and
Guillermo A.Alvarez. Facade: virtual storage
devices with performance guarantees. Proceed-
ings of FAST’03, March 2003.
2) Ajay Gulati, Irfan Ahmad, and Carl A. Wald-
spurger. PARDA: Proportional Allocation of
Resources for Distributed Storage Access. Pro-
ceedings of FAST’09, February 2009.
3) http://sourceforge.jp/projects/sfnet ioband/
4)
HOKKE-2009
5)
SPA
2005
6) Kyung D.Ryu, Jeffery K. Hollingsworth, and
Peter J. Keleher. Efficient Network and I/O
Throttling for Fine-Grain Cycle Steamling. In
Proceedings of the ACM/IEEE SC2001 Con-
ference (SC ‘01), November 2001
7) : iops
IO Busy IO
8 SAC-
SIS2010, May 2010.
8) SGI IRIX Admin: Disks and Filesystems(
), 007-2825-009JP
コンピュータシステム・シンポジウム C om puter System Sym posium
ⓒ 2011 Information Processing Society of Japan82
ComSys20112011/12/1