DiPerFDiPerF::automated automated DIstributedDIstributedPERformancePERformance testingtesting
FrameworkFramework
Catalin Dumitrescu, Ioan Raicu, Matei Ripeanu, Ian Foster
Distributed Systems LaboratoryComputer Science Department
University of Chicago
4/18/2004 DiPerF: automated DIstributed PERformance testing Framework 2
Introduction
• Goals– Simplify and automate distributed performance testing
• grid services• web service• network service
– Define a comprehensive list of performance metrics– Produce accurate client views of service performance– Create analytical models of service performance
• Framework implementation– Grid3– PlanetLab– NFS style cluster (UChicago CS Cluster)
4/18/2004 DiPerF: automated DIstributed PERformance testing Framework 3
Framework• Coordinates a distributed pool of machines
– Tested with over 100 machines– Scalable to 1000s of machines
• Controller– Receives the address of the service and a client code– Distributes the client code across all machines in the pool – Gathers, aggregates, and summarizes performance statistics
• Tester– Receives client code– Runs the code and produce performance statistics– Sends back to “controller” statistic report
4/18/2004 DiPerF: automated DIstributed PERformance testing Framework 4
Architecture Overview
Sendclientcode Send
statistics
Service
ClientCode
TesterClientCode
Tester
Run Performance
Measures
Run Performance
Measures
Controller
4/18/2004 DiPerF: automated DIstributed PERformance testing Framework 5
Time Synchronization
• Time synchronization needed at the testers for data aggregation at controller? – Distributed approach:
• Tester uses Network Time Protocol (NTP) to synchronize time
– Centralized approach:• Controller uses time translation to synchronize
time– Uses network RTT to estimate offset between tester and
controller
4/18/2004 DiPerF: automated DIstributed PERformance testing Framework 6
Metric Aggregation
TIME
FINISH
Ntesters
Tester: 1
Tester: 2
Tester: n-1
Tester: n
STARTRan: 1000 secondsIterations: 2000Ran: 0 secondsIterations: 0
Ran: 500 secondsIterations: 200
Ran: 3600 secondsIterations: 10000
4/18/2004 DiPerF: automated DIstributed PERformance testing Framework 7
Performance Metrics• service response time
– time from when a client issues a request to when it is completed minus the network latency and minus the execution time of the client code
• service throughput– number of jobs issued per second and completed successfully by the end of the
test time• service utilization
– percentage of service resource utilization throughout the entire test per client• service balance among clients
– ratio between number of jobs completed and service utilization per client• service load
– number of concurrent clients per second accessing the service• network latency to the service
– time taken for a minimum sized packet to traverse the network from the client to the service
• time synchronization error– real time difference between client and service measured as a function of
network latency
4/18/2004 DiPerF: automated DIstributed PERformance testing Framework 8
Services Tested• HTTP
– client used “wget” to invoke a program using CGI over HTTP on an Apache HTTP server
• GT2 GRAM– job submission via Globus Gatekeeper 2.4.3 using Globus
Toolkit 2.4
• GT3.02– simple factory-based grid service; create a new service and
invoke a method of the service without security features enabled
• GT3.2– identical to GT3.02 except that GT3.2 was an Alpha release
• MonaLisa– monitoring grid webservice
4/18/2004 DiPerF: automated DIstributed PERformance testing Framework 9
GT2.4Response Time, Load, and Throughput (in time)
0
20
40
60
80
100
120
140
160
180
200
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500 6000 6500
Time (sec)
Res
po
nse
Tim
e (s
ec) /
L
oad
(C
lien
ts C
un
cure
ntly
Ru
nn
ing
)
0
1
2
3
4
Th
rou
gh
pu
t (J
ob
s / S
ec)
Service Response Time (sec) Load (Clients Concurently Running) Throughput (Jobs / Sec)
Service Response Time (Approximation) Load (Approximation) Throughput (Approximation)
4/18/2004 DiPerF: automated DIstributed PERformance testing Framework 10
GT2.4Service Utilization (per Machine)
0
200
400
600
800
1000
1200
1400
0 10 20 30 40 50 60 70 80 90 100
Machine ID
Rat
io o
f Jo
bs
Co
mp
lete
d v
s. S
ervi
ce U
tiliz
atio
n
0.0%
0.5%
1.0%
1.5%
2.0%
2.5%
3.0%
3.5%
4.0%
4.5%
5.0%
To
tal S
ervi
ce U
tiliz
atio
n %
Ratio (Jobs completed / Service Utilization) Total Service UtilizationRatio (Approximation) Total Service Utilization (Approximation)
4/18/2004 DiPerF: automated DIstributed PERformance testing Framework 11
GT2.4Average Load and Jobs Completed (per Machine)
60
65
70
75
80
85
0 10 20 30 40 50 60 70 80 90 100
Machine ID
Ave
rag
e A
gre
gat
e L
oad
(per
mac
hin
e)
Jobs Completed
4/18/2004 DiPerF: automated DIstributed PERformance testing Framework 12
Analytical Model
• Model performance characteristics – used to estimate a service’s performance based on
the service load• Throughput• Service response time
• Dynamic resource allocator – Maintain QoS while maximizing resources utilization
• Polynomial approximations– Throughput:– Service response time:
• Neural networks
-22 6 -18 5 -14 4 -10 3 -7 2y = -3*10 x + 6*10 x - 5*10 x + 2*10 x - 2*10 x + 0.0001x + 0.3212
-19 6 -15 5 -11 4 -7 3 2y = -2*10 x + 4*10 x - 3*10 x + 10 x - 0.0003x + 0.2545x + 9.1001
4/18/2004 DiPerF: automated DIstributed PERformance testing Framework 13
Contributions & Future Work
• Contributions– Analytical models: resource managers– Service capacity– Scalability study– Resource distribution among clients– Accurate client views of service performance– How network latency affects service performance
• Future Work– Verify analytical models
• Polynomial Approximations• Neural Networks
– Test more services
4/18/2004 DiPerF: automated DIstributed PERformance testing Framework 14
References
• Presentation Slides– http://people.cs.uchicago.edu/~iraicu/research/documents/uchicago/cs33340/diperf_presentation.pdf
• Report– http://people.cs.uchicago.edu/~iraicu/research/documents/uchicago/cs33340/diperf_report.pdf
• References– http://www.globus.org– http://www.ivdgl.org– http://www.planet-lab.org– http://www.ntp.org– http://xenia.media.mit.edu/~nelson/research/ntp-survey99/html/