on grid performance evaluation using synthetic workloads
DESCRIPTION
On Grid Performance Evaluation using Synthetic Workloads. Carsten Franke, Alexander Papaspyrou, Lars Schley, Baiyi Song, and Ramin Yahyapour. Alexandru Iosup , Dick Epema. PDS Group, ST/EWI, TU Delft. PDS Group, ST/EWI, TU Delft. JSSPP 2006. Outline. A Short Introduction to Grid Computing - PowerPoint PPT PresentationTRANSCRIPT
April 19, 20231
University of Dortmund
On Grid Performance Evaluation using Synthetic Workloads
JSSPP 2006
Alexandru Iosup, Dick EpemaPDS Group, ST/EWI, TU Delft
Carsten Franke, Alexander Papaspyrou, Lars Schley, Baiyi Song, and Ramin Yahyapour PDS Group, ST/EWI, TU Delft
April 19, 20232
University of Dortmund
Outline
• A Short Introduction to Grid Computing• On Grid Performance Evaluation
Experimental Environments Performance Indicators General Workload Modeling Grid-Specific Workload Modeling The GrenchMark Framework
• Future Work• Conclusions
April 19, 20233
University of Dortmund
A Short Introduction to Grid Computing
• Typical grid environment• Applications [!]
• Unitary, composite• Data
• Resources• Compute (Clusters)• Storage• (Dedicated) Network
• Virtual Organizations, Projects• Groups, Users
• Grids vs. parallel production environments• Dynamic• Heterogeneous• Very large-scale (world)• No central administration
→ Most resource management problems are NP-hard
April 19, 20234
University of Dortmund
Experimental Environments Real-World Testbeds
• Real-World Testbed• DAS, NorduGrid, Grid3/OSG, Grid’5000…
• Pros• True performance, also shows “it works!”• Infrastructure in place
• Cons• Time-intensive• Exclusive access (repeatability)• Controlled environment problem (limited scenarios)• Workload structure (little or no realistic data)• What to measure (new environment)
April 19, 20235
University of Dortmund
Experimental Environments Simulated and Emulated Testbeds• Simulated and Emulated Testbeds• GridSim, SimGrid, GangSim, MicroGrid … • Essentially trade-off precision vs. speed
• Pros• Exclusive access (repeatability)• Controlled environment (unlimited scenarios)
• Cons• Synthetic Grids: What to generate? How to generate?
Clusters, Disks, Network, VOs, Groups, Users, Applications, etc.
• Workload structure (little or no realistic data)• What to measure (new environment)• Validity of results (accuracy vs. time)
April 19, 20236
University of Dortmund
Grid Performance Evaluation Current Practice
• Performance Indicators• Define my own metrics, or use U and AWT/ART, or both
• Workload Structure• Run my own workload, or use traces that are not validated
by peer researchers; do not make comparisons!• Run benchmarks from typical parallel production
environments• Mostly all users are created equal assumptionNeed a common Need a common
performance performance evaluation framework evaluation framework
for Gridfor Grid
April 19, 20237
University of Dortmund
Grid Performance Evaluation Current Issues
• Performance Indicators• What are the metrics for the new
environment? • Workload Structure• Which general aspects are important? • Which Grid-specific aspects need to be
addressed?Need a common Need a common
performance performance evaluation framework evaluation framework
for Gridfor Grid
April 19, 20238
University of Dortmund
Performance Indicators
• Time-, Resource-, and System-Related Metrics• Traditional: utilization, A(W)RT, A(W)WT, A(W)SD• New: waste, fairness (or service quality reliability)
• Workload Completion and Failure Metrics“ In Grids, functionality may be
even more important than performance ”• Workload Completion (WC)• Task and Enabled Task Completion (TC, ETC)• System Failure Factor (SFF)
April 19, 20239
University of Dortmund
General Aspects for Workload Modeling
• User/Group/VO model• Detailed modeling for top-5/10 users, then
clustering (Use squash area to group)
• Submission patterns• Yearly, monthly, weekly, daily• Do daily patterns exist? (Are Grids truly global?)
• Temporal patterns• Repeated submission (batches of jobs)• Job dependencies (composite applications common in Grid(?))
• Feedback • Empiric rules (don’t submit jobs when system busy). But,
reactive submission tools, co-allocators, evolving applications, etc.
April 19, 202310
University of Dortmund
Grid-Specific Workload ModelingComputation Management
• Processor co-allocation• Fixed, non-fixed, semi-fixed jobs
• Job flexibility• Moldable, evolvable, flexible, *-ble…
• Other aspects• Background load: define top jobs (by consumption),
model the rest as background load• Project stage
April 19, 202311
University of Dortmund
Grid-Specific Workload ModelingData Management
• Clearly Defined I/O Requirements• Files, streams, …• Data location and size
• Replicas• Replica location
• Other aspects• …
April 19, 202312
University of Dortmund
Grid-Specific Workload ModelingNetwork Management
• Clearly Defined Network Requirements• Bandwidth, latency, …• Communication pattern
• Special Situations• Dedicated paths, other QoS
• Other aspects• Background load
April 19, 202313
University of Dortmund
Grid-Specific Workload ModelingLocality/Origin Management
• Job issuer and execution siteNot all VOs are created equal !
• Two-level view: Which VO generates the next job? Within a VO, which user generates the next job?
• Three-level view, Multi-level view (Project, VO, Group, User)
• (Usage) Service Level Agreements• Use my system 50% for 7 days, or 20% for 30 days• Dedicated paths, other QoS
• Other aspects• Background load pertaining to same (u)SLA
April 19, 202314
University of Dortmund
Grid-Specific Workload ModelingFailure Modeling
• Error level• Infrastructure• Middleware• Application• User
• Fault tolerance scheme for submitted jobs• Catch the system feedback into the model
• Other aspects• …
April 19, 202315
University of Dortmund
Grid-Specific Workload ModelingEconomic Models
• Pricing • Application cost• Application utility
• Other aspects• …
April 19, 202316
University of Dortmund
GrenchMark: a Framework for Analyzing, Testing, and Comparing grids• What’s in a name?
grid benchmark → working towards a generic tool for the whole community: help standardizing the testing procedures, but benchmarks are too early; we use synthetic grid workloads instead
• What’s it about?A systematic approach to analyzing, testing, and comparing grid settings, based on synthetic workloads• A set of metrics for analyzing grid settings• A set of representative grid applications
• Both real and synthetic• Easy-to-use tools to create synthetic grid workloads• Flexible, extensible framework
April 19, 202317
University of Dortmund
GrenchMark Overview: Easy to Generate and Run Synthetic Workloads
April 19, 202318
University of Dortmund
… but More Complicated Than You Think• Workload structure
• User-defined and statistical models • Dynamic jobs arrival• Burstiness and self-similarity• Feedback, background load• Machine usage assumptions• Users, VOs
• Metrics• A(W) Run/Wait/Resp. Time • Efficiency, MakeSpan• Failure rate [!]
• (Grid) notions• Co-allocation, interactive jobs, malleable, moldable, …
• Measurement methods• Long workloads• Saturated / non-saturated system• Start-up, production, and cool-down scenarios• Scaling workload to system
• Applications• Synthetic• Real
• Workload definition language
• Base language layer• Extended language layer
• Other• Can use the same workload for both simulations and real environments
April 19, 202320
University of Dortmund
GrenchMark: Iterative Research RoadmapSimple functional systemA.Iosup, J.Maassen, R.V.van Nieuwpoort, D.H.J.Epema,
Synthetic Grid Workloads with Ibis, KOALA, and GrenchMark, CoreGRID IW, Nov 2005.
April 19, 202321
University of Dortmund
GrenchMark: Iterative Research Roadmap
Open-GrenchMark
CommunityEffort
Complex extensible systemA.Iosup, D.H.J.Epema, GrenchMark: A Framework for Analyzing,
Testing, and Comparing Grids, IEEE CCGrid'06, May 2006.
April 19, 202322
University of Dortmund
• Performance Evaluation of Grid Systems - need a common performance evaluation framework for grids - need real grid traces (scheduling, accounting, monitoring, etc.) - need more research on workload modeling and performance indicators
• Performance indicators - failure metrics as important as traditional performance metrics
• Workload modeling - generic workload modeling needs validation based on real grid traces - computation/data/network management - locality/origin management - failure modeling - economic models
• GrenchMark - generic tool for the whole community - generates diverse grid workloads - easy-to-use, flexible, portable, extensible, …
Take home message