DESCRIPTION
Scheduling. John Easton. [email protected]. What is a scheduler?. “A means of employing one or more predictive models to evaluate the performance of an application in a system and use this information to assign tasks communications and data to resources”. What is a scheduler?. - PowerPoint PPT PresentationTRANSCRIPT
[email protected] Easton
Scheduling
What is a scheduler?
“A means of employing one or more predictive models to evaluate the performance of an application in a system and use this information to assign tasks communications and data to resources”
What is a scheduler?
“A means of employing one or more predictive models to evaluate the performance of an application in a system and use this information to assign tasks communications and data to resources”
What is a scheduler?
“A means of employing one or more predictive models to evaluate the performance of an application in a system and use this information to assign tasks communications and data to resources”
Which is all pretty easy until we have to handle:– Serial, parallel and distributed applications– Simultaneous execution on shared resources– Enterprise-quality service levels and response times
But let’s talk about locking first
Simple scheduling assumes no locking model
Most data access needs to be coordinated
If you get it wrong, then it doesn’t matter how good your scheduler is, it simply won’t perform– Which defeats the whole idea / purpose of doing it
So schedulers need to understand locking models
Different sorts of scheduling
Job scheduling– Optimise the throughput of the system (measured
by number of jobs executed)
Resource scheduling– Coordinate access to a resource by managing
multiple requests for access whilst optimising utilisation
Application scheduling– Optimise (promote) the performance of a given
application
These requirements conflict because each has a different view of what “performance” means
So what does a scheduler need to do?
Select a set of resources on which to schedule the task(s) of the application– Resource discovery - Identify what resources are present– Resource location - Determine which resources are available
for use– Resource selection - Select candidate resources
Assign application task(s) to compute resources
Distribute data or co-locate data an computation– The easy bit to do (really) badly
Order tasks on compute resources Order communication between tasks
Scheduling model
•Abstraction of the behaviour of the program in the system
Program model
•Sets of rules used to produce schedules
•Description of the performance activity to be optimised by the performance model
•Abstraction of the programs to be scheduled
Performance model
Scheduling policy
Grid scheduling needs
Produce performance predictions that are timeframe-specific– Since performance of resources varies over time, so
predictions of performance must do likewise
Utilise dynamic information to represent performance variations
Adapt to a wide range of infrastructural factors
Why not use an MPP or cluster scheduler for the grid?
MPP schedulers control all the resources
All resources lie in a single administrative domain
Resource pool is invariant
Impacts of contention from other applications in the system is minimal
All compute and communications resources exhibit similar performance characteristics
Unwritten truths
Efficient application performance and efficient system performance are NOT the same thing
It is not possible to obtain optimal performance for multiple applications simultaneously
Load balancing may not provide optimal application scheduling or system utilisation
You can’t create a performance-efficient schedule without modelling the system in detail
Comparison of cluster scheduling technologies -I
Product Function Formal Support? List price Platforms Comments
Load Leveler for Linux
A combination of cluster resource management and scheduler. YES
From 171GBP per CPU
RHEL 3.0 Linux on pSeries or xSeries
Port of a mature product to Linux. Capable of working in a mixed Linux / AIX environment. Supported only in configurations of up to 128 blades.
Mosix
A combination of cluster resource management and scheduler. NO Free to use
Unspecified Redhat / SUSE Linux. Requires modifications to the Linux kernel in addition to user level applications.
An attempt to make Linux "cluster aware", leaning towards single system image. Interesting, but just too risky for commerical application.
Open PBS
A combination of cluster resource management and scheduler. NO
Open Source Software. Free to use. Most UNIX and Linux
Stable, but very low function. No support available.
Torque
A combination of cluster resource management and scheduler. NO
Open Source Software. Free to use.
Source distribution. Would need porting to whatever platform is required. Appears to be designed to work under UNIX or Linux.
Basically OpenPBS with enhancements in the areas of Scalability, Fault Tolerance, Usability and Scheduling Interfaces. Seems to be under on-going development. Possible concerns over stability as a result.
PBS Pro
A combination of cluster resource management and scheduler. YES
Commerical product. Thought to be in the region of $125 to $500 per CPU, price available on individual quotation.
Wide variety of Linux and other systems, on 32 and 64 bit architectures. Notes that workarounds are required to install on some platforms.
Comparison of cluster scheduling technologies - II
Product Function Formal Support? List price Platforms Comments
Silver / Moab Grid Scheduler
A GRID, or Meta-scheduler.
Various support contacts available on chargeable basis from Cluster Resources Inc. Price by negotiation, based out of the USA.
Open Source Software. Free to use, but support costs by negotiation (unknown).
Requires Maui for local Cluster Management.
Maui
Extends existing cluster schedulers / Resource Managers, providing higher function scheduling facilities
Various support contacts available on chargeable basis from Cluster Resources Inc. Price by negotiation, based out of the USA.
Open Source software. Free to use, but support costs by negotiation (unknown).
All Linux, AIX, OSF/Tru-64, Solaris, HP-UX, IRIX, FreeBSD, etc.
Interfaces to resource managers including: PBS Scheduling API (TORQUE, OpenPBS and PBSPro), Loadleveler Scheduling API, and the Platform LSF Scheduling API. Additionally, partial support of the SGE Scheduling API (Sun Grid Engine), BProc Scheduling API (Scyld BProc) and SSS XML Scheduling API
Condor (including Condor-G, GlideIn, etc)
Condor is a sophisticated cluster scheduler, with built-in multicluster capabilities (flocking). Condor-G extends this to allow scheduling to Globus "grids" from Condor, and GlideIn allows a Globus resource to be temporarily added to a Condor cluster.
Various levels of support available, on a chargeable basis from University of Wisconsin-Madison. Price by negotiation, only available during limited US business hours.
Free to use, but support costs by negotiation. Note that this no longer appears to be Open Source.
Support is stated for HPUX, Solaris, IRIX, RedHat Linux, Windows 2000, 2003, XP, Digital Unix Apple OS X, AIX, all at specific versions, on specific platforms. There are some limitations of function on some platforms. Looks very interesting.
Comparison of cluster scheduling technologies - III
Product Function Formal Support? List price Platforms Comments
Platform LSFActs as a cluster scheduler. YES
Commercial product. Generally understood to be priced in the 1000USD per CPU price range.
HP-UX, AIX, Linux, Windows NT, 2000 & XP, IRIX, Solaris and Apple OS X, all at various specific versions.
Comes with web-based interface for job submission / monitoring.
Data Synapse GridServer
Acts as a cluster scheduler. YES
Commercial product. Generally understood to be priced in the 1000USD per CPU price range.
Any JVM subject to support restrictions
Like all Java applications, tends not to just "run anywhere"
Sun Grid EngineActs as a cluster scheduler.
YES, for Enterprise edition
Free and Commercial versions available.
Linux, Solaris, AIX, HP-UX
Free version quite widely accepted. Enterprise version requires Solaris infrastructure.
Very few of these technologies deliver the necessary functionality for them to play anything more than a niche role– For example, being used on a single function cluster within a
department
So what sort of problems do we need to address for commercial scheduling to be more of a reality– Platform support– Guaranteed levels of service, LoB requirements, response time etc.– Security– More complex inter/intra company environments– Commercial ROIs, TCOs etc.
The realities of the commercial world…
Shared virtual pool of grid-enabled, heterogeneous IT resources including desktops, servers, supercomputers and mainframes
Cross LOB Batch
workload submissions
Intelligent schedule and guarantee the completion of batch workload
ITWS-A
LOB A LOB B
LOB A LOB B
70/30 Fair
Share
CIO
Value Proposition:Maximizes
infrastructure ROI by driving 90%+ cross-enterprise utilization
against resource policy across a
heterogeneous environment.
11
3 4
22
3
44
Automate Cross-Enterprise Workloads in an On Demand Environment
Automate Business Scheduling Across Multiple Scheduling Clusters
Shared virtual grid-enabled pool of heterogeneous IT resources for each batch job within a complex business process
Reliable and predictable results
provided faster
Design and schedule batch workload and dependencies such as: time, data and job
events to be executed
Complete automation and real-time monitoring of mission-critical batch workload and dependencies
Value Proposition: Coordinates the cross-enterprise scheduling of workload execution across clusters of heterogeneous scheduling environments
FW
FW
FW
FW
FW
FW
FW
FW
FW FW
FW
FW
FW
ManagementPresentation & userinterfaces
Service pools
Service virtualisation Compute virtualisation Data virtualisation
Metascheduler
Idle pool
Portal
Grid control points
Data management
Grid reference architecture
FW
FW
FW
FW
FW
FW
FW
Service pool 1
Alternative (3rd party) service providerPolicy (rules)-basedMetascheduler
Service requests
Service pools
VPN
Service pool n
FWFW
3rd party “burst in” compute capacity
FW
Service pool 2
Service pool 3
Service pool 4
Service pool 5
.
.
.
VPN
FWVPNFW
3rd party service provider
OR
Compute complexity
FW
FW
FW
FW
Storage network
Filesystem data
Structured data
Unstructured non-file data
FW
Public data sources
3rd party data sources
FW
FW
FW
Remotedisaster recovery
data repository
Data complexity
DA
E
B C
40Mbs-1
10Mbs-1
400Mbs-1
400Mbs-1
20Mbs-1
200Mbs-1
9337km
244km
99km
213km
10058km 3795km
OK – so let’s do this for real…
UserWeb Portal
Networkof
GridsMonitor daemon
Scheduler daemon
Control script(s)
Policy-based metascheduler implementation
FW
FW
FW
FW
FW
FW
FW
FW
FW FW
FW
FW
FW
ManagementPresentation & userinterfaces
Service pools
Service virtualisation Compute virtualisation Data virtualisation
Metascheduler
Idle pool
Portal
Grid control points
Data management
Grid reference architecture
Trends & challenges
Trends– Use of increasingly dynamic information– Use of meta-information– Scheduling of more real-world programs– Restrictions on the program domain– Deriving scheduling information from programming
language(s)
Challenges– Portability vs. performance– Grid-aware programming– Scalability, Efficiency, Repeatability– Metascheduling
Questions?