scalable cluster management: frameworks, tools, and systems david a. evensky ann c. gentile pete...
Post on 30-Dec-2015
220 Views
Preview:
TRANSCRIPT
Scalable Cluster Management:Frameworks, Tools, and Systems
David A. EvenskyAnn C. GentilePete Wyckoff
Robert C. ArmstrongRobert L. ClayRon Brightwell
Sandia National Laboratories
Lilith: a tool framework for very large clusters
• Most current tools for clusters are designed as monolithic programs, to do one task well.
• If you need a new task, you need a new tool.
• The Lilith framework allows users to easily construct new tools using a component framework.
Control of large distributed systems
• System administration• Auditing & job control by users• Interrogation of processes• Simple Applications
1 sec program on 1000 nodes
16min10sec
Lilith: Scalable component framework
C lientData
D istribution
ExecutionC lient
ResultCollectionC lient
• Lilith spans a tree of machines executing user-defined code.
• User code (Lilim/Lilly) provides component functionality on a single node
• Provides scalable distribution, result collection
Component Methods
• MO[] distributeOnTree(MO, int[])– data distribution down the tree
• MO onTree(MO)– component action on the node
• MO collateOnTree(MO[])– result collection and condensation
Security
Uses purely Java 2 mechanisms atthis time….
User sendscredential with call
LilithHost createsProtectionDomain fromuser credential
LilithHost calls checkPermission
LilithHost
PolicyKeys
Method invocation
Sandbox setup similarly usingthe User credential and PolicyFile
Prototypical tools
System monitoring toolto track the state of acluster of machines
PS-tool to get sortable processinformation from selected nodesof the cluster.
Lilith Lights tool
• Snake toy app– demo that draws a
snake over front panel
– no global repository for state --- all info distributed
– Snake’s movement was limited to left half of machine
• program error in declaration of drand48() biased results
Who serves who?
• Programmers adapt to:– The OS that runs on the machine,– The system configuration chosen by the admins– Changing system environments
• economically driven to heterogeneous distributed computing
• Why can’t the user dictate the software environment as a resource request?
DASE
• Dynamically Adaptive Software Environment• Provide multi-OS/multi-environment
capability• Manage multiple SW environments• “save” user environment for reuse later• Integration with SW component architectures
DASE Service Object Model
Physical systemLogical partitioning
“system”model
PartitionerApp Object- resource spec- data/map objects
Solver
Visualizer
MesherScheduler
ResourceRequest
Flexible Resource Management
RM/VM
S chedu ler/R esource M anagem ent
V irtua l M ach ine
A pp lica tion E nvironm ent
D A S E S ession M anager
H ierarch ica l N et B ooting
RM/VM RM/VM
DASEClient
TFlopsPRE
HPVMCustom
Lin
ux
NT
ComponentsFramew orks
con
tro
l
info
rma
tion
App Environment Specification
Scalable Unit
power serial Ethernet Myrinet
To
syst
em s
uppo
rt n
etw
ork
100BaseT hub
16 p
ort M
yrin
et s
wit
ch
compute
compute
compute
compute
compute
compute
compute
service
8 Myrinet LAN cables
sss0
Ter
min
al s
erve
r
Pow
er c
ontr
olle
r
100BaseT hub
16 p
ort M
yrin
et s
wit
ch
compute
compute
compute
compute
compute
compute
compute
service
Ter
min
al s
erve
r
Pow
er c
ontr
olle
r
System Support Hierarchy
sss1
Admin access
sss0
node
node
node
nodeScalable
Unit
In-use copyof systemsoftware
NFS mountroot fromSSS0
sss0
node
node
node
nodeScalable
Unit
In-use copyof systemsoftware
NFS mountroot fromSSS0
sss0
node
node
node
nodeScalable
Unit
In-use copyof systemsoftware
NFS mountroot fromSSS0
Master copyof systemsoftware
Hardware Management
• Discovery and Control– Perl scripts that
• control individual devices (power controller, terminal server, machine, switch)
• build a database of configuration info (MAC and IP addresses, serial numbers, etc.)
• Roles– database is augmented with each components role
in the system (compute, sss0, terminal server, etc.)
“Virtual Machines”
• Allows arbitrary grouping of scalable units that use the same system software
• Operations to update system software and boot nodes, scalable units, or machines
• Updates system software on an SU in 1 min.• Update system software on 24 SUs in 1.5 min.• Boot an SU in 5 min. (staged for power drain)• Boot 24 SUs in 10 min.
“Virtual Machines”
sss1Uses rdist topush system softwaredown
sss0nodenodenode
nodeScalable
Unit
In-use copyof systemsoftwareNFS mountroot fromSSS0
sss0nodenodenode
nodeScalable
Unit
In-use copyof systemsoftwareNFS mountroot fromSSS0
sss0nodenodenode
nodeScalable
Unit
In-use copyof systemsoftwareNFS mountroot fromSSS0
Linux 2.3Beta
AlphaProduction SU configuration
database
http://dancer.ca.sandia.govhttp://www.cplant.ca.sandia.govhttp://www.cs.sandia.gov/cplant
top related