cray operating system and i/o road map charlie …...may 08 cray inc. proprietary slide 3 cray...
TRANSCRIPT
Cray Operating System and I/O Road MapCharlie Carroll
May 08 Cray Inc. Proprietary Slide 2
Cray Operating Systems Focus
PerformanceMaximize compute cycles delivered to applications while also providing necessary services
Lightweight kernel on compute nodeStandard Linux environment on service nodes
Optimize network performance through close interaction with hardware
System stabilityCorrect defects which impact stabilityDevelop and implement features to increase system robustness
ScalabilityScale to increase performance without compromising stabilityProvide better system management tools to manage larger systems
May 08 Cray Inc. Proprietary Slide 3
Cray Operating Systems and I/OCompute node kernels
XT CNLXT CatamountX2 CNLXMT
Service node kernelSupports all compute nodes
File systemsLustreDVS (Data Virtualization Service)
NetworkingPortalsTCP/IPuGNI and DMAPP
Operating system servicesCheckpoint / restartNode health daemonCSA (Comprehensive System Accounting)
System managementInterface to system dataALPS (Application-Level Placement Scheduler)
Interfaces to PBS Pro, Moab/Torque and LSF
Command interface
May 08 Cray Inc. Proprietary Slide 4
Cray OSIO Themes
System stabilityFailover
LustreService nodes
Portals & LustreSignificant effort to improve robustness, defect corrections, and increased testing
Node health checkMore and better tools to evaluate compute node health
PerformanceTension between lightweight kernel and features
We'll hold the line on featuresHuge page supportAnalyze efficacy of topology-aware job placement
May 08 Cray Inc. Proprietary Slide 5
Cray OSIO Themes, continued
System managementUnify the interface to system management data“Play nicely" with customers' existing data center infrastructureLook ahead to increasing scale
Support hardwareIdeal is to release software in advance of the hardware
OS for Quad-core, PCIe, and NUMA support went well
LustreWork with Sun to build their test capabilityContinue to improve our troubleshooting tools
Nic Henke talk at 9:15am on ThursdayPossibly become more selective about taking Lustre features
May 08 Cray Inc. Proprietary Slide 6
Cray OSIO Themes, continued
System size and scalabilityPortals working to run Global Arrays across some very big systems
Internal infrastructureBecome more like Linux in our build and delivery infrastructureBetter mechanics, such as kernel source release
May 08 Cray Inc. Proprietary Slide 7
OSIO Release Process
Moving to two significant releases per yearGA in roughly Q2 and Q4LA release one quarter earlier
LA to GA requires testing at large (40+ cabinets) scaleMid-release hardware may be supported with a product-specific release
XT5 will require v2.1HD releaseGoal is to minimize risk to the v2.1 customer base
Maintenance releases will be consolidated and scheduledMoving toward having the ability to release service node software independently of the compute nodes
May 08 Cray Inc. Proprietary Slide 8
Features in Amazon (XT V2.1, GA in 3Q08)
Lustre 1.6Performance improvementsNew configuration methodology
DVS (Data Virtualization Service)Ability to project NFS to compute nodes
SLES10 SP1Kernel and user spaceAutomated site data migration tools for software upgrades
SIO node rebootIncreased system uptime
Node health, phase 1Ping nodes of jobs which terminate abnormallyAdmin-downs the nodes that do not respond
May 08 Cray Inc. Proprietary Slide 9
Features in Amazon (XT V2.1), continued
CSA (Comprehensive System Accounting)System management and billing
Mazama log managerCentralized log managementSearch, filtering and log features
Virtual Channel 2 (VC2)Higher throughput in some high-load situations such as all-to-all
Kernel changes for NUMANeeded for XT5; base kernel going forward
EAL3 supportSecurity validation
May 08 Cray Inc. Proprietary Slide 10
Features in XT’s Congo Release (GA in 2Q09)
Node health, phase 2User configurable for when to run, how to react to errorsMore checks: file systems and OSInitiated locally on each node, that is, scalable
Attribute managementSingle, documented interface to system information
SLES10 SP2Build split
Mostly internal, RPMs put in locations more like LinuxSource updates easier
May 08 Cray Inc. Proprietary Slide 11
Features in XT’s Congo Release, continued
Checkpoint / restartMitigates job failuresSupport MPI and Shmem applications
Portals changes for XT5Better network performance in XT5’s NUMA architectureMore consistent performance
SDB node failoverAids system resiliency
LDAP integration into CSAEliminates need for separate user database for CSA
DVSSee Dave Wallace’s talk at 8:45am on Thursday
May 08 Cray Inc. Proprietary Slide 12
Features in XT’s Congo Release, continued
Package manifestsSmoother installation process
Open Fabric Enterprise Distribution (OFED) / Infinibandsupport
Enabler for external Lustre
Catamount not supported in Congo and later releases
May 08 Cray Inc. Proprietary Slide 13
Features Being Discussed
External LustreIn 2008 we will provide IB cable to connect to customer-provided Lustre serversBroader involvement under discussion
External login nodesDynamic librariesResiliency features
We will do more for system and application resiliency. Exact features are under discussion.
SLES11We expect to track Novell’s releases
May 08 Cray Inc. Proprietary Slide 14
Baker-Gemini Features
Support for GeminiSupport for MPI applications via User-level Gemini Network Interface API (uGNI API)Support for PGAS languages via Gemini Distributed Memory Applications API (DMAPP API)
Link resiliencyBaker/Gemini will provide the capability to ride through many types of link errors. A single hardware link failure will not take down the entire system, although some applications may be terminated.
May 08 Cray Inc. Proprietary Slide 15
Cray Linux Environment (CLE) Congo
2007Q1 Q2 Q3 Q4
2008Q1 Q2 Q3 Q4
2009Q1 Q2 Q3 Q4
2010Q1 Q2 Q3 Q4
2011Q1 Q2 Q3 Q4
▼2.0 ▼2.1 ▼Congo ▼Danube ▼Ganges
Themes: CNL “Capability”
Scaling,XT5
Reliability, Supportability
Reliability, Supportability
Resiliency,Gemini
Marble,Next-gen NIC
Node health, phase 2
Checkpoint / restart
SDB node failover
Portals improvements
OFED / Infiniband support
Administrative interface
DVS improvements
May 08 Cray Inc. Proprietary Slide 16
Cray Linux Environment (CLE) Danube
2007Q1 Q2 Q3 Q4
2008Q1 Q2 Q3 Q4
2009Q1 Q2 Q3 Q4
2010Q1 Q2 Q3 Q4
2011Q1 Q2 Q3 Q4
▼2.0 ▼2.1 ▼Congo ▼Danube ▼Ganges
Themes: CNL “Capability”
Scaling,XT5
Reliability, Supportability
Reliability, Supportability
Resiliency,Gemini
Marble,Next-gen NIC
Baker-Gemini High-Speed Network• Layered Driver Stack• Takes advantage of new NIC• Minimizes software overhead• OS bypass• Improved MPI performance: latency, bandwidth, msgs/sec•PGAS Support: UPC & CAF
Resiliency ImprovementsHardware rerouting (adaptive
traffic)Rerouting in software around
down links
May 08 Cray Inc. Proprietary Slide 17
Cray Linux Environment (CLE) Ganges
2007Q1 Q2 Q3 Q4
2008Q1 Q2 Q3 Q4
2009Q1 Q2 Q3 Q4
2010Q1 Q2 Q3 Q4
2011Q1 Q2 Q3 Q4
▼2.0 ▼2.1 ▼Congo ▼Danube ▼Ganges
Themes: CNL “Capability”
Scaling,XT5
Reliability, Supportability
Reliability, Supportability
Resiliency,Gemini
Marble,Next-gen NIC
Support for next-generation NIC
Features to support Marble