sss build and configuration management update february 24, 2003 narayan desai [email protected]
TRANSCRIPT
![Page 2: SSS Build and Configuration Management Update February 24, 2003 Narayan Desai desai@mcs.anl.gov](https://reader036.vdocument.in/reader036/viewer/2022083009/5697c0211a28abf838cd3153/html5/thumbnails/2.jpg)
Communication Infrastructure Overview• Service Directory
• Feature complete
• Validating
• Robust
• SSSLib• Supports 5 wire protocols
• Data encryption implemented
• Uses “trusted endpoint” model3
• 5 language bindings based on same code base
• Event Manager• Feature complete
• Validating
• Performs well
• Stable
![Page 3: SSS Build and Configuration Management Update February 24, 2003 Narayan Desai desai@mcs.anl.gov](https://reader036.vdocument.in/reader036/viewer/2022083009/5697c0211a28abf838cd3153/html5/thumbnails/3.jpg)
Communication Infrastructure Stress Testing• At scale tests run last week
• ~240 nodes
• 32 processes per node
• Tests• Service directory
• Event Manager
• Senders
• Receivers
• Results• Thread safety issues revealed
• Minor race conditions fixed
• Now runs at scale for extended stress tests
![Page 4: SSS Build and Configuration Management Update February 24, 2003 Narayan Desai desai@mcs.anl.gov](https://reader036.vdocument.in/reader036/viewer/2022083009/5697c0211a28abf838cd3153/html5/thumbnails/4.jpg)
Communication Infrastructure Futures• Schema updates
• Restriction based syntax (more on this later)
• Bring service directory and event manager schemas in line with other current schemas
• Parallel Implementations
• High availability support
• More wire protocol modules
![Page 5: SSS Build and Configuration Management Update February 24, 2003 Narayan Desai desai@mcs.anl.gov](https://reader036.vdocument.in/reader036/viewer/2022083009/5697c0211a28abf838cd3153/html5/thumbnails/5.jpg)
Build and Configuration Management Status• Complete implementation in use on Chiba City
• Second implementation underway at Oak Ridge
• Complete schemas exist and are used for validation in all components
• System model includes 3 components• Same model shown at last face to face
• Basic validation of approach demonstrated by multiple, disparate implementations
• Transition to restriction based syntax completed
![Page 6: SSS Build and Configuration Management Update February 24, 2003 Narayan Desai desai@mcs.anl.gov](https://reader036.vdocument.in/reader036/viewer/2022083009/5697c0211a28abf838cd3153/html5/thumbnails/6.jpg)
Cluster Hardware Infrastructure• Handles all pre-software install node interactions
• Power controllers
• BIOS setup
• Node identification
• Ethernet switch setup
• IPMI
• First component a node interacts with in the BCM stack
• Initial version that supports Chiba City hardware in use
• Stores and serves hardware topology information
![Page 7: SSS Build and Configuration Management Update February 24, 2003 Narayan Desai desai@mcs.anl.gov](https://reader036.vdocument.in/reader036/viewer/2022083009/5697c0211a28abf838cd3153/html5/thumbnails/7.jpg)
Build System• Cluster configuration management system
• Handles software installation and system configuration
• Handles user access control
• Stores and serves node attribute information
• Stores and serves node configuration information
• OSCAR based implementation underway
• City toolkit based implementation completed and in use
![Page 8: SSS Build and Configuration Management Update February 24, 2003 Narayan Desai desai@mcs.anl.gov](https://reader036.vdocument.in/reader036/viewer/2022083009/5697c0211a28abf838cd3153/html5/thumbnails/8.jpg)
Node State Manager• Administrative control panel for a cluster
• Manages system administrative state information
• Integrates with cluster diagnostic system
• Stores and serves information node states and administrative states
• Generates events on node state changes
• Provides imperative interface to diagnostic system
• Initial system diagnostics supplied by AmIHappy
![Page 9: SSS Build and Configuration Management Update February 24, 2003 Narayan Desai desai@mcs.anl.gov](https://reader036.vdocument.in/reader036/viewer/2022083009/5697c0211a28abf838cd3153/html5/thumbnails/9.jpg)
Build and Configuration Futures• Schemas stable
• Develop a more modular cluster hardware infrastructure implementation with better hardware support
• OSCAR deployment of SSS components underway
• Develop better system diagnostics
• Work towards better node state manager integration
• Figure out more interesting uses of restriction based syntax • (getting curious yet?)
![Page 10: SSS Build and Configuration Management Update February 24, 2003 Narayan Desai desai@mcs.anl.gov](https://reader036.vdocument.in/reader036/viewer/2022083009/5697c0211a28abf838cd3153/html5/thumbnails/10.jpg)
Restriction Based Syntax• All potentially multiple argument functions treat argument data
as a restriction, not as an explicit argument
• Restrictions match all data that meets the criteria specified
• Allows matching to be performed inside of components
• Allows operations to use matching
• Opens the door to transactions
• Makes data ownership more explicit
![Page 11: SSS Build and Configuration Management Update February 24, 2003 Narayan Desai desai@mcs.anl.gov](https://reader036.vdocument.in/reader036/viewer/2022083009/5697c0211a28abf838cd3153/html5/thumbnails/11.jpg)
Example<set-node-state state=‘on’ adminstate=‘offline’>
<node-state adminstate=‘online’/>
</set-node-state>
• Operates on all nodes where adminstate=‘online’
• This allows all operations to be performed efficiently
![Page 12: SSS Build and Configuration Management Update February 24, 2003 Narayan Desai desai@mcs.anl.gov](https://reader036.vdocument.in/reader036/viewer/2022083009/5697c0211a28abf838cd3153/html5/thumbnails/12.jpg)
API Augmentation• APIs only control server side functionality
• i.e., what can client count on from components?
• Our APIs currently consist entirely of XML schemas
• This may not be sufficient• Clients may wait for events
• Event generation is not specified
• Semantics for all commands aren’t specified (yet)
• Data ownership is not yet clear