notes from installing a mac g5 cluster at slac
DESCRIPTION
Notes from Installing a Mac G5 Cluster at SLAC. Chuck Boeheim SLAC Computing Services. Background. New joint Stanford/SLAC department: Kavli Institute for Particle Astrophysics Choices for compute cluster included joining Opteron purchase, or Mac G5 cluster Astronomers like Macs - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Notes from Installing a Mac G5 Cluster at SLAC](https://reader035.vdocument.in/reader035/viewer/2022070411/5681472f550346895db46bfd/html5/thumbnails/1.jpg)
Notes from Installing a Mac G5 Cluster at SLAC
Chuck Boeheim
SLAC Computing Services
![Page 2: Notes from Installing a Mac G5 Cluster at SLAC](https://reader035.vdocument.in/reader035/viewer/2022070411/5681472f550346895db46bfd/html5/thumbnails/2.jpg)
Background
• New joint Stanford/SLAC department: Kavli Institute for Particle Astrophysics
• Choices for compute cluster included joining Opteron purchase, or Mac G5 cluster
• Astronomers like Macs– Compatible with laptops and desktops they
were buying
![Page 3: Notes from Installing a Mac G5 Cluster at SLAC](https://reader035.vdocument.in/reader035/viewer/2022070411/5681472f550346895db46bfd/html5/thumbnails/3.jpg)
Configuration
• 2 Mac G5 file servers– 2 2.5TB Apple XServ Raid units each
• 2 Mac G5 interactive servers
• 10 Mac G5 compute nodes– No graphics cards– No CD drives
• Plus two development servers for us
![Page 4: Notes from Installing a Mac G5 Cluster at SLAC](https://reader035.vdocument.in/reader035/viewer/2022070411/5681472f550346895db46bfd/html5/thumbnails/4.jpg)
![Page 5: Notes from Installing a Mac G5 Cluster at SLAC](https://reader035.vdocument.in/reader035/viewer/2022070411/5681472f550346895db46bfd/html5/thumbnails/5.jpg)
Physical
• Good overall engineering
• Good airflow; quiet
• Hot swap disks in server nodes
• Rails fit standard racks well
• Easy to mount
![Page 6: Notes from Installing a Mac G5 Cluster at SLAC](https://reader035.vdocument.in/reader035/viewer/2022070411/5681472f550346895db46bfd/html5/thumbnails/6.jpg)
Management
• Supports serial console (57,600 baud)– No graphics card in G5 server
• Supports serial bios – But must press button to boot in bios
• No power management• Supports network install• Initial disk partition and boot a little tricky• Subsequent installs can be fully automated
![Page 7: Notes from Installing a Mac G5 Cluster at SLAC](https://reader035.vdocument.in/reader035/viewer/2022070411/5681472f550346895db46bfd/html5/thumbnails/7.jpg)
Installing the first server
• “Secret handshake”– Install from CD
• Run Server Assistant on another Mac to start install, and again to configure
• Run Apple Remote Desktop to get graphic login
• Command line alternatives are available, though obscure
![Page 8: Notes from Installing a Mac G5 Cluster at SLAC](https://reader035.vdocument.in/reader035/viewer/2022070411/5681472f550346895db46bfd/html5/thumbnails/8.jpg)
The Partitioning Conundrum
• System has initial OS with one 250GB partition• Cannot re-partition the boot drive• One solution:
– Boot server from CD– Ssh to server– Partition the disk
• Diskutil partitionDisk disk0 3 “JournaledHFS+” System 10G “JournaledHFS+” Cache 1G “JournaledHFS+” Work 200G
• Netboot may also be possibility• Fastest solution: insert disks in 2nd and 3rd bays of
server and format
![Page 9: Notes from Installing a Mac G5 Cluster at SLAC](https://reader035.vdocument.in/reader035/viewer/2022070411/5681472f550346895db46bfd/html5/thumbnails/9.jpg)
Reference Server
• Install second server– Mac OS X Server 10.3.5
– All patches
– Xcode tools
– Local directory configuration
– AFS
– Fink
• Shutdown, move hard disk to bay 2 of netboot server
![Page 10: Notes from Installing a Mac G5 Cluster at SLAC](https://reader035.vdocument.in/reader035/viewer/2022070411/5681472f550346895db46bfd/html5/thumbnails/10.jpg)
Make a Network Install Master
• Netboot serves disk images• Uses DHCP, BSDP1, TFTP, and NFS to serve
disk image to target machine• Use Network Image Utility to make image of
reference disk• Turn on Netboot in Server Admin• Boot client from network (different “secret
handshake”)1 Boot Server Discovery Protocol
![Page 11: Notes from Installing a Mac G5 Cluster at SLAC](https://reader035.vdocument.in/reader035/viewer/2022070411/5681472f550346895db46bfd/html5/thumbnails/11.jpg)
Network Install Sequence
Target ServerNetboot Server
DHCPbroadcast
Configure with DHCP helper address
DHCPresponse
BSDPbroadcast
BSDP response
Router
TFTP bootloader
NFS disk image
DHCP Server
![Page 12: Notes from Installing a Mac G5 Cluster at SLAC](https://reader035.vdocument.in/reader035/viewer/2022070411/5681472f550346895db46bfd/html5/thumbnails/12.jpg)
Netboot Issues
• Our standard DHCP server worked just fine.
• BSDP is an extension of DHCP– We put on same subnet as cluster for
simplicity– Should be able to use router helper to put on a
different subnet
• Network Image Utility had some conflicts
![Page 13: Notes from Installing a Mac G5 Cluster at SLAC](https://reader035.vdocument.in/reader035/viewer/2022070411/5681472f550346895db46bfd/html5/thumbnails/13.jpg)
Users: Looks like Unix
• Use NIS, LDAP, Kerberos for accounts
• Most utilities present: ssh, bash, perl, emacs, X-windows
• Fink supplies most gnu tools
![Page 14: Notes from Installing a Mac G5 Cluster at SLAC](https://reader035.vdocument.in/reader035/viewer/2022070411/5681472f550346895db46bfd/html5/thumbnails/14.jpg)
Sysadmins: Sorta like Unix
• Strong BSD heritage• Some things in different places than Linux• GUI tools for configuring
– Most, but not all, can configure a remote server
• Serversetup and networksetup commands can configure most settings
• Directory Access (setup NIS or LDAP) seems not to be scriptable
![Page 15: Notes from Installing a Mac G5 Cluster at SLAC](https://reader035.vdocument.in/reader035/viewer/2022070411/5681472f550346895db46bfd/html5/thumbnails/15.jpg)
Startup Befuddlement
• Not inittab or /etc/rc.d based• SystemStarter starts many daemons• Ssh starts out of xinetd• Ypbind started when needed by Directory Services• Watchdog starts server processes
– /etc/watchdog.conf– Sorta like inittab
• Mac OS X 10.4 (Tiger) is going to introduce yet another way
![Page 16: Notes from Installing a Mac G5 Cluster at SLAC](https://reader035.vdocument.in/reader035/viewer/2022070411/5681472f550346895db46bfd/html5/thumbnails/16.jpg)
Authentication Compatibility
• Can use NIS, LDAP, Kerberos, Active Directory
• Supports PAM, but not everything uses it
• LoginWindow (and a few other things) are directly kerberos-aware and don’t use PAM
![Page 17: Notes from Installing a Mac G5 Cluster at SLAC](https://reader035.vdocument.in/reader035/viewer/2022070411/5681472f550346895db46bfd/html5/thumbnails/17.jpg)
Infrastructure Fit
• Taylor (SLAC’s config tool) ran with little modification
• About half the config modules worked with little change
• The remainder took completely Mac OS X specific rewrites
![Page 18: Notes from Installing a Mac G5 Cluster at SLAC](https://reader035.vdocument.in/reader035/viewer/2022070411/5681472f550346895db46bfd/html5/thumbnails/18.jpg)
Nits
• /etc/passwd is there, but not used. Uses netinfo instead for local accounts
• Shadow passwords in different place and format than linux, solaris
• afs permissions: default is to copy owner mode bits to group and other– Copy file from afs to local, ends up world-readable– Change ‘realmodes’ in
/var/db/openafs/etc/config/settings.plist
• Mac HFS file system is not case-sensitive: Makefile and makefile are the same!
![Page 19: Notes from Installing a Mac G5 Cluster at SLAC](https://reader035.vdocument.in/reader035/viewer/2022070411/5681472f550346895db46bfd/html5/thumbnails/19.jpg)
Conclusions
• Good hardware package, but lacking power management
• Network install suitable for cluster operation, but still a few wrinkles
• Generally good configuration management tools, but some divergence from standard Unix tools