implementing dual-boot clusters in a distributed environment surajit bose, technology services...
TRANSCRIPT
![Page 1: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect](https://reader030.vdocument.in/reader030/viewer/2022032804/56649e445503460f94b37b54/html5/thumbnails/1.jpg)
Implementing Dual-Boot Clusters in a
Distributed Environment
Surajit Bose, Technology Services ManagerDustin King, Systems Imaging Architect
![Page 2: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect](https://reader030.vdocument.in/reader030/viewer/2022032804/56649e445503460f94b37b54/html5/thumbnails/2.jpg)
Our Environment
• Not central IT
• Over 100 computer clusters, mostly unstaffed
• Dorms, Row Houses, Graduate Residences
• Central and Branch Libraries
• Student Centers
• Most open 24/7
• Approximately 500 cluster machines
• Historically, even mix of Dells and Apples
![Page 3: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect](https://reader030.vdocument.in/reader030/viewer/2022032804/56649e445503460f94b37b54/html5/thumbnails/3.jpg)
Our Prior Infrastructure
• Campus-wide Kerberos authentication
• PXE/Ghost for Windows imaging
• Windows machines joined to AD
• Domain scripts for Windows maintenance
• NetRestore for Mac imaging
• Macs bound to LDAP
• Radmind for Mac maintenance
• Linux server environment
![Page 4: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect](https://reader030.vdocument.in/reader030/viewer/2022032804/56649e445503460f94b37b54/html5/thumbnails/4.jpg)
Why Dual-Boot?
• Bypass question of optimal platform mix
• Improve availability of single-platform software
• Provide choice for students
• Homogenize inventory
• Seemed like a cool thing to try
![Page 5: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect](https://reader030.vdocument.in/reader030/viewer/2022032804/56649e445503460f94b37b54/html5/thumbnails/5.jpg)
Desiderata
• Network-based full-disk imaging
• Platform parity
• Manage each platform independently
• Ease of switching OS
• Non-ridiculous login times
• Server-side control
• Consistent imaging process across hardware
• Shared local storage across OSes
![Page 6: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect](https://reader030.vdocument.in/reader030/viewer/2022032804/56649e445503460f94b37b54/html5/thumbnails/6.jpg)
What We Discovered
• Managing the reboot cycle is difficult
• Existing solutions unsatisfactory for us
• BootPicker, NetRestore/WinClone Mac-centric
• rEFIt makes management difficult
• No network boot environment works for both Dell and Apple machines
• Partition order matters
![Page 7: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect](https://reader030.vdocument.in/reader030/viewer/2022032804/56649e445503460f94b37b54/html5/thumbnails/7.jpg)
What We Decided• Control boot process with EFI shell
environment (SCUBA)
• Inter-OS communication via locally stored state file
• NetBoot install environment (Genie)
• Use convoluted partition scheme
• Use Paragon NTFS and MacDrive
• Use customized login screens
• Nightly maintenance reboots
• Server-side tracking of machine state
![Page 8: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect](https://reader030.vdocument.in/reader030/viewer/2022032804/56649e445503460f94b37b54/html5/thumbnails/8.jpg)
EFI Shell Environment
• Boot to EFI shell
• Fits on a flash drive for full-disk imaging
• Shell modified to ignore keyboard interrupts
• EFI toolkit has network stack, http client, Python
• Startup script
• validates nvram boot options
• checks with server
• reads and updates local state file
• sets nextboot value in nvram
![Page 9: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect](https://reader030.vdocument.in/reader030/viewer/2022032804/56649e445503460f94b37b54/html5/thumbnails/9.jpg)
Priority of Boot Flags
• Required (from server)
• Mac Maintenance (from local state file, set by script)
• Windows Maintenance (from local state file, set by script)
• Requested (from local state file, set by user)
• Suggested (from server)
![Page 10: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect](https://reader030.vdocument.in/reader030/viewer/2022032804/56649e445503460f94b37b54/html5/thumbnails/10.jpg)
Local State File
• Houses maintenance and requested boot flags
• Caches most recent response from the server
• Has to be writable from both OSes as well as EFI shell environment
![Page 11: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect](https://reader030.vdocument.in/reader030/viewer/2022032804/56649e445503460f94b37b54/html5/thumbnails/11.jpg)
Genie
• Based on NetInstall set built with Mac OS X Server Admin Tools
• Bash scripts check server for configuration and manage imaging process
• Report progress through iHook
![Page 12: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect](https://reader030.vdocument.in/reader030/viewer/2022032804/56649e445503460f94b37b54/html5/thumbnails/12.jpg)
Windows Login Screen (pGina)
![Page 13: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect](https://reader030.vdocument.in/reader030/viewer/2022032804/56649e445503460f94b37b54/html5/thumbnails/13.jpg)
Mac Login Screen (SCUBA)
![Page 14: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect](https://reader030.vdocument.in/reader030/viewer/2022032804/56649e445503460f94b37b54/html5/thumbnails/14.jpg)
Partition Scheme
• EFI System Partition: leave alone per Apple recommendation
• FAT: store Windows images and local state file
• NTFS: local storage space for users
• NTFS: Windows system partition
• HFS+: EFI shell environment
• HFS+: Mac system partition
![Page 15: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect](https://reader030.vdocument.in/reader030/viewer/2022032804/56649e445503460f94b37b54/html5/thumbnails/15.jpg)
Handling Partitions
• Mac OS X
• Paragon NTFS
• Remount volumes under /Library/Mounts
• Windows XP
• MacDrive
• Some partitions already invisible
• Remount volumes under c:\stucomp\mnt
![Page 16: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect](https://reader030.vdocument.in/reader030/viewer/2022032804/56649e445503460f94b37b54/html5/thumbnails/16.jpg)
Nightly Maintenance
• Scripts on each OS write maintenance flags into state file
• Windows
• Python reboot service
• Domain startup scripts
• Mac
• Radmind
• iHook
![Page 17: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect](https://reader030.vdocument.in/reader030/viewer/2022032804/56649e445503460f94b37b54/html5/thumbnails/17.jpg)
Server-Side Setup
• Genie
• Background downloads
• SCUBA flags
• Printer configuration
• Imaging request page
• Status “database”
![Page 18: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect](https://reader030.vdocument.in/reader030/viewer/2022032804/56649e445503460f94b37b54/html5/thumbnails/18.jpg)
Gotchas
• Per-seat licensing costs
• Mouse and keyboard confusion
• NetBoot memory management horror
• Windows reboot behavior
• Time and Kerberos logins
• Permissions on shared volumes
• SSH keys
![Page 19: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect](https://reader030.vdocument.in/reader030/viewer/2022032804/56649e445503460f94b37b54/html5/thumbnails/19.jpg)
Planned Enhancements• Improve build processes for EFI, NetBoot
environments
• Increase structural similarity of configuration and management between platforms
• Implement PKI for client-server communications
• Explore emerging solutions (e.g. XHooks)
• Implement cross-platform monitoring system
• Reduce power usage on clients
• Create documentation
• Release as open-source
![Page 20: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect](https://reader030.vdocument.in/reader030/viewer/2022032804/56649e445503460f94b37b54/html5/thumbnails/20.jpg)
Acknowledgments
• Karl Kuehn, Software Image Developer
• Alex Schorsch, Student Developer
• Fangling Zhang, Student Developer
• Paul Nuyujukian, Student Developer
• Ian Comfort, Systems Administrator
![Page 21: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect](https://reader030.vdocument.in/reader030/viewer/2022032804/56649e445503460f94b37b54/html5/thumbnails/21.jpg)
Questions?