stackifest16: building a cluster with stacki - greg bruno
TRANSCRIPT
Greg BrunoVP of Engineering
Workshop: Building a Cluster with Stacki
Datacenter Architecture
Frontend Services
Services to build backend nodes◦ DHCP◦ TFTP◦ Named (optional)
Services to access backend nodes◦ SSH key management◦ Parallel execution shell

Host Configuration Spreadsheet

Backend Installation
Save your Host Configuration spreadsheet as a CSV
Import CSV on frontend◦ “stack load hostfile file=hosts.csv”
Tell backend nodes to install on their next PXE boot◦ “stack set host boot backend action=install”
PXE boot all backend nodes
Done!

BitTorrent-Inspired Package Installation

Stacki
Custom Partitioning
Storage Partition Spreadsheet

Custom Partitioning
We will substitute the “os” pallet with the full CentOS 6.7 distribution◦“stack list storage partition”◦“stack load storage partition file=partition.csv”
Prep the host for reinstall◦“stack list host partition”◦“stack remove host partition backend-0-0”◦“stack set host attr backend-0-0 attr=nukedisks value=true”◦“stack set host boot backend-0-0 action=install”

Adding a Pallet
Adding CentOS to Default Box
We will substitute the “os” pallet with the full CentOS 6.7 distribution
◦“stack list pallet”◦“stack add pallet CentOS*iso”◦“stack enable pallet CentOS”◦“stack disable pallet os”

Create a New Box with CentOS
Making a new box◦“stack list box”◦“stack add box centos”◦“stack enable pallet CentOS box=centos”◦“stack enable pallet stacki box=centos”
Assign a host to a new box◦“stack list host”◦“stack set host box backend-0-0 box=centos”

Boxes
Stacki Pallet
CentOS
CentOS PalletOS Pallet
Stacki Pallet
Default
Boxes
OS Pallet
Stacki Pallet
Default
Stacki Pallet
CentOS
CentOS Pallet
backend-0-0
backend-0-1
Why is this hard and important?
The “Step 0” ProblemCheck namenodes are
empty Format/start HDFS
Create all directories
Create all metastores
Start services (Hbase, Hive, Oozie, Sqoop,
Impala, etc)
Deploy client configuration Configure database
Setup/assign monitors (activity, services, and host)
Test database connections
Validate/resolve hostnamesConsistent host timezones
No bad kernel versions running
(CDH) version consistency
Java version consistencyDaemons versions consistency
Mgmt Agents versions consistency
Host specification/SSH ports
MUCH MORE …
DHCP Server/Client setup TFTP/PXE configuration
Server OS installation
Node OS Install
RAID configuration
Boot configuration System/data disk partitioning
Monitoring system setup and config
Lights Out/IPMI setup
User accounts added and syncedSSH keys on all hosts
Network node configuration
Config Mgmt install and configuration
Route configurationOS upgrades/updates
Site specific software and configuration
Host specification/SSH ports
Security
Firewall setupCluster Mgmt utility Database install and config
Multiple network configPackage installation MUCH MORE …
Clusters are Different
Adding new servers does require coordination
Newly added servers must:• Have same software stack as original
servers• Have same configuration as original
servers• Know about original servers
And, original servers must:• Know about new servers
Result: The management complexity added to the Operations staff is “exponential”
Exponential Complexity
The Pain Curve
The Pain Threshold
The pain threshold differs for every organization
Function of:• cluster(s) size• number of people in Operations• Operations staff cluster expertise
Moore’s Law
Moore’s Law and Infrastructure Value
What it Means for You
Time is Money
The clock starts ticking when hosts land on your loading dock
Without your applications online, you have an paper weight that consumes power, cooling, and management’s attention