developing the stratoscale system at scale - muli ben-yehuda, stratoscale - devopsdays tel aviv 2015
TRANSCRIPT
Developing the Stratoscale System at Scale
Muli Ben-YehudaChief Scientist
devopsdays TLV
October 2015
2
What is the Stratoscale System?
Run Virtual Machines & Containers
High Performance Storage
I
M
Intelligent Resource Management
The Stratoscale operating system turns any cluster of standard x86-servers into a single, intelligent, private cloud for running virtual machines and containers.
Stratoscale provides all the necessary software components - including software-defined storage and networking, compute (hypervisor), and management services - required for building and running your very own cloud infrastructure.
3
Who We Are
• Software focused on intelligent, scale-out hyper-converged infrastructure• Targeting Enterprises, Service Providers and Web-Scale users• Founded in 2013 Backed by top VC’s and strategic investors
• $42M in two investment rounds• 12 patents submitted, 20+ drafts to be filed • Experienced management team
• Anobit, Waze, XIV, Mellanox, Primesense, Panaya, ...• Team of 70+ leading experts• Based in Herzliya, Israel and Boston, MA
4
What This Talk Is About: Scaling
● People
● Processes
● Systems
● Development
Context: The Technology Stack
Remote MemoryAC/PC Live Migration
SLABased
ComputingScale-Out Distributed Storage
Cloud Management StackHA Clustering
Analysis and Insight Generation
Memory Dedup & Compression
StorageDedup & Compression
Single Pane Mgmt
Standard APIs
It's not your usual devops env
● Kernel and hypervisor● Distributed Storage● Networking● Cloud management● UI/UX
In the beginning
● 0-10 developers● Single git repo for all company source code
● Atlassian Bamboo for CI/CD
● Softlayer bare-metal servers
CI not keeping up
● 10-20 developers● Let's write our own CI system● How hard can it be?
SO YOU BELIEVE WRITING YOUR OWN CI WAS A GOOD IDEA?
TELL ME MORE ABOUT GROWTH
Growing pains
● 20+ developers● Build times are long and getting longer● Multiple build types● Rapid growth
● 1 vanilla, 3 vanillas, 5 vanillas, …● Everyone is an owner → no one is an owner● Cascading changes affect everyone immediately
● Build is broken more often than it is not
Scaling development (tools)
● Goal: 50+ developers● First, we need some tools
● Osmosis is an rsync replacment with git tendencies
● Solvent is a build artifact repository● The Inaugurator is a tiny Linux image that does self-provisioning for bare-metal servers
● Upseto is a repo/git-submodule replacement ● These tools and others are available at https://github.com/Stratoscale/
Scaling development (tests)
● Unit tests as part of dev flow● make → run unit tests● Jenkins → continuous build
● Unit tests for function/class/multiple classes
● Whitebox tests for testing daemons at the API level
● voodoo for mock objects (https://github.com/shlomimatichin/Voodoo-Mock)
The Rackattack
● Lots of tests at scale require lots of iron to run tests
● Some tests can run in VMs● But no substitute for baremetal servers
● Rackattack allocations & provisions & reclaims baremetal servers using osmosis/solvent/inaugurator in seconds
Subsystem tests
● Isolate subsystems● Management● Storage● Networking● Cluster● Runtime
● Test features● Integration with neighbours
System tests
● End-to-end testing● Using API/CLI/GUI
● Allocate nodes → install system → run test scenarios
● Test user stories
Problems solved
● 50+ developers● Fast dev → test → deploy → run cycles
● Fast provisioning of bare-metal and virtual test envs
● Rapid test feedback● Automated tools for dev/test/ops
● Eat our own dogfood
The next challenges
● 2x order of magnitude scaling● 200+ developers x 1K nodes per developer● We need better API definitions● We need better testing coverage● We need ingrained best practices ● (Even more) continous integration & continuous delivery
● How do you do on-premise continuous delivery?● Serviceability - call home, logs, analysis, ...
In conclusion
● Scaling up is hard to do● Sense of accomplishment: guaranteed● Different approaches for different growth stages● Find the right mix of DIY and available solutions● Testing is crucial● Devops is not just for web apps
Thank you!