skynet vs planet of apes
TRANSCRIPT
Skynet vs Planet of Apes
S01-E03
Adrien Blind - @adrienblindLudovic Piot - @lpiotLaurent Grangeau - @laurentgrangeauJérôme Petazzoni - @jpetazzo
Tweet us on #skynetvsapes !@lpiot
@jpetazzo@adrienblind
@laurentgrangeau
Casting
Ludovic Piot@lpiot
Laurent Grangeau@laurentgrangeau
Adrien Blind@AdrienBlind
Jérôme Petazzoni@jpetazzo
Tweet us on #skynetvsapes !@lpiot
@jpetazzo@adrienblind
@laurentgrangeau
Teaser: Deliver smart entertainment
VS
VS
“2 self-managed Docker clusters deployed on public clouds and fight each other in a ruthless battle.
One has been designed to resist any form of threat.
The other one's only aim is to destroy the first one.
Who's going to win?”
Tweet us on #skynetvsapes !@lpiot
@jpetazzo@adrienblind
@laurentgrangeau
Teaser: Deliver smart entertainment
Autonomy first
▪ Hybridated: IaaS from Azure & AWS
▪ Self-healed
▪ Religion: Docker Swarm
▪ Custom chaos monkey
Services first
▪ Various services from AWS
▪ Cloud-healed
▪ Religion: Kubernetes
▪ BBVA’s chaos monkey
VS
Tweet us on #skynetvsapes !@lpiot
@jpetazzo@adrienblind
@laurentgrangeau
Season 01 timeline
S01E02ter 15/05
Cloud Europe
Rise of Planet of Apes
S01E03 22/06
Voxxed Lux
Skynet Returns!
S01E05 09/11
Devops D-Day
Grand finale
S01E01 05/04
Devoxx BOF
Trailer
S01E02 19/04
Breizhcamp
Grand Premier
S01E02bis 11/05
RivieraDev
Rise of the machines
Previously in SkynetVsApes:● Proof of concept
● Clusters creation
● Decentralised storage : test Infinit.sh
● Netflix’s Chaos monkey test
Tweet us on #skynetvsapes !@lpiot
@jpetazzo@adrienblind
@laurentgrangeau
▪ Main objective : targets and terminates instances in a region
▪ When : randomly in a given range of time▪ Frequency : one instance every 2 days
configurable… 😈
▪ How : identifies instances running a given app through Spinnaker
chaosmonkey principles
Tweet us on #skynetvsapes !@lpiot
@jpetazzo@adrienblind
@laurentgrangeau
chaosmonkey architecture
chaosmonkey
Tweet us on #skynetvsapes !@lpiot
@jpetazzo@adrienblind
@laurentgrangeau
chaosmonkey install = WTF !
Tweet us on #skynetvsapes !@lpiot
@jpetazzo@adrienblind
@laurentgrangeau
chaosmonkey install = easy!
Tweet us on #skynetvsapes !@lpiot
@jpetazzo@adrienblind
@laurentgrangeau
chaosmonkey install process
AWS account
KOPSK8s on AWS
HelmSpinnaker pkg
Spinnaker custom pkg
mySQL pkg
chaosmonkey pod
chaosmonkey docker image
Tweet us on #skynetvsapes !@lpiot
@jpetazzo@adrienblind
@laurentgrangeau
Skynet platform
skynet-nodeskynet-nodeskynet-node
Docker Engine 17.05-ce Docker Engine 17.05-ce Docker Engine 17.05-ce
Skynet-storage
Skynet-cluster (Docker swarm mode cluster)
Skynet-registry Skynet-registry Skynet-registry
Skynet-resilience Skynet-resilience Skynet-resilience
Internet
Skynet-provisioning
Skynet-terminator
Tweet us on #skynetvsapes !@lpiot
@jpetazzo@adrienblind
@laurentgrangeau
Skynet-storage (Infinit.sh)
ServerServerServer
Infinit Network
Infinit Silo : 10GB Infinit Silo : 30GB Infinit Silo : 5GB
User
+Passport for
netwk
Infinit Volume
Docker volume plugin
Docker volume plugin
Docker volume plugin
Docker registry container
Docker registry container
Docker registry container
User
+Passport for
netwk
User
+Passport for
netwk
Tweet us on #skynetvsapes !@lpiot
@jpetazzo@adrienblind
@laurentgrangeau
Skynet-provisioning (Terraform)▪ Using Terraform to automate cluster creation
▪ Leverage on terraform multiple providers
▪AWS
▪Azure
▪GCE (soon…)
▪ On master node
▪ docker swarm join --token token-master
▪ On slave nodes
▪ docker swarm join --token token-slave
Tweet us on #skynetvsapes !@lpiot
@jpetazzo@adrienblind
@laurentgrangeau
Skynet-resilience
▪ Focused on platform’s resilience▪ Apps resilience provided by swarm’s orchestrator
▪ Beware of apps architecture !
▪ A simple docker image / service▪ Encloses Terraform provisioning scripts
▪ Deployed as a global service on every nodes
▪ Small introspector script checking if the subsequent docker
engine is the cluster leader to trigger terraform regularly
Tweet us on #skynetvsapes !@lpiot
@jpetazzo@adrienblind
@laurentgrangeau
Skynet-terminator
▪ Chaos monkey to destroy Apes
▪ A simple docker image / service▪ Encloses a script to shoot Apes’ instances
▪ Deployed as a unique replica
Tweet us on #skynetvsapes !@lpiot
@jpetazzo@adrienblind
@laurentgrangeau
▪ Skynet▪ Finalize Decentralized storage for Skynet ▪ Build up our dedicated, on purpose, immutable OS with LinuxKit▪ Let’s be serious : achieve Skynet’s self healing with InfraKit
▪ Unleash the cloud with edge computing : push armed nodes on Raspberries
▪ Implement the least privilege model to enhance resiliency
▪Apes
▪ Unleash its own chaos monkey
▪ Federate several clusters
▪ These are platforms: still have to deliver a brain app
Some next steps to strengthen the setup
Tweet us on #skynetvsapes ! !@lpiot
@jpetazzo@adrienblind
@laurentgrangeau
Deep dive:Least Privilege Model
Tweet us on #skynetvsapes !@lpiot
@jpetazzo@adrienblind
@laurentgrangeau
▪ Give each agent* in the system the exact information it needs; no more, no less
▪ Similar to compartmentalization of classified information
▪ In the Manhattan Project, most teams working on the first A-bomb didn’t have access to the big picture
▪ In information security: if a service doesn’t need a particular password, token, or permission – it shouldn’t have it!
*Process, user, service, node...
What’s “Least Privilege” ?
Tweet us on #skynetvsapes !@lpiot
@jpetazzo@adrienblind
@laurentgrangeau
▪ In a SwarmKit cluster, the only nodes with full access are the manager nodes
▪ Worker nodes do not have access to the Raft log
▪ Worker nodes only know the addresses of the manager nodes
▪ Worker nodes only have their own private key (and CA cert)
▪ Communication between two nodes uses a session key(different at each connection; rotated every 12 hours)
Least Privilege in SwarmKit
Tweet us on #skynetvsapes !@lpiot
@jpetazzo@adrienblind
@laurentgrangeau
▪ A worker node has access to detailed information about a container only if it is designated to run this container
▪ A worker node knows about an overlay network (and the associated keys, if it’s an encrypted overlay) only if it is supposed to run a container attached to this network
▪ Overlay networks are resilient to MITM attacks due to the top-down approach of their configuration(see this DockerCon talk by @lbernail for details)
▪ Secrets (a special SwarmKit construct) are only pushed to worker nodes who need them, and never written to their disk
Worker nodes are like Jon Snow
Tweet us on #skynetvsapes !@lpiot
@jpetazzo@adrienblind
@laurentgrangeau
▪ Least privilege reduces the splash damage of node compromise
▪ Compromising a node only compromises the containers on that node; not the whole cluster
▪ Enables effective definition of security perimeters through tags
▪ Services can be restricted to specific perimeters through placement constraints
docker service create --constraint node.labels.security==low ...
SwarmKit clusters are resilient
Tweet us on #skynetvsapes !@lpiot
@jpetazzo@adrienblind
@laurentgrangeau
▪ By default, the Docker API is all or nothing
▪ Authorization plugins* let you vet each API request
▪ Example:
▪ deny deployment requests lacking a placement constraint specifying a security tag
▪ only allow deployment in a given security zone if the authenticated user is within the appropriate group
▪ Authorization plugins can be cascaded
*Available on Docker CE & EE. UCP is an authorization plugin.
Enforcement through authorization
Tweet us on #skynetvsapes !@lpiot
@jpetazzo@adrienblind
@laurentgrangeau
▪ An AI like Skynet could leverage different security levels …
▪ Core services
▪ managers, storage, self-healing routines
▪ self-provisioned instances on well-protected, well-funded IAAS accounts
▪ Compute nodes
▪ machine learning, deep learning
▪ instances hacked a long time ago, or deployed with fragile funds
▪ Honeypots
▪ scamming and phishing operations, quarantine
▪ anything goes!
Skynet
Tweet us on #skynetvsapes ! !@lpiot
@jpetazzo@adrienblind
@laurentgrangeau
Deep dive:Self-Healing Infrastructure
Tweet us on #skynetvsapes !@lpiot
@jpetazzo@adrienblind
@laurentgrangeau
▪ Example: managing VM deployment with IAAS
▪ Imperative style
▪ I create VMs directly
▪ I use a web console, CLI, API …
▪ when things break, I have to find out what and fix it
▪ Declarative style
▪ I describe what I want (with a Cloud Formation template, Terraform plan…)
▪ I run a tool to reconcile my infrastructure with the description I wrote
▪ when things break, I just run the tool again
▪ enables infrastructure as code
From imperative to declarative
Tweet us on #skynetvsapes !@lpiot
@jpetazzo@adrienblind
@laurentgrangeau
From declarative to self-healing
▪ Self-healing infrastructure continuously reconciles (fixes) itself
▪ Examples:
▪ AWS Auto-Scaling Groups
▪ Convox (leverages AWS Lambda)
▪ InfraKit (cross-platform approach)
Tweet us on #skynetvsapes !@lpiot
@jpetazzo@adrienblind
@laurentgrangeau
Work in (perpetual) progress !
Propose cool hacks:pull requests on the repo
… and be creditedfor your participation!
The story of which you can be the hero