mastering continuous monitoring in a microservices world · mastering continuous monitoring in a...
TRANSCRIPT
1 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
feat. Ansible, Docker, Mesos & Co.
Mastering Continuous Monitoring
in a Microservices World
2 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
Monitoring for a Microservices-readyWeb Hosting Platform
3 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
• Romain Bigeard
• Senior Consultant at Avocado Consulting
• Infrastructure Background
• APM and Automation for past four years
• https://au.linkedin.com/in/rbigeard
• @romainbigeardIT
Who am I
4 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
Big Telco (37000+ employees)
Project: VAS Platform migration from Solaris to a Private Linux Based Cloud
The Customer
5 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
• Focus on elasticity, automation and DevOps principles
• Ready for microservices migration
• Identical prod and non-prod environments
• Holistic monitoring
Platform Goals
6 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
• Blocks of VMs can be spun up at will through a Web GUI
• Each Block has a Virtual Load Balancer
• Dockerized Applications are then deployed to blocksby a Continuous Deployment System (Atlassian Bamboo)
• Ansible Chosen as “Automation” Glue
Focus on Automation
7 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
Block CreationContainer DeploymentCustomer Traffic
8 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
Web GUI
9 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE9
Technology Choices
10 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
• YAML Syntax easy to comprehend
• Agentless architecture
• Combines configuration management and orchestration
• Good match with Docker
• Integrates well with CI systems
Ansible
11 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
Ansible
12 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
• Light-weight containers
• Perfect match for running microservices
• Allows for rapid deployment/redeployment
• Containers contain all application dependencies
• Container images can inherit from existing containers
• Huge Ecosystem: https://www.mindmeister.com/389671722/docker-ecosystem
• Docker images distributed by a registry (think App Store)
Docker
13 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE13
Monitoring Challenges
14 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
• Must be elastic
• Must provide state-of-the-artApplication Performance Monitoring
• Must integrate with current “corporate wide”monitoring and alerting solution (HPOV)
• Must add value to every environment: Dev, SVT, Production
Monitoring Challenges
15 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
Dynatrace fulfills
those requirements!
ELK complements Dynatrace for Log Processing
16 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
Three layers of Dynatrace infrastructure must be automated
Monitoring Challenges
17 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE17
Dynatrace Agent Automation
18 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
• Leverages Docker image ‘Inheritance’
• Dynatrace Agent integrated in “Parent” images for Tomcat and JBoss
• Application Startup Script determines Dynatrace Agent configurationvia DNS TXT Records
Dynatrace Agent Automation
19 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
• Docker can build images automaticallyby reading the instructions from a Dockerfile
• Each image consists of a series of layers
Docker Image Inheritance
20 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
FROM docker-registry.acme.com/tomcat7:7.0.60.1 (Dynatrace Agent Inside)
RUN rpm -ivh http://repo/application1-6.2.0.1238-linux-3.tls1.rpm
CMD ["/usr/sbin/application1”]
Dockerfile Example
docker-registry/tomcat7:7.0.60.1
docker-registry/application:1.0.0.1
Dynatrace Agent
21 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
Application startup script queries DNS:
dig -x 10.1.1.51 TXT +short
"node_environment=production"
"node_shortname=vm001"
"node_service=application1"
Allows to build:
JAVA_OPTS="-agentpath:${dynatrace_binary}
=name=${node_service}-${node_environment},
server=${dynatrace_server} $JAVA_OPTS"
Determine Dynatrace Agent Config via DNS TXT
22 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE22
Dynatrace Collector Automation
23 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
• Dynatrace Collectors run like any otherDockerized Application on the Platform
• Dynatrace Collectors are behind a load balancer
• They are deployed by CD System (Bamboo)
• Number of Collectors can be scaled up within minutes!
Dynatrace Collector Automation
24 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE24
Dynatrace Server Automation
25 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
• Ansible playbook deploys the Dynatrace Server
• Dynatrace Server considered a core service on the platform
• Dynatrace offers own Ansible playbooks/roles: https://github.com/dynatrace/Dynatrace-Ansible
Dynatrace Server Automation
26 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE26
End Result
27 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
End ResultCustomer Traffic
Application Dynatrace
Collector Block
Dynatrace
Server
28 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
29 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
Integration with Corporate Alerting Solution
SNMP
Traps
30 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
Same monitoring solution
used
in every environment.
31 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
• in Dev to allow for rapid performance testing
• in Volume Testing for every run,in order to spot possible performance problems under load
• in Production for Alarming
• in Production for Diagnosing Problems
Dynatrace is used…
32 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
33 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
• Elastic platform
• Ready for microservices migration
• Strong cooperation between Dev and Ops
• Monitoring solution spans every environment
Overall Solution Benefits
34 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
Managing and scaling microserviceswith Apache Mesos and Ruxit
35 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
• Alexander Ramos Jardim
• IT Operations Manager at B2W
• Working at B2W since 2008
• Architecture design
• Operations management
• Incident response management
Who am I
36 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
americanas.com.br
37 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
submarino.com.br
38 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
shoptime.com.br
39 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
soubarato.com.br
http://www.soubarato.com.br
40 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
• 25% market share of Brazilian e-commerce market
• Largest e-commerce company in Latin America
• Revenue: $10 billion Brazilian Real; $2.5 billion US
• 4 brands: Submarino, Americanas.com, Shoptime, Soubarato
• 2 carriers and 9 distribution centers
B2W Digital
41 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
The reign of monolith applications
42 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
Java + Spring
Weblogic
Numerous apps per JVM
Backend does all the worki
HTML, business logic,
data access, you name it
Outdated monolithic architectures
HARDWARE
OPERATING SYSTEM
JVM
APP1 APP2
JVM
APP1 APP2
43 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
Changes are slow and risky!
We’ve had difficulty innovating on our apps
HARDWARE
OPERATING SYSTEM
JVM
APP1 APP2
JVM
APP1 APP2 Why did the site break?
44 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
July 2014: Black Night
45 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
• Too many requests
• Architecture couldn't scale properly
• Team couldn't react to incidents
Board realized we needed a microservices approach
What happened?
46 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
• Going microservices
• Going DevOps
• Decentralized governance
• Hybrid cloud infrastructure
AWS + private
This involved...
47 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE47
2015: PaaS
But why..?
48 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
So many apps!
49 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
Lack of technical standards
but where's it???
APP1APP2
APP1 depends on APP2
Where is this specified?
50 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
Lack of integration between Ops tools
51 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
52 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
Now, we have...
‘Go’ is coming soon
Allows Devs to use any technology
53 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
One CI fits all technologies
Deploys Docker
containers
in 2 seconds
Marathon
54 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
Example Mesos Architecture
Source: digital ocean
55 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE55
Easier Deployments
56 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
Dev pushes codeHook triggers
build
Docker image built+ test+ Docker push
Relax, your code is ready for
production
57 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
Dev triggers deployment
Runs deploymentDeploys containers on Mesos cluster
Marathon
58 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
59 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
But, wait…
What's happening inside my containers?
Now we need visibility into API and
job management with Docker
60 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
Ruxit completes the puzzle
completes the puzzle
61 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
When Marathon died!The day Marathon died
62 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
The network retransmission episode
63 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
No changes to applications.
No plugins.
One agent does it all.
64 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE #Perform2015
• We still have lots of auxiliary operational tools to be integrated
• Need deeper integration of Ruxit into our ticketing system
• Need to integrate Ruxit and persistence technologies
Looking ahead