breaking up with your data center presentation
DESCRIPTION
Telescope Inc.'s presentation at AnsibleFest conference. In this talk, discuss how they moved from a baremetal datacenter to AWS, things learned along the way, and how they scale up to meet voting demand with Ansible.TRANSCRIPT
WHO IS TELESCOPE?
Industry Leader in Participation Media
Offices in Los Angeles, New York, London and Brazil
Real-time Platform – Powering most demanding high profile audience participation campaigns for TV, Sports and Brands
Unique traffic path and unique requirements/High Capacity/Unique Traffic Spikes
WHO IS TELESCOPE? Since 2002, leading media companies and brands have entrusted Telescope to power
and optimize results on their most demanding and highest-profile audience engagement initiatives.
Our participation platform enables several billion interactive experiences every year.
SHOW NIGHT TRAFFIC
05:55:00 PM06:50:00 PM07:45:00 PM08:39:59 PM09:35:00 PM10:30:00 PM11:25:00 PM12:20:00 AM
Vote Transactions per minute
Total
INTRO SLIDES ARE THE BEST!
Why we moved from bare metal to AWS
Why we choose ansible for our automation
How ansible solved our automation challenges
OUR FIRST LOVE 6 racks across two buildings at One Wilshire 2 SysAdmins responsible for hardware and operations No Environment for Developers to play Our hardware was restricting our ability to grow Long lead time to build increase total capacity
Research latest hardware Order hardware Order power and space Rack, stack, cable Install all the things
IS THIS RIGHT FOR US?
Hard to justify building up a datacenter
Telescope’s case required significant burst capacity
Business needs changed quickly
Support large architecture with a small team
Wanted to automate, but didn’t know where to start
SO MANY OPTIONS
RackspaceAWSAzureSoftlayerGoogleOracleDigital OceanHeroku
PuppetChefAnsibleBconfig2SaltCFEngineVagrantCobbler
OUR DATA PROFILE
OUR DATA PROFILE
WHY AWS? Leader in cloud hosting Strong API’s with many supporting
communities across multiple languages AWS services were a complement to our
own technologies
It (AWS) is the overwhelming market share leader, with more than five times the cloud IaaS compute capacity in use than the aggregate total of the other 14 providers in this Magic Quadrant. It is a thought leader; it is extraordinarily innovative, exceptionally agile, and very responsive to the market. It has the richest array of IaaS features and PaaS-like capabilities, and continues to rapidly expand its service offerings. It is the provider most commonly chosen for strategic adoption.
- Gartner, Cloud Infrastructure as a service
report
TIME TO AUTOMATE
Can you make a build in one step?
- Joel Spolsky, “The Joel Test: 12 Steps to Better Code”
TOO DEPENDENT
v
TOO NEEDY
v
JUST RIGHT
v
IT’S A MATCH!
ansible-playbook scale_api.yml -e “env=prod total=40”
SCALE UP: PROVISION INSTANCES
tasks: - name: Create the {{ zone2 }} instances in AZ2 ec2: assign_public_ip: true region: "{{ aws_region }}” keypair: "{{ aws_keypair }}" instance_type: "{{ connectapi.az2.instance_type }}" image: "{{ connectapi.az2.ec2_image }}" wait: true group: "{{ connectapi.az2.group }}" instance_tags: "{{ connectapi.az2.instance_tags }}" count: "{{ zone2 }}" vpc_subnet_id: "{{ connectapi.az2.vpc_subnet_id }}" zone: "{{ connectapi.az2.zone }}" volumes: "{{ connectapi.az2.volumes | default([]) }}" register: ec2_az2 - add_host: name={{ item.public_ip }} groups=scaleup_connectapi,tag_connectapi_{{ env }} with_items: ec2_az2.instances
- include: connect_api.yml scaleup=scaleup_connectapi
SCALE UP: PROVISION INSTANCES
Create Instances
Install things
- hosts: tag_connectapi_{{ env }}:&{{ scaleup | default(‘all’) }}
roles: - common - { role: ndb, goal: apinode } - memcached - nginx - tomcat - maven - connect_api - { role: flume, goal: api} - { role: zabbix, goal: agent } post_tasks:- name: add machine back into the load balancer local_action: module: ec2_elb instance_id: "{{ ansible_ec2_instance_id }}" state: "present" ec2_elbs: "connectapi-{{ env }}" aws_access_key: "{{ aws_access_key }}" aws_secret_key: "{{ aws_secret_key }}" region: "{{ aws_region }}"
Target
SCALE UP: INSTALL THINGS
Install
Load Balancer
MONITORING
Ansible templates give us flexibility for inflexible tools
# {{ ansible_managed }}# This is a config file for the Zabbix agent daemon (Unix)
…
Server={% for host in groups['tag_zabbix_prod'] %} {{ hostvars[host].ec2_private_ip_address }}, {% endfor %}
…
HostMetadata={{ ec2_security_group_names }}
zabbix_agentd.conf
MONITORING
Instance Creation
- name: Create the Instances for this vpc ec2: region: "{{ aws_region }}" keypair: "{{ aws_keypair }}" instance_type: "{{ item.instance_type }}" image: "{{ item.ec2_image }}" wait: true group: "{{ item.group }}" instance_tags: "{{ item.instance_tags }}" count: "{{ item.count }}" vpc_subnet_id: "{{ item.vpc_subnet_id }}" zone: "{{ item.zone }}" volumes: "{{ item.volumes }}" ebs_optimized: "{{ item.ebs_optimized" register: ec2 with_items: ec2_instances
ec2_instances:- instance_type: m3.large ec2_image: ami-3aba131f group: - connectapi-prod - telescope-access zone: us-west-1a vpc_subnet_id: subnet-a13763e7 instance_tags: connectapi: prod Name: ConnectAPI-prod
MONITORING
Zabbix Auto registration based on host metadata
DO YOU EVEN TAG, BRO?
tag_connectapi_{{ env }}
tag_webapp_{{ env }}
tag_hbase{{ cluster }}_{{ env }}:&hadooprole_master
tag_hbase{{ cluster }}_{{ env }}:&hadooprole_node
- hosts: localhost
pre_tasks:
- name: Add new instance to host group add_host: hostname: “{{ item.1 }}” groupname: “__zookeeper_temp” position: “{{ item.0 + 1}}” with_indexed_items: groups.tag_zookeeper_{{ env }}
- hosts: __zookeeper_temp
roles: - role: zookeeper - { role: zabbix, goal: agent }
DO YOU EVEN TAG, BRO?
Add hosts to temp group
Install
Target temp group
- hosts: localhost
pre_tasks: - name: Add instances to temporary host group add_host: hostname={{ item.1 }} groupname:__hbase_temp with_indexed_items: groups.tag_hbase{{ cluster }}_{{ env }}
- name: Pick one from the temporary group add_host: hostname={{ item }} groupname: “__the_chosen_one” with_random_choice: groups.__hbase_temp
- hosts: __the_chosen_one gather_facts: true user: ec2-user sudo: yes
DO YOU EVEN TAG, BRO?
Add hosts to temp group
Install
Randomly choose
- name: copy tower tarball copy: src: ansible-tower-setup-latest.tar.gz dest: /tmp/tower-latest.tar.gz force: yes
- name: set the group vars file template: src=group_vars_all.j2 dest=/tmp/ansible-
tower-setup-latest/group_vars/all when: "upgrade is defined and upgrade == 'yes'“
- name: Create a database backup of tower at /var/log/awx
shell: awx-manage dumpdata > /var/log/awx/backup-{{ ansible_date_time.epoch }}.json when: "backup is defined and backup != 'no'"
- name: run the tower install command: ./setup.sh chdir=/tmp/ansible-tower-setup-
latest when: "upgrade is defined and upgrade == 'yes'"
- name: place the license file template: src=license.j2 dest=/etc/awx/license'
Send the tarball
YO DAWG! I HEARD YOU LIKE ANSIBLE!
Set group_vars
Install
Backup
WHERE ARE WE NOW? Most of our major architecture has
been moved to AWS
Still shipping data to our datacenter for storage and analytics
Coordinating migration of our last pieces based on client restrictions
HOW HAS THIS CHANGED US Changed our conversations
“Ok, so lets just kill all the queue boxes, Bring up 2 new boxes and new API and start from scratch”
-11:52pm Thursday Night
“I need your help since you are not doing anything”“Dude I am busy installing and provisioning 16 hadoop servers”“Good, so you have time”
-3:42pm Tuesday afternoon
WHAT’S LEFT? Chronos as a replacement for Chron Zabbix module for better automation
Scaling down Adding JMX interfaces Eventually express all items/triggers from a dictionary so changes are
tracked in git Akamai Module
Push cache config changes with web app updates Remove any embarrassing hacks from Hadoop, Hbase, NDB roles and share
them if there is interest Operational Tasks to make day to day easier
HAPPILY EVER AFTER
Telescope is happy in it’s new relationships
AWS has been a good match for Telescope
Ansible has let us take control of our architecture like never before
Looking forward to solving bigger problems
QUESTIONS?
Iain Wright, Lead Systems Administrator
John Weatherford, VP Engineering
#TeamTelescope
Ps. If this sounds interesting to you, come talk to us!