automating cloud applications using open source

Click here to load reader

Post on 24-Jan-2015




5 download

Embed Size (px)


With the proliferation of tools, frameworks, and libraries, it’s now easier than ever to build cloud-based systems. However, while each tool is designed to solve a specific pain point, gaps exist when it comes to a holistic approach to managing the cloud-based software lifecycle. Using real-world examples, BrightTag engineers explain how they helped design a highly scalable platform and automated zero-downtime deploys using primarily off-the-shelf open source software. The talk will focus on the software lifecycle, broken into three high-level areas of focus: Design, deployment and monitoring. This session will review considerations for designing applications to take advantage of cloud-based deployment and demonstrate how to leverage existing open source tools like fabric, haproxy, libcloud, and graphite to create a scalable and flexible infrastructure.


  • 1. Automating Life in the Cloud!Joshua Buss, Matthew Kemp & Cody Ray !

2. 2 3. Designing for the Cloud!Use Case 0: Scalability and Reliability!Add more features!!!This widget is too slow!!!No more downtime!!!Were losing potential customers in Asia!! 3 4. Scalability!Use Case 0: Scalability and Reliability! Focus on scaling applications horizontally.! 4 5. Service Oriented Architecture!Use Case 0: Scalability and Reliability!Wikipedia Denition:!SOA as an architecture relies on service-orientation as itsfundamental design principle. If a service presents a simpleinterface that abstracts away its underlying complexity,users can access independent services without knowledgeof the services platform implementation.!!Laymans terms:!A complex system is broken into simple components thatare able to interact with each other (and possibly outsidesources).! 5 6. What is a Service in SOA?!Use Case 0: Scalability and Reliability!An independent unitPresenta(on (web, api, etc) thats composable withother components.!Business Logic Business Logic Data Data Data Access Access Access Data Stores 6 7. Services at BrightTag!Use Case 0: Scalability and Reliability!ui stathub database tagserve datahub database 7 8. Service Division of Labor!Use Case 0: Scalability and Reliability! When should you split services up?! 8 9. Design for Failure!Use Case 0: Scalability and Reliability!Keep failures selfcontained.!Release It! by Nyard is a great resourcefor stability patterns 9 10. Redundancy at BrightTag!Use Case 0: Scalability and Reliability!Run a full stack in each region.!ui ui stathub stathub database database tagserve tagserve datahub datahub database database 10 11. Load Balancers!Use Case 0: Scalability and Reliability!!Services are over HTTP.!!Able to use standardtools and componentswithout extra effort.! 11 12. Backwards Compatibility!Use Case 0: Scalability and Reliability!Changes need to be allowed, but compatibilityneeds to be maintained.!!12 13. Cross-Region Data Replication!Case 1: Inter-Region Communication!Need some dataavailable in all regions,but keep inter-regioncommunication to aminimum.!!13 14. What is Cassandra?!Case 1: Inter-Region Communication!Googles BigTable datamodel on AmazonsDynamo infrastructure.!14 15. Cassandra Token Ring!Case 1: Inter-Region Communication! East West cassandra04 cassandra01 cassandra04 cassandra01 [192-255] [0-63] [193-0] [1-64] cassandra03 cassandra02 cassandra03 cassandra02 [128-191] [64-127] [129-192] [65-128] Key hashes to 157?15 16. How Cassandra Writes!Case 1: Inter-Region Communication! East West cassandra04 cassandra01 cassandra04 cassandra01 [192-255] [0-63] [193-0] [1-64] cassandra03 cassandra02 cassandra03 cassandra02 [128-191] [64-127] [129-192] [65-128] Writes goes here.16 17. Cross Region Messaging (Hiveway)! Case 1: Inter-Region Communication! Cross region messaging over HTTPS with compression.!Messages Messageslocal remote hiveway hiveway 17 18. Smooth Code Pushes!Use Case 2: Zero Downtime Builds!18 19. Mirror Environment Cutover!Use Case 2: Zero Downtime Builds!Easy migrations andupgrade path.!Can be more expensive.!19 20. Rolling Deploy!Use Case 2: Zero Downtime Builds!More complicatedmigrations andupgrades.!!Longer deploy window.!!Usually cheaper.!!20 21. Fabric Pseudocode!Use Case 2: Zero Downtime Builds! for region in regions: for app in apps: for server in region: if app on server: maintenance app scp new code to dir symlink app/current to app/ restart app wait for healthy! 21 22. Health Checks at BrightTag!Use Case 2: Zero Downtime Builds!!Standardized health checks acrossservices.!!!$ curl -si http://service/bthc HTTP/1.1 204 No Content $ curl -si http://service/bthc?action=maint HTTP/1.1 500 Internal Server Error Connection: close Content-Length: 5 MAINT 22 23. Keeping an Eye on the Pulse!Use Case 2: Zero Downtime Builds!At a glance environment health.!23 24. Runtime Controls!Use Case 2: Zero Downtime Builds!Provide multiple modesof operation.!24 25. ConnectivityUse Case 3: Generating /etc/hosts 26. What is Zerg?!Use Case 3: Generating /etc/host+ = 26 27. Flask and libcloud Working Together!Use Case 3: Generating /etc/hostsDRIVER_MAPPING = { "dev": { "office": get_driver(Provider.EUCALYPTUS)( DEV_ID, secret=DEV_KEY, host="openmaster", port=8773, secure=False, path="/services/Cloud") }, "prod": { "us-east-1": get_driver(Provider.EC2_US_EAST)(PROD_ID, PROD_KEY), "eu-west-1": get_driver(Provider.EC2_EU_WEST)(PROD_ID, PROD_KEY) } } @app.route("/hosts//") def hosts(env, region): hosts = DRIVER_MAPPING[env][region].list_nodes() return str([d.extra[private_dns] for host in hosts]) ! 27 28. The Zerg Code! Use Case 3: Generating /etc/[email protected]("/etchosts//") def etchosts(env, region): driver = DRIVER_MAPPING[env][region] sorted_nodes = sorted((, node.private_ips, node.public_ips) for node in driver.list_nodes()) hosts = [{private_ip:private_ips[0], name:name, public_ip:public_ips[0]} for (name, private_ips, public_ips) in sorted_nodes] response = render_template(etc_hosts.txt, hosts=hosts) return Response(response, content_type=text/plain) Template:!# The following lines are desirable for IPv6 capable hosts ::1 ip6-localhost ip6-loopback {% for host in hosts %} {{ "%-21s%-21s# External: %s"|format(host.private_ip,, host.public_ip) }} {%- endfor %} 28 29. The Zerg HTTP Response!Use Case 3: Generating /etc/hosts!$ curl s http://zerg/etchosts/prod/eu-west-1# The following lines are desirable for IPv6 capable hosts"::1 ip6-localhost ip6-loopback10.0.0.10 server01# External: server02# External: server03# External: server04# External: server05# External: server06# External: 30. The bash script! Use Case 3: Generating /etc/hosts# Set variables read -r -d STATIC_HOSTS > ${TMPDIR}/static_hosts cp ${TMPDIR}/static_hosts ${TMPDIR}/new_hosts wget -qO- "http://${ZERG_IP}/etchosts/${E}/${R}" >> ${TMPDIR}/new_hosts && if [[ $(diff ${TMPDIR}/new_hosts /etc/hosts | wc -l | awk {print $1}) < 7 || ${FORCE} == --force ]]; then cp ${TMPDIR}/new_hosts /etc/hosts; fi 30 31. Conguring Load Balanced Services!Use Case 4: Generating Load Balancer Conguration! Update timing tricky to get right! Too important to leave completely autonomous! 31 32. Consistency > *Use Case 4: Generating Load Balancer Conguration!Need a rock-solid foundation to deploy onto. 33. Single Puppet MasterUse Case 4: Generating Load Balancer Conguration!Set environment per-instance: /etc/puppet/puppet.confSymlink /etc/puppet/environments/ on master to variousgit checkouts of the source:$ cd /etc/puppet/environments$ ln s ~/src/puppet/prod_stable prod_stable$ ln s ~/src/puppet/dev_stable dev_stable$ ln s ~/src/puppet/dev_test dev_testUse cron to keep all branches up-to-date 34. Source Controlled Puppet ConfigsUse Case 4: Generating Load Balancer Conguration!Each environment hasits own branch.Make a new branch forevery new feature.Merge into a test branchto test.Merge into stable. 35. The App Denitions in Zerg!Use Case 4 Load Balancer Congs!APP_DEFS : { "zerg": { "type": "http", "healthcheck": {"port": 19999, "resource": "/zerghealth"} }, "awesome": { "type": "http", "healthcheck": {"port": 20000, "resource": "/ahc"}, "frontend" : "10080" }, "haproxy_awesome":{ "type": "http", "healthcheck": {"port": 20001, "resource": "/"} }, "foo": { "type": "http", "healthcheck": {"port": 20002, "resource": "/"}, "frontend" : "10081" }, "mashed_potatoes": { "type": "http", "healthcheck": {"port": 20003, "resource": /"}, "frontend" : "10082" }, "haproxy_foo": { "type": "http", "healthcheck": {"port": 20004, "resource": "/hc"} }, "thehardproblem": { "type": "http", "healthcheck": {"port": 20006, "resource": "/"} }, "redis": { "type": "tcp", "healthcheck": {"port": 20007, "resource": "/rhc"} }, "dataserver": { "type": "http", "healthcheck": {"port": 20008, "resource": "/"} }, "frontend" : "10083" }, "itshards":{ "type": "http", "healthcheck": {"port": 20009, "resource": "/"} }, "devnull": { "type": "http", "healthcheck": {"port": 200010, "resource": "/hc"} } } 35 36. The Zerg Code!Use Case 4 Load Balancer [email protected]("/haproxy///") def haproxy(env, region, type): instances = get_region_manifest(region) apps = {} for app in APP_DEFS[env]: if frontend in APP_PORTS[env][app].keys(): app_object = { servers:[], backend_port: APP_PORTS[env][app][healthcheck][port], frontend_port: APP_PORTS[env][app][frontend] } for server in instances: if app in instances[server][roles]: app_object[servers].append({name:server, details:instances[server]}) apps[app] = app_object return render_template(haproxy_%s_%s_%s.txt % (env, region, type), vips=apps) 36 37. The Zerg Flask Template!Use Case 4 Load Balancer Congs!global blah blah defaults blah blah frontend dataserver_vip bind *:{{ vips.dataserver.frontend_port }} default_backend dataserver frontend mashed_potatoes_vip bind *:{{ vips.mashed_potatoes.frontend_port }} default_backend mashed_potatoes backend dataserver balance roundrobin {%- for server in vips.dataserver.servers %} server {{ server[name] }} {{ server.details[private ip] }}:{{ vips.dataserver.backend_port }} check {%- endfor %} backend mashed_potatoes balance roundrobin {%- for server in vips.mashed_potatoes.servers %} server {{ server[name] }} {{ server.details[private ip] }}:{{ vips.mashed_potatoes.backend_port }} check {%- endfor %} 37 38. The Zerg H

View more