astricon - realities of global infrastructure in the cloud

@cvonwallenstein from @DynInc

Global Infrastructure in the Cloud

Cory von WallensteinChief Technology Officer, Dyn Inc.

@cvonwallenstein

http://www.flickr.com/photos/notaperfectpilot/8119088205/

“Wired people should know something about wires”- Neal Stephenson, quoted in Andrew Blum’s TED Talk What is the Internet, Really?

http://www.ted.com/talks/andrew_blum_what_is_the_internet_really.html

Going Global in the Cloud

• Never been easier• Never been more affordable• Why should or shouldn’t you?• If so, how?

A Word on Costs and Value

• Unlikely to save you raw dollars• Likely to spend the same or more• But here’s what you gain:

– Flexibility – Performance – Reliability – Efficiency

• Are those worthwhile to you?

(can’t really screw this up)(many caveats here)

(if you do it right)(if your team embraces it)

Why go from 1 to N?

Reason 1: Disaster Recovery

http://maps.google.com

http://www.cogentco.com/files/images/network/network_map/networkmap_global_large.png

Speed of light299,792.458 km/second

Theoretical RTT~40ms

Real RTT~90ms

• Things don’t work as well at 90ms RTT latency as they do at 9ms RTT latency

• Where can you go to get out of the way of a disaster but not create latency headaches?

http://www.globaldatavault.com/natural-disaster-threat-maps.htm

http://www.datacenterknowledge.com/archives/2012/07/09/outages-surviving-electric-squirrels-ups-failures/

“A frying squirrel took out half of our Santa Clara data center two years back,”- Mike Christian, Yahoo

http://blog.level3.com/level-3-network/the-10-most-bizarre-and-annoying-causes-of-fiber-cuts/

“Squirrel chews account for a whopping 17% of our damages so far this year! But let me add that it is down from 28% just last year and it continues to decrease since we added cable guards to our plant.”, Fred Lawler, Level(3)

Reason 2: Get closer to users

http://www.akamai.com/html/technology/dataviz1.html

Reason 2: Get closer to users

http://www.akamai.com/html/technology/dataviz1.html

Reason 3: “Sorry, we’re full”

http://www.theregister.co.uk/2010/10/12/capgemini_merlin_data_center/

How: Figure out who and where

• Figure out what your motivations are– Disaster recovery– Get closer to users– Future scaling

• Take a latency inventory of your apps– To end users– To other dependencies

• Get out the maps! Fire up traceroute!– EC2: US East (Northern Virginia), US West (Oregon), US West (Northern California), EU (Ireland), Asia

Pacific (Singapore), Asia Pacific (Tokyo), South America (Sao Paulo), and GovCloud.

How: Deploy and manage w/ sanity• Software defined datacenters

– Fancy term for “I defined the architecture in code instead of Microsoft Visio”

• Configuration management– Orchestrate the cloud APIs, and the config of

systems– Chef– Puppet– CFEngine, and more

• Huge loss if you don’t take advantage of this

How: Coordinating global traffic• What’s the app?

– Application agnostic, like DNS Global Server Load Balancing

• Fancy term for “DNS servers monitor your servers and change DNS answers when events are detected”

– Application specific, like DUNDi• Decentralized coordination and fault tolerance

• Avoid SPOFs like the plague– Keep it simple, keep it scalable

What can you expect?• Flexibility

– Deploy new servers in new locations in hours instead of weeks

• Performance– If horizontally scalable on commodity hardware,

you win. Else, be careful.– If closer to users and site-to-site latency not an

issue or data is distributed/eventually consistent, you win. Else, be careful.

What can you expect?• Reliability

– If you understand “regions” and “availability zones”, you win. Else, be careful.

http://joyent.com/blog/if-i-was-your-cloud-provider-i-d-never-let-you-down

What can you expect?• Efficiency

– Automation– More instrumentation -> reduced MTTD– More scalable– Most important: More focus on what delivers your

business core competitive advantage.

Thank you (and we’re hiring!)VP Technical Operations, Director of Engineering

Director of Security, Network Engineers, Software Engineers, System Engineers, System Administrators (and more!)

Reach out to me: dyn.com, cvw@dyn.com, @cvonwallenstein

astricon - realities of global infrastructure in the cloud

disaster recovery

disaster

servers

cloud

dyn

cvonwallenstein

figure

usershttp

Technology

asterisk project update astricon 2009 - tmcnet · asterisk...

research education private punishment: who profits?...

astricon 2016 - scaling ari and production

asterisk at the heart of interactive media (astricon 2014)

livevox serge kruppa astricon 2008 4

astricon 2009 presenter: jeronimo romero date: 10/14/2009

case study: building a hosted call center · astricon *...

asterisk en español introducción a asterisk astricon...

astricon 2014 - webrtc - the big debate, i say shut up and...

astricon 2008

astricon 2014 keynote: russell bryant

"reinventing the dialplan" slides from twilio's astricon...

pre-conference sessions...

no more fraud, astricon, las vegas 2014

home automation with asterisk - astricon 2015 - alberto...

large infrastructure projects in germany · registration...

esalen in new realities magazine 1987 · title: esalen in...

astricon 2011 - connecting sugarcrm with your pbx

astricon webrtc update

asterisk dialplan - workflows & maintainability- astricon...