the journey of chaos engineering begins with a single step

25
#PDSummit16 #PDSummit16 The Journey of Chaos Engineering Begins with a Single Step

Upload: bruce-wong

Post on 14-Apr-2017

223 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: The Journey of Chaos Engineering Begins with a Single Step

#PDSummit16#PDSummit16

The Journey of Chaos EngineeringBegins with a Single Step

Page 2: The Journey of Chaos Engineering Begins with a Single Step

#PDSummit16#PDSummit16

Bruce WongSenior Engineering Manager

Twilio@bruce_m_wong

https://www.linkedin.com/in/brucemwong

Page 3: The Journey of Chaos Engineering Begins with a Single Step

#PDSummit16#PDSummit16

Page 4: The Journey of Chaos Engineering Begins with a Single Step

#PDSummit16#PDSummit16

2009

2012

2014http://techblog.netflix.com/2012/07/chaos-monkey-released-into-wild.html https://github.com/Netflix/SimianArmy http://techblog.netflix.com/2015/09/chaos-engineering-upgraded.html

Page 5: The Journey of Chaos Engineering Begins with a Single Step

#PDSummit16http://readwrite.com/2014/09/17/netflix-chaos-engineering-for-everyone/ http://techblog.netflix.com/2014/09/introducing-chaos-engineering.html

Page 6: The Journey of Chaos Engineering Begins with a Single Step

#PDSummit16 https://www.twilio.com/

Page 7: The Journey of Chaos Engineering Begins with a Single Step

#PDSummit16#PDSummit16 https://customers.twilio.com/

Page 8: The Journey of Chaos Engineering Begins with a Single Step

#PDSummit16#PDSummit16

The journey of a thousand miles begins with a single

step.

-Lao Tzu

Page 9: The Journey of Chaos Engineering Begins with a Single Step

#PDSummit16#PDSummit16

James BurnsTech Lead

Twilio@1mentat

#PDSummit16 https://www.linkedin.com/in/james-burns-7816a82

Page 10: The Journey of Chaos Engineering Begins with a Single Step

#PDSummit16#PDSummit16

PreparationPre-Launch Log Aggregation System -Stage env -Synthetic Traffic

Page 11: The Journey of Chaos Engineering Begins with a Single Step

#PDSummit16

The Master of Disaster•Network Issues

•Partitions

•Thundering Herds

•Cascading Failures

•Resource Starvation

•CPU

•Memory

•Disk IO

•Network IO

•Application Load

Page 12: The Journey of Chaos Engineering Begins with a Single Step

> sudo halt

#PDSummit16

Page 13: The Journey of Chaos Engineering Begins with a Single Step

Incident Start

#PDSummit16

Page 14: The Journey of Chaos Engineering Begins with a Single Step

Impact?

#PDSummit16

Page 15: The Journey of Chaos Engineering Begins with a Single Step

Post-Mortem

#PDSummit16

Page 16: The Journey of Chaos Engineering Begins with a Single Step

#PDSummit16#PDSummit16

Page 17: The Journey of Chaos Engineering Begins with a Single Step

#PDSummit16

Round 2•Network Issues

•Partitions

•Thundering Herds

•Cascading Failures

•Resource Starvation

•CPU

•Memory

•Disk IO

•Network IO

•Application Load

Page 18: The Journey of Chaos Engineering Begins with a Single Step

> sudo halt

#PDSummit16

Page 19: The Journey of Chaos Engineering Begins with a Single Step

Third-Party API Failure

#PDSummit16

Page 20: The Journey of Chaos Engineering Begins with a Single Step

#PDSummit16

Well, that’s not what I expected to see

Page 21: The Journey of Chaos Engineering Begins with a Single Step

#PDSummit16

Outcomes

Instrument

Instrument

Instrument

API SLAs

Architectural Change!

Page 22: The Journey of Chaos Engineering Begins with a Single Step

#PDSummit16

Recap

• Start Simple • Instrumentation

Gaps • Understand your

dashboards • Prevent outages

Page 23: The Journey of Chaos Engineering Begins with a Single Step

#PDSummit16http://www.crisistextline.org/ http://polarisproject.org/befree-textline http://trekmedics.org/ https://www.twilio.org/

Page 24: The Journey of Chaos Engineering Begins with a Single Step

#PDSummit16

When you wish upon a blue moon…

Page 25: The Journey of Chaos Engineering Begins with a Single Step

#PDSummit16#PDSummit16

Please provide feedback for this

session by filling out the feedback survey