in this talk - qcon · @nora_js “we ran a chaos experiment which verifies that our fallback path...
TRANSCRIPT
![Page 1: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/1.jpg)
![Page 2: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/2.jpg)
In this talk
●●●●
@nora_js
![Page 3: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/3.jpg)
In this talk
●●●●
@nora_js
![Page 4: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/4.jpg)
Known ways of testing for availability
●●●●
![Page 5: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/5.jpg)
![Page 6: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/6.jpg)
![Page 7: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/7.jpg)
You can’t keep blaming your cloud provider
@nora_js
![Page 8: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/8.jpg)
Why is there a fear of Chaos when it’s inevitable?
![Page 9: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/9.jpg)
![Page 10: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/10.jpg)
Meet “Chaos Carol”
![Page 11: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/11.jpg)
Where is Carol starting her Chaos?
![Page 12: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/12.jpg)
![Page 13: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/13.jpg)
Start with a steady state
●
●
@nora_js
![Page 14: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/14.jpg)
![Page 15: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/15.jpg)
There isn’t always money in microservices
![Page 16: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/16.jpg)
Randomly turn things off?
Recreate things that already happened?
![Page 17: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/17.jpg)
![Page 18: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/18.jpg)
Let people know? Let the Chaos run automatically?
![Page 19: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/19.jpg)
![Page 20: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/20.jpg)
![Page 21: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/21.jpg)
![Page 22: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/22.jpg)
![Page 23: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/23.jpg)
Socialization
●●
●
![Page 24: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/24.jpg)
![Page 25: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/25.jpg)
![Page 26: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/26.jpg)
![Page 27: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/27.jpg)
![Page 28: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/28.jpg)
● Focus more on asking the questions, rather than answering them.
● Find customers willing to try first. Then share their stories.● Be honest. Don’t make false promises about what Chaos
will do.
![Page 29: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/29.jpg)
![Page 30: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/30.jpg)
![Page 31: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/31.jpg)
●●
●
●
![Page 32: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/32.jpg)
![Page 33: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/33.jpg)
![Page 34: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/34.jpg)
![Page 35: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/35.jpg)
![Page 36: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/36.jpg)
![Page 37: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/37.jpg)
![Page 38: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/38.jpg)
![Page 39: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/39.jpg)
![Page 40: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/40.jpg)
![Page 41: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/41.jpg)
![Page 42: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/42.jpg)
![Page 43: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/43.jpg)
![Page 44: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/44.jpg)
![Page 45: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/45.jpg)
![Page 46: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/46.jpg)
![Page 47: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/47.jpg)
![Page 48: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/48.jpg)
![Page 49: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/49.jpg)
![Page 50: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/50.jpg)
![Page 51: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/51.jpg)
![Page 52: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/52.jpg)
![Page 53: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/53.jpg)
![Page 54: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/54.jpg)
ChAP
●
●●●
@nora_js
![Page 55: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/55.jpg)
![Page 56: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/56.jpg)
![Page 57: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/57.jpg)
![Page 58: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/58.jpg)
![Page 59: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/59.jpg)
![Page 60: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/60.jpg)
Targeted Chaos: Kafka Problems
●●
●
@nora_js
![Page 61: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/61.jpg)
Targeted Chaos: Kafka Ideas
●●●●●●
@nora_js
![Page 62: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/62.jpg)
![Page 63: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/63.jpg)
![Page 64: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/64.jpg)
@nora_js
“We ran a chaos experiment which verifies that our fallback path works
(crucial for our availability) and it successfully caught a issue in the fallback path and the issue was
resolved before it resulted in any availability incident!”
![Page 65: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/65.jpg)
“While [failing calls] we discovered an increase in license requests for the experiment cluster even though fallbacks were all successful. This likely
means that whoever was consuming the fallback was retrying the call, causing an increase in
license requests.”
@nora_js
![Page 66: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/66.jpg)
●●
●
@nora_js
![Page 67: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/67.jpg)
Should you develop experiments for the service teams?
Let them do it on their own?
![Page 68: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/68.jpg)
Takeaways
●
●
●
@nora_js
![Page 69: In this talk - QCon · @nora_js “We ran a chaos experiment which verifies that our fallback path works (crucial for our availability) and it successfully caught a issue in the fallback](https://reader035.vdocument.in/reader035/viewer/2022071417/6114e0d25e2ca013756efba4/html5/thumbnails/69.jpg)