Chaos Engineering Bootcamp @ Velocity

In June I taught a solo 3 hour hands-on chaos engineering workshop. There were 200 engineers in the audience following along and it was my favourite talk I've ever given!

Tammy Butow
9:00am–12:30pm Tuesday, June 20, 2017
Systems Engineering
Average rating: **. (4.38, 13 ratings)

Here are my slides: https://speakerdeck.com/tammybutow/chaos-engineering-bootcamp


Here is the talk info: https://conferences.oreilly.com/velocity/vl-ca/public/schedule/detail/58140

Description

Chaos engineering is the discipline of experimenting on a distributed system in order to build confidence in the system’s capability to withstand turbulent conditions in production. Chaos engineering can be thought of as the facilitation of experiments to uncover systemic weaknesses. These experiments follow four steps:

Start by defining “steady state” as some measurable output of a system that indicates normal behavior.
Hypothesize that this steady state will continue in both the control group and the experimental group.
Introduce variables that reflect real-world events like servers that crash, hard drives that malfunction, network connections that are severed, etc.
Try to disprove the hypothesis by looking for a difference in steady state between the control group and the experimental group.
Tammy Butow leads a hands-on tutorial on chaos engineering, covering the tools and practices you need to implement chaos engineering in your organization. Even if you’re already using chaos engineering, you’ll learn to identify new ways to use chaos engineering within your engineering organization and discover how other companies are using chaos engineering—and the positive results they have had using chaos to create reliable distributed systems.

Outline

Laying the foundations

What is chaos engineering?
The principles of chaos engineering
Why are many engineering organizations (including Netflix, Dropbox, Uber, National Australia Bank, and Yandex) using chaos engineering, and how can every engineering organization use chaos engineering to create reliable systems?
How to get started using chaos engineering with your own team and how to measure success
Chaos tools

Common open source chaos tools
How to use chaos engineering for cloud and physical infrastructure servers
How to get started using Chaos Monkey
Advanced topics

How to get started using chaos engineering for databases (MySQL)
How to get started using chaos engineering for Go
What is intuition engineering, and how can tools like Vizceral help you create reliable distributed systems?
Where can you learn more?
How to join the chaos community