Build Systems that Drive Business
Sep 30–Oct 1, 2018: Training
Oct 1–3, 2018: Tutorials & Conference
New York, NY

Chaos engineering bootcamp

Tammy Butow (Gremlin), Ana Margarita Medina (Gremlin), Patrick Higgins (Gremlin)
1:30pm–5:00pm Monday, October 1, 2018
DevOps and SRE
Location: Sutton South/Regent Parlor Level: Beginner
Secondary topics:  Resilient, Performant & Secure Distributed Systems
Average rating: ***..
(3.00, 2 ratings)

Materials or downloads needed in advance

  • A laptop with the ability to SSH into a remote server (You'll be provided with cloud infrastructure.)

What you'll learn

  • Learn how to implement chaos engineering

Description

Chaos engineering is the discipline of experimenting on a distributed system in order to build confidence in the system’s capability to withstand turbulent conditions in production. Chaos engineering can be thought of as the facilitation of experiments to uncover systemic weaknesses. These experiments follow four steps:

  1. Define “steady state” as some measurable output of a system that indicates normal behavior
  2. Hypothesize that this steady state will continue in both the control group and the experimental group
  3. Introduce variables that reflect real-world events like servers that crash, hard drives that malfunction, and network connections that are severed
  4. Try to disprove the hypothesis by looking for a difference in steady state between the control group and the experimental group

Tammy Butow, Ana Medina, and Patrick Higgins lead a hands-on deep dive into chaos engineering, covering the tools and practices you need to implement it in your organization. Even if you’re already using chaos engineering, you’ll learn to identify new ways to use chaos engineering within your engineering organization and discover how other companies are using chaos engineering—and the positive results they have had using chaos to create reliable distributed systems.

Photo of Tammy Butow

Tammy Butow

Gremlin

Tammy Butow is a principal SRE at Gremlin, where she works on chaos engineering—the facilitation of controlled experiments to identify systemic weaknesses. Gremlin helps engineers build resilient systems using their control plane and API. Previously, Tammy led SRE teams at Dropbox responsible for the databases and storage systems used by over 500 million customers and was an IMOC (incident manager on call), where she was responsible for managing and resolving high-severity incidents across the company. She has also worked in infrastructure engineering, security engineering, and product engineering. Tammy is the cofounder of Girl Geek Academy, a global movement to teach one million women technical skills by 2025. Tammy is an Australian and enjoys riding bikes, skateboarding, snowboarding, and surfing. She also loves mosh pits, crowd surfing, metal, and hardcore punk.

Photo of Ana Margarita Medina

Ana Margarita Medina

Gremlin

Ana Medina is a San Francisco-based chaos engineer at Gremlin, where she helps companies avoid outages by running proactive chaos engineering experiments. Previously, she was an engineer on the SRE and infrastructure teams at Uber, specifically focusing on chaos engineering and cloud computing. She tweets at @Ana_M_Medina, mostly about traveling, diversity in tech, and mental health.

Photo of Patrick Higgins

Patrick Higgins

Gremlin

Patrick Higgins is a UI engineer at Gremlin, where he helps developers unleash the power of controlled chaos. He is passionate about finding effective ways to make UIs resilient to failure. He fills his weekends with playing soccer and assisting with civic causes that he cares about.

Comments on this page are now closed.

Comments

Picture of Ana Margarita Medina
Ana Margarita Medina | CHAOS ENGINEER
07/20/2018 1:25pm EDT

Super stoked for this! Find me over at @Ana_M_Medina if you wanna talk chaos before the workshop :)

Picture of Patrick Higgins
Patrick Higgins | UI ENGINEER
07/19/2018 10:09am EDT

Hey everyone! Really looking forward to the workshop! If you’re looking to connect pre-conference you can find me at @higgyCodes on twitter! Cant wait!

Picture of Tammy Butow
Tammy Butow | PRINCIPAL SITE RELIABILITY ENGINEER
06/22/2018 2:18pm EDT

Looking forward to seeing everyone! This is going to be super fun : ) If you want to read more about Chaos Engineering before coming along to the workshop you can find me over on twitter! twitter.com/tammybutow