Engineering the Future of Software
Feb 3–4, 2019: Training
Feb 4–6, 2019: Tutorials & Conference
New York, NY

Chaos engineering and scalability at Audible.com

Tyler Lund (Audible.com)
2:15pm–3:05pm Wednesday, February 6, 2019
Chaos engineering
Location: Sutton Center/Sutton South
Secondary topics:  Best Practice, Case Study
Average rating: ****.
(4.45, 11 ratings)

Who is this presentation for?

  • Engineers, managers, and SREs

Level

Intermediate

Prerequisite knowledge

  • Experience with distributed service-oriented architectures

What you'll learn

  • Explore the evolution of Audible's service architecture
  • Learn how chaos engineering reveals resiliency issues in complex distributed systems
  • Gain knowledge of the specific types of chaos experiments and frameworks
  • Discover a goals-driven approach to implementing a chaos engineering program in your company

Description

You think you know how your system works, until 1:00am one night when it just doesn’t. No matter how big or small your system is, the complexity between dependencies will become chaotic when you least expect it. Chaos engineering is the way to understand and tame this chaos.

As Audible.com continues on a path toward distributed microservices and serverless technologies, the audio and playback experience team has utilized chaos engineering to experiment and measure resiliency and reliability of the playback experience. Tyler Lund discusses why Audible is evolving its architecture toward microservices and serverless architectures and how the customer experience has improved through measured experimentation with chaos engineering. With chaos engineering, the goal is to experiment with failure scenarios within software systems to verify and test the resiliency of a large system. In a distributed system at Amazon scale like Audible, a single point of failure can affect millions of users.

Join Tyler to learn more about how Audible tests disruptions and failures like data center outages, network latency, host failures, massive requests, and others to ensure the customer experience never suffers. Tyler dives deep into Audible’s reasons for embracing chaos engineering, how it has been applied and automated, the types of experiments run, and how Audible garnered support throughout the organization to expand experimentation. He also covers the metrics and goals used to get executive support and track success along with specific tests the team runs, how they built an automation framework, and the issues they’ve uncovered.

Learn how Audible embraces chaos to make things better for its users and how you can too.

Photo of Tyler Lund

Tyler Lund

Audible.com

Tyler Lund manages the audio and playback experience at Audible.com, where he’s responsible for ensuring customers have an immersive and reliable experience every time they listen to Audible. Tyler has worked at Audible for seven years, managing web, Android, iOS, and services teams in his time. Previously, Tyler worked on high-frequency order management and trading systems in the financial industry as well as designing automated test harnesses and platforms. Tyler writes about his passions, software development, raising twins, and brewing beer on his blog Dadontherunblog.com.