Developers are increasingly expected to be on call, provide out-of-hours support, and respond to production outages. Without much experience handling incidents, this can be scary and intimidating—like being dropped in the deep end. But it doesn’t have to be that way.
The content team at the Financial Times has transformed its incident response from a number of mildly terrifying multihour outages to a stable platform where team members feel comfortable on call. Drawing on this experience, Euan Finlay shares practical tips and advice on setting up an incident response framework, what to do when “everything is on fire,” and how to improve things afterward—along with some horror stories of his own.
Euan Finlay is an integration engineer at the Financial Times, where he works across multiple teams to support microservices, containers, and the website as a whole. As someone on the ops-ier side of DevOps, he has occasionally admitted to being a sysadmin while in public.
©2018, O’Reilly UK Ltd • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com