Build Systems that Drive Business
30–31 Oct 2018: Training
31 Oct–2 Nov 2018: Tutorials & Conference
London, UK

Don't panic! How to cope now that you're responsible for production

Euan Finlay (Financial Times)
16:3517:15 Thursday, 1 November 2018
DevOps and SRE
Location: Blenheim Room - Palace Suite
Secondary topics:  Systems Monitoring & Orchestration
Average rating: ****.
(4.43, 7 ratings)

What you'll learn

  • Learn tips and best practices for incident response

Description

Developers are increasingly expected to be on call, provide out-of-hours support, and respond to production outages. Without much experience handling incidents, this can be scary and intimidating—like being dropped in the deep end. But it doesn’t have to be that way.

The content team at the Financial Times has transformed its incident response from a number of mildly terrifying multihour outages to a stable platform where team members feel comfortable on call. Drawing on this experience, Euan Finlay shares practical tips and advice on setting up an incident response framework, what to do when “everything is on fire,” and how to improve things afterward—along with some horror stories of his own.

Photo of Euan Finlay

Euan Finlay

Financial Times

Euan is part of the Operations & Reliability team at the FT, managing incidents across the globe. Before that, he lead a distributed team responsible for Go microservices, Docker containers in Kubernetes, and the backend APIs powering the website.

On the Ops-ier side of DevOps, he has occasionally admitted to being a sysadmin in public.