Build Systems that Drive Business
30–31 Oct 2018: Training
31 Oct–2 Nov 2018: Tutorials & Conference
London, UK

Leveraging Envoy when responding to high-severity incidents

14:1014:50 Thursday, 1 November 2018
Monitoring, Observability, and Performance
Location: King's Suite - Balmoral
Average rating: ***..
(3.20, 5 ratings)

Prerequisite knowledge

  • A basic understanding of Envoy Proxy and microservice architecture

What you'll learn

  • Learn how engineers at Lyft use Envoy’s extensive metrics to identify the root cause of production incidents

Description

High-severity incident management is an inherently stressful time, and it’s made even worse when the available data is lacking and heterogenous. Lyft runs Envoy at every hop of the network, providing best-in-class observability across the entirety of Lyft’s network topology. Having that set of homogenous data vastly reduces the time it takes to identify a production issue.

Constance Caramanolis simulates a production incident and walks you through a page from the dreaded PagerDuty notification to resolution, demonstrating how engineers at Lyft use Envoy’s extensive metrics to identify the root cause of the incident and then proceed to remedy the situation.

Photo of Constance Caramanolis

Constance Caramanolis

Lyft

Constance Caramanolis is a software engineer on the server networking team at Lyft, where for the past two years, she has built and deployed Envoy and its ecosystem. Constance focuses on configuration management, network security, and engineering education and is an Envoy maintainer. Previously, Constance worked at Microsoft on several different projects and teams.