Production Engineering, SRE, and DevOps sessions
Edward Muller (Heroku) is the Engineering Manager of Heroku's Operational Experience team which focuses on helping customers understand how their Heroku apps are operating. During his career he has written open source and closed source software in several different programming languages, run an ISP, architected systems at a large financial company, owned a cyber cafe, and designed, installed and managed networks running all sorts of systems from Linux to Microsoft Windows to Novell Netware. He has spent the last 11 years working on PaaS systems with a focus on operations, logging and observability.
9:00am–12:30pm Tuesday, June 11, 2019
Ryan Kitchens, Lorin Hochstein, and Nora Jones discuss incident management and explore effective approaches and techniques that help you build the capacity to encounter failure and manage the consequences of failure successfully.
1:30pm–5:00pm Tuesday, June 11, 2019
Explore the key concepts behind large system design with Jenny Liao, as she guides you through building, scaling and provisioning a system. Apply the concepts you learn to evaluate and build systems of your own. You will be working in small groups.
11:35am–12:15pm Thursday, June 13, 2019
Alex Elman explains how Indeed used a site-wide outage as an opportunity to build resilience, improve reliability, and make lasting improvements to the engineering culture.
1:25pm–2:05pm Thursday, June 13, 2019
Charity Majors explains why the only environment that matters is production. For the good of humanity, ditch the rest.
2:20pm–3:00pm Thursday, June 13, 2019
DevOps squads coordinate in almost every aspect of their work. Laura Maguire explores how high-performing teams responding to service outages demonstrate sophisticated, nuanced practices that ease the cognitive burden of coping with complex, time-pressured incidents.
3:50pm–4:30pm Thursday, June 13, 2019
A lot has been said about the SRE profession (how to start an SRE team, how to scale a single team in place, etc.), but how to move from a single SRE team to an SRE organization that requires several teams has been largely unexplored. Gustavo Franco takes new SRE leaders and individual contributors through what it takes to be a part of or start their second team and beyond.
4:45pm–5:25pm Thursday, June 13, 2019
DevOps and platform teams have too many projects, not enough time, and users who can easily ask if the thing is done, because "it's really holding them up." James Heimbuck explores the good, the bad, and the ugly of how SendGrid incorporates product management practices into planning and execution within DevOps and platform teams to cut off scope creep and never-ending projects and realize value.