Are “blameless” postmortems real? Sure, small companies here and there talk about it, but what about large enterprises? As organizations experiment with greater concurrency and integration between their departments and move toward a continuous delivery of customer value, failure is assured. Asking how failure can be avoided isn’t as useful or relevant as focusing on how an organization reacts when failure occurs and how to create a sustainable, actionable process for describing, exploring, and remedying failure.
J. Paul Reed and Kevina Finn-Braun discuss the hurdles, lessons, and surprises Salesforce’s Service Reliability Engineering team discovered rolling out actionable retrospectives in a large, complex organization, including what works, what doesn’t, and techniques that, well. . .the jury is still out on. This is the story Kevina and Paul’s months-long journey to identify the specifics of what made reliability retrospectives difficult, why actionable takeaways were often lacking, and how the feedback loops within the company’s operations organization weren’t serving Salesforce’s needs. Kevina and Paul led a series of experiments, putting the SRE team on the road to improving their ability to respond, react, remediate, and reincorporate lessons learned from failure into the organization.
J. Paul Reed is the founder of Release Engineering Approaches, a consultancy incorporating a host of tools and techniques to help organizations “Simply Ship. Every time.” Paul has worked across a number of industries, from financial services to cloud-based infrastructure, with teams from 2 to 2,000 on everything from tooling, operational analysis and improvement, and team culture transformation to business value optimization. He is also the chief delivery officer and a visiting scientist at Praxisflow.
Kevina Finn-Braun’s focus throughout her 18 years in the Internet industry has been operational excellence and risk management. Kevina is currently director of site reliability service management at Salesforce, where she leads the team focused on operational process improvements in the areas of incident, problem, and change management. In her previous role as director of business continuity at Yahoo, she led the team focused on risk management and service continuity best practices.
©2016, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org