What isn’t site reliability engineering? Does your NOC escalate outages to your DevOps engineer, who in turn calls your packaging and deployment team? Did your Chef just sprinkle some Salt on your Ansible Red Hat and call it SRE? Lots of companies claim to have SRE teams, but some don’t quite understand the full value proposition—or what shiny technologies and organizational structures will negatively impact your operations rather than empowering your team to accomplish your mission.
Blake Bisset and Jonah Horowitz share stories about anti-patterns in monitoring, incident response, configuration management, and more that they’ve tripped over on their own teams, seen proposed as good practice in talks at other conferences, or heard in talks with peers in the industry. Blake and Jonah also explain how Google and Netflix view the role of the SRE (and how it differs from the traditional system administrator role). You’ll learn that freedom and responsibility are key, trust is required, and chaos is (sometimes) your friend.
Blake Bisset got his first legal tech job at 16. He won’t say how long ago, except that he’s legitimately entitled to make shakey fists while shouting, “Get off my LAN!” He’s cofounded three startups—a joint venture with Dupont/ConAgra, a biotech spinoff from UW, and one that started this time a bunch of kids were sitting around on New Year’s Eve, wondering why they couldn’t watch movies on the internet—only to end up spending a half-decade as an SRM at YouTube and Chrome, where his happiest accomplishment was holding the go/bestpostmortem link for several years.
Jonah Horowitz is a site reliability engineer at Stripe, where he works with all of the company’s individual engineering teams to drive reliability efforts, including monitoring, alerting, deployment pipelines, and chaos resiliency. Previously, Jonah worked at several startups around the Bay Area, including Netflix, Quantcast (a leading ad-tech startup, where he grew the company’s network to process over three million events per second), and Looksmart (a contextual advertising company), and was on the founding team of Walmart.com (now @Walmart Labs), where he built out the company’s software deployment pipelines and its product image management systems.
©2017, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com