San Jose • New York • London

Build Systems that Drive Business

Sep 30–Oct 1, 2018: Training
Oct 1–3, 2018: Tutorials & Conference

New York, NY

Building successful site reliability engineering in large enterprises

Liz Fong-Jones (Honeycomb), Dave Rensin (Google)

11:35am–12:15pm Tuesday, October 2, 2018

DevOps and SRE
Location: Murray Hill Level: Beginner

Secondary topics: Systems Monitoring & Orchestration

Average rating:

(4.25, 4 ratings)

Download slides (PDF)

Watch the keynote

What you'll learn

Explore eight key lessons Google's customer reliability engineering team learned helping large enterprises adopt SRE as an operations engineering model

Description

Google’s customer reliability engineering team is a specialized group of SREs who go into the world and teach enterprise customers of public cloud infrastructure—via their actual production systems—how to “do SRE” in their orgs. In the team’s two years of existence, its members have found that some things they thought would be hard weren’t, while others were nigh on impossible. The team has written many postmortems and learned a bunch of lessons you can only learn the hard way. Liz Fong-Jones and Dave Rensin share eight of these key lessons.

Topics include:

Why it’s easier to bootstrap SRE in a large traditional enterprise than a cloud-native one
Things enterprises assume are true but aren’t
Where to start in your SRE journey (including SLOs and quantitative risk measurements)
All the things the team should have known but still learned the hard way—and how you can avoid them when bootstrapping SRE in your culture (or your customers’ cultures)

Liz Fong-Jones

Honeycomb

Liz Fong-Jones is a developer advocate, labor and ethics organizer, and site reliability engineer (SRE) with 15+ years of experience at Honeycomb. Previously, she was an SRE working on products ranging from the Google Cloud Load Balancer to Google Flights. She lives in Brooklyn with her wife, metamours, and a Samoyed/Golden Retriever mix, and in San Francisco and Seattle with her other partners. She plays classical piano, leads an EVE Online alliance, and advocates for transgender rights as a board member of the National Center for Transgender Equality.

Website

Dave Rensin

Google

Dave Rensin is the director of customer reliability engineering (CRE) at Google. His team takes Google SREs focused on the reliability and availability of internal Google systems and focuses them on the reliability and availability of customer production systems running on Google Cloud. His mission is to teach Google customers how to design, build, and run highly available systems using Google SRE practices and tools. Dave is the author of several books, including two for O’Reilly, and holds more than a dozen patents in distributed systems, data acquisition, access control, and pattern matching.

Website

Diamond Sponsor

Platinum Sponsors

Gold Sponsors

Silver Sponsors

Innovators

Supporters

Sponsorship Opportunities

For exhibition and sponsorship opportunities, email velocity@oreilly.com

Partner Opportunities

For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com

Contact Us

View a complete list of Velocity contacts

©2018, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com