San Jose • New York • London

Build Systems that Drive Business

Sep 30–Oct 1, 2018: Training
Oct 1–3, 2018: Tutorials & Conference

New York, NY

DevOps and SRE sessions

Building and running complex systems that are both fast and reliable requires teams and applications that work well, together. The cultural shift is evident: software engineers and system administrators break down walls as they move towards sharing responsibilities and thereby quicken the pace of software development and delivery. This track explores these new ways of working together with insights and lessons gathered from taking software from concept to production

Track host

Tanya Reilly (Google) is a system administrator and site reliability engineer at Google, where she works on low-level infrastructure like distributed locking, load balancing, and bootstrapping. Previously, she was a system administrator at Eircom.net, Ireland's largest ISP, and the entire IT Department for a small software house.

9:00am–12:30pm Monday, October 1, 2018

Kubernetes 101

Location: Beekman/Sutton North Level: Beginner

Secondary topics: Systems Architecture & Infrastructure

Bridget Kromhout (Microsoft)

Average rating:

(4.78, 9 ratings)

Bridget Kromhout walks you through launching clusters and details all the moving parts you need to know about to use Kubernetes in production. Read more.

9:00am–12:30pm Monday, October 1, 2018

Ansible for SRE teams

Location: Nassau Level: Beginner

Secondary topics: Systems Monitoring & Orchestration

James Meickle (Quantopian)

Average rating:

(4.00, 1 rating)

Ansible is a "batteries included" automation, configuration management, and orchestration tool that's fast to learn and flexible enough for any architecture. Join James Meickle to get started with Ansible, with an eye toward sustainable development in cloud environments. Read more.

1:30pm–5:00pm Monday, October 1, 2018

Chaos engineering bootcamp

Location: Sutton South/Regent Parlor Level: Beginner

Secondary topics: Resilient, Performant & Secure Distributed Systems

Tammy Butow (Gremlin), Ana Margarita Medina (Gremlin), Patrick Higgins (Gremlin)

Average rating:

(3.00, 2 ratings)

Chaos engineering is the discipline of experimenting on a distributed system in order to build confidence in the system’s capability to withstand turbulent conditions in production. Tammy Butow, Ana Medina, and Patrick Higgins lead a hands-on deep dive into chaos engineering, covering the tools and practices you need to implement it in your organization. Read more.

1:30pm–5:00pm Monday, October 1, 2018

Smart networking with service meshes

Location: Murray Hill East (B) Level: Intermediate

Secondary topics: Systems Architecture & Infrastructure

Anubhav Mishra (HashiCorp)

Average rating:

(3.00, 2 ratings)

Over the past year, service meshes have gained significant interest. Most service meshes have two components: a control plane and a data plane. Anubhav Mishra explains what it takes to build a scalable control and data plane. Anubhav also discusses how HashiCorp Consul provides many features like a distributed key-value store and service discovery that make it ideal for a control plane. Read more.

11:35am–12:15pm Tuesday, October 2, 2018

Building successful site reliability engineering in large enterprises

Location: Murray Hill Level: Beginner

Secondary topics: Systems Monitoring & Orchestration

Liz Fong-Jones (Honeycomb), Dave Rensin (Google)

Average rating:

(4.25, 4 ratings)

Implementing site reliability (SRE) engineering doesn't have to be intimidating, and it isn't only for cloud-native organizations. Liz Fong-Jones and Dave Rensin share eight key lessons Google's customer reliability engineering team learned helping large enterprises adopt SRE as an operations engineering model. Read more.

1:30pm–2:10pm Tuesday, October 2, 2018

How NTSB air disaster analysis can help you in an emergency

Location: Murray Hill Level: Beginner

Secondary topics: Systems Architecture & Infrastructure

Matt Rogish (ReactiveOps)

Average rating:

(5.00, 1 rating)

Matt Rogish explains how NTSB investigations of air disasters have dramatically improved flight safety and applies lessons learned in disaster recovery and analysis, teamwork, task saturation, and systems design to modern software application and infrastructure architecture at scale to achieve higher availability, reduced errors, and more scalable systems. Read more.

2:25pm–3:05pm Tuesday, October 2, 2018

Disaster resilience the Waffle House way, from flattops to feature flags and more

Location: Murray Hill Level: Non-technical

Secondary topics: Resilient, Performant & Secure Distributed Systems

Heidi Waterhouse (LaunchDarkly)

Average rating:

(5.00, 4 ratings)

Waffle House's hurricane disaster plan has everything you could want from an IT disaster plan, including contact trees, failover states, and runbooks on partial operation. Heidi Waterhouse shares lessons about state drawn from the world outside computers and explains how to quantify them using a finite state machine and implement them automatically while you are in a less-than-perfect condition. Read more.

4:45pm–5:25pm Tuesday, October 2, 2018

The ops in serverless

Location: Murray Hill

Jennifer Davis (Microsoft)

Average rating:

(4.00, 2 ratings)

Rather than a future of NoOps, serverless has increased the need for specialized operations engineering. Jennifer Davis explores the role of operations in serverless, covering testing, monitoring, and debugging functions. Read more.

Diamond Sponsor

Platinum Sponsors

Gold Sponsors

Silver Sponsors

Innovators

Supporters

Sponsorship Opportunities

For exhibition and sponsorship opportunities, email velocity@oreilly.com

Partner Opportunities

For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com

Contact Us

View a complete list of Velocity contacts

©2018, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com