Fueling innovative software
July 15-18, 2019
Portland, OR

Large-scale automated storage on Kubernetes

Matt Schallert (Chronosphere)
11:50am12:30pm Thursday, July 18, 2019
Secondary topics:  Data Driven
Average rating: ***..
(3.00, 5 ratings)

Who is this presentation for?

  • Database developers and site reliability engineers

Level

Intermediate

Description

Managing large stateful applications is tough.

Matt Schallert outlines how, using Kubernetes, Uber automated managing a challenging stateful workload—M3DB, its sharded, replicated, multizone time series database—and examines the operational challenges the company faced while scaling M3DB from a handful of clusters to over 40 clusters across multiple data centers and cloud providers, all while trying to create an environment-agnostic solution for open source users.

Matt then demonstrates methods of managing stateful workloads in a declarative manner to ease operational burden. You’ll see how M3DB’s declarative approach to cluster management can be extended to other workloads using its common set of open source libraries. This approach made orchestrating M3DB easier.

Along the way, Matt shares lessons learned that you can apply to a variety of stateful workloads across bare metal and cloud environments, regardless of whether it’s running under an orchestration system or managing instances directly. You’ll walk away with advice for managing stateful systems at scale and lessons to bear in mind when considering using an orchestration system for state management.

Prerequisite knowledge

  • Familiarity with designing and automating databases or data ingestion pipelines, specifically in the domain of metrics and time series data

What you'll learn

  • Learn how the declarative approach to infrastructure embraced by systems such as Kubernetes can make stateful systems easier to operate and automate and how these ideas apply to a variety of workloads such as time series databases and ingestion pipelines
Photo of Matt Schallert

Matt Schallert

Chronosphere

Matt is a Senior Software Engineer at Chronosphere and works on M3, an open source metrics platform. Recently, his efforts have been focused on improving the operational experience for users of M3. Previously, Matt was a Senior Site Reliability Engineer at Uber where he helped launch M3, and prior to that he was an SRE at Tumblr. In his spare time, Matt can be found hiking, skiing, and building data centers in his apartment.