M3 and Prometheus: Monitoring at planet scale for everyone

Rob Skillington (Chronosphere), Łukasz Szczęsny (M3)

16:45–17:25 Wednesday, 6 November 2019

Location: Expo Hall Sessions

Expo Plus Sessions, Monitoring, Observability, and Performance, Overcoming Obstacles: Lessons in Resilience

Tags:

Average rating:

(4.00, 1 rating)

Download slides (PDF)

Who is this presentation for?

Software engineers, SREs, and DevOps engineers

Level

Intermediate

Description

For the past few years, Prometheus has solved many people’s monitoring needs, and it’s exceptional at what it does. Prometheus has exploded in popularity, and now many wish to store more metrics at longer retention and establish a single pane of glass on top of Prometheus for their monitoring needs across regions.

M3, first developed at Uber, is an open source metrics platform that you can deploy and run using Kubernetes and Helm that integrates with Prometheus. It can store petabytes of metrics data with replication for high availability in a cost-efficient manner, with compaction-averse time series storage, and an index that can efficiently index and run dimension-based regexp queries on billions of metrics.

Rob Skillington and Łukasz Szczęsny use a real-world example to cover how to deploy M3Coordinator and M3DB using the M3 Kubernetes operator and how to connect your Prometheus instances together into a single global monitoring system.

Prerequisite knowledge

Familiarity with metrics, monitoring, and alerting

What you'll learn

Identify how to solve challenges scaling an observability organization
Learn how to set up and scale a metrics monitoring and alerting stack, starting with a single Prometheus server, and then when and why to add remote storage, as well as what it looks like to deploy M3 as a global store for Prometheus metrics

Rob Skillington

Chronosphere

Rob Skillington is the chief technology officer at Chronosphere. Previously, he was on the monitoring team at Uber where he created M3DB, an open source time series database built for M3 to scale to the needs of Uber’s ever-growing metrics footprint of more than ten billion metrics. He’s also a member of OpenMetrics, an effort to create an open standard for transmitting metrics at scale.

Website

Łukasz Szczęsny

M3

Łukasz Szczęsny has been an infrastructure engineer in many roles. As an early SRE at Uber on the observability team he developed parts of M3 and other monitoring infrastructure.