Bring the Noise: Making Effective Use of a Quarter Million Metrics

Abe Stanway (Etsy), Jon Cowie (Chef)
Operations, Ballroom AB
Tutorial Please note: to attend, your registration must include Tutorials on Tuesday.
Average rating: ****.
(4.04, 46 ratings)

At Etsy, we collect over a quarter million metrics from a variety of monitoring systems – everything from 404s to how much money we make, all in real time. However, with so many metrics and only a hundred engineers, how can we effectively monitor everything to separate the signal from the noise?

We’d like to introduce Velocity to two complementary tools we’ve developed to solve this problem.

Skyline is our real-time anomaly detection system. It continually analyzes each of our metrics, as they come in, for anomalous behavior. The algorithm we use automatically determines what it means for any given metric to be anomalous, and when an anomaly is detected, our engineers are alerted via interactive dashboard and can react accordingly.

Of course, with so many metrics, an anomalous event often impacts many metrics in similar ways. We wanted to surface these correlations automatically, to avoid manually curating dashboards, so we built Oculus – a way to index and compare all metrics with each other for similarity. This way, we can detect how a problem impacts many different metrics at once, furthering our understanding of the incident. We can then save and index the incident “pattern” itself so that if it happens again, Oculus will let us know immediately, while telling us the past diagnosis and solution.

This joint talk will cover the following topics:

  • Metric Overload – The situation which gave rise to the development of these tools and how we approached the problem
  • Skyline – The architecture and algorithms we use for realtime anomaly detection on a massive scale
  • Oculus – The architecture and algorithms we use to compute similarity and correlation across all of our metric data
  • Demonstration – This talk wouldn’t be any fun if we didn’t show you the tools firsthand!
Photo of Abe Stanway

Abe Stanway

Etsy

Abe is an engineer at Etsy. He was previously a hackNY Fellow of 2011, and he is also the founder of Hacker League. You may know him as the insolent author of commitlogsfromlastnight.com

Photo of Jon Cowie

Jon Cowie

Chef

Jon is a Web Operations Engineer at Etsy, having done the rounds as Syadmin at several London tech startups. He likes Chef. A lot.

Sponsors

Sponsorship Opportunities

For information on exhibition and sponsorship opportunities at the conference, contact Gloria Lombardo at (203) 381-9245 or glombardo@oreilly.com

Media Partner Opportunities

For media partnerships, contact mediapartners@ oreilly.com

Press and Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

Contact Us

View a complete list of Velocity contacts