At Etsy, we collect over a quarter million metrics from a variety of monitoring systems – everything from 404s to how much money we make, all in real time. However, with so many metrics and only a hundred engineers, how can we effectively monitor everything to separate the signal from the noise?
We’d like to introduce Velocity to two complementary tools we’ve developed to solve this problem.
Skyline is our real-time anomaly detection system. It continually analyzes each of our metrics, as they come in, for anomalous behavior. The algorithm we use automatically determines what it means for any given metric to be anomalous, and when an anomaly is detected, our engineers are alerted via interactive dashboard and can react accordingly.
Of course, with so many metrics, an anomalous event often impacts many metrics in similar ways. We wanted to surface these correlations automatically, to avoid manually curating dashboards, so we built Oculus – a way to index and compare all metrics with each other for similarity. This way, we can detect how a problem impacts many different metrics at once, furthering our understanding of the incident. We can then save and index the incident “pattern” itself so that if it happens again, Oculus will let us know immediately, while telling us the past diagnosis and solution.
This joint talk will cover the following topics:
Abe is an engineer at Etsy. He was previously a hackNY Fellow of 2011, and he is also the founder of Hacker League. You may know him as the insolent author of commitlogsfromlastnight.com
Jon is a Web Operations Engineer at Etsy, having done the rounds as Syadmin at several London tech startups. He likes Chef. A lot.
For information on exhibition and sponsorship opportunities at the conference, contact Gloria Lombardo at (203) 381-9245 or firstname.lastname@example.org
For media partnerships, contact mediapartners@ oreilly.com
For media-related inquiries, contact Maureen Jennings at email@example.com
View a complete list of Velocity contacts