Less Alarming Alerts

Operations and Culture
Location: King's Suite - Balmoral Level: Intermediate
Average rating: ****.
(4.28, 29 ratings)
Slides:   1-PDF 

Pretty much every company that has computers on the internet has someone who gets called when those computers go down. While this practice isn’t surprising, what is surprising is that we spend very little time as an industry discussing the right way to design and implement alerts. Not from a technical sense; what we need to discuss are how to make alerts something that are actually of value for the business, and worth the disruption they cause in peoples lives. That may sound a bit dramatic, but “pager fatigue” is a real risk to business, and “phantom pages” are a sign that things have gotten out of hand. We have terms for the bad things, it’s time to start talking about the good things. Topics we’ll cover include:

  • The difference between metrics, alerts, alarms, and other particulars.
  • How do you determine who should be called when a problem arises.
  • Simple and effective techniques for your team to responding to alerts & alarms.
  • How to attack your monitoring setup to eliminate alerts without adding risk.
  • Defining what “production ready” ready software is in a way that the business people will agree to.

At OmniTI, we’re often forced to walk into the middle of an existing infrastructure that is already set on fire. The only thing worse than having no alerts in that situation is having hundreds of alerts screaming at you constantly. Over the years we’ve had to come up with a way to help keep our operations team sane while also providing business value, and most importantly giving comfort to the folks that have brought us in. The methods that we’ve developed can be used by any operations team to help bring sanity back to their world, and end the cycle of “pager fatigue”.

Photo of Robert Treat

Robert Treat

OmniTI

Working on database backed, internet based systems for over a decade, Robert is co-author of the book Beginning PHP and PostgreSQL 8, maintains the phpPgAdmin software package, and has been recognized as a major contributor to the PostgreSQL project. An international speaker on databases, open source, and managing web operations at scale, he spends his days as COO of OmniTI, a web consultancy focused on building and managing highly scalable web infrastructure.

Sponsors

Sponsorship Opportunities

For exhibition and sponsorship opportunities at Velocity conference, contact Gloria Lombardo at +1 (203) 381-9245 or glombardo@oreilly.com

Media Partner Opportunities

For information on trade opportunities with O'Reilly conferences contact Jaimey Walking Bear at mediapartners
@oreilly.com

Contact Us

View a complete list of Velocity Europe 2013 contacts