A Survey of Statistical Anomaly Detection Methods on Machine Data

Baron Schwartz (VividCortex)

Many data scientists working with machine data are looking to anomaly detection as a means to discover interesting needles in the haystack. Although more sophisticated and robust methods exist, statistical methods are often a good simple approximation to the truth, and can be made very efficient.

This talk will cover the following:

  • What is anomaly detection?
  • Why are people expressing interest in anomaly detection?
  • What statistical techniques are often used?
  • What should we learn from disciplines such as finance?
  • How well do statistical techniques work on machine data?
  • What online/realtime methods can be applied for high throughput and efficiency?
  • What options, besides statistical techniques, are worth considering?

Although this presentation will include some math, it will all be explained in intuitive terms, so no mathematical background is needed to understand the concepts presented. Sample data and a spreadsheet will be provided so you can repeat the examples.

Note that this speaker has no pony in the race and isn’t trying to sell you anything (VividCortex doesn’t do anomaly detection). This is purely educational.

Photo of Baron Schwartz

Baron Schwartz

VividCortex

Baron is co-founder and CEO of VividCortex, a SaaS server performance management product. He is the lead author of High Performance MySQL and a variety of open-source software.