Build resilient systems at scale
28–30 October 2015 • Amsterdam, The Netherlands

Statistics for engineers

Heinrich Hartmann (Circonus)
9:30–11:00 Wednesday, 28/10/2015
Location: Emerald Room
Average rating: ***..
(3.31, 26 ratings)

Prerequisite Knowledge

High School level mathematics. Set notation. Familiarity with UNIX command line. Read Python code.

Materials or downloads needed in advance

Please install the following software components in advance:

A working Python 2.x installation including:

I recommend to install the python distribution Anaconda which ships with those tools per default.


Gathering all kinds of telemetry data is key to operating reliable distributed systems at scale. Once you have set up your monitoring systems and recorded all relevant data, the challenge becomes to make sense of it and extract valuable information, like:

  • Are we fulfilling our SLA?
  • How did our query response times change with the last update?

Statistics is the art of extracting information from data. In this tutorial, we address the basic statistical knowledge that helps you at your daily work as a system operator. We will cover probabilistic models, summarizing distributions with mean values, quantiles, and histograms and their relations.

The tutorial focuses on practical aspects, and will give you hands-on knowledge of how to handle, import, analyze, and visualize telemetry data with UNIX command line tools and the IPython toolkit.

Photo of Heinrich Hartmann

Heinrich Hartmann


Heinrich Hartmann is the lead data scientist at Circonus. He is driving the development of analytics methods that transform monitoring data into actionable information as part of the Circonus monitoring platform. Heinrich earned his PhD in mathematics from the University of Bonn and worked as a researcher for the University of Oxford afterward. In 2012 he shifted his focus to computer science, and now applies his 10+ years of mathematical expertise to data analytics.