Skip to main content

Human Confirmation Bias in Monitoring of Systems

Chris Baker (Dyn)
Operations
Mission City Ballroom B4
Average rating: ***..
(3.33, 6 ratings)

As engineers we seek to apply “scientific knowledge, mathematics, and ingenuity to develop solutions to technical problems.” The key word, however, is “seek.” Although technology is immune to bias and heuristics, humans are not. And, even if we don’t like to admit it, we’re all human and those biases often impair our ability to apply the scientific method and come to proper conclusions. This talk will explore the role bias has played in the identification, attribution and resolution of a collection of issues, which will be illustrated through first-hand experience.

The first scenario is centered around exploratory analysis and asks the question: is it too good to be true? Key business indicators show growth in customers, growth in traffic, growth in load; everything makes sense…or does it? When looking at growth trends, which manifest in a company’s key business indicators, how do you avoid the correlation / causation dilemma? Do you deep dive into the data to establish a clear causal relationship or do you let bias take over and assume the data as presented is enough? Examining this can ensure that you are out-ahead of a potential problem.

The second scenario the presentation will discuss is focused on root cause analysis and attribution. When something goes wrong where do you start? The first question that comes up during a production incident is usually: Is this customer impacting? As a member of an ops team or a manager of an application support group how do you know? Chances are you have some checks in place (via Nagios, OpenNMS, etc.) monitoring and probing your system for responses – maybe you have deployed some form of real time user monitoring. But how well do you understand the implementation of those checks? Are you confident that it is interacting with the system in the same way customers are? Group problem identification and root cause analysis is an area where conformation bias and the availability heuristics can cause major issues. How do you avoid them?

After hearing this presentation, the next time an attendee is faced with a monitoring problem they will stop and think: am I being biased?

Photo of Chris Baker

Chris Baker

Dyn

Chris currently holds the title Doomsayer As A Service at Dyn. The goal of this role is to perform data analysis/experimentation related to network traffic, system performance and stress testing. The focus being the identification and communication of potential risks and thresholds to business and technical stakeholders. Previously, Chris worked at Fidelity Investments as a Senior Data Analyst. He graduated from Worcester Polytechnic Institute with a Masters in System Dynamics and a Bachelor’s in Management of Information Systems and Philosophy.