Health checking: A not-so-trivial task in the distributed containerized world

Alexander Rukletsov (Mesosphere)

11:35am–12:15pm Tuesday, October 3, 2017

Orchestration, Scheduling, and Containers
Location: Beekman

Average rating:

(2.88, 8 ratings)

Who is this presentation for?

Distributed systems engineers and DevOps engineers

Prerequisite knowledge

A basic understanding of containers (e.g., Docker) and cluster orchestrators (e.g., Mesos or Kubernetes)

What you'll learn

Understand the importance of and challenges to health checking in distributed cloud-native apps

Description

People usually think of a health check as a simple sequence: performing a specific action and judging whether the target application is healthy based on the outcome. This becomes trickier when the application consists of multiple containers managed by a cluster orchestrator and monitored by third-party tooling. In this situation, a number of questions arise, including:

What entity should interpret the result? Should the reasoning about the health of a task be done locally (less context) or globally (greater overhead)?
How often should health status be delivered to balance excessive network overhead against an up-to-date status?
Should health checks be aware of environment-specific intricacies such as namespaces and software-defined networks?
How do you keep the overhead imposed by health checks manageable and reasonable?

Alexander Rukletsov discusses the perils of modern health checking and shares lessons learned during the revamp of the Apache Mesos health checks subsystem. Alexander explores challenges and trade-offs and offers an overview of how the modern distributed systems, such as AWS, Apache Mesos, and Kubernetes, tackle the problem of health checking, as well as alternative solutions.

Alexander Rukletsov

Mesosphere

Alex Rukletsov is an Apache committer and Mesos PMC member at Mesosphere. He loves making programs run faster, reducing the cognitive load of code, and creating the right abstractions. In a previous life, Alex segmented medical images and investigated the behavior of human vessels at several German research institutes. His areas of interests include distributed systems, object recognition, and probabilistic and heuristic algorithms.

Elite Sponsor

Google Cloud

Platinum Sponsors

Gold Sponsors

Silver Sponsors

Innovators

Supporters

Sponsorship Opportunities

For exhibition and sponsorship opportunities, email velocity@oreilly.com

Partner Opportunities

For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com

Contact Us

View a complete list of Velocity contacts

©2017, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com