Skip to main content

Trickery and Tooling for Distributed System Diagnosis and Debugging

Philip Zeyliger (Cloudera)
Hadoop Platform Murray Hill Suite
Average rating: ****.
(4.50, 6 ratings)
Slides:   1-PDF 

All is quiet on the log file front, but yet the system is down. What next? This talk will cover the tricks of the trade for debugging distributed systems. Motivated by experience gained diagnosing Hadoop, we’ll dig into the JVM, Linux esoterica, and outlier visualization.

Distributed systems make for tricky diagnosis problems. Which component is at fault? Is it the network, the machine, the process, or, even worse, some emergent complex behavior?

The answer lies in a methodology for finding outliers and then tooling to dig into certain issues deeply. I’ll cover tooling and tricks for both.

The talk will be illustrated by examples from open source systems (especially Hadoop).

Photo of Philip Zeyliger

Philip Zeyliger


Philip Zeyliger came to Cloudera from Google, where he worked on scalable storage for user-facing applications. Before that, he worked in finance, at D.E. Shaw. Philip holds a bachelor’s degree in mathematics from Harvard University. His interests include systems and databases. He’s a committer on the Apache Avro project.

Comments on this page are now closed.


Picture of Philip Zeyliger
Philip Zeyliger
10/31/2013 1:57pm EDT

Sorry for the delay. They’re now posted, both here, and at .

Marek K Kolodziej
10/30/2013 4:29pm EDT

Would it be possible to post the slides here, like the other speakers have?

Picture of Philip Zeyliger
Philip Zeyliger
10/27/2013 7:08am EDT

Hi Ken,

Looking forward to meeting you. Office hours are a good bet for finding me.

As far as network issues, tracing (a la Dapper, Xtrace) etc. is very slowly coming into our ecosystem. I’ll be talking a tiny bit about Twitter’s Zipkin and some in-progress Hadoop integration.

Picture of Ken Krugler
Ken Krugler
10/25/2013 2:13pm EDT

Hi Philip – as a fellow Avroer, looking forward to seeing you in person :) I’m most interested in tools & techniques for tracking down network issues (or even figuring out that the problem is in the network), as that’s been a thorn in my side for years.


Sponsorship Opportunities

For exhibition and sponsorship opportunities, contact Susan Stewart at

Media Partner Opportunities

For information on trade opportunities with O'Reilly conferences email mediapartners

Press & Media

For media-related inquiries, contact Maureen Jennings at

Contact Us

View a complete list of Strata + Hadoop World 2013 contacts