Build & maintain complex distributed systems
October 1–2, 2017: Training
October 2–4, 2017: Tutorials & Conference
New York, NY

Schedule: Monitoring, Tracing and Metrics sessions

New monitoring paradigms including how to monitor large-scale, complex, dynamic and distributed systems built on emerging architectures like microservices and serverless.

Track Host

Baron SchwartzBaron Schwartz is founder and CEO of VividCortex, the best way to see what your production database servers are doing. He is the lead author of High Performance MySQL and a variety of open-source software.

Add to your personal schedule
1:30pm5:00pm Monday, October 2, 2017
Location: Regent Level: Intermediate
Sasha Goldshtein (Sela Group)
Average rating: *****
(5.00, 2 ratings)
Sasha Goldshtein leads a hands-on workshop on Linux dynamic tracing. You'll explore the BPF Compiler Collection (BCC), a set of tools and libraries for dynamic tracing, and gain firsthand experience of memory leak analysis, generic function tracing, kernel tracepoints, static tracepoints in user-space programs, and the baked-in tools for file I/O, network, and CPU analysis. Read more.
Add to your personal schedule
3:50pm4:30pm Tuesday, October 3, 2017
Location: Gramercy Level: Intermediate
Sarah Wells (Financial Times)
Average rating: ****.
(4.86, 7 ratings)
Most people think about microservices as a solution for scale. That may be the case, but operating them is definitely a scale challenge. Sarah Wells explains why, when you have 100+ services, everything needs to be automated, or else you'll spend two days updating Jenkins build pipelines or be woken up every night by false alarms caused by network blips. Read more.
Add to your personal schedule
11:35am12:15pm Wednesday, October 4, 2017
Location: Beekman Level: Intermediate
Cindy Sridharan (imgix)
Average rating: ***..
(3.00, 3 ratings)
As the systems we build become more distributed and (in the case of containerization) ephemeral, traditional monitoring tools prove to be grossly insufficient. Fortunately, the state of monitoring has evolved to meet these new demands, but it brings its own set of technical and organizational challenges. Cindy Sridharan offers an honest overview of monitoring challenges and trade-offs. Read more.
Add to your personal schedule
1:30pm2:10pm Wednesday, October 4, 2017
Location: Beekman Level: Intermediate
Mark McBride (Turbine Labs)
Average rating: ****.
(4.00, 1 rating)
With the recent flourishing of observability systems, there's no shortage of things to monitor. Sadly, humans have limited capacity to process them all. Mark McBride outlines three key metrics—request rate, success rate, and the latency histogram—that provide a high-level abstraction of the customer experience. If these three metrics are good, your system is healthy from a customer perspective. Read more.
Add to your personal schedule
2:25pm3:05pm Wednesday, October 4, 2017
Location: Beekman Level: Intermediate
Baron Schwartz (VividCortex)
Observability (or lack thereof), like testability and maintainability, is a fundamental property of systems. But what does observable code look like? What instrumentation creates systems that are observable later in arbitrary ways, in circumstances you can't foresee? Baron Schwartz outlines the most useful things to know about observability in systems in production. Read more.
Add to your personal schedule
3:50pm4:30pm Wednesday, October 4, 2017
Location: Beekman Level: Intermediate
Dina Goldshtein (Riverbed)
Event Tracing for Windows (ETW) is the most important diagnostic tool Windows developers have at their disposal. Dina Goldshtein explores the rich and wonderful world of ETW events, which span numerous OS components. You’ll learn how to diagnose complex issues in production systems and discover ways to automate ETW collection and analysis to build self-diagnosing applications. Read more.
Add to your personal schedule
4:45pm5:25pm Wednesday, October 4, 2017
Location: Beekman Level: Intermediate
Sasha Goldshtein (Sela Group)
Sasha Goldshtein explores a holistic set of BPF-based tools for monitoring JVM applications on Linux and outlines a systems performance checklist that includes classics like fileslower, opensnoop, and strace—all based on the noninvasive, fast, and safe BPF technology. Read more.