Speed improves customer engagement. With the emergence of microservices, it is very common for a single customer interaction, such as loading the home page or querying a search end point, to invoke hundreds of calls to tens of backend services. In this multitenant environment, traditional monitoring and profiling tools can’t tell us why a specific request was slow.
Distributed tracing is the only tool available today to trace a request across several systems. The gathered traces allow you to specifically debug how a specific request is processed across the service, understand where a request spent most of its time, and gain insight into why a specific request was slow.
Suman Karumuri outlines the architecture of PinTrace, a Zipkin-based distributed tracing infrastructure. Suman shares the challenges of instrumenting and deploying the tracing in a polyglot microservices architecture at scale, a few examples of how Pinterest uses traces from production to debug p99 latency issues and identify unnecessary network calls and performance bottlenecks in the system, and a few distributed tracing use cases beyond performance optimization.
Suman Karumuri is the lead for distributed tracing at Pinterest. Previously, he served as the lead for Zipkin project at Twitter. He is the author of an upcoming book Distributed Tracing from O’Reilly.
©2017, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com