An execution graph is a data model to enable the capabilities of a debugger and profiler when investigating failures or performance problems in large distributed systems. Andre Vachon explains that although existing distributed tracing tools work well when a common tracing library is used across all components, these tools don’t work when multiple, heterogeneous systems use different data collection formats and data correlation models.
Execution graphs provide a more general schema which can be populated from heterogeneous, loosely coupled distributed systems. Andre offers an overview of a data model, data correlation, and UI that have been built to enable investigation of virtual machine creations in Azure. The various logging and tracing systems in the distributed system can export the relevant data so correlations can be made in a central data schema. From this, a complete flow of low-level operations can be stitched into a single end-to-end view. A very simple visualization model is then delivered on top of this schema. It enables much easier investigations into the reliability and performance of complex code flows through multiple services. Complex failures and performance issues that took weeks to investigate and required effort from multiple teams can now be done in minutes.
Andre Vachon has worked on the Windows operating system and development tools in various groups at Microsoft, including WinDbg, the MS C++ compiler, Microsoft crash analysis, and Skype telemetry. Currently, Andre is part of the Azure Performance team, focused on delivering tools to help identify and improve the performance of Azure Compute and Storage services. Andre is a frequent speaker at WinHEC conferences, delivering insights into device driver development and debugging.
©2016, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com