The proliferation of good metrics collection and visualization toolkits over the past five years has been a huge benefit to developers. But with so many metrics available, along with a massive proliferation of services and limited cognitive capacity, which ones should we focus on?
Mark McBride outlines three key metrics—request rate, success rate, and the latency histogram—that provide a high-level abstraction of the customer experience. If these three metrics are good, your system is healthy from a customer perspective. Using concrete examples from a multiyear journey to improve service reliability while scaling a consumer site dramatically, Mark walks you through a customer-centric monitoring approach that fosters better teamwork and faster incident resolution.
As your service gets refactored into smaller services, internal teams become customers as well. These three key metrics serve as a common frame of reference for talking about service behavior across teams. Teams can quickly evaluate how their service is behaving for customers and can also quickly evaluate how their dependencies are serving them. This makes communication about performance and reliability issues crisper and dramatically improves incident troubleshooting and resolution.
Mark McBride is founder and CEO of Turbine Labs, building products that help engineers ship features more quickly and safely. Previously, Mark was services engineer lead at Nest Labs and Google, where he was responsible for the development of Nest’s server infrastructure that makes it possible for Nest customers to connect with their homes from wherever they are, and as an early developer on Twitter’s streaming API, delivering thousands of messages per second in real time to millions of users. During his time at Twitter, Mark managed developer productivity and led the web delivery, developer tools, and infrastructure test teams; he also worked with a variety of deploy pipelines and led development of some of Twitter’s early service migrations, which grew into a suite of tools used to migrate of millions of requests per second from legacy services to modern replacements.
©2017, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org