Prometheus is an open source systems monitoring and alerting toolkit inspired by Google’s internal monitoring tool Borgmon. It is perfectly suited for monitoring containerized and dynamically orchestrated microservices. Prometheus is the future of monitoring, perfectly poised to become the de facto standard of monitoring cloud-native applications of the next generation.
Prometheus chooses a different approach than those used and popularized by traditional monitoring systems. Cindy Sridharan explores the architecture and philosophy of Prometheus and explains how powerful features like the query language, flexible data model, and relabeling can be leveraged to gain valuable insights about application performance. You’ll learn why Prometheus is a perfect fit for modern, cloud-native applications—think applications/batch workloads running in a containerized, dynamically orchestrated, “microservices architecture” environment where failure is the norm. Along the way, Cindy explains how easy it is to integrate Prometheus clients to services, which enables building and scaling seamlessly, how time series data-driven alerting and notifications based off of percentiles greatly simplifies understanding and reasoning about distributed service availability, how the pushgateway enables applications to “push” metrics to the Prometheus, and how the alertmanager deduplicates, groups, and routes Prometheus alerts to services like Slack and PagerDuty.
Cindy also outlines what Prometheus does not offer such as—anomaly detection, request tracing, horizontal scalability out of the box, and long-term storage, for example—and covers some of the other open source tools in the ecosystem that are available to tackle these issues, looking at how request tracing across different services can be implemented with the Open Tracing spec with a backend like ZipKin and how tools like DigitalOcean’s Vulcan augment Prometheus with long-term storage. Cindy then concludes with a brief example of monitoring a Dockerized application with Prometheus (contingent on how this issue pans out) or monitoring a bare-bones Kubernetes cluster with Prometheus.
Cindy Sridharan is a Distributed Systems Engineer at Apple. Previously, she was an engineer at imgix, where she worked on API development, infrastructure, and other miscellaneous backend engineering tasks.
She likes thinking about building resilient and maintainable systems. She maintains a blog where she shares her ideas and experience about several of these topics. She is soon to be the author of a book on Distributed Systems Observability with O’Reilly.
©2017, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org