Microsoft has to monitor a number of product and business metrics to stay on top of the health of its products and business and act quickly to address issues if anything falls outside historical patterns. However, it turns out it’s extremely difficult to create a scalable system that is also smart enough to, first, provide automated anomaly detection across a wide spectrum of time series related to business or product performance and second, and probably more difficult, provide automated diagnostic insight into why a business incident happens.
Tony Xing and Bixiong Xu offer an overview of Project Kensho, Microsoft’s one-stop shop for business incident monitoring and automated insights, covering the technology’s evolution, the architecture, the algorithms, and the benefits and the trade-offs. Tony and Bixiong walk through the pain points from customers in this area and detail Microsoft’s path to automated business incident monitoring and diagnostics powered by AI to serve teams across the company. Along the way, they share a case study on Bing ads key metrics monitoring and automated diagnostic insights.
Tony Xing is a senior product manager on the AI, data, and infrastructure (AIDI) team within Microsoft’s AI and Research Organization. Previously, he was a senior product manager on the Skype data team within Microsoft’s Application and Service Group, where he worked on products for data ingestion, real-time data analytics, and the data quality platform.
Bixiong Xu is the principal development manager on the AI, data, and infrastructure team at Microsoft.
©2018, O’Reilly UK Ltd • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com