If you need to predict how much revenue an ecommerce site will generate this quarter, you could use the previous quarter’s revenue as a guide, but this does not take into consideration any other valid parameters, such as how much traffic came to the site in the current quarter, the site’s bounce rate, or other metrics that may be much better predictors. However, to understand which metrics can be used as predictors (or other tasks), you must first understand which metrics are related to each other and how. For a small-scale operation, these relationships can be manually defined. For certain types of metrics, such as IT, tools such as configuration management databases (CMDBs) may automate some of the discovery of the relationships between the metrics. But if you want to incorporate metrics beyond IT, such as application metrics or business metrics like revenue, and at the vast scale most digital businesses require, machine learning tools are needed.
Inbal Tadeski shares key machine learning methods for correlating metrics at scale, without having to do any manual configuration. Implementing these methods at scale can be computationally expensive, so Inbal also shares methods for reducing the computational resources needed—in particular, she discusses how to scale the similarity and clustering methods. Along the way, Inbal explains how to identify causality, since correlation does not necessarily equal causation. In many cases, it may not matter that the metrics are correlated but not related causally. However, sometimes it does.
Topics include:
Inbal Tadeski is a data scientist at Anodot, a provider of real-time machine learning anomaly detection and analytics solutions for detection of business incidents. Previously, Inbal was a research engineer at HP Labs, where she specialized in machine learning and data mining. She holds an MSc in computer science with a focus on machine learning from Hebrew University in Jerusalem and a BSc in computer science from Ben Gurion University.
For exhibition and sponsorship opportunities, email strataconf@oreilly.com
For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com
View a complete list of Strata Data Conference contacts
©2017, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com