There has been a shift from leveraging big data to live streaming data (note that streaming services such as YouTube, Netflix and the like do not stream live data) to facilitate faster data-driven decision making. As the number of live data streams grow – owing to, say, but not limited to, IoT – over time, it is critical to develop techniques to extract actionable insights from the same. Although the topic of anomaly detection has received attention over the last few years at O’Reilly and other industry conferences, anomaly detection is a necessary but not a sufficient step. This stems from the fact that anomaly detection over a set of live data streams may result in an anomaly fatigue and consequently, may limit effective decision making.
One way to address the above is to carry out anomaly detection in a multi-dimensional space. However, this is typically very expensive computationally and hence, how not suitable for live data streams. Another approach is to carry out anomaly detection on individual data streams and then leverage correlation analysis to minimize false positives which in turn helps in surfacing actionable insights faster. In this talk we walk the audience through the following topics:
Further, we shall illustrate the concepts using data sets from production.
Arun Kejariwal is a statistical learning principal at Machine Zone (MZ), where he leads a team of top-tier researchers and works on research and development of novel techniques for install and click fraud detection and assessing the efficacy of TV campaigns and optimization of marketing campaigns. In addition, his team is building novel methods for bot detection, intrusion detection, and real-time anomaly detection. Previously, Arun worked at Twitter, where he developed and open-sourced techniques for anomaly detection and breakout detection. His research includes the development of practical and statistically rigorous techniques and methodologies to deliver high-performance, availability, and scalability in large-scale distributed clusters. Some of the techniques he helped develop have been presented at international conferences and published in peer-reviewed journals.
Francois Orsini is the chief technology officer for MZ’s Satori business unit. Previously, he served as vice president of platform engineering and chief architect, bringing his expertise in building server-side architecture and implementation for a next-gen social and server platform; was a database architect and evangelist at Sun Microsystems; and worked in OLTP database systems, middleware, and real-time infrastructure development at companies like Oracle, Sybase, and Cloudscape. Francois has extensive experience working with database and infrastructure development, honing his expertise in distributed data management systems, scalability, security, resource management, HA cluster solutions, soft real-time and connectivity services. He also collaborated with Visa International and Visa USA to implement the first Visa Cash Virtual ATM for the internet and founded a VC-backed startup called Unikala in 1999. Francois holds a bachelor’s degree in civil engineering and computer sciences from the Paris Institute of Technology.
Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?
Join the conversation here (requires login)
©2018, O’Reilly UK Ltd • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org