Presented By O'Reilly and Cloudera
Make Data Work
Dec 4–5, 2017: Training
Dec 5–7, 2017: Tutorials & Conference
Singapore

Anomaly detection on live data

11:15am11:55am Wednesday, December 6, 2017
Machine Learning
Location: 323

Who is this presentation for?

  • Data scientists and analysts

What you'll learn

  • Explore the design and architecture of MZ's Satori platform and learn how to leverage it for anomaly detection on live data

Description

Data-driven decision making has become the norm in every industry, and there has been a shift from leveraging big data to live data in order to facilitate faster decision making in a stateless compute tier. Although niche services such as Periscope and Facebook Live focus on live video, there is not a general-purpose platform to democratize live data processing. To address this lack, MZ recently launched the Satori platform for live messaging, which offers a cloud-based, managed messaging service; live discovery, with a dynamic SQL-based real-time message filtering service that can query at line rate with no configuration or need to index data in advance; and live reactions, through in-stream bots that attach to data channels and react at ultralow latencies.

Anomalies occur frequently in live data for a multitude of reasons, so detection and filtering of anomalies is of paramount importance for robust decision making. Dhruv Choudhary, Arun Kejariwal, and Francois Orsini explore the design and architecture of MZ’s Satori platform and share techniques for anomaly detection on live data.

Topics include:

  • How to handle low SNR (signal-to-noise ratio), which is typical of live data
  • How to handle seasonality, trend, and structural changes
  • One-pass incremental algorithms
  • Trade-offs between speed and accuracy
Photo of Francois Orsini

Francois Orsini

MZ

Francois Orsini is the chief technology officer for MZ’s Satori business unit. Previously, he served as vice president of platform engineering and chief architect, bringing his expertise in building server-side architecture and implementation for a next-gen social and server platform; was a database architect and evangelist at Sun Microsystems; and worked in OLTP database systems, middleware, and real-time infrastructure development at companies like Oracle, Sybase, and Cloudscape. Francois has extensive experience working with database and infrastructure development, honing his expertise in distributed data management systems, scalability, security, resource management, HA cluster solutions, soft real-time and connectivity services. He also collaborated with Visa International and Visa USA to implement the first Visa Cash Virtual ATM for the internet and founded a VC-backed startup called Unikala in 1999. Francois holds a bachelor’s degree in civil engineering and computer sciences from the Paris Institute of Technology.

Photo of Arun Kejariwal

Arun Kejariwal

MZ

Arun Kejariwal is a statistical learning principal at Machine Zone (MZ), where he leads a team of top-tier researchers and works on research and development of novel techniques for install and click fraud detection and assessing the efficacy of TV campaigns and optimization of marketing campaigns. In addition, his team is building novel methods for bot detection, intrusion detection, and real-time anomaly detection. Previously, Arun worked at Twitter, where he developed and open-sourced techniques for anomaly detection and breakout detection. His research includes the development of practical and statistically rigorous techniques and methodologies to deliver high-performance, availability, and scalability in large-scale distributed clusters. Some of the techniques he helped develop have been presented at international conferences and published in peer-reviewed journals.

Photo of Dhruv Choudhary

Dhruv Choudhary

MZ

Dhruv Choudhary is a research scientist at MZ, where he is researching stream anomaly detection algorithms for time series analysis and computer vision. Previously, Dhruv worked in the connected car space building data products around driver aggression, car behavior, and risk analysis. He holds a master’s degree from Georgia Tech, where he focused on applying control theory techniques to systems problems; his thesis formulated energy efficient thread scheduling for asymmetric architectures as an optimal control problem.

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)