Presented By O'Reilly and Cloudera
Make Data Work
31 May–1 June 2016: Training
1 June–3 June 2016: Conference
London, UK

Stream analytics in the enterprise: A look at Intel’s internal IoT implementation

Moty Fania (Intel)
17:25–18:05 Thursday, 2/06/2016
IoT & real-time
Location: Capital Suite 7
Tags: iot

Prerequisite knowledge

Attendees should have a basic understanding of relevant big data technologies and the architecture of stream analytics systems.

Description

Recent years have seen significant evolution of the Internet of Things. It has become increasingly easy to connect devices to the Internet and send sensorial data to the public cloud. However, it’s quite evident that the adoption of IoT platforms and stream analytics within the enterprise is lagging and less prevalent, due in part to companies’ lack of expertise and skills required to deploy an on-premises platform and demonstrate high value through various, real-life use cases.

Moty Fania shares Intel’s IT experience implementing an on-premises IoT platform for internal use cases. The platform was based on open source big data technologies and containers and was designed as a multitenant platform with built-in analytical capabilities. Moty highlights the key lessons learned from this journey and offers a thorough review of the platform’s architecture.

Intel IT’s goal was to allow users and organizations in Intel to gain insights and business value from real-time analytics and become more proactive. Intel deployed a platform based on several open source technologies, including Akka, Kafka, and Spark Streaming, with a full stack of algorithms such as multisensor change detection, anomaly detection, and more. Unlike other IoT analytics implementations that settle for basic statistics or make many assumptions on the collected data, Intel’s implementation includes a generic analytics layer that uses machine learning and advanced statistical tests to provide meaningful insights to users in different use cases and business domains.

Moty outlines Intel’s “smart data pipe”/stream processing framework, Pigeon, which enables stream analytics at scale. Pigeon, based on Akka, implements a cluster capable of processing topologies that process the data according to any arbitrary logic determined by the users. It handles the creation of topologies, balancing them across the cluster, and allows nodes to join or leave dynamically. Pigeon is optimized to be easily deployed with Docker and Core OS and cut down development by enabling a single developer to deploy a massive real-time, elastic processing cluster with a click of a button. Spark Streaming was used to deploy self-service data monitors that allow users define their own rules and get an actuation when a certain condition is met. These user-defined rules are monitored in near-real-time on the stream.

Moty then explains how Pigeon and its analytics capabilities were applied to several use cases—both internally and externally—with interesting results. In one POC, Pigeon helped identify a fab tool causing a yield problem; in another POC it showed malfunctions of electrical network voltage sensors. Moty concludes by exploring how operational activities can be “translated” into IoT stream analytics scenarios to allow a higher level of proactivity and a shift from manual monitoring and firefighting to higher-value work.

Photo of Moty Fania

Moty Fania

Intel

Moty Fania is a principal engineer for big data analytics at Intel IT and the CTO of the Advanced Analytics Group, which delivers big data and AI solutions across Intel. With over 15 years of experience in analytics, data warehousing, and decision support solutions, Moty leads the development and architecture of various big data and AI initiatives, such as IoT systems, predictive engines, online inference systems, and more. Moty holds a bachelor’s degree in economics and computer science and a master’s degree in business administration from Ben-Gurion University.