Presented By O’Reilly and Cloudera
Make Data Work
September 11, 2018: Training & Tutorials
September 12–13, 2018: Keynotes & Sessions
New York, NY

IoT edge processing with Apache NiFi, Apache MiniFi, and multiple deep learning libraries

4:20pm–5:00pm Thursday, 09/13/2018
Data engineering and architecture
Location: 1E 07/08 Level: Beginner
Average rating: ****.
(4.00, 2 ratings)

Who is this presentation for?

  • Data engineers, data scientists, and programmers

Prerequisite knowledge

  • A working knowledge of Linux, Hadoop, and Python

What you'll learn

  • Learn how to use deep learning with the IoT


Timothy Spann leads a hands-on deep dive into using Apache MiniFi with Apache MXNet and other deep learning libraries on edge devices, such as Raspberry Pis with Movidius and the NVIDIA Jetson TX1.

You’ll learn how to run deep learning models on edge devices and send images, GPS data, sensor data, and deep learning results if values exceed norms. Using S2S, data is sent to NiFi for further processing, additional TensorFlow processing, and data augmentation with weather and geolocation. A stream of data is landed as ORC files in HDFS with Hive tables on top. Processed data in the Avro format with a schema stored in Schema Registry is sent to Streaming Analytics Manager via Kafka, and additional processing and rules are processed to Druid endpoints. Visualization is shown in Zeppelin and Superset.

Potential use cases for this solution include security camera monitoring, utility asset anomaly detection, and temperature and humdity filtering for devices.




Tim Spann is a Senior Solutions Engineer at Cloudera where he works with Apache NiFi, MiniFi, Kafka, , MXNet, TensorFlow, Apache Spark, big data, the IoT, machine learning, and deep learning. Tim has over a decade of experience with the IoT, big data, distributed computing, streaming technologies, and Java programming. Previously, he was a senior solutions architect at AirisData and a senior field engineer at Pivotal. He blogs for DZone, where he is the Big Data Zone leader, and runs a popular meetup in Princeton on big data, the IoT, deep learning, streaming, NiFi, the blockchain, and Spark. Tim is a frequent speaker at conferences such as IoT Fusion, Strata, ApacheCon, Data Works Summit Berlin, DataWorks Summit Sydney, and Oracle Code NYC. He holds a BS and MS in computer science.