Presented By O'Reilly and Cloudera
Make Data Work
Sept 29–Oct 1, 2015 • New York, NY

High performance results using Spark to analyze mining equipment sensor data

Ankur Gupta (Bitwise Inc.)
2:05pm–2:45pm Thursday, 10/01/2015
IoT & Real-time
Location: 3D 02/11 Level: Intermediate
Average rating: ***..
(3.44, 9 ratings)

Remote health monitoring (RHM) systems collect data from sensors on mining equipment in use across the globe to analyze productivity, availability, utilization, and status. Issues with legacy RHM systems include low performance, low scalability in meeting growing equipment implementations and data volumes, availability and manageability issues (up to 10 hours of down time), and escalating costs to maintain the system. Using an open source technology stack and distributed processing, we implemented a solution that delivers high performance, low latency results – achieving over 120,000 writes per second sustained.

We will share the technical architecture and tools we used to implement the solution:

  • Kafka provisions for data ingestion from collector servers and event queuing to Spark
  • Spark performs real-time complex event processing and trend level analysis
  • Spark provisions for remote analysis and storage in Cassandra distributed database
  • Visualization

Some of the challenges encountered in the project included collecting data across different geographies, processing the data in real-time, applying business rules, and providing remote monitoring. We will also cover technical benchmarks achieved for streaming, distribution, in-memory computation, data syncing into Cassandra, and visualization as well as some tips for easy integration.

Photo of Ankur Gupta

Ankur Gupta

Bitwise Inc.

Meet us at Booth #105 and Checkout our Open Source ETL on Hadoop Utility developed in partnership with Capital One..

Comments on this page are now closed.


Chintan Bhatt
11/18/2015 11:31pm EST

Hi Ankur,
pls provide me sensor data for my research in big data mining.

Picture of Ankur Gupta
Ankur Gupta
10/01/2015 10:16pm EDT

Hi Robert – Yes, there are several ways to measure productivity such as ‘reduced failure rate’, ‘effective maintenance of equipment’ that could lead to ‘higher productivity’. if you want specifics, please send me your email. Thanks.

Picture of Robert Cohen
Robert Cohen
09/25/2015 10:30am EDT

Have you been able to measure some of the productivity gains on a per employee basis or with any other yardstick?