Epilepsy, a disease which causes seizures, affects 65 million people worldwide. Unfortunately, doctors can only treat the disease if they have the necessary data. Typically, this means patients have to spend days in a hospital connected to cumbersome EEG machines. Dataiku and Bioserenity, two Parisian startups, have partnered to provide a solution to this problem: an at-home, real-time, wearable EEG with data hosted in the cloud.
If you’ve worked with the time series data generated by connected devices, you know that it’s a type of data that needs special attention. Creating a database schema that can handle “big time series data” is hard. Processing that data on Apache Spark and providing a real-time web app is even harder. As connected devices proliferate, how will data scientists and engineers cope with this new data?
Eric Kramer, a data scientist at Dataiku, describes the architecture that makes this system possible. Eric talks about successes and failures—and explains what works and what doesn’t—when it comes to handling large amounts of time series data and offers an overview of the tools Dataiku and Bioserenity use to handle large amounts of time series data. Eric also explains how they created a real-time web app that processes petabytes of data generated by connected devices using an open source NoSQL database, Apache Spark, and Dataiku’s Data Science Studio and explores how variations on this technology stack could be applicable to a wide range of applications using connected devices.
Eric Kramer left medical school to join Dataiku, a big data startup in Paris. Eric specializes in the analysis of medical data and the possibilities at the intersection of medicine, data, and predictive analytics.
©2016, O’Reilly UK Ltd • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.