Digital consumer companies are disrupting the old guard and changing the way we do business in fundamental ways; for example, Uber, Airbnb, and Zipcar have disrupted the traditional businesses of taxis, hotels, and car rental companies by leveraging software capabilities to create new business models. Opportunities in the industrial world are expected to outpace consumer business cases. Time series data is growing exponentially as new machines around the world get connected. Venkatesh Sivasubramanian and Luis Ramos explain how GE makes it faster and easier for systems to access (using a common layer) and perform analytics on a massive volume of time series data by taking what they’ve learned from Apache Arrow and applying it today for highly efficient time series storage using Apache Apex, Spark, and Kudu.
At the heart of GE’s digital portfolio is the Predix platform, a cloud-based platform as a service (PaaS) for the Industrial IoT. Predix provides the tools, framework, guidelines, and best practices to enable you to create solutions to run industrial-scale analytics. Distributed processing is a de facto standard when dealing with a lot of data. But as there are many heterogenous data processing systems geared for different work loads, the need to agree and standardize the communication layer becomes paramount. Apache Arrow is working with several products in an attempt to do just that and agree on a common in-memory columnar storage layer to avoid serialization of data between different systems. Venkat and Luis discuss GE’s approach, which uses similar concepts to time series-centric data.
Venkatesh Sivasubramanian is currently a Senior Director at GE Digital, where he drives the architecture and development of Data Services for Predix, an Industrial IoT platform. Prior to joining GE Digital, he worked as a lead engineer in the Big Fast Data team at WalmartLabs, building its stream processing engine and distributed systems. Venkatesh holds a master’s degree in software engineering from Birla Institute of Technology and Science (BITS), India.
Luis Ramos is a senior staff engineer at GE Digital who recently transitioned from GE Global Research, where he drove initiatives on industrial big data projects during early stages of Predix. Currently with the Predix Data Services team, Luis leads the Time Series Service development team. Prior to GE, he worked in startups, where he contributed to Hadoop ecosystem projects and built an analytics system that was used by major telecom companies including Verizon, Sprint, and T-Mobile for smartphone usage and MND. Luis holds a master’s degree in computer science from Cal State Fullerton.
©2016, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.