Uber’s mission is to provide transportation as reliable as running water, everywhere, for everyone. To fulfill its mission, Uber relies on making data-driven decisions at every level, and most of these decisions can benefit from faster data processing.
Vinoth Chandar and Prasanna Rajaperumal explore data processing systems for near-real-time use cases, making the case that adding new incremental processing primitives to existing Hadoop technologies can solve many problems at reduced cost and in a unified manner. Along the way, Vinoth and Prasanna introduce Hoodie, a newly open sourced storage system at Uber that adds new incremental processing primitives to existing Hadoop technologies to provide near-real-time data at 10x reduced cost using Spark and Hadoop and share their production experience.
Vinoth Chandar is the Co-Creator of the Hudi project at Uber and also PMC/Lead of Apache Hudi (Incubating). Previously, he was a senior staff engineer at Uber, where he led projects across various technology areas like data infrastructure, data architecture & mobile/network performance. Vinoth has keen interest in unified architectures for data analytics and processing. Previously, he was the LinkedIn lead on Voldemort and worked on Oracle Server’s replication engine, HPC, and stream processing.
Prasanna Rajaperumal is a senior engineer at Uber working on building the next generation Uber data infrastructure and building data systems that scale along with Uber’s hyper growth. Over the last six months, he has been focused on building a library that ingests change logs into large HDFS datasets, optimized for analytical workloads. Prasanna has held various roles at small to large companies building data systems. Previously, he was a software engineer at Cloudera working on building out data infrastructure for indexing and visualizing customer log files.
©2017, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.