From determining the most convenient rider pickup points to predicting the fastest routes, Uber uses data-driven analytics to create seamless trip experiences.
Inside Uber, big data are spread everywhere. Analysts and Engineers would like to run SQL Analytics on any data sources, better in real time. While, copy data from one source to another is pretty expensive. It is challenging to support real time SQL Analytics on all data sources.
This talk will share Uber’s engineering effort, supporting real time SQL Analytics on any data source on the fly, without any data copy. We will start with our big data infrastructure, specifically Hadoop, Spark, and Presto. Then we will talk about how Uber used Presto as the interactive SQL engine, and deployed Hadoop Distributed File System, Pinot, MySQL, and ElasticSearch as storage solutions. We will focus on how Uber built Presto ElasticSearch Connector from scratch, to support real time analytics on heterogeneous data. Finally, we will share our production experience and roadmap.
Zhenxiao Luo is an engineering manager at Uber, where he runs the interactive analytics team. Previously, he led the development and operations of Presto at Netflix and worked on big data and Hadoop-related projects at Facebook, Cloudera, and Vertica. He holds a master’s degree from the University of Wisconsin-Madison and a bachelor’s degree from Fudan University.
Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?
Join the conversation here (requires login)
©2019, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org