Uber’s geospatial data is increasing exponentially as the company grows. As a result, its big data systems must also grow in scalability, reliability, and performance to support business decisions, user recommendations, and experiments for geospatial data. Zhenxiao Luo and Wei Yan explain how Uber runs geospatial analysis efficiently in its big data systems, including Hadoop, Hive, and Presto.
Zhenxiao and Wei start with an overview of Uber’s big data infrastructure before explaining how Uber models geospatial data and outlining its data ingestion pipeline. They then discuss geospatial query performance improvement techniques and experiences, focusing on geospatial data processing in big data systems, including Hadoop and Presto. Zhenxiao and Wei conclude by sharing Uber’s use cases and roadmap.
Zhenxiao Luo is an engineering manager at Uber, where he runs the interactive analytics team. Previously, he led the development and operations of Presto at Netflix and worked on big data and Hadoop-related projects at Facebook, Cloudera, and Vertica. He holds a master’s degree from the University of Wisconsin-Madison and a bachelor’s degree from Fudan University.
Wei Yan is a senior engineer at Uber, where he builds data processing and querying systems that scale along with Uber’s hypergrowth.
©2017, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org