Presented By O'Reilly and Cloudera
Make Data Work
March 13–14, 2017: Training
March 14–16, 2017: Tutorials & Conference
San Jose, CA

Uber's data science workbench

Peng Du (Uber Inc.), Randy Wei (Uber Inc.)
11:00am11:40am Wednesday, March 15, 2017
Secondary topics:  Data Platform, Logistics
Average rating: ***..
(3.11, 9 ratings)

What you'll learn

  • Explore Uber’s data science workbench


Peng Du and Randy Wei offer an overview of Uber’s data science workbench, which provides a central platform for data scientists to perform interactive data analysis through notebooks like Jupyter and RStudio, share and collaborate on scripts, and publish results to dashboards and is seamlessly integrated with other Uber services, providing convenient features such as task scheduling, model publishing, and job monitoring.

Uber’s data science workbench provides clients with a scalable compute environment through dedicated Docker containers spawned by requests for notebook instances and a YARN/Mesos managed cluster for compute engines such as Spark, Hive, and Presto. Socialization features are supported in the workbench where clients can share, comment, and collaborate on notebook scripts with appropriate access control. All files, including scripts and results, are maintained by a version control system so that people can track progress and compare results.

In order to improve the productivity of data scientists, the workbench is also integrated with multiple services in Uber. A matured script can be scheduled as a periodical task in Uber’s job scheduling service, and people can publish their results through dashboard services like Shiny and models through Uber’s machine-learning platform. Last but not least, for complicated tasks that involve long-time running jobs in Spark, Hive, or Presto, the workbench will register the jobs in Uber’s monitoring service so that people can check the progress and debug information from them.

Peng Du

Uber Inc.

Peng Du is a senior software engineer in Uber. He holds a PhD in computer science and an MA in applied mathematics, both from the University of California, San Diego.

Randy Wei

Uber Inc.

Randy Wei is a software engineer in Uber. He holds a bachelor’s degree in computer science from the University of California, Berkeley.