Presented By
O’Reilly + Cloudera
Make Data Work
March 25-28, 2019
San Francisco, CA

Schedule: Transportation and Logistics sessions

9:00am5:00pm Tuesday, March 26, 2019
Location: 2022
Alex Kudriashova (Astro Digital), Jonathan Francis (Starbucks), JoLynn Lavin (General Mills), Robin Way (Corios), June Andrews (GE), Kyungtaak Noh (SK Telecom), Taposh DuttaRoy (Kaiser Permanente), Sabrina Dahlgren (Kaiser Permanente), Craig Rowley (Columbia Sportswear), Ambal Balakrishnan (IBM), Benjamin Glicksberg (UCSF), Patrick Lucey (Stats Perform), Rhonda Textor (True Fit)
Hear practical insights from household brands and global companies: the challenges they tackled, approaches they took, and the benefits—and drawbacks—of their solutions. Read more.
2:40pm3:20pm Wednesday, March 27, 2019
Zhenxiao Luo (Twitter)
Average rating: ****.
(4.09, 11 ratings)
From determining the most convenient rider pickup points to predicting the fastest routes, Uber uses data-driven analytics to create seamless trip experiences. Zhenxiao Luo explains how Uber supports real-time analytics with deep learning on the fly, without any data copying. Read more.
2:40pm3:20pm Wednesday, March 27, 2019
James Taylor (Lyft)
Average rating: ***..
(3.56, 9 ratings)
James Taylor offers an overview of an automated feedback loop at Lyft to adapt ETL based on the aggregate cost of queries run across the cluster. He also discusses future work to enhance the system through the use of materialized views to reduce the number of ad hoc joins and sorting performed by the most expensive queries by transparently rewriting queries when possible. Read more.
2:40pm3:20pm Wednesday, March 27, 2019
Divya Choudhary (University of Southern California)
Average rating: ****.
(4.50, 2 ratings)
Divya Choudhary explains how GO-JEK uses random chat messages and notes written in a local language sent by customers to their drivers while waiting for a ride to arrive to carve out unparalleled information about pickup points and their names (which sometimes even Google Maps has no idea of) and help create a world-class customer pickup experience feature. Read more.
4:20pm5:00pm Wednesday, March 27, 2019
Alex Kira (Uber)
Average rating: ****.
(4.00, 13 ratings)
Uber operates at scale, with thousands of microservices serving millions of rides a day, leading to 100+ PB of data. Alex Kira details Uber's journey toward a unified and scalable data workflow system used to manage this data and shares the challenges faced and how the company has rearchitected the system to make it highly available and horizontally scalable. Read more.
5:10pm5:50pm Wednesday, March 27, 2019
Rakesh Kumar (Lyft), Thomas Weise (Lyft)
Average rating: ****.
(4.00, 3 ratings)
Rakesh Kumar and Thomas Weise explore how Lyft dynamically prices its rides with a combination of various data sources, ML models, and streaming infrastructure for low latency, reliability, and scalability—allowing the pricing system to be more adaptable to real-world changes. Read more.
5:10pm5:50pm Wednesday, March 27, 2019
Average rating: ****.
(4.50, 2 ratings)
GE produces a third of the world's power and 60% of its airplane engines—a critical portion of the world's infrastructure that requires meticulous monitoring of the hundreds of sensors streaming data from each turbine. June Andrews and John Rutherford explain how GE's monitoring and diagnostics teams released the first real-time ML systems used to determine turbine health into production. Read more.
11:00am11:40am Thursday, March 28, 2019
Mark Grover (Lyft), Tao Feng (Lyft)
Average rating: ****.
(4.40, 10 ratings)
Lyft has reduced the time it takes to discover data by 10x by building its own data portal, Amundsen. Mark Grover and Tao Feng offer a demo of Amundsen and lead a deep dive into its architecture, covering how it leverages centralized metadata, PageRank, and a comprehensive data graph to achieve its goal. They also explore the future roadmap, unsolved problems, and its collaboration model. Read more.
1:50pm2:30pm Thursday, March 28, 2019
Piero Molino (Uber AI)
Average rating: ****.
(4.60, 5 ratings)
Piero Molino offers an overview of Ludwig, a deep learning toolbox that allows you to train models and use them for prediction without the need to write code. It's unique in its ability to help make deep learning easier to understand for nonexperts and enable faster model improvement iteration cycles for experienced machine learning developers and researchers alike. Read more.
4:40pm5:20pm Thursday, March 28, 2019
Alex Gorbachev (Pythian), Paul Spiegelhalter (Pythian)
Average rating: ****.
(4.67, 3 ratings)
Alex Gorbachev and Paul Spiegelhalter use the example of a mining haul truck to explain how to map preventive maintenance needs to supervised machine learning problems, create labeled datasets, do feature engineering from sensors and alerts data, evaluate models—then convert it all to a complete AI solution on Google Cloud Platform that's integrated with existing on-premises systems. Read more.