San FranciscoLondon New York

Presented By
O’Reilly + Cloudera

Make Data Work

March 25-28, 2019
San Francisco, CA

Schedule: Transportation and Logistics sessions

9:00am–5:00pm Tuesday, March 26, 2019

Data Case Studies

Location: 2022

Alex Kudriashova (Astro Digital), Jonathan Francis (Starbucks), JoLynn Lavin (General Mills), Robin Way (Corios), June Andrews (GE), Kyungtaak Noh (SK Telecom), Taposh DuttaRoy (Kaiser Permanente), Sabrina Dahlgren (Kaiser Permanente), Craig Rowley (Columbia Sportswear), Ambal Balakrishnan (IBM), Benjamin Glicksberg (UCSF), Patrick Lucey (Stats Perform), Rhonda Textor (True Fit)

Hear practical insights from household brands and global companies: the challenges they tackled, approaches they took, and the benefits—and drawbacks—of their solutions. Read more.

2:40pm–3:20pm Wednesday, March 27, 2019

Real-time analytics at Uber: Bring SQL into everything

Data Engineering & Architecture
Location: 2004

Zhenxiao Luo (Twitter)

Average rating:

(4.09, 11 ratings)

From determining the most convenient rider pickup points to predicting the fastest routes, Uber uses data-driven analytics to create seamless trip experiences. Zhenxiao Luo explains how Uber supports real-time analytics with deep learning on the fly, without any data copying. Read more.

2:40pm–3:20pm Wednesday, March 27, 2019

Adaptive ETL to optimize query performance at Lyft

Data Engineering & Architecture
Location: 2001

James Taylor (Lyft)

Average rating:

(3.56, 9 ratings)

James Taylor offers an overview of an automated feedback loop at Lyft to adapt ETL based on the aggregate cost of queries run across the cluster. He also discusses future work to enhance the system through the use of materialized views to reduce the number of ad hoc joins and sorting performed by the most expensive queries by transparently rewriting queries when possible. Read more.

2:40pm–3:20pm Wednesday, March 27, 2019

From an archived data field to GO-JEK’s world-class product feature for customer experience

Data Science, Machine Learning & AI
Location: 2009

Divya Choudhary (University of Southern California)

Average rating:

(4.50, 2 ratings)

Divya Choudhary explains how GO-JEK uses random chat messages and notes written in a local language sent by customers to their drivers while waiting for a ride to arrive to carve out unparalleled information about pickup points and their names (which sometimes even Google Maps has no idea of) and help create a world-class customer pickup experience feature. Read more.

4:20pm–5:00pm Wednesday, March 27, 2019

Managing Uber's data workflows at scale

Data Engineering & Architecture
Location: 2001

Alex Kira (Uber)

Average rating:

(4.00, 13 ratings)

Uber operates at scale, with thousands of microservices serving millions of rides a day, leading to 100+ PB of data. Alex Kira details Uber's journey toward a unified and scalable data workflow system used to manage this data and shares the challenges faced and how the company has rearchitected the system to make it highly available and horizontally scalable. Read more.

5:10pm–5:50pm Wednesday, March 27, 2019

The magic behind your Lyft ride prices: A case study on machine learning and streaming

Data Science, Machine Learning & AI
Location: 2009

Rakesh Kumar (Lyft), Thomas Weise (Lyft)

Average rating:

(4.00, 3 ratings)

Rakesh Kumar and Thomas Weise explore how Lyft dynamically prices its rides with a combination of various data sources, ML models, and streaming infrastructure for low latency, reliability, and scalability—allowing the pricing system to be more adaptable to real-world changes. Read more.

5:10pm–5:50pm Wednesday, March 27, 2019

Critical turbine maintenance: Monitoring and diagnosing planes and power plants in real time

Data Engineering & Architecture, Streaming and IoT
Location: 2006

June Andrews (GE), John Rutherford (GE)

Average rating:

(4.50, 2 ratings)

GE produces a third of the world's power and 60% of its airplane engines—a critical portion of the world's infrastructure that requires meticulous monitoring of the hundreds of sensors streaming data from each turbine. June Andrews and John Rutherford explain how GE's monitoring and diagnostics teams released the first real-time ML systems used to determine turbine health into production. Read more.

11:00am–11:40am Thursday, March 28, 2019

Disrupting data discovery

Data Engineering & Architecture
Location: 2001

Mark Grover (Lyft), Tao Feng (Lyft)

Average rating:

(4.40, 10 ratings)

Lyft has reduced the time it takes to discover data by 10x by building its own data portal, Amundsen. Mark Grover and Tao Feng offer a demo of Amundsen and lead a deep dive into its architecture, covering how it leverages centralized metadata, PageRank, and a comprehensive data graph to achieve its goal. They also explore the future roadmap, unsolved problems, and its collaboration model. Read more.

1:50pm–2:30pm Thursday, March 28, 2019

Ludwig, a code-free deep learning toolbox

Data Science, Machine Learning & AI
Location: 2007

Piero Molino (Uber AI)

Average rating:

(4.60, 5 ratings)

Piero Molino offers an overview of Ludwig, a deep learning toolbox that allows you to train models and use them for prediction without the need to write code. It's unique in its ability to help make deep learning easier to understand for nonexperts and enable faster model improvement iteration cycles for experienced machine learning developers and researchers alike. Read more.

4:40pm–5:20pm Thursday, March 28, 2019

Machine learning for preventive maintenance of mining haul trucks

Data Science, Machine Learning & AI
Location: 2009

Alex Gorbachev (Pythian), Paul Spiegelhalter (Pythian)

Average rating:

(4.67, 3 ratings)

Alex Gorbachev and Paul Spiegelhalter use the example of a mining haul truck to explain how to map preventive maintenance needs to supervised machine learning problems, create labeled datasets, do feature engineering from sensors and alerts data, evaluate models—then convert it all to a complete AI solution on Google Cloud Platform that's integrated with existing on-premises systems. Read more.

Presented by

Strategic Sponsors

Zettabyte Sponsor

Contributing Sponsors

Exabyte Sponsors

Impact Sponsors

Supporting Sponsor

Sponsorship Opportunities

For exhibition and sponsorship opportunities, email strataconf@oreilly.com

Partner Opportunities

For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com

Contact Us

View a complete list of Strata Data Conference contacts

©2019, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com