Presented By O’Reilly and Cloudera

San Francisco • London • New York

Make Data Work

September 11, 2018: Training & Tutorials
September 12–13, 2018: Keynotes & Sessions
New York, NY

Schedule: Transportation and Logistics sessions

9:00am–5:00pm Tuesday, 09/11/2018

Data Case Studies

Location: 1E 10

Paco Nathan (derwen.ai), Katharina Warzel (EveryMundo), Mike Berger (Mount Sinai Health System), Sam Helmich (Deere & Company), Stephanie Fischer (datanizing GmbH), Maryam Jahanshahi (TapRecruit), Greg Quist (SmartCover Systems), Ann Nguyen (Whole Whale), Steve Otto (Navistar), Jennifer Lim (Cerner), S Anand (Gramener), Ian Brooks (Cloudera)

Hear practical insights from household brands and global companies: the challenges they tackled, approaches they took, and the benefits—and drawbacks—of their solutions. Read more.

11:20am–12:00pm Wednesday, 09/12/2018

Your 10 billion rides are arriving now: Scaling Apache Spark for data pipelines and intelligent systems at Uber

Location: 1A 10 Level: Intermediate

Felix Cheung (Uber)

Average rating:

(4.60, 5 ratings)

Did you know that your Uber rides are powered by Apache Spark? Join Felix Cheung to learn how Uber is building its data platform with Apache Spark at enormous scale and discover the unique challenges the company faced and overcame. Read more.

2:05pm–2:45pm Wednesday, 09/12/2018

Achieving personalization with LSTMs

Location: 1A 15/16 Level: Intermediate

Ankit Jain (Uber)

Average rating:

(3.00, 3 ratings)

Personalization is a common theme in social networks and ecommerce businesses. Personalization at Uber involves an understanding of how each driver and rider is expected to behave on the platform. Ankit Jain explains how Uber employs deep learning using LSTMs and its huge database to understand and predict the behavior of each and every user on the platform. Read more.

11:20am–12:00pm Thursday, 09/13/2018

The care and feeding of data scientists: Concrete tips for retaining your data science team

Location: 1E 10/11 Level: Non-technical

Michelangelo D'Agostino (ShopRunner)

Average rating:

(4.75, 4 ratings)

Data scientists are hard to hire. But too often, companies struggle to find the right talent only to make avoidable mistakes that cause their best data scientists to leave. From org structure and leadership to tooling, infrastructure, and more, Michelangelo D'Agostino shares concrete (and inexpensive) tips for keeping your data scientists engaged, productive, and adding business value. Read more.

11:20am–12:00pm Thursday, 09/13/2018

Near-real-time anomaly detection at Lyft

Location: 1E 07/08 Level: Beginner

Thomas Weise (Lyft), Mark Grover (Lyft)

Average rating:

(2.50, 2 ratings)

Thomas Weise and Mark Grover explain how Lyft uses its streaming platform to detect and respond to anomalous events, using data science tools for machine learning and a process that allows for fast and predictable deployment. Read more.

1:10pm–1:50pm Thursday, 09/13/2018

A/B testing at Uber: How we built a BYOM (bring your own metrics) platform

Location: 1A 21/22 Level: Intermediate

Milene Darnis (Uber)

Average rating:

(4.22, 9 ratings)

Every new launch at Uber is vetted via robust A/B testing. Given the pace at which Uber operates, the metrics needed to assess the impact of experiments constantly evolve. Milene Darnis explains how the team built a scalable and self-serve platform that lets users plug in any metric to analyze. Read more.

1:10pm–1:50pm Thursday, 09/13/2018

How Komatsu is improving mining efficiencies using the IoT and machine learning

Location: 1E 09 Level: Non-technical

Shawn Terry (Komatsu Mining Corp)

Average rating:

(4.50, 2 ratings)

Global heavy equipment manufacturer Komatsu is using IoT data to continuously monitor some of the largest mining equipment to ultimately improve mine performance and efficiencies. Shawn Terry details the company's data journey and explains how it is using advanced analytics and predictive modeling to drive insights on terabytes of IoT data from connected mining equipment. Read more.

1:10pm–1:50pm Thursday, 09/13/2018

Executive Briefing: Analytics for executives—Building an approachable language to drive data science in your organization

Location: 1E 14 Level: Non-technical

Brandy Freitas (Pitney Bowes)

Average rating:

(4.50, 6 ratings)

Data science is an approachable field given the right framing. Often, though, practitioners and executives are describing opportunities using completely different languages. Join Brandy Freitas to develop context and vocabulary around data science topics to help build a culture of data within your organization. Read more.

2:00pm–2:40pm Thursday, 09/13/2018

Using Alluxio as a fault-tolerant pluggable optimization component of JD.com's compute frameworks

Location: 1E 09 Level: Beginner

tao huang (JD.com), mang zhang (JD.com), Bing Bai (JD.com)

Average rating:

(3.00, 1 rating)

Tao Huang, Mang Zhang, and 白冰 explain how JD.com uses Alluxio to provide support for ad hoc and real-time stream computing, using Alluxio-compatible HDFS URLs and Alluxio as a pluggable optimization component. To give just one example, one framework, JDPresto, has seen a 10x performance improvement on average. Read more.

2:00pm–2:40pm Thursday, 09/13/2018

Big data at speed

Location: 1A 06/07 Level: Intermediate

Ted Malaska (Capital One), Mark Grover (Lyft)

Many details go into building a big data system for speed, from determining a respectable latency until data access and where to store the data to solving multiregion problems—or even knowing just what data you have and where stream processing fits in. Mark Grover and Ted Malaska share challenges, best practices, and lessons learned doing big data processing and analytics at scale and at speed. Read more.

4:20pm–5:00pm Thursday, 09/13/2018

Real-time machine intelligence in IndyCar and Tour de France

Location: 1E 10/11 Level: Beginner

Yasuyuki Kataoka (NTT Innovation Institute, Inc.)

Average rating:

(3.00, 4 ratings)

One of the challenges of sports data analytics is how to deliver machine intelligence beyond a mere real-time monitoring tool. Yasuyuki Kataoka highlights various real-time machine learning models in both IndyCar and Tour de France, sharing real-time data processing architectures, machine learning models, and demonstrations that deliver meaningful insights for players and fans. Read more.

Presented by

Elite Sponsors

Strategic Sponsors

Zettabyte Sponsors

Contributing Sponsors

Exabyte Sponsors

Impact Sponsors

Supporting Sponsors

Sponsorship Opportunities

For exhibition and sponsorship opportunities, email strataconf@oreilly.com

Partner Opportunities

For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com

Contact Us

View a complete list of Strata Data Conference contacts

©2018, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com