Presented By
O’Reilly + Cloudera
Make Data Work
29 April–2 May 2019
London, UK

Schedule: Data Platforms sessions

Add to your personal schedule
9:0012:30 Tuesday, 30 April 2019
Mark Madsen (Teradata), Todd Walter (Teradata)
Average rating: ***..
(3.71, 7 ratings)
Building a data lake involves more than installing Hadoop or putting data into AWS. The goal in most organizations is to build a multiuse data infrastructure that is not subject to past constraints. Mark Madsen and Todd Walter explore design assumptions and principles and walk you through a reference architecture to use as you work to unify your analytics infrastructure. Read more.
Add to your personal schedule
11:1511:55 Wednesday, 1 May 2019
Data Engineering and Architecture
Location: Capital Suite 8/9
Moty Fania (Intel)
Average rating: ***..
(3.83, 6 ratings)
Moty Fania shares his experience implementing a sales AI platform that handles processing of millions of website pages and sifts through millions of tweets per day. The platform is based on unique open source technologies and was designed for real-time data extraction and actuation. Read more.
Add to your personal schedule
12:0512:45 Wednesday, 1 May 2019
Case studies, Strata Business Summit
Location: Capital Suite 12
Dirk Petzoldt (Zalando SE)
Average rating: ****.
(4.18, 11 ratings)
Dirk Petzoldt shares a case study from Europe’s leading online fashion platform Zalando illustrating its journey to a scalable, personalized machine learning–based marketing platform. Read more.
Add to your personal schedule
14:0514:45 Wednesday, 1 May 2019
Data Engineering and Architecture
Location: Capital Suite 8/9
JIAN CHANG (Alibaba Group), Sanjian Chen (Alibaba Group)
Average rating: ***..
(3.33, 3 ratings)
Jian Chang and Sanjian Chen share the architecture design and many detailed technology innovations of Alibaba TSDB, a state-of-the-art database for IoT data management, and discuss lessons learned from years of development and continuous improvement. Read more.
Add to your personal schedule
14:5515:35 Wednesday, 1 May 2019
Data Engineering and Architecture
Location: Capital Suite 8/9
Mark Grover (Lyft), Deepak Tiwari (Lyft)
Average rating: ****.
(4.69, 13 ratings)
Lyft’s data platform is at the heart of the company's business. Decisions from pricing to ETA to business operations rely on Lyft’s data platform. Moreover, it powers the enormous scale and speed at which Lyft operates. Mark Grover and Deepak Tiwari walk you through the choices Lyft made in the development and sustenance of the data platform, along with what lies ahead in the future. Read more.
Add to your personal schedule
14:5515:35 Wednesday, 1 May 2019
Law and Ethics, Strata Business Summit
Location: Capital Suite 10/11
Average rating: ****.
(4.11, 9 ratings)
Shailesh Chauhan explains how Uber built its business intelligence platform, detailing why the company took a platform approach rather than adding features in a piecemeal fashion. Read more.
Add to your personal schedule
16:3517:15 Wednesday, 1 May 2019
Data Engineering and Architecture
Location: Capital Suite 8/9
Neelesh Salian (Stitch Fix)
Average rating: ****.
(4.25, 4 ratings)
Developing data infrastructure is not trivial; neither is changing it. It takes effort and discipline to make changes that can affect your team. Neelesh Salian discusses how Stitch Fix's data platform team maintains and innovates its infrastructure for the company's data scientists. Read more.
Add to your personal schedule
17:2518:05 Wednesday, 1 May 2019
Felix Cheung (Uber)
Average rating: ****.
(4.42, 12 ratings)
Did you know that your Uber rides are powered by Apache Spark? Join Felix Cheung to learn how Uber is building its data platform with Apache Spark at enormous scale and discover the unique challenges the company faced and overcame. Read more.
Add to your personal schedule
17:2518:05 Wednesday, 1 May 2019
Data Engineering and Architecture
Location: Capital Suite 8/9
Nate Keating (Google)
Average rating: ****.
(4.00, 5 ratings)
AI will change how we live in the next 30 years, but it's still currently limited to a small group of companies. In order to scale the impact of AI across the globe, we need to reduce the cost of building AI solutions, but how? Nate Keating explains how to apply lessons learned from other industries—specifically, the automobile industry, which went through a similar cycle. Read more.
Add to your personal schedule
17:2518:05 Wednesday, 1 May 2019
Mark Samson (Cloudera), Phillip Radley (BT)
Average rating: *****
(5.00, 2 ratings)
It's now possible to build a modern data platform capable of storing, processing, and analyzing a wide variety of data across multiple public and private cloud platforms and on-premises data centers. Mark Samson and Phillip Radley outline an information architecture for such a platform, informed by working with multiple large organizations that have built such platforms over the last five years. Read more.
Add to your personal schedule
11:1511:55 Thursday, 2 May 2019
Data Engineering and Architecture, Expo Hall, Streaming and IoT
Location: Expo Hall 2 (Capital Hall N24)
Thomas Weise (Lyft)
Average rating: ****.
(4.50, 14 ratings)
Fast data and stream processing are essential for making Lyft rides a good experience for passengers and drivers. Lyft's systems need to track and react to event streams in real time to update locations, compute routes and estimates, balance prices, and more. Thomas Weise offers an overview of the streaming platform that powers these use cases. Read more.
Add to your personal schedule
12:0512:45 Thursday, 2 May 2019
David Josephsen (Sparkpost)
Average rating: ***..
(3.50, 2 ratings)
David Josephsen tells the story of how Sparkpost's reliability engineering team abandoned ELK for a DIY schema-on-read logging infrastructure. Join in to learn the architectural details, trials, and tribulations from the company's Internal Event Hose data ingestion pipeline project, which uses Fluentd, Kinesis, Parquet, and AWS Athena to make logging sane. Read more.
Add to your personal schedule
12:0512:45 Thursday, 2 May 2019
Pradeep Bhadani (Hotels.com), Elliot West (Hotels.com)
Average rating: ****.
(4.17, 6 ratings)
Travel platform Expedia Group likes to give its data teams flexibility and autonomy to work with different technologies. However, this approach generates challenges that cannot be solved by existing tools. Pradeep Bhadani and Elliot West explain how the company built a unified virtual data lake on top of its many heterogeneous and distributed data platforms. Read more.
Add to your personal schedule
12:0512:45 Thursday, 2 May 2019
Data Engineering and Architecture
Location: Capital Suite 8/9
Vaclav Surovec (Deutsche Telekom), Gabor Kotalik (Deutsche Telekom)
Average rating: ****.
(4.00, 2 ratings)
Knowledge of customers' location and travel patterns is important for many companies, including German telco service operator Deutsche Telekom. Václav Surovec and Gabor Kotalik explain how a commercial roaming project using Cloudera Hadoop helped the company better analyze the behavior of its customers from 10 countries and provide better predictions and visualizations for management. Read more.
Add to your personal schedule
14:0514:45 Thursday, 2 May 2019
Data Engineering and Architecture
Location: Capital Suite 8/9
Willem Pienaar (GOJEK), Zhi Ling Chen (GOJEK)
Average rating: ****.
(4.80, 5 ratings)
Features are key to driving impact with AI at all scales, allowing organizations to dramatically accelerate innovation and time to market. Willem Pienaar and Zhiling Chen explain how GOJEK, Indonesia's first billion-dollar startup, unlocked insights in AI by building a feature store called Feast, and the lessons they learned along the way. Read more.
Add to your personal schedule
16:3517:15 Thursday, 2 May 2019
Thomas Phelan (BlueData)
Average rating: ***..
(3.29, 7 ratings)
Organizations need to keep ahead of their competition by using the latest AI, ML, and DL technologies such as Spark, TensorFlow, and H2O. The challenge is in how to deploy these tools and keep them running in a consistent manner while maximizing the use of scarce hardware resources, such as GPUs. Thomas Phelan discusses the effective deployment of such applications in a container environment. Read more.