Presented By
O’Reilly + Cloudera
Make Data Work
29 April–2 May 2019
London, UK

Schedule: Expo Hall sessions

Add to your personal schedule
11:1511:55 Wednesday, 1 May 2019
Location: Expo Hall (Capital Hall N24)
Mounia Lalmas (Spotify)
Average rating: ****.
(4.16, 19 ratings)
Spotify's mission is "to match fans and artists in a personal and relevant way." Mounia Lalmas shares some of the (research) work the company is doing to achieve this, from using machine learning to metric validation, illustrated through examples within the context of home and search. Read more.
Add to your personal schedule
11:1511:55 Wednesday, 1 May 2019
Location: Expo Hall 2 (Capital Hall N24)
Itai Yaffe (Nielsen)
Average rating: ****.
(4.45, 11 ratings)
NMC (Nielsen Marketing Cloud) provides customers (both marketers and publishers) with real-time analytics tools to profile their target audiences. To achieve that, the company needs to ingest billions of events per day into its big data stores in a scalable, cost-efficient way. Itai Yaffe explains how NMC continuously transforms its data infrastructure to support these goals. Read more.
Add to your personal schedule
12:0512:45 Wednesday, 1 May 2019
Location: Expo Hall (Capital Hall N24)
Matthew Honnibal (Explosion AI)
Average rating: ****.
(4.00, 4 ratings)
Matthew Honnibal shares "one weird trick" that can give your NLP project a better chance of success: avoid a waterfall methodology where data definition, corpus construction, modeling, and deployment are performed as separate phases of work. Read more.
Add to your personal schedule
12:0512:45 Wednesday, 1 May 2019
Location: Expo Hall 2 (Capital Hall N24)
Ted Dunning (MapR)
Average rating: ****.
(4.67, 6 ratings)
As a community, we have been pushing streaming architectures, particularly microservices, for several years now. But what are the results in the field? Ted Dunning shares several (anonymized) case histories, describing the good, the bad, and the ugly. In particular, Ted covers how several teams who were new to big data fared by skipping MapReduce and jumping straight into streaming. Read more.
Add to your personal schedule
14:0514:45 Wednesday, 1 May 2019
Location: Expo Hall (Capital Hall N24)
Secondary topics:  Security and Privacy
Mikio Braun (Zalando)
Average rating: *****
(5.00, 3 ratings)
Mikio Braun explores techniques and concepts around fairness, privacy, and security when it comes to machine learning models. Read more.
Add to your personal schedule
14:0514:45 Wednesday, 1 May 2019
Location: Expo Hall 2 (Capital Hall N24)
Simona Meriam (Nielsen)
Average rating: ****.
(4.57, 7 ratings)
Simona Meriam explains how Nielsen Marketing Cloud (NMC) used to manage its Kafka consumer offsets against Spark-Kafka 0.8 consumer and why the company decided to upgrade from Spark-Kafka 0.8 to 0.10 consumer. Simona reviews the problems encountered during the upgrade and details the process that led to the solution. Read more.
Add to your personal schedule
14:5515:35 Wednesday, 1 May 2019
Location: Expo Hall (Capital Hall N24)
Secondary topics:  Deep Learning
Wolff Dobson (Google, Inc.)
Average rating: ***..
(3.83, 6 ratings)
Wolff Dobson covers the latest in TensorFlow. Whether you're a beginner or are migrating from 1.x to 2.0, you'll learn the best ways to set up your model, feed your data to it, and distribute it for fast training. You'll also discover how TensorFlow has been recently upgraded to be more intuitive. Read more.
Add to your personal schedule
14:5515:35 Wednesday, 1 May 2019
Location: Expo Hall 2 (Capital Hall N24)
Geir Engdahl (Cognite), Daniel Bergqvist (Google)
Average rating: ****.
(4.00, 2 ratings)
Geir Engdahl and Daniel Bergqvist explain how Cognite is developing IIoT smart maintenance systems that can process 10M samples a second from thousands of sensors. You'll explore an architecture designed for high performance, robust streaming sensor data ingest, and cost-effective storage of large volumes of time series data as well as best practices learned along the way. Read more.
Add to your personal schedule
16:3517:15 Wednesday, 1 May 2019
Location: Expo Hall (Capital Hall N24)
Secondary topics:  Ethics, Security and Privacy
Maren Eckhoff (QuantumBlack)
Average rating: ****.
(4.50, 4 ratings)
The success of machine learning algorithms in a wide range of domains has led to a desire to leverage their power in ever more areas. Maren Eckhoff discusses modern explainability techniques that increase the transparency of black box algorithms, drive adoption, and help manage ethical, legal, and business risks. Many of these methods can be applied to any model without limiting performance. Read more.
Add to your personal schedule
16:3517:15 Wednesday, 1 May 2019
Location: Expo Hall 2 (Capital Hall N24)
Constantin Muraru (Adobe), Dan Popescu (Adobe)
Average rating: *****
(5.00, 2 ratings)
With the current crop of cloud providers, obtaining servers to run your real-time application has never been easier. But what happens, though, when you wish to deploy your (web) applications frequently, on hundreds or even thousands of servers, in a fast, reliable way, with minimal human intervention? Constantin Muraru and Dan Popescu tell you how to tackle this challenge. Read more.
Add to your personal schedule
17:2518:05 Wednesday, 1 May 2019
Location: Expo Hall 2 (Capital Hall N24)
Ted Malaska (Capital One)
Average rating: ****.
(4.12, 8 ratings)
The world of data is all about building the best path to support time and quality to value. 80% to 90% of the work is getting the data into the hands and tools that can create value. Ted Malaska takes you on a journey to investigate strategies and designs that can change the way your company looks and approaches data. Read more.
Add to your personal schedule
11:1511:55 Thursday, 2 May 2019
Location: Expo Hall (Capital Hall N24)
Secondary topics:  Ethics
Average rating: *****
(5.00, 3 ratings)
Machine learning (ML) algorithms are good at learning new behaviors but bad at identifying when those behaviors are harmful or don’t make sense. Bias, ethics, and fairness are big risk factors in ML. However, we creators have a lot of experience dealing with intelligent beings—one another. Jerry Overton uses this common sense to build a checklist for protecting against ethical violations with ML. Read more.
Add to your personal schedule
11:1511:55 Thursday, 2 May 2019
Location: Expo Hall 2 (Capital Hall N24)
Thomas Weise (Lyft)
Average rating: ****.
(4.50, 14 ratings)
Fast data and stream processing are essential for making Lyft rides a good experience for passengers and drivers. Lyft's systems need to track and react to event streams in real time to update locations, compute routes and estimates, balance prices, and more. Thomas Weise offers an overview of the streaming platform that powers these use cases. Read more.
Add to your personal schedule
12:0512:45 Thursday, 2 May 2019
Location: Expo Hall (Capital Hall N24)
Oliver Gindele (Datatonic)
Average rating: ****.
(4.50, 6 ratings)
The success of deep learning has reached the realm of structured data in the past few years, where neural networks have been shown to improve the effectiveness and predictability of recommendation engines. Oliver Gindele offers a brief overview of such deep recommender systems and explains how they can be implemented in TensorFlow. Read more.
Add to your personal schedule
12:0512:45 Thursday, 2 May 2019
Location: Expo Hall 2 (Capital Hall N24)
Kai Wähner (Confluent)
Average rating: ****.
(4.75, 8 ratings)
How do you leverage the flexibility and extreme scale of the public cloud and the Apache Kafka ecosystem to build scalable, mission-critical machine learning infrastructures that span multiple public clouds—or bridge your on-premises data center to the cloud? Join Kai Wähner to learn how to use technologies such as TensorFlow with Kafka’s open source ecosystem for machine learning infrastructures. Read more.
Add to your personal schedule
14:0514:45 Thursday, 2 May 2019
Location: Expo Hall (Capital Hall N24)
Alex Jaimes (Dataminr)
Average rating: ***..
(3.00, 2 ratings)
When emergency events occur, social signals and sensor data are generated. Alex Jaimes explains how to apply machine learning and deep learning to process large amounts of heterogeneous data from various sources in real time, with a particular focus on how such information can be used for emergencies and in critical events for first responders and for other social good use cases. Read more.
Add to your personal schedule
14:0514:45 Thursday, 2 May 2019
Location: Expo Hall 2 (Capital Hall N24)
Holden Karau (Independent), Kris Nova (Independent)
Average rating: ****.
(4.86, 7 ratings)
In the Kubernetes world, where declarative resources are a first-class citizen, running complicated workloads across distributed infrastructure is easy, and processing big data workloads using Spark is common practice, we can finally look at constructing a hybrid system of running Spark in a distributed cloud native way. Join respective experts Kris Nova and Holden Karau for a fun adventure. Read more.
Add to your personal schedule
14:5515:35 Thursday, 2 May 2019
Location: Expo Hall 2 (Capital Hall N24)
Michael Freedman (TimescaleDB | Princeton University)
Average rating: ****.
(4.75, 4 ratings)
Time series databases require ingesting high volumes of structured data, answering complex, performant queries for recent and historical time intervals, and performing specialized time-centric analysis and data management. Michael Freedman explains how to avoid these operational problems by reengineering Postgres to serve as a general data platform, including high-volume time series workloads. Read more.