Presented By
O’Reilly + Cloudera
Make Data Work
29 April–2 May 2019
London, UK

Tutorials

These expert-led presentations on Tuesday, 30 April give you a chance to dive deep into the subject matter. Please note: to attend tutorials, you must register for a Gold or Silver pass; does not include access to training courses on Monday or Tuesday.

Tuesday, 30 April

Add to your personal schedule
9:0012:30 Tuesday, 30 April 2019
Location: Capital Suite 8
Secondary topics:  Financial Services
Ted Malaska (Capital One), Jonathan Seidman (Cloudera)
Average rating: ***..
(3.50, 12 ratings)
The enterprise data management space has changed dramatically in recent years, and this had led to new challenges for organizations in creating successful data practices. Jonathan Seidman and Ted Malaska share guidance and best practices from planning to implementation based on years of experience working with companies to deliver successful data projects. Read more.
Add to your personal schedule
9:0012:30 Tuesday, 30 April 2019
Location: Capital Suite 11
Robin Moffatt (Confluent)
Average rating: *****
(5.00, 5 ratings)
Robin Moffatt walks you through the architectural reasoning for Apache Kafka and the benefits of real-time integration. You'll then build a streaming data pipeline using nothing but your bare hands, Kafka Connect, and KSQL. Read more.
Add to your personal schedule
9:0012:30 Tuesday, 30 April 2019
Location: Capital Suite 14
Secondary topics:  Model lifecycle management
Danilo Sato (ThoughtWorks), Christoph Windheuser (ThoughtWorks)
Average rating: ****.
(4.31, 13 ratings)
Danilo Sato and Christoph Windheuser walk you through applying continuous delivery (CD), pioneered by ThoughtWorks, to data science and machine learning. Join in to learn how to make changes to your models while safely integrating and deploying them into production, using testing and automation techniques to release reliably at any time and with a high frequency. Read more.
Add to your personal schedule
9:0012:30 Tuesday, 30 April 2019
Location: Capital Suite 15
Holden Karau (Independent), Trevor Grant (IBM), Francesca Lazzeri (Microsoft)
Average rating: ****.
(4.43, 7 ratings)
Holden Karau, Francesca Lazzeri, and Trevor Grant offer an overview of Kubeflow and walk you through using it to train and serve models across different cloud environments (and on-premises). You'll use a script to do the initial setup work, so you can jump (almost) straight into training a model on one cloud and then look at how to set up serving in another cluster/cloud. Read more.
Add to your personal schedule
9:0012:30 Tuesday, 30 April 2019
Location: Capital Suite 4
Krishnan Saidapet (REAN Cloud, A Hitachi Vantara company)
Average rating: ***..
(3.43, 7 ratings)
Krishnan Saidapet offers an overview of the latest big data and machine learning serverless technologies from Amazon Web Services (AWS) and leads a deep dive into using them to process and analyze two different datasets: the publicly available Bureau of Labor Statistics dataset and the Chest X-Ray Image Data dataset. Read more.
Add to your personal schedule
9:0012:30 Tuesday, 30 April 2019
Location: Capital Suite 10
Secondary topics:  Security and Privacy
Mark Donsky (Okera), Ifigeneia Derekli (Cloudera), Lars George (Okera), Michael Ernest (Dataiku)
Average rating: ****.
(4.00, 2 ratings)
New regulations such as CCPA and GDPR are driving new compliance, governance, and security challenges for big data. Infosec and security groups must ensure a consistently secured and governed environment across multiple workloads. Mark Donsky, Ifigeneia Derekli, Lars George, and Michael Ernest share hands-on best practices for meeting these challenges, with special attention paid to CCPA. Read more.
Add to your personal schedule
9:0012:30 Tuesday, 30 April 2019
Location: Capital Suite 2/3
Melinda King (ROI Training)
Average rating: ***..
(3.00, 8 ratings)
Melinda King offers an introduction to designing and building machine learning models on Google Cloud Platform. Through a combination of presentations, demos, and hands-on labs, you’ll learn machine learning (ML) and TensorFlow concepts, and develop skills in developing, evaluating, and productionizing ML models. Read more.
Add to your personal schedule
9:0012:30 Tuesday, 30 April 2019
Location: S11 A
Mark Madsen (Teradata), Todd Walter (Archimedata)
Average rating: ***..
(3.71, 7 ratings)
Building a data lake involves more than installing Hadoop or putting data into AWS. The goal in most organizations is to build a multiuse data infrastructure that is not subject to past constraints. Mark Madsen and Todd Walter explore design assumptions and principles and walk you through a reference architecture to use as you work to unify your analytics infrastructure. Read more.
Add to your personal schedule
9:0017:00 Tuesday, 30 April 2019
Location: Capital Suite 12
Paco Nathan (derwen.ai), Ganes Kesari (Gramener), Alicia Williams (Google), Semih Kumluk (Turkcell), Simon Moritz (Ericsson), Samuel Cristóbal (Innaxis), Volker Schnecke (Novo Nordisk), Julia Butter (Scout24), Cecilia Marchi (Jakala), Caroline Goulard (Dataveyes), Marc Rind (ADP), Juan Bengochea (Royal Caribbean Cruise Lines), Aaronpal Dhanda (EasyJet )
Hear practical insights from household brands and global companies: the challenges they tackled, approaches they took, and the benefits—and drawbacks—of their solutions. Read more.
Add to your personal schedule
9:0017:00 Tuesday, 30 April 2019
Location: Capital Suite 13
Alistair Croll (Solve For Interesting), Nicolette Bullivant (Santander UK Technology), Charlotte Werger (Van Lanschot Kempen), Daniel First (QuantumBlack), Yiannis Kanellopoulos (Code4Thought), Romi Mahajan (Quantarium), Rashed Iqbal (Investment and Development Office), Martin Leijen (Rabobank / Digital Transformation Office), Tal Doron (GigaSpaces), Alistair Croll (Solve For Interesting), Chris Taggart (OpenCorporates), Jan Novotny (Deutsche Bank)
From analyzing risk and detecting fraud to predicting payments and improving customer experience, take a deep dive into the ways data technologies are transforming the financial industry. Read more.
Add to your personal schedule
13:3017:00 Tuesday, 30 April 2019
Location: Capital Suite 4
Colm Moynihan (Cloudera), Jonathan Seidman (Cloudera), Michael Kohs (Cloudera)
Average rating: ****.
(4.00, 2 ratings)
Moving to the cloud poses a number of challenges. Join Colm Moynihan, Jonathan Seidman, and Michael Kohs to explore cloud architecture and challenges and learn how to use Cloudera Altus to build data warehousing and data engineering clusters and run workloads that share metadata between them using Cloudera SDX. Read more.
Add to your personal schedule
13:3017:00 Tuesday, 30 April 2019
Location: S11 A
Arun Kejariwal (Independent), Karthik Ramasamy (Streamlio), Ivan Kelly (Streamlio)
Average rating: ***..
(3.00, 10 ratings)
Many industry segments have been grappling with fast data (high-volume, high-velocity data). Arun Kejariwal and Karthik Ramasamy walk you through the state-of-the-art systems for each stage of an end-to-end data processing pipeline—messaging, compute, and storage—for real-time data and algorithms to extract insights (e.g., heavy hitters and quantiles) from data streams. Read more.
Add to your personal schedule
13:3017:00 Tuesday, 30 April 2019
Location: Capital Suite 8
Peter Aiken (Data BluePrint | DAMA International | Virginia Commonwealth University)
Average rating: ***..
(3.43, 14 ratings)
Peter Aiken offers a more operational perspective on the use of data strategy, which is especially useful for organizations just getting started with data Read more.
Add to your personal schedule
13:3017:00 Tuesday, 30 April 2019
Location: Capital Suite 10
Boris Lublinsky (Lightbend), Dean Wampler (Anyscale)
Average rating: ****.
(4.20, 5 ratings)
Boris Lublinsky and Dean Wampler walk you through using ML in streaming data pipelines and doing periodic model retraining and low-latency scoring in live streams. You'll explore using Kafka as a data backplane, the pros and cons of microservices versus systems like Spark and Flink, tips for TensorFlow and SparkML, performance considerations, model metadata tracking, and other techniques. Read more.
Add to your personal schedule
13:3017:00 Tuesday, 30 April 2019
Location: Capital Suite 14
Alexander Thomas (John Snow Labs), Claudiu Branzan (Accenture)
Average rating: ****.
(4.00, 4 ratings)
Alex Thomas and Claudiu Branzan lead a hands-on introduction to scalable NLP using the highly performant, highly scalable open source Spark NLP library. You’ll spend about half your time coding as you work through four sections, each with an end-to-end working code base that you can change and improve. Read more.
Add to your personal schedule
13:3017:00 Tuesday, 30 April 2019
Location: Capital Suite 11
Melinda King (ROI Training)
Average rating: ***..
(3.12, 8 ratings)
Melinda King offers an introduction to designing and building machine learning models on Google Cloud Platform. Through a combination of presentations, demos, and hands-on labs, you’ll learn machine learning (ML) and TensorFlow concepts and develop skills in developing, evaluating, and productionizing ML models. Read more.
Add to your personal schedule
13:3017:00 Tuesday, 30 April 2019
Location: Capital Suite 15
Matt Fuller (Starburst)
Average rating: *****
(5.00, 2 ratings)
Used by Facebook, Netflix, Airbnb, LinkedIn, Twitter, Uber, and others, Presto has become the ubiquitous open source software for SQL on anything. Presto was built from the ground up for fast interactive SQL analytics against disparate data sources ranging in size from GBs to PBs. Join Matt Fuller to learn how to use Presto and explore use cases and best practices you can implement today. Read more.
Add to your personal schedule
13:3017:00 Tuesday, 30 April 2019
Location: Capital Suite 2/3
Francesca Lazzeri (Microsoft), Aashish Bhateja (Microsoft)
Average rating: ****.
(4.25, 4 ratings)
Time series modeling and forecasting is fundamentally important to various practical domains; in the past few decades, machine learning model-based forecasting has become very popular in both private and public decision-making processes. Francesca Lazzeri walks you through using Azure Machine Learning to build and deploy your time series forecasting models. Read more.