Presented By O’Reilly and Cloudera
Make Data Work
21–22 May 2018: Training
22–24 May 2018: Tutorials & Conference
London, UK
 
Capital Suite 12
Add Measure what matters: How your measurement strategy can reduce opex to your personal schedule
9:00 Measure what matters: How your measurement strategy can reduce opex Radhika Dutt (Radical Product), Geordie Kaytes (Fresh Tilled Soil), Nidhi Aggarwal (Radical Product)
Add Architecting a next-generation data platform to your personal schedule
13:30 Architecting a next-generation data platform Ted Malaska (Blizzard Entertainment), Jonathan Seidman (Cloudera)
Capital Suite 13
Add Running data analytic workloads in the cloud   to your personal schedule
9:00 Running data analytic workloads in the cloud Eugene Fratkin (Cloudera), Vinithra Varadharajan (Cloudera), Mael Ropars (Cloudera), Jason Wang (Cloudera)
Add Natural language understanding at scale with spaCy and Spark NLP to your personal schedule
13:30 Natural language understanding at scale with spaCy and Spark NLP David Talby (Pacific AI), Claudiu Branzan (G2 Web Services)
Capital Suite 14
Add  Architecting a data platform for enterprise use to your personal schedule
9:00 Architecting a data platform for enterprise use Mark Madsen (Think Big Analytics), Todd Walter (Teradata)
Add Managing data science in the enterprise to your personal schedule
13:30 Managing data science in the enterprise Dan Enthoven (Domino Data Lab)
Capital Suite 2/3
Add Data Case Studies to your personal schedule
9:00 Data Case Studies Dan Jeavons (Shell), Hollie Lubbock (Fjord), Jivan Virdee (Fjord), fausto morales (Arundo), Marty Cochrane (Arundo), Jane McConnell (Teradata), Paul Ibberson (Teradata), Michael Troughton (Conduce), Jonathan Genah (DHL Supply Chain), Allison Nau (Cox Automotive UK), Dave Fitch (The Data Lab), Maria Assunta Palmieri (Data Reply ), Niranjan Thomas (Dow Jones), Erik Elgersma (FrieslandCampina), Viola Melis (Typeform), Carme Artigas (Synergic Partners), Nuria Bombardo (Pepsico)
Capital Suite 4
Add Findata Day to your personal schedule
9:00 Findata Day Paul Lashmet (Arcadia Data), Anthony Culligan (SETL), Konrad Sippel (Deutsche Börse), Paul Lynn (Nordea), Mikheil Nadareishvili (TBC Bank), Olaf Hein (ORDIX AG), Robert Passarella (Alpha Features), Louise Beaumont (Publicis Groupe | techUK | NPSO), Alistair Croll (Solve For Interesting), Robert Passarella (Alpha Features), Christina Erlwein-Sayer (OptiRisk Systems), Angelique Mohring (GainX), Saeed Amen (Cuemacro), Gisele Frederick (Zingr.io)
Capital Suite 8
Add Modern real-time streaming architectures to your personal schedule
9:00 Modern real-time streaming architectures Arun Kejariwal (Independent), Karthik Ramasamy (Streamlio), Ivan Kelly (Streamlio)
Add Kafka streaming microservices with Akka Streams and Kafka Streams to your personal schedule
13:30 Kafka streaming microservices with Akka Streams and Kafka Streams Dean Wampler (Lightbend), Boris Lublinsky (Lightbend)
Capital Suite 9
Add Making data visual: A practical session on using visualization for insight to your personal schedule
13:30 Making data visual: A practical session on using visualization for insight Danyel Fisher (Honeycomb.io), Miriah Meyer (University of Utah)
Capital Suite 10
13:30
Capital Suite 11
Capital Suite 15
Add Leveraging Spark and deep learning frameworks to understand data at scale to your personal schedule
9:00 Leveraging Spark and deep learning frameworks to understand data at scale Vartika Singh (Cloudera), Juan Yu (Cloudera), Marton Balassi (Cloudera), Steven Totman (Cloudera)
Add Securing and governing hybrid, cloud, and on-premises big data deployments, step by step to your personal schedule
13:30 Securing and governing hybrid, cloud, and on-premises big data deployments, step by step Mark Donsky (Okera), Steffen Maerkl (Cloudera), Andre Araujo (Cloudera)
10:30 Morning break | Room: Capital Suite Foyer
15:00 Afternoon break | Room: Capital Suite Foyer
12:30 Lunch sponsored by IBM | Room: N11
7:30 Coffee break sponsored by Redis Lab | Room: Auditorium Foyer
Add Opening Reception to your personal schedule
17:00 Opening Reception | Room: Expo Hall (Capital Hall 24)
Add Strata Dine-Around to your personal schedule
19:00 Strata Dine-Around | Room: Various locations
9:00-12:30 (3h 30m) Data-driven business management, Strata Business Summit Visualization, Design, and UX
Measure what matters: How your measurement strategy can reduce opex
Radhika Dutt (Radical Product), Geordie Kaytes (Fresh Tilled Soil), Nidhi Aggarwal (Radical Product)
These days it’s easy for companies to say, "We measure everything!” The problem is, most popular metrics may not be appropriate or relevant for your business. Measurement isn’t free and should be done strategically. Radhika Dutt, Geordie Kaytes, and Nidhi Aggarwal explain how to align measurement with your product strategy so you can measure what matters for your business.
13:30-17:00 (3h 30m) Data engineering and architecture Data Platforms
Architecting a next-generation data platform
Ted Malaska (Blizzard Entertainment), Jonathan Seidman (Cloudera)
Using Customer 360 and the IoT as examples, Jonathan Seidman and Ted Malaska explain how to architect a modern, real-time big data platform leveraging recent advancements in the open source software world, using components like Kafka, Flink, Kudu, Spark Streaming, and Spark SQL and modern storage engines to enable new forms of data processing and analytics.
9:00-12:30 (3h 30m) Data engineering and architecture
Running data analytic workloads in the cloud
Eugene Fratkin (Cloudera), Vinithra Varadharajan (Cloudera), Mael Ropars (Cloudera), Jason Wang (Cloudera)
Vinithra Varadharajan, Jason Wang, Eugene Fratkin, and Mael Ropars detail new paradigms to effectively run production-level pipelines with minimal operational overhead. Join in to learn how to remove barriers to data discovery, metadata sharing, and access control.
13:30-17:00 (3h 30m) Data science and machine learning Text and Language processing and analysis
Natural language understanding at scale with spaCy and Spark NLP
David Talby (Pacific AI), Claudiu Branzan (G2 Web Services)
Natural language processing is a key component in many data science systems. David Talby and Claudiu Branzan lead a hands-on tutorial on scalable NLP using spaCy for building annotation pipelines, Spark NLP for building distributed natural language machine-learned pipelines, and Spark ML and TensorFlow for using deep learning to build and apply word embeddings.
9:00-12:30 (3h 30m) Data engineering and architecture Data Platforms
Architecting a data platform for enterprise use
Mark Madsen (Think Big Analytics), Todd Walter (Teradata)
Building a data lake involves more than installing Hadoop or putting data into AWS. The goal in most organizations is to build multiuse data infrastructure that is not subject to past constraints. Mark Madsen and Todd Walter explore design assumptions and principles and walk you through a reference architecture to use as you work to unify your analytics infrastructure.
13:30-17:00 (3h 30m) Strata Business Summit
Managing data science in the enterprise
Dan Enthoven (Domino Data Lab)
The honeymoon era of data science is ending, and accountability is coming. Not content to wait for results that may or may not arrive, successful data science leaders deliver measurable impact on an increasing share of an enterprise's KPIs. Dan Enthoven outlines a holistic approach to people, process, and technology to build a sustainable competitive advantage.
9:00-17:00 (8h)
Data Case Studies
Dan Jeavons (Shell), Hollie Lubbock (Fjord), Jivan Virdee (Fjord), fausto morales (Arundo), Marty Cochrane (Arundo), Jane McConnell (Teradata), Paul Ibberson (Teradata), Michael Troughton (Conduce), Jonathan Genah (DHL Supply Chain), Allison Nau (Cox Automotive UK), Dave Fitch (The Data Lab), Maria Assunta Palmieri (Data Reply ), Niranjan Thomas (Dow Jones), Erik Elgersma (FrieslandCampina), Viola Melis (Typeform), Carme Artigas (Synergic Partners), Nuria Bombardo (Pepsico)
Hear practical insights from household brands and global companies: the challenges they tackled, approaches they took, and the benefits—and drawbacks—of their solutions.
9:00-17:00 (8h)
Findata Day
Paul Lashmet (Arcadia Data), Anthony Culligan (SETL), Konrad Sippel (Deutsche Börse), Paul Lynn (Nordea), Mikheil Nadareishvili (TBC Bank), Olaf Hein (ORDIX AG), Robert Passarella (Alpha Features), Louise Beaumont (Publicis Groupe | techUK | NPSO), Alistair Croll (Solve For Interesting), Robert Passarella (Alpha Features), Christina Erlwein-Sayer (OptiRisk Systems), Angelique Mohring (GainX), Saeed Amen (Cuemacro), Gisele Frederick (Zingr.io)
From analyzing risk and detecting fraud to predicting payments and improving customer experience, take a deep dive into the ways data technologies are transforming the financial industry.
9:00-12:30 (3h 30m) Data engineering and architecture, Streaming systems and real-time applications
Modern real-time streaming architectures
Arun Kejariwal (Independent), Karthik Ramasamy (Streamlio), Ivan Kelly (Streamlio)
The need for instant data-driven insights has led the proliferation of messaging and streaming frameworks. Karthik Ramasamy, Arun Kejariwal, and Ivan Kelly walk you through state-of-the-art streaming frameworks, algorithms, and architectures, covering the typical challenges in modern real-time big data platforms and offering insights on how to address them.
13:30-17:00 (3h 30m) Data engineering and architecture, Streaming systems and real-time applications
Kafka streaming microservices with Akka Streams and Kafka Streams
Dean Wampler (Lightbend), Boris Lublinsky (Lightbend)
Dean Wampler and Boris Lublinsky walk you through building streaming apps as microservices using Akka Streams and Kafka Streams. Along the way, Dean and Boris discuss the strengths and weaknesses of each tool for particular design needs and contrast them with Spark Streaming and Flink, so you'll know when to chose them instead.
9:00-12:30 (3h 30m) Law, ethics, and governance, Strata Business Summit Security and Privacy
General Data Protection Regulation (GDPR) tutorial and ePrivacy introduction
Aurélie Pols (Mind Your Privacy)
Aurélie Pols walks you through a "5+5 pillars" framework for GDPR readiness, explaining what the GDPR means to data-fueled businesses. You'll learn how to attribute responsibility to assure compliance and build toward ethical data practices, minimizing risk for your company while fostering trust with your clients.
13:30-17:00 (3h 30m) Visualization and user experience Visualization, Design, and UX
Making data visual: A practical session on using visualization for insight
Danyel Fisher (Honeycomb.io), Miriah Meyer (University of Utah)
Danyel Fisher and Miriah Meyer explore the human side of data analysis and visualization, covering operationalization, the process of reducing vague problems to specific tasks, and how to choose a visual representation that addresses those tasks. Along the way, they also discuss single views and explain how to link them into multiple views.
9:00-12:30 (3h 30m) Data science and machine learning, Emerging technologies and case studies Text and Language processing and analysis
Introduction to natural language processing with Python
Barbara Fusinska (Google)
Natural language processing techniques help address tasks like text classification, information extraction, and content generation. Barbara Fusinska offers an overview of natural language processing and walks you through building a bag-of-words representation, using Python and its machine learning libraries, and then using it for text classification.
13:30-17:00 (3h 30m)
Session
9:00-17:00 (8h) Big data and data science in the cloud
Serverless machine learning with TensorFlow
Carl Osipov (Google)
Carl Osipov walks you through building a complete machine learning pipeline from ingest, exploration, training, and evaluation to deployment and prediction.
9:00-12:30 (3h 30m) Data science and machine learning
Leveraging Spark and deep learning frameworks to understand data at scale
Vartika Singh (Cloudera), Juan Yu (Cloudera), Marton Balassi (Cloudera), Steven Totman (Cloudera)
Vartika Singh, Marton Balassi, Steven Totman, and Juan Yu outline approaches for preprocessing, training, inference, and deployment across datasets (time series, audio, video, text, etc.) that leverage Spark, its extended ecosystem of libraries, and deep learning frameworks.
13:30-17:00 (3h 30m) Law, ethics, and governance, Platform security and cybersecurity Security and Privacy
Securing and governing hybrid, cloud, and on-premises big data deployments, step by step
Mark Donsky (Okera), Steffen Maerkl (Cloudera), Andre Araujo (Cloudera)
Hybrid big data deployments present significant new security risks. Security admins must ensure a consistently secured and governed experience for end users and administrators across multiple workloads. Mark Donsky, Steffen Maerkl, and André Araujo share best practices for meeting these challenges as they walk you through securing a Hadoop cluster.
10:30-11:00 (30m)
Break: Morning break
15:00-15:30 (30m)
Break: Afternoon break
12:30-13:30 (1h)
Break: Lunch sponsored by IBM
7:30-9:00 (1h 30m)
Break: Coffee break sponsored by Redis Lab
17:00-18:00 (1h)
Opening Reception
Join us after tutorials on Tuesday in the Expo Hall. Grab a drink and mingle with fellow Strata attendees while you check out all of the exhibitors.
19:00-21:00 (2h)
Strata Dine-Around
Get to know your fellow attendees over dinner. We've made reservations for you at some of the most sought-after restaurants in town. This is a great chance to make new connections and sample some of the great cuisine London has to offer.