Presented By O'Reilly and Cloudera
Make Data Work
22–23 May 2017: Training
23–25 May 2017: Tutorials & Conference
London, UK
 
Capital Suite 12
Add Developing a modern enterprise data strategy to your personal schedule
13:30 Developing a modern enterprise data strategy Edd Wilder-James (Silicon Valley Data Science), Scott Kurth (Silicon Valley Data Science)
Capital Suite 13
Add Distributed deep learning on AWS using MXNet to your personal schedule
13:30 Distributed deep learning on AWS using MXNet Anima Anandkumar (UC Irvine)
Capital Suite 14
Add Data 101 to your personal schedule
9:00 Data 101 Shannon Cutt (O'Reilly Media, Inc.), Edd Wilder-James (Silicon Valley Data Science), Jim Scott (MapR Technologies, Inc.), Ellen Friedman (Independent), Martin Goodson (Evolution AI), Majken Sander (TimeXtender), Darren Cook (QQ Trend Ltd.)
Add Interactive data visualizations using Visdown to your personal schedule
13:30 Interactive data visualizations using Visdown Amit Kapoor (narrativeVIZ Consulting), Bargava Subramanian (Cisco Systems)
Capital Suite 11
Capital Suite 2/3
Add A practitioner’s guide to securing your Hadoop cluster to your personal schedule
9:00 A practitioner’s guide to securing your Hadoop cluster Mark Donsky (Cloudera), Ben Spivey (Cloudera), Mubashir Kazia (Cloudera)
Add Unraveling data with Spark using machine learning to your personal schedule
13:30 Unraveling data with Spark using machine learning Jeffrey Shmain (Cloudera), Jayant Shekhar (Sparkflows Inc.), Vartika Singh (Cloudera)
Capital Suite 4
Add Deploying and managing Hive, Spark, and Impala in the public cloud to your personal schedule
9:00 Deploying and managing Hive, Spark, and Impala in the public cloud Matthew Jacobs (Cloudera), Andrei Savu (Cloudera), Vinithra Varadharajan (Cloudera), Jennifer Wu (Cloudera)
Add Real-time data pipelines with Apache Kafka to your personal schedule
13:30 Real-time data pipelines with Apache Kafka Ian Wrigley (Confluent)
Capital Suite 8
Add Architecting a data platform to your personal schedule
9:00 Architecting a data platform John Akred (Silicon Valley Data Science), Stephen O'Sullivan (Silicon Valley Data Science)
Add Architecting a next-generation data platform to your personal schedule
13:30 Architecting a next-generation data platform Jonathan Seidman (Cloudera), Mark Grover (Cloudera), Ted Malaska (Blizzard)
Capital Suite 9
Add Just enough Scala for Spark to your personal schedule
9:00 Just enough Scala for Spark Dean Wampler (Lightbend)
Add Spark and R with sparklyr to your personal schedule
13:30 Spark and R with sparklyr Douglas Ashton (Mango Solutions), Kate Ross-Smith (Mango Solutions), Mark Sellors (Mango Solutions)
Capital Suite 10
Capital Suite 15
Add FinData Day to your personal schedule
9:00 FinData Day Doron Reuter (ING), Aida Mehonic (ASI Data Science), Colin White (Goldman Sachs), Fabio Oberto (UniCredit Business Integrated Solutions), Ivan Luciano Danesi (UniCredit Business Integrated Solutions), Tanvi Singh (Credit Suisse), Olivier de Garrigues (Trifacta)
Add Data Case Studies to your personal schedule
13:30 Data Case Studies Allison Nau (Cox Automotive UK), Sriskandarajah Suhothayan (WSO2), Roland Major (Transport for London), Denis C. Bauer (Commonwealth Scientific and Industrial Research Organisation), Alberto Rey (easyJet PLC), Sameer Tilak (Kaiser Permanente), Anand Iyer (Cloudera), Wael Elrifai (Pentaho)
London Suite 3
Add Hardcore Data Science to your personal schedule
9:00 Hardcore Data Science Ira Cohen (Anodot), Yingsong Zhang (ASI Data Science), Ali Hürriyetoglu (Statistics Netherlands), Marco Puts (Statistics Netherlands), Piet Daas (Statistics Netherlands), Robin Senge (inovex GmbH), Mathew Salvaris (Microsoft), Miguel Gonzalez-Fierro (Microsoft), Kay Brodersen (Google), Ding Ding (Intel), Alan Mosca (Birkbeck, University of London), Eduard Vazquez (Cortexica Vision Systems), Aida Mehonic (ASI Data Science)
12:30 Lunch | Room: Capital Suite Foyer
Add Opening Reception to your personal schedule
17:00 Opening Reception | Room: Capital Hall (N24)
9:00-12:30 (3h 30m) Data science and advanced analytics
Practical machine learning with Python
Angie Ma (ASI)
Angie Ma offers a hands-on overview of implementing machine learning with Python, providing practical experience while covering the most commonly used libraries, including NumPy, pandas, and scikit-learn.
13:30-17:00 (3h 30m) Data-driven business management, Strata Business Summit
Developing a modern enterprise data strategy
Edd Wilder-James (Silicon Valley Data Science), Scott Kurth (Silicon Valley Data Science)
Big data and data science have great potential for accelerating business, but how do you reconcile the business opportunity with the sea of possible technologies? Data should serve the strategic imperatives of a business—those aspirations that will define an organization’s future vision. Scott Kurth and Edd Wilder-James explain how to create a modern data strategy that powers data-driven business.
9:00-12:30 (3h 30m) Data science and advanced analytics AI, Deep learning
Deep learning for object detection and neural network deployment
Alison Lowndes (NVIDIA)
Alison Lowndes leads a hands-on exploration of approaches to the challenging problem of detecting if an object of interest is present within an image and, if so, recognizing its precise location within the image. Along the way, Alison walks you through testing three different approaches to deploying a trained DNN for inference.
13:30-17:00 (3h 30m) Data science and advanced analytics Cloud, Deep learning
Distributed deep learning on AWS using MXNet
Anima Anandkumar (UC Irvine)
Deep learning is the state of the art in domains such as computer vision and natural language understanding. MXNet is a highly flexible and developer-friendly deep learning framework. Anima Anandkumar provides hands-on experience on how to use MXNet with preconfigured Deep Learning AMIs and CloudFormation Templates to help speed your development.
9:00-12:30 (3h 30m)
Data 101
Shannon Cutt (O'Reilly Media, Inc.), Edd Wilder-James (Silicon Valley Data Science), Jim Scott (MapR Technologies, Inc.), Ellen Friedman (Independent), Martin Goodson (Evolution AI), Majken Sander (TimeXtender), Darren Cook (QQ Trend Ltd.)
Data 101 introduces you to core principles of data architecture, teaches you how to build and manage successful data teams, and inspires you to do more with your data through real-world applications. Setting the foundation for deeper dives on the following days of Strata Data Conference, Data 101 reinforces data fundamentals and helps you focus on how data can solve your business problems.
13:30-17:00 (3h 30m) Visualization & user experience
Interactive data visualizations using Visdown
Amit Kapoor (narrativeVIZ Consulting), Bargava Subramanian (Cisco Systems)
Crafting interactive data visualizations for the web is hard—you're stuck using proprietary tools or must become proficient in JavaScript libraries like D3. But what if creating a visualization was as easy as writing text? Amit Kapoor and Bargava Subramanian outline the grammar of interactive graphics and explain how to use declarative markdown-based tool Visdown to build them with ease.
9:00-17:00 (8h) Spark & beyond Text Analysis and Mining
Spark camp: Apache Spark 2.0 for analytics and text mining with Spark ML
This one-day hands-on class introduces you to Apache Spark 2.0 core concepts with a focus on Spark's machine learning library, using text mining on real-world data as the primary end-to-end use case.
9:00-12:30 (3h 30m) Hadoop platform and applications, Platform Security and Cybersecurity
A practitioner’s guide to securing your Hadoop cluster
Mark Donsky (Cloudera), Ben Spivey (Cloudera), Mubashir Kazia (Cloudera)
Ben Spivey, Mark Donsky, and Mubashir Kazia walk you through securing a Hadoop cluster. You’ll start with a cluster with no security and then add security features related to authentication, authorization, encryption of data at rest, encryption of data in transit, and complete data governance.
13:30-17:00 (3h 30m) Spark & beyond
Unraveling data with Spark using machine learning
Jeffrey Shmain (Cloudera), Jayant Shekhar (Sparkflows Inc.), Vartika Singh (Cloudera)
Vartika Singh, Jayant Shekhar, and Jeffrey Shmain walk you through various approaches available via the machine-learning algorithms available in Spark Framework (and more) to understand and decipher meaningful patterns in real-world data in order to derive value.
9:00-12:30 (3h 30m) Big data and the Cloud, Data engineering and architecture
Deploying and managing Hive, Spark, and Impala in the public cloud
Matthew Jacobs (Cloudera), Andrei Savu (Cloudera), Vinithra Varadharajan (Cloudera), Jennifer Wu (Cloudera)
Public cloud usage for Hadoop workloads is accelerating. Consequently, Hadoop components have adapted to leverage cloud infrastructure. Andrei Savu, Vinithra Varadharajan, Matthew Jacobs, and Jennifer Wu explore best practices for Hadoop deployments in the public cloud and provide detailed guidance for deploying, configuring, and managing Hive, Spark, and Impala in the public cloud.
13:30-17:00 (3h 30m) Stream processing and analytics
Real-time data pipelines with Apache Kafka
Ian Wrigley (Confluent)
Ian Wrigley demonstrates how to use Kafka Connect and Kafka Streams to build real-world, real-time streaming data pipelines—using Kafka Connect to ingest data from a relational database into Kafka topics as the data is being generated and then using Kafka Streams to process and enrich the data in real time before writing it out for further analysis.
9:00-12:30 (3h 30m) Data engineering and architecture, Spark & beyond
Architecting a data platform
John Akred (Silicon Valley Data Science), Stephen O'Sullivan (Silicon Valley Data Science)
What are the essential components of a data platform? John Akred and Stephen O'Sullivan explain how the various parts of the Hadoop, Spark, and big data ecosystems fit together in production to create a data platform supporting batch, interactive, and real-time analytical workloads.
13:30-17:00 (3h 30m) Data engineering and architecture, Hadoop platform and applications
Architecting a next-generation data platform
Jonathan Seidman (Cloudera), Mark Grover (Cloudera), Ted Malaska (Blizzard)
Using Entity 360 as an example, Jonathan Seidman, Ted Malaska, Mark Grover, and Gwen Shapira explain how to architect a modern, real-time big data platform leveraging recent advancements in the open source software world, using components like Kafka, Impala, Kudu, Spark Streaming, and Spark SQL with Hadoop to enable new forms of data processing and analytics.
9:00-12:30 (3h 30m) Spark & beyond
Just enough Scala for Spark
Dean Wampler (Lightbend)
Apache Spark is written in Scala. Hence, many if not most data engineers adopting Spark are also adopting Scala, while most data scientists continue to use Python and R. Dean Wampler offers an overview of the core features of Scala you need to use Spark effectively, using hands-on exercises with the Spark APIs.
13:30-17:00 (3h 30m) Big data and the Cloud, Spark & beyond
Spark and R with sparklyr
Douglas Ashton (Mango Solutions), Kate Ross-Smith (Mango Solutions), Mark Sellors (Mango Solutions)
R is a top contender for statistics and machine learning, but Spark has emerged as the leader for in-memory distributed data analysis. Douglas Ashton, Kate Ross-Smith, and Mark Sellors introduce Spark, cover data manipulation with Spark as a backend to dplyr and machine learning via MLlib, and explore RStudio's sparklyr package, giving you the power of Spark without having to leave your R session.
9:00-12:30 (3h 30m) Big data and the Cloud
Building your first big data application on AWS
Want to ramp up your knowledge of Amazon's big data web services and launch your first big data application on the cloud? Rahul Bhartia walks you through building a big data application in real time using a combination of open source technologies, including Apache Hadoop, Spark, and Zeppelin, as well as AWS managed services such as Amazon EMR, Amazon Kinesis, and more.
13:30-17:00 (3h 30m) Big data and the Cloud, Data engineering and architecture
Architecting and building enterprise-class Spark and Hadoop in cloud environments
James Malone (Google)
James Malone explores using managed Spark and Hadoop solutions in public clouds alongside cloud products for storage, analysis, and message queues to meet enterprise requirements via the Spark and Hadoop ecosystem.
9:00-12:30 (3h 30m)
FinData Day
Doron Reuter (ING), Aida Mehonic (ASI Data Science), Colin White (Goldman Sachs), Fabio Oberto (UniCredit Business Integrated Solutions), Ivan Luciano Danesi (UniCredit Business Integrated Solutions), Tanvi Singh (Credit Suisse), Olivier de Garrigues (Trifacta)
Finance is information. From analyzing risk and detecting fraud to predicting payments and improving customer experience, data technologies are transforming the financial industry. And we're diving deep into this change with a new day of data-meets-finance talks, tailored for Strata Data Conference events in the world's financial hubs.
13:30-17:00 (3h 30m)
Data Case Studies
Allison Nau (Cox Automotive UK), Sriskandarajah Suhothayan (WSO2), Roland Major (Transport for London), Denis C. Bauer (Commonwealth Scientific and Industrial Research Organisation), Alberto Rey (easyJet PLC), Sameer Tilak (Kaiser Permanente), Anand Iyer (Cloudera), Wael Elrifai (Pentaho)
In a series of 6 half-hour talks aimed at a business audience, you’ll hear data-themed case studies from household brands and global companies, explaining the challenges they wanted to tackle, the approaches they took, and the benefits—and drawbacks—of their solutions. If you want practical insights about applied data, look no further.
9:00-17:00 (8h)
Hardcore Data Science
Ira Cohen (Anodot), Yingsong Zhang (ASI Data Science), Ali Hürriyetoglu (Statistics Netherlands), Marco Puts (Statistics Netherlands), Piet Daas (Statistics Netherlands), Robin Senge (inovex GmbH), Mathew Salvaris (Microsoft), Miguel Gonzalez-Fierro (Microsoft), Kay Brodersen (Google), Ding Ding (Intel), Alan Mosca (Birkbeck, University of London), Eduard Vazquez (Cortexica Vision Systems), Aida Mehonic (ASI Data Science)
A full day of hardcore data science, exploring emerging topics and new areas of study made possible by vast troves of raw data and cutting-edge architectures for analyzing and exploring information. Along the way, leading data science practitioners teach new techniques and technologies to add to your data science toolbox.
12:30-13:30 (1h)
Break: Lunch
17:00-18:00 (1h) Event
Opening Reception
Grab a drink and mingle with fellow Strata Data Conference attendees while you check out all of the exhibitors in the Exhibit Hall.