Presented By O'Reilly and Cloudera
Make Data Work
September 25–26, 2017: Training
September 26–28, 2017: Tutorials & Conference
New York, NY

Schedule: Cloud sessions

Add to your personal schedule
9:00am - 5:00pm Monday, September 25 & Tuesday, September 26
Data engineering
Location: 1A 04/05
SOLD OUT
Bruce Martin (Cloudera)
Average rating: *....
(1.50, 2 ratings)
Bruce Martin leads you through designing and architecting solutions to a challenging business problem. You'll explore big data application architecture concepts in general and then apply them to the design of a challenging system. Read more.
Add to your personal schedule
9:00am - 5:00pm Monday, September 25 & Tuesday, September 26
SOLD OUT
Jesse Anderson (Big Data Institute)
To handle real-time big data, you need to solve two difficult problems: how do you ingest that much data and how will you process that much data? Jesse Anderson explores the latest real-time frameworks (both open source and managed cloud services), discusses the leading cloud providers, and explains how to choose the right one for your company. Read more.
Add to your personal schedule
9:00am12:30pm Tuesday, September 26, 2017
Big data and the Cloud, Data Engineering & Architecture
Location: 1E 10 Level: Intermediate
Jennifer Wu (Cloudera), Fahd Siddiqui (Cloudera), Paul George (Cloudera), Eugene Fratkin (Cloudera)
Average rating: *....
(1.50, 2 ratings)
Jennifer Wu, Paul George, Fahd Siddiqui, and Eugene Fratkin lead a deep dive into running data engineering workloads in a managed service capacity in the public cloud. Along the way, they share AWS infrastructure best practices and explain how data engineering workloads interoperate with data analytic workloads. Read more.
Add to your personal schedule
9:00am12:30pm Tuesday, September 26, 2017
Big data and the Cloud, Data Engineering & Architecture
Location: 1A 23/24 Level: Beginner
Pranav Rastogi (Microsoft)
Average rating: **...
(2.50, 2 ratings)
As big data solutions are rapidly moving to the cloud, it's becoming increasingly important to know how to use Apache Hadoop, Spark, R Server, and other open source technologies in the cloud. Pranav Rastogi walks you through building big data applications on Azure HDInsight and other Azure services. Read more.
Add to your personal schedule
1:30pm5:00pm Tuesday, September 26, 2017
Big data and the Cloud, Data Engineering & Architecture
Location: 1E 15/16 Level: Intermediate
Ryan Nienhuis (Amazon Web Services (AWS)), Radhika Ravirala (Amazon Web Services (AWS)), Allan MacInnis (Amazon Web Services), Ben Snively (Amazon Web Services (AWS))
Average rating: ****.
(4.00, 2 ratings)
Want to learn how to use Amazon's big data web services to launch your first big data application on the cloud? Ryan Nienhuis, Radhika Ravirala, Allan MacInnis, and Ben Snively walk you through building a big data application using a combination of open source technologies and AWS managed services. Read more.
Add to your personal schedule
1:30pm5:00pm Tuesday, September 26, 2017
Data Engineering & Architecture, Security
Location: 1A 18 Level: Intermediate
Mark Donsky (Cloudera), Manish Ahluwalia (Nerdwallet), Andre Araujo (Cloudera), Syed Rafice (Cloudera)
Average rating: *****
(5.00, 1 rating)
Mark Donsky, André Araujo, Syed Rafice, and Manish Ahluwalia walk you through securing a Hadoop cluster. You’ll start with a cluster with no security and then add security features related to authentication, authorization, encryption of data at rest, encryption of data in transit, and complete data governance. Read more.
Add to your personal schedule
2:05pm2:45pm Wednesday, September 27, 2017
Big data and the Cloud, Data Engineering & Architecture
Location: 1A 15/16/17 Level: Intermediate
Henry Robinson (Cloudera), Greg Rahn (Cloudera)
Cloud environments will likely play a key role in your business’s future. Henry Robinson and Greg Rahn explore the workload considerations when evaluating the cloud for analytics and discuss common architectural patterns to optimize price and performance. Read more.
Add to your personal schedule
5:25pm6:05pm Wednesday, September 27, 2017
Artificial Intelligence, Machine Learning & Data Science
Location: 1A 12/14 Level: Intermediate
Leo Dirac (Amazon Web Services)
Average rating: *****
(5.00, 5 ratings)
Leo Dirac demonstrates how to apply the latest deep learning techniques to semantically understand images. You'll learn what embeddings are, how to extract them from your images using deep convolutional neural networks (CNNs), and how they can be used to cluster and classify large datasets of images. Read more.
Add to your personal schedule
5:25pm6:05pm Wednesday, September 27, 2017
Big data and the Cloud, Data Engineering & Architecture
Location: 1A 15/16/17 Level: Intermediate
Josh Baer (Spotify), Alison Gilles (Spotify)
Average rating: ****.
(4.00, 1 rating)
In early 2016, Spotify decided that it didn’t want to be in the data center business. The future was the cloud. Josh Baer and Alison Gilles explain what it took to move Spotify to the cloud, covering Spotify's technology choices, challenges faced, and the lessons Spotify learned along the way. Read more.
Add to your personal schedule
11:20am12:00pm Thursday, September 28, 2017
Gwen Shapira (Confluent)
Average rating: ****.
(4.50, 2 ratings)
Gwen Shapira explains how the three realities of modern programming—the explosion of data and data systems, building business processes as microservices instead of monolithic applications, and the rise of the public cloud—affect how developers and companies operate today and why companies across all industries are turning to streaming data and Apache Kafka for mission-critical applications. Read more.
Add to your personal schedule
11:20am12:00pm Thursday, September 28, 2017
Big data and the Cloud, Data Engineering & Architecture
Location: 1A 15/16/17 Level: Intermediate
Chris Mills (The Meet Group)
if(we)'s batch event processing pipeline is different from yours, but the process of migrating it from running in a data center to running in AWS is likely pretty similar. Chris Mills explains what was easier than expected, what was harder, and what the company wished it had known before starting the migration. Read more.
Add to your personal schedule
11:20am12:00pm Thursday, September 28, 2017
Stephen Wu (Microsoft)
Average rating: ****.
(4.00, 1 rating)
Remote storage in the cloud provides an infinitely scalable, cost-effective, and performant solution for big data customers. Adoption is rapid due to the flexibility and cost savings associated with unlimited storage capacity when separating compute and storage. Stephen Wu demonstrates how to correctly performance tune your workloads when your data is stored in remote storage in the cloud. Read more.
Add to your personal schedule
1:15pm1:55pm Thursday, September 28, 2017
Big data and the Cloud, Data Engineering & Architecture
Location: 1A 15/16/17 Level: Intermediate
Bill Havanki (Cloudera)
Speed and reliability in deploying big data clusters is key for effectiveness in the cloud. Drawing on ideas from his book Moving Hadoop to the Cloud, which covers essential practices like baking images and automating cluster configuration, Bill Havanki explains how you can automate the creation of new clusters from scratch and use metrics gathered using the cloud provider to scale up. Read more.
Add to your personal schedule
1:15pm1:55pm Thursday, September 28, 2017
Big data and the Cloud, Machine Learning & Data Science
Location: 1A 08/10 Level: Intermediate
Edgar Ruiz (RStudio)
Average rating: ****.
(4.00, 1 rating)
With R and sparklyr, a Spark standalone cluster can be used to analyze large datasets found in S3 buckets. Edgar Ruiz walks you through setting up a Spark standalone cluster using EC2 and offers an overview of S3 bucket folder and file setup, connecting R to Spark, the settings needed to read S3 data into Spark, and a data import and wrangle approach. Read more.
Add to your personal schedule
2:05pm2:45pm Thursday, September 28, 2017
Big data and the Cloud, Data Engineering & Architecture
Location: 1A 15/16/17 Level: Beginner
Michael McCune (Red Hat)
Average rating: *****
(5.00, 2 ratings)
Notebook interfaces like Apache Zeppelin and Project Jupyter are excellent starting points for sketching out ideas and exploring data-driven algorithms, but where does the process lead after the notebook work has been completed? Michael McCune offers some answers as they relate to cloud-native platforms. Read more.
Add to your personal schedule
2:55pm3:35pm Thursday, September 28, 2017
Business case studies, Strata Business Summit
Location: 1A 18 Level: Non-technical
Moderated by:
Steven Totman (Cloudera)
Panelists:
Siew Choo Soh (DBS Bank), Meena Ram (CIBC), David Leach (Qrious)
Big data and the cloud have spread around the world, and Singapore, New Zealand, Australia, and Canada are already seeing dramatic investments and returns. In a panel moderated by Steve Totman, senior executives from a variety of leading companies, including DBS, CIBC, and Qrious, share use cases, challenges, and how to be successful. Read more.
Add to your personal schedule
4:35pm5:15pm Thursday, September 28, 2017
Data Engineering & Architecture, Data-driven business management
Location: 1A 15/16/17 Level: Intermediate
Felipe Hoffa (Google)
Average rating: *****
(5.00, 1 rating)
With Google BigQuery anyone can easily analyze the more than five years of GitHub metadata and 42+ terabytes of open source code. Felipe Hoffa explains how to leverage this data to understand the community and code related to any language or project. Relevant for open source creators, users, and choosers, this is data that you can leverage to make better choices. Read more.
Add to your personal schedule
4:35pm5:15pm Thursday, September 28, 2017
Emerging Technologies, Machine Learning & Data Science
Location: 1A 08/10 Level: Non-technical
Karim Chine (RosettaHUB)
Karim Chine offers an overview of rosettaHUB—which aims to establish a global open data science metacloud centered on usability, reproducibility, auditability, and shareability—and shares the results of the rosettaHUB/AWS Educate initiative, which involved 30 higher education institutions and research labs and over 3,000 researchers, educators, and students. Read more.