Presented By O’Reilly and Cloudera
Make Data Work
September 11, 2018: Training & Tutorials
September 12–13, 2018: Keynotes & Sessions
New York, NY

Running multidisciplinary big data workloads in the cloud

Sudhanshu Arora (Cloudera), Stefan Salandy (Cloudera), Suraj Acharya (Cloudera), Brandon Freeman (Cloudera), Jason Wang (Cloudera), Shravan Pabba (Cloudera)
1:30pm–5:00pm Tuesday, 09/11/2018

Who is this presentation for?

  • Data engineers, data scientists, BI engineers, analytic engineers, and those in IT

Prerequisite knowledge

  • Familiarity with public cloud concepts

Materials or downloads needed in advance

  • A WiFi-enabled laptop (If you want to use the CLI, you need to have Python 3.6 installed and have terminal access.)
  • An AWS account and credentials set up prior to the tutorial

What you'll learn

  • Learn how to successfully run a data analytics pipeline in the cloud and integrate data engineering and data analytic workflows
  • Understand considerations and best practices for data analytics pipelines in the cloud
  • Explore approaches for sharing metadata across workloads in a big data PaaS

Description

Organizations now run diverse, multidisciplinary big data workloads that span data engineering, analytic database, and data science applications. Many of these workloads operate on the same underlying data, and the workloads themselves can be transient or long running in nature. One of the challenges is keeping the data context consistent across these various workloads.

Sudhanshu Arora, Stefan Salandy, Suraj Acharya, Brandon Freeman, Jason Wang, and Shravan Pabba demonstrate how to successfully manage the shared data experience to ensure a consistent experience across all various workloads. You’ll learn how to successfully run a data analytics pipeline in the cloud and integrate data engineering and data analytic workflows and explore considerations and best practices for data analytics pipelines in the cloud. Along the way, you’ll see how to share metadata across workloads in a big data PaaS.

You’ll use the Cloudera Altus PaaS offering, powered by Cloudera Altus SDX, to run various big data workloads.

Photo of Sudhanshu Arora

Sudhanshu Arora

Cloudera

Sudhanshu Arora is a software engineer at Cloudera, where he leads the development for data management and governance solutions. Previously, Sudhanshu was with the platform team at Informatica, where he helped design and implement its next-generation metadata repository.

Stefan Salandy

Cloudera

Stefan Salandy is a systems engineer at Cloudera.

Photo of Suraj Acharya

Suraj Acharya

Cloudera

Suraj Acharya is a software engineer on the cloud team at Cloudera.

Brandon Freeman

Cloudera

Brandon Freeman is a Mid-Atlantic region strategic system engineer at Cloudera, specializing in infrastructure, the cloud, and Hadoop. Previously, Brandon was an infrastructure architect at Explorys, working in operations, architecture, and performance optimization for the Cloudera Hadoop environments, where he was responsible for designing, building, and managing many large Hadoop clusters.

Photo of Jason Wang

Jason Wang

Cloudera

Jason Wang is a software engineer at Cloudera focusing on the cloud.

Photo of Shravan Pabba

Shravan Pabba

Cloudera

Shravan (Sean) Pabba is a Principal Systems Engineer at Cloudera. He helps Cloudera customers and prospects adopt, architect and build applications using Cloudera Platform. His current area of focus is Cloudera Altus. Before Cloudera, Sean worked as a Solutions Architect at various companies including GigaSpaces and IBM, where he was involved in architecture, design and development of distributed and mainframe applications.

Comments on this page are now closed.

Comments

Brandon Freeman | SALES ENGINEER
09/19/2018 12:24pm EDT

The slides are now live.

Ramesh Kanchanam | PRINCIPLE DATA ARCHITECT
09/19/2018 7:51am EDT

I don’t see the slides online, when will the slides be available?