Presented By O’Reilly and Cloudera
Make Data Work
March 5–6, 2018: Training
March 6–8, 2018: Tutorials & Conference
San Jose, CA

Analytics in the cloud: Building a modern cloud-based big data warehouse

Greg Rahn (Cloudera)
11:00am11:40am Thursday, March 8, 2018
Average rating: ***..
(3.40, 5 ratings)

Who is this presentation for?

  • Data engineers and architects, enterprise architects, DBAs, and data scientists

Prerequisite knowledge

  • A basic understanding of on-premises big data warehousing technologies and cloud concepts (object storage, cloud virtual machines, etc.)

What you'll learn

  • Understand the different cloud topologies available and how to map use cases to them, how and when to take advantage of object storage versus local storage, how the on-prem multitenancy model maps to the cloud, and translating enterprise security and governance to the cloud

Description

Organizations are increasingly looking to move their analytics and data warehouses to the cloud—not only to take advantage of the flexibility new technologies can provide but also to empower their end users with simple provisioning and instant access to data to better support a self-service BI model. But successfully transitioning analytic workloads to the cloud requires an understanding of the architectural decisions that will need to be made and the trade-offs in making them. Greg Rahn explains how to build a big data warehouse in order to maximize the full potential of the cloud, all while minimizing friction for self-service BI and analytics.

When migrating data and analytics to the cloud, you need to know when to use object storage rather than local storage, how to design for multitenant isolation, and how to tune performance for SLAs. Greg explores the workload considerations when evaluating the cloud and offers an overview of the common architectural patterns to optimize price and performance so you can answer these questions and more.

Topics include:

  • When to use transient versus long-lived clusters
  • The trade-offs between object stores and locally attached storage
  • Architectures for multitenancy to enable self-service access without risking predictability
  • Translating enterprise security and governance to cloud environments
Photo of Greg Rahn

Greg Rahn

Cloudera

Greg Rahn is director of product management at Cloudera, where he’s responsible for driving SQL product strategy as part of the company’s data warehouse product team, including working directly with Impala. For over 20 years, Greg has worked with relational database systems in a variety of roles, including software engineering, database administration, database performance engineering, and most recently product management, providing a holistic view and expertise on the database market. Previously, Greg was part of the esteemed Real-World Performance Group at Oracle and was the first member of the product management team at Snowflake Computing.