Presented By O’Reilly and Cloudera
Make Data Work
21–22 May 2018: Training
22–24 May 2018: Tutorials & Conference
London, UK

Analytics in the cloud: Building a modern cloud-based big data warehouse

Greg Rahn (Cloudera)
14:0514:45 Wednesday, 23 May 2018
Average rating: ***..
(3.29, 7 ratings)

Who is this presentation for?

  • Data engineers and architects, enterprise architects, DBAs, and data scientists

Prerequisite knowledge

  • A basic understanding of the cloud and data warehousing

What you'll learn

  • Explore available cloud topologies and how to map use cases to them
  • Learn how and when to take advantage of object storage versus local storage, how the on-premises multitenancy model maps to the cloud, and how to translate enterprise security and governance to the cloud

Description

Organizations are increasingly looking to move their analytics and data warehouses to the cloud—not only to take advantage of the flexibility new technologies can provide but also to empower their end users with simple provisioning and instant access to data to better support a self-service BI model. However, achieving success in moving analytic workloads to the cloud requires understanding the architectural decisions that will need to be made and the trade-offs in making them.

Greg Rahn explains how to build a big data warehouse to maximize the full potential of the cloud, all while minimizing friction for self-service BI and analytics, covering when it’s best to use object storage versus local storage, how to design for multitenant isolation, how to tune performance for SLAs, and other considerations. Greg also discusses the workload considerations when evaluating the cloud and the common architectural patterns to optimize price and performance.

Topics include:

  • When to use transient clusters versus long-lived clusters
  • The trade-offs between object stores and locally attached storage
  • Architectures for multitenancy to enable self-service access without risking predictability
  • Translating enterprise security and governance to cloud environments
Photo of Greg Rahn

Greg Rahn

Cloudera

Greg Rahn is director of product management at Cloudera, where he’s responsible for driving SQL product strategy as part of the company’s data warehouse product team, including working directly with Impala. For over 20 years, Greg has worked with relational database systems in a variety of roles, including software engineering, database administration, database performance engineering, and most recently product management, providing a holistic view and expertise on the database market. Previously, Greg was part of the esteemed Real-World Performance Group at Oracle and was the first member of the product management team at Snowflake Computing.