Presented By O’Reilly and Cloudera
Make Data Work
21–22 May 2018: Training
22–24 May 2018: Tutorials & Conference
London, UK

Analytics in the Cloud - Building a Modern Cloud-based Big Data Warehouse

Greg Rahn (Cloudera)
14:0514:45 Wednesday, 23 May 2018

Who is this presentation for?

Data Engineer/Architect, Enterprise Architect, DBA, Data Scientist

Prerequisite knowledge

General understanding of cloud and data warehousing.

What you'll learn

In comparing on-premises with cloud big data warehouses: * Discuss the different cloud topologies available and how to map use cases to them * How and when to take advantage of object storage vs. local storage * How the on-prem multi-tenancy model maps to cloud * Translating enterprise security and governance to cloud


More and more organizations are looking to move their analytics and data warehouses to the cloud not only to take advantage of the flexibility new technologies can provide but also to empower their end-users with simple provisioning and instant access to data to better support a self-service BI model In order to be successful in moving analytic workloads to the cloud, it requires understanding the architectural decisions that will need to be made and the tradeoffs in making them. The speakers will discuss how to build a big data warehouse in order to maximize the full potential of the cloud, all while minimizing friction for self-service BI and analytics.
When migrating data and analytics to the cloud, you’ll need to understand when it’s best to use object storage vs. local, how to design for multi-tenant isolation, how to tune performance for SLAs, and others. During this talk, we’ll discuss the workload considerations when evaluating the cloud and discuss the common architectural patterns to optimize price and performance so you can answer these questions and more. In particular, they will discuss:

  • When to use transient clusters vs. long-lived clusters
  • The trade-offs between object stores and locally attached storage
  • Architectures for multi-tenancy to enable self-service access without risking predictability
  • Translating enterprise security and governance to cloud environments
Photo of Greg Rahn

Greg Rahn


Greg Rahn is a director of product management at Cloudera, where he is responsible for driving SQL product strategy as part of Cloudera’s analytic database product, including working directly with Impala. Over his 20-year career, Greg has worked with relational database systems in a variety of roles, including software engineering, database administration, database performance engineering, and most recently, product management, to provide a holistic view and expertise on the database market. Previously, Greg was part of the esteemed Real-World Performance Group at Oracle and was the first member of the product management team at Snowflake Computing.

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)