Presented By O'Reilly and Cloudera
Make Data Work
Dec 4–5, 2017: Training
Dec 5–7, 2017: Tutorials & Conference
Singapore

Rethinking data marts in the cloud: Common architectural patterns for analytics

Henry Robinson (Cloudera), Greg Rahn (Cloudera)
5:05pm5:45pm Wednesday, December 6, 2017
Big data and the cloud, Data engineering and architecture
Location: 310/311 Level: Intermediate

Who is this presentation for?

  • Architects and those in IT

Prerequisite knowledge

  • A basic understanding of SQL and cloud principles

What you'll learn

  • Explore the common cloud architectural patterns to optimize price and performance
  • Learn the trade-offs when running analytics in cloud environments

Description

Cloud environments will likely play a key role in your business’s future. With the allure of on-demand provisioning and usage-based cost optimizations, it’s no surprise why. However, to maximize the full potential of the cloud, it’s critical to understand how to best leverage these environments for different workloads without disrupting the business.

For migrating data marts and analytics to the cloud, you’ll need to understand when it’s best to use object storage versus local storage, how to design for multitenant isolation, and how to tune performance for SLAs, among others. Henry Robinson and Greg Rahn explore the workload considerations when evaluating the cloud and discuss the common architectural patterns to optimize price and performance. You’ll learn how to incorporate the cloud into your overall infrastructure landscape and the benefits of a heterogenous strategy.

Topics include:

  • When to use transient clusters versus long-lived clusters
  • The trade-offs between object stores and locally attached storage
  • Architectures for multitenancy
  • Translating enterprise security and governance to cloud environments
Photo of Henry Robinson

Henry Robinson

Cloudera

Henry Robinson is a software engineer at Cloudera. For the past few years, he has worked on Apache Impala, an SQL query engine for data stored in Apache Hadoop, and leads the scalability effort to bring Impala to clusters of thousands of nodes. Henry’s main interest is in distributed systems. He is a PMC member for the Apache ZooKeeper, Apache Flume, and Apache Impala open source projects.

Photo of Greg Rahn

Greg Rahn

Cloudera

Greg Rahn is a director of product management at Cloudera, where he is responsible for driving SQL product strategy as part of Cloudera’s analytic database product, including working directly with Impala. Over his 20-year career, Greg has worked with relational database systems in a variety of roles, including software engineering, database administration, database performance engineering, and most recently, product management, to provide a holistic view and expertise on the database market. Previously, Greg was part of the esteemed Real-World Performance Group at Oracle and was the first member of the product management team at Snowflake Computing.

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)