Presented By O'Reilly and Cloudera
Make Data Work
Sept 29–Oct 1, 2015 • New York, NY

Multi-tenant, multi-cluster, and multi-container Apache HBase deployment

Jonathan Hsieh (Cloudera, Inc), Dima Spivak (StreamSets)
1:15pm–1:55pm Wednesday, 09/30/2015
Production Ready Hadoop
Location: 3D 05/08 Level: Intermediate
Average rating: ***..
(3.79, 14 ratings)

With the number of production Apache HBase clusters increasing, there is greater demand for running multiple applications on single clusters, for data reliability and availability, and for developers to better test their applications. We’ll lay out how these new demands can be addressed using multi-tenant, multi-cluster, or multi-container deployments.

A multi-cluster approach is a viable option when single-cluster fault tolerance is insufficient. We will discuss several deployments and strategies where availability-sensitive applications benefit from geographically-distributed clusters.

Multi-tenancy becomes vital as the number of users and use cases for Apache Hadoop and Apache HBase continue to grow. The ability to handle multiple workloads and multiple frameworks on less hardware improves cost efficiency. We’ll present current solutions and new features, such as request scheduling, that can help overcome the isolation challenges with these deployments.

Multi-container deployments of HBase on a single host using Docker, a container-based virtualization platform, have driven Cloudera’s efforts to improve the quality of HBase releases. We will discuss how a “distributed” HBase cluster can be quickly deployed, and how it can help test HBase applications with less hardware while reducing complexity.

Photo of Jonathan Hsieh

Jonathan Hsieh

Cloudera, Inc

Jonathan Hsieh is a software engineer at Cloudera. He is an Apache HBase committer, and Apache Flume founder.

Photo of Dima Spivak

Dima Spivak


Dima Spivak is a software engineer at StreamSets, where he works on test infrastructure. He is also a committer and PMC member on the Apache HBase project. Before joining StreamSets, he developed test infrastructure at Cloudera.