Presented By O'Reilly and Cloudera
Make Data Work
March 28–29, 2016: Training
March 29–31, 2016: Conference
San Jose, CA

Deploying Hadoop on user namespace containers

Abin Shahab (Altiscale)
11:00am–11:40am Thursday, 03/31/2016
Data Innovations

Location: 230 C

Prerequisite knowledge

Attendees should have a basic knowledge of Hadoop and Docker containers.

Description

Docker is a very popular container technology. A Docker container provides an isolated VM-like environment that can be treated as a machine but is much lighter than a VM. Containers provide this isolation by taking advantage of Linux namespaces, which allow per-process isolation and sharing of UID, mount(filesystem), network, PID, UTS, and IPC resources.

Altiscale deploys Hadoop in its data centers to allow customers to process petabytes of data without worrying about managing Hadoop clusters. Altiscale clusters grow and shrink elastically with the compute and storage usage of the customer. This elasticity is achieved by growing and shrinking the slave nodes. Docker containers enable Altiscale to launch Node Managers and DataNodes in subseconds and rapidly respond to changing customer demands. Applying this user namespace solution on top of Docker also offers more isolation than what Docker provides so that no user inside these Hadoop slaves has root privileges.

Abin Shahab walks attendees through Altiscale’s elastic cluster model, describes the design decisions behind it, and discusses the issues encountered and fixed along the way. Abin then looks to the future, describing improvements in Docker and Hadoop that will enable better isolation and elasticity.

Photo of Abin Shahab

Abin Shahab

Altiscale

Abin Shahab is a senior software engineer at Altiscale as well as a contributor to Docker and LXC. Abin’s work at Altiscale is focused on multitenant Hadoop clusters using Docker containers. Prior to joining Altiscale, Abin worked on graph databases and search engines at Guidewire, Symantec, and Vivisimo (IBM). Abin holds a master’s degree in software engineering from Carnegie Mellon University and a bachelor’s degree in computer science from the University of Arizona.