Docker is a very popular container technology. A Docker container provides an isolated VM-like environment that can be treated as a machine but is much lighter than a VM. Containers provide this isolation by taking advantage of Linux namespaces, which allow per-process isolation and sharing of UID, mount(filesystem), network, PID, UTS, and IPC resources.
Altiscale deploys Hadoop in its data centers to allow customers to process petabytes of data without worrying about managing Hadoop clusters. Altiscale clusters grow and shrink elastically with the compute and storage usage of the customer. This elasticity is achieved by growing and shrinking the slave nodes. Docker containers enable Altiscale to launch Node Managers and DataNodes in subseconds and rapidly respond to changing customer demands. Applying this user namespace solution on top of Docker also offers more isolation than what Docker provides so that no user inside these Hadoop slaves has root privileges.
Abin Shahab walks attendees through Altiscale’s elastic cluster model, describes the design decisions behind it, and discusses the issues encountered and fixed along the way. Abin then looks to the future, describing improvements in Docker and Hadoop that will enable better isolation and elasticity.
Abin Shahab is a senior software engineer at Altiscale as well as a contributor to Docker and LXC. Abin’s work at Altiscale is focused on multitenant Hadoop clusters using Docker containers. Prior to joining Altiscale, Abin worked on graph databases and search engines at Guidewire, Symantec, and Vivisimo (IBM). Abin holds a master’s degree in software engineering from Carnegie Mellon University and a bachelor’s degree in computer science from the University of Arizona.
©2016, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.