Presented By O'Reilly and Cloudera
Make Data Work
March 28–29, 2016: Training
March 29–31, 2016: Conference
San Jose, CA

Hadoop in the cloud: Good fit or round peg in a square hole?

Thomas Phelan (BlueData), Joel Baxter (BlueData)
1:50pm–2:30pm Wednesday, 03/30/2016
Hadoop Use Cases

Location: 230 A
Average rating: ****.
(4.40, 10 ratings)

Each big data application and use case has different performance and operational requirements. Choosing the Hadoop deployment option that is best suited for a given application depends on a number of factors. Thomas Phelan and Joel Baxter discuss the advantages and disadvantages of running various Hadoop use cases and applications in each of the following environments:

  • On-premises, with bare metal
  • On-premises, with hypervisor-based virtualization
  • On-premises, with containers (i.e., operating system virtualization)
  • In the public cloud (e.g., Amazon EMR or Azure HDInsight)

They then provide a set of rules to help users evaluate big data runtime environments and deployment options to determine which is best suited for a given application. The answer is not always obvious.

Thomas and Joel conclude with a Q&A session, where they will run real-world Hadoop uses cases through the proposed Hadoop environment selection criteria and ask attendees to select the optimal environment. These answers will then be compared against the Hadoop runtime environments actually selected by the customers running these real-world applications and use cases.

Photo of Thomas Phelan

Thomas Phelan


Thomas Phelan is cofounder and chief architect of BlueData. Previously, a member of the original team at Silicon Graphics that designed and implemented XFS, the first commercially availably 64-bit file system; and an early employee at VMware, a senior staff engineer and a key member of the ESX storage architecture team where he designed and developed the ESX storage I/O load-balancing subsystem and modular pluggable storage architecture as well as led teams working on many key storage initiatives such as the cloud storage gateway and vFlash.

Photo of Joel Baxter

Joel Baxter


Joel Baxter is an engineer at BlueData, where he focuses on virtualization, containers, and Hadoop-related technologies to build an infrastructure platform for big data analytics. His background is in the provisioning and configuration of virtual compute, storage, and networking to serve the needs of application clusters. Before joining BlueData, Joel did a 9-year tour at VMware on the vCenter and vSphere teams, developing the management layer for vMotion, the vSphere Distributed Switch, and policy-based storage management and related vSphere features such as Virtual SAN.