Each big data application and use case has different performance and operational requirements. Choosing the Hadoop deployment option that is best suited for a given application depends on a number of factors. Thomas Phelan and Joel Baxter discuss the advantages and disadvantages of running various Hadoop use cases and applications in each of the following environments:
They then provide a set of rules to help users evaluate big data runtime environments and deployment options to determine which is best suited for a given application. The answer is not always obvious.
Thomas and Joel conclude with a Q&A session, where they will run real-world Hadoop uses cases through the proposed Hadoop environment selection criteria and ask attendees to select the optimal environment. These answers will then be compared against the Hadoop runtime environments actually selected by the customers running these real-world applications and use cases.
Thomas Phelan is cofounder and chief architect of BlueData. Previously, a member of the original team at Silicon Graphics that designed and implemented XFS, the first commercially availably 64-bit file system; and an early employee at VMware, a senior staff engineer and a key member of the ESX storage architecture team where he designed and developed the ESX storage I/O load-balancing subsystem and modular pluggable storage architecture as well as led teams working on many key storage initiatives such as the cloud storage gateway and vFlash.
Joel Baxter is an engineer at BlueData, where he focuses on virtualization, containers, and Hadoop-related technologies to build an infrastructure platform for big data analytics. His background is in the provisioning and configuration of virtual compute, storage, and networking to serve the needs of application clusters. Before joining BlueData, Joel did a 9-year tour at VMware on the vCenter and vSphere teams, developing the management layer for vMotion, the vSphere Distributed Switch, and policy-based storage management and related vSphere features such as Virtual SAN.
©2016, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.