Often, when an organization first ventures into the cloud for running Hadoop clusters, it carries over practices that worked well on-premises, along with the idea that each cluster should last a long time and be carefully tended. It soon becomes apparent that there is a different, perhaps more effective way: deploying clusters on demand, scaling them as needed, and destroying them to save costs when demand slackens.
The problem is that it’s a lot of work to deploy a cluster in the cloud. There’s still the usual installation and configuration for all of the cluster services, but now you also need to think about allocating instances, placing them into your virtual networks, setting up security, creating new accounts, and more. How can all of that be done quickly enough to support an agile system of cloud cluster management?
Drawing on ideas from his book Moving Hadoop to the Cloud, Bill Havanki explains how you can automate the creation of new clusters from scratch and use metrics gathered using the cloud provider to scale up. Moving Hadoop to the Cloud covers many of the techniques you need, including creating instance images with most of your work baked in ahead of time, using automation to handle the rest of the work, and devising your own cloud-based metrics tailored to Hadoop clusters that inform you when your cluster could use more resources. Bill then takes you even further, demonstrating how to automate the creation of entire clusters, relying on the cloud provider API and your own scripting to make it happen. Once you can automatically create new clusters, you can also trigger similar actions from your metrics to scale your cluster up in response to demand, fully harnessing cloud flexibility for effective cluster management.
Bill Havanki is a software engineer at Cloudera, where he contributes to Hadoop components and systems for deploying Hadoop clusters into public cloud services. Previously, Bill worked for 15 years developing software for government contracts, focusing mostly on analytic frameworks and authentication and authorization systems. He holds a BS in electrical engineering from Rutgers University and an MS in computer engineering from North Carolina State University. A New Jersey native, Bill currently lives near Annapolis, Maryland, with his family.
Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?
Join the conversation here (requires login)
©2017, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com