Presented By O'Reilly and Cloudera
Make Data Work
March 13–14, 2017: Training
March 14–16, 2017: Tutorials & Conference
San Jose, CA

Moving big data as a service to a multicloud world

Sriram Ganesan (Qubole), Prakhar Jain (Qubole)
11:00am11:40am Wednesday, March 15, 2017
Big data and the Cloud
Location: 210 A/E Level: Intermediate
Secondary topics:  Architecture, Cloud
Average rating: ***..
(3.00, 2 ratings)

Who is this presentation for?

  • Data engineers, data team admins, and big data DevOps engineers

Prerequisite knowledge

  • A basic understanding of public cloud services and models

What you'll learn

  • Understand the differences in big data deployment models across several different public cloud platforms, including AWS, Google Cloud Platform, Microsoft Azure, and Oracle Bare Metal Cloud Service
  • Explore Qubole's Cloudman cluster management tool


Qubole started out by offering Hadoop as a service in AWS. Over time, it extended its big data capabilities beyond Hadoop and its cloud infrastructure support beyond AWS. To do this, Qubole needed to build a simple, cloud-agnostic, multipurpose provisioning tool that could be extended for further engines and further cloud support. Sriram Ganesan and Prakhar Jain describe how and why Qubole built cluster management tool Cloudman to deploy Spark, Hadoop, and other big data engines across several public IaaS cloud platforms, such as AWS, Microsoft Azure, and Oracle Public Cloud.

Topics include:

  • How Qubole minimizes the amount of time spent in deployment by relying heavily on dynamic provisioning during cluster usage
  • How Qubole minimized the footprint of software and hardware resources by making Cloudman a multitenant web service and not a client-server setup that lives within the cluster
  • The consolidation of APIs across several different public cloud platforms
  • How Qubole accomplished the above without excluding the cloud-specific features that are important to its customers, such as AWS Spot instances
  • How Qubole minimizes the amount of effort in adding and supporting a new public cloud platform
  • Qubole’s decision to build Cloudman in-house as opposed to using other available cluster orchestration systems such as Ambari, Mesos, or Docker
Photo of Sriram Ganesan

Sriram Ganesan


Sriram Ganesan is a member of the technical staff at Qubole, where he works on HBase and cluster orchestration. Previously, Sriram was at Directi, where he worked on scaling the backend of leading chat app Sriram holds a bachelor of computer science engineering from the National Institute of Technology, Trichy, India.

Photo of Prakhar Jain

Prakhar Jain


Prakhar Jain is a member of the technical staff at Qubole, where he works on the cluster orchestration stack. Prakhar holds a bachelor of computer science engineering from the Indian Institute of Technology, Bombay, India.