Managing the System Lifecycle and Configuration of Apache Hadoop and Other Distributed Systems

Philip Zeyliger (Cloudera)
Operations Ballroom EFGH
Please note: to attend, your registration must include Workshops.
Average rating: ***..
(3.36, 14 ratings)

We’ve built a system (working name: Cloudera Management Framework)
to operate and configure existing distributed systems in datacenters.
Distributed systems must be managed as a whole, and not merely
as the sum of the individual computers involved. For example,
configuration isn’t just the various parameters and configuration
files for processes, but it’s also what machines play what part. Operationally, managing distributed systems as if they were merely their parts causes headaches when some machines can be down at any point

In this talk, we’ll focus on the architecture of CMF and describing
the necessary building blocks/concepts for managing
complex distributed systems. Whereas existing systems
(e.g., chef, puppet, cfengine, as well as VM-based systems like EC2) focus on configuring
individual machines, CMF focuses on operating the lifecycle
of clusters running HDFS, MapReduce, HBase, ZooKeeper, and Flume.
To do this, we introduce the notion of ‘casting’, implement
distributed process supervision, and discuss both model-based
and observation-based views of resources.

Photo of Philip Zeyliger

Philip Zeyliger


Philip Zeyliger came to Cloudera from Google, where he worked on scalable storage for user-facing applications. Before that, he worked in finance, at D.E. Shaw. Philip holds a bachelor’s degree in mathematics from Harvard University. His interests include systems and databases. He’s a committer on the Apache Avro project.

Comments on this page are now closed.


Ben Rockwood
06/14/2011 2:15pm PDT

In hindsight I should have thoroughly read the full description. The title of this talk was misleading in itself and should have been: “Managing the System Lifecycle and Configuration of Apache Hadoop using Cloudera CMF”

I had wrongly inferred from the title alone that this was a Zookeeper talk and it is not. The talk came off like an infomercial for Cloudera, which is fine, but not what I (or the other attendees I ran into afterwards) expected.

That said, its a good talk for those new to Hadoop’s architecture and the slides in Section 4 will be helpful for those getting started, especially if they haven’t read “Hadoop: The Definitive Guide”.

  • Keynote Systems
  • Cisco
  • Google
  • Neustar
  • Betfair
  • Cotendo
  • Rackspace Hosting
  • Akamai
  • Apica
  • dynaTrace
  • Equinix
  • Facebook
  • New Relic
  • Opscode
  • Yahoo! Inc.
  • AppDynamics
  • Aptimize
  • Blaze
  • CDNetworks
  • Cedexis
  • Citrix Systems
  • Compuware Corporation
  • Dyn Inc.
  • F5 Networks
  • Heroku
  • Percona
  • Quest Software
  • Schooner Information Technology
  • SiteSpect
  • Splunk
  • Strangeloop
  • WatchMouse
  • Zeus Technology
  • Neustar

Sponsorship Opportunities

For information on exhibition and sponsorship opportunities at the conference, contact Yvonne Romaine at

Download the Velocity Sponsor/Exhibitor Prospectus

Contact Us

View a complete list of Velocity contacts