Presented By O'Reilly and Cloudera
Make Data Work
Dec 4–5, 2017: Training
Dec 5–7, 2017: Tutorials & Conference

Real-world patterns for continuously deployed advanced analytics

Graham Gear (Cloudera)
12:05pm12:45pm Thursday, December 7, 2017
Average rating: *****
(5.00, 3 ratings)

Who is this presentation for?

  • Data engineers and scientists, developers, those in operations, analysts, and IT managers

Prerequisite knowledge

  • A basic understanding of the Hadoop ecosystem (MR, Spark, Impala, notebooks, etc.)

What you'll learn

  • Explore successful deployment patterns for Agile advanced analytics


The core drivers of a Hadoop deployment have traditionally been scale, flexibility and economics. Flexibility is often touted as the ability to add a new dimension, dataset, or complex line of questioning without investing in a expensive, difficult multimonth project. It is true that Hadoop’s flexibility gives us the ability to collapse these projects into tasks and shorten the window by which they can be deliverer, but how far can we push this?

Graham Gear draws on real-world processes and systems to explain how it’s possible to apply continuous delivery techniques to advanced analytics, realizing business value earlier and more safely. Along the way Graham shows just how far we can push the flexibility of a modern advanced analytics platform.

Topics include:

  • Can anyone deploy a Hadoop platform and the advanced analytics applications running on top of it in the same way the likes of Google, Facebook and Netflix deploy theirs?
  • How do we ensure the workloads that aren’t one-offs can be operationalized and brought within the scope of a supporting IT organization and not left to languish as a science experiment?
  • Can we deploy continuously, treating each commit as a release candidate destined for production?
  • Can we safely take the lead time from idea to realized business value from years to months, weeks, days, or even hours, getting more insights and models into production faster?
  • Where is the industry today: what is the state of the art and what should your organization be aiming for?
Photo of Graham Gear

Graham Gear


Graham Gear is director of system engineering at Cloudera and an Apache Hadoop committer. Having thirstily read the Google papers that inspired Hadoop and watched as the community coalesced, Graham could clearly see the huge potential of the Hadoop ecosystem and has been contributing to the Hadoop ecosystem and helping organizations take advantage of it for many years. Previously, Graham delivered large-scale distributed systems with a keen analytical focus; he began his career implementing sonar algorithms leveraging MPI on large Beowulf clusters at a defense research institution.