Presented By O'Reilly and Cloudera
Make Data Work
September 26–27, 2016: Training
September 27–29, 2016: Tutorials & Conference
New York, NY

Unified integration for data lakes and modern data applications

Jonathan Gray (Cask)
2:55pm–3:35pm Wednesday, 09/28/2016
Location: 1 E 09
Average rating: ****.
(4.33, 3 ratings)

What you'll learn

  • Understand the Cask Data Application Platform (CDAP), the first unified integration platform for big data and the IoT
  • Description

    Building, running, and governing a data lake and production data applications on Hadoop is often a difficult process filled with slow development cycles and painful operations. Not only are traditional development tools and techniques missing from the Hadoop ecosystem, but mastering data ingestion and data integration, as well as enterprise governance and security, has become a formidable challenge when building big data solutions. The challenge only increases as the Hadoop ecosystem continues to grow, use cases mature, SLAs intensify, and services become customer facing and revenue generating. And while the IT organization owns the task of mitigating these issues, more importantly, it also has an opportunity to enable the business to reduce time to insights and make better decisions faster by providing them with a modern self-service environment for their data.

    Jonathan Gray proposes a modern, unified integration architecture that helps IT mitigate these issues while enabling businesses to reduce time to insights and make decisions faster through a modern self-service environment. Drawing on his experiences as an early committer on Apache HBase, building real-time systems on Hadoop at Facebook, and working with customers at Cask, Jonathan explores the benefits of the Cask Data Application Platform (CDAP), the first unified integration platform for big data and the IoT. CDAP ensures data and process consistency between applications and underlying infrastructure technologies, across multiple environments, and between different parts of the IT organization and provides a single environment for design, operations, data science, and governance for data lakes, data applications, and the IoT. Jonathan discusses the requirements for building and running modern production applications on Hadoop and outlines the architecture CDAP offers to address the challenges in the context of common use cases.

    Photo of Jonathan Gray

    Jonathan Gray


    Jonathan Gray is the founder and CEO of Cask. Jonathan is an entrepreneur and software engineer with a background in startups, open source, and all things data. Previously, he was a software engineer at Facebook, where he helped drive HBase engineering efforts, including Facebook Messages and several other large-scale projects, from inception to production. An open source evangelist, Jonathan was responsible for helping build the Facebook engineering brand through developer outreach and refocusing the open source strategy of the company. Prior to Facebook, Jonathan founded, where he became an early adopter of Hadoop and HBase. He is now a core contributor and active committer in the community. Jonathan holds a bachelor’s degree in electrical and computer engineering from Carnegie Mellon University.