Presented By O'Reilly and Cloudera
Make Data Work
September 26–27, 2016: Training
September 27–29, 2016: Tutorials & Conference
New York, NY

Filling the data lake

Chuck Yarbrough (Pentaho)
2:05pm–2:45pm Wednesday, 09/28/2016
Location: 1B 01/02
Average rating: **...
(2.50, 2 ratings)

What you'll learn

  • Understand a solution to getting data into a data lake that uses metadata to autogenerate ingestion processes
  • Description

    A major challenge in today’s big data world is getting data into a data lake in a simple, automated way. Many organizations use Python or another language to code their way through these processes. But when the number of data sources increases into the hundreds—or often thousands—coding scripts for each source becomes time consuming and extremely difficult to manage and maintain.

    Developers need the ability to create a simple process that can support many disparate data sources by detecting metadata and passing that metadata through what Pentaho calls “metadata injection.” With this capability, teams can drive hundreds of data ingestion and preparation processes through just a few transformations, reducing development time and risk and speeding time to insights. Chuck Yarbrough outlines this template-driven data ingestion and explains how to simplify and automate your data ingestion processes.

    This session is sponsored by Pentaho.

    Photo of Chuck Yarbrough

    Chuck Yarbrough


    Chuck Yarbrough is the senior director of solutions marketing and management at Pentaho, a leading big data analytics company that helps organizations engineer big data connections, blend data, and report and visualize all of their data. Chuck is responsible for creating and driving Pentaho solutions that leverage the Pentaho platform, enabling customers to implement big data solutions quicker and achieve greater ROI faster. Chuck has more than 20 years of experience helping organizations use technology to their advantage to ensure they can run, manage, and transform their business through better use of data. A lifelong participant in the data game, Chuck has held leadership roles at Deloitte Consulting, SAP Business Objects, Hyperion, and National Semiconductor.