A major challenge in today’s big data world is getting data into a data lake in a simple, automated way. Many organizations use Python or another language to code their way through these processes. But when the number of data sources increases into the hundreds—or often thousands—coding scripts for each source becomes time consuming and extremely difficult to manage and maintain.
Developers need the ability to create a simple process that can support many disparate data sources by detecting metadata and passing that metadata through what Pentaho calls “metadata injection.” With this capability, teams can drive hundreds of data ingestion and preparation processes through just a few transformations, reducing development time and risk and speeding time to insights. Chuck Yarbrough outlines this template-driven data ingestion and explains how to simplify and automate your data ingestion processes.
This session is sponsored by Pentaho.
Chuck Yarbrough is the senior director of solutions marketing and management at Pentaho, a leading big data analytics company that helps organizations engineer big data connections, blend data, and report and visualize all of their data. Chuck is responsible for creating and driving Pentaho solutions that leverage the Pentaho platform, enabling customers to implement big data solutions quicker and achieve greater ROI faster. Chuck has more than 20 years of experience helping organizations use technology to their advantage to ensure they can run, manage, and transform their business through better use of data. A lifelong participant in the data game, Chuck has held leadership roles at Deloitte Consulting, SAP Business Objects, Hyperion, and National Semiconductor.
©2016, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.