Presented By
O’Reilly + Cloudera
Make Data Work
March 25-28, 2019
San Francisco, CA

High-performance data lakes for AI workloads using object storage (sponsored by Minio)

11:50am12:30pm Wednesday, March 27, 2019
Location: 2022
Average rating: *****
(5.00, 1 rating)

What you'll learn

  • Learn how a team at a large IT services management company created a high-performance data lake using object storage for consumption by big data workloads


Traditionally, big data applications teams have relied upon the power of MapReduce to process large amounts of data. The current generation of AI and big data applications offer interactive rather than batch processing. These AI workloads are no longer constrained by the performance of MapReduce and HDFS but instead achieve massive levels of performance by processing data in memory. In effect, these applications have become stateless in order to circumvent the limitations of the data infrastructure that feeds them.

Modern data lakes, built on disaggregated infrastructure, are able to provide the performance and flexibility required to run cloud native applications, such as AI and ML workloads. By separating compute from the storage layer, we’re no longer constrained by an infrastructure that binds both together. Now these applications can run with a data infrastructure that is based on object storage, which provides both performance and scaling advantages for AI workloads.

Recently, Scott Mcclellan’s team—which analyzes over six petabytes of data using Hadoop technology—created a high-performance data lake using object storage for consumption by big data workloads. Scott shares his experience deploying object storage for AI workloads.

This session is sponsored by Minio.

Photo of Scott Mcclellan

Scott Mcclellan


Scott Mcclellan is CTO at PRGX. A creative, results-driven technology leader, Scott is a change agent and problem solver with a passion for technology. He’s skilled in grasping and explaining the big picture and conceptualizing, developing, and implementing solutions. Scott has substantial experience working with business leaders and C-level executives. Previously, he was chief technologist and VP of engineering for Hewlett-Packard’s cloud services and for scalable computing, where he set technical direction for the company’s scalable computing business and introduced new products focused on cloud service providers and high-performance computing customers.