Presented By O’Reilly and Intel AI
Put AI to work
Sep 4-5, 2018: Training
Sep 5-7, 2018: Tutorials & Conference
San Francisco, CA

High-performance input pipelines for scalable deep learning

Vas Chellappa (Pure Storage)
4:50pm-5:30pm Friday, September 7, 2018
Implementing AI
Location: Imperial B
Secondary topics:  Edge computing and Hardware

Who is this presentation for?

  • Data scientists and members of infrastructure teams

Prerequisite knowledge

  • A high-level understanding of the process for performance-testing GPUs and AI pipelines

What you'll learn

  • Explore a new benchmark suite for evaluating and tuning input pipelines


As GPU technology continues to advance, the demand for faster data continues to grow. In deep learning, input pipelines are responsible for a complex chain of actions that ultimately feed data into GPU memory: defining how files are read from storage, deserializing them into data structures, preprocessing on a CPU, and copying to the GPU. These pipelines bring together complex hardware systems—including cluster networks, peripheral interconnects, modern CPUs, and storage devices—along with sophisticated software systems to drive the data movement and transformation.

Vas Chellappa explains how to keep your GPUs fed with data as you train the next generation of deep learning architectures and shares a new benchmark suite for evaluating and tuning input pipelines. Vas examines results with TensorFlow’s DataSets API on a DGX-1 with V100 and provides guidance on key tuning parameters and diagnostic techniques for improving performance.

Photo of Vas Chellappa

Vas Chellappa

Pure Storage

Vas Chellappa manages the big data analytics team at Pure Engineering, which sifts through 24 TB of streaming data a day to find test failures so that engineers can focus on much more fun things. Vas holds a PhD in electrical and computer engineering with a focus on computer systems from Carnegie Mellon University.