As GPU technology continues to advance, the demand for faster data continues to grow. In deep learning, input pipelines are responsible for a complex chain of actions that ultimately feed data into GPU memory: defining how files are read from storage, deserializing them into data structures, preprocessing on a CPU, and copying to the GPU. These pipelines bring together complex hardware systems—including cluster networks, peripheral interconnects, modern CPUs, and storage devices—along with sophisticated software systems to drive the data movement and transformation.
Vas Chellappa explains how to keep your GPUs fed with data as you train the next generation of deep learning architectures and shares a new benchmark suite for evaluating and tuning input pipelines. Vas examines results with TensorFlow’s DataSets API on a DGX-1 with V100 and provides guidance on key tuning parameters and diagnostic techniques for improving performance.
Vas Chellappa manages the big data analytics team at Pure Engineering, which sifts through 24 TB of streaming data a day to find test failures so that engineers can focus on much more fun things. Vas holds a PhD in electrical and computer engineering with a focus on computer systems from Carnegie Mellon University.
©2018, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org