Accelerating training, inference, and ML applications on NVIDIA GPUs
Who is this presentation for?
- Researchers and developers who are designing and optimizing deep learning models in TensorFlow
Level
Description
Maggie Zhang, Nathan Luehr, Josh Romero, Pooya Davoodi, and Davide Onofrio dive into techniques to accelerate deep learning training and inference for common deep learning and machine learning workloads. You’ll learn how DALI can eliminate I/O and data processing bottlenecks in real-world applications and how automatic mixed precision (AMP) can easily give you up to 3x training performance improvement on Volta GPUs. You’ll see best practices for multi-GPU and multinode scaling using Horovod. They use a deep learning profiler to visualize the TensorFlow operations and identify optimization opportunities. And you’ll learn to deploy these trained models using INT8 quantization in TensorRT (TRT), all within new convenient APIs of the TensorFlow framework.
Prerequisite knowledge
- A working knowledge of TensorFlow
Materials or downloads needed in advance
- Please for best experience be sure to bring a laptop with:
- SSH terminal connection capabilities
- A browser (any browser should be fine)
- NVIDIA NSight tools installed **BEFORE** you arrive onsite.
What you'll learn
- Discover components from NVIDIA’s software stack to speed up pipelines and eliminate I/O bottlenecks
- Learn how to enable mixed precision when training models and use TRT to optimize your trained models for inference
Maggie Zhang
NVIDIA
Maggie Zhang is a deep learning software engineer at NVIDIA, where she works on deep learning frameworks. She earned her PhD in computer science and engineering from the University of New South Wales in Australia. Her research background includes GPU and CPU heterogeneous computing, compiler optimization, computer architecture, and deep learning.
Nathan Luehr
NVIDIA
Nathan Luehr is a senior developer technology engineer at NVIDIA, where he works to accelerate deep learning frameworks. His background is in theoretical chemistry. He holds a doctoral degree from Stanford University, where he worked to accelerate electronic structure calculations on GPUs.
Josh Romero
NVIDIA
Josh Romero is a developer technology engineer at NVIDIA. He has extensive experience in GPU computing from porting and optimizing high-performance computing (HPC) applications to more recent work with deep learning. Josh earned his PhD from Stanford University, where his research focused on developing new computational fluid dynamics methods to better exploit GPU hardware.
Pooya Davoodi
NVIDIA
Pooya Davoodi is a senior software engineer at NVIDIA working on accelerating TensorFlow on NVIDIA GPUs. Previously, Pooya worked on Caffe2, Caffe, CUDNN, and other CUDA libraries.
Davide Onofrio
NVIDIA
Davide Onofrio is a senior deep learning software technical marketing engineer at NVIDIA. He’s focused on development and presentation of deep learning technical developer-oriented content at NVIDIA. Davide has several years of experience working as a computer vision and machine learning engineer in biometrics, VR, and the automotive industry. He earned a PhD in signal processing at the Politecnico di Milano.
Comments on this page are now closed.
Presented by
Diamond Sponsor
Elite Sponsors
Gold Sponsor
Supporting Sponsors
Premier Exhibitors
Exhibitors
Innovators
Contact us
confreg@oreilly.com
For conference registration information and customer service
partners@oreilly.com
For more information on community discounts and trade opportunities with O’Reilly conferences
sponsorships@oreilly.com
For information on exhibiting or sponsoring a conference
pr@oreilly.com
For media/analyst press inquires
Comments
We removed the link but we uploaded the slides to the conference website. It should be already available.
Thanks for your comment.
The link to slides made available during the presentation doesn’t work. http://bit.ly/331RRAs Is there another way to get the slides?
About the requirements:
– We will SSH into a remote VM from a terminal window
– We will connect the browser to run python notebooks
So the OS should not matter provided you can SSH to a remote machine.
We removed all the installation requirements so the NVIDIA Nsight Systems is not needed on your laptop.
Thanks for leaving a comment.
When you say “NVIDIA NSight tools”, do you mean NVIDIA NSight Systems or other NVIDIA NSight tools (NSight Graphics? NSight Compute? others?).
Also, do you expect the tutorial work to be done on Windows? Linux? Either? It is not clear from the description if/what Windows/Linux specific software might be needed.
Thanks