Deep learning with Horovod and Spark using GPUs and Docker containers
Who is this presentation for?
- Data scientists and IT administrators
Data volume and complexity increases by the day, so it’s imperative that companies understand their business needs in order to stay ahead of their competition. Thanks to AI, ML, and deep learning (DL) projects such as Apache Spark, H2O, TensorFlow, and Horovod, these organizations no longer have to lock in to a specific vendor technology or proprietary solutions to maintain this competitive advantage. These feature-rich, deep learning applications are available directly from the open source community with many different algorithms and options tailored for specific use cases.
One of the biggest challenges for the enterprise is how to deploy these open source tools in an easy and consistent manner (keeping in mind that some of them have operating system kernel and software components). For example, TensorFlow can leverage NVIDIA GPU resources, but running TensorFlow with GPUs requires users to set up NVIDIA CUDA libraries on the host and install and configure the TensorFlow application to make use of the GPU computing facility. The combination of device drivers, libraries, and software versions can be daunting and may end in failure for many users.
Moreover, since GPUs are a premium resource, organizations want to maximize their use. Clusters using these resources need to be configured on demand and freed immediately after computation is complete. Docker containers are ideal for enabling just this sort of instant cluster provisioning and deprovisioning. They also ensure reproducible and consistent deployment.
Thomas Phelan demonstrates how to deploy AI, ML, and DL applications, including Spark, TensorFlow, and Horovod, using GPU hardware acceleration on Docker containers in a secure multitenant environment. The use of GPU-based services within Docker containers does require some careful consideration, so he’ll also explore some best practices.
- A basic understanding of Docker containers and NVIDIA GPUs (helpful but not required)
What you'll learn
- Discover how to spin up and tear down GPU-enabled AI, ML, and DL clusters in Docker containers
- Learn about quota management of GPU resources for better manageability, GPU isolation to specific clusters to avoid resource conflict or contention, the dynamic attach and detach of GPU resources from running clusters, and transient use of GPUs for the duration of a job
Thomas Phelan is cofounder and chief architect of BlueData. Previously, a member of the original team at Silicon Graphics that designed and implemented XFS, the first commercially availably 64-bit file system; and an early employee at VMware, a senior staff engineer and a key member of the ESX storage architecture team where he designed and developed the ESX storage I/O load-balancing subsystem and modular pluggable storage architecture as well as led teams working on many key storage initiatives such as the cloud storage gateway and vFlash.
For conference registration information and customer service
For more information on community discounts and trade opportunities with O’Reilly conferences
For information on exhibiting or sponsoring a conference
For media/analyst press inquires