Building applications powered by deep learning is hard—particularly in an enterprise setting when reproducibility is of paramount importance. It’s difficult to justify shipping a model to production when you can’t be sure you’ll be able to build it again. Achieving reproducibility during deep learning model development comes at a cost: it can be difficult to maintain a high degree of computational performance during the model development lifecycle while retaining the benefits of reproducibility.
Evan Sparks describes the key ingredients of reproducible deep learning models in an enterprise setting. He then explains how to maintain a high degree of resource utilization and throughput through workload-aware cluster resource orchestration techniques.
Evan Sparks is a cofounder and CEO of Determined AI, a software company that makes machine learning engineers and data scientists fantastically more productive. Previously, Evan worked in quantitative finance and web intelligence. He holds a PhD in computer science from the University of California, Berkeley, where, as a member of the AMPLab, he contributed to the design and implementation of much of the large-scale machine learning ecosystem around Apache Spark, including MLlib and KeystoneML. He also holds an AB in computer science from Dartmouth College.
©2019, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com