Unshattering the mirror: Defragmenting the deep learning ecosystem
Who is this presentation for?
- Directors of AI, VPs of engineering, deep learning developers, ML engineers, and data scientist
Companies like Google, Facebook, Amazon, and Microsoft have comprehensive, internal tooling that enables their developers to collaborate and build end-to-end AI applications. These platforms allow developers to be insanely productive and deliver new AI features and applications orders of magnitude faster than the rest of the industry. Meanwhile, developers at every other organization on the planet are left piecing this infrastructure together with a combination of highly specialized point solutions, legacy systems not designed for modern AI workflows, and half-baked open source projects. These tools lack standard interfaces, file formats, and are often incompatible in surprising ways.
Evan Sparks details the deficiencies with existing reference architectures for AI development infrastructure and the opportunities for end-to-end system design in AI development with deep dives into two examples: there’s orders of magnitude improvement in training performance and convergence to better models by integrating cluster resource management and fine-grained scheduling with hyperparameter optimization, and workload-aware checkpointing enables seamless and rapid fault tolerance, auto-scaling, and rapid collaboration.
- A working knowledge of the model development process
- A high-level understanding of the existing tooling in the DL ecosystem for tools like model development, hyperparameter optimization, model compression, and training cluster management
What you'll learn
- Gain a better understanding of the gaps in current publicly available AI development infrastructure and a good sense of what you should demand from this infrastructure
Evan Sparks is a cofounder and CEO of Determined AI, a software company that makes machine learning engineers and data scientists fantastically more productive. Previously, Evan worked in quantitative finance and web intelligence. He holds a PhD in computer science from the University of California, Berkeley, where, as a member of the AMPLab, he contributed to the design and implementation of much of the large-scale machine learning ecosystem around Apache Spark, including MLlib and KeystoneML. He also holds an AB in computer science from Dartmouth College.
Leave a Comment or Question
Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?
Join the conversation here (requires login)
Diversity and Inclusion Sponsor
R & D and Innovation Track Sponsor
For conference registration information and customer service
For more information on community discounts and trade opportunities with O’Reilly conferences
For information on exhibiting or sponsoring a conference
View a complete list of O'Reilly AI contacts