You’ve trained machine learning models on your data, but how do you put them into production? When you have tens of thousands of model versions, each written in any mix of frameworks and exposed as REST API endpoints, and your users love to chain algorithms and run ensembles in parallel, how do you maintain a latency less than 20 ms on just a few servers?
Although AI is a hot topic, there has not been much discussion of the infrastructure and scaling challenges that come with it. Kenny Daniel explains why AI and machine learning are a natural fit for serverless computing and shares a complete operating system for AI—a general architecture for scalable and serverless machine learning in production. Along the way, Kenny discusses the issues Algorithmia ran into when implementing its on-demand scaling over GPU clusters and outlines one possible vision for the future of cloud-based machine learning.
Kenny Daniel is founder and CTO of Algorithmia. Kenny’s goal with Algorithmia is to accelerate AI development by creating a marketplace where algorithm developers can share their creations and application developers can make their applications smarter by incorporating the latest machine learning algorithms—an idea he came up with while working on his PhD, when he encountered a plethora of algorithms that never see the light of day. Kenny has also worked with companies like wine enthusiast app Delectable to build out their deep learning-based image recognition systems. Kenny holds degrees from Carnegie Mellon University and the University of Southern California, where he studied artificial intelligence and mechanism design.
©2017, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com