Presented By O’Reilly and Cloudera
Make Data Work
September 11, 2018: Training & Tutorials
September 12–13, 2018: Keynotes & Sessions
New York, NY

Deploying machine learning models in the enterprise

Diego Oppenheimer (Algorithmia)
5:25pm–6:05pm Wednesday, 09/12/2018
Data science and machine learning
Location: 1E 10/11 Level: Intermediate
Secondary topics:  Model lifecycle management
Average rating: ****.
(4.50, 2 ratings)

Who is this presentation for?

  • Managers and executives, data scientists, ML engineers, DevOps engineers, and IT infrastructure decision makers

Prerequisite knowledge

  • Familiarity with big data and machine learning
  • Experience with or interest in deploying machine learning models

What you'll learn

  • Explore technical and organizational challenges and best practices for deploying scalable ML models

Description

All the conferences and thought leaders have been painting a vision of the businesses of the future being powered by data, but if we’re honest with ourselves, the vast majority of our massive data science investments are being deployed to PowerPoint or maybe a business dashboard. Productionizing your machine learning (ML) portfolio is the next big step on the path to ROI from AI.

You probably started out years ago on a “big data” initiative: You collected and cleaned your data and built data warehouses, and when those filled up you upgraded to data lakes. You hired data engineers and data scientists, and around the organization, everyone brushed up their SQL querying skills and got some licenses to Tableau and PowerBI.

Then you saw what Google, Uber, Facebook, and Amazon were doing with machine learning to automate business processes and customer interactions. To not get broadsided, you hired more data scientists and machine learning engineers. They were put on your teams and started using your big data investments to train models. But what you probably found is that your tech stack and DevOps processes don’t fit ML models. Unlike most of your systems, ML models require short spikes of massive compute; they are often written in different languages than your core code; they need different hardware to perform well; one model probably has applications across many teams; and the people making the models often don’t have the engineering experience to write production code but need to iterate faster than traditional engineers. Expecting your engineering and DevOps teams to deploy ML models well is like showing up to Seaworld with a giraffe since they are already handling large mammals.

There is a path forward. Almost five years ago Algorithmia launched a marketplace for models, functions, and algorithms. Today 65,000 developers are on the platform deploying 4,500 models—the result has been a layer of tools and best practices to make deploying ML models frictionless, scalable, and low maintenance. The company refers to it as the “AI layer.”

Drawing on this experience, Diego Oppenheimer covers the strategic and technical hurdles each company must overcome and the best practices developed while deploying over 4,000 ML models for 70,000 engineers.

Topics include:

  • Best practices for your organization
  • Continuous model deployment
  • Varying languages (Your code base probably isn’t in Python or R, but your ML models probably are.)
  • Managing your portfolio of ML models
  • Standardize versioning
  • Enabling models across your organization
  • Analytics on how and where models are being used
  • Maintaining auditability
Photo of Diego Oppenheimer

Diego Oppenheimer

Algorithmia

Diego Oppenheimer is the founder and CEO of Algorithmia. An entrepreneur and product developer with extensive background in all things data, Diego has designed, managed, and shipped some of Microsoft’s most used data analysis products, including Excel, Power Pivot, SQL Server, and Power BI. Diego holds a bachelor’s degree in information systems and a master’s degree in business intelligence and data analytics from Carnegie Mellon University.