Presented By
O’Reilly + Cloudera
Make Data Work
March 25-28, 2019
San Francisco, CA
Please log in

Streamlining a machine learning project team

Sourav Dey (Manifold), Alex Ng (Manifold)
1:30pm5:00pm Tuesday, March 26, 2019
Average rating: ****.
(4.25, 4 ratings)

Who is this presentation for?

  • Practicing data scientists and data engineers and CDOs with a mandate to build a team inside the organization



Prerequisite knowledge

  • Basic knowledge of the software engineering process
  • Familiarity with machine learning concepts and vocabulary (model, training, etc.)

What you'll learn

  • Understand how to get value from machine learning in a way that will positively impact the company's bottom line, by streamlining teams and time to production


Artificial intelligence is already helping many businesses become more responsive and competitive, but how do you move machine learning models efficiently from research to deployment at enterprise scale? It’s imperative to plan for deployment from day one, both in tool selection and in the feedback and development process.

As recently as a few years ago, data scientists were the people who played in a sandbox—when they came up with a useful model, it was thrown over the wall to another team that would reimplement it to put it into production. Those days are over now: there’s only one Git repo in the entire company, and everything you commit is essentially in production. But teams are still run as if data science is mainly about experimentation.

Sourav Day and Alex Ng share best practices for working in this new reality. Data scientists can still play in a sandbox but must do so in a way such that offers a turnkey solution to take models into production. Just as in DevOps, where people work at the intersection of development and operations, today people are working at the intersection of data science and software engineering and need to be integrated into the team with tools and support. Manifold developed the Lean AI process and the open source Orbyter package for Docker-first data science to help do just that.

Sourav and Alex explain how to streamline a machine learning project and help your engineers work as an an integrated part of your development and production teams.

Topics include:

  • Understanding both the business problem and the data
  • Containerized data science for cleaner workflows
  • Data engineering as a core competency
  • Building iterative data models to deliver value early
  • Best practices for bookkeeping ML experiments
  • Developing user trust in the data models
  • Seamless deployment at production scale
  • Observing and validating on-the-ground model use
Photo of Sourav Dey

Sourav Dey


Sourav Dey is CTO at Manifold, an artificial intelligence engineering services firm with offices in Boston and Silicon Valley. Sourav leads the engineering team focusing on work across client projects, developing platform technologies to make Manifold ML engineers more efficient, and communicating to business stakeholders. Prior to Manifold, Sourav led teams building data products across the technology stack, from smart thermostats and security cams at Google-Nest to wireless communication at Qualcomm. Sourav’s career has always been at the intersection of math and computer science — a PhD from MIT in signal processing and bachelors degrees in Math and CS from MIT.

Photo of Alex Ng

Alex Ng


Alexander Ng is a director of infrastructure and DevOps at Manifold. Previously, he had a stint as an engineer and technical lead doing DevOps at Kyruus and engineering work for the Navy. He holds a BS in electrical engineering from Boston University.

Comments on this page are now closed.


Picture of Sourav Dey
Sourav Dey | CTO
11/13/2018 6:16am PST

Hi all, we’re really looking forward to this tutorial! To help us make this session as productive as possible for you, please let us know some of the specific challenges you’ve run into—whether technical or team-related—when trying to move ML projects from development to production. We may not be able to address every specific scenario, but we’ll be sure to cover some of the most interesting in addition to the ones we commonly see.