Presented By
O’Reilly + Cloudera
Make Data Work
29 April–2 May 2019
London, UK
Please log in

Building a secure and transparent ML pipeline using open source technologies

11:1511:55 Wednesday, 1 May 2019
Data Science, Machine Learning & AI
Location: Capital Suite 15/16
Secondary topics:  Ethics, Security and Privacy
Average rating: ****.
(4.75, 4 ratings)

Who is this presentation for?

  • Data scientists, researchers, machine learning engineers, and ML DevOps teams



Prerequisite knowledge

  • Familiarity with ML (useful but not required)

What you'll learn

  • Learn about open source tools for creating scalable, end-to-end ML pipelines that are open, transparent, and fair


The application of AI algorithms in domains such as criminal justice, credit scoring, and hiring holds unlimited promise. At the same time, it raises legitimate concerns about algorithmic fairness. There’s a growing demand for fairness, accountability, and transparency from machine learning (ML) systems. And we need to remember that training data isn’t the only source of possible bias and adversarial contamination. It can also be introduced through inappropriate data handling, inappropriate model selection, or incorrect algorithm design.

What we need is a pipeline that is open, transparent, secure, and fair and that fully integrates into the AI lifecycle. Such a pipeline requires a robust set of bias and adversarial checkers, “debiasing” and “defense” algorithms, and explanations. Nick Pentreath explains how to build such a pipeline leveraging open source projects such as AI Fairness 360 (AIF360), the Adversarial Robustness Toolbox (ART), Fabric for Deep Learning (FfDL), Model Asset eXchange (MAX), and Seldon Core.

Photo of Nick Pentreath

Nick Pentreath


Nick Pentreath is a principal engineer at the Center for Open Source Data & AI Technologies (CODAIT) at IBM, where he works on machine learning. Previously, he cofounded Graphflow, a machine learning startup focused on recommendations, and was at Goldman Sachs, Cognitive Match, and Mxit. He’s a committer and PMC member of the Apache Spark project and author of Machine Learning with Spark. Nick is passionate about combining commercial focus with machine learning and cutting-edge technology to build intelligent systems that learn from data to add business value.