Transparency, auditability, and stability of predictive models and results are typically key differentiators in effective machine learning applications. Patrick Hall shares tips and techniques learned through implementing interpretable machine learning solutions in industries like financial services, telecom, and health insurance.
Using a set of publicly available and highly annotated examples, Patrick walks you through several holistic approaches to interpretable machine learning. The examples use the well-known University of California Irvine (UCI) credit card dataset and popular open source packages to train constrained, interpretable machine learning models and visualize, explain, and test more complex machine learning models in the context of an example credit-risk application. Along the way, Patrick draws on his applied experience to highlight crucial success factors and common pitfalls not typically discussed in blog posts and open source software documentation, such as the importance of both local and global explanation and the approximate nature of nearly all machine learning explanation techniques.
Outline:
Enhancing transparency in machine learning models with Python and XGBoost:
Increasing transparency and accountability in your machine learning project with Python:
Explaining your predictive models to business stakeholders with local interpretable model-agnostic explanations (LIME) using Python and H2O:
Debugging machine learning models for accuracy, trustworthiness, and stability with Python and H2O:
Patrick Hall is principle scientist at bnh.ai, a boutique law firm focused on AI and analytics; a senior director of product at H2O.ai, a leading Silicon Valley machine learning software company; and a lecturer in the Department of Decision Sciences at George Washington University, where he teaches graduate classes in data mining and machine learning.
At both bnh.ai and H2O.ai, he works to mitigate AI risks and advance the responsible practice of machine learning. Previously, Patrick held global customer-facing and R&D research roles at SAS. He holds multiple patents in automated market segmentation using clustering and deep neural networks. Patrick is the 11th person worldwide to become a Cloudera Certified Data Scientist. He studied computational chemistry at the University of Illinois before graduating from the Institute for Advanced Analytics at North Carolina State University.
Comments on this page are now closed.
For exhibition and sponsorship opportunities, email strataconf@oreilly.com
For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com
View a complete list of Strata Data Conference contacts
©2019, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com
Comments
I didn’t really use any one deck of slides, but here are the resources I shared during the tutorial.
Getting Started
• Tutorial URL: https://aquarium.h2o.ai
• Create new account
• Check email
• Use temporary password to login to aquarium
• Browse labs
• View Detail under Patrick Hall’s MLI Tutorial
• Start Lab (This can take several minutes)
• Click on the Jupyter URL when it becomes available
• Enter the token h2o
• Browse/run Jupyter notebooks
• Please End Lab when you are finished
Criticism
• Cynthia Rudin: “Please Stop Explaining Black Box Models for High Stakes Decisions”
• Cassie Kozyrkov: “Explainable AI wont deliver. Here’s why.”
• Yann Lecun, Peter Norvig, etc.
Other Resources by the Instructor
• All of the resources for this lab are freely available here: https://github.com/jphall663/interpretable_machine_learning_with_python
• The 2018 JSM presentation related to the post-hoc explanation approaches herein: https://github.com/jphall663/jsm_2018_slides
• The 2018 JSM proceedings paper related to the monotonic GBM and post-hoc explanation approaches herein: https://github.com/jphall663/jsm_2018_paper
• The 2019 H2O World presentation which puts forward an interpretable machine learning workflow: https://github.com/jphall663/h2oworld_sf_2019
• The awesome-machine-learning-interpretability metalist that includes many debugging, explanation, fairness, interpretability, privacy, and security resources: https://github.com/jphall663/awesome-machine-learning-interpretability
• A recent article on the security risks of ML models: https://www.oreilly.com/ideas/proposals-for-model-vulnerability-and-security
• Interpretable Machine Learning ``Good, Bad, and Ugly’’ slides: https://github.com/h2oai/h2o-meetups/blob/master/2018_04_30_NYC_MLI_good_bad_ugly/MLI_good_bad_ugly.pdf
Hi Patrick, Thanks for the very enlightening tutorial. I will certainly use this in my work at Field Nation. Would you be willing to share your slides from your presentation? Thanks, Alex