Presented By O’Reilly and Intel AI

Beijing • New York • San Francisco • London

Put AI to Work

April 29-30, 2018: Training

April 30-May 2, 2018: Tutorials & Conference

New York, NY

Model evaluation in the land of deep learning

Pramit Choudhary (h2o.ai)

4:00pm–4:40pm Wednesday, May 2, 2018

Implementing AI, Interacting with AI
Location: Nassau East/West

Average rating:

(4.67, 3 ratings)

Download slides (PDF)

Who is this presentation for?

Data scientists, machine learning practitioners, and product managers involved with analytical or predictive modeling workflows

Prerequisite knowledge

A basic understanding of machine learning concepts and deep neural networks

What you'll learn

Understand why evaluating models using model metrics like RMSE or the confusion matrix is not enough
Learn tricks and algorithms to enable interpretability in image classification problems

Description

Model evaluation metrics are typically tied to the predictive learning tasks. There are different metrics for classification (ROC-AUC, confusion matrix), regression (RMSE, R2 score), ranking metrics (precision recall, F1 score), and so on. These metrics, coupled with cross-validation or hold-out validation techniques, might help analysts and data scientists select a performant model. However, model performance decays over time because of the variability in the data. At this point in time, point estimate-based metrics are not enough, and a better understanding of the why, what, and how of the categorization process is needed.

Evaluating model decisions might still be easy for linear models but gets difficult in the world of deep neural networks (DNNs). This complexity might increase multifold for use cases related to computer vision (image classification, image captioning or visual QnA(VQA), text classification), sentiment analysis, or topic modeling. ResNets, a recently published state-of-the-art DNN, has over 200 layers. Interpreting input features and output categorization over multiple layers is challenging. The lack of decomposability and intuitiveness associated with DNNs prevents widespread adoption even with their superior performance compared to more classical machine learning approaches. Faithful interpretation of DNNs will help not only provide insight about the failure modes (false positives and false negatives) but also enable the humans in the loop to evaluate the robustness of the model against noise. This brings in trust and transparency to the predictive algorithm.

Pramit Choudhary shares tricks to enable class-discriminative visualizations for computer vision problems when using convolutional neural networks (CNNs) and approaches to help enable transparency of CNNs by capturing metrics during the validation step and highlighting salient features in the image which are driving prediction.

Pramit Choudhary

h2o.ai

Pramit Choudhary is a Lead data scientist/ML scientist at h2o.ai, where he focuses on optimizing and applying classical machine learning and Bayesian design strategy to solve large scale real-world problems.
Currently, he is leading initiatives on figuring out better ways to generate a predictive model’s learned decision policies as meaningful insights(Supervised/Unsupervised problems)

Website

Presented by

Elite Sponsors

Strategic Sponsors

Knowledge Sponsor

Contributing Sponsors

Impact Sponsors

Premier Exhibitors

Supporting Sponsors

Community Partner

Sponsorship Opportunities

For exhibition and sponsorship opportunities, email aisponsorships@oreilly.com

Partner Opportunities

For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com

Contact Us

View a complete list of AI contacts

©2018, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com