Model evaluation metrics are typically tied to the predictive learning tasks. There are different metrics for classification (ROC-AUC, confusion matrix), regression (RMSE, R2 score), ranking metrics (precision recall, F1 score), and so on. These metrics, coupled with cross-validation or hold-out validation techniques, might help analysts and data scientists select a performant model. However, model performance decays over time because of the variability in the data. At this point in time, point estimate-based metrics are not enough, and a better understanding of the why, what, and how of the categorization process is needed.
Evaluating model decisions might still be easy for linear models but gets difficult in the world of deep neural networks (DNNs). This complexity might increase multifold for use cases related to computer vision (image classification, image captioning or visual QnA(VQA), text classification), sentiment analysis, or topic modeling. ResNets, a recently published state-of-the-art DNN, has over 200 layers. Interpreting input features and output categorization over multiple layers is challenging. The lack of decomposability and intuitiveness associated with DNNs prevents widespread adoption even with their superior performance compared to more classical machine learning approaches. Faithful interpretation of DNNs will help not only provide insight about the failure modes (false positives and false negatives) but also enable the humans in the loop to evaluate the robustness of the model against noise. This brings in trust and transparency to the predictive algorithm.
Pramit Choudhary shares tricks to enable class-discriminative visualizations for computer vision problems when using convolutional neural networks (CNNs) and approaches to help enable transparency of CNNs by capturing metrics during the validation step and highlighting salient features in the image which are driving prediction.
Pramit Choudhary is a Lead data scientist/ML scientist at h2o.ai, where he focuses on optimizing and applying classical machine learning and Bayesian design strategy to solve large scale real-world problems.
Currently, he is leading initiatives on figuring out better ways to generate a predictive model’s learned decision policies as meaningful insights(Supervised/Unsupervised problems)
©2018, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org