Auditing and Debugging Deep Learning Models via Decision Boundaries: Individual-level and Group-level Analysis

01/03/2020

∙

Deep learning models have been criticized for their lack of easy interpretation, which undermines confidence in their use for important applications. Nevertheless, they are consistently utilized in many applications, consequential to humans' lives, mostly because of their better performance. Therefore, there is a great need for computational methods that can explain, audit, and debug such models. Here, we use flip points to accomplish these goals for deep learning models with continuous output scores (e.g., computed by softmax), used in social applications. A flip point is any point that lies on the boundary between two output classes: e.g. for a model with a binary yes/no output, a flip point is any input that generates equal scores for "yes" and "no". The flip point closest to a given input is of particular importance because it reveals the least changes in the input that would change a model's classification, and we show that it is the solution to a well-posed optimization problem. Flip points also enable us to systematically study the decision boundaries of a deep learning classifier. The resulting insight into the decision boundaries of a deep model can clearly explain the model's output on the individual-level, via an explanation report that is understandable by non-experts. We also develop a procedure to understand and audit model behavior towards groups of people. Flip points can also be used to alter the decision boundaries in order to improve undesirable behaviors. We demonstrate our methods by investigating several models trained on standard datasets used in social applications of machine learning. We also identify the features that are most responsible for particular classifications and misclassifications.

READ FULL TEXT

Auditing and Debugging Deep Learning Models via Decision Boundaries: Individual-level and Group-level Analysis

Interpreting Neural Networks Using Flip Points

Investigating Decision Boundaries of Trained Neural Networks

Deep Learning Generalization and the Convex Hull of Training Sets

The XAISuite framework and the implications of explanatory system dissonance

Interpretability of Blackbox Machine Learning Models through Dataview Extraction and Shadow Model creation

Sampling Unknown Decision Functions to Build Classifier Copies

I am Only Happy When There is Light: The Impact of Environmental Changes on Affective Facial Expressions Recognition

Auditing and Debugging Deep Learning Models via Decision Boundaries: Individual-level and Group-level Analysis

Related Research

Interpreting Neural Networks Using Flip Points

Investigating Decision Boundaries of Trained Neural Networks

Deep Learning Generalization and the Convex Hull of Training Sets

The XAISuite framework and the implications of explanatory system dissonance

Interpretability of Blackbox Machine Learning Models through Dataview Extraction and Shadow Model creation

Sampling Unknown Decision Functions to Build Classifier Copies

I am Only Happy When There is Light: The Impact of Environmental Changes on Affective Facial Expressions Recognition