Influence Functions in Deep Learning Are Fragile

06/25/2020
by   Samyadeep Basu, et al.
0

Influence functions approximate the effect of training samples in test-time predictions and have a wide variety of applications in machine learning interpretability and uncertainty estimation. A commonly-used (first-order) influence function can be implemented efficiently as a post-hoc method requiring access only to the gradients and Hessian of the model. For linear models, influence functions are well-defined due to the convexity of the underlying loss function and are generally accurate even across difficult settings where model changes are fairly large such as estimating group influences. Influence functions, however, are not well-understood in the context of deep learning with non-convex loss functions. In this paper, we provide a comprehensive and large-scale empirical study of successes and failures of influence functions in neural network models trained on datasets such as Iris, MNIST, CIFAR-10 and ImageNet. Through our extensive experiments, we show that the network architecture, its depth and width, as well as the extent of model parameterization and regularization techniques have strong effects in the accuracy of influence functions. In particular, we find that (i) influence estimates are fairly accurate for shallow networks, while for deeper networks the estimates are often erroneous; (ii) for certain network architectures and datasets, training with weight-decay regularization is important to get high-quality influence estimates; and (iii) the accuracy of influence estimates can vary significantly depending on the examined test points. These results suggest that in general influence functions in deep learning are fragile and call for developing improved influence estimation methods to mitigate these issues in non-convex setups.

READ FULL TEXT

page 13

page 15

page 16

page 17

research
09/12/2022

If Influence Functions are the Answer, Then What is the Question?

Influence functions efficiently estimate the effect of removing a single...
research
03/22/2023

Revisiting the Fragility of Influence Functions

In the last few years, many works have tried to explain the predictions ...
research
03/14/2017

Understanding Black-box Predictions via Influence Functions

How can we explain the predictions of a black-box model? In this paper, ...
research
05/30/2019

On the Accuracy of Influence Functions for Measuring Group Effects

Influence functions estimate the effect of removing particular training ...
research
05/02/2023

Class based Influence Functions for Error Detection

Influence functions (IFs) are a powerful tool for detecting anomalous ex...
research
08/07/2023

Studying Large Language Model Generalization with Influence Functions

When trying to gain better visibility into a machine learning model in o...
research
12/08/2020

Efficient Estimation of Influence of a Training Instance

Understanding the influence of a training instance on a neural network m...

Please sign up or login with your details

Forgot password? Click here to reset