Estimating Training Data Influence by Tracking Gradient Descent

02/19/2020
by   Garima Pruthi, et al.
13

We introduce a method called TrackIn that computes the influence of a training example on a prediction made by the model, by tracking how the loss on the test point changes during the training process whenever the training example of interest was utilized. We provide a scalable implementation of TrackIn via a combination of a few key ideas: (a) a first-order approximation to the exact computation, (b) using random projections to speed up the computation of the first-order approximation for large models, (c) using saved checkpoints of standard training procedures, and (d) cherry-picking layers of a deep neural network. An experimental evaluation shows that TrackIn is more effective in identifying mislabelled training examples than other related methods such as influence functions and representer points. We also discuss insights from applying the method on vision, regression and natural language tasks.

READ FULL TEXT

page 8

page 11

page 14

page 15

research
03/25/2020

RelatIF: Identifying Explanatory Training Examples via Relative Influence

In this work, we focus on the use of influence functions to identify rel...
research
10/12/2020

Explaining Neural Matrix Factorization with Gradient Rollback

Explaining the predictions of neural black-box models is an important pr...
research
11/23/2018

Representer Point Selection for Explaining Deep Neural Networks

We propose to explain the predictions of a deep neural network, by point...
research
02/04/2023

How Many and Which Training Points Would Need to be Removed to Flip this Prediction?

We consider the problem of identifying a minimal subset of training data...
research
06/07/2021

Interactive Label Cleaning with Example-based Explanations

We tackle sequential learning under label noise in applications where a ...
research
05/26/2023

Theoretical and Practical Perspectives on what Influence Functions Do

Influence functions (IF) have been seen as a technique for explaining mo...
research
03/14/2023

Simfluence: Modeling the Influence of Individual Training Examples by Simulating Training Runs

Training data attribution (TDA) methods offer to trace a model's predict...

Please sign up or login with your details

Forgot password? Click here to reset