TRAK: Attributing Model Behavior at Scale

03/24/2023
by   Sung-Min Park, et al.
0

The goal of data attribution is to trace model predictions back to training data. Despite a long line of work towards this goal, existing approaches to data attribution tend to force users to choose between computational tractability and efficacy. That is, computationally tractable methods can struggle with accurately attributing model predictions in non-convex settings (e.g., in the context of deep neural networks), while methods that are effective in such regimes require training thousands of models, which makes them impractical for large models or datasets. In this work, we introduce TRAK (Tracing with the Randomly-projected After Kernel), a data attribution method that is both effective and computationally tractable for large-scale, differentiable models. In particular, by leveraging only a handful of trained models, TRAK can match the performance of attribution methods that require training thousands of models. We demonstrate the utility of TRAK across various modalities and scales: image classifiers trained on ImageNet, vision-language models (CLIP), and language models (BERT and mT5). We provide code for using TRAK (and reproducing our work) at https://github.com/MadryLab/trak .

READ FULL TEXT

page 13

page 14

page 18

research
02/17/2016

Authorship Attribution Using a Neural Network Language Model

In practice, training language models for individual authors is often ex...
research
07/14/2021

Deduplicating Training Data Makes Language Models Better

We find that existing language modeling datasets contain many near-dupli...
research
09/14/2022

On the State of the Art in Authorship Attribution and Authorship Verification

Despite decades of research on authorship attribution (AA) and authorshi...
research
06/25/2019

Learning Explainable Models Using Attribution Priors

Two important topics in deep learning both involve incorporating humans ...
research
02/11/2023

Characterizing Attribution and Fluency Tradeoffs for Retrieval-Augmented Large Language Models

Despite recent progress, it has been difficult to prevent semantic hallu...
research
06/15/2023

Evaluating Data Attribution for Text-to-Image Models

While large text-to-image models are able to synthesize "novel" images, ...
research
05/23/2022

Tracing Knowledge in Language Models Back to the Training Data

Neural language models (LMs) have been shown to memorize a great deal of...

Please sign up or login with your details

Forgot password? Click here to reset