Incorporating Priors with Feature Attribution on Text Classification

06/19/2019
by   Frederick Liu, et al.
0

Feature attribution methods, proposed recently, help users interpret the predictions of complex models. Our approach integrates feature attributions into the objective function to allow machine learning practitioners to incorporate priors in model building. To demonstrate the effectiveness our technique, we apply it to two tasks: (1) mitigating unintended bias in text classifiers by neutralizing identity terms; (2) improving classifier performance in a scarce data setting by forcing the model to focus on toxic terms. Our approach adds an L2 distance loss between feature attributions and task-specific prior values to the objective. Our experiments show that i) a classifier trained with our technique reduces undesired model biases without a trade off on the original task; ii) incorporating priors helps model performance in scarce data settings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/25/2019

Learning Explainable Models Using Attribution Priors

Two important topics in deep learning both involve incorporating humans ...
research
11/15/2021

Fast Axiomatic Attribution for Neural Networks

Mitigating the dependence on spurious correlations present in the traini...
research
08/31/2021

Explaining Classes through Word Attribution

In recent years, several methods have been proposed for explaining indiv...
research
08/24/2018

Building a Robust Text Classifier on a Test-Time Budget

We propose a generic and interpretable learning framework for building r...
research
12/20/2019

Learned Feature Attribution Priors

Deep learning models have achieved breakthrough successes in domains whe...
research
02/14/2018

Authorship Attribution Using the Chaos Game Representation

The Chaos Game Representation, a method for creating images from nucleot...
research
09/10/2023

Mitigating Word Bias in Zero-shot Prompt-based Classifiers

Prompt-based classifiers are an attractive approach for zero-shot classi...

Please sign up or login with your details

Forgot password? Click here to reset