Gold Doesn't Always Glitter: Spectral Removal of Linear and Nonlinear Guarded Attribute Information

03/15/2022
by   Shun Shao, et al.
0

We describe a simple and effective method (Spectral Attribute removaL; SAL) to remove guarded information from neural representations. Our method uses singular value decomposition and eigenvalue decomposition to project the input representations into directions with reduced covariance with the guarded information rather than maximal covariance as normally these factorization methods are used. We begin with linear information removal and proceed to generalize our algorithm to the case of nonlinear information removal through the use of kernels. Our experiments demonstrate that our algorithm retains better main task performance after removing the guarded information compared to previous methods. In addition, our experiments demonstrate that we need a relatively small amount of guarded attribute data to remove information about these attributes, which lowers the exposure to such possibly sensitive data and fits better low-resource scenarios.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/08/2022

Probing Classifiers are Unreliable for Concept Removal and Detection

Neural network models trained on text data have been found to encode und...
research
02/06/2023

Erasure of Unaligned Attributes from Neural Representations

We present the Assignment-Maximization Spectral Attribute removaL (AMSAL...
research
07/13/2022

Supervised Attribute Information Removal and Reconstruction for Image Manipulation

The goal of attribute manipulation is to control specified attribute(s) ...
research
12/11/2020

Neural Dynamic Mode Decomposition for End-to-End Modeling of Nonlinear Dynamics

Koopman spectral analysis has attracted attention for understanding nonl...
research
09/18/2022

RankFeat: Rank-1 Feature Removal for Out-of-distribution Detection

The task of out-of-distribution (OOD) detection is crucial for deploying...
research
12/07/2020

Removing Spurious Features can Hurt Accuracy and Affect Groups Disproportionately

The presence of spurious features interferes with the goal of obtaining ...

Please sign up or login with your details

Forgot password? Click here to reset