DeepAI AI Chat
Log In Sign Up

Parameter-efficient Modularised Bias Mitigation via AdapterFusion

02/13/2023
by   Deepak Kumar, et al.
Brown University
Johannes Kepler University Linz
Association for Computing Machinery
0

Large pre-trained language models contain societal biases and carry along these biases to downstream tasks. Current in-processing bias mitigation approaches (like adversarial training) impose debiasing by updating a model's parameters, effectively transferring the model to a new, irreversible debiased state. In this work, we propose a novel approach to develop stand-alone debiasing functionalities separate from the model, which can be integrated into the model on-demand, while keeping the core model untouched. Drawing from the concept of AdapterFusion in multi-task learning, we introduce DAM (Debiasing with Adapter Modules) - a debiasing approach to first encapsulate arbitrary bias mitigation functionalities into separate adapters, and then add them to the model on-demand in order to deliver fairness qualities. We conduct a large set of experiments on three classification tasks with gender, race, and age as protected attributes. Our results show that DAM improves or maintains the effectiveness of bias mitigation, avoids catastrophic forgetting in a multi-attribute scenario, and maintains on-par task performance, while granting parameter-efficiency and easy switching between the original and debiased models.

READ FULL TEXT

page 1

page 2

page 3

page 4

01/30/2023

How Far Can It Go?: On Intrinsic Gender Bias Mitigation for Text Classification

To mitigate gender bias in contextualized language models, different int...
11/26/2019

Towards Fairness in Visual Recognition: Effective Strategies for Bias Mitigation

Computer vision models learn to perform a task by capturing relevant sta...
10/11/2021

Improving Gender Fairness of Pre-Trained Language Models without Catastrophic Forgetting

Although pre-trained language models, such as BERT, achieve state-of-art...
06/09/2022

Unlearning Protected User Attributes in Recommendations with Adversarial Training

Collaborative filtering algorithms capture underlying consumption patter...
09/16/2022

Less is Better: Recovering Intended-Feature Subspace to Robustify NLU Models

Datasets with significant proportions of bias present threats for traini...
09/08/2021

Sustainable Modular Debiasing of Language Models

Unfair stereotypical biases (e.g., gender, racial, or religious biases) ...
05/30/2022

Parameter Efficient Diff Pruning for Bias Mitigation

In recent years language models have achieved state of the art performan...