Debiasing Pre-trained Contextualised Embeddings

01/23/2021
by   Masahiro Kaneko, et al.
0

In comparison to the numerous debiasing methods proposed for the static non-contextualised word embeddings, the discriminative biases in contextualised embeddings have received relatively little attention. We propose a fine-tuning method that can be applied at token- or sentence-levels to debias pre-trained contextualised embeddings. Our proposed method can be applied to any pre-trained contextualised embedding model, without requiring to retrain those models. Using gender bias as an illustrative example, we then conduct a systematic study using several state-of-the-art (SoTA) contextualised representations on multiple benchmark datasets to evaluate the level of biases encoded in different contextualised embeddings before and after debiasing using the proposed method. We find that applying token-level debiasing for all tokens and across all layers of a contextualised embedding model produces the best performance. Interestingly, we observe that there is a trade-off between creating an accurate vs. unbiased contextualised embedding model, and different contextualised embedding models respond differently to this trade-off.

READ FULL TEXT
research
06/03/2019

Gender-preserving Debiasing for Pre-trained Word Embeddings

Word embeddings learnt from massive text collections have demonstrated s...
research
01/23/2021

Dictionary-based Debiasing of Pre-trained Word Embeddings

Word embeddings trained on large corpora have shown to encode high level...
research
03/14/2022

Sense Embeddings are also Biased–Evaluating Social Biases in Static and Contextualised Sense Embeddings

Sense embedding learning methods learn different embeddings for the diff...
research
04/16/2022

Unsupervised Attention-based Sentence-Level Meta-Embeddings from Contextualised Language Models

A variety of contextualised language models have been proposed in the NL...
research
05/16/2018

Prediction Rule Reshaping

Two methods are proposed for high-dimensional shape-constrained regressi...
research
06/04/2019

Toward Grammatical Error Detection from Sentence Labels: Zero-shot Sequence Labeling with CNNs and Contextualized Embeddings

Zero-shot grammatical error detection is the task of tagging token-level...
research
09/08/2023

Manifold-based Verbalizer Space Re-embedding for Tuning-free Prompt-based Classification

Prompt-based classification adapts tasks to a cloze question format util...

Please sign up or login with your details

Forgot password? Click here to reset