Model and Evaluation: Towards Fairness in Multilingual Text Classification

03/28/2023
by   Nankai Lin, et al.
0

Recently, more and more research has focused on addressing bias in text classification models. However, existing research mainly focuses on the fairness of monolingual text classification models, and research on fairness for multilingual text classification is still very limited. In this paper, we focus on the task of multilingual text classification and propose a debiasing framework for multilingual text classification based on contrastive learning. Our proposed method does not rely on any external language resources and can be extended to any other languages. The model contains four modules: multilingual text representation module, language fusion module, text debiasing module, and text classification module. The multilingual text representation module uses a multilingual pre-trained language model to represent the text, the language fusion module makes the semantic spaces of different languages tend to be consistent through contrastive learning, and the text debiasing module uses contrastive learning to make the model unable to identify sensitive attributes' information. The text classification module completes the basic tasks of multilingual text classification. In addition, the existing research on the fairness of multilingual text classification is relatively simple in the evaluation mode. The evaluation method of fairness is the same as the monolingual equality difference evaluation method, that is, the evaluation is performed on a single language. We propose a multi-dimensional fairness evaluation framework for multilingual text classification, which evaluates the model's monolingual equality difference, multilingual equality difference, multilingual equality performance difference, and destructiveness of the fairness strategy. We hope that our work can provide a more general debiasing method and a more comprehensive evaluation framework for multilingual text fairness tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/03/2021

Multilingual Text Classification for Dravidian Languages

As the fourth largest language family in the world, the Dravidian langua...
research
09/21/2022

SMTCE: A Social Media Text Classification Evaluation Benchmark and BERTology Models for Vietnamese

Text classification is a typical natural language processing or computat...
research
07/21/2021

Comparison of Czech Transformers on Text Classification Tasks

In this paper, we present our progress in pre-training monolingual Trans...
research
10/26/2017

ALL-IN-1: Short Text Classification with One Model for All Languages

We present ALL-IN-1, a simple model for multilingual text classification...
research
08/03/2021

Your fairness may vary: Group fairness of pretrained language models in toxic text classification

We study the performance-fairness trade-off in more than a dozen fine-tu...
research
04/12/2022

Easy Adaptation to Mitigate Gender Bias in Multilingual Text Classification

Existing approaches to mitigate demographic biases evaluate on monolingu...
research
05/13/2022

Interlock-Free Multi-Aspect Rationalization for Text Classification

Explanation is important for text classification tasks. One prevalent ty...

Please sign up or login with your details

Forgot password? Click here to reset