INCLUSIFY: A benchmark and a model for gender-inclusive German

12/05/2022
by   David Pomerenke, et al.
0

Gender-inclusive language is important for achieving gender equality in languages with gender inflections, such as German. While stirring some controversy, it is increasingly adopted by companies and political institutions. A handful of tools have been developed to help people use gender-inclusive language by identifying instances of the generic masculine and providing suggestions for more inclusive reformulations. In this report, we define the underlying tasks in terms of natural language processing, and present a dataset and measures for benchmarking them. We also present a model that implements these tasks, by combining an inclusive language database with an elaborate sequence of processing steps via standard pre-trained models. Our model achieves a recall of 0.89 and a precision of 0.82 in our benchmark for identifying exclusive language; and one of its top five suggestions is chosen in real-world texts in 44 advanced by training end-to-end models and using large language models; and we urge the community to include more gender-inclusive texts in their training data in order to not present an obstacle to the adoption of gender-inclusive language. Through these efforts, we hope to contribute to restoring justice in language and, to a small extent, in reality.

READ FULL TEXT

page 6

page 15

page 25

research
11/21/2022

Measuring Harmful Representations in Scandinavian Language Models

Scandinavian countries are perceived as role-models when it comes to gen...
research
10/11/2021

Improving Gender Fairness of Pre-Trained Language Models without Catastrophic Forgetting

Although pre-trained language models, such as BERT, achieve state-of-art...
research
01/21/2022

Gender Bias in Text: Labeled Datasets and Lexicons

Language has a profound impact on our thoughts, perceptions, and concept...
research
05/18/2023

Exploiting Biased Models to De-bias Text: A Gender-Fair Rewriting Model

Natural language generation models reproduce and often amplify the biase...
research
04/12/2023

Measuring Normative and Descriptive Biases in Language Models Using Census Data

We investigate in this paper how distributions of occupations with respe...
research
04/12/2022

Robust Quantification of Gender Disparity in Pre-Modern English Literature using Natural Language Processing

Research has continued to shed light on the extent and significance of g...
research
05/01/2020

Predicting Declension Class from Form and Meaning

The noun lexica of many natural languages are divided into several decle...

Please sign up or login with your details

Forgot password? Click here to reset