Keeping Up with the Language Models: Robustness-Bias Interplay in NLI Data and Models

05/22/2023
by   Ioana Baldini, et al.
6

Auditing unwanted social bias in language models (LMs) is inherently hard due to the multidisciplinary nature of the work. In addition, the rapid evolution of LMs can make benchmarks irrelevant in no time. Bias auditing is further complicated by LM brittleness: when a presumably biased outcome is observed, is it due to model bias or model brittleness? We propose enlisting the models themselves to help construct bias auditing datasets that remain challenging, and introduce bias measures that distinguish between types of model errors. First, we extend an existing bias benchmark for NLI (BBNLI) using a combination of LM-generated lexical variations, adversarial filtering, and human validation. We demonstrate that the newly created dataset (BBNLInext) is more challenging than BBNLI: on average, BBNLI-next reduces the accuracy of state-of-the-art NLI models from 95.3 we employ BBNLI-next to showcase the interplay between robustness and bias, and the subtlety in differentiating between the two. Third, we point out shortcomings in current bias scores used in the literature and propose bias measures that take into account pro-/anti-stereotype bias and model brittleness. We will publicly release the BBNLI-next dataset to inspire research on rapidly expanding benchmarks to keep up with model evolution, along with research on the robustness-bias interplay in bias auditing. Note: This paper contains offensive text examples.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/14/2021

Persistent Anti-Muslim Bias in Large Language Models

It has been observed that large-scale language models capture undesirabl...
research
06/23/2022

Towards WinoQueer: Developing a Benchmark for Anti-Queer Bias in Large Language Models

This paper presents exploratory work on whether and to what extent biase...
research
05/30/2023

Fighting Bias with Bias: Promoting Model Robustness by Amplifying Dataset Biases

NLP models often rely on superficial cues known as dataset biases to ach...
research
08/21/2023

Systematic Offensive Stereotyping (SOS) Bias in Language Models

Research has shown that language models (LMs) are socially biased. Howev...
research
02/24/2023

In-Depth Look at Word Filling Societal Bias Measures

Many measures of societal bias in language models have been proposed in ...
research
09/15/2023

Casteist but Not Racist? Quantifying Disparities in Large Language Model Bias between India and the West

Large Language Models (LLMs), now used daily by millions of users, can e...
research
10/28/2022

Investigating Ensemble Methods for Model Robustness Improvement of Text Classifiers

Large pre-trained language models have shown remarkable performance over...

Please sign up or login with your details

Forgot password? Click here to reset