Targeted Data Generation: Finding and Fixing Model Weaknesses

05/28/2023
by   Zexue He, et al.
6

Even when aggregate accuracy is high, state-of-the-art NLP models often fail systematically on specific subgroups of data, resulting in unfair outcomes and eroding user trust. Additional data collection may not help in addressing these weaknesses, as such challenging subgroups may be unknown to users, and underrepresented in the existing and new data. We propose Targeted Data Generation (TDG), a framework that automatically identifies challenging subgroups, and generates new data for those subgroups using large language models (LLMs) with a human in the loop. TDG estimates the expected benefit and potential harm of data augmentation for each subgroup, and selects the ones most likely to improve within group performance without hurting overall performance. In our experiments, TDG significantly improves the accuracy on challenging subgroups for state-of-the-art sentiment analysis and natural language inference models, while also improving overall test accuracy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/24/2023

Large Language Models as Counterfactual Generator: Strengths and Weaknesses

Large language models (LLMs) have demonstrated remarkable performance in...
research
10/31/2019

Adversarial NLI: A New Benchmark for Natural Language Understanding

We introduce a new large-scale NLI benchmark dataset, collected via an i...
research
10/10/2022

Metaphorical Paraphrase Generation: Feeding Metaphorical Language Models with Literal Texts

This study presents a new approach to metaphorical paraphrase generation...
research
05/09/2022

Improving negation detection with negation-focused pre-training

Negation is a common linguistic feature that is crucial in many language...
research
09/01/2023

Will Sentiment Analysis Need Subculture? A New Data Augmentation Approach

The renowned proverb that "The pen is mightier than the sword" underscor...
research
06/24/2023

Towards Robust Aspect-based Sentiment Analysis through Non-counterfactual Augmentations

While state-of-the-art NLP models have demonstrated excellent performanc...
research
05/11/2018

Behavior Analysis of NLI Models: Uncovering the Influence of Three Factors on Robustness

Natural Language Inference is a challenging task that has received subst...

Please sign up or login with your details

Forgot password? Click here to reset