Inoculation by Fine-Tuning: A Method for Analyzing Challenge Datasets

04/04/2019
by   Nelson F. Liu, et al.
0

Several datasets have recently been constructed to expose brittleness in models trained on existing benchmarks. While model performance on these challenge datasets is significantly lower compared to the original benchmark, it is unclear what particular weaknesses they reveal. For example, a challenge dataset may be difficult because it targets phenomena that current models cannot capture, or because it simply exploits blind spots in a model's specific training set. We introduce inoculation by fine-tuning, a new analysis method for studying challenge datasets by exposing models (the metaphorical patient) to a small amount of data from the challenge dataset (a metaphorical pathogen) and assessing how well they can adapt. We apply our method to analyze the NLI "stress tests" (Naik et al., 2018) and the Adversarial SQuAD dataset (Jia and Liang, 2017). We show that after slight exposure, some of these datasets are no longer challenging, while others remain difficult. Our results indicate that failures on challenge datasets may lead to very different conclusions about models, training datasets, and the challenge datasets themselves.

READ FULL TEXT
research
10/21/2019

Diversify Your Datasets: Analyzing Generalization via Controlled Variance in Adversarial Datasets

Phenomenon-specific "adversarial" datasets have been recently designed t...
research
06/16/2023

Data Selection for Fine-tuning Large Language Models Using Transferred Shapley Values

Although Shapley values have been shown to be highly effective for ident...
research
11/30/2022

Quadapter: Adapter for GPT-2 Quantization

Transformer language models such as GPT-2 are difficult to quantize beca...
research
06/12/2023

Active Learning Guided Fine-Tuning for enhancing Self-Supervised Based Multi-Label Classification of Remote Sensing Images

In recent years, deep neural networks (DNNs) have been found very succes...
research
08/26/2023

Adversarial Fine-Tuning of Language Models: An Iterative Optimisation Approach for the Generation and Detection of Problematic Content

In this paper, we tackle the emerging challenge of unintended harmful co...
research
04/28/2023

Empirical Analysis of the Strengths and Weaknesses of PEFT Techniques for LLMs

As foundation models continue to exponentially scale in size, efficient ...
research
05/24/2022

Partial-input baselines show that NLI models can ignore context, but they don't

When strong partial-input baselines reveal artifacts in crowdsourced NLI...

Please sign up or login with your details

Forgot password? Click here to reset