QAGAN: Adversarial Approach To Learning Domain Invariant Language Features

06/24/2022
by   Shubham Shrivastava, et al.
0

Training models that are robust to data domain shift has gained an increasing interest both in academia and industry. Question-Answering language models, being one of the typical problem in Natural Language Processing (NLP) research, has received much success with the advent of large transformer models. However, existing approaches mostly work under the assumption that data is drawn from same distribution during training and testing which is unrealistic and non-scalable in the wild. In this paper, we explore adversarial training approach towards learning domain-invariant features so that language models can generalize well to out-of-domain datasets. We also inspect various other ways to boost our model performance including data augmentation by paraphrasing sentences, conditioning end of answer span prediction on the start word, and carefully designed annealing function. Our initial results show that in combination with these methods, we are able to achieve 15.2% improvement in EM score and 5.6% boost in F1 score on out-of-domain validation dataset over the baseline. We also dissect our model outputs and visualize the model hidden-states by projecting them onto a lower-dimensional space, and discover that our specific adversarial training approach indeed encourages the model to learn domain invariant embedding and bring them closer in the multi-dimensional space.

READ FULL TEXT

page 8

page 14

page 15

research
10/21/2019

Domain-agnostic Question-Answering with Adversarial Training

Adapting models to new domain without finetuning is a challenging proble...
research
11/07/2022

Using Deep Mixture-of-Experts to Detect Word Meaning Shift for TempoWiC

This paper mainly describes the dma submission to the TempoWiC task, whi...
research
04/22/2023

Romanian Multiword Expression Detection Using Multilingual Adversarial Training and Lateral Inhibition

Multiword expressions are a key ingredient for developing large-scale an...
research
12/18/2022

On the Connection between Invariant Learning and Adversarial Training for Out-of-Distribution Generalization

Despite impressive success in many tasks, deep learning models are shown...
research
06/21/2017

Cross-language Learning with Adversarial Neural Networks: Application to Community Question Answering

We address the problem of cross-language adaptation for question-questio...
research
03/15/2022

Generalized but not Robust? Comparing the Effects of Data Modification Methods on Out-of-Domain Generalization and Adversarial Robustness

Data modification, either via additional training datasets, data augment...
research
06/29/2022

What Can Secondary Predictions Tell Us? An Exploration on Question-Answering with SQuAD-v2.0

Performance in natural language processing, and specifically for the que...

Please sign up or login with your details

Forgot password? Click here to reset