Pseudo-OOD training for robust language models

10/17/2022
by   Dhanasekar Sundararaman, et al.
0

While pre-trained large-scale deep models have garnered attention as an important topic for many downstream natural language processing (NLP) tasks, such models often make unreliable predictions on out-of-distribution (OOD) inputs. As such, OOD detection is a key component of a reliable machine-learning model for any industry-scale application. Common approaches often assume access to additional OOD samples during the training stage, however, outlier distribution is often unknown in advance. Instead, we propose a post hoc framework called POORE - POsthoc pseudo-Ood REgularization, that generates pseudo-OOD samples using in-distribution (IND) data. The model is fine-tuned by introducing a new regularization loss that separates the embeddings of IND and OOD data, which leads to significant gains on the OOD prediction task during testing. We extensively evaluate our framework on three real-world dialogue systems, achieving new state-of-the-art in OOD detection.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/18/2020

Predictions For Pre-training Language Models

Language model pre-training has proven to be useful in many language und...
research
12/19/2022

Privacy Adhering Machine Un-learning in NLP

Regulations introduced by General Data Protection Regulation (GDPR) in t...
research
10/22/2020

Calibrated Language Model Fine-Tuning for In- and Out-of-Distribution Data

Fine-tuned pre-trained language models can suffer from severe miscalibra...
research
04/03/2019

Probing Biomedical Embeddings from Language Models

Contextualized word embeddings derived from pre-trained language models ...
research
12/17/2020

MASKER: Masked Keyword Regularization for Reliable Text Classification

Pre-trained language models have achieved state-of-the-art accuracies on...
research
09/16/2021

Regularized Training of Nearest Neighbor Language Models

Including memory banks in a natural language processing architecture inc...
research
11/01/2022

The future is different: Large pre-trained language models fail in prediction tasks

Large pre-trained language models (LPLM) have shown spectacular success ...

Please sign up or login with your details

Forgot password? Click here to reset