Types of Out-of-Distribution Texts and How to Detect Them

09/14/2021
by   Udit Arora, et al.
7

Despite agreement on the importance of detecting out-of-distribution (OOD) examples, there is little consensus on the formal definition of OOD examples and how to best detect them. We categorize these examples by whether they exhibit a background shift or a semantic shift, and find that the two major approaches to OOD detection, model calibration and density estimation (language modeling for text), have distinct behavior on these types of OOD data. Across 14 pairs of in-distribution and OOD English natural language understanding datasets, we find that density estimation methods consistently beat calibration methods in background shift settings, while performing worse in semantic shift settings. In addition, we find that both methods generally fail to detect examples from challenge data, highlighting a weak spot for current methods. Since no single method works well across all settings, our results call for an explicit definition of OOD examples when evaluating different detection methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/13/2023

Unified Out-Of-Distribution Detection: A Model-Specific Perspective

Out-of-distribution (OOD) detection aims to identify test examples that ...
research
11/17/2022

Online Distribution Shift Detection via Recency Prediction

When deploying modern machine learning-enabled robotic systems in high-s...
research
10/28/2021

Exploring Covariate and Concept Shift for Detection and Calibration of Out-of-Distribution Data

Moving beyond testing on in-distribution data works on Out-of-Distributi...
research
07/13/2023

Classical Out-of-Distribution Detection Methods Benchmark in Text Classification Tasks

State-of-the-art models can perform well in controlled environments, but...
research
02/09/2021

Label Smoothed Embedding Hypothesis for Out-of-Distribution Detection

Detecting out-of-distribution (OOD) examples is critical in many applica...
research
10/25/2021

No News is Good News: A Critique of the One Billion Word Benchmark

The One Billion Word Benchmark is a dataset derived from the WMT 2011 Ne...

Please sign up or login with your details

Forgot password? Click here to reset