An Evaluation Dataset for Intent Classification and Out-of-Scope Prediction

09/04/2019
by   Stefan Larson, et al.
0

Task-oriented dialog systems need to know when a query falls outside their range of supported intents, but current text classification corpora only define label sets that cover every example. We introduce a new dataset that includes queries that are out-of-scope---i.e., queries that do not fall into any of the system's supported intents. This poses a new challenge because models cannot assume that every query at inference time belongs to a system-supported intent class. Our dataset also covers 150 intent classes over 10 domains, capturing the breadth that a production task-oriented agent must handle. We evaluate a range of benchmark classifiers on our dataset along with several different out-of-scope identification schemes. We find that while the classifiers perform well on in-scope intent classification, they struggle to identify out-of-scope queries. Our dataset and evaluation fill an important gap in the field, offering a way of more rigorously and realistically benchmarking text classification in task-driven dialog systems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/30/2021

Hierarchical Modeling for Out-of-Scope Domain and Intent Classification

User queries for a real-world dialog system may sometimes fall outside t...
research
12/15/2022

Improve Text Classification Accuracy with Intent Information

Text classification, a core component of task-oriented dialogue systems,...
research
07/26/2022

A Survey of Intent Classification and Slot-Filling Datasets for Task-Oriented Dialog

Interest in dialog systems has grown substantially in the past decade. B...
research
04/12/2022

Redwood: Using Collision Detection to Grow a Large-Scale Intent Classification Dataset

Dialog systems must be capable of incorporating new skills via updates o...
research
09/12/2020

Intent Detection with WikiHow

Modern task-oriented dialog systems need to reliably understand users' i...
research
12/18/2018

Predicting user intent from search queries using both CNNs and RNNs

Predicting user behaviour on a website is a difficult task, which requir...
research
06/08/2021

Are Pretrained Transformers Robust in Intent Classification? A Missing Ingredient in Evaluation of Out-of-Scope Intent Detection

Pretrained Transformer-based models were reported to be robust in intent...

Please sign up or login with your details

Forgot password? Click here to reset