DIET: Lightweight Language Understanding for Dialogue Systems

04/21/2020
by   Tanja Bunk, et al.
0

Large-scale pre-trained language models have shown impressive results on language understanding benchmarks like GLUE and SuperGLUE, improving considerably over other pre-training methods like distributed representations (GloVe) and purely supervised approaches. We introduce the Dual Intent and Entity Transformer (DIET) architecture, and study the effectiveness of different pre-trained representations on intent and entity prediction, two common dialogue language understanding tasks. DIET advances the state of the art on a complex multi-domain NLU dataset and achieves similarly high performance on other simpler datasets. Surprisingly, we show that there is no clear benefit to using large pre-trained models for this task, and in fact DIET improves upon the current state of the art even in a purely supervised setup without any pre-trained embeddings. Our best performing model outperforms fine-tuning BERT and is about six times faster to train.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/30/2020

Enriched Pre-trained Transformers for Joint Slot Filling and Intent Detection

Detecting the user's intent and finding the corresponding slots among th...
research
05/12/2021

News Headline Grouping as a Challenging NLU Task

Recent progress in Natural Language Understanding (NLU) has seen the lat...
research
09/05/2019

Effective Use of Transformer Networks for Entity Tracking

Tracking entities in procedural language requires understanding the tran...
research
04/19/2022

ALBETO and DistilBETO: Lightweight Spanish Language Models

In recent years there have been considerable advances in pre-trained lan...
research
04/24/2020

Data Annealing for Informal Language Understanding Tasks

There is a huge performance gap between formal and informal language und...
research
04/14/2023

Learn What Is Possible, Then Choose What Is Best: Disentangling One-To-Many Relations in Language Through Text-based Games

Language models pre-trained on large self-supervised corpora, followed b...
research
12/16/2022

Decoder Tuning: Efficient Language Understanding as Decoding

With the evergrowing sizes of pre-trained models (PTMs), it has been an ...

Please sign up or login with your details

Forgot password? Click here to reset