Learn What Is Possible, Then Choose What Is Best: Disentangling One-To-Many Relations in Language Through Text-based Games

04/14/2023
by   Benjamin Towle, et al.
0

Language models pre-trained on large self-supervised corpora, followed by task-specific fine-tuning has become the dominant paradigm in NLP. These pre-training datasets often have a one-to-many structure–e.g. in dialogue there are many valid responses for a given context. However, only some of these responses will be desirable in our downstream task. This raises the question of how we should train the model such that it can emulate the desirable behaviours, but not the undesirable ones. Current approaches train in a one-to-one setup–only a single target response is given for a single dialogue context–leading to models only learning to predict the average response, while ignoring the full range of possible responses. Using text-based games as a testbed, our approach, PASA, uses discrete latent variables to capture the range of different behaviours represented in our larger pre-training dataset. We then use knowledge distillation to distil the posterior probability distribution into a student model. This probability distribution is far richer than learning from only the hard targets of the dataset, and thus allows the student model to benefit from the richer range of actions the teacher model has learned. Results show up to 49 state-of-the-art model on the Jericho Walkthroughs dataset.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/17/2023

FutureTOD: Teaching Future Knowledge to Pre-trained Language Model for Task-Oriented Dialogue

Pre-trained language models based on general text enable huge success in...
research
03/09/2020

An Empirical Investigation of Pre-Trained Transformer Language Models for Open-Domain Dialogue Generation

We present an empirical investigation of pre-trained Transformer-based a...
research
05/26/2023

A Study on Knowledge Distillation from Weak Teacher for Scaling Up Pre-trained Language Models

Distillation from Weak Teacher (DWT) is a method of transferring knowled...
research
04/21/2020

DIET: Lightweight Language Understanding for Dialogue Systems

Large-scale pre-trained language models have shown impressive results on...
research
05/18/2023

Causal Document-Grounded Dialogue Pre-training

The goal of document-grounded dialogue (DocGD) is to generate a response...
research
07/10/2023

Enhancing Biomedical Text Summarization and Question-Answering: On the Utility of Domain-Specific Pre-Training

Biomedical summarization requires large datasets to train for text gener...
research
06/12/2023

The BEA 2023 Shared Task on Generating AI Teacher Responses in Educational Dialogues

This paper describes the results of the first shared task on the generat...

Please sign up or login with your details

Forgot password? Click here to reset