Variational Latent-State GPT for Semi-supervised Task-Oriented Dialog Systems

09/09/2021
by   Hong Liu, et al.
0

Recently, two approaches, fine-tuning large pre-trained language models and variational training, have attracted significant interests, separately, for semi-supervised end-to-end task-oriented dialog (TOD) systems. In this paper, we propose Variational Latent-State GPT model (VLS-GPT), which is the first to combine the strengths of the two approaches. Among many options of models, we propose the generative model and the inference model for variational learning of the end-to-end TOD system, both as auto-regressive language models based on GPT-2, which can be further trained over a mix of labeled and unlabeled dialog data in a semi-supervised manner. We develop the strategy of sampling-then-forward-computation, which successfully overcomes the memory explosion issue of using GPT in variational learning and speeds up training. Semi-supervised TOD experiments are conducted on two benchmark multi-domain datasets of different languages - MultiWOZ2.1 and CrossWOZ. VLS-GPT is shown to significantly outperform both supervised-only and semi-supervised baselines.

READ FULL TEXT
research
07/06/2022

A Challenge on Semi-Supervised and Reinforced Task-Oriented Dialog Systems

A challenge on Semi-Supervised and Reinforced Task-Oriented Dialog Syste...
research
09/17/2020

A Probabilistic End-To-End Task-Oriented Dialog Model with Latent Belief States towards Semi-Supervised Learning

Structured belief states are crucial for user goal tracking and database...
research
04/13/2022

Revisiting Markovian Generative Architectures for Efficient Task-Oriented Dialog Systems

Recently, Transformer based pretrained language models (PLMs), such as G...
research
10/17/2022

Semi-Supervised Knowledge-Grounded Pre-training for Task-Oriented Dialog Systems

Recent advances in neural approaches greatly improve task-oriented dialo...
research
06/07/2019

Semi-supervised Stochastic Multi-Domain Learning using Variational Inference

Supervised models of NLP rely on large collections of text which closely...
research
02/22/2023

Semi-Supervised Approach for Early Stuck Sign Detection in Drilling Operations

A real-time stuck pipe prediction methodology is proposed in this paper....
research
06/26/2023

How About Kind of Generating Hedges using End-to-End Neural Models?

Hedging is a strategy for softening the impact of a statement in convers...

Please sign up or login with your details

Forgot password? Click here to reset