KEST: Kernel Distance Based Efficient Self-Training for Improving Controllable Text Generation

06/17/2023
by   Yuxi Feng, et al.
0

Self-training (ST) has come to fruition in language understanding tasks by producing pseudo labels, which reduces the labeling bottleneck of language model fine-tuning. Nevertheless, in facilitating semi-supervised controllable language generation, ST faces two key challenges. First, augmented by self-generated pseudo text, generation models tend to over-exploit the previously learned text distribution, suffering from mode collapse and poor generation diversity. Second, generating pseudo text in each iteration is time-consuming, severely decelerating the training process. In this work, we propose KEST, a novel and efficient self-training framework to handle these problems. KEST utilizes a kernel-based loss, rather than standard cross entropy, to learn from the soft pseudo text produced by a shared non-autoregressive generator. We demonstrate both theoretically and empirically that KEST can benefit from more diverse pseudo text in an efficient manner, which allows not only refining and exploiting the previously fitted distribution but also enhanced exploration towards a larger potential text space, providing a guarantee of improved performance. Experiments on three controllable generation tasks demonstrate that KEST significantly improves control accuracy while maintaining comparable text fluency and generation diversity against several strong baselines.

READ FULL TEXT
research
12/16/2022

DuNST: Dual Noisy Self Training for Semi-Supervised Controllable Text Generation

Self-training (ST) has prospered again in language understanding by augm...
research
09/25/2020

Controllable Text Generation with Focused Variation

This work introduces Focused-Variation Network (FVN), a novel model to c...
research
05/15/2022

Classifiers are Better Experts for Controllable Text Generation

This paper proposes a simple method for controllable text generation bas...
research
06/20/2023

On Compositionality and Improved Training of NADO

NeurAlly-Decomposed Oracle (NADO) is a powerful approach for controllabl...
research
11/14/2022

Evade the Trap of Mediocrity: Promoting Diversity and Novelty in Text Generation via Concentrating Attention

Recently, powerful Transformer architectures have proven superior in gen...
research
12/10/2021

Discourse-Aware Prompt Design for Text Generation

Current efficient fine-tuning methods (e.g., adapters, prefix-tuning, et...

Please sign up or login with your details

Forgot password? Click here to reset