Critic-Guided Decoding for Controlled Text Generation

12/21/2022
by   Minbeom Kim, et al.
0

Steering language generation towards objectives or away from undesired content has been a long-standing goal in utilizing language models (LM). Recent work has demonstrated reinforcement learning and weighted decoding as effective approaches to achieve a higher level of language control and quality with pros and cons. In this work, we propose a novel critic decoding method for controlled language generation (CriticControl) that combines the strengths of reinforcement learning and weighted decoding. Specifically, we adopt the actor-critic framework to train an LM-steering critic from non-differentiable reward models. And similar to weighted decoding, our method freezes the language model and manipulates the output token distribution using called critic, improving training efficiency and stability. Evaluation of our method on three controlled generation tasks, namely topic control, sentiment control, and detoxification, shows that our approach generates more coherent and well-controlled texts than previous methods. In addition, CriticControl demonstrates superior generalization ability in zero-shot settings. Human evaluation studies also corroborate our findings.

READ FULL TEXT
research
01/11/2021

Implicit Unlikelihood Training: Improving Neural Text Generation with Reinforcement Learning

Likelihood training and maximization-based decoding result in dull and r...
research
05/07/2021

On-the-Fly Controlled Text Generation with Experts and Anti-Experts

Despite recent advances in natural language generation, it remains chall...
research
05/19/2023

BOLT: Fast Energy-based Controlled Text Generation with Tunable Biases

Energy-based models (EBMs) have gained popularity for controlled text ge...
research
08/11/2023

ZYN: Zero-Shot Reward Models with Yes-No Questions

In this work, we address the problem of directing the text generations o...
research
05/03/2022

Zero-shot Sonnet Generation with Discourse-level Planning and Aesthetics Features

Poetry generation, and creative language generation in general, usually ...
research
04/29/2020

Generating Safe Diversity in NLG via Imitation Learning

Deep-learning models for language generation tasks tend to produce repet...
research
07/06/2023

PREADD: Prefix-Adaptive Decoding for Controlled Text Generation

We propose Prefix-Adaptive Decoding (PREADD), a flexible method for cont...

Please sign up or login with your details

Forgot password? Click here to reset