Sampling with Attribute-Related Information for Controlling Language Models

05/12/2022
by   Shangda Wu, et al.
0

The dominant approaches for controlling language models are based on fine-tuning large language models or prompt engineering. However, these methods often require condition-specific data or considerable hand-crafting. We propose a new simple guided decoding method, Gamma Sampling, which does not require complex engineering and any extra data. Gamma Sampling introduces attribute-related information (provided by humans or language models themselves) into the sampling process to guide language models to generate texts with desired attributes. Experiments on controlling topics and sentiments of generated text show Gamma Sampling to be superior in diversity, attribute relevance and overall quality of generated samples while maintaining a fast generation speed. In addition, we successfully applied Gamma Sampling to control other attributes of language such as relatedness and repetition, which further demonstrates the versatility and effectiveness of this method. Gamma Sampling is now available in the python package samplings via import gamma sampling from samplings.

READ FULL TEXT
research
12/04/2019

Plug and Play Language Models: a Simple Approach to Controlled Text Generation

Large transformer-based language models (LMs) trained on huge text corpo...
research
10/09/2020

Plug-and-Play Conversational Models

There has been considerable progress made towards conversational models ...
research
09/14/2020

GeDi: Generative Discriminator Guided Sequence Generation

Class-conditional language models (CC-LMs) can be used to generate natur...
research
03/15/2022

Do Language Models Plagiarize?

Past literature has illustrated that language models do not fully unders...
research
02/27/2022

Controllable Natural Language Generation with Contrastive Prefixes

To guide the generation of large pretrained language models (LM), previo...
research
07/14/2023

Generating Efficient Training Data via LLM-based Attribute Manipulation

In this paper, we propose a novel method, Chain-of-Thoughts Attribute Ma...
research
07/28/2023

Skeleton-of-Thought: Large Language Models Can Do Parallel Decoding

This work aims at decreasing the end-to-end generation latency of large ...

Please sign up or login with your details

Forgot password? Click here to reset