Steering Language Generation: Harnessing Contrastive Expert Guidance and Negative Prompting for Coherent and Diverse Synthetic Data Generation

08/15/2023
by   Charles O'Neill, et al.
0

Large Language Models (LLMs) hold immense potential to generate synthetic data of high quality and utility, which has numerous applications from downstream model training to practical data utilisation. However, contemporary models, despite their impressive capacities, consistently struggle to produce both coherent and diverse data. To address the coherency issue, we introduce contrastive expert guidance, where the difference between the logit distributions of fine-tuned and base language models is emphasised to ensure domain adherence. In order to ensure diversity, we utilise existing real and synthetic examples as negative prompts to the model. We deem this dual-pronged approach to logit reshaping as STEER: Semantic Text Enhancement via Embedding Repositioning. STEER operates at inference-time and systematically guides the LLMs to strike a balance between adherence to the data distribution (ensuring semantic fidelity) and deviation from prior synthetic examples or existing real datasets (ensuring diversity and authenticity). This delicate balancing act is achieved by dynamically moving towards or away from chosen representations in the latent space. STEER demonstrates improved performance over previous synthetic data generation techniques, exhibiting better balance between data diversity and coherency across three distinct tasks: hypothesis generation, toxic and non-toxic comment generation, and commonsense reasoning task generation. We demonstrate how STEER allows for fine-tuned control over the diversity-coherency trade-off via its hyperparameters, highlighting its versatility.

READ FULL TEXT

page 4

page 5

page 10

page 12

research
06/03/2021

Fingerprinting Fine-tuned Language Models in the Wild

There are concerns that the ability of language models (LMs) to generate...
research
05/24/2023

Generating Faithful Synthetic Data with Large Language Models: A Case Study in Computational Social Science

Large Language Models (LLMs) have democratized synthetic data generation...
research
02/12/2023

MarioGPT: Open-Ended Text2Level Generation through Large Language Models

Procedural Content Generation (PCG) algorithms provide a technique to ge...
research
11/28/2022

GPT-Neo for commonsense reasoning-a theoretical and practical lens

Recent work has demonstrated substantial gains in pre-training large-sca...
research
08/08/2023

ChatGPT for Arabic Grammatical Error Correction

Recently, large language models (LLMs) fine-tuned to follow human instru...
research
06/26/2023

Data-Driven Approach for Formality-Sensitive Machine Translation: Language-Specific Handling and Synthetic Data Generation

In this paper, we introduce a data-driven approach for Formality-Sensiti...
research
04/27/2023

CONSCENDI: A Contrastive and Scenario-Guided Distillation Approach to Guardrail Models for Virtual Assistants

A wave of new task-based virtual assistants has been fueled by increasin...

Please sign up or login with your details

Forgot password? Click here to reset