disco: a toolkit for Distributional Control of Generative Models

03/08/2023
by   Germàn Kruszewski, et al.
0

Pre-trained language models and other generative models have revolutionized NLP and beyond. However, these models tend to reproduce undesirable biases present in their training data. Also, they may overlook patterns that are important but challenging to capture. To address these limitations, researchers have introduced distributional control techniques. These techniques, not limited to language, allow controlling the prevalence (i.e., expectations) of any features of interest in the model's outputs. Despite their potential, the widespread adoption of these techniques has been hindered by the difficulty in adapting complex, disconnected code. Here, we present disco, an open-source Python library that brings these techniques to the broader public.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/14/2022

Generative Visual Prompt: Unifying Distributional Control of Pre-Trained Generative Models

Generative models (e.g., GANs and diffusion models) learn the underlying...
research
12/01/2021

Controlling Conditional Language Models with Distributional Policy Gradients

Machine learning is shifting towards general-purpose pretrained generati...
research
09/28/2022

medigan: A Python Library of Pretrained Generative Models for Enriched Data Access in Medical Imaging

Synthetic data generated by generative models can enhance the performanc...
research
06/17/2021

pysentimiento: A Python Toolkit for Sentiment Analysis and SocialNLP tasks

Extracting opinions from texts has gathered a lot of interest in the las...
research
03/31/2022

Leveraging pre-trained language models for conversational information seeking from text

Recent advances in Natural Language Processing, and in particular on the...
research
05/16/2021

SLGPT: Using Transfer Learning to Directly Generate Simulink Model Files and Find Bugs in the Simulink Toolchain

Finding bugs in a commercial cyber-physical system (CPS) development too...
research
02/17/2023

Bounding the Capabilities of Large Language Models in Open Text Generation with Prompt Constraints

The limits of open-ended generative models are unclear, yet increasingly...

Please sign up or login with your details

Forgot password? Click here to reset