COLLIE: Systematic Construction of Constrained Text Generation Tasks

07/17/2023
by   Shunyu Yao, et al.
0

Text generation under constraints have seen increasing interests in natural language processing, especially with the rapidly improving capabilities of large language models. However, existing benchmarks for constrained generation usually focus on fixed constraint types (e.g.,generate a sentence containing certain words) that have proved to be easy for state-of-the-art models like GPT-4. We present COLLIE, a grammar-based framework that allows the specification of rich, compositional constraints with diverse generation levels (word, sentence, paragraph, passage) and modeling challenges (e.g.,language understanding, logical reasoning, counting, semantic planning). We also develop tools for automatic extraction of task instances given a constraint structure and a raw text corpus. Using COLLIE, we compile the COLLIE-v1 dataset with 2080 instances comprising 13 constraint structures. We perform systematic experiments across five state-of-the-art instruction-tuned language models and analyze their performances to reveal shortcomings. COLLIE is designed to be extensible and lightweight, and we hope the community finds it useful to develop more complex constraints and evaluations in the future.

READ FULL TEXT

page 9

page 16

page 17

research
06/28/2023

Most Language Models can be Poets too: An AI Writing Assistant and Constrained Text Generation Studio

Despite rapid advancement in the field of Constrained Natural Language G...
research
02/17/2023

Bounding the Capabilities of Large Language Models in Open Text Generation with Prompt Constraints

The limits of open-ended generative models are unclear, yet increasingly...
research
12/18/2015

A Planning based Framework for Essay Generation

Generating an article automatically with computer program is a challengi...
research
05/12/2023

TinyStories: How Small Can Language Models Be and Still Speak Coherent English?

Language models (LMs) are powerful tools for natural language processing...
research
11/03/2022

LMentry: A Language Model Benchmark of Elementary Language Tasks

As the performance of large language models rapidly improves, benchmarks...
research
09/19/2023

Toward Unified Controllable Text Generation via Regular Expression Instruction

Controllable text generation is a fundamental aspect of natural language...
research
07/19/2023

On the Origin of LLMs: An Evolutionary Tree and Graph for 15,821 Large Language Models

Since late 2022, Large Language Models (LLMs) have become very prominent...

Please sign up or login with your details

Forgot password? Click here to reset