TwistList: Resources and Baselines for Tongue Twister Generation

06/06/2023
by   Tyler Loakman, et al.
0

Previous work in phonetically-grounded language generation has mainly focused on domains such as lyrics and poetry. In this paper, we present work on the generation of tongue twisters - a form of language that is required to be phonetically conditioned to maximise sound overlap, whilst maintaining semantic consistency with an input topic, and still being grammatically correct. We present TwistList, a large annotated dataset of tongue twisters, consisting of 2.1K+ human-authored examples. We additionally present several benchmark systems (referred to as TwisterMisters) for the proposed task of tongue twister generation, including models that both do and do not require training on in-domain data. We present the results of automatic and human evaluation to demonstrate the performance of existing mainstream pre-trained models in this task with limited (or no) task specific training and data, and no explicit phonetic knowledge. We find that the task of tongue twister generation is challenging for models under these conditions, yet some models are still capable of generating acceptable examples of this language type.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/15/2021

Knowledge-Grounded Dialogue Generation with a Unified Knowledge Representation

Knowledge-grounded dialogue systems are challenging to build due to the ...
research
05/23/2022

What Makes Data-to-Text Generation Hard for Pretrained Language Models?

Expressing natural language descriptions of structured facts or relation...
research
08/31/2023

BioCoder: A Benchmark for Bioinformatics Code Generation with Contextual Pragmatic Knowledge

Pre-trained language models like ChatGPT have significantly improved cod...
research
05/09/2018

Creative Invention Benchmark

In this paper we present the Creative Invention Benchmark (CrIB), a 2000...
research
12/20/2022

ByGPT5: End-to-End Style-conditioned Poetry Generation with Token-free Language Models

State-of-the-art poetry generation systems are often complex. They eithe...
research
11/25/2020

AGenT Zero: Zero-shot Automatic Multiple-Choice Question Generation for Skill Assessments

Multiple-choice questions (MCQs) offer the most promising avenue for ski...
research
09/05/2023

Weigh Your Own Words: Improving Hate Speech Counter Narrative Generation via Attention Regularization

Recent computational approaches for combating online hate speech involve...

Please sign up or login with your details

Forgot password? Click here to reset