Understanding the Properties of Generated Corpora

06/22/2022
by   Naama Zwerdling, et al.
0

Models for text generation have become focal for many research tasks and especially for the generation of sentence corpora. However, understanding the properties of an automatically generated text corpus remains challenging. We propose a set of tools that examine the properties of generated text corpora. Applying these tools on various generated corpora allowed us to gain new insights into the properties of the generative models. As part of our characterization process, we found remarkable differences in the corpora generated by two leading generative technologies.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/27/2023

Evaluating Generative Models for Graph-to-Text Generation

Large language models (LLMs) have been widely employed for graph-to-text...
research
06/17/2020

Automatically Ranked Russian Paraphrase Corpus for Text Generation

The article is focused on automatic development and ranking of a large c...
research
12/25/2017

Generative Adversarial Nets for Multiple Text Corpora

Generative adversarial nets (GANs) have been successfully applied to the...
research
04/13/2020

Reverse Engineering Configurations of Neural Text Generation Models

This paper seeks to develop a deeper understanding of the fundamental pr...
research
04/21/2018

Eval all, trust a few, do wrong to none: Comparing sentence generation models

In this paper, we study recent neural generative models for text generat...
research
06/28/2023

You Can Generate It Again: Data-to-text Generation with Verification and Correction Prompting

Despite significant advancements in existing models, generating text des...
research
12/13/2016

Vicinity-Driven Paragraph and Sentence Alignment for Comparable Corpora

Parallel corpora have driven great progress in the field of Text Simplif...

Please sign up or login with your details

Forgot password? Click here to reset