CoTK: An Open-Source Toolkit for Fast Development and Fair Evaluation of Text Generation

02/03/2020
by   Fei Huang, et al.
0

In text generation evaluation, many practical issues, such as inconsistent experimental settings and metric implementations, are often ignored but lead to unfair evaluation and untenable conclusions. We present CoTK, an open-source toolkit aiming to support fast development and fair evaluation of text generation. In model development, CoTK helps handle the cumbersome issues, such as data processing, metric implementation, and reproduction. It standardizes the development steps and reduces human errors which may lead to inconsistent experimental settings. In model evaluation, CoTK provides implementation for many commonly used metrics and benchmark models across different experimental settings. As a unique feature, CoTK can signify when and which metric cannot be fairly compared. We demonstrate that it is convenient to use CoTK for model development and evaluation, particularly across different experimental settings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/02/2021

MAUVE: Human-Machine Divergence Curves for Evaluating Open-Ended Text Generation

Despite major advances in open-ended text generation, there has been lim...
research
02/16/2023

dump1030: open-source plug-and-play demodulator/decoder for 1030MHz uplink

Automatic Dependent Surveillance (ADS), Automatic Dependent Surveillance...
research
02/27/2023

TabGenie: A Toolkit for Table-to-Text Generation

Heterogenity of data-to-text generation datasets limits the research on ...
research
09/04/2018

Texar: A Modularized, Versatile, and Extensible Toolkit for Text Generation

We introduce Texar, an open-source toolkit aiming to support the broad s...
research
05/10/2017

Analysing Data-To-Text Generation Benchmarks

Recently, several data-sets associating data to text have been created t...
research
09/17/2023

ChainForge: A Visual Toolkit for Prompt Engineering and LLM Hypothesis Testing

Evaluating outputs of large language models (LLMs) is challenging, requi...
research
06/21/2017

JaTeCS an open-source JAva TExt Categorization System

JaTeCS is an open source Java library that supports research on automati...

Please sign up or login with your details

Forgot password? Click here to reset