VizSeq: A Visual Analysis Toolkit for Text Generation Tasks

09/12/2019
by   Changhan Wang, et al.
0

Automatic evaluation of text generation tasks (e.g. machine translation, text summarization, image captioning and video description) usually relies heavily on task-specific metrics, such as BLEU and ROUGE. They, however, are abstract numbers and are not perfectly aligned with human assessment. This suggests inspecting detailed examples as a complement to identify system error patterns. In this paper, we present VizSeq, a visual analysis toolkit for instance-level and corpus-level system evaluation on a wide variety of text generation tasks. It supports multimodal sources and multiple text references, providing visualization in Jupyter notebook or a web app interface. It can be used locally or deployed onto public servers for centralized data hosting and benchmarking. It covers most common n-gram based metrics accelerated with multiprocessing, and also provides latest embedding-based metrics such as BERTScore.

READ FULL TEXT
research
09/05/2019

MoverScore: Text Generation Evaluating with Contextualized Embeddings and Earth Mover Distance

A robust evaluation metric has a profound impact on the development of t...
research
05/07/2022

Attract me to Buy: Advertisement Copywriting Generation with Multimodal Multi-structured Information

Recently, online shopping has gradually become a common way of shopping ...
research
08/28/2018

Multi-Reference Training with Pseudo-References for Neural Translation and Text Generation

Neural text generation, including neural machine translation, image capt...
research
04/30/2020

NUBIA: NeUral Based Interchangeability Assessor for Text Generation

We present NUBIA, a methodology to build automatic evaluation metrics fo...
research
06/06/2023

Correction of Errors in Preference Ratings from Automated Metrics for Text Generation

A major challenge in the field of Text Generation is evaluation: Human e...
research
09/04/2018

Texar: A Modularized, Versatile, and Extensible Toolkit for Text Generation

We introduce Texar, an open-source toolkit aiming to support the broad s...
research
08/27/2021

Automatic Text Evaluation through the Lens of Wasserstein Barycenters

A new metric to evaluate text generation based on deep contextualized e...

Please sign up or login with your details

Forgot password? Click here to reset