Evaluating Factual Consistency of Texts with Semantic Role Labeling

05/22/2023
by   Jing Fan, et al.
0

Automated evaluation of text generation systems has recently seen increasing attention, particularly checking whether generated text stays truthful to input sources. Existing methods frequently rely on an evaluation using task-specific language models, which in turn allows for little interpretability of generated scores. We introduce SRLScore, a reference-free evaluation metric designed with text summarization in mind. Our approach generates fact tuples constructed from Semantic Role Labels, applied to both input and summary texts. A final factuality score is computed by an adjustable scoring mechanism, which allows for easy adaption of the method across domains. Correlation with human judgments on English summarization datasets shows that SRLScore is competitive with state-of-the-art methods and exhibits stable generalization across datasets without requiring further training or hyperparameter tuning. We experiment with an optional co-reference resolution step, but find that the performance boost is mostly outweighed by the additional compute required. Our metric is available online at https://github.com/heyjing/SRLScore.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/22/2021

BARTScore: Evaluating Generated Text as Text Generation

A wide variety of NLP applications, such as machine translation, summari...
research
04/02/2022

CTRLEval: An Unsupervised Reference-Free Metric for Evaluating Controlled Text Generation

Existing reference-free metrics have obvious limitations for evaluating ...
research
05/24/2022

MaskEval: Weighted MLM-Based Evaluation for Text Summarization and Simplification

In text summarization and simplification, system outputs must be evaluat...
research
06/03/2019

Handling Divergent Reference Texts when Evaluating Table-to-Text Generation

Automatically constructed datasets for generating text from semi-structu...
research
01/17/2023

On the State of German (Abstractive) Text Summarization

With recent advancements in the area of Natural Language Processing, the...
research
12/07/2020

CX DB8: A queryable extractive summarizer and semantic search engine

Competitive Debate's increasingly technical nature has left competitors ...
research
10/25/2022

Towards Interpretable Summary Evaluation via Allocation of Contextual Embeddings to Reference Text Topics

Despite extensive recent advances in summary generation models, evaluati...

Please sign up or login with your details

Forgot password? Click here to reset