FFCI: A Framework for Interpretable Automatic Evaluation of Summarization

11/27/2020
by   Fajri Koto, et al.
7

In this paper, we propose FFCI, a framework for automatic summarization evaluation that comprises four elements: Faithfulness, Focus, Coverage, and Inter-sentential coherence. We design FFCI by comprehensively studying traditional evaluation metrics and model-based evaluations, including question answering (QA) approaches, STS, next-sentence prediction (NSP), and scores from 19 pre-trained language models. Our study reveals three key findings: (1) calculating BertSCORE between the summary and article sentences yields a higher correlation score than recently-proposed QA-based evaluation methods for faithfulness evaluation; (2) GPT2Score has the best Pearson's correlation for focus and coverage; and (3) a simple NSP model is effective at evaluating inter-sentential coherence.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset