Of Human Criteria and Automatic Metrics: A Benchmark of the Evaluation of Story Generation

08/24/2022
by   Cyril Chhun, et al.
22

Research on Automatic Story Generation (ASG) relies heavily on human and automatic evaluation. However, there is no consensus on which human evaluation criteria to use, and no analysis of how well automatic criteria correlate with them. In this paper, we propose to re-evaluate ASG evaluation. We introduce a set of 6 orthogonal and comprehensive human criteria, carefully motivated by the social sciences literature. We also present HANNA, an annotated dataset of 1,056 stories produced by 10 different ASG systems. HANNA allows us to quantitatively evaluate the correlations of 72 automatic metrics with human criteria. Our analysis highlights the weaknesses of current metrics for ASG and allows us to formulate practical recommendations for ASG evaluation.

READ FULL TEXT

page 7

page 21

page 23

page 24

page 25

page 26

page 27

page 28

research
06/13/2023

HAUSER: Towards Holistic and Automatic Evaluation of Simile Generation

Similes play an imperative role in creative writing such as story and di...
research
05/19/2021

OpenMEVA: A Benchmark for Evaluating Open-ended Story Generation Metrics

Automatic metrics are essential for developing natural language generati...
research
05/24/2021

Towards Standard Criteria for human evaluation of Chatbots: A Survey

Human evaluation is becoming a necessity to test the performance of Chat...
research
09/11/2019

What Makes A Good Story? Designing Composite Rewards for Visual Storytelling

Previous storytelling approaches mostly focused on optimizing traditiona...
research
09/17/2020

Small but Mighty: New Benchmarks for Split and Rephrase

Split and Rephrase is a text simplification task of rewriting a complex ...
research
04/29/2022

Seeing without Looking: Analysis Pipeline for Child Sexual Abuse Datasets

The online sharing and viewing of Child Sexual Abuse Material (CSAM) are...
research
10/31/2018

dAIrector: Automatic Story Beat Generation through Knowledge Synthesis

dAIrector is an automated director which collaborates with humans storyt...

Please sign up or login with your details

Forgot password? Click here to reset