Plot-guided Adversarial Example Construction for Evaluating Open-domain Story Generation

04/12/2021

∙

With the recent advances of open-domain story generation, the lack of reliable automatic evaluation metrics becomes an increasingly imperative issue that hinders the fast development of story generation. According to conducted researches in this regard, learnable evaluation metrics have promised more accurate assessments by having higher correlations with human judgments. A critical bottleneck of obtaining a reliable learnable evaluation metric is the lack of high-quality training data for classifiers to efficiently distinguish plausible and implausible machine-generated stories. Previous works relied on heuristically manipulated plausible examples to mimic possible system drawbacks such as repetition, contradiction, or irrelevant content in the text level, which can be unnatural and oversimplify the characteristics of implausible machine-generated stories. We propose to tackle these issues by generating a more comprehensive set of implausible stories using plots, which are structured representations of controllable factors used to generate stories. Since these plots are compact and structured, it is easier to manipulate them to generate text with targeted undesirable properties, while at the same time maintain the grammatical correctness and naturalness of the generated sentences. To improve the quality of generated implausible stories, we further apply the adversarial filtering procedure presented by <cit.> to select a more nuanced set of implausible texts. Experiments show that the evaluation metrics trained on our generated data result in more reliable automatic assessments that correlate remarkably better with human judgments compared to the baselines.

READ FULL TEXT

Plot-guided Adversarial Example Construction for Evaluating Open-domain Story Generation

UNION: An Unreferenced Metric for Evaluating Open-ended Story Generation

Data-driven Natural Language Generation: Paving the Road to Success

HAUSER: Towards Holistic and Automatic Evaluation of Simile Generation

OpenMEVA: A Benchmark for Evaluating Open-ended Story Generation Metrics

Open-Domain Text Evaluation via Meta Distribution Modeling

Evaluation Metrics for Symbolic Knowledge Extracted from Machine Learning Black Boxes: A Discussion Paper

An Impartial Transformer for Story Visualization

Plot-guided Adversarial Example Construction for Evaluating Open-domain Story Generation

Related Research

UNION: An Unreferenced Metric for Evaluating Open-ended Story Generation

Data-driven Natural Language Generation: Paving the Road to Success

HAUSER: Towards Holistic and Automatic Evaluation of Simile Generation

OpenMEVA: A Benchmark for Evaluating Open-ended Story Generation Metrics

Open-Domain Text Evaluation via Meta Distribution Modeling

Evaluation Metrics for Symbolic Knowledge Extracted from Machine Learning Black Boxes: A Discussion Paper

An Impartial Transformer for Story Visualization