Evaluating for Diversity in Question Generation over Text

Generating diverse and relevant questions over text is a task with widespread applications. We argue that commonly-used evaluation metrics such as BLEU and METEOR are not suitable for this task due to the inherent diversity of reference questions, and propose a scheme for extending conventional metrics to reflect diversity. We furthermore propose a variational encoder-decoder model for this task. We show through automatic and human evaluation that our variational model improves diversity without loss of quality, and demonstrate how our evaluation scheme reflects this improvement.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/17/2022

Revisiting the Evaluation Metrics of Paraphrase Generation

Paraphrase generation is an important NLP task that has achieved signifi...
research
07/03/2020

On the Relation between Quality-Diversity Evaluation and Distribution-Fitting Goal in Text Generation

The goal of text generation models is to fit the underlying real probabi...
research
04/06/2020

Evaluating the Evaluation of Diversity in Natural Language Generation

Despite growing interest in natural language generation (NLG) models tha...
research
04/04/2022

Diverse Text Generation via Variational Encoder-Decoder Models with Gaussian Process Priors

Generating high quality texts with high diversity is important for many ...
research
04/10/2022

DISK: Domain-constrained Instance Sketch for Math Word Problem Generation

A math word problem (MWP) is a coherent narrative which reflects the und...
research
04/11/2017

Creativity: Generating Diverse Questions using Variational Autoencoders

Generating diverse questions for given images is an important task for c...
research
08/12/2021

Generating Diverse Descriptions from Semantic Graphs

Text generation from semantic graphs is traditionally performed with det...

Please sign up or login with your details

Forgot password? Click here to reset