DISTO: Evaluating Textual Distractors for Multi-Choice Questions using Negative Sampling based Approach

04/10/2023
by   Bilal Ghanem, et al.
0

Multiple choice questions (MCQs) are an efficient and common way to assess reading comprehension (RC). Every MCQ needs a set of distractor answers that are incorrect, but plausible enough to test student knowledge. Distractor generation (DG) models have been proposed, and their performance is typically evaluated using machine translation (MT) metrics. However, MT metrics often misjudge the suitability of generated distractors. We propose DISTO: the first learned evaluation metric for generated distractors. We validate DISTO by showing its scores correlate highly with human ratings of distractor quality. At the same time, DISTO ranks the performance of state-of-the-art DG models very differently from MT-based metrics, showing that MT metrics should not be used for distractor evaluation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/02/2018

Exploring Gap Filling as a Cheaper Alternative to Reading Comprehension Questionnaires when Evaluating Machine Translation for Gisting

A popular application of machine translation (MT) is gisting: MT is cons...
research
11/16/2022

MT Metrics Correlate with Human Ratings of Simultaneous Speech Translation

There have been several studies on the correlation between human ratings...
research
04/20/2022

Evaluating Commit Message Generation: To BLEU Or Not To BLEU?

Commit messages play an important role in several software engineering t...
research
07/16/2023

Assessing the Quality of Multiple-Choice Questions Using GPT-4 and Rule-Based Methods

Multiple-choice questions with item-writing flaws can negatively impact ...
research
04/04/2023

GPT-4 to GPT-3.5: 'Hold My Scalpel' – A Look at the Competency of OpenAI's GPT on the Plastic Surgery In-Service Training Exam

The Plastic Surgery In-Service Training Exam (PSITE) is an important ind...
research
12/20/2022

Extrinsic Evaluation of Machine Translation Metrics

Automatic machine translation (MT) metrics are widely used to distinguis...
research
05/30/2023

Breeding Machine Translations: Evolutionary approach to survive and thrive in the world of automated evaluation

We propose a genetic algorithm (GA) based method for modifying n-best li...

Please sign up or login with your details

Forgot password? Click here to reset