Assessing the Quality of Multiple-Choice Questions Using GPT-4 and Rule-Based Methods

07/16/2023
by   Steven Moore, et al.
0

Multiple-choice questions with item-writing flaws can negatively impact student learning and skew analytics. These flaws are often present in student-generated questions, making it difficult to assess their quality and suitability for classroom usage. Existing methods for evaluating multiple-choice questions often focus on machine readability metrics, without considering their intended use within course materials and their pedagogical implications. In this study, we compared the performance of a rule-based method we developed to a machine-learning based method utilizing GPT-4 for the task of automatically assessing multiple-choice questions based on 19 common item-writing flaws. By analyzing 200 student-generated questions from four different subject areas, we found that the rule-based method correctly detected 91 We demonstrated the effectiveness of the two methods in identifying common item-writing flaws present in the student-generated questions across different subject areas. The rule-based method can accurately and efficiently evaluate multiple-choice questions from multiple domains, outperforming GPT-4 and going beyond existing metrics that do not account for the educational use of such questions. Finally, we discuss the potential for using these automated methods to improve the quality of questions based on the identified flaws.

READ FULL TEXT
research
09/20/2018

A rule-based method to model myocardial fiber orientation in cardiac biventricular geometries with outflow tracts

Rule-based methods are often used for assigning fiber orientation to car...
research
07/11/2015

A new hybrid stemming algorithm for Persian

Stemming has been an influential part in Information retrieval and searc...
research
04/10/2023

DISTO: Evaluating Textual Distractors for Multi-Choice Questions using Negative Sampling based Approach

Multiple choice questions (MCQs) are an efficient and common way to asse...
research
05/08/2023

Algebra Error Classification with Large Language Models

Automated feedback as students answer open-ended math questions has sign...
research
11/21/2022

Evaluating the Knowledge Dependency of Questions

The automatic generation of Multiple Choice Questions (MCQ) has the pote...
research
04/28/2020

Introducing a framework to assess newly created questions with Natural Language Processing

Statistical models such as those derived from Item Response Theory (IRT)...
research
01/29/2022

Information Extraction through AI techniques: The KIDs use case at CONSOB

In this paper we report on the initial activities carried out within a c...

Please sign up or login with your details

Forgot password? Click here to reset