Automatic assessment of text-based responses in post-secondary education: A systematic review
Text-based open-ended questions in academic formative and summative assessments help students become deep learners and prepare them to understand concepts for a subsequent test conceptually. However, grading text-based questions, especially in large (>50 enrolled students) courses, is a tedious and time-costing process for instructors. Text processing models continue progressing with the rapid development of Artificial Intelligence (AI) tools and Natural Language Processing (NLP) algorithms. Especially after breakthroughs in Large Language Models (LLM), there is immense potential to automate rapid assessment and feedback of text-based responses in education. This systematic review adopts a scientific and reproducible literature search strategy based on the PRISMA process using explicit inclusion and exclusion criteria to study text-based automatic assessment systems in post-secondary education, screening 838 papers and synthesizing 93 studies. To understand how text-based automatic assessment systems have been developed and applied in education in recent years, all included studies are summarized and categorized according to a proposed comprehensive theoretical framework, including the input and output of the automatic assessment system, research motivation, and research outcome, aiming to answer three research questions accordingly. Additionally, the typical studies of automated assessment systems and application domains in these studies are investigated and summarized. This systematic review will provide an overview of recent educational applications of text-based assessment systems for understanding the latest AI/NLP developments assisting in text-based assessments in higher education. We expect it will particularly benefit researchers and educators incorporating LLMs such as ChatGPT into their educational activities.
READ FULL TEXT