Comparing Human and Automated Evaluation of Open-Ended Student Responses to Questions of Evolution

03/22/2016
by   Michael J Wiser, et al.
0

Written responses can provide a wealth of data in understanding student reasoning on a topic. Yet they are time- and labor-intensive to score, requiring many instructors to forego them except as limited parts of summative assessments at the end of a unit or course. Recent developments in Machine Learning (ML) have produced computational methods of scoring written responses for the presence or absence of specific concepts. Here, we compare the scores from one particular ML program -- EvoGrader -- to human scoring of responses to structurally- and content-similar questions that are distinct from the ones the program was trained on. We find that there is substantial inter-rater reliability between the human and ML scoring. However, sufficient systematic differences remain between the human and ML scoring that we advise only using the ML scoring for formative, rather than summative, assessment of student reasoning.

READ FULL TEXT

page 3

page 5

research
06/01/2023

Modeling and Analyzing Scorer Preferences in Short-Answer Math Questions

Automated scoring of student responses to open-ended questions, includin...
research
01/02/2023

Using Active Learning Methods to Strategically Select Essays for Automated Scoring

Research on automated essay scoring has become increasing important beca...
research
01/13/2016

EvoGrader: an online formative assessment tool for automatically evaluating written evolutionary explanations

EvoGrader is a free, online, on-demand formative assessment service desi...
research
07/17/2017

Detecting Off-topic Responses to Visual Prompts

Automated methods for essay scoring have made great progress in recent y...
research
01/20/2023

Matching Exemplar as Next Sentence Prediction (MeNSP): Zero-shot Prompt Learning for Automatic Scoring in Science Education

Developing models to automatically score students' written responses to ...
research
01/05/2022

Automated Scoring of Graphical Open-Ended Responses Using Artificial Neural Networks

Automated scoring of free drawings or images as responses has yet to be ...
research
04/15/2020

Personality Assessment from Text for Machine Commonsense Reasoning

This article presents PerSense, a framework to estimate human personalit...

Please sign up or login with your details

Forgot password? Click here to reset