Clustering students' open-ended questionnaire answers

Open responses form a rich but underused source of information in educational data mining and intelligent tutoring systems. One of the major obstacles is the difficulty of clustering short texts automatically. In this paper, we investigate the problem of clustering free-formed questionnaire answers. We present comparative experiments on clustering ten sets of open responses from course feedback queries in English and Finnish. We also evaluate how well the main topics could be extracted from clusterings with the HITS algorithm. The main result is that, for English data, affinity propagation performed well despite frequent outliers and considerable overlapping between real clusters. However, for Finnish data, the performance was poorer and none of the methods clearly outperformed the others. Similarly, topic extraction was very successful for the English data but only satisfactory for the Finnish data. The most interesting discovery was that stemming could actually deteriorate the clustering quality significantly.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/27/2018

Automatic Short Answer Grading and Feedback Using Text Mining Methods

Automatic grading is not a new approach but the need to adapt the latest...
research
01/31/2020

Enhancement of Short Text Clustering by Iterative Classification

Short text clustering is a challenging task due to the lack of signal co...
research
03/24/2017

Data-Mining Textual Responses to Uncover Misconception Patterns

An important, yet largely unstudied, problem in student data analysis is...
research
04/15/2019

Tracing Forum Posts to MOOC Content using Topic Analysis

Massive Open Online Courses are educational programs that are open and a...
research
10/11/2022

Identifying Difficult exercises in an eTextbook Using Item Response Theory and Logged Data Analysis

The growing dependence on eTextbooks and Massive Open Online Courses (MO...
research
05/22/2023

Leveraging Human Feedback to Scale Educational Datasets: Combining Crowdworkers and Comparative Judgement

Machine Learning models have many potentially beneficial applications in...
research
02/01/2022

A Semi-Supervised Deep Clustering Pipeline for Mining Intentions From Texts

Mining the latent intentions from large volumes of natural language inpu...

Please sign up or login with your details

Forgot password? Click here to reset