The Influence of Domain-Based Preprocessing on Subject-Specific Clustering

11/16/2020
by   Alexandra Gkolia, et al.
0

The sudden change of moving the majority of teaching online at Universities due to the global Covid-19 pandemic has caused an increased amount of workload for academics. One of the contributing factors is answering a high volume of queries coming from students. As these queries are not limited to the synchronous time frame of a lecture, there is a high chance of many of them being related or even equivalent. One way to deal with this problem is to cluster these questions depending on their topic. In our previous work, we aimed to find an improved method of clustering that would give us a high efficiency, using a recurring LDA model. Our data set contained questions posted online from a Computer Science course at the University of Bath. A significant number of these questions contained code excerpts, which we found caused a problem in clustering, as certain terms were being considered as common words in the English language and not being recognised as specific code terms. To address this, we implemented tagging of these technical terms using Python, as part of preprocessing the data set. In this paper, we explore the realms of tagging data sets, focusing on identifying code excerpts and providing empirical results in order to justify our reasoning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/04/2020

Unification of HDP and LDA Models for Optimal Topic Clustering of Subject Specific Question Banks

There has been an increasingly popular trend in Universities for curricu...
research
01/25/2022

Perspective on Code Submission and Automated Evaluation Platforms for University Teaching

We present a perspective on platforms for code submission and automated ...
research
07/14/2023

Towards Generalizable Detection of Urgency of Discussion Forum Posts

Students who take an online course, such as a MOOC, use the course's dis...
research
10/22/2020

Kwame: A Bilingual AI Teaching Assistant for Online SuaCode Courses

Introductory hands-on courses such as our smartphone-based coding course...
research
11/21/2020

Query Game 2.0: Improvement of a Web-Based Query Game for Cavite State University Main Campus

Purpose: The study aimed to improve the previous study covering a web-ba...
research
04/21/2021

Clustering Introductory Computer Science Exercises Using Topic Modeling Methods

Manually determining concepts present in a group of questions is a chall...
research
01/11/2021

The audiovisual resource as a pedagogical tools in times of covid 19. An empirical analysis of its efficiency

The global pandemic caused by the COVID virus led universities to a chan...

Please sign up or login with your details

Forgot password? Click here to reset