Does Informativeness Matter? Active Learning for Educational Dialogue Act Classification

04/12/2023
by   Wei Tan, et al.
0

Dialogue Acts (DAs) can be used to explain what expert tutors do and what students know during the tutoring process. Most empirical studies adopt the random sampling method to obtain sentence samples for manual annotation of DAs, which are then used to train DA classifiers. However, these studies have paid little attention to sample informativeness, which can reflect the information quantity of the selected samples and inform the extent to which a classifier can learn patterns. Notably, the informativeness level may vary among the samples and the classifier might only need a small amount of low informative samples to learn the patterns. Random sampling may overlook sample informativeness, which consumes human labelling costs and contributes less to training the classifiers. As an alternative, researchers suggest employing statistical sampling methods of Active Learning (AL) to identify the informative samples for training the classifiers. However, the use of AL methods in educational DA classification tasks is under-explored. In this paper, we examine the informativeness of annotated sentence samples. Then, the study investigates how the AL methods can select informative samples to support DA classifiers in the AL sampling process. The results reveal that most annotated sentences present low informativeness in the training dataset and the patterns of these sentences can be easily captured by the DA classifier. We also demonstrate how AL methods can reduce the cost of manual annotation in the AL sampling process.

READ FULL TEXT
research
04/15/2023

Robust Educational Dialogue Act Classifiers with Low-Resource and Imbalanced Datasets

Dialogue acts (DAs) can represent conversational actions of tutors or st...
research
02/06/2022

LiDAR dataset distillation within bayesian active learning framework: Understanding the effect of data augmentation

Autonomous driving (AD) datasets have progressively grown in size in the...
research
02/01/2022

Federated Active Learning (F-AL): an Efficient Annotation Strategy for Federated Learning

Federated learning (FL) has been intensively investigated in terms of co...
research
08/18/2023

A Graph-based Stratified Sampling Methodology for the Analysis of (Underground) Forums

[Context] Researchers analyze underground forums to study abuse and cybe...
research
04/12/2021

Active learning for medical code assignment

Machine Learning (ML) is widely used to automatically extract meaningful...
research
01/13/2017

Active Self-Paced Learning for Cost-Effective and Progressive Face Identification

This paper aims to develop a novel cost-effective framework for face ide...
research
09/14/2020

Beyond Accuracy: ROI-driven Data Analytics of Empirical Data

This vision paper demonstrates that it is crucial to consider Return-on-...

Please sign up or login with your details

Forgot password? Click here to reset