Can vectors read minds better than experts? Comparing data augmentation strategies for the automated scoring of children's mindreading ability

06/03/2021
by   Venelin Kovatchev, et al.
0

In this paper we implement and compare 7 different data augmentation strategies for the task of automatic scoring of children's ability to understand others' thoughts, feelings, and desires (or "mindreading"). We recruit in-domain experts to re-annotate augmented samples and determine to what extent each strategy preserves the original rating. We also carry out multiple experiments to measure how much each augmentation strategy improves the performance of automatic scoring systems. To determine the capabilities of automatic systems to generalize to unseen data, we create UK-MIND-20 - a new corpus of children's performance on tests of mindreading, consisting of 10,320 question-answer pairs. We obtain a new state-of-the-art performance on the MIND-CA corpus, improving macro-F1-score by 6 points. Results indicate that both the number of training examples and the quality of the augmentation strategies affect the performance of the systems. The task-specific augmentations generally outperform task-agnostic augmentations. Automatic augmentations based on vectors (GloVe, FastText) perform the worst. We find that systems trained on MIND-CA generalize well to UK-MIND-20. We demonstrate that data augmentation strategies also improve the performance on unseen data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/16/2020

"What is on your mind?" Automated Scoring of Mindreading in Childhood and Early Adolescence

In this paper we present the first work on the automated scoring of mind...
research
05/04/2020

Data Augmentation for Hypernymy Detection

The automatic detection of hypernymy relationships represents a challeng...
research
10/31/2021

Smart(Sampling)Augment: Optimal and Efficient Data Augmentation for Semantic Segmentation

Data augmentation methods enrich datasets with augmented data to improve...
research
04/27/2023

NAP at SemEval-2023 Task 3: Is Less Really More? (Back-)Translation as Data Augmentation Strategies for Detecting Persuasion Techniques

Persuasion techniques detection in news in a multi-lingual setup is non-...
research
10/11/2022

T5 for Hate Speech, Augmented Data and Ensemble

We conduct relatively extensive investigations of automatic hate speech ...
research
09/25/2020

BET: A Backtranslation Approach for Easy Data Augmentation in Transformer-based Paraphrase Identification Context

Newly-introduced deep learning architectures, namely BERT, XLNet, RoBERT...
research
04/19/2021

Automatic Stroke Classification of Tabla Accompaniment in Hindustani Vocal Concert Audio

The tabla is a unique percussion instrument due to the combined harmonic...

Please sign up or login with your details

Forgot password? Click here to reset