Cross Encoding as Augmentation: Towards Effective Educational Text Classification

05/30/2023
by   Hyun Seung Lee, et al.
0

Text classification in education, usually called auto-tagging, is the automated process of assigning relevant tags to educational content, such as questions and textbooks. However, auto-tagging suffers from a data scarcity problem, which stems from two major challenges: 1) it possesses a large tag space and 2) it is multi-label. Though a retrieval approach is reportedly good at low-resource scenarios, there have been fewer efforts to directly address the data scarcity problem. To mitigate these issues, here we propose a novel retrieval approach CEAA that provides effective learning in educational text classification. Our main contributions are as follows: 1) we leverage transfer learning from question-answering datasets, and 2) we propose a simple but effective data augmentation method introducing cross-encoder style texts to a bi-encoder architecture for more efficient inference. An extensive set of experiments shows that our proposed method is effective in multi-label scenarios and low-resource tags compared to state-of-the-art models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/22/2023

Retrieval-augmented Multi-label Text Classification

Multi-label text classification (MLC) is a challenging task in settings ...
research
09/22/2020

On Data Augmentation for Extreme Multi-label Classification

In this paper, we focus on data augmentation for the extreme multi-label...
research
05/23/2022

Many-Class Text Classification with Matching

In this work, we formulate Text Classification as a Matching problem bet...
research
07/05/2020

Pretrained Generalized Autoregressive Model with Adaptive Probabilistic Label Clusters for Extreme Multi-label Text Classification

Extreme multi-label text classification (XMTC) is a task for tagging a g...
research
07/14/2017

DocTag2Vec: An Embedding Based Multi-label Learning Approach for Document Tagging

Tagging news articles or blog posts with relevant tags from a collection...
research
08/21/2022

Automatic tagging of knowledge points for K12 math problems

Automatic tagging of knowledge points for practice problems is the basis...
research
11/21/2018

Resource Mention Extraction for MOOC Discussion Forums

In discussions hosted on discussion forums for MOOCs, references to onli...

Please sign up or login with your details

Forgot password? Click here to reset