MixText: Linguistically-Informed Interpolation of Hidden Space for Semi-Supervised Text Classification

04/25/2020
by   Jiaao Chen, et al.
0

This paper presents MixText, a semi-supervised learning method for text classification, which uses our newly designed data augmentation method called TMix. TMix creates a large amount of augmented training samples by interpolating text in hidden space. Moreover, we leverage recent advances in data augmentation to guess low-entropy labels for unlabeled data, hence making them as easy to use as labeled data.By mixing labeled, unlabeled and augmented data, MixText significantly outperformed current pre-trained and fined-tuned models and other state-of-the-art semi-supervised learning methods on several text classification benchmarks. The improvement is especially prominent when supervision is extremely limited. We have publicly released our code at https://github.com/GT-SALT/MixText.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/29/2019

Unsupervised Data Augmentation

Despite its success, deep learning still needs large labeled datasets to...
research
10/23/2022

SAT: Improving Semi-Supervised Text Classification with Simple Instance-Adaptive Self-Training

Self-training methods have been explored in recent years and have exhibi...
research
04/23/2020

Semi-Supervised Models via Data Augmentationfor Classifying Interactive Affective Responses

We present semi-supervised models with data augmentation (SMDA), a semi-...
research
04/19/2023

ESimCSE Unsupervised Contrastive Learning Jointly with UDA Semi-Supervised Learning for Large Label System Text Classification Mode

The challenges faced by text classification with large tag systems in na...
research
12/13/2022

The Hateful Memes Challenge Next Move

State-of-the-art image and text classification models, such as Convoluti...
research
01/16/2021

Weakly-Supervised Hierarchical Models for Predicting Persuasive Strategies in Good-faith Textual Requests

Modeling persuasive language has the potential to better facilitate our ...
research
02/07/2020

Snippext: Semi-supervised Opinion Mining with Augmented Data

Online services are interested in solutions to opinion mining, which is ...

Please sign up or login with your details

Forgot password? Click here to reset