Neural Semi-supervised Learning for Text Classification Under Large-Scale Pretraining

by   Zijun Sun, et al.

The goal of semi-supervised learning is to utilize the unlabeled, in-domain dataset U to improve models trained on the labeled dataset D. Under the context of large-scale language-model (LM) pretraining, how we can make the best use of U is poorly understood: is semi-supervised learning still beneficial with the presence of large-scale pretraining? should U be used for in-domain LM pretraining or pseudo-label generation? how should the pseudo-label based semi-supervised model be actually implemented? how different semi-supervised strategies affect performances regarding D of different sizes, U of different sizes, etc. In this paper, we conduct comprehensive studies on semi-supervised learning in the task of text classification under the context of large-scale LM pretraining. Our studies shed important lights on the behavior of semi-supervised learning methods: (1) with the presence of in-domain pretraining LM on U, open-domain LM pretraining is unnecessary; (2) both the in-domain pretraining strategy and the pseudo-label based strategy introduce significant performance boosts, with the former performing better with larger U, the latter performing better with smaller U, and the combination leading to the largest performance boost; (3) self-training (pretraining first on pseudo labels D' and then fine-tuning on D) yields better performances when D is small, while joint training on the combination of pseudo labels D' and the original dataset D yields better performances when D is large. Using semi-supervised learning strategies, we are able to achieve a performance of around 93.8 and a competitive performance of 96.6 marks an initial step in understanding the behavior of semi-supervised learning models under the context of large-scale pretraining.


page 1

page 2

page 3

page 4


Revisiting Pretraining for Semi-Supervised Learning in the Low-Label Regime

Semi-supervised learning (SSL) addresses the lack of labeled data by exp...

Semi-Supervised Text Classification via Self-Pretraining

We present a neural semi-supervised learning model termed Self-Pretraini...

Variational Pretraining for Semi-supervised Text Classification

We introduce VAMPIRE, a lightweight pretraining framework for effective ...

A random matrix analysis and improvement of semi-supervised learning for large dimensional data

This article provides an original understanding of the behavior of a cla...

Large-Scale Self- and Semi-Supervised Learning for Speech Translation

In this paper, we improve speech translation (ST) through effectively le...

Semi-supervised classification by reaching consensus among modalities

This paper introduces transductive consensus network (TCNs), as an exten...