Delta-training: Simple Semi-Supervised Text Classification using Pretrained Word Embeddings

01/22/2019
by   Hwiyeol Jo, et al.
0

We propose a novel and simple method for semi-supervised text classification. The method starts from a hypothesis that a classifier with pretrained word embeddings always outperforms the same classifier with randomly initialized word embeddings, as empirically observed in NLP tasks. Our method first builds two sets of classifiers as a form of model ensemble, and then initializes their word embeddings differently: one using random, the other using pretrained word embeddings. We focus on different predictions between the two classifiers on unlabeled data while following the self-training framework. We also introduce label refinement and early-stopping in meta-epoch for better confidence on the label-by-prediction. We experiment on 4 different classification datasets, showing that our method performs better than the method using only the training set. Delta-training also outperforms the conventional self-training method in multi-class classification, showing robust performance against error accumulation.

READ FULL TEXT
research
06/17/2019

KaWAT: A Word Analogy Task Dataset for Indonesian

We introduced KaWAT (Kata Word Analogy Task), a new word analogy task da...
research
12/21/2022

Text classification in shipping industry using unsupervised models and Transformer based supervised models

Obtaining labelled data in a particular context could be expensive and t...
research
03/03/2022

Automated Single-Label Patent Classification using Ensemble Classifiers

Many thousands of patent applications arrive at patent offices around th...
research
11/08/2019

Ruminating Word Representations with Random Noised Masker

We introduce a training method for both better word representation and p...
research
09/09/2017

Semi-Supervised Instance Population of an Ontology using Word Vector Embeddings

In many modern day systems such as information extraction and knowledge ...
research
09/16/2021

Revisiting Tri-training of Dependency Parsers

We compare two orthogonal semi-supervised learning techniques, namely tr...
research
05/24/2021

View Distillation with Unlabeled Data for Extracting Adverse Drug Effects from User-Generated Data

We present an algorithm based on multi-layer transformers for identifyin...

Please sign up or login with your details

Forgot password? Click here to reset