LOPS: Learning Order Inspired Pseudo-Label Selection for Weakly Supervised Text Classification

05/25/2022
by   Dheeraj Mekala, et al.
0

Weakly supervised text classification methods typically train a deep neural classifier based on pseudo-labels. The quality of pseudo-labels is crucial to final performance but they are inevitably noisy due to their heuristic nature, so selecting the correct ones has a huge potential for performance boost. One straightforward solution is to select samples based on the softmax probability scores in the neural classifier corresponding to their pseudo-labels. However, we show through our experiments that such solutions are ineffective and unstable due to the erroneously high-confidence predictions from poorly calibrated models. Recent studies on the memorization effects of deep neural models suggest that these models first memorize training samples with clean labels and then those with noisy labels. Inspired by this observation, we propose a novel pseudo-label selection method LOPS that takes learning order of samples into consideration. We hypothesize that the learning order reflects the probability of wrong annotation in terms of ranking, and therefore, propose to select the samples that are learnt earlier. LOPS can be viewed as a strong performance-boost plug-in to most of existing weakly-supervised text classification methods, as confirmed in extensive experiments on four real-world datasets.

READ FULL TEXT
research
09/02/2018

Weakly-Supervised Neural Text Classification

Deep neural networks are gaining increasing popularity for the classic t...
research
10/13/2022

LIME: Weakly-Supervised Text Classification Without Seeds

In weakly-supervised text classification, only label names act as source...
research
05/24/2022

WeDef: Weakly Supervised Backdoor Defense for Text Classification

Existing backdoor defense methods are only effective for limited trigger...
research
10/06/2021

Weakly-supervised Text Classification Based on Keyword Graph

Weakly-supervised text classification has received much attention in rec...
research
08/11/2023

Weakly Supervised Text Classification on Free Text Comments in Patient-Reported Outcome Measures

Free text comments (FTC) in patient-reported outcome measures (PROMs) da...
research
10/13/2022

ComSearch: Equation Searching with Combinatorial Strategy for Solving Math Word Problems with Weak Supervision

Previous studies have introduced a weakly-supervised paradigm for solvin...
research
06/05/2023

CELDA: Leveraging Black-box Language Model as Enhanced Classifier without Labels

Utilizing language models (LMs) without internal access is becoming an a...

Please sign up or login with your details

Forgot password? Click here to reset