Identifying Semantically Difficult Samples to Improve Text Classification

02/13/2023
by   Shashank Mujumdar, et al.
0

In this paper, we investigate the effect of addressing difficult samples from a given text dataset on the downstream text classification task. We define difficult samples as being non-obvious cases for text classification by analysing them in the semantic embedding space; specifically - (i) semantically similar samples that belong to different classes and (ii) semantically dissimilar samples that belong to the same class. We propose a penalty function to measure the overall difficulty score of every sample in the dataset. We conduct exhaustive experiments on 13 standard datasets to show a consistent improvement of up to 9 of our approach in identifying difficult samples for a text classification model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/26/2017

A WL-SPPIM Semantic Model for Document Classification

In this paper, we explore SPPIM-based text classification method, and th...
research
11/05/2018

Evolutionary Data Measures: Understanding the Difficulty of Text Classification Tasks

Classification tasks are usually analysed and improved through new model...
research
09/23/2020

Text Classification with Novelty Detection

This paper studies the problem of detecting novel or unexpected instance...
research
05/10/2018

Text classification based on ensemble extreme learning machine

In this paper, we propose a novel approach based on cost-sensitive ensem...
research
10/06/2020

Identifying Spurious Correlations for Robust Text Classification

The predictions of text classifiers are often driven by spurious correla...
research
05/04/2022

Are All the Datasets in Benchmark Necessary? A Pilot Study of Dataset Evaluation for Text Classification

In this paper, we ask the research question of whether all the datasets ...
research
06/25/2022

Protoformer: Embedding Prototypes for Transformers

Transformers have been widely applied in text classification. Unfortunat...

Please sign up or login with your details

Forgot password? Click here to reset