Multidimensional Perceptron for Efficient and Explainable Long Text Classification

04/04/2023
by   Yexiang Wang, et al.
0

Because of the inevitable cost and complexity of transformer and pre-trained models, efficiency concerns are raised for long text classification. Meanwhile, in the highly sensitive domains, e.g., healthcare and legal long-text mining, potential model distrust, yet underrated and underexplored, may hatch vital apprehension. Existing methods generally segment the long text, encode each piece with the pre-trained model, and use attention or RNNs to obtain long text representation for classification. In this work, we propose a simple but effective model, Segment-aWare multIdimensional PErceptron (SWIPE), to replace attention/RNNs in the above framework. Unlike prior efforts, SWIPE can effectively learn the label of the entire text with supervised training, while perceive the labels of the segments and estimate their contributions to the long-text labeling in an unsupervised manner. As a general classifier, SWIPE can endorse different encoders, and it outperforms SOTA models in terms of classification accuracy and model efficiency. It is noteworthy that SWIPE achieves superior interpretability to transparentize long text classification results.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/10/2022

GUDN A novel guide network for extreme multi-label text classification

The problem of extreme multi-label text classification (XMTC) is to reca...
research
04/14/2022

Label Semantic Aware Pre-training for Few-shot Text Classification

In text classification tasks, useful information is encoded in the label...
research
08/29/2021

kFolden: k-Fold Ensemble for Out-Of-Distribution Detection

Out-of-Distribution (OOD) detection is an important problem in natural l...
research
09/28/2018

Learning Robust, Transferable Sentence Representations for Text Classification

Despite deep recurrent neural networks (RNNs) demonstrate strong perform...
research
09/13/2022

SkIn: Skimming-Intensive Long-Text Classification Using BERT for Medical Corpus

BERT is a widely used pre-trained model in natural language processing. ...
research
04/15/2021

Text Guide: Improving the quality of long text classification by a text selection method based on feature importance

The performance of text classification methods has improved greatly over...
research
12/19/2019

PySS3: A Python package implementing a novel text classifier with visualization tools for Explainable AI

A recently introduced text classifier, called SS3, has obtained state-of...

Please sign up or login with your details

Forgot password? Click here to reset