Bag-of-Words vs. Sequence vs. Graph vs. Hierarchy for Single- and Multi-Label Text Classification

04/08/2022
by   Andor Diera, et al.
7

Graph neural networks have triggered a resurgence of graph-based text classification methods, defining today's state of the art. We show that a simple multi-layer perceptron (MLP) using a Bag of Words (BoW) outperforms the recent graph-based models TextGCN and HeteGCN in an inductive text classification setting and is comparable with HyperGAT in single-label classification. We also run our own experiments on multi-label classification, where the simple MLP outperforms the recent sequential-based gMLP and aMLP models. Moreover, we fine-tune a sequence-based BERT and a lightweight DistilBERT model, which both outperform all models on both single-label and multi-label settings in most datasets. These results question the importance of synthetic graphs used in modern text classifiers. In terms of parameters, DistilBERT is still twice as large as our BoW-based wide MLP, while graph-based models like TextGCN require setting up an 𝒪(N^2) graph, where N is the vocabulary plus corpus size.

READ FULL TEXT

page 23

page 24

page 26

research
09/08/2021

Forget me not: A Gentle Reminder to Mind the Simple Multi-Layer Perceptron Baseline for Text Classification

Graph neural networks have triggered a resurgence of graph-based text cl...
research
12/10/2020

GNN-XML: Graph Neural Networks for Extreme Multi-label Text Classification

Extreme multi-label text classification (XMTC) aims to tag a text instan...
research
03/26/2021

Heterogeneous Graph Neural Networks for Multi-label Text Classification

Multi-label text classification (MLTC) is an attractive and challenging ...
research
04/23/2023

Graph Neural Networks for Text Classification: A Survey

Text Classification is the most essential and fundamental problem in Nat...
research
05/21/2023

F-PABEE: Flexible-patience-based Early Exiting for Single-label and Multi-label text Classification Tasks

Computational complexity and overthinking problems have become the bottl...
research
10/27/2022

BERT-Flow-VAE: A Weakly-supervised Model for Multi-Label Text Classification

Multi-label Text Classification (MLTC) is the task of categorizing docum...
research
01/11/2023

Multi-label Image Classification using Adaptive Graph Convolutional Networks: from a Single Domain to Multiple Domains

This paper proposes an adaptive graph-based approach for multi-label ima...

Please sign up or login with your details

Forgot password? Click here to reset