Word-Class Embeddings for Multiclass Text Classification

11/26/2019
by   Alejandro Moreo, et al.
24

Pre-trained word embeddings encode general word semantics and lexical regularities of natural language, and have proven useful across many NLP tasks, including word sense disambiguation, machine translation, and sentiment analysis, to name a few. In supervised tasks such as multiclass text classification (the focus of this article) it seems appealing to enhance word representations with ad-hoc embeddings that encode task-specific information. We propose (supervised) word-class embeddings (WCEs), and show that, when concatenated to (unsupervised) pre-trained word embeddings, they substantially facilitate the training of deep-learning models in multiclass classification by topic. We show empirical evidence that WCEs yield a consistent improvement in multiclass classification accuracy, using four popular neural architectures and six widely used and publicly available datasets for multiclass text classification. Our code that implements WCEs is publicly available at https://github.com/AlexMoreo/word-class-embeddings

READ FULL TEXT

page 16

page 18

page 20

research
09/26/2020

iNLTK: Natural Language Toolkit for Indic Languages

We present iNLTK, an open-source NLP library consisting of pre-trained l...
research
06/02/2023

Word Embeddings for Banking Industry

Applications of Natural Language Processing (NLP) are plentiful, from se...
research
09/06/2018

An Analysis of Hierarchical Text Classification Using Word Embeddings

Efficient distributed numerical word representation models (word embeddi...
research
04/14/2021

Distributed Word Representation in Tsetlin Machine

Tsetlin Machine (TM) is an interpretable pattern recognition algorithm b...
research
03/03/2019

Predicting Algorithm Classes for Programming Word Problems

We introduce the task of algorithm class prediction for programming word...
research
05/18/2020

Text Classification with Few Examples using Controlled Generalization

Training data for text classification is often limited in practice, espe...
research
05/12/2023

IMAGINATOR: Pre-Trained Image+Text Joint Embeddings using Word-Level Grounding of Images

Word embeddings, i.e., semantically meaningful vector representation of ...

Please sign up or login with your details

Forgot password? Click here to reset