Efficient strategies for hierarchical text classification: External knowledge and auxiliary tasks

05/05/2020
by   Kervy Rivas Rojas, et al.
0

In hierarchical text classification, we perform a sequence of inference steps to predict the category of a document from top to bottom of a given class taxonomy. Most of the studies have focused on developing novels neural network architectures to deal with the hierarchical structure, but we prefer to look for efficient ways to strengthen a baseline model. We first define the task as a sequence-to-sequence problem. Afterwards, we propose an auxiliary synthetic task of bottom-up-classification. Then, from external dictionaries, we retrieve textual definitions for the classes of all the hierarchy's layers, and map them into the word vector space. We use the class-definition embeddings as an additional input to condition the prediction of the next layer and in an adapted beam search. Whereas the modified search did not provide large gains, the combination of the auxiliary task and the additional input of class-definitions significantly enhance the classification accuracy. With our efficient approaches, we outperform previous studies, using a drastically reduced number of parameters, in two well-known English datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/07/2021

Hierarchical Text Classification of Urdu News using Deep Neural Network

Digital text is increasing day by day on the internet. It is very challe...
research
04/02/2022

Constrained Sequence-to-Tree Generation for Hierarchical Text Classification

Hierarchical Text Classification (HTC) is a challenging task where a doc...
research
04/18/2022

HFT-ONLSTM: Hierarchical and Fine-Tuning Multi-label Text Classification

Many important classification problems in the real-world consist of a la...
research
03/03/2019

Predicting Algorithm Classes for Programming Word Problems

We introduce the task of algorithm class prediction for programming word...
research
06/03/2020

Exploiting Class Labels to Boost Performance on Embedding-based Text Classification

Text classification is one of the most frequent tasks for processing tex...
research
03/02/2016

Filter based Taxonomy Modification for Improving Hierarchical Classification

Hierarchical Classification (HC) is a supervised learning problem where ...

Please sign up or login with your details

Forgot password? Click here to reset