Hierarchy-Aware T5 with Path-Adaptive Mask Mechanism for Hierarchical Text Classification

09/17/2021
by   Wei Huang, et al.
5

Hierarchical Text Classification (HTC), which aims to predict text labels organized in hierarchical space, is a significant task lacking in investigation in natural language processing. Existing methods usually encode the entire hierarchical structure and fail to construct a robust label-dependent model, making it hard to make accurate predictions on sparse lower-level labels and achieving low Macro-F1. In this paper, we propose a novel PAMM-HiA-T5 model for HTC: a hierarchy-aware T5 model with path-adaptive mask mechanism that not only builds the knowledge of upper-level labels into low-level ones but also introduces path dependency information in label prediction. Specifically, we generate a multi-level sequential label structure to exploit hierarchical dependency across different levels with Breadth-First Search (BFS) and T5 model. To further improve label dependency prediction within each path, we then propose an original path-adaptive mask mechanism (PAMM) to identify the label's path information, eliminating sources of noises from other paths. Comprehensive experiments on three benchmark datasets show that our novel PAMM-HiA-T5 model greatly outperforms all state-of-the-art HTC approaches especially in Macro-F1. The ablation studies show that the improvements mainly come from our innovative approach instead of T5.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/05/2022

Exploiting Global and Local Hierarchies for Hierarchical Text Classification

Hierarchical text classification aims to leverage label hierarchy in mul...
research
04/06/2020

Joint Embedding of Words and Category Labels for Hierarchical Multi-label Text Classification

Text classification has become increasingly challenging due to the conti...
research
03/08/2022

Incorporating Hierarchy into Text Encoder: a Contrastive Learning Approach for Hierarchical Text Classification

Hierarchical text classification is a challenging subtask of multi-label...
research
05/24/2023

HiTIN: Hierarchy-aware Tree Isomorphism Network for Hierarchical Text Classification

Hierarchical text classification (HTC) is a challenging subtask of multi...
research
04/12/2021

HTCInfoMax: A Global Model for Hierarchical Text Classification via Information Maximization

The current state-of-the-art model HiAGM for hierarchical text classific...
research
08/02/2023

Global Hierarchical Neural Networks using Hierarchical Softmax

This paper presents a framework in which hierarchical softmax is used to...
research
04/30/2018

Staircase Network: structural language identification via hierarchical attentive units

Language recognition system is typically trained directly to optimize cl...

Please sign up or login with your details

Forgot password? Click here to reset