Weakly Supervised Multi-Label Classification of Full-Text Scientific Papers

06/24/2023
by   Yu Zhang, et al.
0

Instead of relying on human-annotated training samples to build a classifier, weakly supervised scientific paper classification aims to classify papers only using category descriptions (e.g., category names, category-indicative keywords). Existing studies on weakly supervised paper classification are less concerned with two challenges: (1) Papers should be classified into not only coarse-grained research topics but also fine-grained themes, and potentially into multiple themes, given a large and fine-grained label space; and (2) full text should be utilized to complement the paper title and abstract for classification. Moreover, instead of viewing the entire paper as a long linear sequence, one should exploit the structural information such as citation links across papers and the hierarchy of sections and paragraphs in each paper. To tackle these challenges, in this study, we propose FUTEX, a framework that uses the cross-paper network structure and the in-paper hierarchy structure to classify full-text scientific papers under weak supervision. A network-aware contrastive fine-tuning module and a hierarchy-aware aggregation module are designed to leverage the two types of structural signals, respectively. Experiments on two benchmark datasets demonstrate that FUTEX significantly outperforms competitive baselines and is on par with fully supervised classifiers that use 1,000 to 60,000 ground-truth training samples.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/07/2021

MotifClass: Weakly Supervised Text Classification with Higher-order Metadata Information

We study the problem of weakly supervised text classification, which aim...
research
07/24/2022

Weakly-Supervised Temporal Action Detection for Fine-Grained Videos with Hierarchical Atomic Actions

Action understanding has evolved into the era of fine granularity, as mo...
research
04/26/2018

Network Transplanting

This paper focuses on a novel problem, i.e., transplanting a category-an...
research
09/22/2021

Coarse2Fine: Fine-grained Text Classification on Coarsely-grained Annotated Data

Existing text classification methods mainly focus on a fixed label set, ...
research
06/12/2023

Weakly-Supervised Scientific Document Classification via Retrieval-Augmented Multi-Stage Training

Scientific document classification is a critical task for a wide range o...
research
07/26/2023

UnScientify: Detecting Scientific Uncertainty in Scholarly Full Text

This demo paper presents UnScientify, an interactive system designed to ...
research
05/01/2018

Weakly Supervised Attention Learning for Textual Phrases Grounding

Grounding textual phrases in visual content is a meaningful yet challeng...

Please sign up or login with your details

Forgot password? Click here to reset