DeepAI AI Chat
Log In Sign Up

Human in the loop: How to effectively create coherent topics by manually labeling only a few documents per class

by   Anton Thielmann, et al.

Few-shot methods for accurate modeling under sparse label-settings have improved significantly. However, the applications of few-shot modeling in natural language processing remain solely in the field of document classification. With recent performance improvements, supervised few-shot methods, combined with a simple topic extraction method pose a significant challenge to unsupervised topic modeling methods. Our research shows that supervised few-shot learning, combined with a simple topic extraction method, can outperform unsupervised topic modeling techniques in terms of generating coherent topics, even when only a few labeled documents per class are used.


page 1

page 2

page 3

page 4


BERTopic: Neural topic modeling with a class-based TF-IDF procedure

Topic models can be useful tools to discover latent topics in collection...

Viewpoint and Topic Modeling of Current Events

There are multiple sides to every story, and while statistical topic mod...

A Coherent Unsupervised Model for Toponym Resolution

Toponym Resolution, the task of assigning a location mention in a docume...

On Cross-Dataset Generalization in Automatic Detection of Online Abuse

NLP research has attained high performances in abusive language detectio...

One-shot Key Information Extraction from Document with Deep Partial Graph Matching

Automating the Key Information Extraction (KIE) from documents improves ...

Analyzing Folktales of Different Regions Using Topic Modeling and Clustering

This paper employs two major natural language processing techniques, top...