Re-visiting Automated Topic Model Evaluation with Large Language Models

05/20/2023
by   Dominik Stammbach, et al.
0

Topic models are used to make sense of large text collections. However, automatically evaluating topic model output and determining the optimal number of topics both have been longstanding challenges, with no effective automated solutions to date. This paper proposes using large language models to evaluate such output. We find that large language models appropriately assess the resulting topics, correlating more strongly with human judgments than existing automated metrics. We then investigate whether we can use large language models to automatically determine the optimal number of topics. We automatically assign labels to documents and choosing configurations with the most pure labels returns reasonable values for the optimal number of topics.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/23/2020

Topic Modeling with Contextualized Word Representation Clusters

Clustering token-level contextualized word representations produces outp...
research
06/05/2023

Leveraging Large Language Models for Topic Classification in the Domain of Public Affairs

The analysis of public affairs documents is crucial for citizens as it p...
research
10/09/2020

Paying down metadata debt: learning the representation of concepts using topic models

We introduce a data management problem called metadata debt, to identify...
research
05/29/2020

Automatic Generation of Topic Labels

Topic modelling is a popular unsupervised method for identifying the und...
research
08/23/2023

Devising and Detecting Phishing: Large Language Models vs. Smaller Human Models

AI programs, built using large language models, make it possible to auto...
research
09/04/2019

Distributionally Robust Language Modeling

Language models are generally trained on data spanning a wide range of t...
research
07/11/2023

What do LLMs need to Synthesize Correct Router Configurations?

We investigate whether Large Language Models (e.g., GPT-4) can synthesiz...

Please sign up or login with your details

Forgot password? Click here to reset