Is Automated Topic Model Evaluation Broken?: The Incoherence of Coherence

07/05/2021
by   Alexander Hoyle, et al.
24

Topic model evaluation, like evaluation of other unsupervised methods, can be contentious. However, the field has coalesced around automated estimates of topic coherence, which rely on the frequency of word co-occurrences in a reference corpus. Recent models relying on neural components surpass classical topic models according to these metrics. At the same time, unlike classical models, the practice of neural topic model evaluation suffers from a validation gap: automatic coherence for neural models has not been validated using human experimentation. In addition, as we show via a meta-analysis of topic modeling literature, there is a substantial standardization gap in the use of automated topic modeling benchmarks. We address both the standardization gap and the validation gap. Using two of the most widely used topic model evaluation datasets, we assess a dominant classical model and two state-of-the-art neural models in a systematic, clearly documented, reproducible way. We use automatic coherence along with the two most widely accepted human judgment tasks, namely, topic rating and word intrusion. Automated evaluation will declare one model significantly different from another when corresponding human evaluations do not, calling into question the validity of fully automatic evaluations independent of human judgments.

READ FULL TEXT
research
05/23/2023

Contextualized Topic Coherence Metrics

The recent explosion in work on neural topic modeling has been criticize...
research
06/01/2023

A Call for Standardization and Validation of Text Style Transfer Evaluation

Text Style Transfer (TST) evaluation is, in practice, inconsistent. Ther...
research
10/28/2022

Are Neural Topic Models Broken?

Recently, the relationship between automated and human evaluation of top...
research
05/18/2019

Automatic Evaluation of Local Topic Quality

Topic models are typically evaluated with respect to the global topic di...
research
06/30/2021

Evaluation of Thematic Coherence in Microblogs

Collecting together microblogs representing opinions about the same topi...
research
05/21/2021

Have you tried Neural Topic Models? Comparative Analysis of Neural and Non-Neural Topic Models with Application to COVID-19 Twitter Data

Topic models are widely used in studying social phenomena. We conduct a ...
research
09/25/2019

PaRe: A Paper-Reviewer Matching Approach Using a Common Topic Space

Finding the right reviewers to assess the quality of conference submissi...

Please sign up or login with your details

Forgot password? Click here to reset