Federated Non-negative Matrix Factorization for Short Texts Topic Modeling with Mutual Information

05/26/2022
by   Shijing Si, et al.
5

Non-negative matrix factorization (NMF) based topic modeling is widely used in natural language processing (NLP) to uncover hidden topics of short text documents. Usually, training a high-quality topic model requires large amount of textual data. In many real-world scenarios, customer textual data should be private and sensitive, precluding uploading to data centers. This paper proposes a Federated NMF (FedNMF) framework, which allows multiple clients to collaboratively train a high-quality NMF based topic model with locally stored data. However, standard federated learning will significantly undermine the performance of topic models in downstream tasks (e.g., text classification) when the data distribution over clients is heterogeneous. To alleviate this issue, we further propose FedNMF+MI, which simultaneously maximizes the mutual information (MI) between the count features of local texts and their topic weight vectors to mitigate the performance degradation. Experimental results show that our FedNMF+MI methods outperform Federated Latent Dirichlet Allocation (FedLDA) and the FedNMF without MI methods for short texts by a significant margin on both coherence score and classification F1 score.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/31/2022

Guided Semi-Supervised Non-negative Matrix Factorization on Legal Documents

Classification and topic modeling are popular techniques in machine lear...
research
08/19/2023

Exploring the Power of Topic Modeling Techniques in Analyzing Customer Reviews: A Comparative Analysis

The exponential growth of online social network platforms and applicatio...
research
01/12/2022

Topic Modeling on Podcast Short-Text Metadata

Podcasts have emerged as a massively consumed online content, notably du...
research
05/27/2021

Non-negative matrix factorization algorithms greatly improve topic model fits

We report on the potential for using algorithms for non-negative matrix ...
research
05/04/2022

FedSPLIT: One-Shot Federated Recommendation System Based on Non-negative Joint Matrix Factorization and Knowledge Distillation

Non-negative matrix factorization (NMF) with missing-value completion is...
research
04/28/2021

Analysis of Legal Documents via Non-negative Matrix Factorization Methods

The California Innocence Project (CIP), a clinical law school program ai...
research
09/27/2021

Evaluation of Non-Negative Matrix Factorization and n-stage Latent Dirichlet Allocation for Emotion Analysis in Turkish Tweets

With the development of technology, the use of social media has become q...

Please sign up or login with your details

Forgot password? Click here to reset