Topic Modelling on Consumer Financial Protection Bureau Data: An Approach Using BERT Based Embeddings

05/15/2022
by   Vasudeva Raju Sangaraju, et al.
0

Customers' reviews and comments are important for businesses to understand users' sentiment about the products and services. However, this data needs to be analyzed to assess the sentiment associated with topics/aspects to provide efficient customer assistance. LDA and LSA fail to capture the semantic relationship and are not specific to any domain. In this study, we evaluate BERTopic, a novel method that generates topics using sentence embeddings on Consumer Financial Protection Bureau (CFPB) data. Our work shows that BERTopic is flexible and yet provides meaningful and diverse topics compared to LDA and LSA. Furthermore, domain-specific pre-trained embeddings (FinBERT) yield even better topics. We evaluated the topics on coherence score (c_v) and UMass.

READ FULL TEXT
research
07/18/2018

Latent Dirichlet Allocation (LDA) for Topic Modeling of the CFPB Consumer Complaints

A text mining approach is proposed based on latent Dirichlet allocation ...
research
04/08/2020

Pre-training is a Hot Topic: Contextualized Document Embeddings Improve Topic Coherence

Topic models extract meaningful groups of words from documents, allowing...
research
04/21/2022

Is Neural Topic Modelling Better than Clustering? An Empirical Study on Clustering with Contextual Embeddings for Topics

Recent work incorporates pre-trained word embeddings such as BERT embedd...
research
05/04/2020

Modelling Grocery Retail Topic Distributions: Evaluation, Interpretability and Stability

Understanding the shopping motivations behind market baskets has high co...
research
11/15/2021

Regional Topics in British Grocery Retail Transactions

Understanding the customer behaviours behind transactional data has high...
research
10/07/2015

Assisting Composition of Email Responses: a Topic Prediction Approach

We propose an approach for helping agents compose email replies to custo...
research
02/05/2021

How Pandemic Spread in News: Text Analysis Using Topic Model

Researches about COVID-19 has increased largely, no matter in the biolog...

Please sign up or login with your details

Forgot password? Click here to reset