A Flexible Clustering Pipeline for Mining Text Intentions

02/01/2022
by   Xinyu Chen, et al.
0

Mining the latent intentions from large volumes of natural language inputs is a key step to help data analysts design and refine Intelligent Virtual Assistants (IVAs) for customer service and sales support. We created a flexible and scalable clustering pipeline within the Verint Intent Manager (VIM) that integrates the fine-tuning of language models, a high performing k-NN library and community detection techniques to help analysts quickly surface and organize relevant user intentions from conversational texts. The fine-tuning step is necessary because pre-trained language models cannot encode texts to efficiently surface particular clustering structures when the target texts are from an unseen domain or the clustering task is not topic detection. We describe the pipeline and demonstrate its performance using BERT on three real-world text mining tasks. As deployed in the VIM application, this clustering pipeline produces high quality results, improving the performance of data analysts and reducing the time it takes to surface intentions from customer service data, thereby reducing the time it takes to build and deploy IVAs in new domains.

READ FULL TEXT

page 1

page 4

research
02/01/2022

A Semi-Supervised Deep Clustering Pipeline for Mining Intentions From Texts

Mining the latent intentions from large volumes of natural language inpu...
research
10/23/2022

On Cross-Domain Pre-Trained Language Models for Clinical Text Mining: How Do They Perform on Data-Constrained Fine-Tuning?

Pre-trained language models (PLMs) have been deployed in many natural la...
research
10/28/2022

RoChBert: Towards Robust BERT Fine-tuning for Chinese

Despite of the superb performance on a wide range of tasks, pre-trained ...
research
08/07/2023

MedMine: Examining Pre-trained Language Models on Medication Mining

Automatic medication mining from clinical and biomedical text has become...
research
06/08/2023

Assessing Phrase Break of ESL Speech with Pre-trained Language Models and Large Language Models

This work introduces approaches to assessing phrase breaks in ESL learne...
research
08/26/2022

Building the Intent Landscape of Real-World Conversational Corpora with Extractive Question-Answering Transformers

For companies with customer service, mapping intents inside their conver...
research
05/09/2023

Going beyond research datasets: Novel intent discovery in the industry setting

Novel intent discovery automates the process of grouping similar message...

Please sign up or login with your details

Forgot password? Click here to reset