Building the Intent Landscape of Real-World Conversational Corpora with Extractive Question-Answering Transformers

08/26/2022
by   Jean-Philippe Corbeil, et al.
3

For companies with customer service, mapping intents inside their conversational data is crucial in building applications based on natural language understanding (NLU). Nevertheless, there is no established automated technique to gather the intents from noisy online chats or voice transcripts. Simple clustering approaches are not suited to intent-sparse dialogues. To solve this intent-landscape task, we propose an unsupervised pipeline that extracts the intents and the taxonomy of intents from real-world dialogues. Our pipeline mines intent-span candidates with an extractive Question-Answering Electra model and leverages sentence embeddings to apply a low-level density clustering followed by a top-level hierarchical clustering. Our results demonstrate the generalization ability of an ELECTRA large model fine-tuned on the SQuAD2 dataset to understand dialogues. With the right prompting question, this model achieves a rate of linguistic validation on intent spans beyond 85 We furthermore reconstructed the intent schemes of five domains from the MultiDoGo dataset with an average recall of 94.3

READ FULL TEXT

page 7

page 12

page 13

page 14

research
03/02/2023

QAID: Question Answering Inspired Few-shot Intent Detection

Intent detection with semantically similar fine-grained intents is a cha...
research
03/25/2019

Question Embeddings Based on Shannon Entropy: Solving intent classification task in goal-oriented dialogue system

Question-answering systems and voice assistants are becoming major part ...
research
12/18/2020

Exploring Fluent Query Reformulations with Text-to-Text Transformers and Reinforcement Learning

Query reformulation aims to alter potentially noisy or ambiguous text se...
research
07/17/2019

Almawave-SLU: A new dataset for SLU in Italian

The widespread use of conversational and question answering systems made...
research
08/10/2020

A Bootstrapped Model to Detect Abuse and Intent in White Supremacist Corpora

Intelligence analysts face a difficult problem: distinguishing extremist...
research
09/24/2019

Technical report on Conversational Question Answering

Conversational Question Answering is a challenging task since it require...
research
02/01/2022

A Flexible Clustering Pipeline for Mining Text Intentions

Mining the latent intentions from large volumes of natural language inpu...

Please sign up or login with your details

Forgot password? Click here to reset