Dialog Intent Induction via Density-based Deep Clustering Ensemble

01/18/2022
by   Jiashu Pu, et al.
7

Existing task-oriented chatbots heavily rely on spoken language understanding (SLU) systems to determine a user's utterance's intent and other key information for fulfilling specific tasks. In real-life applications, it is crucial to occasionally induce novel dialog intents from the conversation logs to improve the user experience. In this paper, we propose the Density-based Deep Clustering Ensemble (DDCE) method for dialog intent induction. Compared to existing K-means based methods, our proposed method is more effective in dealing with real-life scenarios where a large number of outliers exist. To maximize data utilization, we jointly optimize texts' representations and the hyperparameters of the clustering algorithm. In addition, we design an outlier-aware clustering ensemble framework to handle the overfitting issue. Experimental results over seven datasets show that our proposed method significantly outperforms other state-of-the-art baselines.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/05/2022

Analysis of Utterance Embeddings and Clustering Methods Related to Intent Induction for Task-Oriented Dialogue

This paper investigates unsupervised approaches to overcome quintessenti...
research
08/30/2019

Dialog Intent Induction with Deep Multi-View Clustering

We introduce the dialog intent induction task and present a novel deep m...
research
05/01/2023

Joint Modelling of Spoken Language Understanding Tasks with Integrated Dialog History

Most human interactions occur in the form of spoken conversations where ...
research
04/11/2022

Gaining Insights into Unrecognized User Utterances in Task-Oriented Dialog Systems

The rapidly growing market demand for dialogue agents capable of goal-or...
research
05/09/2023

Going beyond research datasets: Novel intent discovery in the industry setting

Novel intent discovery automates the process of grouping similar message...
research
02/08/2017

Name Disambiguation in Anonymized Graphs using Network Embedding

In real-world, our DNA is unique but many people share names. This pheno...
research
10/23/2016

Two are Better than One: An Ensemble of Retrieval- and Generation-Based Dialog Systems

Open-domain human-computer conversation has attracted much attention in ...

Please sign up or login with your details

Forgot password? Click here to reset