Topic Detection from Conversational Dialogue Corpus with Parallel Dirichlet Allocation Model and Elbow Method

06/05/2020
by   Haider Khalid, et al.
0

A conversational system needs to know how to switch between topics to continue the conversation for a more extended period. For this topic detection from dialogue corpus has become an important task for a conversation and accurate prediction of conversation topics is important for creating coherent and engaging dialogue systems. In this paper, we proposed a topic detection approach with Parallel Latent Dirichlet Allocation (PLDA) Model by clustering a vocabulary of known similar words based on TF-IDF scores and Bag of Words (BOW) technique. In the experiment, we use K-mean clustering with Elbow Method for interpretation and validation of consistency within-cluster analysis to select the optimal number of clusters. We evaluate our approach by comparing it with traditional LDA and clustering technique. The experimental results show that combining PLDA with Elbow method selects the optimal number of clusters and refine the topics for the conversation.

READ FULL TEXT
research
05/04/2012

Variable Selection for Latent Dirichlet Allocation

In latent Dirichlet allocation (LDA), topics are multinomial distributio...
research
08/24/2018

Measuring LDA Topic Stability from Clusters of Replicated Runs

Background: Unstructured and textual data is increasing rapidly and Late...
research
06/13/2019

Proactive Human-Machine Conversation with Explicit Conversation Goals

Though great progress has been made for human-machine conversation, curr...
research
04/23/2021

Prediction, Selection, and Generation: Exploration of Knowledge-Driven Conversation System

In open-domain conversational systems, it is important but challenging t...
research
05/23/2023

Multi-Granularity Prompts for Topic Shift Detection in Dialogue

The goal of dialogue topic shift detection is to identify whether the cu...
research
07/03/2018

Topic Discovery in Massive Text Corpora Based on Min-Hashing

The task of discovering topics in text corpora has been dominated by Lat...
research
02/27/2018

Classifying Idiomatic and Literal Expressions Using Topic Models and Intensity of Emotions

We describe an algorithm for automatic classification of idiomatic and l...

Please sign up or login with your details

Forgot password? Click here to reset