Unsupervised Dialogue Topic Segmentation in Hyperdimensional Space

08/21/2023
by   Seongmin Park, et al.
0

We present HyperSeg, a hyperdimensional computing (HDC) approach to unsupervised dialogue topic segmentation. HDC is a class of vector symbolic architectures that leverages the probabilistic orthogonality of randomly drawn vectors at extremely high dimensions (typically over 10,000). HDC generates rich token representations through its low-cost initialization of many unrelated vectors. This is especially beneficial in topic segmentation, which often operates as a resource-constrained pre-processing step for downstream transcript understanding tasks. HyperSeg outperforms the current state-of-the-art in 4 out of 5 segmentation benchmarks – even when baselines are given partial access to the ground truth – and is 10 times faster on average. We show that HyperSeg also improves downstream summarization accuracy. With HyperSeg, we demonstrate the viability of HDC in a major language task. We open-source HyperSeg to provide a strong baseline for unsupervised topic segmentation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/26/2022

Unsupervised Abstractive Dialogue Summarization with Word Graphs and POV Conversion

We advance the state-of-the-art in unsupervised abstractive dialogue sum...
research
06/24/2021

Unsupervised Topic Segmentation of Meetings with BERT Embeddings

Topic segmentation of meetings is the task of dividing multi-person meet...
research
06/12/2021

Improving Unsupervised Dialogue Topic Segmentation with Utterance-Pair Coherence Scoring

Dialogue topic segmentation is critical in several dialogue modeling pro...
research
05/15/2023

Uncovering the Potential of ChatGPT for Discourse Analysis in Dialogue: An Empirical Study

Large Language Models (LLMs) like ChatGPT have proven a great shallow un...
research
05/31/2021

APEX: Unsupervised, Object-Centric Scene Segmentation and Tracking for Robot Manipulation

Recent advances in unsupervised learning for object detection, segmentat...
research
11/27/2022

Topic Segmentation in the Wild: Towards Segmentation of Semi-structured Unstructured Chats

Breaking down a document or a conversation into multiple contiguous segm...

Please sign up or login with your details

Forgot password? Click here to reset