Unsupervised Word Segmentation Using Temporal Gradient Pseudo-Labels

03/30/2023
by   Tzeviya Sylvia Fuchs, et al.
0

Unsupervised word segmentation in audio utterances is challenging as, in speech, there is typically no gap between words. In a preliminary experiment, we show that recent deep self-supervised features are very effective for word segmentation but require supervision for training the classification head. To extend their effectiveness to unsupervised word segmentation, we propose a pseudo-labeling strategy. Our approach relies on the observation that the temporal gradient magnitude of the embeddings (i.e. the distance between the embeddings of subsequent frames) is typically minimal far from the boundaries and higher nearer the boundaries. We use a thresholding function on the temporal gradient magnitude to define a psuedo-label for wordness. We train a linear classifier, mapping the embedding of a single frame to the pseudo-label. Finally, we use the classifier score to predict whether a frame is a word or a boundary. In an empirical investigation, our method, despite its simplicity and fast run time, is shown to significantly outperform all previous methods on two datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/15/2020

SF-Net: Single-Frame Supervision for Temporal Action Localization

In this paper, we study an intermediate form of supervision, i.e., singl...
research
04/27/2022

Unsupervised Word Segmentation using K Nearest Neighbors

In this paper, we propose an unsupervised kNN-based approach for word se...
research
12/20/2022

Joint Speech Transcription and Translation: Pseudo-Labeling with Out-of-Distribution Data

Self-training has been shown to be helpful in addressing data scarcity f...
research
04/23/2021

STRUDEL: Self-Training with Uncertainty Dependent Label Refinement across Domains

We propose an unsupervised domain adaptation (UDA) approach for white ma...
research
09/12/2023

OTAS: Unsupervised Boundary Detection for Object-Centric Temporal Action Segmentation

Temporal action segmentation is typically achieved by discovering the dr...
research
12/22/2022

Timestamp-Supervised Action Segmentation in the Perspective of Clustering

Video action segmentation aims to slice the video into several action se...

Please sign up or login with your details

Forgot password? Click here to reset