Unsupervised Discovery of Recurring Speech Patterns Using Probabilistic Adaptive Metrics

08/03/2020
by   Okko Räsänen, et al.
0

Unsupervised spoken term discovery (UTD) aims at finding recurring segments of speech from a corpus of acoustic speech data. One potential approach to this problem is to use dynamic time warping (DTW) to find well-aligning patterns from the speech data. However, automatic selection of initial candidate segments for the DTW-alignment and detection of "sufficiently good" alignments among those require some type of pre-defined criteria, often operationalized as threshold parameters for pair-wise distance metrics between signal representations. In the existing UTD systems, the optimal hyperparameters may differ across datasets, limiting their applicability to new corpora and truly low-resource scenarios. In this paper, we propose a novel probabilistic approach to DTW-based UTD named as PDTW. In PDTW, distributional characteristics of the processed corpus are utilized for adaptive evaluation of alignment quality, thereby enabling systematic discovery of pattern pairs that have similarity what would be expected by coincidence. We test PDTW on Zero Resource Speech Challenge 2017 datasets as a part of 2020 implementation of the challenge. The results show that the system performs consistently on all five tested languages using fixed hyperparameters, clearly outperforming the earlier DTW-based system in terms of coverage of the detected patterns.

READ FULL TEXT

page 3

page 4

research
10/10/2017

A Very Low Resource Language Speech Corpus for Computational Language Documentation Experiments

Most speech and language technologies are trained with massive amounts o...
research
07/27/2018

A small Griko-Italian speech translation corpus

This paper presents an extension to a very low-resource parallel corpus ...
research
12/12/2017

The Zero Resource Speech Challenge 2017

We describe a new challenge aimed at discovering subword and word units ...
research
07/26/2020

Self-Expressing Autoencoders for Unsupervised Spoken Term Discovery

Unsupervised spoken term discovery consists of two tasks: finding the ac...
research
11/28/2020

Unsupervised Spoken Term Discovery Based on Re-clustering of Hypothesized Speech Segments with Siamese and Triplet Networks

Spoken term discovery from untranscribed speech audio could be achieved ...
research
03/23/2017

An embedded segmental K-means model for unsupervised segmentation and clustering of speech

Unsupervised segmentation and clustering of unlabelled speech are core p...

Please sign up or login with your details

Forgot password? Click here to reset