Unsupervised Pattern Discovery from Thematic Speech Archives Based on Multilingual Bottleneck Features

11/03/2020
by   Man-Ling Sung, et al.
0

The present study tackles the problem of automatically discovering spoken keywords from untranscribed audio archives without requiring word-by-word speech transcription by automatic speech recognition (ASR) technology. The problem is of practical significance in many applications of speech analytics, including those concerning low-resource languages, and large amount of multilingual and multi-genre data. We propose a two-stage approach, which comprises unsupervised acoustic modeling and decoding, followed by pattern mining in acoustic unit sequences. The whole process starts by deriving and modeling a set of subword-level speech units with untranscribed data. With the unsupervisedly trained acoustic models, a given audio archive is represented by a pseudo transcription, from which spoken keywords can be discovered by string mining algorithms. For unsupervised acoustic modeling, a deep neural network trained by multilingual speech corpora is used to generate speech segmentation and compute bottleneck features for segment clustering. Experimental results show that the proposed system is able to effectively extract topic-related words and phrases from the lecture recordings on MIT OpenCourseWare.

READ FULL TEXT
research
10/31/2020

Multilingual Bottleneck Features for Improving ASR Performance of Code-Switched Speech in Under-Resourced Languages

In this work, we explore the benefits of using multilingual bottleneck f...
research
11/28/2020

Unsupervised Spoken Term Discovery Based on Re-clustering of Hypothesized Speech Segments with Siamese and Triplet Networks

Spoken term discovery from untranscribed speech audio could be achieved ...
research
06/01/2023

Towards hate speech detection in low-resource languages: Comparing ASR to acoustic word embeddings on Wolof and Swahili

We consider hate speech detection through keyword spotting on radio broa...
research
07/29/2020

Exploiting Cross-Lingual Knowledge in Unsupervised Acoustic Modeling for Low-Resource Languages

(Short version of Abstract) This thesis describes an investigation on un...
research
11/20/2021

Deep Spoken Keyword Spotting: An Overview

Spoken keyword spotting (KWS) deals with the identification of keywords ...
research
06/02/2022

Pronunciation Dictionary-Free Multilingual Speech Synthesis by Combining Unsupervised and Supervised Phonetic Representations

This paper proposes a multilingual speech synthesis method which combine...
research
04/02/2021

Unsupervised Acoustic Unit Discovery by Leveraging a Language-Independent Subword Discriminative Feature Representation

This paper tackles automatically discovering phone-like acoustic units (...

Please sign up or login with your details

Forgot password? Click here to reset