Robust Feature Clustering for Unsupervised Speech Activity Detection

06/25/2018
by   Harishchandra Dubey, et al.
0

In certain applications such as zero-resource speech processing or very-low resource speech-language systems, it might not be feasible to collect speech activity detection (SAD) annotations. However, the state-of-the-art supervised SAD techniques based on neural networks or other machine learning methods require annotated training data matched to the target domain. This paper establish a clustering approach for fully unsupervised SAD useful for cases where SAD annotations are not available. The proposed approach leverages Hartigan dip test in a recursive strategy for segmenting the feature space into prominent modes. Statistical dip is invariant to distortions that lends robustness to the proposed method. We evaluate the method on NIST OpenSAD 2015 and NIST OpenSAT 2017 public safety communications data. The results showed the superiority of proposed approach over the two-component GMM baseline. Index Terms: Clustering, Hartigan dip test, NIST OpenSAD, NIST OpenSAT, speech activity detection, zero-resource speech processing, unsupervised learning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/12/2021

Unsupervised Domain Adaptation Schemes for Building ASR in Low-resource Languages

Building an automatic speech recognition (ASR) system from scratch requi...
research
06/02/2021

Improving low-resource ASR performance with untranscribed out-of-domain data

Semi-supervised training (SST) is a common approach to leverage untransc...
research
01/03/2017

Unsupervised neural and Bayesian models for zero-resource speech processing

In settings where only unlabelled speech data is available, zero-resourc...
research
07/28/2018

Domain Robust Feature Extraction for Rapid Low Resource ASR Development

Developing a practical speech recognizer for a low resource language is ...
research
03/09/2022

Automatic Language Identification for Celtic Texts

Language identification is an important Natural Language Processing task...
research
04/25/2021

Contextual Lexicon-Based Approach for Hate Speech and Offensive Language Detection

This paper provides a new approach for offensive language and hate speec...
research
03/04/2019

Traditional Machine Learning for Pitch Detection

Pitch detection is a fundamental problem in speech processing as F0 is u...

Please sign up or login with your details

Forgot password? Click here to reset