Detection of Glottal Closure Instants using Deep Dilated Convolutional Neural Networks

04/26/2018
by   Prathosh A. P., et al.
0

Glottal Closure Instants (GCIs) correspond to the temporal locations of significant excitation to the vocal tract occurring during the production of voiced speech. Detection of GCIs from speech signals is a well-studied problem given its importance in speech processing. Most of the existing approaches for GCI detection adopt a two-stage approach - (i) Transformation of speech signal into a representative signal where GCIs are localized better, (ii) extraction of GCIs using the representative signal obtained in first stage. The former stage is accomplished using signal processing techniques based on the principles of speech production and the latter with heuristic-algorithms such as dynamic programming and peak-picking. These methods are thus task-specific and rely on the methods used for representative signal extraction. However in this paper, we formulate the GCI detection problem from a representation learning perspective where appropriate representation is implicitly learned from the raw-speech data samples. Specifically, GCI detection is cast as a supervised multi-task learning problem which is solved using a deep dilated convolutional neural network jointly optimizing a classification and regression cost. The learning capabilities of the proposed model is demonstrated with several experiments on standard datasets. The results compare well with the state-of- the-art algorithms while performing better in the case of presence of real-world non-stationary noise.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/25/2018

Glottal Closure Instants Detection From Pathological Acoustic Speech Signal Using Deep Learning

In this paper, we propose a classification based glottal closure instant...
research
05/20/2020

Statistical and Neural Network Based Speech Activity Detection in Non-Stationary Acoustic Environments

Speech activity detection (SAD), which often rests on the fact that the ...
research
12/28/2019

Detection of Glottal Closure Instants from Speech Signals: a Quantitative Review

The pseudo-periodicity of voiced speech can be exploited in several spee...
research
04/22/2019

hf0: A hybrid pitch extraction method for multimodal voice

Pitch or fundamental frequency (f0) extraction is a fundamental problem ...
research
10/05/2020

FaultNet: A Deep Convolutional Neural Network for bearing fault classification

The increased presence of advanced sensors on the production floors has ...
research
12/28/2019

Glottal Closure and Opening Instant Detection from Speech Signals

This paper proposes a new procedure to detect Glottal Closure and Openin...
research
03/07/2021

An Optimized Signal Processing Pipeline for Syllable Detection and Speech Rate Estimation

Syllable detection is an important speech analysis task with application...

Please sign up or login with your details

Forgot password? Click here to reset