Formant Tracking Using Quasi-Closed Phase Forward-Backward Linear Prediction Analysis and Deep Neural Networks

01/05/2022
by   Dhananjaya Gowda, et al.
7

Formant tracking is investigated in this study by using trackers based on dynamic programming (DP) and deep neural nets (DNNs). Using the DP approach, six formant estimation methods were first compared. The six methods include linear prediction (LP) algorithms, weighted LP algorithms and the recently developed quasi-closed phase forward-backward (QCP-FB) method. QCP-FB gave the best performance in the comparison. Therefore, a novel formant tracking approach, which combines benefits of deep learning and signal processing based on QCP-FB, was proposed. In this approach, the formants predicted by a DNN-based tracker from a speech frame are refined using the peaks of the all-pole spectrum computed by QCP-FB from the same frame. Results show that the proposed DNN-based tracker performed better both in detection rate and estimation error for the lowest three formants compared to reference formant trackers. Compared to the popular Wavesurfer, for example, the proposed tracker gave a reduction of 29 three formants, respectively.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

page 6

page 7

page 9

research
08/17/2023

Refining a Deep Learning-based Formant Tracker using Linear Prediction Methods

In this study, formant tracking is investigated by refining the formants...
research
08/31/2023

Time-Varying Quasi-Closed-Phase Analysis for Accurate Formant Tracking in Speech Signals

In this paper, we propose a new method for the accurate estimation and t...
research
04/03/2019

Unsupervised Deep Tracking

We propose an unsupervised visual tracking method in this paper. Differe...
research
06/07/2020

Maximum Phase Modeling for Sparse Linear Prediction of Speech

Linear prediction (LP) is an ubiquitous analysis method in speech proces...
research
07/02/2018

Waveform to Single Sinusoid Regression to Estimate the F0 Contour from Noisy Speech Using Recurrent Deep Neural Networks

The fundamental frequency (F0) represents pitch in speech that determine...
research
07/22/2020

Unsupervised Deep Representation Learning for Real-Time Tracking

The advancement of visual tracking has continuously been brought by deep...
research
03/01/2021

Unsupervised Classification of Voiced Speech and Pitch Tracking Using Forward-Backward Kalman Filtering

The detection of voiced speech, the estimation of the fundamental freque...

Please sign up or login with your details

Forgot password? Click here to reset