Acoustic-to-articulatory Inversion based on Speech Decomposition and Auxiliary Feature

04/02/2022
by   Jianrong Wang, et al.
0

Acoustic-to-articulatory inversion (AAI) is to obtain the movement of articulators from speech signals. Until now, achieving a speaker-independent AAI remains a challenge given the limited data. Besides, most current works only use audio speech as input, causing an inevitable performance bottleneck. To solve these problems, firstly, we pre-train a speech decomposition network to decompose audio speech into speaker embedding and content embedding as the new personalized speech features to adapt to the speaker-independent case. Secondly, to further improve the AAI, we propose a novel auxiliary feature network to estimate the lip auxiliary features from the above personalized speech features. Experimental results on three public datasets show that, compared with the state-of-the-art only using the audio speech feature, the proposed method reduces the average RMSE by 0.25 and increases the average correlation coefficient by 2.0 importantly, the average RMSE decreases by 0.29 and the average correlation coefficient increases by 5.0

READ FULL TEXT
research
02/26/2023

Two-Stream Joint-Training for Speaker Independent Acoustic-to-Articulatory Inversion

Acoustic-to-articulatory inversion (AAI) aims to estimate the parameters...
research
05/30/2022

Personalized Acoustic Echo Cancellation for Full-duplex Communications

Deep neural networks (DNNs) have shown promising results for acoustic ec...
research
10/31/2019

A comparative study of estimating articulatory movements from phoneme sequences and acoustic features

Unlike phoneme sequences, movements of speech articulators (lips, tongue...
research
08/26/2020

DeepVOX: Discovering Features from Raw Audio for Speaker Recognition in Degraded Audio Signals

Automatic speaker recognition algorithms typically use pre-defined filte...
research
04/26/2019

Speaker Sincerity Detection based on Covariance Feature Vectors and Ensemble Methods

Automatic measuring of speaker sincerity degree is a novel research prob...
research
07/15/2021

Improving Security in McAdams Coefficient-Based Speaker Anonymization by Watermarking Method

Speaker anonymization aims to suppress speaker individuality to protect ...

Please sign up or login with your details

Forgot password? Click here to reset