Repeat after me: Self-supervised learning of acoustic-to-articulatory mapping by vocal imitation

04/05/2022
by   Marc-Antoine Georges, et al.
0

We propose a computational model of speech production combining a pre-trained neural articulatory synthesizer able to reproduce complex speech stimuli from a limited set of interpretable articulatory parameters, a DNN-based internal forward model predicting the sensory consequences of articulatory commands, and an internal inverse model based on a recurrent neural network recovering articulatory commands from the acoustic speech input. Both forward and inverse models are jointly trained in a self-supervised way from raw acoustic-only speech data from different speakers. The imitation simulations are evaluated objectively and subjectively and display quite encouraging performances.

READ FULL TEXT

page 2

page 3

research
05/23/2023

Can Self-Supervised Neural Representations Pre-Trained on Human Speech distinguish Animal Callers?

Self-supervised learning (SSL) models use only the intrinsic structure o...
research
01/03/2023

Supervised Acoustic Embeddings And Their Transferability Across Languages

In speech recognition, it is essential to model the phonetic content of ...
research
09/03/2023

Acoustic-to-articulatory inversion for dysarthric speech: Are pre-trained self-supervised representations favorable?

Acoustic-to-articulatory inversion (AAI) involves mapping from the acous...
research
10/30/2022

Improved acoustic-to-articulatory inversion using representations from pretrained self-supervised learning models

In this work, we investigate the effectiveness of pretrained Self-Superv...
research
10/19/2021

A mathematical model of the vowel space

The articulatory-acoustic relationship is many-to-one and non linear and...
research
06/20/2020

Embodied Self-supervised Learning by Coordinated Sampling and Training

Self-supervised learning can significantly improve the performance of do...
research
05/16/2020

Learning Joint Articulatory-Acoustic Representations with Normalizing Flows

The articulatory geometric configurations of the vocal tract and the aco...

Please sign up or login with your details

Forgot password? Click here to reset