Estimating articulatory movements in speech production with transformer networks

04/11/2021
by   Sathvik Udupa, et al.
0

We estimate articulatory movements in speech production from different modalities - acoustics and phonemes. Acoustic-to articulatory inversion (AAI) is a sequence-to-sequence task. On the other hand, phoneme to articulatory (PTA) motion estimation faces a key challenge in reliably aligning the text and the articulatory movements. To address this challenge, we explore the use of a transformer architecture - FastSpeech, with explicit duration modelling to learn hard alignments between the phonemes and articulatory movements. We also train a transformer model on AAI. We use correlation coefficient (CC) and root mean squared error (rMSE) to assess the estimation performance in comparison to existing methods on both tasks. We observe 154 improvement in CC with subject-dependent, pooled and fine-tuning strategies, respectively, for PTA estimation. Additionally, on the AAI task, we obtain 1.5 state-of-the-art baseline. We further present the computational benefits of having transformer architecture as representation blocks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/31/2019

A comparative study of estimating articulatory movements from phoneme sequences and acoustic features

Unlike phoneme sequences, movements of speech articulators (lips, tongue...
research
05/15/2020

JDI-T: Jointly trained Duration Informed Transformer for Text-To-Speech without Explicit Alignment

We propose Jointly trained Duration Informed Transformer (JDI-T), a feed...
research
09/07/2021

Puzzle Solving without Search or Human Knowledge: An Unnatural Language Approach

The application of Generative Pre-trained Transformer (GPT-2) to learn t...
research
05/29/2021

FoveaTer: Foveated Transformer for Image Classification

Many animals and humans process the visual field with a varying spatial ...
research
08/09/2023

Optimizing a Transformer-based network for a deep learning seismic processing workflow

StorSeismic is a recently introduced model based on the Transformer to a...

Please sign up or login with your details

Forgot password? Click here to reset