Robust Sequence-to-Sequence Acoustic Modeling with Stepwise Monotonic Attention for Neural TTS

06/03/2019
by   Mutian He, et al.
0

Neural TTS has demonstrated strong capabilities to generate human-like speech with high quality and naturalness, while its generalization to out-of-domain texts is still a challenging task, with regard to the design of attention-based sequence-to-sequence acoustic modeling. Various errors occur in those texts with unseen context, including attention collapse, skipping, repeating, etc., which limits the broader applications. In this paper, we propose a novel stepwise monotonic attention method in sequence-to-sequence acoustic modeling to improve the robustness on out-of-domain texts. The method utilizes the strict monotonic property in TTS with extra constraints on monotonic attention that the alignments between inputs and outputs sequence must be not only monotonic but also allowing no skipping on the inputs. In inference, soft attention could be used to evade mismatch between training and test in monotonic hard attention. The experimental results show that the proposed method could achieve significant improvements in robustness on various out-of-domain scenarios, without any regression on the in-domain test set.

READ FULL TEXT
research
07/18/2018

Forward Attention in Sequence-to-sequence Acoustic Modelling for Speech Synthesis

This paper proposes a forward attention method for the sequenceto- seque...
research
05/15/2019

Exact Hard Monotonic Attention for Character-Level Transduction

Many common character-level, string-to-string transduction tasks, e.g., ...
research
04/08/2021

On Biasing Transformer Attention Towards Monotonicity

Many sequence-to-sequence tasks in natural language processing are rough...
research
11/02/2020

FeatherTTS: Robust and Efficient attention based Neural TTS

Attention based neural TTS is elegant speech synthesis pipeline and has ...
research
12/14/2017

Monotonic Chunkwise Attention

Sequence-to-sequence models with soft attention have been successfully a...
research
07/14/2023

Expressive Monotonic Neural Networks

The monotonic dependence of the outputs of a neural network on some of i...
research
08/29/2018

Hard Non-Monotonic Attention for Character-Level Transduction

Character-level string-to-string transduction is an important component ...

Please sign up or login with your details

Forgot password? Click here to reset