Nana-HDR: A Non-attentive Non-autoregressive Hybrid Model for TTS

09/28/2021
by   Shilun Lin, et al.
1

This paper presents Nana-HDR, a new non-attentive non-autoregressive model with hybrid Transformer-based Dense-fuse encoder and RNN-based decoder for TTS. It mainly consists of three parts: Firstly, a novel Dense-fuse encoder with dense connections between basic Transformer blocks for coarse feature fusion and a multi-head attention layer for fine feature fusion. Secondly, a single-layer non-autoregressive RNN-based decoder. Thirdly, a duration predictor instead of an attention model that connects the above hybrid encoder and decoder. Experiments indicate that Nana-HDR gives full play to the advantages of each component, such as strong text encoding ability of Transformer-based encoder, stateful decoding without being bothered by exposure bias and local information preference, and stable alignment provided by duration predictor. Due to these advantages, Nana-HDR achieves competitive performance in naturalness and robustness on two Mandarin corpora.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/22/2019

Retrieving Sequential Information for Non-Autoregressive Neural Machine Translation

Non-Autoregressive Transformer (NAT) aims to accelerate the Transformer ...
research
10/08/2020

Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis Including Unsupervised Duration Modeling

This paper presents Non-Attentive Tacotron based on the Tacotron 2 text-...
research
10/18/2022

Multimodal Image Fusion based on Hybrid CNN-Transformer and Non-local Cross-modal Attention

The fusion of images taken by heterogeneous sensors helps to enrich the ...
research
10/24/2020

Multi-Domain Dialogue State Tracking – A Purely Transformer-Based Generative Approach

We investigate the problem of multi-domain Dialogue State Tracking (DST)...
research
01/03/2021

An Efficient Transformer Decoder with Compressed Sub-layers

The large attention-based encoder-decoder network (Transformer) has beco...
research
07/20/2023

Layer-wise Representation Fusion for Compositional Generalization

Despite successes across a broad range of applications, sequence-to-sequ...
research
02/10/2021

Last Query Transformer RNN for knowledge tracing

This paper presents an efficient model to predict a student's answer cor...

Please sign up or login with your details

Forgot password? Click here to reset