Multi-scale Alignment and Contextual History for Attention Mechanism in Sequence-to-sequence Model

07/22/2018
by   Andros Tjandra, et al.
0

A sequence-to-sequence model is a neural network module for mapping two sequences of different lengths. The sequence-to-sequence model has three core modules: encoder, decoder, and attention. Attention is the bridge that connects the encoder and decoder modules and improves model performance in many tasks. In this paper, we propose two ideas to improve sequence-to-sequence model performance by enhancing the attention module. First, we maintain the history of the location and the expected context from several previous time-steps. Second, we apply multiscale convolution from several previous attention vectors to the current decoder state. We utilized our proposed framework for sequence-to-sequence speech recognition and text-to-speech systems. The results reveal that our proposed extension could improve performance significantly compared to a standard attention baseline.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/04/2019

Integrating Whole Context to Sequence-to-sequence Speech Recognition

Because an attention based sequence-to-sequence speech (Seq2Seq) recogni...
research
06/13/2018

Double Path Networks for Sequence to Sequence Learning

Encoder-decoder based Sequence to Sequence learning (S2S) has made remar...
research
11/29/2017

PSIque: Next Sequence Prediction of Satellite Images using a Convolutional Sequence-to-Sequence Network

Predicting unseen weather phenomena is an important issue for disaster m...
research
03/27/2019

Multilevel Text Normalization with Sequence-to-Sequence Networks and Multisource Learning

We define multilevel text normalization as sequence-to-sequence processi...
research
04/04/2019

Sequence-to-Sequence Speech Recognition with Time-Depth Separable Convolutions

We propose a fully convolutional sequence-to-sequence encoder architectu...
research
12/21/2017

Variational Attention for Sequence-to-Sequence Models

The variational encoder-decoder (VED) encodes source information as a se...

Please sign up or login with your details

Forgot password? Click here to reset