Local Monotonic Attention Mechanism for End-to-End Speech and Language Processing

05/23/2017
by   Andros Tjandra, et al.
0

Recently, encoder-decoder neural networks have shown impressive performance on many sequence-related tasks. The architecture commonly uses an attentional mechanism which allows the model to learn alignments between the source and the target sequence. Most attentional mechanisms used today is based on a global attention property which requires a computation of a weighted summarization of the whole input sequence generated by encoder states. However, it is computationally expensive and often produces misalignment on the longer input sequence. Furthermore, it does not fit with monotonous or left-to-right nature in several tasks, such as automatic speech recognition (ASR), grapheme-to-phoneme (G2P), etc. In this paper, we propose a novel attention mechanism that has local and monotonic properties. Various ways to control those properties are also explored. Experimental results on ASR, G2P and machine translation between two languages with similar sentence structures, demonstrate that the proposed encoder-decoder model with local monotonic attention could achieve significant performance improvements and reduce the computational complexity in comparison with the one that used the standard global attention architecture.

READ FULL TEXT
research
10/25/2019

Towards Online End-to-end Transformer Automatic Speech Recognition

The Transformer self-attention network has recently shown promising perf...
research
10/11/2021

SRU++: Pioneering Fast Recurrence with Attention for Speech Recognition

The Transformer architecture has been well adopted as a dominant archite...
research
07/04/2015

Describing Multimedia Content using Attention-based Encoder--Decoder Networks

Whereas deep neural networks were first mostly used for classification t...
research
07/01/2017

Efficient Attention using a Fixed-Size Memory Representation

The standard content-based attention mechanism typically used in sequenc...
research
04/03/2017

Online and Linear-Time Attention by Enforcing Monotonic Alignments

Recurrent neural network models with an attention mechanism have proven ...
research
11/11/2022

Align, Write, Re-order: Explainable End-to-End Speech Translation via Operation Sequence Generation

The black-box nature of end-to-end speech translation (E2E ST) systems m...
research
03/31/2016

Neural Attention Models for Sequence Classification: Analysis and Application to Key Term Extraction and Dialogue Act Detection

Recurrent neural network architectures combining with attention mechanis...

Please sign up or login with your details

Forgot password? Click here to reset