You May Not Need Attention

10/31/2018
by   Ofir Press, et al.
0

In NMT, how far can we get without attention and without separate encoding and decoding? To answer that question, we introduce a recurrent neural translation model that does not use attention and does not have a separate encoder and decoder. Our eager translation model is low-latency, writing target tokens as soon as it reads the first source token, and uses constant memory during decoding. It performs on par with the standard attention-based model of Bahdanau et al. (2014), and better on long sentences.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/17/2016

Interactive Attention for Neural Machine Translation

Conventional attention-based Neural Machine Translation (NMT) conducts d...
research
05/23/2022

Towards Opening the Black Box of Neural Machine Translation: Source and Target Interpretations of the Transformer

In Neural Machine Translation (NMT), each token prediction is conditione...
research
04/28/2019

Neural Machine Translation with Recurrent Highway Networks

Recurrent Neural Networks have lately gained a lot of popularity in lang...
research
09/13/2021

Attention Weights in Transformer NMT Fail Aligning Words Between Sequences but Largely Explain Model Predictions

This work proposes an extensive analysis of the Transformer architecture...
research
12/06/2017

Multi-channel Encoder for Neural Machine Translation

Attention-based Encoder-Decoder has the effective architecture for neura...
research
07/01/2017

Efficient Attention using a Fixed-Size Memory Representation

The standard content-based attention mechanism typically used in sequenc...
research
06/06/2020

Challenges and Thrills of Legal Arguments

State-of-the-art attention based models, mostly centered around the tran...

Please sign up or login with your details

Forgot password? Click here to reset