Modelling Bahdanau Attention using Election methods aided by Q-Learning

11/10/2019
by   Rakesh Bal, et al.
0

Neural Machine Translation has lately gained a lot of "attention" with the advent of more and more sophisticated but drastically improved models. Attention mechanism has proved to be a boon in this direction by providing weights to the input words, making it easy for the decoder to identify words representing the present context. But by and by, the newer attention models being more complex involved large computation, making inference slow. In this paper, we have modelled the attention network using techniques resonating with social choice theory. Along with that, attention mechanism, being a Markov Decision Process, should be, in theory, representable by reinforcement learning techniques. Thus, we propose to use an election method (k-Borda), fine-tuned using Q-learning, as a replacement for attention networks. The inference time for this network is less than a standard Bahdanau translator, and the results of the translation are comparable. This not only experimentally verifies the claims stated above but also helps provide a faster inference.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/30/2018

Fine-Grained Attention Mechanism for Neural Machine Translation

Neural machine translation (NMT) has been a new paradigm in machine tran...
research
08/22/2018

Learning When to Concentrate or Divert Attention: Self-Adaptive Attention Temperature for Neural Machine Translation

Most of the Neural Machine Translation (NMT) models are based on the seq...
research
09/14/2016

Neural Machine Translation with Supervised Attention

The attention mechanisim is appealing for neural machine translation, si...
research
04/07/2020

Salience Estimation with Multi-Attention Learning for Abstractive Text Summarization

Attention mechanism plays a dominant role in the sequence generation mod...
research
05/19/2023

ReSeTOX: Re-learning attention weights for toxicity mitigation in machine translation

Our proposed method, ReSeTOX (REdo SEarch if TOXic), addresses the issue...
research
12/19/2016

An Empirical Study of Adequate Vision Span for Attention-Based Neural Machine Translation

Recently, the attention mechanism plays a key role to achieve high perfo...
research
06/04/2023

Learning from AI: An Interactive Learning Method Using a DNN Model Incorporating Expert Knowledge as a Teacher

Visual explanation is an approach for visualizing the grounds of judgmen...

Please sign up or login with your details

Forgot password? Click here to reset