Two-Headed Monster And Crossed Co-Attention Networks

11/10/2019
by   Yaoyiran Li, et al.
0

This paper presents some preliminary investigations of a new co-attention mechanism in neural transduction models. We propose a paradigm, termed Two-Headed Monster (THM), which consists of two symmetric encoder modules and one decoder module connected with co-attention. As a specific and concrete implementation of THM, Crossed Co-Attention Networks (CCNs) are designed based on the Transformer model. We demonstrate CCNs on WMT 2014 EN-DE and WMT 2016 EN-FI translation tasks and our model outperforms the strong Transformer baseline by 0.51 (big) and 0.74 (base) BLEU points on EN-DE and by 0.17 (big) and 0.47 (base) BLEU points on EN-FI.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/05/2019

Learning Deep Transformer Models for Machine Translation

Transformer is the state-of-the-art model in recent machine translation ...
research
06/12/2017

Attention Is All You Need

The dominant sequence transduction models are based on complex recurrent...
research
08/23/2021

Recurrent multiple shared layers in Depth for Neural Machine Translation

Learning deeper models is usually a simple and effective approach to imp...
research
11/06/2021

Transformer Based Bengali Chatbot Using General Knowledge Dataset

An AI chatbot provides an impressive response after learning from the tr...
research
05/02/2020

Hard-Coded Gaussian Attention for Neural Machine Translation

Recent work has questioned the importance of the Transformer's multi-hea...
research
07/09/2020

DCANet: Learning Connected Attentions for Convolutional Neural Networks

While self-attention mechanism has shown promising results for many visi...
research
08/30/2023

Dual-path Transformer Based Neural Beamformer for Target Speech Extraction

Neural beamformers, which integrate both pre-separation and beamforming ...

Please sign up or login with your details

Forgot password? Click here to reset