Layer-Wise Cross-View Decoding for Sequence-to-Sequence Learning

05/16/2020
by   Fenglin Liu, et al.
0

In sequence-to-sequence learning, the attention mechanism has been a great success in bridging the information between the encoder and the decoder. However, it is often overlooked that the decoder only has a single view of the source sequences, that is, the representations generated by the last encoder layer, which is supposed to be a global view of source sequences. Such implementation hinders the decoder from concrete, fine-grained, local source information. In this work, we explore to reuse the representations from different encoder layers for layer-wise cross-view decoding, that is, different views of the source sequences are presented to different decoder layers. We investigate multiple, representative strategies for cross-view coding, of which the granularity consistent attention (GCA) strategy proves the most efficient and effective in the experiments on neural machine translation task. Especially, GCA surpasses the previous state-of-the-art architecture on three machine translation datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/20/2020

Balancing Cost and Benefit with Tied-Multi Transformers

We propose and evaluate a novel procedure for training multiple Transfor...
research
11/12/2018

Input Combination Strategies for Multi-Source Transformer Decoder

In multi-source sequence-to-sequence tasks, the attention mechanism can ...
research
12/29/2020

Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning

Encoder layer fusion (EncoderFusion) is a technique to fuse all the enco...
research
09/17/2018

Quantum Statistics-Inspired Neural Attention

Sequence-to-sequence (encoder-decoder) models with attention constitute ...
research
11/03/2020

Layer-Wise Multi-View Learning for Neural Machine Translation

Traditional neural machine translation is limited to the topmost encoder...
research
06/28/2023

Sequential Attention Source Identification Based on Feature Representation

Snapshot observation based source localization has been widely studied d...
research
05/20/2023

Learn to Compose Syntactic and Semantic Representations Appropriately for Compositional Generalization

Recent studies have shown that sequence-to-sequence (Seq2Seq) models are...

Please sign up or login with your details

Forgot password? Click here to reset