Exploiting Deep Representations for Neural Machine Translation

10/24/2018
by   Zi-Yi Dou, et al.
0

Advanced neural machine translation (NMT) models generally implement encoder and decoder as multiple layers, which allows systems to model complex functions and capture complicated linguistic structures. However, only the top layers of encoder and decoder are leveraged in the subsequent process, which misses the opportunity to exploit the useful information embedded in other layers. In this work, we propose to simultaneously expose all of these signals with layer aggregation and multi-layer attention mechanisms. In addition, we introduce an auxiliary regularization term to encourage different layers to capture diverse information. Experimental results on widely-used WMT14 English-German and WMT17 Chinese-English translation data demonstrate the effectiveness and universality of the proposed approach.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/04/2019

Exploiting Sentential Context for Neural Machine Translation

In this work, we present novel approaches to exploit sentential context ...
research
07/29/2022

GTrans: Grouping and Fusing Transformer Layers for Neural Machine Translation

Transformer structure, stacked by a sequence of encoder and decoder netw...
research
11/03/2020

Layer-Wise Multi-View Learning for Neural Machine Translation

Traditional neural machine translation is limited to the topmost encoder...
research
07/19/2021

Residual Tree Aggregation of Layers for Neural Machine Translation

Although attention-based Neural Machine Translation has achieved remarka...
research
02/16/2020

Multi-layer Representation Fusion for Neural Machine Translation

Neural machine translation systems require a number of stacked layers fo...
research
08/27/2019

Multi-Layer Softmaxing during Training Neural Machine Translation for Flexible Decoding with Fewer Layers

This paper proposes a novel procedure for training an encoder-decoder ba...
research
02/15/2019

Dynamic Layer Aggregation for Neural Machine Translation with Routing-by-Agreement

With the promising progress of deep neural networks, layer aggregation h...

Please sign up or login with your details

Forgot password? Click here to reset