Learning Light-Weight Translation Models from Deep Transformer

12/27/2020
by   Bei Li, et al.
19

Recently, deep models have shown tremendous improvements in neural machine translation (NMT). However, systems of this kind are computationally expensive and memory intensive. In this paper, we take a natural step towards learning strong but light-weight NMT systems. We proposed a novel group-permutation based knowledge distillation approach to compressing the deep Transformer model into a shallow model. The experimental results on several benchmarks validate the effectiveness of our method. Our compressed model is 8X shallower than the deep model, with almost no loss in BLEU. To further enhance the teacher model, we present a Skipping Sub-Layer method to randomly omit sub-layers to introduce perturbation into training, which achieves a BLEU score of 30.63 on English-German newstest2014. The code is publicly available at https://github.com/libeineu/GPKD.

READ FULL TEXT
research
10/08/2020

Shallow-to-Deep Training for Neural Machine Translation

Deep encoders have been proven to be effective in improving neural machi...
research
09/16/2021

The NiuTrans System for WNGT 2020 Efficiency Task

This paper describes the submissions of the NiuTrans Team to the WNGT 20...
research
05/27/2021

Selective Knowledge Distillation for Neural Machine Translation

Neural Machine Translation (NMT) models achieve state-of-the-art perform...
research
12/22/2021

Joint-training on Symbiosis Networks for Deep Nueral Machine Translation models

Deep encoders have been proven to be effective in improving neural machi...
research
09/16/2021

The NiuTrans System for the WMT21 Efficiency Task

This paper describes the NiuTrans system for the WMT21 translation effic...
research
09/08/2021

What's Hidden in a One-layer Randomly Weighted Transformer?

We demonstrate that, hidden within one-layer randomly weighted neural ne...
research
06/02/2023

Binary and Ternary Natural Language Generation

Ternary and binary neural networks enable multiplication-free computatio...

Please sign up or login with your details

Forgot password? Click here to reset