The NiuTrans System for WNGT 2020 Efficiency Task

09/16/2021
by   Chi Hu, et al.
0

This paper describes the submissions of the NiuTrans Team to the WNGT 2020 Efficiency Shared Task. We focus on the efficient implementation of deep Transformer models <cit.> using NiuTensor (https://github.com/NiuTrans/NiuTensor), a flexible toolkit for NLP tasks. We explored the combination of deep encoder and shallow decoder in Transformer models via model compression and knowledge distillation. The neural machine translation decoding also benefits from FP16 inference, attention caching, dynamic batching, and batch pruning. Our systems achieve promising results in both translation quality and efficiency, e.g., our fastest system can translate more than 40,000 tokens per second with an RTX 2080 Ti while maintaining 42.9 BLEU on newstest2018. The code, models, and docker images are available at NiuTrans.NMT (https://github.com/NiuTrans/NiuTrans.NMT).

READ FULL TEXT

page 3

page 5

research
09/16/2021

The NiuTrans System for the WMT21 Efficiency Task

This paper describes the NiuTrans system for the WMT21 translation effic...
research
09/09/2021

Bag of Tricks for Optimizing Transformer Efficiency

Improving Transformer efficiency has become increasingly attractive rece...
research
12/27/2020

Learning Light-Weight Translation Models from Deep Transformer

Recently, deep models have shown tremendous improvements in neural machi...
research
06/07/2023

MobileNMT: Enabling Translation in 15MB and 30ms

Deploying NMT models on mobile devices is essential for privacy, low lat...
research
06/09/2021

Instantaneous Grammatical Error Correction with Shallow Aggressive Decoding

In this paper, we propose Shallow Aggressive Decoding (SAD) to improve t...
research
05/30/2018

Marian: Cost-effective High-Quality Neural Machine Translation in C++

This paper describes the submissions of the "Marian" team to the WNMT 20...
research
08/28/2023

Attention Visualizer Package: Revealing Word Importance for Deeper Insight into Encoder-Only Transformer Models

This report introduces the Attention Visualizer package, which is crafte...

Please sign up or login with your details

Forgot password? Click here to reset