DeepAI AI Chat
Log In Sign Up

The NiuTrans System for the WMT21 Efficiency Task

by   Chenglong Wang, et al.
Northeastern University

This paper describes the NiuTrans system for the WMT21 translation efficiency task ( Following last year's work, we explore various techniques to improve efficiency while maintaining translation quality. We investigate the combinations of lightweight Transformer architectures and knowledge distillation strategies. Also, we improve the translation efficiency with graph optimization, low precision, dynamic batching, and parallel pre/post-processing. Our system can translate 247,000 words per second on an NVIDIA A100, being 3× faster than last year's system. Our system is the fastest and has the lowest memory consumption on the GPU-throughput track. The code, model, and pipeline will be available at NiuTrans.NMT (


The NiuTrans System for WNGT 2020 Efficiency Task

This paper describes the submissions of the NiuTrans Team to the WNGT 20...

The RoyalFlush System for the WMT 2022 Efficiency Task

This paper describes the submission of the RoyalFlush neural machine tra...

Learning Light-Weight Translation Models from Deep Transformer

Recently, deep models have shown tremendous improvements in neural machi...

Bag of Tricks for Optimizing Transformer Efficiency

Improving Transformer efficiency has become increasingly attractive rece...

JIT-Masker: Efficient Online Distillation for Background Matting

We design a real-time portrait matting pipeline for everyday use, partic...

MobileNMT: Enabling Translation in 15MB and 30ms

Deploying NMT models on mobile devices is essential for privacy, low lat...

BLCU-ICALL at SemEval-2022 Task 1: Cross-Attention Multitasking Framework for Definition Modeling

This paper describes the BLCU-ICALL system used in the SemEval-2022 Task...