DeepAI
Log In Sign Up

Marian: Cost-effective High-Quality Neural Machine Translation in C++

05/30/2018
by   Marcin Junczys-Dowmunt, et al.
0

This paper describes the submissions of the "Marian" team to the WNMT 2018 shared task. We investigate combinations of teacher-student training, low-precision matrix products, auto-tuning and other methods to optimize the Transformer model on GPU and CPU. By further integrating these methods with the new averaging attention networks, a recently introduced faster Transformer variant, we create a number of high-quality, high-performance models on the GPU and CPU, dominating the Pareto frontier for this shared task.

READ FULL TEXT

page 1

page 2

page 3

page 4

08/09/2020

SpeedySpeech: Efficient Neural Speech Synthesis

While recent neural sequence-to-sequence models have greatly improved th...
08/31/2018

The MeMAD Submission to the WMT18 Multimodal Translation Task

This paper describes the MeMAD project entry to the WMT Multimodal Machi...
10/14/2021

Non-Autoregressive Translation with Layer-Wise Prediction and Deep Supervision

How do we perform efficient inference while retaining high translation q...
09/16/2021

The NiuTrans System for WNGT 2020 Efficiency Task

This paper describes the submissions of the NiuTrans Team to the WNGT 20...
09/16/2021

The NiuTrans System for the WMT21 Efficiency Task

This paper describes the NiuTrans system for the WMT21 translation effic...
02/16/2021

cuFINUFFT: a load-balanced GPU library for general-purpose nonuniform FFTs

Nonuniform fast Fourier transforms dominate the computational cost in ma...