Hierarchical Transformer for Multilingual Machine Translation

03/05/2021
by   Albina Khusainova, et al.
10

The choice of parameter sharing strategy in multilingual machine translation models determines how optimally parameter space is used and hence, directly influences ultimate translation quality. Inspired by linguistic trees that show the degree of relatedness between different languages, the new general approach to parameter sharing in multilingual machine translation was suggested recently. The main idea is to use these expert language hierarchies as a basis for multilingual architecture: the closer two languages are, the more parameters they share. In this work, we test this idea using the Transformer architecture and show that despite the success in previous work there are problems inherent to training such hierarchical models. We demonstrate that in case of carefully chosen training strategy the hierarchical architecture can outperform bilingual models and multilingual models with full parameter sharing.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/01/2018

Parameter Sharing Methods for Multilingual Self-Attentional Translation Models

In multilingual neural machine translation, it has been shown that shari...
research
05/12/2020

A Framework for Hierarchical Multilingual Machine Translation

Multilingual machine translation has recently been in vogue given its po...
research
10/15/2021

Breaking Down Multilingual Machine Translation

While multilingual training is now an essential ingredient in machine tr...
research
12/31/2020

XLM-T: Scaling up Multilingual Machine Translation with Pretrained Cross-lingual Transformer Encoders

Multilingual machine translation enables a single model to translate bet...
research
12/27/2021

Parameter Differentiation based Multilingual Neural Machine Translation

Multilingual neural machine translation (MNMT) aims to translate multipl...
research
01/01/2021

Subformer: Exploring Weight Sharing for Parameter Efficiency in Generative Transformers

The advent of the Transformer can arguably be described as a driving for...
research
05/22/2022

Multilingual Machine Translation with Hyper-Adapters

Multilingual machine translation suffers from negative interference acro...

Please sign up or login with your details

Forgot password? Click here to reset