Log In Sign Up

Hierarchical Transformer for Multilingual Machine Translation

by   Albina Khusainova, et al.

The choice of parameter sharing strategy in multilingual machine translation models determines how optimally parameter space is used and hence, directly influences ultimate translation quality. Inspired by linguistic trees that show the degree of relatedness between different languages, the new general approach to parameter sharing in multilingual machine translation was suggested recently. The main idea is to use these expert language hierarchies as a basis for multilingual architecture: the closer two languages are, the more parameters they share. In this work, we test this idea using the Transformer architecture and show that despite the success in previous work there are problems inherent to training such hierarchical models. We demonstrate that in case of carefully chosen training strategy the hierarchical architecture can outperform bilingual models and multilingual models with full parameter sharing.


page 1

page 2

page 3

page 4


Parameter Sharing Methods for Multilingual Self-Attentional Translation Models

In multilingual neural machine translation, it has been shown that shari...

A Framework for Hierarchical Multilingual Machine Translation

Multilingual machine translation has recently been in vogue given its po...

XLM-T: Scaling up Multilingual Machine Translation with Pretrained Cross-lingual Transformer Encoders

Multilingual machine translation enables a single model to translate bet...

Multilingual Neural Machine Translation with Task-Specific Attention

Multilingual machine translation addresses the task of translating betwe...

Subformer: Exploring Weight Sharing for Parameter Efficiency in Generative Transformers

The advent of the Transformer can arguably be described as a driving for...

Multilingual Machine Translation with Hyper-Adapters

Multilingual machine translation suffers from negative interference acro...

Parameter Differentiation based Multilingual Neural Machine Translation

Multilingual neural machine translation (MNMT) aims to translate multipl...