Sparsely Factored Neural Machine Translation

by   Noe Casas, et al.

The standard approach to incorporate linguistic information to neural machine translation systems consists in maintaining separate vocabularies for each of the annotated features to be incorporated (e.g. POS tags, dependency relation label), embed them, and then aggregate them with each subword in the word they belong to. This approach, however, cannot easily accommodate annotation schemes that are not dense for every word. We propose a method suited for such a case, showing large improvements in out-of-domain data, and comparable quality for the in-domain data. Experiments are performed in morphologically-rich languages like Basque and German, for the case of low-resource scenarios.


page 1

page 2

page 3

page 4


Linguistic Input Features Improve Neural Machine Translation

Neural machine translation has recently achieved impressive results, whi...

Domain Robustness in Neural Machine Translation

Translating text that diverges from the training domain is a key challen...

Enriching the Transformer with Linguistic and Semantic Factors for Low-Resource Machine Translation

Introducing factors, that is to say, word features such as linguistic in...

Incorporating Bilingual Dictionaries for Low Resource Semi-Supervised Neural Machine Translation

We explore ways of incorporating bilingual dictionaries to enable semi-s...

Data Diversification: An Elegant Strategy For Neural Machine Translation

A common approach to improve neural machine translation is to invent new...

Measuring Memorization Effect in Word-Level Neural Networks Probing

Multiple studies have probed representations emerging in neural networks...