Optimizing Segmentation Granularity for Neural Machine Translation

10/19/2018
by   Elizabeth Salesky, et al.
0

In neural machine translation (NMT), it is has become standard to translate using subword units to allow for an open vocabulary and improve accuracy on infrequent words. Byte-pair encoding (BPE) and its variants are the predominant approach to generating these subwords, as they are unsupervised, resource-free, and empirically effective. However, the granularity of these subword units is a hyperparameter to be tuned for each language and task, using methods such as grid search. Tuning may be done inexhaustively or skipped entirely due to resource constraints, leading to sub-optimal performance. In this paper, we propose a method to automatically tune this parameter using only one training pass. We incrementally introduce new vocabulary online based on the held-out validation loss, beginning with smaller, general subwords and adding larger, more specific units over the course of training. Our method matches the results found with grid search, optimizing segmentation granularity without any additional training time. We also show benefits in training efficiency and performance improvements for rare words due to the way embeddings for larger units are incrementally constructed by combining those from smaller units.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/31/2015

Neural Machine Translation of Rare Words with Subword Units

Neural machine translation (NMT) models typically operate with a fixed v...
research
07/25/2018

Finding Better Subword Segmentation for Neural Machine Translation

For different language pairs, word-level neural machine translation (NMT...
research
09/30/2019

Regressing Word and Sentence Embeddings for Regularization of Neural Machine Translation

In recent years, neural machine translation (NMT) has become the dominan...
research
04/29/2018

Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates

Subword units are an effective way to alleviate the open vocabulary prob...
research
06/13/2016

Zero-Resource Translation with Multi-Lingual Neural Machine Translation

In this paper, we propose a novel finetuning algorithm for the recently ...
research
09/14/2019

A Universal Parent Model for Low-Resource Neural Machine Translation Transfer

Transfer learning from a high-resource language pair `parent' has been p...
research
05/24/2019

A Call for Prudent Choice of Subword Merge Operations

Most neural machine translation systems are built upon subword units ext...

Please sign up or login with your details

Forgot password? Click here to reset