Active Learning for Neural Machine Translation

12/30/2022
by   Neeraj Vashistha, et al.
0

The machine translation mechanism translates texts automatically between different natural languages, and Neural Machine Translation (NMT) has gained attention for its rational context analysis and fluent translation accuracy. However, processing low-resource languages that lack relevant training attributes like supervised data is a current challenge for Natural Language Processing (NLP). We incorporated a technique known Active Learning with the NMT toolkit Joey NMT to reach sufficient accuracy and robust predictions of low-resource language translation. With active learning, a semi-supervised machine learning strategy, the training algorithm determines which unlabeled data would be the most beneficial for obtaining labels using selected query techniques. We implemented two model-driven acquisition functions for selecting the samples to be validated. This work uses transformer-based NMT systems; baseline model (BM), fully trained model (FTM) , active learning least confidence based model (ALLCM), and active learning margin sampling based model (ALMSM) when translating English to Hindi. The Bilingual Evaluation Understudy (BLEU) metric has been used to evaluate system results. The BLEU scores of BM, FTM, ALLCM and ALMSM systems are 16.26, 22.56 , 24.54, and 24.20, respectively. The findings in this paper demonstrate that active learning techniques helps the model to converge early and improve the overall quality of the translation system.

READ FULL TEXT

page 1

page 9

research
10/27/2022

COMET-QE and Active Learning for Low-Resource Machine Translation

Active learning aims to deliver maximum benefit when resources are scarc...
research
07/30/2018

Active Learning for Interactive Neural Machine Translation of Data Streams

We study the application of active learning techniques to the translatio...
research
10/26/2022

Eeny, meeny, miny, moe. How to choose data for morphological inflection

Data scarcity is a widespread problem in numerous natural language proce...
research
03/09/2022

Onception: Active Learning with Expert Advice for Real World Machine Translation

Active learning can play an important role in low-resource settings (i.e...
research
10/21/2014

Bucking the Trend: Large-Scale Cost-Focused Active Learning for Statistical Machine Translation

We explore how to improve machine translation systems by adding more tra...
research
10/02/2018

Semi-supervised and Active-learning Scenarios: Efficient Acoustic Model Refinement for a Low Resource Indian Language

We address the problem of efficient acoustic-model refinement (continuou...
research
06/02/2022

BayesFormer: Transformer with Uncertainty Estimation

Transformer has become ubiquitous due to its dominant performance in var...

Please sign up or login with your details

Forgot password? Click here to reset