Bilingual Dictionary Based Neural Machine Translation without Using Parallel Sentences

07/06/2020
by   Xiangyu Duan, et al.
0

In this paper, we propose a new task of machine translation (MT), which is based on no parallel sentences but can refer to a ground-truth bilingual dictionary. Motivated by the ability of a monolingual speaker learning to translate via looking up the bilingual dictionary, we propose the task to see how much potential an MT system can attain using the bilingual dictionary and large scale monolingual corpora, while is independent on parallel sentences. We propose anchored training (AT) to tackle the task. AT uses the bilingual dictionary to establish anchoring points for closing the gap between source language and target language. Experiments on various language pairs show that our approaches are significantly better than various baselines, including dictionary-based word-by-word translation, dictionary-supervised cross-lingual word embedding transformation, and unsupervised MT. On distant language pairs that are hard for unsupervised MT to perform well, AT performs remarkably better, achieving performances comparable to supervised SMT trained on more than 4M parallel sentences.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/11/2017

Word Translation Without Parallel Data

State-of-the-art methods for learning cross-lingual word embeddings have...
research
05/09/2018

On the Limitations of Unsupervised Bilingual Dictionary Induction

Unsupervised machine translation---i.e., not assuming any cross-lingual ...
research
06/12/2021

Don't Rule Out Monolingual Speakers: A Method For Crowdsourcing Machine Translation Data

High-performing machine translation (MT) systems can help overcome langu...
research
02/04/2019

Unsupervised Clinical Language Translation

As patients' access to their doctors' clinical notes becomes common, tra...
research
05/29/2019

Unsupervised Paraphrasing without Translation

Paraphrasing exemplifies the ability to abstract semantic content from s...
research
02/21/2019

Development of a classifiers/quantifiers dictionary towards French-Japanese MT

Although classifiers/quantifiers (CQs) expressions appear frequently in ...
research
05/12/2023

Perturbation-based QE: An Explainable, Unsupervised Word-level Quality Estimation Method for Blackbox Machine Translation

Quality Estimation (QE) is the task of predicting the quality of Machine...

Please sign up or login with your details

Forgot password? Click here to reset