"Wikily" Neural Machine Translation Tailored to Cross-Lingual Tasks

04/16/2021
by   Mohammad Sadegh Rasooli, et al.
16

We present a simple but effective approach for leveraging Wikipedia for neural machine translation as well as cross-lingual tasks of image captioning and dependency parsing without using any direct supervision from external parallel data or supervised models in the target language. We show that first sentences and titles of linked Wikipedia pages, as well as cross-lingual image captions, are strong signals for a seed parallel data to extract bilingual dictionaries and cross-lingual word embeddings for mining parallel text from Wikipedia. Our final model achieves high BLEU scores that are close to or sometimes higher than strong supervised baselines in low-resource languages; e.g. supervised BLEU of 4.0 versus 12.1 from our model in English-to-Kazakh. Moreover, we tailor our wikily translation models to unsupervised image captioning and cross-lingual dependency parser transfer. In image captioning, we train a multi-tasking machine translation and image captioning pipeline for Arabic and English from which the Arabic training data is a wikily translation of the English captioning data. Our captioning results in Arabic are slightly better than that of its supervised model. In dependency parsing, we translate a large amount of monolingual text, and use it as an artificial training data in an annotation projection framework. We show that our model outperforms recent work on cross-lingual transfer of dependency parsers.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 2

page 12

08/15/2017

Fluency-Guided Cross-Lingual Image Captioning

Image captioning has so far been explored mostly in English, as most ava...
08/15/2019

Unpaired Cross-lingual Image Caption Generation with Self-Supervised Rewards

Generating image descriptions in different languages is essential to sat...
01/06/2017

Cross-Lingual Dependency Parsing with Late Decoding for Truly Low-Resource Languages

In cross-lingual dependency annotation projection, information is often ...
10/03/2020

Unsupervised Cross-lingual Image Captioning

Most recent image captioning works are conducted in English as the major...
07/14/2017

CUNI System for the WMT17 Multimodal Translation Task

In this paper, we describe our submissions to the WMT17 Multimodal Trans...
07/29/2021

The Cross-Lingual Arabic Information REtrieval (CLAIRE) System

Despite advances in neural machine translation, cross-lingual retrieval ...
07/24/2018

Cross-lingual Argumentation Mining: Machine Translation (and a bit of Projection) is All You Need!

Argumentation mining (AM) requires the identification of complex discour...

Code Repositories

ImageTranslate

"Wikily" Neural Machine Translation Tailored to Cross-Lingual Tasks.


view repo
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.