Tackling Ambiguity with Images: Improved Multimodal Machine Translation and Contrastive Evaluation

12/20/2022
by   Matthieu Futeral, et al.
0

One of the major challenges of machine translation (MT) is ambiguity, which can in some cases be resolved by accompanying context such as an image. However, recent work in multimodal MT (MMT) has shown that obtaining improvements from images is challenging, limited not only by the difficulty of building effective cross-modal representations but also by the lack of specific evaluation and training data. We present a new MMT approach based on a strong text-only MT model, which uses neural adapters and a novel guided self-attention mechanism and which is jointly trained on both visual masking and MMT. We also release CoMMuTE, a Contrastive Multilingual Multimodal Translation Evaluation dataset, composed of ambiguous sentences and their possible translations, accompanied by disambiguating images corresponding to each translation. Our approach obtains competitive results over strong text-only models on standard English-to-French benchmarks and outperforms these baselines and state-of-the-art MMT systems with a large margin on our contrastive test set.

READ FULL TEXT

page 4

page 5

page 7

page 8

page 14

page 16

page 17

research
11/21/2017

Evaluating Machine Translation Performance on Chinese Idioms with a Blacklist Method

Idiom translation is a challenging problem in machine translation becaus...
research
01/20/2022

VISA: An Ambiguous Subtitles Dataset for Visual Scene-Aware Machine Translation

Existing multimodal machine translation (MMT) datasets consist of images...
research
05/04/2018

Upping the Ante: Towards a Better Benchmark for Chinese-to-English Machine Translation

There are many machine translation (MT) papers that propose novel approa...
research
05/30/2019

DiaBLa: A Corpus of Bilingual Spontaneous Written Dialogues for Machine Translation

We present a new English-French test set for the evaluation of Machine T...
research
01/15/2016

Multimodal Pivots for Image Caption Translation

We present an approach to improve statistical machine translation of ima...
research
05/18/2018

SNU_IDS at SemEval-2018 Task 12: Sentence Encoder with Contextualized Vectors for Argument Reasoning Comprehension

We present a novel neural architecture for the Argument Reasoning Compre...
research
05/09/2023

E2TIMT: Efficient and Effective Modal Adapter for Text Image Machine Translation

Text image machine translation (TIMT) aims to translate texts embedded i...

Please sign up or login with your details

Forgot password? Click here to reset