Beyond Triplet: Leveraging the Most Data for Multimodal Machine Translation

12/20/2022
by   Yaoming Zhu, et al.
0

Multimodal machine translation (MMT) aims to improve translation quality by incorporating information from other modalities, such as vision. Previous MMT systems mainly focus on better access and use of visual information and tend to validate their methods on image-related datasets. These studies face two challenges. First, they can only utilize triple data (bilingual texts with images), which is scarce; second, current benchmarks are relatively restricted and do not correspond to realistic scenarios. Therefore, this paper correspondingly establishes new methods and new datasets for MMT. First, we propose a framework 2/3-Triplet with two new approaches to enhance MMT by utilizing large-scale non-triple data: monolingual image-text data and parallel text-only data. Second, we construct an English-Chinese e-commercial multimodal translation dataset (including training and testing), named EMMT, where its test set is carefully selected as some words are ambiguous and shall be translated mistakenly without the help of images. Experiments show that our method is more suitable for real-world scenarios and can significantly improve translation performance by using more non-triple data. In addition, our model also rivals various SOTA models in conventional multimodal translation benchmarks.

READ FULL TEXT

page 7

page 13

research
07/21/2019

Hindi Visual Genome: A Dataset for Multimodal English-to-Hindi Machine Translation

Visual Genome is a dataset connecting structured image information with ...
research
12/28/2020

Towards Fully Automated Manga Translation

We tackle the problem of machine translation of manga, Japanese comics. ...
research
08/02/2022

Silo NLP's Participation at WAT2022

This paper provides the system description of "Silo NLP's" submission to...
research
08/05/2019

Predicting Actions to Help Predict Translations

We address the task of text translation on the How2 dataset using a stat...
research
06/01/2021

ViTA: Visual-Linguistic Translation by Aligning Object Tags

Multimodal Machine Translation (MMT) enriches the source text with visua...
research
03/17/2022

On Vision Features in Multimodal Machine Translation

Previous work on multimodal machine translation (MMT) has focused on the...
research
03/01/2020

Towards Automatic Face-to-Face Translation

In light of the recent breakthroughs in automatic machine translation sy...

Please sign up or login with your details

Forgot password? Click here to reset