A Visual Attention Grounding Neural Model for Multimodal Machine Translation

08/24/2018
by   Mingyang Zhou, et al.
0

We introduce a novel multimodal machine translation model that utilizes parallel visual and textual information. Our model jointly optimizes the learning of a shared visual-language embedding and translating languages. It does this with the aid of a visual attention grounding mechanism which links the visual semantics in the image with the corresponding textual semantics. Our approach achieves competitive state-of-the-art results on the Multi30K and the Ambiguous COCO datasets. We also collected a new multilingual multimodal product description dataset to simulate a real-world international online shopping scenario. On this dataset, our visual attention grounding model outperforms other methods by a large margin.

READ FULL TEXT

page 6

page 7

research
10/19/2017

Findings of the Second Shared Task on Multimodal Machine Translation and Multilingual Image Description

We present the results from the second shared task on multimodal machine...
research
03/02/2021

MultiSubs: A Large-scale Multimodal and Multilingual Dataset

This paper introduces a large-scale multimodal and multilingual dataset ...
research
08/12/2019

Multimodal Unified Attention Networks for Vision-and-Language Interactions

Learning an effective attention mechanism for multimodal data is importa...
research
02/04/2019

Embodied Multimodal Multitask Learning

Recent efforts on training visual navigation agents conditioned on langu...
research
06/18/2019

Distilling Translations with Visual Awareness

Previous work on multimodal machine translation has shown that visual in...
research
03/11/2020

Visual Grounding in Video for Unsupervised Word Translation

There are thousands of actively spoken languages on Earth, but a single ...
research
05/24/2023

Exploring the Grounding Issues in Image Caption

This paper explores the grounding issue concerning multimodal semantic r...

Please sign up or login with your details

Forgot password? Click here to reset