BERT Can See Out of the Box: On the Cross-modal Transferability of Text Representations

02/25/2020
by   Thomas Scialom, et al.
0

Pre-trained language models such as BERT have recently contributed to significant advances in Natural Language Processing tasks. Interestingly, while multilingual BERT models have demonstrated impressive results, recent works have shown how monolingual BERT can also be competitive in zero-shot cross-lingual settings. This suggests that the abstractions learned by these models can transfer across languages, even when trained on monolingual data. In this paper, we investigate whether such generalization potential applies to other modalities, such as vision: does BERT contain abstractions that generalize beyond text? We introduce BERT-gen, an architecture for text generation based on BERT, able to leverage on either mono- or multi- modal representations. The results reported under different configurations indicate a positive answer to our research question, and the proposed model obtains substantial improvements over the state-of-the-art on two established Visual Question Generation datasets.

READ FULL TEXT

page 3

page 8

research
10/25/2019

On the Cross-lingual Transferability of Monolingual Representations

State-of-the-art unsupervised multilingual models (e.g., multilingual BE...
research
06/04/2019

How multilingual is Multilingual BERT?

In this paper, we show that Multilingual BERT (M-BERT), released by Devl...
research
05/24/2023

Meta-Learning For Vision-and-Language Cross-lingual Transfer

Current pre-trained vison-language models (PVLMs) achieve excellent perf...
research
08/31/2021

Monolingual versus Multilingual BERTology for Vietnamese Extractive Multi-Document Summarization

Recent researches have demonstrated that BERT shows potential in a wide ...
research
10/22/2020

Towards Fully Bilingual Deep Language Modeling

Language models based on deep neural networks have facilitated great adv...
research
02/15/2022

Delving Deeper into Cross-lingual Visual Question Answering

Visual question answering (VQA) is one of the crucial vision-and-languag...
research
12/21/2021

DB-BERT: a Database Tuning Tool that "Reads the Manual"

DB-BERT is a database tuning tool that exploits information gained via n...

Please sign up or login with your details

Forgot password? Click here to reset