Log In Sign Up

BERTGEN: Multi-task Generation through BERT

by   Faidon Mitzalis, et al.

We present BERTGEN, a novel generative, decoder-only model which extends BERT by fusing multimodal and multilingual pretrained models VL-BERT and M-BERT, respectively. BERTGEN is auto-regressively trained for language generation tasks, namely image captioning, machine translation and multimodal machine translation, under a multitask setting. With a comprehensive set of evaluations, we show that BERTGEN outperforms many strong baselines across the tasks explored. We also show BERTGEN's ability for zero-shot language generation, where it exhibits competitive performance to supervised counterparts. Finally, we conduct ablation studies which demonstrate that BERTGEN substantially benefits from multi-tasking and effectively transfers relevant inductive biases from the pre-trained models.


M3P: Learning Universal Representations via Multitask Multilingual Multimodal Pre-training

This paper presents a Multitask Multilingual Multimodal Pre-trained mode...

Does Multimodality Help Human and Machine for Translation and Image Captioning?

This paper presents the systems developed by LIUM and CVC for the WMT16 ...

Unbabel's Submission to the WMT2019 APE Shared Task: BERT-based Encoder-Decoder for Automatic Post-Editing

This paper describes Unbabel's submission to the WMT2019 APE Shared Task...

BERT, mBERT, or BiBERT? A Study on Contextualized Embeddings for Neural Machine Translation

The success of bidirectional encoders using masked language models, such...

Findings of the Second Shared Task on Multimodal Machine Translation and Multilingual Image Description

We present the results from the second shared task on multimodal machine...

Multimodal Machine Translation through Visuals and Speech

Multimodal machine translation involves drawing information from more th...

Multilingual Alignment of Contextual Word Representations

We propose procedures for evaluating and strengthening contextual embedd...