Large Scale Multi-Lingual Multi-Modal Summarization Dataset

02/13/2023
by   Yash Verma, et al.
0

Significant developments in techniques such as encoder-decoder models have enabled us to represent information comprising multiple modalities. This information can further enhance many downstream tasks in the field of information retrieval and natural language processing; however, improvements in multi-modal techniques and their performance evaluation require large-scale multi-modal data which offers sufficient diversity. Multi-lingual modeling for a variety of tasks like multi-modal summarization, text generation, and translation leverages information derived from high-quality multi-lingual annotated data. In this work, we present the current largest multi-lingual multi-modal summarization dataset (M3LS), and it consists of over a million instances of document-image pairs along with a professionally annotated multi-modal summary for each pair. It is derived from news articles published by British Broadcasting Corporation(BBC) over a decade and spans 20 languages, targeting diversity across five language roots, it is also the largest summarization dataset for 13 languages and consists of cross-lingual summarization data for 2 languages. We formally define the multi-lingual multi-modal summarization task utilizing our dataset and report baseline scores from various state-of-the-art summarization techniques in a multi-lingual setting. We also compare it with many similar datasets to analyze the uniqueness and difficulty of M3LS.

READ FULL TEXT
research
12/16/2021

CrossSum: Beyond English-Centric Cross-Lingual Abstractive Text Summarization for 1500+ Language Pairs

We present CrossSum, a large-scale dataset comprising 1.65 million cross...
research
04/30/2020

MLSUM: The Multilingual Summarization Corpus

We present MLSUM, the first large-scale MultiLingual SUMmarization datas...
research
06/27/2021

Multi-Modal Chorus Recognition for Improving Song Search

We discuss a novel task, Chorus Recognition, which could potentially ben...
research
06/15/2020

DynE: Dynamic Ensemble Decoding for Multi-Document Summarization

Sequence-to-sequence (s2s) models are the basis for extensive work in na...
research
12/08/2020

Cross-lingual Approach to Abstractive Summarization

Automatic text summarization extracts important information from texts a...
research
10/25/2021

"So You Think You're Funny?": Rating the Humour Quotient in Standup Comedy

Computational Humour (CH) has attracted the interest of Natural Language...
research
05/29/2017

Emergent Communication in a Multi-Modal, Multi-Step Referential Game

Inspired by previous work on emergent communication in referential games...

Please sign up or login with your details

Forgot password? Click here to reset