DialogCC: Large-Scale Multi-Modal Dialogue Dataset

12/08/2022
by   Young Jun Lee, et al.
0

As sharing images in an instant message is a crucial factor, there has been active research on learning a image-text multi-modal dialogue model. However, training a well-generalized multi-modal dialogue model is challenging because existing multi-modal dialogue datasets contain a small number of data, limited topics, and a restricted variety of images per dialogue. In this paper, we present a multi-modal dialogue dataset creation pipeline that involves matching large-scale images to dialogues based on CLIP similarity. Using this automatic pipeline, we propose a large-scale multi-modal dialogue dataset, DialogCC, which covers diverse real-world topics and various images per dialogue. With extensive experiments, we demonstrate that training a multi-modal dialogue model with our dataset can improve generalization performance. Additionally, existing models trained with our dataset achieve state-of-the-art performance on image and text retrieval tasks. The source code and the dataset will be released after publication.

READ FULL TEXT

page 3

page 4

page 5

page 12

page 14

page 17

page 18

page 20

research
07/19/2021

Constructing Multi-Modal Dialogue Dataset by Replacing Text with Semantically Relevant Images

In multi-modal dialogue systems, it is important to allow the use of ima...
research
05/24/2023

PaCE: Unified Multi-modal Dialogue Pre-training with Progressive and Compositional Experts

Perceiving multi-modal information and fulfilling dialogues with humans ...
research
11/17/2022

DeepSense 6G: A Large-Scale Real-World Multi-Modal Sensing and Communication Dataset

This article presents the DeepSense 6G dataset, which is a large-scale d...
research
12/11/2022

AliCHI: A Large-scale Multi-modal Dataset and Automated Evaluation Tool for Human-like Dialogue Systems

A well-designed interactive human-like dialogue system is expected to ta...
research
07/28/2023

'What are you referring to?' Evaluating the Ability of Multi-Modal Dialogue Models to Process Clarificational Exchanges

Referential ambiguities arise in dialogue when a referring expression do...
research
05/17/2023

IMAD: IMage-Augmented multi-modal Dialogue

Currently, dialogue systems have achieved high performance in processing...
research
08/16/2021

MMChat: Multi-Modal Chat Dataset on Social Media

Incorporating multi-modal contexts in conversation is an important step ...

Please sign up or login with your details

Forgot password? Click here to reset