Multimodal Conversational AI: A Survey of Datasets and Approaches

05/13/2022
by   Anirudh Sundar, et al.
0

As humans, we experience the world with all our senses or modalities (sound, sight, touch, smell, and taste). We use these modalities, particularly sight and touch, to convey and interpret specific meanings. Multimodal expressions are central to conversations; a rich set of modalities amplify and often compensate for each other. A multimodal conversational AI system answers questions, fulfills tasks, and emulates human conversations by understanding and expressing itself via multiple modalities. This paper motivates, defines, and mathematically formulates the multimodal conversational research objective. We provide a taxonomy of research required to solve the objective: multimodal representation, fusion, alignment, translation, and co-learning. We survey state-of-the-art datasets and approaches for each research area and highlight their limiting assumptions. Finally, we identify multimodal co-learning as a promising direction for multimodal conversational AI research.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/05/2018

MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversations

Emotion recognition in conversations is a challenging Artificial Intelli...
research
09/28/2011

Cognitive Principles in Robust Multimodal Interpretation

Multimodal conversational interfaces provide a natural means for users t...
research
01/18/2021

MONAH: Multi-Modal Narratives for Humans to analyze conversations

In conversational analyses, humans manually weave multimodal information...
research
06/06/2020

Multimodal Systems: Taxonomy, Methods, and Challenges

Naturally, humans use multiple modalities to convey information. The mod...
research
10/23/2022

McQueen: a Benchmark for Multimodal Conversational Query Rewrite

The task of query rewrite aims to convert an in-context query to its ful...
research
11/16/2020

Conversational agents for learning foreign languages – a survey

Conversational practice, while crucial for all language learners, can be...
research
05/16/2023

ConvXAI: Delivering Heterogeneous AI Explanations via Conversations to Support Human-AI Scientific Writing

While various AI explanation (XAI) methods have been proposed to interpr...

Please sign up or login with your details

Forgot password? Click here to reset