Which One Are You Referring To? Multimodal Object Identification in Situated Dialogue

02/28/2023
by   Holy Lovenia, et al.
0

The demand for multimodal dialogue systems has been rising in various domains, emphasizing the importance of interpreting multimodal inputs from conversational and situational contexts. We explore three methods to tackle this problem and evaluate them on the largest situated dialogue dataset, SIMMC 2.1. Our best method, scene-dialogue alignment, improves the performance by  20 discussion regarding the limitation of our methods and the potential directions for future works. Our code is publicly available at https://github.com/holylovenia/multimodal-object-identification.

READ FULL TEXT

page 1

page 3

research
03/08/2023

FaceChat: An Emotion-Aware Face-to-face Dialogue Framework

While current dialogue systems like ChatGPT have made significant advanc...
research
06/27/2023

Shikra: Unleashing Multimodal LLM's Referential Dialogue Magic

In human conversations, individuals can indicate relevant regions within...
research
06/21/2023

OphGLM: Training an Ophthalmology Large Language-and-Vision Assistant based on Instructions and Dialogue

Large multimodal language models (LMMs) have achieved significant succes...
research
09/16/2022

Selecting Stickers in Open-Domain Dialogue through Multitask Learning

With the increasing popularity of online chatting, stickers are becoming...
research
02/16/2020

A Multimodal Dialogue System for Conversational Image Editing

In this paper, we present a multimodal dialogue system for Conversationa...
research
06/05/2019

Towards Multimodal Sarcasm Detection (An _Obviously_ Perfect Paper)

Sarcasm is often expressed through several verbal and non-verbal cues, e...
research
05/27/2023

MPCHAT: Towards Multimodal Persona-Grounded Conversation

In order to build self-consistent personalized dialogue agents, previous...

Please sign up or login with your details

Forgot password? Click here to reset