Ensemble of MRR and NDCG models for Visual Dialog

04/15/2021
by   Idan Schwartz, et al.
0

Assessing an AI agent that can converse in human language and understand visual content is challenging. Generation metrics, such as BLEU scores favor correct syntax over semantics. Hence a discriminative approach is often used, where an agent ranks a set of candidate options. The mean reciprocal rank (MRR) metric evaluates the model performance by taking into account the rank of a single human-derived answer. This approach, however, raises a new challenge: the ambiguity and synonymy of answers, for instance, semantic equivalence (e.g., `yeah' and `yes'). To address this, the normalized discounted cumulative gain (NDCG) metric has been used to capture the relevance of all the correct answers via dense annotations. However, the NDCG metric favors the usually applicable uncertain answers such as `I don't know. Crafting a model that excels on both MRR and NDCG metrics is challenging. Ideally, an AI agent should answer a human-like reply and validate the correctness of any answer. To address this issue, we describe a two-step non-parametric ranking approach that can merge strong MRR and NDCG models. Using our approach, we manage to keep most MRR state-of-the-art performance (70.41 state-of-the-art performance (72.16 the recent Visual Dialog 2020 challenge. Source code is available at https://github.com/idansc/mrr-ndcg.

READ FULL TEXT

page 15

page 20

page 22

page 28

page 30

page 35

page 39

page 42

research
02/26/2019

Image-Question-Answer Synergistic Network for Visual Dialog

The image, question (combined with the history for de-referencing), and ...
research
11/26/2016

Visual Dialog

We introduce the task of Visual Dialog, which requires an AI agent to ho...
research
01/17/2020

Modality-Balanced Models for Visual Dialogue

The Visual Dialog task requires a model to exploit both image and conver...
research
11/23/2022

Unified Multimodal Model with Unlikelihood Training for Visual Dialog

The task of visual dialog requires a multimodal chatbot to answer sequen...
research
01/06/2023

You Truly Understand What I Need: Intellectual and Friendly Dialogue Agents grounding Knowledge and Persona

To build a conversational agent that interacts fluently with humans, pre...
research
06/05/2017

Best of Both Worlds: Transferring Knowledge from Discriminative Learning to a Generative Visual Dialog Model

We present a novel training framework for neural sequence models, partic...
research
08/16/2021

Computational extraction of metrics and normative data on the alternative uses test on a set of 420 household objects

The Alternative Uses Test (AUT) is a classical test which has long been ...

Please sign up or login with your details

Forgot password? Click here to reset