Visual Dialogue without Vision or Dialogue

12/16/2018
by   Daniela Massiceti, et al.
0

We characterise some of the quirks and shortcomings in the exploration of Visual Dialogue (VD) - a sequential question-answering task where the questions and corresponding answers are related through given visual stimuli. To do so, we develop an embarrassingly simple method based on Canonical Correlation Analysis (CCA) that, on the standard dataset, achieves near state-of-the-art performance for some standard metric. In direct contrast to current complex and over-parametrised architectures that are both compute and time intensive, our method ignores the visual stimuli, ignores the sequencing of dialogue, does not need gradients, uses off-the-shelf feature extractors, has at least an order of magnitude fewer parameters, and learns in practically no time. We argue that these results are indicative of issues in current approaches to Visual Dialogue relating particularly to implicit dataset biases, under-constrained task objectives, and over-constrained evaluation metrics, and consequently, discuss some avenues to ameliorate these issues.

READ FULL TEXT
research
04/20/2020

A Revised Generative Evaluation of Visual Dialogue

Evaluating Visual Dialogue, the task of answering a sequence of question...
research
02/11/2018

FlipDial: A Generative Model for Two-Way Visual Dialogue

We present FlipDial, a generative model for visual dialogue that simulta...
research
11/17/2019

DualVD: An Adaptive Dual Encoding Model for Deep Visual Understanding in Visual Dialogue

Different from Visual Question Answering task that requires to answer on...
research
08/14/2019

Reactive Multi-Stage Feature Fusion for Multimodal Dialogue Modeling

Visual question answering and visual dialogue tasks have been increasing...
research
11/12/2019

Visual Dialogue State Tracking for Question Generation

GuessWhat?! is a visual dialogue task between a guesser and an oracle. T...
research
05/23/2023

Continual Dialogue State Tracking via Example-Guided Question Answering

Dialogue systems are frequently updated to accommodate new services, but...
research
01/22/2020

A hemodynamic decomposition model for detecting cognitive load using functional near-infrared spectroscopy

In the current paper, we introduce a parametric data-driven model for fu...

Please sign up or login with your details

Forgot password? Click here to reset