From VQA to Multimodal CQA: Adapting Visual QA Models for Community QA Tasks

08/29/2018
by   Avikalp Srivastava, et al.
0

In this work, we present novel methods to adapt visual QA models for community QA tasks of practical significance - automated question category classification and finding experts for question answering - on questions containing both text and image. To the best of our knowledge, this is the first work to tackle the multimodality challenge in CQA, and is an enabling step towards basic question-answering on image-based CQA. First, we analyze the differences between visual QA and community QA datasets, discussing the limitations of applying VQA models directly to CQA tasks, and then we propose novel augmentations to VQA-based models to best address those limitations. Our model, with the augmentations of an image-text combination method tailored for CQA and use of auxiliary tasks for learning better grounding features, significantly outperforms the text-only and VQA model baselines for both tasks on real-world CQA data from Yahoo! Chiebukuro, a Japanese counterpart of Yahoo! Answers.

READ FULL TEXT

page 2

page 7

research
10/28/2020

Leveraging Visual Question Answering to Improve Text-to-Image Synthesis

Generating images from textual descriptions has recently attracted a lot...
research
07/30/2019

LEAF-QA: Locate, Encode & Attend for Figure Question Answering

We introduce LEAF-QA, a comprehensive dataset of 250,000 densely annotat...
research
12/18/2020

Trying Bilinear Pooling in Video-QA

Bilinear pooling (BLP) refers to a family of operations recently develop...
research
01/29/2018

Game of Sketches: Deep Recurrent Models of Pictionary-style Word Guessing

The ability of intelligent agents to play games in human-like fashion is...
research
11/19/2020

Logically Consistent Loss for Visual Question Answering

Given an image, a back-ground knowledge, and a set of questions about an...
research
01/17/2017

Community Question Answering Platforms vs. Twitter for Predicting Characteristics of Urban Neighbourhoods

In this paper, we investigate whether text from a Community Question Ans...
research
11/16/2017

A Novel Framework for Robustness Analysis of Visual QA Models

Deep neural networks have been playing an essential role in many compute...

Please sign up or login with your details

Forgot password? Click here to reset