Cross-Modal Generative Augmentation for Visual Question Answering

05/11/2021
by   Zixu Wang, et al.
0

Data augmentation is an approach that can effectively improve the performance of multimodal machine learning. This paper introduces a generative model for data augmentation by leveraging the correlations among multiple modalities. Different from conventional data augmentation approaches that apply low level operations with deterministic heuristics, our method proposes to learn an augmentation sampler that generates samples of the target modality conditioned on observed modalities in the variational auto-encoder framework. Additionally, the proposed model is able to quantify the confidence of augmented data by its generative probability, and can be jointly updated with a downstream pipeline. Experiments on Visual Question Answering tasks demonstrate the effectiveness of the proposed generative model, which is able to boost the strong UpDn-based models to the state-of-the-art performance.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 8

04/04/2016

Character-Level Question Answering with Attention

We show that a character-level encoder-decoder framework can be successf...
10/04/2020

Tell Me How to Ask Again: Question Data Augmentation with Controllable Rewriting in Continuous Space

In this paper, we propose a novel data augmentation method, referred to ...
03/15/2022

Adversarial Counterfactual Augmentation: Application in Alzheimer's Disease Classification

Data augmentation has been widely used in deep learning to reduce over-f...
11/12/2017

High-Order Attention Models for Visual Question Answering

The quest for algorithms that enable cognitive abilities is an important...
01/23/2020

Variational Hierarchical Dialog Autoencoder for Dialogue State Tracking Data Augmentation

Recent works have shown that generative data augmentation, where synthet...
07/15/2021

Cross-modal Variational Auto-encoder for Content-based Micro-video Background Music Recommendation

In this paper, we propose a cross-modal variational auto-encoder (CMVAE)...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.