UniCon: Unidirectional Split Learning with Contrastive Loss for Visual Question Answering

08/24/2022
by   Yuwei Sun, et al.
0

Visual question answering (VQA) that leverages multi-modality data has attracted intensive interest in real-life applications, such as home robots and clinic diagnoses. Nevertheless, one of the challenges is to design robust learning for different client tasks. This work aims to bridge the gap between the prerequisite of large-scale training data and the constraint of client data sharing mainly due to confidentiality. We propose the Unidirectional Split Learning with Contrastive Loss (UniCon) to tackle VQA tasks training on distributed data silos. In particular, UniCon trains a global model over the entire data distribution of different clients learning refined cross-modal representations via contrastive learning. The learned representations of the global model aggregate knowledge from different local tasks. Moreover, we devise a unidirectional split learning framework to enable more efficient knowledge sharing. The comprehensive experiments with five state-of-the-art VQA models on the VQA-v2 dataset demonstrated the efficacy of UniCon, achieving an accuracy of 49.89 study of VQA under the constraint of data confidentiality using self-supervised Split Learning.

READ FULL TEXT

page 8

page 9

page 12

page 13

page 14

research
11/21/2022

Cross-Modal Contrastive Learning for Robust Reasoning in VQA

Multi-modal reasoning in visual question answering (VQA) has witnessed r...
research
10/13/2020

Contrast and Classify: Alternate Training for Robust VQA

Recent Visual Question Answering (VQA) models have shown impressive perf...
research
08/01/2022

Generative Bias for Visual Question Answering

The task of Visual Question Answering (VQA) is known to be plagued by th...
research
10/10/2022

Towards Robust Visual Question Answering: Making the Most of Biased Samples via Contrastive Learning

Models for Visual Question Answering (VQA) often rely on the spurious co...
research
11/17/2021

Achieving Human Parity on Visual Question Answering

The Visual Question Answering (VQA) task utilizes both visual image and ...
research
04/14/2021

Towards Explainable Multi-Party Learning: A Contrastive Knowledge Sharing Framework

Multi-party learning provides solutions for training joint models with d...
research
01/31/2021

An Empirical Study on the Generalization Power of Neural Representations Learned via Visual Guessing Games

Guessing games are a prototypical instance of the "learning by interacti...

Please sign up or login with your details

Forgot password? Click here to reset