Dealing with Missing Modalities in the Visual Question Answer-Difference Prediction Task through Knowledge Distillation

04/13/2021
by   Jae Won Cho, et al.
0

In this work, we address the issues of missing modalities that have arisen from the Visual Question Answer-Difference prediction task and find a novel method to solve the task at hand. We address the missing modality-the ground truth answers-that are not present at test time and use a privileged knowledge distillation scheme to deal with the issue of the missing modality. In order to efficiently do so, we first introduce a model, the "Big" Teacher, that takes the image/question/answer triplet as its input and outperforms the baseline, then use a combination of models to distill knowledge to a target network (student) that only takes the image/question pair as its inputs. We experiment our models on the VizWiz and VQA-V2 Answer Difference datasets and show through extensive experimentation and ablation the performances of our method and a diverse possibility for future research.

READ FULL TEXT

page 3

page 8

research
05/31/2022

What Knowledge Gets Distilled in Knowledge Distillation?

Knowledge distillation aims to transfer useful information from a teache...
research
10/15/2021

From Multimodal to Unimodal Attention in Transformers using Knowledge Distillation

Multimodal Deep Learning has garnered much interest, and transformers ha...
research
12/21/2021

Multi-Modality Distillation via Learning the teacher's modality-level Gram Matrix

In the context of multi-modality knowledge distillation research, the ex...
research
12/10/2018

Spatial Knowledge Distillation to aid Visual Reasoning

For tasks involving language and vision, the current state-of-the-art me...
research
09/26/2019

Compact Trilinear Interaction for Visual Question Answering

In Visual Question Answering (VQA), answers have a great correlation wit...
research
08/17/2022

Progressive Cross-modal Knowledge Distillation for Human Action Recognition

Wearable sensor-based Human Action Recognition (HAR) has achieved remark...
research
08/28/2019

Online Sensor Hallucination via Knowledge Distillation for Multimodal Image Classification

We deal with the problem of information fusion driven satellite image/sc...

Please sign up or login with your details

Forgot password? Click here to reset