Hadamard Product for Low-rank Bilinear Pooling

10/14/2016
by   Jin-Hwa Kim, et al.
0

Bilinear models provide rich representations compared with linear models. They have been applied in various visual tasks, such as object recognition, segmentation, and visual question-answering, to get state-of-the-art performances taking advantage of the expanded representations. However, bilinear representations tend to be high-dimensional, limiting the applicability to computationally complex tasks. We propose low-rank bilinear pooling using Hadamard product for an efficient attention mechanism of multimodal learning. We show that our model outperforms compact bilinear pooling in visual question-answering tasks with the state-of-the-art results on the VQA dataset, having a better parsimonious property.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/06/2016

Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding

Modeling textual or visual information with vector representations train...
research
05/21/2018

Bilinear Attention Networks

Attention networks in multimodal learning provide an efficient way to ut...
research
05/18/2017

MUTAN: Multimodal Tucker Fusion for Visual Question Answering

Bilinear models provide an appealing framework for mixing and merging in...
research
03/23/2017

Multimodal Compact Bilinear Pooling for Multimodal Neural Machine Translation

In state-of-the-art Neural Machine Translation, an attention mechanism i...
research
06/20/2017

Compact Tensor Pooling for Visual Question Answering

Performing high level cognitive tasks requires the integration of featur...
research
06/03/2019

Low-rank Random Tensor for Bilinear Pooling

Bilinear pooling is capable of extracting high-order information from da...
research
12/18/2020

Trying Bilinear Pooling in Video-QA

Bilinear pooling (BLP) refers to a family of operations recently develop...

Please sign up or login with your details

Forgot password? Click here to reset