Visual Concept Reasoning Networks

08/26/2020
by   Taesup Kim, et al.
24

A split-transform-merge strategy has been broadly used as an architectural constraint in convolutional neural networks for visual recognition tasks. It approximates sparsely connected networks by explicitly defining multiple branches to simultaneously learn representations with different visual concepts or properties. Dependencies or interactions between these representations are typically defined by dense and local operations, however, without any adaptiveness or high-level reasoning. In this work, we propose to exploit this strategy and combine it with our Visual Concept Reasoning Networks (VCRNet) to enable reasoning between high-level visual concepts. We associate each branch with a visual concept and derive a compact concept state by selecting a few local descriptors through an attention module. These concept states are then updated by graph-based interaction and used to adaptively modulate the local descriptors. We describe our proposed model by split-transform-attend-interact-modulate-merge stages, which are implemented by opting for a highly modularized architecture. Extensive experiments on visual recognition tasks such as image classification, semantic segmentation, object detection, scene recognition, and action recognition show that our proposed model, VCRNet, consistently improves the performance by increasing the number of parameters by less than 1

READ FULL TEXT

page 8

page 13

research
03/02/2022

Aggregated Pyramid Vision Transformer: Split-transform-merge Strategy for Image Recognition without Convolutions

With the achievements of Transformer in the field of natural language pr...
research
09/13/2023

Dynamic Spectrum Mixer for Visual Recognition

Recently, MLP-based vision backbones have achieved promising performance...
research
05/28/2014

Detection Bank: An Object Detection Based Video Representation for Multimedia Event Recognition

While low-level image features have proven to be effective representatio...
research
11/30/2018

Graph-Based Global Reasoning Networks

Globally modeling and reasoning over relations between regions can be be...
research
03/03/2017

Deep Collaborative Learning for Visual Recognition

Deep neural networks are playing an important role in state-of-the-art v...
research
07/18/2023

Human Action Recognition in Still Images Using ConViT

Understanding the relationship between different parts of the image play...
research
01/26/2018

Neural Algebra of Classifiers

The world is fundamentally compositional, so it is natural to think of v...

Please sign up or login with your details

Forgot password? Click here to reset