AOGNets: Deep AND-OR Grammar Networks for Visual Recognition

11/15/2017
by   Xilai Li, et al.
0

This paper presents a method of learning deep AND-OR Grammar (AOG) networks for visual recognition, which we term AOGNets. An AOGNet consists of a number of stages each of which is composed of a number of AOG building blocks. An AOG building block is designed based on a principled AND-OR grammar and represented by a hierarchical and compositional AND-OR graph. Each node applies some basic operation (e.g., Conv-BatchNorm-ReLU) to its input. There are three types of nodes: an AND-node explores composition, whose input is computed by concatenating features of its child nodes; an OR-node represents alternative ways of composition in the spirit of exploitation, whose input is the element-wise sum of features of its child nodes; and a Terminal-node takes as input a channel-wise slice of the input feature map of the AOG building block. AOGNets aim to harness the best of two worlds (grammar models and deep neural networks) in representation learning with end-to-end training. In experiments, AOGNets are tested on three highly competitive image classification benchmarks: CIFAR-10, CIFAR-100 and ImageNet-1K. AOGNets obtain better performance than the widely used Residual Net and its variants, and are tightly comparable to the Dense Net. AOGNets are also tested in object detection on the PASCAL VOC 2007 and 2012 using the vanilla Faster RCNN system and obtain better performance than the Residual Net.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/23/2017

Learning Chained Deep Features and Classifiers for Cascade in Object Detection

Cascade is a widely used approach that rejects obvious negative samples ...
research
11/14/2017

Interpretable R-CNN

This paper presents a method of learning qualitatively interpretable mod...
research
10/31/2018

Structure Learning of Deep Neural Networks with Q-Learning

Recently, with convolutional neural networks gaining significant achieve...
research
05/17/2023

Impact of ROS 2 Node Composition in Robotic Systems

The Robot Operating System 2 (ROS 2) is the second generation of ROS rep...
research
06/21/2017

GM-Net: Learning Features with More Efficiency

Deep Convolutional Neural Networks (CNNs) are capable of learning unprec...
research
07/18/2019

MintNet: Building Invertible Neural Networks with Masked Convolutions

We propose a new way of constructing invertible neural networks by combi...
research
09/29/2017

Deep Competitive Pathway Networks

In the design of deep neural architectures, recent studies have demonstr...

Please sign up or login with your details

Forgot password? Click here to reset