Object Recognition with Multi-Scale Pyramidal Pooling Networks

07/07/2012
by   Jonathan Masci, et al.
0

We present a Multi-Scale Pyramidal Pooling Network, featuring a novel pyramidal pooling layer at multiple scales and a novel encoding layer. Thanks to the former the network does not require all images of a given classification task to be of equal size. The encoding layer improves generalisation performance in comparison to similar neural network architectures, especially when training data is scarce. We evaluate and compare our system to convolutional neural networks and state-of-the-art computer vision methods on various benchmark datasets. We also present results on industrial steel defect classification, where existing architectures are not applicable because of the constraint on equally sized input images. The proposed architecture can be seen as a fully supervised hierarchical bag-of-features extension that is trained online and can be fine-tuned for any given task.

READ FULL TEXT

page 6

page 8

page 9

page 15

research
04/21/2016

Robust Audio Event Recognition with 1-Max Pooling Convolutional Neural Networks

We present in this paper a simple, yet efficient convolutional neural ne...
research
10/03/2017

A concatenating framework of shortcut convolutional neural networks

It is well accepted that convolutional neural networks play an important...
research
08/02/2021

Dynamic Multi-scale Convolution for Dialect Identification

Time Delay Neural Networks (TDNN)-based methods are widely used in diale...
research
11/18/2015

Competitive Multi-scale Convolution

In this paper, we introduce a new deep convolutional neural network (Con...
research
11/29/2019

Using Fully Convolutional Neural Networks to detect manipulated images in videos

We propose a compact architecture based on fully convolutional neural ne...
research
04/18/2012

Convolutional Neural Networks Applied to House Numbers Digit Classification

We classify digits of real-world house numbers using convolutional neura...
research
11/25/2022

MS-PS: A Multi-Scale Network for Photometric Stereo With a New Comprehensive Training Dataset

The photometric stereo (PS) problem consists in reconstructing the 3D-su...

Please sign up or login with your details

Forgot password? Click here to reset