Inability of spatial transformations of CNN feature maps to support invariant recognition

04/30/2020
by   Ylva Jansson, et al.
0

A large number of deep learning architectures use spatial transformations of CNN feature maps or filters to better deal with variability in object appearance caused by natural image transformations. In this paper, we prove that spatial transformations of CNN feature maps cannot align the feature maps of a transformed image to match those of its original, for general affine transformations, unless the extracted features are themselves invariant. Our proof is based on elementary analysis for both the single- and multi-layer network case. The results imply that methods based on spatial transformations of CNN feature maps or filters cannot replace image alignment of the input and cannot enable invariant recognition for general affine transformations, specifically not for scaling transformations or shear transformations. For rotations and reflections, spatially transforming feature maps or filters can enable invariance but only for networks with learnt or hardcoded rotation- or reflection-invariant features

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/14/2020

The problems with using STNs to align CNN feature maps

Spatial transformer networks (STNs) were designed to enable CNNs to lear...
research
11/24/2017

Efficient and Invariant Convolutional Neural Networks for Dense Prediction

Convolutional neural networks have shown great success on feature extrac...
research
05/24/2017

Deep Rotation Equivariant Network

Recently, learning equivariant representations has attracted considerabl...
research
04/24/2020

Understanding when spatial transformer networks do not support invariance, and what to do about it

Spatial transformer networks (STNs) were designed to enable convolutiona...
research
12/30/2019

Recognizing Instagram Filtered Images with Feature De-stylization

Deep neural networks have been shown to suffer from poor generalization ...
research
04/22/2021

Deep Lucas-Kanade Homography for Multimodal Image Alignment

Estimating homography to align image pairs captured by different sensors...
research
01/22/2020

Learning to Correct 3D Reconstructions from Multiple Views

This paper is about reducing the cost of building good large-scale 3D re...

Please sign up or login with your details

Forgot password? Click here to reset