How deep convolutional neural networks lose spatial information with training

by   Umberto M. Tomasini, et al.

A central question of machine learning is how deep nets manage to learn tasks in high dimensions. An appealing hypothesis is that they achieve this feat by building a representation of the data where information irrelevant to the task is lost. For image datasets, this view is supported by the observation that after (and not before) training, the neural representation becomes less and less sensitive to diffeomorphisms acting on images as the signal propagates through the net. This loss of sensitivity correlates with performance, and surprisingly correlates with a gain of sensitivity to white noise acquired during training. These facts are unexplained, and as we demonstrate still hold when white noise is added to the images of the training set. Here, we (i) show empirically for various architectures that stability to image diffeomorphisms is achieved by spatial pooling in the first half of the net, and by channel pooling in the second half, (ii) introduce a scale-detection task for a simple model of data where pooling is learned during training, which captures all empirical observations above and (iii) compute in this model how stability to diffeomorphisms and noise scale with depth. The scalings are found to depend on the presence of strides in the net architecture. We find that the increased sensitivity to noise is due to the perturbing noise piling up during pooling, after being rectified by ReLU units.


page 2

page 5

page 6

page 8

page 9


Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition

Existing deep convolutional neural networks (CNNs) require a fixed-size ...

Relative stability toward diffeomorphisms in deep nets indicates performance

Understanding why deep nets can classify data in large dimensions remain...

Learned Deformation Stability in Convolutional Neural Networks

Conventional wisdom holds that interleaved pooling layers in convolution...

A Multi-Resolution Framework for U-Nets with Applications to Hierarchical VAEs

U-Net architectures are ubiquitous in state-of-the-art deep learning, ho...

multi-patch aggregation models for resampling detection

Images captured nowadays are of varying dimensions with smartphones and ...

Streaming Networks: Enable A Robust Classification of Noise-Corrupted Images

The convolution neural nets (conv nets) have achieved a state-of-the-art...

Dissecting the Effects of SGD Noise in Distinct Regimes of Deep Learning

Understanding when the noise in stochastic gradient descent (SGD) affect...

Please sign up or login with your details

Forgot password? Click here to reset