DeepAI
Log In Sign Up

An Intriguing Failing of Convolutional Neural Networks and the CoordConv Solution

07/09/2018
by   Rosanne Liu, et al.
6

Few ideas have enjoyed as large an impact on deep learning as convolution. For any problem involving pixels or spatial representations, common intuition holds that convolutional neural networks may be appropriate. In this paper we show a striking counterexample to this intuition via the seemingly trivial coordinate transform problem, which simply requires learning a mapping between coordinates in (x,y) Cartesian space and one-hot pixel space. Although convolutional networks would seem appropriate for this task, we show that they fail spectacularly. We demonstrate and carefully analyze the failure first on a toy problem, at which point a simple fix becomes obvious. We call this solution CoordConv, which works by giving convolution access to its own input coordinates through the use of extra coordinate channels. Without sacrificing the computational and parametric efficiency of ordinary convolution, CoordConv allows networks to learn either perfect translation invariance or varying degrees of translation dependence, as required by the task. CoordConv solves the coordinate transform problem with perfect generalization and 150 times faster with 10--100 times fewer parameters than convolution. This stark contrast raises the question: to what extent has this inability of convolution persisted insidiously inside other tasks, subtly hampering performance from within? A complete answer to this question will require further investigation, but we show preliminary evidence that swapping convolution for CoordConv can improve models on a diverse set of tasks. Using CoordConv in a GAN produced less mode collapse as the transform between high-level spatial latents and pixels becomes easier to learn. A Faster R-CNN detection model trained on MNIST detection showed 24 agents playing Atari games benefit significantly from the use of CoordConv layers.

READ FULL TEXT

page 3

page 7

page 14

page 15

page 21

page 22

page 23

page 24

03/24/2020

A Simple Fix for Convolutional Neural Network via Coordinate Embedding

Convolutional Neural Networks (CNN) has been widely applied in the realm...
12/15/2018

3DTI-Net: Learn Inner Transform Invariant 3D Geometry Features using Dynamic GCN

Deep learning on point clouds has made a lot of progress recently. Many ...
01/19/2018

EffNet: An Efficient Structure for Convolutional Neural Networks

With the ever increasing application of Convolutional Neural Networks to...
04/26/2017

A Generalization of Convolutional Neural Networks to Graph-Structured Data

This paper introduces a generalization of Convolutional Neural Networks ...
08/22/2019

Deep Green Function Convolution for Improving Saliency in Convolutional Neural Networks

Current saliency methods require to learn large scale regional features ...
01/23/2020

DCT-Conv: Coding filters in convolutional networks with Discrete Cosine Transform

Convolutional neural networks are based on a huge number of trained weig...
11/14/2021

Visual design intuition: Predicting dynamic properties of beams from raw cross-section images

In this work we aim to mimic the human ability to acquire the intuition ...

Code Repositories

CoordConv-pytorch

Pytorch implementation of CoordConv introduced in 'An intriguing failing of convolutional neural networks and the CoordConv solution' paper. (https://arxiv.org/pdf/1807.03247.pdf)


view repo

CoordConv

Pytorch implementation of "An intriguing failing of convolutional neural networks and the CoordConv solution" - https://arxiv.org/abs/1807.03247


view repo

coord-conv-pytorch

An intriguing failing of convolutional neural networks and the CoordConv solution in PyTorch


view repo

CNN-SoilTextureClassification

1-dimensional convolutional neural networks (CNN) for the classification of soil texture based on hyperspectral data


view repo