Greedy Layerwise Learning Can Scale to ImageNet

12/29/2018
by   Eugene Belilovsky, et al.
0

Shallow supervised 1-hidden layer neural networks have a number of favorable properties that make them easier to interpret, analyze, and optimize than their deep counterparts, but lack their representational power. Here we use 1-hidden layer learning problems to sequentially build deep networks layer by layer, which can inherit properties from shallow networks. Contrary to previous approaches using shallow networks, we focus on problems where deep learning is reported as critical for success. We thus study CNNs on image recognition tasks using the large-scale ImageNet dataset and the CIFAR-10 dataset. Using a simple set of ideas for architecture and training we find that solving sequential 1-hidden-layer auxiliary problems leads to a CNN that exceeds AlexNet performance on ImageNet. Extending our training methodology to construct individual layers by solving 2-and-3-hidden layer auxiliary problems, we obtain an 11-layer network that exceeds VGG-11 on ImageNet obtaining 89.8 single crop. To our knowledge, this is the first competitive alternative to end-to-end training of CNNs that can scale to ImageNet. We conduct a wide range of experiments to study the properties this induces on the intermediate layers.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/22/2015

Gradual DropIn of Layers to Train Very Deep Neural Networks

We introduce the concept of dynamically growing a neural network during ...
research
01/23/2019

Decoupled Greedy Learning of CNNs

A commonly cited inefficiency of neural network training by back-propaga...
research
03/30/2018

Hierarchical Transfer Convolutional Neural Networks for Image Classification

In this paper, we address the issue of how to enhance the generalization...
research
06/11/2021

Decoupled Greedy Learning of CNNs for Synchronous and Asynchronous Distributed Learning

A commonly cited inefficiency of neural network training using back-prop...
research
07/19/2019

Representational Capacity of Deep Neural Networks -- A Computing Study

There is some theoretical evidence that deep neural networks with multip...
research
03/27/2017

Scaling the Scattering Transform: Deep Hybrid Networks

We use the scattering network as a generic and fixed ini-tialization of ...
research
06/27/2018

DeepObfuscation: Securing the Structure of Convolutional Neural Networks via Knowledge Distillation

This paper investigates the piracy problem of deep learning models. Desi...

Please sign up or login with your details

Forgot password? Click here to reset