DeepAI AI Chat
Log In Sign Up

Making EfficientNet More Efficient: Exploring Batch-Independent Normalization, Group Convolutions and Reduced Resolution Training

06/07/2021
by   Dominic Masters, et al.
0

Much recent research has been dedicated to improving the efficiency of training and inference for image classification. This effort has commonly focused on explicitly improving theoretical efficiency, often measured as ImageNet validation accuracy per FLOP. These theoretical savings have, however, proven challenging to achieve in practice, particularly on high-performance training accelerators. In this work, we focus on improving the practical efficiency of the state-of-the-art EfficientNet models on a new class of accelerator, the Graphcore IPU. We do this by extending this family of models in the following ways: (i) generalising depthwise convolutions to group convolutions; (ii) adding proxy-normalized activations to match batch normalization performance with batch-independent statistics; (iii) reducing compute by lowering the training resolution and inexpensively fine-tuning at higher resolution. We find that these three methods improve the practical efficiency for both training and inference. Our code will be made available online.

READ FULL TEXT

page 1

page 2

page 3

page 4

06/07/2021

Proxy-Normalizing Activations to Match Batch Normalization while Removing Batch Dependence

We investigate the reasons for the performance degradation incurred with...
02/11/2021

High-Performance Large-Scale Image Recognition Without Normalization

Batch normalization is a key component of most image classification mode...
02/10/2017

Batch Renormalization: Towards Reducing Minibatch Dependence in Batch-Normalized Models

Batch Normalization is quite effective at accelerating and improving the...
09/28/2020

Group Whitening: Balancing Learning Efficiency and Representational Capacity

Batch normalization (BN) is an important technique commonly incorporated...
03/22/2018

Group Normalization

Batch Normalization (BN) is a milestone technique in the development of ...
11/01/2018

Stochastic Normalizations as Bayesian Learning

In this work we investigate the reasons why Batch Normalization (BN) imp...
11/28/2016

Dense Prediction on Sequences with Time-Dilated Convolutions for Speech Recognition

In computer vision pixelwise dense prediction is the task of predicting ...